During the oceansprint I wrote a script to dump all pull requests (only metadata, you can use git to get the actual code content and correlate it with the git sha references).
The script is designed in a way that it can be restarted in case it’s interrupted or the github api fails. This can be also used to update file incrementally.
The code is here in case someone wants to make this a project: Scrape all nixpkgs pull requests · GitHub
A scrape takes on a 1 Gbit/s connection less than a day. Here is the result: prs.jsonl.zstd - Google Drive (compressed with zstandard)
The format is in https://jsonlines.org/ format.