Public Dataset
- GitHub Archive is a project to record the public GitHub timeline, archive it, and make it easily accessible for further analysis. All the open source code in GitHub is also available in Google BigQuery
- Common Crawl an open repository of web crawl data that can be accessed and analyzed by anyone.
- Our World in Data