Freecode (Freshmeat) Collections
Freecode (formerly Freshmeat) is a directory of open source projects.
Every month we download Freecode's own RDF file of information about projects listed on that directory, parse the information, and load it into our database. We then provide that data freely back to you to do with as you wish.
Project Metadata:
- Project names (long name and short names)
- Project textual descriptions
- Project URL (the Freshmeat URL and the 'real' project URL)
- Project license(s)
- Project author(s) by project
- Project stats (vitality, popularity, etc, as determined by Freshmeat)
- Project trove categories (tags)
After the interesting variables are parsed out of the pages and stored in the database, we release the data in several different formats: flat files (delimited), SQL files, and live query db access.
Frequently Asked Questions:
- How come you don't provide the trove categories in the file downloads?
We'd really like to make some sense of the trove categories, actually. Ideally, we'd like to relate each numeric trove category to a textual description of that trove category, and create a "key" table for this information. Then we'd feel more comfortable releasing the trove categories for each project. Let us know if you'd like to work on this. - How come some of the data is missing from your Freshmeat downloads in early 2009?
Mostly because Freshmeat embarked on a total site redesign during this timeframe and they stopped putting out their RDF file of project data. In mid-2009 they started putting the file out again, and we were able to begin collecting this data again in May and June, however we were told that this method was deprecated and that we'd have to start using their API to collect data. So far, we have found that the RDF files are still being produced and we're still using them.