February data released

Go to our file release page on Sourceforge and pick up all the latest files.

Included this month: SF, FM, RF, OW flat files and data marts (sql statements).

(Debian is on the way! I'll update here as soon as it's ready.)


Jan files released and Dec datamarts posted

Hi Moles!

The January data files have been released, and I went back and released the December data marts too (sql statements to make your own database).

Have fun with it! Grab the data from our Sourceforge project page.

December data released

Hello moles! After considerable delay, here is the December data. Enjoy!

Download the data from our SF release page!

November data released

Files for November 2007 have been released here:

Sourceforge File Release Page

October data released

Sorry for the lack of postings! The data has been released, I guess I just forgot to post the updates.

Here are the October data links.

August data released (mostly)

Greetings moles! We have the August data release mostly done.

Get the data marts (sql) or the flat files over on our Sourceforge file release page, or use the query tool to explore.

I say the August is "mostly" done because we had some technical difficulties with the Sourceforge and Debian releases, but these should be cleared up around mid-month.

So if you're looking for Sourceforge data and you just can't wait, why not give the Freshmeat or Rubyforge or Objectweb or Free Software Foundation a try? You might find some interesting things in there.

We have some new trove definitions that have been added to Freshmeat, in order to help better explain the trove categories of each project. In the flat files, the new files are: fmProjectTrove and fmTroveDefs. In the data marts, the new file is datamart_fm_trove_defs. A big thanks to user Bob Daly for this trove information and the idea to include it!

SF Status pages: bug fix

Just wanted to let you guys know about a bug fix on the status pages for the SF data. Each project has a "status" (i.e. beta, alpha, production, etc). We were under the impression that each project only had a single status (we assumed this represented the project's "current" status), but this turned out not to be the case.

Our code was therefore erroneously grabbing only the first of a possible list of status codes. Consequently, some projects that had multiple status codes were not shown correctly.

I have gone into the June 2007 and April 2007 data and made the corrections, and I'll get to the older data sets at some point. (Let me know if this is a high priority for you and I'll try to get to them faster.) Obviously, the upcoming August data set will now be run correctly.

When looking through the old packages, you'll want to look for the files marked ".fixed.bz2".

debian data released

I collected some debian package data and started parsing it to see what kind of stuff we might find in there.

I will probably need some help from the user community on this one, to know what sort of data you find interesting in these packages.

Here are the files I collected:

Obviously there is a lot of information there, and I only parsed some of it out for this initial run. Here are the items I parsed and released:

  • package name, version, parent directory

July data released

July data is out for Freshmeat, Rubyforge, Objectweb, Free Software Foundation.

Go get it!

June data released - all forges

June data has been released for all forges.

Head over to the project page on Sourceforge and gather all the data you need!
