June data sets released

Hi moles, the June 2009 data sets are released.


Datamarts (sql files) and flat (delimited) files are located on our Google Code downloads area.

oss2009 requests, etc

Just back from OSS 2009 in Skövde, Sweden. (Finally figured out how to make the ö character on a mac: hit option-u, then o). Here are the requests I heard from sitting in talks, either for new forges, for features that FLOSSmole could provide, or just things that people were using/needing that might intersect with our mission here:

debian popularity contest
KDE's "10 years of data in an xml logfile"
"git" everything
developer skills from sourceforge

May data, and April and May datamarts released

Go grab the May data, April & May datamarts from our Google code web site.

I'm backing up to Teragrid now, so Teragrid users, you should have a nice new set of data to play with RSN (real soon now).

April 2009 data released

Hi moles, April data has been released over to the google code site (all forges). I'll get that up on the Teragrid site ASAP. We have a few small problems with the scripts over there, nothing that is major but it will slow me down in getting the data uploaded over there.

FOSSology grants

Hello moles,

Here is an RFP I received this past week. I've copied it here.

Request for PROPOSALS

Hewlett Packard, major sponsor of the FOSSology project
(http://fossology.org), is accepting short proposals from academic
institutions involved in research in areas of interest to the
FOSSology project. FOSSology is an open source project dedicated to
analysis and mining of open source data and using the results for the
betterment of the free and open source software (FOSS) and FOSS

Interesting areas include but are not limited to:

- vulnerability analysis
- vulnerability tracking
- dependency analysis
- code reuse detection

Currently, the FOSSology project concentrates on software license
detection, but would like to expand into other areas. Feel free to
suggest your own area or propose something around the above topics

Grant amounts are in the range of $5,000 - $20,000 USD. The number of
grants awarded is dependent on the number and quality of proposals

Proposals should short (1-2 pages).

Grant recipients will be expected to communicate with other FOSSology
developers through the public mailing list and/or IRC channel. If

Feb and March data released

Sorry for the delay on February, but as a bonus I'll throw in the March data too!

Go to our Google Code downloads page and have fun.

Teragrid updated with Dec data

I forgot to mention this here, but I finished updating the latest complete data sources to the Teragrid, so those of you who have requested database access should be able to play with that data there. The new data sources are:


Have fun! Let me know if there are problems or things that are just not quite right.

Dec data for SF released

Sourceforge data is available for December. Get it now at the Flossmole Google Code Project Page.

Update (Jan/12/2009): Also datamarts from December are posted. Same place.

NOTE: The stats data looks somewhat strange to me. I suspect that the stats server was having some issues around the first week of December, so be careful of that data. I parsed the data as I found it, but I think it was not in great shape on the server when we collected it.

minor forges ready for December, inc. FSF

The December "minor" forges are ready for download from the FLOSSmole Google Code Project Page: FM, RF, OW. Enjoy!

Update: I've also released Free Software Foundation (FSF) for October and December. This parser has been a long time coming. But better late than never.

Problem with Public Areas file

We recently had an error in the Sourceforge Public Areas data dumps for October.

New files available for you to download. They are marked "2".