megan's blog

April 2009 data released

Hi moles, April data has been released over to the google code site (all forges). I'll get that up on the Teragrid site ASAP. We have a few small problems with the scripts over there, nothing that is major but it will slow me down in getting the data uploaded over there.

FOSSology grants

Hello moles,

Here is an RFP I received this past week. I've copied it here.

Request for PROPOSALS
----------------------

Hewlett Packard, major sponsor of the FOSSology project
(http://fossology.org), is accepting short proposals from academic
institutions involved in research in areas of interest to the
FOSSology project. FOSSology is an open source project dedicated to
analysis and mining of open source data and using the results for the
betterment of the free and open source software (FOSS) and FOSS
management.

Interesting areas include but are not limited to:

- vulnerability analysis
- vulnerability tracking
- dependency analysis
- code reuse detection

Currently, the FOSSology project concentrates on software license
detection, but would like to expand into other areas. Feel free to
suggest your own area or propose something around the above topics

Grant amounts are in the range of $5,000 - $20,000 USD. The number of
grants awarded is dependent on the number and quality of proposals
received.

Proposals should short (1-2 pages).

Grant recipients will be expected to communicate with other FOSSology
developers through the public mailing list and/or IRC channel. If

Feb and March data released

Sorry for the delay on February, but as a bonus I'll throw in the March data too!

Go to our Google Code downloads page and have fun.

Teragrid updated with Dec data

I forgot to mention this here, but I finished updating the latest complete data sources to the Teragrid, so those of you who have requested database access should be able to play with that data there. The new data sources are:

150-sf
151-fm
152-rf
153-ow
154-fsf

Have fun! Let me know if there are problems or things that are just not quite right.

Dec data for SF released

Sourceforge data is available for December. Get it now at the Flossmole Google Code Project Page.

Update (Jan/12/2009): Also datamarts from December are posted. Same place.

NOTE: The stats data looks somewhat strange to me. I suspect that the stats server was having some issues around the first week of December, so be careful of that data. I parsed the data as I found it, but I think it was not in great shape on the server when we collected it.

minor forges ready for December, inc. FSF

The December "minor" forges are ready for download from the FLOSSmole Google Code Project Page: FM, RF, OW. Enjoy!

Update: I've also released Free Software Foundation (FSF) for October and December. This parser has been a long time coming. But better late than never.

Problem with Public Areas file

We recently had an error in the Sourceforge Public Areas data dumps for October.

New files available for you to download. They are marked "2".

October developers

Better late than never on the October Developers and Project Developers for SourceForge:

Download these developer files here

Sorry about the .gz format instead of the .bz - for some reason I have problems using the Google Code auto-uploader with .bz files. It's a very strange problem, random and intermittent, hard to pin down why some .bz files will go and some will not. Anyway, I gzipped them and we are good to go.

October developers

Better late than never on the October Developers and Project Developers for SourceForge:

Download these developer files here

Sorry about the .gz format instead of the .bz - for some reason I have problems using the Google Code auto-uploader with .bz files. It's a very strange problem, random and intermittent, hard to pin down why some .bz files will go and some will not. Anyway, I gzipped them and we are good to go.

October data

The October data are available at the following locations:

Google Code Downloads Page

Notes:
  • SF Stats are unavailable this month because the server was not reliably reporting stats when we did our collection.
  • There were a few errors in the file uploads. These files can't be taken off of Google Code, so I've just marked the files "DO NOT DOWNLOAD". You can usually tell these files by their very tiny filesize.

Pages