Collection information

Details about the repository collections

July 2009 data

Hello moles, our July 2009 data has been released: this month we have Objectweb, Freshmeat, Rubyforge, Free Software Foundation directory.

Go to our Google Code pages to download the data.

The most recent datasource_ids are:

Data Resources: 

SourceKibitzer Collections

SourceKibitzer, now defunct, was an initiative to collect metrics about the performance of various open source software products. (Here is a Wikipedia article about SourceKibitzer.)

SourceKibitzer sent FLOSSmole their data on a regular basis from February 2007 through September 2007. We dutifully stored this data and it is available for researchers to use if they are interested in the SK metrics from this time period. The datasource ids are as follows:

  • 51: 2007-Feb SourceKibitzer
  • 56: 2007-Mar SourceKibitzer
  • 62: 2007-Apr SK
  • 67: 2007-May SK
  • 73: 2007-Jun SK
  • 79: 2007-Jul SK
  • 85: 2007-Aug SK
  • 91: 2007-Sep SK

Data explanation
Here are the metrics provided for 500-odd projects by SourceKibitzer:

  • project name
  • density of comments (DC: Density of comments. Ratio of sum of the comment lines to sum of all lines in all source files of the package. Indicates how much of the code is commented.)
  • todo count (TODO_COUNT: Number of TODO comments. Sums up the number of TODO comment lines in all source files of the package. The following patterns are recognized as TODO comments: FIX-ME, FIXME, FIX-IT, FIXIT, TO-DO, TODO, XXX, TBD.)
  • commented lines of code (CLOC: Number of lines that contain comments.)
Data Resources: 

Free Software Foundation Collections

The Free Software Foundation Directory of open source projects lists those that run under "free" systems, particularly GNU and GNU/Linux variants.

Every month we collect the available project-level metadata from the Free Software Foundation's directory, and load that into our database. We then parse through those html pages and extract interesting data elements. After parsing, we save each piece of data in our database also. We then provide this data back to researchers to do with as you wish.

Project Items:

Data Resources: 

ObjectWeb Collections

OW2 Forge (formerly ObjectWeb forge) is used as a repository for projects created by developers of open source middleware in the OW2 community.

Data Resources: 

Rubyforge Collections

Rubyforge is a repository designed to support the open source development community working in the Ruby programming language.

Every month (or so) we collect the Rubyforge list of projects and some basic developer information. We insert this data into our database, then parse out various interesting data elements and store those in the database also. We then provide this data back to you in several formats for you to do with as you please.

Project Items:

Data Resources: 

Freecode (Freshmeat) Collections

Freecode (formerly Freshmeat) is a directory of open source projects.

Every month we download Freecode's own RDF file of information about projects listed on that directory, parse the information, and load it into our database. We then provide that data freely back to you to do with as you wish.

Project Metadata:

Data Resources: 

Sourceforge Collections

Sourceforge is a large repository of open source software development projects.

From 2004-2009, approximately six times per year (every other month) FLOSSmole collected, parsed, and stored metadata about each of the projects on Sourceforge.

However, as of 2009, FLOSSmole can no longer support this effort. Instead, we recommend that researchers use the SRDA repository of SF data hosted at Notre Dame.

Project-level metadata that we collected

Data Resources: