Getting the Data
FLOSSmole collects data from numerous open source software development forges, and we also accept data donations.
Organization of the Data
Each forge we collect from is given a "forge id", and each collection from a forge (or each donation) is given a number, called a "data source id". The collected data is stored in our database, parsed, and re-released for researchers to use as they wish. We have some schemas for our database to help you understand the organization better and details about each collection.
Getting the FLOSSmole Data
There are three ways to access the FLOSSmole data, which are described below.
-
Flat delimited files
Download these from our page at Google Code. The files are named with the abbreviation for their forge or source (e.g. Sourceforge is "SF"), a brief description of the parsed data you are getting, and the date of the collection. Example: sfRawPublicAreas2009-Jun.txt.bz2 (These files are in bzip format, which should open using any standard unzipping utility.) SQL files (mysql creates and inserts)
Download these from our page at Google Code. These files all start with the word "datamart", then the forge abbreviation, then the date. Example: datamart_sf_stats.2009-Jun.sql.bz2 Refer to the schema descriptions as you download these files so you'll know what you're getting. Direct database access
Request access by joining the FLOSSmole mailing list. More details on getting direct database access.
In addition, we have older data in flat files and SQL files that are still available from Sourceforge.
