February Data Released for All Forges

We've been digging as usual, and we now announce that February data is released for all 5(!) forges. This includes:

forge (abbreviation) - datasource_id
Sourceforge (SF) - 46
Freshmeat (FM) - 47
Rubyforge (RF) - 48
Objectweb (OW) - 49
Free Software Foundation (FSF) - 50

Get the files at our Sourceforge Project Page

FLOSSmole mentioned in Information Week

FLOSSmole gets mentioned in a nice little article in a US trade magazine called Information Week. The article is about how to differentiate open source "winners and losers". With a proper shout out to Open BRR and the Business Readiness Rating, which is more like what the article is really about, here is the excerpt:

How To Tell The Open Source Winners From The Losers
By Charles Babcock
Feb 3, 2007

(this excerpt is from page 2):

database schema

You can browse the sql version of the FLOSSmole database schema in CVS on Sourceforge, OR you can check the schemaspy-generated view of the schema. The CVS version is updated less often than the schemaspy version. I'll try to run a schemaspy view every few months and overwrite the old one.

FLOSSmole now includes FSF data

FLOSSmole now includes data from the FSF (Free Software Foundation) directory (original directory link).

The flat files including the data can be found on our FSF sourceforge file release page.

Some facts of note:
--the FSF directory contains 5226 projects
--the FSF directory allows projects with case-sensitive but otherwise identical names, i.e. ANT and ant are considered different projects

January FM, RF, OW files released

We've released the January 2007 files for Freshmeat, Objectweb, and Rubyforge.

In February, look forward to the next Sourceforge release, as well as a new feature: "All Time Stats" for Sourceforge!

Get the latest files here

December data released, some updates

December data has been released. But first, some informational updates.

1. Project_indexes (the table where sourceforge web pages are stored) is now set to use the UTF-8 encoding. This is to address the concerns about character sets and corruption in our storage of Sourceforge project home pages.

2. The "forges" table is now in use. We have 5 forges, as follows:
0 - TE - test
1 - SF - Sourceforge
2 - FM - Freshmeat
3 - RF - Rubyforge
4 - OW - Objectweb

The datasources table now shows which forge the datasource is pulled from. The join column between the "forges" and "datasources" tables is "forge_id".

3. December data has been released for Sourceforge (SF), Objectweb (OW), RF (Rubyforge), FM (Freshmeat). Get the files on Sourceforge as follows:

-- All projects, SF, FM, OW, RF link
-- Sourceforge, December link (Datasource_id = 38)
-- Freshmeat, December link (Datasource_id = 41)

added "home pages"

There are 2 types of "home pages" for projects on Sourceforge:

1. A project's summary page. This is not a real home page, but sometimes people call it one. It lives on the SF servers and it has the URL format, where "projectname" is replaced by the actual name of the project. In our system, we call this address "url", and it's located in the projects table.

2. A request came in this week for us to parse out the "real" home pages of a project. There are 2 types of home pages:
a. Homepages that live on the servers and give a project a URL like this:
b. Homepages that live on some other server and are not hosted by SF in any way.

I wrote a parser for these "real urls" today and created a new column in the projects table called "real_url" to hold this data. I then released files in the sfRawData package for August 2006 and October 2006 showing these "real urls". Remember that real urls are reported by the project administrators. For the vast majority of projects, the URL is of type "a" above. But for some projects who have as their type "b" this may be of assistance in tracking down additional info about these projects.

October SF, FM, OW, RF data released

It's my Fall Break, so you know that means the October releases are finally here! (This includes Sourceforge releases, yay.)

Get the text files here.

1- FRESHMEAT fmProjectInfo (fmProjectInfo2006-Oct)
2- RUBYFORGE rfRawData (rfRawData2006-Oct)
3- SOURCEFORGE sfProjectInfo (sfProjectInfo01-oct-2006); sfRawData (sfRawData01-Oct-2006); sfRawDeveloperData (sfRawDeveloperData01-Oct-2006)
4- OBJECTWEB owRawData (osRawData2006-Oct)

September RF, FM, OW data released

The September 2006 data was released today for:


You can pick up those datafiles here on our FLOSSmole Files Page on Sourceforge.


August Data released

Go to our file release page on Sourceforge to get the latest files for August.

What's included here:

1- FRESHMEAT fmProjectInfo (fmProjectInfo2006-Aug)
2- RUBYFORGE rfRawData (rfRawData2006-Aug)
3- SOURCEFORGE sfProjectInfo (sfProjectInfo01-Aug-2006); sfRawData (sfRawData01-Aug-2006); sfRawDeveloperData (sfRawDeveloperData01-Aug-2006)
4- OBJECTWEB owRawData (osRawData2006-Aug)