FLOSSmole mentioned in Information Week

FLOSSmole gets mentioned in a nice little article in a US trade magazine called Information Week. The article is about how to differentiate open source "winners and losers". With a proper shout out to Open BRR and the Business Readiness Rating, which is more like what the article is really about, here is the excerpt:

How To Tell The Open Source Winners From The Losers
By Charles Babcock
InformationWeek
Feb 3, 2007

(this excerpt is from page 2):

database schema

You can browse the sql version of the FLOSSmole database schema in CVS on Sourceforge, OR you can check the schemaspy-generated view of the schema. The CVS version is updated less often than the schemaspy version. I'll try to run a schemaspy view every few months and overwrite the old one. The query tool also lists all the tables, so if you see a table in there and you want to know what it does, do "describe <tablename>" or just check out the schema!

FLOSSmole now includes FSF data

FLOSSmole now includes data from the FSF (Free Software Foundation) directory (original directory link).

The flat files including the data can be found on our FSF sourceforge file release page.

Some facts of note:
--the FSF directory contains 5226 projects
--the FSF directory allows projects with case-sensitive but otherwise identical names, i.e. ANT and ant are considered different projects
--our datasource_id for this initial run is "45" if anyone is checking the query tool
--FSF is forge #6 in our FLOSSmole system
--our FSF tables are all preceded with the "fsf" extension (i.e. "fsf_projects").

January FM, RF, OW files released

We've released the January 2007 files for Freshmeat, Objectweb, and Rubyforge.

In February, look forward to the next Sourceforge release, as well as a new feature: "All Time Stats" for Sourceforge!

Get the latest files here

One more newish thing... if you're on the mailing list, this is old news... but for you data junkies, you can now play around with the content generated about our data model from schema spy. If we like this format, I'll generate these bi-monthly along with the SF data and post them on the sidebar permanently.

As always, if you feel like helping out the team with coding or documentation, send me an email and we'll chat. (mconklin at elon dot edu)

December data released, some updates

December data has been released. But first, some informational updates.

1. Project_indexes (the table where sourceforge web pages are stored) is now set to use the UTF-8 encoding. This is to address the concerns about character sets and corruption in our storage of Sourceforge project home pages.

2. The "forges" table is now in use. We have 5 forges, as follows:
0 - TE - test
1 - SF - Sourceforge
2 - FM - Freshmeat
3 - RF - Rubyforge
4 - OW - Objectweb

The datasources table now shows which forge the datasource is pulled from. The join column between the "forges" and "datasources" tables is "forge_id".

3. December data has been released for Sourceforge (SF), Objectweb (OW), RF (Rubyforge), FM (Freshmeat). Get the files on Sourceforge as follows:

-- All projects, SF, FM, OW, RF link
-- Sourceforge, December link (Datasource_id = 38)
-- Freshmeat, December link (Datasource_id = 41)

added "home pages"

There are 2 types of "home pages" for projects on Sourceforge:

1. A project's summary page. This is not a real home page, but sometimes people call it one. It lives on the SF servers and it has the URL format http://sf.net/projects/projectname, where "projectname" is replaced by the actual name of the project. In our system, we call this address "url", and it's located in the projects table.

2. A request came in this week for us to parse out the "real" home pages of a project. There are 2 types of home pages:
a. Homepages that live on the shell.sf.net servers and give a project a URL like this: http://projectname.sf.net
b. Homepages that live on some other server and are not hosted by SF in any way.

I wrote a parser for these "real urls" today and created a new column in the projects table called "real_url" to hold this data. I then released files in the sfRawData package for August 2006 and October 2006 showing these "real urls". Remember that real urls are reported by the project administrators. For the vast majority of projects, the URL is of type "a" above. But for some projects who have as their type "b" this may be of assistance in tracking down additional info about these projects.

October SF, FM, OW, RF data released

It's my Fall Break, so you know that means the October releases are finally here! (This includes Sourceforge releases, yay.)

Get the text files here.

1- FRESHMEAT fmProjectInfo (fmProjectInfo2006-Oct)
2- RUBYFORGE rfRawData (rfRawData2006-Oct)
3- SOURCEFORGE sfProjectInfo (sfProjectInfo01-oct-2006); sfRawData (sfRawData01-Oct-2006); sfRawDeveloperData (sfRawDeveloperData01-Oct-2006)
4- OBJECTWEB owRawData (osRawData2006-Oct)

Check the release notes for more guidance on what is inside each file. Also the "how to use this data" posting might be helpful. (Even though the original date on this posting is "April 2005", I continue to update it, almost like a mini-FAQ.)

September RF, FM, OW data released

The September 2006 data was released today for:

--Freshmeat
--RubyForge
--ObjectWeb

You can pick up those datafiles here on our FLOSSmole Files Page on Sourceforge.

Enjoy!

August Data released

Go to our file release page on Sourceforge to get the latest files for August.

What's included here:

1- FRESHMEAT fmProjectInfo (fmProjectInfo2006-Aug)
2- RUBYFORGE rfRawData (rfRawData2006-Aug)
3- SOURCEFORGE sfProjectInfo (sfProjectInfo01-Aug-2006); sfRawData (sfRawData01-Aug-2006); sfRawDeveloperData (sfRawDeveloperData01-Aug-2006)
4- OBJECTWEB owRawData (osRawData2006-Aug)

Check the release notes for more guidance on what is inside each file. Also the "how to use this data" posting might be helpful. (Even though the original date on this posting is "April 2005", I continue to update it, almost like a mini-FAQ.)

Enjoy!

rubyforge data released

Hello moles, and happy summer! I've just released Rubyforge data from July, 2006.

Now granted, Rubyforge is not as large as Sourceforge. But it has considerable "buzz" for what that's worth. And as a relatively new language and new forge, I figure it's worth watching, especially considering how easy it is to collect their data! (They put out an XML file with a bit of the data in it, and with only 1700 or so projects, it's much easier to scrape the rest than on CERTAIN OTHER FORGES. Thank you for that, Rubyforge!)

Rubyforge files available here:
https://sourceforge.net/project/showfiles.php?group_id=119453&package_id...

Unfortunately, even though RF is using the same software as SF, they don't have donation system (so no donor files), and they don't have a statistics engine like SF. So the statistics are a little weak.

One other note, along with Freshmeat, Rubyforge will be scraped MONTHLY. Sourceforge will continue to be scraped BI-MONTHLY (every other month). This is due to the size and complexity of the SF scrape.

ObjectWeb coming next.
Syndicate content