megan's blog

Announcement: data marts now available

By popular demand of our user base (and some hard work by our developers, especially ruphus_13), we now provide data marts for Sourceforge data.

The new package, called DataMarts contains all the SQL create and insert statements for creating your own version of the FLOSSmole database - for multiple data sources (Sourceforge, Freshmeat, Rubyforge, ObjectWeb, FSF)

The marts are created following each of our data collections; we collect and parse the data as usual. We then load it into our database as usual, and create the raw flat-file data dumps as we have been doing since 2004. The new feature we are announcing today is that we also now provide the SQL data dumps so you can auto-load our data into your own local database for easier processing and more complex mining tasks.

So, there are now numerous ways to get our data:

--install the data marts into your own mysql database
--download and analyze the flat, delimited data files
--play around with the query tool

April 2007 data released for all forges

April 2007 data is released for all forges. Here is a summary of the data we have and where to get it:

  • Sourceforge data
    • General Forge Information(Get it)
      • Project code names, project display names, developer counts, date project was registered, long project descriptions

    • Developer Information(Get it)
      • Developer login names, real names, developers-per-project and what role they have on that project, are they an admin?

    • Data about Projects(Get it)
      • Database type by project, number of downloads per project, rank of project, intended audience, topic of project, status of project, license(s), operating system(s), programming language(s), real URL of project, tracker data, donors to projects, user interfaces


  • Freshmeat data (Get it)

New Query Tool

Check out the New Query Tool for running common, pre-defined canned queries. (Thanks Gregg!)

The old query tool is still available. We'll be adding real-time graphing and some more bells and whistles to the new tool as time is available.

March data released

March data is released for the following sources (forges/directories/repositories):

--Freshmeat
--Rubyforge
--ObjectWeb
--Free Software Foundation
--SourceKibitzer

Get the data from our Sourceforge file release page

Enjoy!

(The April release will include Sourceforge and the other 5 forges.)

new donations from SourceKibitzer

Great news moles, we have a new donation partner: Source Kibitzer.

The facts:
--In our system, SourceKibitzer is forge #6, and has the abbreviation "SK".
--SK will be part of the monthly data cycle, so expect new SK files once per month (just like Freshmeat, Rubyforge, Objectweb, and Free Software Fndn.)
--SK files are available on our file releases page on Sourceforge.

February Data Released for All Forges

Hi moles!

We've been digging as usual, and we now announce that February data is released for all 5(!) forges. This includes:

forge (abbreviation) - datasource_id
=====================================
Sourceforge (SF) - 46
Freshmeat (FM) - 47
Rubyforge (RF) - 48
Objectweb (OW) - 49
Free Software Foundation (FSF) - 50

Get the files at our Sourceforge Project Page

FLOSSmole mentioned in Information Week

FLOSSmole gets mentioned in a nice little article in a US trade magazine called Information Week. The article is about how to differentiate open source "winners and losers". With a proper shout out to Open BRR and the Business Readiness Rating, which is more like what the article is really about, here is the excerpt:

How To Tell The Open Source Winners From The Losers
By Charles Babcock
InformationWeek
Feb 3, 2007

(this excerpt is from page 2):

database schema

You can browse the sql version of the FLOSSmole database schema in CVS on Sourceforge, OR you can check the schemaspy-generated view of the schema. The CVS version is updated less often than the schemaspy version. I'll try to run a schemaspy view every few months and overwrite the old one.

FLOSSmole now includes FSF data

FLOSSmole now includes data from the FSF (Free Software Foundation) directory (original directory link).

The flat files including the data can be found on our FSF sourceforge file release page.

Some facts of note:
--the FSF directory contains 5226 projects
--the FSF directory allows projects with case-sensitive but otherwise identical names, i.e. ANT and ant are considered different projects

January FM, RF, OW files released

We've released the January 2007 files for Freshmeat, Objectweb, and Rubyforge.

In February, look forward to the next Sourceforge release, as well as a new feature: "All Time Stats" for Sourceforge!

Get the latest files here

Pages