May 2007 data released for small forges

May 2007 data is released for the small forges. (Reminder that Sourceforge data is next scheduled for a release in June.)

As usual, there are 3 ways to get FLOSSmole data:
(1) Flat files (includes May 2007 data, plus historical data if you wish)
(2) Get the data marts
(3) Browse results of common queries

Announcement: data marts now available

By popular demand of our user base (and some hard work by our developers, especially ruphus_13), we now provide data marts for Sourceforge data.

The new package, called DataMarts contains all the SQL create and insert statements for creating your own version of the FLOSSmole database - for multiple data sources (Sourceforge, Freshmeat, Rubyforge, ObjectWeb, FSF)

The marts are created following each of our data collections; we collect and parse the data as usual. We then load it into our database as usual, and create the raw flat-file data dumps as we have been doing since 2004. The new feature we are announcing today is that we also now provide the SQL data dumps so you can auto-load our data into your own local database for easier processing and more complex mining tasks.

So, there are now numerous ways to get our data:

--install the data marts into your own mysql database
--download and analyze the flat, delimited data files
--play around with the query tool

April 2007 data released for all forges

April 2007 data is released for all forges. Here is a summary of the data we have and where to get it:

  • Sourceforge data
    • General Forge Information(Get it)
      • Project code names, project display names, developer counts, date project was registered, long project descriptions

    • Developer Information(Get it)
      • Developer login names, real names, developers-per-project and what role they have on that project, are they an admin?

    • Data about Projects(Get it)
      • Database type by project, number of downloads per project, rank of project, intended audience, topic of project, status of project, license(s), operating system(s), programming language(s), real URL of project, tracker data, donors to projects, user interfaces


  • Freshmeat data (Get it)

New Query Tool

Check out the New Query Tool for running common, pre-defined canned queries. (Thanks Gregg!)

The old query tool is still available. We'll be adding real-time graphing and some more bells and whistles to the new tool as time is available.

March data released

March data is released for the following sources (forges/directories/repositories):

--Freshmeat
--Rubyforge
--ObjectWeb
--Free Software Foundation
--SourceKibitzer

Get the data from our Sourceforge file release page

Enjoy!

(The April release will include Sourceforge and the other 5 forges.)

new donations from SourceKibitzer

Great news moles, we have a new donation partner: Source Kibitzer.

The facts:
--In our system, SourceKibitzer is forge #6, and has the abbreviation "SK".
--SK will be part of the monthly data cycle, so expect new SK files once per month (just like Freshmeat, Rubyforge, Objectweb, and Free Software Fndn.)
--SK files are available on our file releases page on Sourceforge.

The first file we released from SourceKibitzer is for February, 2007. For each of some 500-odd projects, it includes:

project name
density of comments
todo count
commented lines of code
total lines of code
non-comment lines of code
non-commenting source statements
number of methods
sum of data abstraction coupling
boolean expression complexity
fanout
npath complexity
weighted method count

Some very interesting stuff! Get the SourceKibitzer February data.

February Data Released for All Forges

Hi moles!

We've been digging as usual, and we now announce that February data is released for all 5(!) forges. This includes:

forge (abbreviation) - datasource_id
=====================================
Sourceforge (SF) - 46
Freshmeat (FM) - 47
Rubyforge (RF) - 48
Objectweb (OW) - 49
Free Software Foundation (FSF) - 50

Get the files at our Sourceforge Project Page

FLOSSmole mentioned in Information Week

FLOSSmole gets mentioned in a nice little article in a US trade magazine called Information Week. The article is about how to differentiate open source "winners and losers". With a proper shout out to Open BRR and the Business Readiness Rating, which is more like what the article is really about, here is the excerpt:

How To Tell The Open Source Winners From The Losers
By Charles Babcock
InformationWeek
Feb 3, 2007

(this excerpt is from page 2):

database schema

You can browse the sql version of the FLOSSmole database schema in CVS on Sourceforge, OR you can check the schemaspy-generated view of the schema. The CVS version is updated less often than the schemaspy version. I'll try to run a schemaspy view every few months and overwrite the old one. The query tool also lists all the tables, so if you see a table in there and you want to know what it does, do "describe <tablename>" or just check out the schema!

FLOSSmole now includes FSF data

FLOSSmole now includes data from the FSF (Free Software Foundation) directory (original directory link).

The flat files including the data can be found on our FSF sourceforge file release page.

Some facts of note:
--the FSF directory contains 5226 projects
--the FSF directory allows projects with case-sensitive but otherwise identical names, i.e. ANT and ant are considered different projects
--our datasource_id for this initial run is "45" if anyone is checking the query tool
--FSF is forge #6 in our FLOSSmole system
--our FSF tables are all preceded with the "fsf" extension (i.e. "fsf_projects").
Syndicate content