savannah

January 2012 releases

We're cruising ahead with January 2012 releases. Grab the data from Google Code site or from the teragrid.

Freecode - done (formerly known as Freshmeat)
Savannah - done
Tigris - done
Rubyforge - done
Objectweb - done
Launchpad - done

Google Code - still running
Alioth - bug submitted #54
Gihub - will start as soon as Google is done

Free Software Foundation - bug still not fixed (this is my fault) #51

Interesting things: most popular data from November ..... drumroll please.... Google Code, Github.

November 2011 data entered

Here is the status of the November 2011 collection:

done & ready to download on Google Code or query in Teragrid...
============
RUBYFORGE
OBJECTWEB
TIGRIS
LAUNCHPAD
SAVANNAH
ALIOTH
GITHUB

still collecting...
============
GOOGLE

collectors broken and waiting to be fixed...
============
FRESHMEAT (BUG # 43)
UDD (BUG # 50)
DEBIAN (BUG # 48)
FREE SOFTWARE FOUNDATION (BUG # 51)

How many projects of each team size are listed in Savannah?

Description

This chart shows the number of projects of each team size listed in Savannah.

Visualization

Projects listed as having 0 developers were disregarded (73 projects).

SQL Script

SELECT DISTINCT project_dev_count, COUNT( DISTINCT project_name ) AS count
FROM sv_projects
WHERE datasource_id= <current>
GROUP BY project_dev_count
ORDER BY count DESC , project_dev_count

How has the use of "Free" and "Open" in project names grown by year?

Description

This chart shows the number of new projects in each repository that use the words "Free" and "Open" in project names through 2010.

Visualization

SQL Script

Freshmeat:

SELECT YEAR( date_added ) , COUNT( DISTINCT project_id ) AS Count
FROM fm_projects
WHERE projectname_full LIKE "%free%"
AND datasource_id = <current>
GROUP BY YEAR( date_added )
ORDER BY YEAR( date_added );


SELECT YEAR( date_added ) , COUNT( DISTINCT project_id ) AS Count
FROM fm_projects
WHERE projectname_full LIKE "%open%"
AND datasource_id = <current>
GROUP BY YEAR( date_added )
ORDER BY YEAR( date_added );

Rubyforge:

SELECT YEAR( date_registered ) , COUNT( DISTINCT proj_unixname ) AS Count
FROM rf_projects
WHERE proj_unixname LIKE "%free%"
AND datasource_id = <current>
GROUP BY YEAR( date_registered )
ORDER BY YEAR( date_registered );


SELECT YEAR( date_registered ) , COUNT( DISTINCT proj_unixname ) AS Count
FROM rf_projects
WHERE proj_unixname LIKE "%open%"
AND datasource_id = <current>
GROUP BY YEAR( date_registered )
ORDER BY YEAR( date_registered );

Savannah:

SELECT YEAR( registration_date ) , COUNT( DISTINCT project_name ) AS Count
FROM sv_projects
WHERE project_name LIKE "%free%"
AND datasource_id = <current>
GROUP BY YEAR( registration_date )
ORDER BY YEAR( registration_date );


SELECT YEAR( registration_date ) , COUNT( DISTINCT project_name ) AS Count
FROM sv_projects
WHERE project_name LIKE "%open%"
AND datasource_id = <current>
GROUP BY YEAR( registration_date )
ORDER BY YEAR( registration_date );

How have projects in each repository grown by year?

Description

This chart shows the number of NEW projects added to each repository by month/year.

Visualization

Notes: RF had 697 projects without a project start date. OW had one project started in 1970.

SQL Script


SELECT MONTH( date_added ) , YEAR( date_added ) , COUNT( DISTINCT project_id )
FROM fm_projects
WHERE datasource_id = <current>
GROUP BY YEAR( date_added ) , MONTH( date_added )
ORDER BY YEAR( date_added ) , MONTH( date_added );


SELECT MONTH( date_registered ) , YEAR( date_registered ) , COUNT( DISTINCT proj_unixname )
FROM rf_projects
WHERE datasource_id = <current>
GROUP BY YEAR( date_registered ) , MONTH( date_registered )
ORDER BY YEAR( date_registered ) , MONTH( date_registered );


SELECT MONTH( registration_date ) , YEAR( registration_date ) , COUNT( DISTINCT project_name )
FROM sv_projects
WHERE datasource_id = <current>
GROUP BY YEAR( registration_date ) , MONTH( registration_date )
ORDER BY YEAR( registration_date ) , MONTH( registration_date );

How many projects are listed in each repository?

Description

This chart shows the number of projects that FLOSSmole most recently collected from each repository.

Visualization

Project Count Chart


SQL Script


SELECT COUNT( DISTINCT proj_name )
FROM gc_projects
WHERE datasource_id = <current>;


SELECT COUNT( DISTINCT project_id )
FROM fm_projects
WHERE datasource_id = <current>;


SELECT COUNT( DISTINCT project_name )
FROM lp_projects
WHERE datasource_id = <current>;


SELECT COUNT( DISTINCT proj_unixname )
FROM rf_projects
WHERE datasource_id= <current>;


SELECT COUNT( DISTINCT proj_num )
FROM fsf_projects
WHERE datasource_id= <current>;


SELECT COUNT( DISTINCT project_name )
FROM sv_projects
WHERE datasource_id= <current>;


SELECT COUNT( DISTINCT unixname )
FROM tg_projects
WHERE datasource_id = <current>;


SELECT COUNT( DISTINCT proj_unixname )
FROM ow_projects
WHERE datasource_id= <current>;

May 2011 Data Released

May 2011 data has been released to Google Code and uploaded into Data Central at Teragrid.

Datasources:
263 2011-Mar UDD bugfix replaces 262
264 2011-Mar UDD bugfix replaces 263
265 2011-May UDD May 2011 UDD donation
266 Rubyforge 2011-May Rubyforge 2011-May
267 Objectweb 2011-May Objectweb 2011-May
268 FSF 2011-May Free Software Foundation 2011-May
269 Savannah 2011-May Savannah 2011-May
270 2011-May FM May 2011 Freshmeat

Status of other collectors:
Launchpad - parsing problem
Tigris - mailing list collector problem
Github - collection problem
Google Code - still running (it will be about a month until these are out)

Link to FLOSSmole files on Google Code
Link to instructions for how to access FLOSSmole db at Teragrid

January file releases

Just released data files for the following forges. You can head over to the FLOSSmole data downloads page at Google Code to download any of these files, or wait for them to be released to the Teragrid for live querying (shortly!)

datasource_id, forge_id, abbreviation, name
237 2 FM Freshmeat
238 3 RF Rubyforge
239 4 OW ObjectWeb
240 5 FSF Free Software Foundation
241 10 SV Savannah
243 12 GC Google Code
244 13 TG Tigris

Still running...
245 14 LP Launchpad

Broken...
242 11 GH Github

February 2010 Data Released

Lots of new data for you to peruse out on our FLOSSmole Data Downloads Page.

Here's what's out there, recently added:

Google Code, March 2010 (GC) - list of all GC projects donated by Audris Mockus (HUGE THANK YOU TO AUDRIS FOR THIS!!)
Freshmeat, February 2010 (FM)
Objectweb, February 2010 (OW)
Rubyforge, February 2010 (RF)
Github, February 2010 (GH)
Free Software Foundation, February 2010 (FSF)
Savannah, February 2010 (SV)
and Sourceforge from December 2009 (SF)

We have another set of bugs to fix with Sourceforge collection this year, 2010, but those are forthcoming. I'm running a collection now. Hopefully the data will be good. We may even have stats this time. Hallelujah.

Also, thanks to my phenomenal undergraduate superstar Steven Norris, Tigris is coming soon!! and Debian after that. We are rocking the repository collection...

December 2009 data released

December data has been released for the following forges:

(datasource-abbreviation-full name)
200-fm-freshmeat
201-rf-rubyforge
202-ow-objectweb
203-fsf-free software foundation
204-sv-savannah
205-gh-github

Sourceforge is in progress... it will be datasource_id=206.

Get the data here:
http://code.google.com/p/flossmole/downloads/list

Remember that the files marked "DM" are SQL files (mysql) but the files marked .txt are flat text files (delimited)

Syndicate content