Lots of new data for you to peruse out on our FLOSSmole Data Downloads Page.
Here's what's out there, recently added:
Google Code, March 2010 (GC) - list of all GC projects donated by Audris Mockus (HUGE THANK YOU TO AUDRIS FOR THIS!!)
Freshmeat, February 2010 (FM)
Objectweb, February 2010 (OW)
Rubyforge, February 2010 (RF)
Github, February 2010 (GH)
Free Software Foundation, February 2010 (FSF)
Savannah, February 2010 (SV)
and Sourceforge from December 2009 (SF)
We have another set of bugs to fix with Sourceforge collection this year, 2010, but those are forthcoming. I'm running a collection now. Hopefully the data will be good. We may even have stats this time. Hallelujah.
Also, thanks to my phenomenal undergraduate superstar Steven Norris, Tigris is coming soon!! and Debian after that. We are rocking the repository collection...
December data has been released for the following forges:
(datasource-abbreviation-full name)
200-fm-freshmeat
201-rf-rubyforge
202-ow-objectweb
203-fsf-free software foundation
204-sv-savannah
205-gh-github
Sourceforge is in progress... it will be datasource_id=206.
Get the data here:
http://code.google.com/p/flossmole/downloads/list
Remember that the files marked "DM" are SQL files (mysql) but the files marked .txt are flat text files (delimited)
This month we have data from Freshmeat, Rubyforge, Objectweb, Savannah, Github, Free Software Foundation.
Downloads available at Google Code
Remember, the SQL is available in the datamart*.sql.bz files, the flat (delimited) data is available in the other files.
We're still working on getting our Sourceforge scraper back up and running, and we thank you for your patience.
October 2009 data has been released. Here are the forges we have this month:
Freshmeat
Rubyforge
ObjectWeb
Free Software Foundation directory
Savannah (new)
GitHub (new)
Sourceforge is undergoing a re-write, still, but we will be collecting again from there soon. In the meantime, don't forget that the June 2009 data is available, and also there is the Notre Dame data if you find that helps at all.
Enjoy!
Data has been released for FSF, FM, RF, OW. Go get it!! Have fun.
That Freshmeat data looks fairly popular. Anyone want to tell us how you use this data?
Hello moles, our July 2009 data has been released: this month we have Objectweb, Freshmeat, Rubyforge, Free Software Foundation directory.
Go to our Google Code pages to download the data.
The most recent datasource_ids are:
178-fm-July2009
179-rf-July2009
180-ow-July2009
181-fsf-July2009
Description
This chart shows the top programming languages used by projects in Rubyforge.
Visualization

SQL Script
SELECT rfpl.description, count(DISTINCT rfpl.proj_unixname) AS lang
FROM rf_project_programming_language rfpl
WHERE rfpl.datasource_id = <current>
GROUP BY rfpl.description
ORDER BY lang DESC;
Description
This chart shows the top operating systems used by projects in Rubyforge.
Visualization

SQL Script
SELECT rfop.description, count(DISTINCT rfop.proj_unixname) AS system
FROM rf_project_operating_system rfop
WHERE rfop.datasource_id = <current>
GROUP BY rfop.description
ORDER BY system DESC;
Description
This chart shows the number of projects of each team size listed in Rubyforge.
Visualization

SQL Script
SELECT DISTINCT dev_count, count(DISTINCT proj_unixname) AS count
FROM rf_projects
WHERE datasource_id = <current>
GROUP BY dev_count
ORDER BY count DESC , dev_count;
Description
This chart shows the number of new projects in each repository that use the words "Free" and "Open" in project names. (We ran the queries to make this chart in June. This means 2009 was not yet completed, so this explains the apparent drop-off for the 2009 numbers.)
Visualization

SQL Script
Sourceforge:
SELECT year(date_registered) , count(DISTINCT proj_unixname) FROM projects
WHERE proj_unixname LIKE "%free%"
AND datasource_id = <current>
GROUP BY year(date_registered)
ORDER BY year(date_registered) ;
SELECT year(date_registered) , count(DISTINCT proj_unixname) FROM projects
WHERE proj_unixname LIKE "%open%"
AND datasource_id = <current>
GROUP BY year(date_registered)
ORDER BY year(date_registered) ;
Freshmeat:
SELECT year(date_added), count(DISTINCT project_id) FROM fm_projects
WHERE projectname_full LIKE "%free%"
AND datasource_id = <current>
GROUP BY year(date_added)
ORDER BY year(date_added);
SELECT year(date_added), count(DISTINCT project_id) FROM fm_projects
WHERE projectname_full LIKE "%open%"
Recent comments
37 weeks 5 days ago
1 year 23 weeks ago
1 year 39 weeks ago
1 year 51 weeks ago
2 years 5 days ago
3 years 20 weeks ago
3 years 20 weeks ago
3 years 22 weeks ago
1 year 37 weeks ago
2 years 49 weeks ago