warning: Creating default object from empty value in /var/www/drupal/modules/taxonomy/taxonomy.pages.inc on line 33.

rubyforge

Mining Software Repositories '16 paper, slides & data

This weekend I'll be presenting at Mining Software Repositories 2016 in Austin, TX. My talk is in the data sets track, and it is entitled Data Sets: The Circle of Life in Ruby Hosting, 2003-2015(PDF). Here are the slides. And here are the quick links to the flat data: RF and RG.

Growth of Projects in Each Repository (June 2014)

Description:

This chart shows the number of NEW projects added to each repository by month/year.

Visualization:

Notes: RF had ~750 projects without a project start date.

SQL Script:

SELECT MONTH( date_added ) , YEAR( date_added ) , COUNT( DISTINCT project_id )
FROM fm_projects
WHERE datasource_id = [current]
GROUP BY YEAR( date_added ) , MONTH( date_added )
ORDER BY YEAR( date_added ) , MONTH( date_added );

SELECT MONTH( date_registered ) , YEAR( date_registered ) , COUNT( DISTINCT proj_unixname )
FROM rf_projects
WHERE datasource_id = [current]
GROUP BY YEAR( date_registered ) , MONTH( date_registered )
ORDER BY YEAR( date_registered ) , MONTH( date_registered );

SELECT MONTH( registration_date ) , YEAR( registration_date ) , COUNT( DISTINCT project_name )
FROM sv_projects
WHERE datasource_id = [current]
GROUP BY YEAR( registration_date ) , MONTH( registration_date )
ORDER BY YEAR( registration_date ) , MONTH( registration_date );

Growth of use of "Free" and "Open" in Project Names (2014 data)

Description:

This chart shows the number of new projects in each repository that use the words "Free" and "Open" in project names through 2014.

Visualization:

SQL Script:

Freshmeat:

SELECT YEAR( date_added ) , COUNT( DISTINCT project_id ) AS Count
FROM fm_projects
WHERE projectname_full LIKE "%free%"
AND datasource_id = [current]
GROUP BY YEAR( date_added )
ORDER BY YEAR( date_added );

SELECT YEAR( date_added ) , COUNT( DISTINCT project_id ) AS Count
FROM fm_projects
WHERE projectname_full LIKE "%open%"
AND datasource_id = [current]
GROUP BY YEAR( date_added )
ORDER BY YEAR( date_added );

Rubyforge:

SELECT YEAR( date_registered ) , COUNT( DISTINCT proj_unixname ) AS Count
FROM rf_projects
WHERE proj_unixname LIKE "%free%"
AND datasource_id = [current]
GROUP BY YEAR( date_registered )
ORDER BY YEAR( date_registered );

SELECT YEAR( date_registered ) , COUNT( DISTINCT proj_unixname ) AS Count
FROM rf_projects
WHERE proj_unixname LIKE "%open%"
AND datasource_id = [current]
GROUP BY YEAR( date_registered )
ORDER BY YEAR( date_registered );

Savannah:

SELECT YEAR( registration_date ) , COUNT( DISTINCT project_name ) AS Count
FROM sv_projects
WHERE project_name LIKE "%free%"
AND datasource_id = [current]
GROUP BY YEAR( registration_date )
ORDER BY YEAR( registration_date );

SELECT YEAR( registration_date ) , COUNT( DISTINCT project_name ) AS Count
FROM sv_projects
WHERE project_name LIKE "%open%"
AND datasource_id = [current]
GROUP BY YEAR( registration_date )
ORDER BY YEAR( registration_date );

Number of Projects per Team Size in Rubyforge (June 2014)

Description:

This chart show the number of projects for each team size in Rubyforge.

Visualization:

Projects listed as having 0 developers were disregarded (159 projects).

SQL Script:

SELECT DISTINCT dev_count, COUNT( DISTINCT proj_unixname ) AS count
FROM rf_projects
WHERE datasource_id = [current]
GROUP BY dev_count
ORDER BY count DESC , dev_count

Most Commonly Used Operating Systems by Rubyforge Projects (June 2014)

Description:

This chart shows the top operating systems used by projects in Rubyforge.

Visualization:

SQL Script:

SELECT rfop.description AS System, COUNT( DISTINCT rfop.proj_unixname ) AS Count
FROM rf_project_operating_system rfop
WHERE rfop.datasource_id = [current]
GROUP BY System
ORDER BY Count DESC;

Most Commonly Used Languages by Rubyforge Projects (June 2014)

Description:

This chart shows the top programming languages used by projects in Rubyforge.

Visualization:

SQL Script:

SELECT rfpl.description AS Lang, COUNT( DISTINCT rfpl.proj_unixname ) AS Count
FROM rf_project_programming_language rfpl
WHERE rfpl.datasource_id = [current]
GROUP BY Lang
ORDER BY Count DESC;

Rubyforge Project Registrations, 2003-2014

Rubyforge is a software development "code forge" associated with projects written in the Ruby programming language. This chart shows the growth of new projects registered on this forge from July 2003 - December 2013. We used datasource_id=317 in this example.

The SQL to generate the data used to populate this graph is as follows (fill in the datasource_id accordingly):

SELECT CONCAT(MONTH( date_registered ) , '-', YEAR( date_registered )) , COUNT( DISTINCT proj_unixname )
FROM rf_projects
WHERE datasource_id = 12987
GROUP BY YEAR( date_registered ) , MONTH( date_registered )
ORDER BY YEAR( date_registered ) , MONTH( date_registered );

Rubyforge License Counts, June 2014

Each project on Rubyforge can list what license it uses. The following chart was generated in June 2014 (datasource_id=12987) to show the most common licenses (all those with more than 10 projects using it) and how many projects.

Here is the SQL code used to generate the data for this chart:

SELECT description, count( * )
FROM rf_project_licenses
WHERE datasource_id =[current datasource_id]
GROUP BY 1
ORDER BY 2 DESC;

Last Rubyforge Collection

The last Rubyforge collection happened yesterday. The datasource_id = 12987. All the data is located on our file downloads site, or in the database (ossmole_merged schema, tables prefixed 'rf', use datasource_id=12987 in your SQL queries).

RIP Rubyforge! We have been collecting from there for 10 years. Charts and graphs coming soon.

rubyforge shuts down

New March 2014 data released

Some new forge data has been released collected 04-Mar-2014.

Datasource_id's are as follows:

8079 - freecode
8080 - rubyforge
8081 - objectweb
8082 - savannah
8083 - tigris
8084 - alioth

IRC data:
8085 - 8134: Apache ServiceMix
8135 - 8185: Apache Camel
8186 - 8236: Apache ActiveMQ
8237 - 8287: Apache CXF
8288 - 8338: Apache-Aries
8339 - 8389: Apache Kalumet
8390 - 8440: Apache Karaf

Data is available either in the flat files or by direct database access. Happy digging!

Syndicate content