As of June 2009, how many projects of each team size are listed in Free Software Foundation?

Description

This chart shows the number of projects of each team size listed in Free Software Foundation.

Visualization

Projects listed as having 0 developers were disregarded (53 projects).

Free Software Foundation Developer Count Chart

SQL Script

SELECT DISTINCT calc_dev_count, count(DISTINCT proj_num) AS count
FROM fsf_projects
WHERE datasource_id= <current>
GROUP BY calc_dev_count
ORDER BY count DESC, calc_dev_count;

As of June 2009, how has the use "Free" and "Open" in project names grown by year?

Description

This chart shows the number of new projects in each repository that use the words "Free" and "Open" in project names. (We ran the queries to make this chart in June. This means 2009 was not yet completed, so this explains the apparent drop-off for the 2009 numbers.)

Visualization

Freshmeat Free & Open Count Chart

SQL Script

Sourceforge:

SELECT year(date_registered) , count(DISTINCT proj_unixname) FROM projects
WHERE proj_unixname LIKE "%free%"
AND datasource_id = <current>
GROUP BY year(date_registered)
ORDER BY year(date_registered) ;


SELECT year(date_registered) , count(DISTINCT proj_unixname) FROM projects
WHERE proj_unixname LIKE "%open%"
AND datasource_id = <current>
GROUP BY year(date_registered)
ORDER BY year(date_registered) ;

Freshmeat:

SELECT year(date_added), count(DISTINCT project_id) FROM fm_projects
WHERE projectname_full LIKE "%free%"
AND datasource_id = <current>
GROUP BY year(date_added)
ORDER BY year(date_added);


SELECT year(date_added), count(DISTINCT project_id) FROM fm_projects
WHERE projectname_full LIKE "%open%"
AND datasource_id = <current>
GROUP BY year(date_added)
ORDER BY year(date_added);

Rubyforge:

SELECT year(date_registered), count(DISTINCT proj_unixname) FROM rf_projects
WHERE proj_unixname LIKE "%free%"
AND datasource_id = <current>
GROUP BY year(date_registered)
ORDER BY year(date_registered);


SELECT year(date_registered), count(DISTINCT proj_unixname) FROM rf_projects
WHERE proj_unixname LIKE "%open%"
AND datasource_id = <current>
GROUP BY year(date_registered)
ORDER BY year(date_registered);

As of June 2009, how many projects of each team size are listed in Freshmeat?

Description

This chart shows the number of projects of each team size listed in Freshmeat.

Visualization

Projects listed as having 0 developers were disregarded (332 projects).

Freshmeat Developer Count Chart

SQL Script

SELECT DISTINCT calc_dev_count, count(DISTINCT project_id) AS count
FROM fm_projects
WHERE datasource_id= <current>
GROUP BY calc_dev_count
ORDER BY count DESC, calc_dev_count;

As of June 2009, how many projects at each repository share URL's?

Description

This chart shows the number of projects at each repository that share URL's.

Visualization

Number of Projects at each Repository that List a Home Page at Another Repository

Shared URL's Table

Shared URL's Chart

Matching projects by URL has two possiblities: projects listed on different forges might both display the same external URL, or projects on one forge might actually list the project site on a competing forge as the home page of record. The diagram shown in the figure above depicts each forge/directory in FLOSSmole and how many of its projects list another forge as the actual hosting home page. For example, in the diagram, the topmost arrow shows 11,229 projects on the Freshmeat that actually have Sourceforge listed as the home page. The arrow notation is used to show a direction of the relationship (e.g. 11,229 Freshmeat projects show a home page on Sourceforge, but only 10 Sourceforge projects list a Freshmeat home page). Pairs of forges with no URLs in common do not show an arrow. (No Rubyforge projects list ObjectWeb URLs, and vice versa.)

For more information on matching project names and URLs, see:

Squire, M. (2009). Integrating projects from multiple open source code forges. International Journal of Open Source Software & Processes, 1(1). January-March 2009. pp. 46-57.

SQL Script

RF-SF

SELECT count(r.proj_unixname) FROM rf_projects r
WHERE (r.real_url like "%sourceforge%"
OR r.real_url like "%sf.net%")
AND datasource_id= <current>;

RF-FM

SELECT count(r.proj_unixname) FROM rf_projects r
WHERE r.real_url like "%freshmeat%"
AND datasource_id= <current>;

RF-OW

SELECT count(r.proj_unixname) FROM rf_projects r
WHERE r.real_url like "%objectweb%"
AND datasource_id= <current>;

FM-RF

SELECT count(f.project_id) FROM fm_project_homepages f
WHERE f.real_url_homepage like "%rubyforge%"
AND datasource_id= <current>;

FM-SF

SELECT count(f.project_id) FROM fm_project_homepages f
WHERE (f.real_url_homepage like "%sourceforge%"
OR f.real_url_homepage like "%sf.net%")
AND datasource_id= <current>;

FM-OW

SELECT count(f.project_id) FROM fm_project_homepages f
WHERE f.real_url_homepage like "%objectweb%"
AND datasource_id= <current>;

SF-FM

SELECT count(p.proj_unixname) FROM projects p
WHERE p.real_url like "%freshmeat%"
AND datasource_id= <current>;

SF-OW

SELECT count(p.proj_unixname) FROM projects p
WHERE p.real_url like "%objectweb%"
AND datasource_id= <current>;

SF-RF

SELECT count(p.proj_unixname) FROM projects p
WHERE p.real_url like "%rubyforge%"
AND datasource_id= <current>;

OW-SF

SELECT count(o.proj_unixname) FROM ow_projects o
WHERE (o.real_url like "%sourceforge%" or o.real_url like "%sf.net%")
AND datasource_id= <current>;

OW-RF

SELECT count(o.proj_unixname) FROM ow_projects o
WHERE o.real_url like "%rubyforge%"
AND datasource_id= <current>;

OW-FM

SELECT count(o.proj_unixname) FROM ow_projects o
WHERE o.real_url like "%freshmeat%"
AND datasource_id= <current>;

As of June 2009, how many projects at each repository share identical short project names?

Description

This chart shows the number of projects at each repository that share project names.

Visualization

Number of Projects at each Repository that Share an Identical Short Project Name

Shared Short Names Table

Shared Short Names Chart

This graph shows the number of short project names shared in common between each pair of projects. For instance, starfish is a project listed on both Sourceforge and Rubyforge. On Rubyforge, it is described as a "tool to make programming ridiculously easy", but on Sourceforge the starfish project is described as a password management application. There are 1367 projects with shared names on Rubyforge and Sourceforge.

For more information on matching project names and URLs, see:

Squire, M. (2009). Integrating projects from multiple open source code forges. International Journal of Open Source Software & Processes, 1(1). January-March 2009. pp. 46-57.

SQL Script

RF-SF

SELECT count(p.proj_unixname) FROM projects p, rf_projects r
WHERE p.proj_unixname = r.proj_unixname
AND p.datasource_id= <current>
AND r.datasource_id= <current>;

RF-FM

SELECT count(f.projectname_short_fixed) FROM fm_projects f, rf_projects r
WHERE f.projectname_short_fixed = r.proj_unixname
AND f.datasource_id = <current> 
AND r.datasource_id = <current>;

FM-SF

SELECT count(f.projectname_short_fixed) FROM fm_projects f, projects p
WHERE f.projectname_short_fixed = p.proj_unixname
AND f.datasource_id = <current> 
AND p.datasource_id = <current>;

SF-OW

SELECT count(p.proj_unixname) FROM projects p, ow_projects o
WHERE p.proj_unixname = o.proj_unixname
AND p.datasource_id= <current> 
AND o.datasource_id= <current>;

RF-OW

SELECT count(r.proj_unixname) FROM rf_projects r, ow_projects o
WHERE r.proj_unixname = o.proj_unixname
AND r.datasource_id= <current> 
AND o.datasource_id= <current>;

FM-OW

SELECT count(f.projectname_short_fixed) FROM fm_projects f, ow_projects o
WHERE f.projectname_short_fixed = o.proj_unixname
AND f.datasource_id = <current>
AND o.datasource_id = <current>;

As of June 2009, how have projects in each repository grown by year?

Description

This chart shows the number of NEW projects added to each repository by year.

Visualization

Project Growth Chart

SQL Script

Sourceforge:

SELECT year(date_registered) , count(DISTINCT proj_unixname) FROM projects
WHERE datasource_id = <current>
GROUP BY year(date_registered)
ORDER BY year(date_registered);

Freshmeat:

SELECT year(date_added), count(DISTINCT project_id) FROM fm_projects
WHERE datasource_id= <current>
GROUP BY year(date_added)
ORDER BY year(date_added);

Rubyforge:

SELECT year(date_registered), count(DISTINCT proj_unixname) FROM rf_projects
WHERE datasource_id= <current>
GROUP BY year(date_registered)
ORDER BY year(date_registered);

Objectweb:

SELECT year(date_registered), count(DISTINCT proj_unixname) FROM ow_projects
WHERE datasource_id= <current>
GROUP BY year(date_registered)
ORDER BY year(date_registered);

As of June 2009, how many projects are listed in each repository?

Description

This chart shows the number of projects that FLOSSmole most recently collected from each repository.

Visualization

Project Count Chart

SQL Script

SELECT count(DISTINCT proj_unixname) FROM projects
WHERE datasource_id= <current>;


SELECT count(DISTINCT project_id) FROM fm_projects
WHERE datasource_id= <current>;


SELECT count(DISTINCT proj_unixname) FROM rf_projects
WHERE datasource_id= <current>;


SELECT count(DISTINCT proj_unixname) FROM ow_projects
WHERE datasource_id= <current>;


SELECT count(DISTINCT proj_num) FROM fsf_projects
WHERE datasource_id= <current>;

June data sets released

Hi moles, the June 2009 data sets are released.

172-sf
173-fm
174-rf
175-ow
176-fsf

Datamarts (sql files) and flat (delimited) files are located on our Google Code downloads area.

oss2009 requests, etc

Just back from OSS 2009 in Skövde, Sweden. (Finally figured out how to make the ö character on a mac: hit option-u, then o). Here are the requests I heard from sitting in talks, either for new forges, for features that FLOSSmole could provide, or just things that people were using/needing that might intersect with our mission here:

debian popularity contest
UDD
KDE's "10 years of data in an xml logfile"
sugarForge
ascencia?
eclipse
"git" everything
developer skills from sourceforge
rdfohloh
launchpad
gforge
fusionforge
dataportability.org
a wiki for common analyses, charts, graphs, SQL commands
a taxonomy of forges

May data, and April and May datamarts released

Go grab the May data, April & May datamarts from our Google code web site.

I'm backing up to Teragrid now, so Teragrid users, you should have a nice new set of data to play with RSN (real soon now).
Syndicate content