google code

January 2012 releases

We're cruising ahead with January 2012 releases. Grab the data from Google Code site or from the teragrid.

Freecode - done (formerly known as Freshmeat)
Savannah - done
Tigris - done
Rubyforge - done
Objectweb - done
Launchpad - done

Google Code - still running
Alioth - bug submitted #54
Gihub - will start as soon as Google is done

Free Software Foundation - bug still not fixed (this is my fault) #51

Interesting things: most popular data from November ..... drumroll please.... Google Code, Github.

Google Code data available

Google Code is our longest data collection effort each month. We've collected everything for November and posted it for your data mining pleasure. Get the files or access it on the Teragrid with direct database access (datasource_id=285).

What are the most common content licenses used by projects listed in Google Code?

Description

This chart shows the top content licenses used by projects in Google Code. There were 197,465 projects that did not list a content license.

Visualization



SQL Script

SELECT gc.content_license AS License, COUNT(proj_name) AS COUNT
FROM gc_projects gc
WHERE gc.datasource_id = <current>
GROUP BY License
ORDER BY Count DESC;

What are the most common code licenses used by projects listed in Google Code?

Description

This chart shows the top code licenses used by projects in Google Code. There were 23,810 projects that did not have a code license listed.

Visualization



SQL Script

SELECT gc.code_license AS License, COUNT(proj_name) AS Count
FROM gc_projects gc
WHERE gc.datasource_id = <current>
GROUP BY License
ORDER BY Count DESC;

June Data: Google Code, Launchpad, Github

Summer is a beautiful thing. Moles, we've got a huge Google Code release for you (ds=271), and the re-vamped Launchpad (ds=272), and also Github (ds=273).

Get your FRESH June data on our Google Code Downloads Page or LIVE on the Teragrid.

Tigris is fixed and is running right now. We're also writing a new collector for Alioth! Lots of new stuff.

Got a bug in the Freshmeat collector, so I'm wrangling that. Thanks to a user for reporting that bug. Don't forget we do have a bug-tracking system on Google Code.

Finally, we've got a fresh UDD upload and Debian data coming soon also. We're just so productive right now!

Also don't forget to check out our collection of Everything You Ever Wanted to Know About Code Forges - data also available on our Google Code download site.

How many projects of each team size are listed in Google Code?

Description

This chart shows the number of projects of each team size listed in Google Code.

Visualization

SQL Script

Use the results from the following query to create a temp table.

SELECT datasource_id, unixname, COUNT( dev_name ) AS dev_count
FROM gc_developer_projects
WHERE datasource_id = <current>
GROUP BY unixname;

Use the temp table for the following query.

SELECT dev_count, COUNT(unixname) AS Count
FROM gc_temp
WHERE datasource id = <current>
GROUP BY dev_count
ORDER BY Count DESC, dev_count;

How many projects are listed in each repository?

Description

This chart shows the number of projects that FLOSSmole most recently collected from each repository.

Visualization

Project Count Chart


SQL Script


SELECT COUNT( DISTINCT proj_name )
FROM gc_projects
WHERE datasource_id = <current>;


SELECT COUNT( DISTINCT project_id )
FROM fm_projects
WHERE datasource_id = <current>;


SELECT COUNT( DISTINCT project_name )
FROM lp_projects
WHERE datasource_id = <current>;


SELECT COUNT( DISTINCT proj_unixname )
FROM rf_projects
WHERE datasource_id= <current>;


SELECT COUNT( DISTINCT proj_num )
FROM fsf_projects
WHERE datasource_id= <current>;


SELECT COUNT( DISTINCT project_name )
FROM sv_projects
WHERE datasource_id= <current>;


SELECT COUNT( DISTINCT unixname )
FROM tg_projects
WHERE datasource_id = <current>;


SELECT COUNT( DISTINCT proj_unixname )
FROM ow_projects
WHERE datasource_id= <current>;

March data for Google Code posted

Here is the March 2011 data for Google Code projects, available on our own GC page. Upload to Teragrid is happening now if you prefer direct db access.

The datasource is 252. Available data includes:

--basic project info for each project on Google Code
--links for each project
--people on each project (some hashed)
--blogs for each project
--labels for each project
--groups for each project

January file releases

Just released data files for the following forges. You can head over to the FLOSSmole data downloads page at Google Code to download any of these files, or wait for them to be released to the Teragrid for live querying (shortly!)

datasource_id, forge_id, abbreviation, name
237 2 FM Freshmeat
238 3 RF Rubyforge
239 4 OW ObjectWeb
240 5 FSF Free Software Foundation
241 10 SV Savannah
243 12 GC Google Code
244 13 TG Tigris

Still running...
245 14 LP Launchpad

Broken...
242 11 GH Github

Adding 1000 data files to Google Code

I've got about 1000 files that were hosted on Sourceforge (still are) but I'm trying to move all our files into one place. I am running scripts all day to d/l these from SF, relabel them, and move them to Google Code.

If you see old files showing up at Google Code, that's why! Don't forget that you can use the search box there if you are looking for a specific file or topic. Also, send email to the mailing list if you can't find something you're looking for and I'll help you out.

UPDATE: this action apparently broke the Google Code files download page for our project. I've submitted a bug report.

Syndicate content