google code

Adding 1000 data files to Google Code

I've got about 1000 files that were hosted on Sourceforge (still are) but I'm trying to move all our files into one place. I am running scripts all day to d/l these from SF, relabel them, and move them to Google Code.

If you see old files showing up at Google Code, that's why! Don't forget that you can use the search box there if you are looking for a specific file or topic. Also, send email to the mailing list if you can't find something you're looking for and I'll help you out.

UPDATE: this action apparently broke the Google Code files download page for our project. I've submitted a bug report.

Google Code Sept 2010 data out

Whew! Google Code is collected, parsed, and released. Backup to Teragrid is happening now (just as soon as I solve this little issue of"disk quota exceeded" - fun!). In the meantime, go to the FLOSSmole Google Code Downloads Page and get your hot fresh data.

Remember, the files marked "datamarts" are SQL code you can use to make your own version of the database. The raw delimited files are marked .txt.bz2. The datasource_id for Google Code this release is 235.

New Google Code Data Released

Hello moles! I've released a new set of Google Code project data to our own downloads page (on Google Code, no less!) - the datasource_id is 226.

This data took over a month to collect. Included are the following:

--project names (info)
--project license, code and content (info)
--project summary (info)
--project description (info)
--project activity level (info)
--who works on what project and what their role is (people)
--what blogs are listed for each project (blogs)
--what links are listed for each project (links)
--what labels are used to describe each project (labels)

May 2010 Data released

May 2010 data is released for some forges.

-Freshmeat (datasource 218)
-Rubyforge (datasource 219)
-ObjectWeb (datasource 220)
-Free Software Fntn (datasource 221)
-Google Code (datasource 222) - list of projects only

Our collectors for Savannah, Sourceforge, Github, Tigris, Launchpad are all undergoing maintenance at the moment.

UPDATE May 28, 2010
-Savannah data has been released (datasource 224)

Link to download the FLOSSmole data on Google Code.

February 2010 Data Released

Lots of new data for you to peruse out on our FLOSSmole Data Downloads Page.

Here's what's out there, recently added:

Google Code, March 2010 (GC) - list of all GC projects donated by Audris Mockus (HUGE THANK YOU TO AUDRIS FOR THIS!!)
Freshmeat, February 2010 (FM)
Objectweb, February 2010 (OW)
Rubyforge, February 2010 (RF)
Github, February 2010 (GH)
Free Software Foundation, February 2010 (FSF)
Savannah, February 2010 (SV)
and Sourceforge from December 2009 (SF)

We have another set of bugs to fix with Sourceforge collection this year, 2010, but those are forthcoming. I'm running a collection now. Hopefully the data will be good. We may even have stats this time. Hallelujah.

Also, thanks to my phenomenal undergraduate superstar Steven Norris, Tigris is coming soon!! and Debian after that. We are rocking the repository collection...

Syndicate content