Details about the repository collections
Submitted by megan on September 14, 2011 - 5:55pm
Here is the pre-print copy of the paper on forges that David and I have written. I am going to present at HICSS 45 in January.
Squire, M. and Williams, D. (2012). Describing the software forge ecosystem. 45th Hawaii International Conference on System Sciences. Maui, Hawaii. January 4-7. Forthcoming.
Submitted by megan on August 31, 2011 - 4:37pm
Here is the status of each collection for September 2011:
The stages are
1. collecting (some projects have sub-stages here)
2. parsing
3. files released to Google Code
4. data released to Teragrid
UPDATED as of 05-Sep-2011 at 12:41PM:
Freshmeat - collector/parser being re-written for accuracy and bugfixes
Rubyforge - files released to Google Code & data uploaded into Teragrid
Objectweb - files released to Google Code & data uploaded into Teragrid
Submitted by megan on June 20, 2011 - 5:00am
Summer is a beautiful thing. Moles, we've got a huge Google Code release for you (ds=271), and the re-vamped Launchpad (ds=272), and also Github (ds=273).
Get your FRESH June data on our Google Code Downloads Page or LIVE on the Teragrid.
Tigris is fixed and is running right now. We're also writing a new collector for Alioth! Lots of new stuff.
Submitted by megan on March 4, 2011 - 11:33am
Most of the March data has been released to our page on Google Code. Included forge collections are: Free Software Foundation, Freshmeat, Rubyforge, Objectweb, Savannah, Tigris. Google Code is still running. Github and Launchpad are not functional right now (waiting on a bug fixes).
Submitted by megan on March 3, 2011 - 7:41pm
I've backed up the Jan/Feb data to Teragrid for your live queries. Be sure to log in there and use your database querying tool of choice to check out the data. (If you need an account, read these instructions for how to get yourself an account.)
The datasource_id information is as follows:
237 FM-Freshmeat
238 RF-Rubyforge
239 OW-ObjectWeb
240 FSF-FreeSoftwareFndtn
241 SV-Savannah
243 GC-GoogleCode
244 TG-Tigris
246 - Debian metrics
Submitted by megan on February 16, 2011 - 10:35am
One of the undergraduate students working on this project, Carter Kozak, recently collected all of the Debian package data for January and parsed it for relevant software engineering metrics. He concentrated on C/C++ code. He also integrated some Debian metadata, such as popcon (popularity contest) and sources.gz. He has written up his findings in a paper (as yet unpublished, but just you wait!) and has donated his data to FLOSSmole. You can find the data on our FLOSSmole data downloads page on Google.
Submitted by megan on February 15, 2011 - 11:04am
Just released data files for the following forges. You can head over to the FLOSSmole data downloads page at Google Code to download any of these files, or wait for them to be released to the Teragrid for live querying (shortly!)
datasource_id, forge_id, abbreviation, name
237 2 FM Freshmeat
238 3 RF Rubyforge
239 4 OW ObjectWeb
240 5 FSF Free Software Foundation
241 10 SV Savannah
243 12 GC Google Code
244 13 TG Tigris
Still running...
245 14 LP Launchpad
Submitted by megan on February 11, 2011 - 9:28am
I've got about 1000 files that were hosted on Sourceforge (still are) but I'm trying to move all our files into one place. I am running scripts all day to d/l these from SF, relabel them, and move them to Google Code.
If you see old files showing up at Google Code, that's why! Don't forget that you can use the search box there if you are looking for a specific file or topic. Also, send email to the mailing list if you can't find something you're looking for and I'll help you out.
Submitted by megan on February 10, 2011 - 9:20am
Freshmeat, Objectweb, Rubyforge, Savannah, FSF, Tigris - all done, waiting for release
Github collector is broken, have student working on fixing that now. They suddenly made it hard to seed the initial project list so we're trying to figure out a way to get the entire corpus of projects. I'm getting flashbacks of when SF got really big and made it difficult for everyone to work with.
Google Code is plugging along. We're on the 7th out of 8 collection processes. Won't be too much longer on that.
Submitted by megan on September 21, 2010 - 10:47am
Just released Github data for September today. This took about 10 days to collect, parse and release. You can download the files here (along with Freshmeat, Rubyforge, Objectweb, Free Software Foundation, Savannah, Tigris, etc), or wait for the next Teragrid backup if you want direct database access. (Google Code is collecting now, and upon completion of that, I'll do a final September Teragrid backup.)
Pages