Collection information |
We're cruising ahead with January 2012 releases. Grab the data from Google Code site or from the teragrid. Freecode - done (formerly known as Freshmeat) Google Code - still running Free Software Foundation - bug still not fixed (this is my fault) #51 Interesting things: most popular data from November ..... drumroll please.... Google Code, Github. |
|||
Google Code is our longest data collection effort each month. We've collected everything for November and posted it for your data mining pleasure. Get the files or access it on the Teragrid with direct database access (datasource_id=285). |
|||
Here is the status of the November 2011 collection: done & ready to download on Google Code or query in Teragrid... still collecting... collectors broken and waiting to be fixed... |
|||
1. Free Software Foundation directory changed their layout to a wiki so we're re-writing our collector to parse RDF instead. This will change the tables we use for FSF data now. 2. We were able to convince our dear colleague Audris Mockus to run his Google Code collector and gather the latest list of project names for us. SWEET! This means a Google Code run is imminent. 3. UDD and Debian still need to be re-run, and automated. 4. In case you are keeping track of the different forges, Berlios is shutting down as of Dec 31 2011. We're still plugging along with all this stuff. Hope you are finding the data helpful. Let us know what we can provide. (and join the Mailing List!) |
|||
Here is the pre-print copy of the paper on forges that David and I have written. I am going to present at HICSS 45 in January. Squire, M. and Williams, D. (2012). Describing the software forge ecosystem. 45th Hawaii International Conference on System Sciences. Maui, Hawaii. January 4-7. Forthcoming. |
|||
Here is the status of each collection for September 2011: The stages are UPDATED as of 05-Sep-2011 at 12:41PM: Rubyforge - files released to Google Code & data uploaded into Teragrid Objectweb - files released to Google Code & data uploaded into Teragrid Free Software Foundation Directory - files released to Google Code & data uploaded into Teragrid Savannah - files released to Google Code & data uploaded into Teragrid Github - files released to Google Code & data uploaded into Teragrid Tigris - files released to Google Code & data uploaded into Teragrid Google Code - I am looking at getting a new list of projects as the one we've been using is quite old now (Oct 2010) Launchpad - files released to Google & data uploaded to Teragrid Alioth - files released to Google Code & data uploaded to Teragrid (new tables made) Debian Metrics - waiting on README Ultimate Debian Database - importing into database; error on table create |
|||
Summer is a beautiful thing. Moles, we've got a huge Google Code release for you (ds=271), and the re-vamped Launchpad (ds=272), and also Github (ds=273). Get your FRESH June data on our Google Code Downloads Page or LIVE on the Teragrid. Tigris is fixed and is running right now. We're also writing a new collector for Alioth! Lots of new stuff. Got a bug in the Freshmeat collector, so I'm wrangling that. Thanks to a user for reporting that bug. Don't forget we do have a bug-tracking system on Google Code. Finally, we've got a fresh UDD upload and Debian data coming soon also. We're just so productive right now! Also don't forget to check out our collection of Everything You Ever Wanted to Know About Code Forges - data also available on our Google Code download site. |
|||
Most of the March data has been released to our page on Google Code. Included forge collections are: Free Software Foundation, Freshmeat, Rubyforge, Objectweb, Savannah, Tigris. Google Code is still running. Github and Launchpad are not functional right now (waiting on a bug fixes). There are two ways to get the data: You can also log into our database on the Teragrid and live-query the data. Read these instructions on getting a login. Have fun! |
|||
I've backed up the Jan/Feb data to Teragrid for your live queries. Be sure to log in there and use your database querying tool of choice to check out the data. (If you need an account, read these instructions for how to get yourself an account.) The datasource_id information is as follows: 237 FM-Freshmeat Enjoy!! |
|||
One of the undergraduate students working on this project, Carter Kozak, recently collected all of the Debian package data for January and parsed it for relevant software engineering metrics. He concentrated on C/C++ code. He also integrated some Debian metadata, such as popcon (popularity contest) and sources.gz. He has written up his findings in a paper (as yet unpublished, but just you wait!) and has donated his data to FLOSSmole. You can find the data on our FLOSSmole data downloads page on Google. |
|||
