Blog

February Github data released

Submitted by megan on March 1, 2012 - 9:23am

February data has been released for Github.

Get the data here from our Google Code downloads page or request direct database access here.

Included with Github data are the following values:
project name
developer name
description
private yes/no
fork number
homepage
number of watchers
open issues
...and all the xml values that these fields are based on!

Have fun!

Data Resources:

Collection information

Tags:

github

Read more about February Github data released
megan's blog
Log in to post comments

February Google Code data released

Submitted by megan on February 27, 2012 - 12:50pm

Google Code data has been released for January/February 2012.

Get the data here from our Google Code downloads page or request direct database access here.

Be aware that there is one open bug for Google Code collection that may affect your use of this data.

Data Resources:

Collection information

Tags:

google

Read more about February Google Code data released
megan's blog
Log in to post comments

January 2012 releases

Submitted by megan on January 18, 2012 - 1:34pm

We're cruising ahead with January 2012 releases. Grab the data from Google Code site or from the teragrid.

Freecode - done (formerly known as Freshmeat)
Savannah - done
Tigris - done
Rubyforge - done
Objectweb - done
Launchpad - done

Data Resources:

Collection information

Tags:

Google Code data available

Submitted by megan on November 21, 2011 - 11:01am

Google Code is our longest data collection effort each month. We've collected everything for November and posted it for your data mining pleasure. Get the files or access it on the Teragrid with direct database access (datasource_id=285).

Data Resources:

Collection information

Tags:

google code

Read more about Google Code data available
megan's blog
Log in to post comments

Freshmeat becomes Freecode, and how our data is affected

Submitted by megan on November 17, 2011 - 11:51am

Three things happened recently to affect our Freshmeat collection

1. Freshmeat announced a name change to Freecode.
2. We have an issue (issue #43) that talks about how the trove definitions for Freshmeat are out of date.
3. Freshmeat replaced trove with tagging and we missed the memo

What I've done is as follows:

For issue #1 - decided not to rename our abbreviation for Freshmeat. It will remain "FM".

Tags:

freshmeat

freecode

Read more about Freshmeat becomes Freecode, and how our data is affected
megan's blog
Log in to post comments

November 2011 data entered

Submitted by megan on November 2, 2011 - 12:49pm

Here is the status of the November 2011 collection:

done & ready to download on Google Code or query in Teragrid...
============
RUBYFORGE
OBJECTWEB
TIGRIS
LAUNCHPAD
SAVANNAH
ALIOTH
GITHUB

still collecting...
============
GOOGLE

Data Resources:

Collection information

Tags:

Read more about November 2011 data entered
megan's blog
Log in to post comments

FLOSSmole as a catalyst for research

Submitted by megan on October 14, 2011 - 8:53am

One of the papers at the 2011 OSS conference is entitled "Building Knowledge in Open Source Software Research in Six Years of Conferences". It surveys the contributions of papers presented at the OSS conferences, and builds social networks of the papers, identifying research streams along the way.

Findings particular to FLOSSmole:

"Cluster #82. The largest cluster originates from node #82. Paper #82 introduces the OSSmole project (later called FLOSSmole). OSSmole is a repository of data, scripts, and analysis of data collected from OSS projects."

and

Tags:

research

conference

Read more about FLOSSmole as a catalyst for research
megan's blog
Log in to post comments

Current challenges for Fall

Submitted by megan on October 11, 2011 - 10:06am

1. Free Software Foundation directory changed their layout to a wiki so we're re-writing our collector to parse RDF instead. This will change the tables we use for FSF data now.

2. We were able to convince our dear colleague Audris Mockus to run his Google Code collector and gather the latest list of project names for us. SWEET! This means a Google Code run is imminent.

3. UDD and Debian still need to be re-run, and automated.

Data Resources:

Collection information

Read more about Current challenges for Fall
megan's blog
Log in to post comments

Forges paper pre-print

Submitted by megan on September 14, 2011 - 5:55pm

Here is the pre-print copy of the paper on forges that David and I have written. I am going to present at HICSS 45 in January.

Squire, M. and Williams, D. (2012). Describing the software forge ecosystem. 45th Hawaii International Conference on System Sciences. Maui, Hawaii. January 4-7. Forthcoming.

Data Resources:

Collection information

Tags:

forges

September 2011 data, in progress

Submitted by megan on August 31, 2011 - 4:37pm

Here is the status of each collection for September 2011:

The stages are
1. collecting (some projects have sub-stages here)
2. parsing
3. files released to Google Code
4. data released to Teragrid

UPDATED as of 05-Sep-2011 at 12:41PM:
Freshmeat - collector/parser being re-written for accuracy and bugfixes

Rubyforge - files released to Google Code & data uploaded into Teragrid

Objectweb - files released to Google Code & data uploaded into Teragrid

Data Resources:

Collection information

Read more about September 2011 data, in progress
megan's blog
Log in to post comments

Navigation

Search form

Getting data

Using Data

Related Projects

Recent blog posts

February Github data released

February Google Code data released

January 2012 releases

Google Code data available

Freshmeat becomes Freecode, and how our data is affected

November 2011 data entered

FLOSSmole as a catalyst for research

Current challenges for Fall

Forges paper pre-print

September 2011 data, in progress

Pages

Navigation

Search form

Getting data

Using Data

Related Projects

Recent blog posts

You are here

Blog

Pages