Collection information

Details about the repository collections

RubyGems collections updated

RubyGems collections for datasource_id # 61243 (July 2017) have been updated. They can be found in the database in the "rubygems" schema, or as flat files (latest datasource_id only) on the FLOSSmole data server.

Some sample queries:

Data Resources: 

ObjectWeb Collections updated

We updated our ObjectWeb collections. You can find out more about the data we have, or download flat data files in raw format or query the data directly from the FLOSSmole database. The most recent datasource_id is 70912.

Some sample queries:

What licenses are used most by ObjectWeb projects?

Data Resources: 

Microsoft CodePlex data

Codeplex was Microsoft's open source code forge. It began in 2006 and shut down in 2017. We collected the data at the time of shutdown, and provided it here at FLOSSmole for anyone to use.

Data is available in raw format or in the FLOSSmole database.

Sample graphics

Data Resources: 

Google Code Project Create Dates

Project creation dates for every Google Code project from February 4, 2011 (when they first started tracking project creation dates) and when Google Code was shut down March 12, 2015.

Click to enlarge

Data Resources: 

LKML (email) study: data/paper available

We presented this paper at the 2016 OpenSym this week.

Schneider, D., Spurlock, S., and M. Squire. (2016). Differentiating Communication Patterns of Leaders on the Linux Kernel Mailing List. In Proceedings of the 12th International Symposium on Open Collaboration (OpenSym 2016).

Tags: 
Data Resources: 

New "Apache Projects & Contributors" data dump

I spent a few days in May updating the list of all the Apache project contributors (full name & Apache system name when available) and their organizations when available. This data set was first released in 2013 in the MSR paper entitled "Project Roles in the Apache Foundation: A Data Set".

Fields:

Tags: 
Data Resources: 

RubyGems data updated June 2016


Hello moles, the latest RubyGems data has been collected. We now have two RubyGems collections:

  • 61240: November 2015
  • 61243: June 2016

The data can be found in two places:

Tables include:

Tags: 
Data Resources: 

Bitcoin-dev, Ubuntu, Perl6, Django, Puppet IRC logs are updated

Thanks to the work of my two summer research assistants Evan Ashwell & Greg Batchelor, the IRC channels for #bitcoin-dev, perl6, #ubuntu, #django, and puppet (#gen, #dev, and #razor) have been updated.

Things to know:

Data Resources: 

RubyGems.org collection, Nov 2015

We have added RubyGems.org data under datasource_id 61240. RubyGems.org is the official gem host for Ruby projects.

The scripts we used to collect this data are available on Github and the SQL dumps are available on our data server. Direct database access is also available. Existing database users were given access to this new database on the MySQL server, called 'Rubygems'.

Data Resources: 

August 2015 Launchpad data

We have added Launchpad data under datasource_id 58458. Launchpad is a repository for projects affiliated with Ubuntu. Summer research assistant Gavan Roth wrote some scripts to collect this data.

--Download the flat files, or
--Access and query this data via the MySQL interface

Here is a query to show some of the data that is available:

Data Resources: 

Pages