FLOSSmole

Last Rubyforge Collection

Submitted by megan on May 15, 2014 - 1:19pm

The last Rubyforge collection happened yesterday. The datasource_id = 12987. All the data is located on our file downloads site, or in the database (ossmole_merged schema, tables prefixed 'rf', use datasource_id=12987 in your SQL queries).

RIP Rubyforge! We have been collecting from there for 10 years. Charts and graphs coming soon.

rubyforge shuts down

Data Resources:

Collection information

Tags:

rubyforge

12987

Read more about Last Rubyforge Collection
megan's blog
Log in to post comments

Django IRC data loaded into database

Submitted by megan on March 25, 2014 - 4:47pm

Django is a Python web framework. And of course it is an open source project. I have downloaded the entire collection of IRC logs for this project starting with the first logs from 2011. The logs are split into lines, parsed into fields (message, sender, time, date, etc) are now loaded into ossmole_merged database on our live MySQL server in a table called django_irc.

Each datasource_id represents one day's log file. Right now we have datasource_id 8442-9435.

Data Resources:

Collection information

Tags:

django

Read more about Django IRC data loaded into database
megan's blog
Log in to post comments

New March 2014 data released

Submitted by megan on March 11, 2014 - 11:57am

Some new forge data has been released collected 04-Mar-2014.

Datasource_id's are as follows:

8079 - freecode
8080 - rubyforge
8081 - objectweb
8082 - savannah
8083 - tigris
8084 - alioth

IRC data:
8085 - 8134: Apache ServiceMix
8135 - 8185: Apache Camel
8186 - 8236: Apache ActiveMQ
8237 - 8287: Apache CXF
8288 - 8338: Apache-Aries
8339 - 8389: Apache Kalumet
8390 - 8440: Apache Karaf

Data Resources:

Collection information

Tags:

Read more about New March 2014 data released
megan's blog
Log in to post comments

New Apache project IRC data

Submitted by megan on January 22, 2014 - 10:44am

Hello moles! Happy January. Here are some fresh new data sources for your mining pleasure:

1. Freenode channel list and topics (all public channels with 3 or more users). The table is called "fn_irc_channels".
2. Apache Activemq IRC logs (one datasource_id per day, one row per message).
3. Apache Aries IRC logs
4. Apache Camel IRC logs
5. Apache CXF IRC logs
6. Apache Karaf IRC logs
7. Apache Kalumet IRC logs
8. Apache Servicemix IRC logs

here is a sample of what the structure looks like for 2-8:

Data Resources:

Collection information

Tags:

apache

irc

freenode

Read more about New Apache project IRC data
megan's blog
Log in to post comments

New Apache People-Roles-Projects data

Submitted by megan on December 24, 2013 - 6:13pm

Hot off the presses! Another update to the Apache people-roles-projects data:

Datasources 1578-1585 have updated information on people working on Apache projects, including committer lists, PMC lists, PMC chairs, etc.

Timezones are also now being collected as well.

This is an update to the original dataset described in the paper "Project Roles in the Apache Software Foundation: A Dataset" (2013), written by yours truly.

Data Resources:

Collection information

Tags:

apache

1585

Read more about New Apache People-Roles-Projects data
megan's blog
Log in to post comments

Apache Camel data

Submitted by megan on December 16, 2013 - 1:53pm

We have released several files of Apache Camel IRC log data.

Sources:
originally stored by Dan Kulp
More about Apache Camel

Sample Queries for the IRC data:

Data Resources:

Examples

Tags:

apache

December 2013 data released

Submitted by megan on December 16, 2013 - 1:29pm

December data has been released. We have a few old standbys (fc, rf, ow, sv, al) and some hot fresh data as well.

What is new, you ask? Well, we have some IRC chat log data for the Apache project Camel [1]. A nice new social data set, all parsed and organized into relational database format for you to query.

Data Resources:

Collection information

Tags:

Read more about December 2013 data released
megan's blog
Log in to post comments

Rubyforge goes into the sunset

Submitted by megan on December 1, 2013 - 11:12am

We've been collecting Rubyforge data almost since the beginning. Last month we reported on the decline of Rubyforge in light of newer forges, like Github. Here's the chart we drew:

Now we've got this lovely pair of images to contend with:

and

Data Resources:

Collection information

Tags:

rubyforge

Read more about Rubyforge goes into the sunset
megan's blog
Log in to post comments

New home for flat files

Submitted by megan on October 2, 2013 - 2:34pm

Many of you know that we provide flat files of our data for download by anyone at any time. Until recently we had hosted these on Google Code (before 2009 or so, we hosted them on Sourceforge). Recently, Google Code announced that projects will not be able to have file downloads as of January 2014. So we had to find a new home for our files.

Data Resources:

Collection information

Tags:

flossdata

A decade of forges

Submitted by megan on September 12, 2013 - 8:21am

We here at FLOSSmole have been gathering data about how free, libre, and open source software is made for about 10 years now, actually a little more.

In that time, a lot has changed in the forge landscape, both with the players and with the tools.

Just for fun, I decided to run a few quick queries to show the ascendance of Github and the concurrent decline of some smaller forges. These two graphs show the rate of new project creation (called 'registration' on Rubyforge and 'creation' at Github - and yes, the Github numbers do include forks).

Tags:

rubyforge

github

Read more about A decade of forges
Log in to post comments

Navigation

Search form

Getting data

Using Data

Related Projects

Recent blog posts

Last Rubyforge Collection

Django IRC data loaded into database

New March 2014 data released

New Apache project IRC data

New Apache People-Roles-Projects data

Apache Camel data

December 2013 data released

Rubyforge goes into the sunset

New home for flat files

A decade of forges

Pages