RubyGems data updated June 2016

Hello moles, the latest RubyGems data has been collected. We now have two RubyGems collections:

  • 61240: November 2015
  • 61243: June 2016

The data can be found in two places:

Tables include:

  • rubygems_project_authors (the author(s) listed for each gem/project)
  • rubygems_project_create_dates (earliest known release date for a gem/project)
  • rubygems_project_devdep (development dependencies for each gem/ project)
  • rubygems_project_facts (basic project metadata scraped from project page)
  • rubygems_project_links (the list of links provided by each gem/project, ex: home page, documentation, etc)
  • rubygems_project_owners (the owner(s) listed for each gem/project)
  • rubygems_project_pages (the html and rss where we got this data; one per gem, per datasource_id)
  • rubygems_project_rtdep (runtime dependencies for each gem/project)
  • rubygems_project_versions (each time the gem/project is released, it creates a new version)

Mining Software Repositories '16 paper, slides & data

This weekend I'll be presenting at Mining Software Repositories 2016 in Austin, TX. My talk is in the data sets track, and it is entitled Data Sets: The Circle of Life in Ruby Hosting, 2003-2015(PDF). Here are the slides. And here are the quick links to the flat data: RF and RG.

RubyGems.org collection, Nov 2015

We have added RubyGems.org data under datasource_id 61240. RubyGems.org is the official gem host for Ruby projects.

The scripts we used to collect this data are available on Github and the SQL dumps are available on our data server. Direct database access is also available. Existing database users were given access to this new database on the MySQL server, called 'Rubygems'.

