Submitted by megan on May 12, 2016 - 3:10pm
Submitted by megan on December 17, 2015 - 10:45am
We have added RubyGems.org data under datasource_id 61240. RubyGems.org is the official gem host for Ruby projects.
The scripts we used to collect this data are available on Github and the SQL dumps are available on our data server. Direct database access is also available. Existing database users were given access to this new database on the MySQL server, called 'Rubygems'.
Submitted by megan on September 14, 2015 - 8:26am
The International Conference on Open Source Systems (OSS) is a long-standing international forum for researchers, practitioners from business and industry, enthusiasts, and students to present and discuss the latest trends, experiences, and concerns in the field of Free/Libre Open Source Software.
The 12th OSS Conference will take place in will take place in the city of Gothenburg, in 30 May - 02 June 2016.
Call for conributions
Submitted by megan on August 3, 2015 - 10:48am
We have added Launchpad data under datasource_id 58458. Launchpad is a repository for projects affiliated with Ubuntu. Summer research assistant Gavan Roth wrote some scripts to collect this data.
--Download the flat files, or
--Access and query this data via the MySQL interface
Here is a query to show some of the data that is available:
Submitted by megan on July 22, 2015 - 12:54pm
I've uploaded metadata (and when available, actual papers) for some relevant 2015 open source conferences (including the 2015 OSS conference, HICSS, and Mining Software Repositories) to FLOSShub/biblio.
There are now 1589 papers on FLOSShub/biblio. It makes a nice addition/backup/source for Google Scholar and the other larger publishing sites.
Submitted by megan on June 1, 2015 - 11:21am
Today my new research assistant Gavan & I are performing some maintenance tasks on the database, including a reorganization of the places where the data tables live. Hopefully this will mean that the data is much better organized.
Here is the github summary of what we are doing, and a brief summary below.
We will leave old copies of the most popular tables for a few days, in order to give everyone time to rework scripts, etc.
Submitted by megan on May 29, 2015 - 10:04pm
Back in the 2000's, the GNU Enterprise (GNUe) project chat logs (and human-created chat log summaries!) were used by several papers in the area of text summarization, especially dialogue summarization.
The reason the GNUe chat logs and summaries were used is that the logs were accompanied by summaries that were compiled periodically (manually) by a human. The summarized chat logs can thus be considered a kind of "gold standard" for what kind of summary a machine summarizer should produce.
Submitted by megan on May 22, 2015 - 7:23am
A pastebin is a web site where developers can paste in some code, get back a URL, and then share that with others. The usage of pastebins is handy for IRC chat or in email, when a lot of source code will look ugly or be unformatted. However, the pastebin URLs disappear over time and this presents a problem for those of us who collect old data, or want to study the software evolution.
Submitted by megan on December 31, 2014 - 1:48pm
Submitted by megan on November 13, 2014 - 9:21am
I appeared (?) on David Levine's radio show Hearsay Culture out of KZSU-FM (Stanford U.) today. 10am PST for the stream, or listen later on a podcast at HearsayCulture.com.
Pages