Submitted by gbatchelor on July 1, 2016 - 10:31am
Django IRC D3 CONTRIBUTIONS GRAPH
This graph represents The number of posts in the Django IRC logs. The lighter green squares represent days with less posts than the darker green squares. Months go from left to right and are separated by the darker lines. Days go by columns from left to right.
Submitted by gbatchelor on June 29, 2016 - 11:41am
Ubuntu IRC D3 CONTRIBUTIONS GRAPH
This graph represents The number of posts in the UbuntuIRC logs. The lighter green squares represent days with less posts than the darker green squares. Months go from left to right and are separated by the darker lines. Days go by columns from left to right.
Submitted by gbatchelor on June 23, 2016 - 9:55am
BITCOIN IRC D3 CONTRIBUTIONS GRAPH
This graph represents The number of posts in the Bitcoin IRC logs. The lighter green squares represent days with less posts than the darker green squares. Months go from left to right and are separated by the darker lines. Days go by columns from left to right.
Submitted by megan on June 2, 2016 - 1:17pm
Thanks to the work of my two summer research assistants Evan Ashwell & Greg Batchelor, the IRC channels for #bitcoin-dev, perl6, #ubuntu, #django, and puppet (#gen, #dev, and #razor) have been updated.
Things to know:
Submitted by megan on May 29, 2015 - 10:04pm
Back in the 2000's, the GNU Enterprise (GNUe) project chat logs (and human-created chat log summaries!) were used by several papers in the area of text summarization, especially dialogue summarization.
The reason the GNUe chat logs and summaries were used is that the logs were accompanied by summaries that were compiled periodically (manually) by a human. The summarized chat logs can thus be considered a kind of "gold standard" for what kind of summary a machine summarizer should produce.
Submitted by megan on April 8, 2015 - 1:06pm
Hi moles! New IRC chat logs now cleaned and stored in the irc database on the FLOSSmole mysql server, thanks to Andrea Black, one of our intrepid FLOSSmole research assistants. This data is part of an overall IRC collection started by another student, Becca Gazda, last summer.
We now have the following IRC chat histories available:
Apache
--activemq
--aries
--camel
--cxf
--kalumet
--karaf
--servicemix
Submitted by megan on December 31, 2014 - 1:48pm
Submitted by megan on August 1, 2014 - 12:17pm
In my continuing quest to be organized, I've created a new schema to hold just the IRC log data. On the database server (access instructions here), there is a new schema called 'irc' and it includes (for now) Ubuntu logs, Django logs, 7 Apache projects, and the topic lines from Freenode for all channels with 3+ users.
Coming soon: email updates, including Linux Kernel Mailing List (LKML) and more IRC (Wordpress, etc).
Enjoy!
Submitted by megan on January 22, 2014 - 10:44am
Hello moles! Happy January. Here are some fresh new data sources for your mining pleasure:
1. Freenode channel list and topics (all public channels with 3 or more users). The table is called "fn_irc_channels".
2. Apache Activemq IRC logs (one datasource_id per day, one row per message).
3. Apache Aries IRC logs
4. Apache Camel IRC logs
5. Apache CXF IRC logs
6. Apache Karaf IRC logs
7. Apache Kalumet IRC logs
8. Apache Servicemix IRC logs
here is a sample of what the structure looks like for 2-8: