Apache Camel data

We have released several files of Apache Camel IRC log data.

Sources:
originally stored by Dan Kulp
More about Apache Camel

Related Data Sets
Apache Twitter Handles
Apache Project People & Roles

Sample Queries for the IRC data:

List the most prolific IRC posters, in order of their post count
SELECT about_user, count( * )
FROM apache_camel_irc
GROUP BY 1
ORDER BY 2 DESC

List the twitter handles and svn_ids (if known) for anyone who is also on Apache Camel's IRC
SELECT distinct i.about_user, t.twitter_screen_name, t.svn_id
FROM apache_camel_irc i
inner join apache_twitter t
on i.about_user = t.svn_id

The datasources for the IRC data are (currently) #393-1572. Each log file (daily) gets its own datasource_id (since each one is a separate source)!

Data Resources: 
Tags: