Apache Camel data
We have released several files of Apache Camel IRC log data.
Sample Queries for the IRC data:
List the most prolific IRC posters, in order of their post count
SELECT about_user, count( * )
GROUP BY 1
ORDER BY 2 DESC
List the twitter handles and svn_ids (if known) for anyone who is also on Apache Camel's IRC
SELECT distinct i.about_user, t.twitter_screen_name, t.svn_id
FROM apache_camel_irc i
inner join apache_twitter t
on i.about_user = t.svn_id
The datasources for the IRC data are (currently) #393-1572. Each log file (daily) gets its own datasource_id (since each one is a separate source)!