(1) You might notice we changed the name of OSSmole to FLOSSmole. This name change is to reflect the presence of free and libre projects as well. Plus, it will alleviate confusion about how to pronounce the name of the project... now it's just two normal english words: "floss" and "mole". No more spelling or guessing!

(2) This July run is taking forever! I've got all the index.html's and all the developer data, but the scrapes are taking a really long time to UPDATE and INSERT into mysql. Not sure what's going on there but I thank you for your patience. Some of them have been running for a week and I'm still only on 'y'.

(3) Finally, is anyone going to be at OSCON next week? I made a t-shirt with the mole logo on it so if you see me, you'll know :)

almost there...

I'm leaving for the beach, but as soon as we get Internet set up over there (this afternoon?), I'll pick back up. Finished 'm', 'n', and 'o' last night, so we're on "p".


OSSmole gives a huge shout-out to swik, an open wiki-like database about open source projects.

Here's the OSSmole page on swik, and hopefully it'll reflect my comments here about the July run happening at this very moment... JULY DATA should be done before I leave for the beach next week, yay. I'm on 'g' right now.

july data - stay tuned

July data coming soon... I'm running the scrape of SF e'en as we speak... Then to add freshmeat!

java spider code released

We've released our spider code (java) and a nice library (with documentation!) so you can do spiders of sourceforge yourself. Here is java library file (api) and you can go to CVS (project=OSSmoleJava) to see the source. Enjoy! Special thanks to gconklin for this code.

How to use this data

(Note: This message is updated periodically with new info.)

The FLOSSmole project provides data about:

(a) all projects on Sourceforge
(b) all developers on Sourceforge
(c) all projects on Sourceforge AND who is developing for them, their roles, whether they are an administrator, etc.
(d) all Sourceforge projects and their programming languages, operating systems, user interfaces, end user audience, registration dates, etc (new: donations!)
(e) Edit, Oct-2005: much of the above, but for Freshmeat, also
(f) Edit, Jul-2006: also, Rubyforge
(g) Edit, Jul-2006: also, Objectweb
(h) Edit, Jan-2007: also, Free Software Foundation directory
(i) Edit, Feb-2007: also, SourceKibitzer donates data

We have done runs on Sourceforge starting in early 2004 and we have received donated Sourceforge data for December 2004 from Dawid Weiss in Poland.

We began also scraping Freshmeat, Rubyforge, and Objectweb, and we receive data from SourceKibitzer. Get the complete list of data sources here. (This is a list of each of our scrapes and the date and it's "datasource" ID.) The abbreviations for the forges are RF (Rubyforge), SF (Sourceforge), FM (Freshmeat), OW (Objectweb), FSF (Free Software Fndn Directory), SK (SourceKibitzer).

April 2005 Raw Data Released

I've released the raw data files for April 2005 Sourceforge scrape.
  1. Get the Raw List of Projects (full list of SF projects, registration dates, etc)
  2. Get the Raw Project Data (includes operating systems, programming languages, etc)
  3. Get the Raw Developer Data (includes developer list and developer-projects list, with new administrative flag!)

This is good stuff! Summary reports coming soon.

data donations

Thanks to all who have donated and used FLOSSmole data. Here is a short explanation of who has collected data from us so far:

  • Partially supported by NSF Grants 03-41475 and 04–14468. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

  • Partial data donated by Dawid Weiss, Institute of Computing Science, Poznan University of Technology from a research funded by the European Commission via FP6 Co-ordinated Action Project 004337 in priority IST-2002- (CALIBRE), http://www.calibre.ie/

  • Partial data donated by Megan Conklin, Elon University, Department of Computing Sciences.

  • Partial data donated by Kevin Crowston and James Howison, Syracuse University.

  • Partial data donated by Mark Kofman and Anton Litvinenko of SourceKibitzer.org

  • [Your name here!]

Sourceforge Bug Tracker data and analysis scripts

Just wanted to put in a pointer to the data and scripts that we used for our recent First Monday paper, The social structure of Free and Open Source software development. This data is part of OSSmole and Megan and I are working away currently merging out databases. But it is available now on the Syracuse FLOSS research site if people want to jump in.
Syndicate content