megan's blog

May 2010 Data released

May 2010 data is released for some forges.

-Freshmeat (datasource 218)
-Rubyforge (datasource 219)
-ObjectWeb (datasource 220)
-Free Software Fntn (datasource 221)
-Google Code (datasource 222) - list of projects only

Our collectors for Savannah, Sourceforge, Github, Tigris, Launchpad are all undergoing maintenance at the moment.

UPDATE May 28, 2010
-Savannah data has been released (datasource 224)

Link to download the FLOSSmole data on Google Code.

December Sourceforge Data released

After long delay, the December Sourceforge data has been released. You may recall that over summer 2009, SF redesigned their web site which broke many of our crawlers and all of our parsers.

We have re-written these, and with only a few exceptions, have pretty much the same data as we always had.

Here are some release notes:

1. The Datasource_id=206
2. Donors data is not available in the Dec 2009 release. Donors were moved to their own page, so we have to add this to the collection for next time.
3. Statistics data is not available in the Dec 2009 release. We accidentally collected the wrong stats pages, so we had to throw these out and re-write for next time.
4. Status data (alpha, beta, mature, etc) is not available in the Dec 2009 release. This information is still being collected and kept by SF, but we can't find where it's being reported on their web site. If you have any ideas, send them to the mailing list (ossmole-discuss@lists.sourceforge.net).

Files are located at our Google Code page: http://code.google.com/p/flossmole/downloads/list

For those of you with database access on the sdsc server, I'll get these files over there ASAP.

November 2009 data released

This month we have data from Freshmeat, Rubyforge, Objectweb, Savannah, Github, Free Software Foundation.

Downloads available at Google Code

Remember, the SQL is available in the datamart*.sql.bz files, the flat (delimited) data is available in the other files.

We're still working on getting our Sourceforge scraper back up and running, and we thank you for your patience.

October 2009 data released

October 2009 data has been released. Here are the forges we have this month:
Freshmeat
Rubyforge
ObjectWeb
Free Software Foundation directory
Savannah (new)
GitHub (new)

FLOSSmole Downloads

Sourceforge is undergoing a re-write, still, but we will be collecting again from there soon. In the meantime, don't forget that the June 2009 data is available, and also there is the Notre Dame data if you find that helps at all.

Enjoy!

September 2009 data released

Data has been released for FSF, FM, RF, OW. Go get it!! Have fun.

Google Code Downloads Page

That Freshmeat data looks fairly popular. Anyone want to tell us how you use this data?

Savannah data available

Savannah data has been released for July. See what you think! (Datasource_id = 182)

July 2009 data

Hello moles, our July 2009 data has been released: this month we have Objectweb, Freshmeat, Rubyforge, Free Software Foundation directory.

Go to our Google Code pages to download the data.

The most recent datasource_ids are:
178-fm-July2009
179-rf-July2009
180-ow-July2009
181-fsf-July2009

June data sets released

Hi moles, the June 2009 data sets are released.

172-sf
173-fm
174-rf
175-ow
176-fsf

Datamarts (sql files) and flat (delimited) files are located on our Google Code downloads area.

oss2009 requests, etc

Just back from OSS 2009 in Skövde, Sweden. (Finally figured out how to make the ö character on a mac: hit option-u, then o). Here are the requests I heard from sitting in talks, either for new forges, for features that FLOSSmole could provide, or just things that people were using/needing that might intersect with our mission here:

debian popularity contest
UDD
KDE's "10 years of data in an xml logfile"
sugarForge
ascencia?
eclipse
"git" everything
developer skills from sourceforge
rdfohloh
launchpad
gforge
fusionforge
dataportability.org
a wiki for common analyses, charts, graphs, SQL commands
a taxonomy of forges

May data, and April and May datamarts released

Go grab the May data, April & May datamarts from our Google code web site.

I'm backing up to Teragrid now, so Teragrid users, you should have a nice new set of data to play with RSN (real soon now).
Syndicate content