SF project descriptions

We got a request for Sourceforge project descriptions. These are the little paragraphs that the project owners write to describe a given project. I've parsed out the descriptions and put them in this file release. Also, I created a new table called project_description to hold this information if you're using the query tool.

freshmeat dec and jan

December and January Freshmeat files have been added as datasource_ids 14(Dec) and 15(Jan). Use the Query Tool to explore the fm_* tables (these are the tables that hold the freshmeat data).

december 2005 data

We've run December 2005 Sourceforge data; the raw html has been stored as datasource_id #13 if you're using the query tool, otherwise, text files are over here at sourceforge on our project page.

We've got the usual stuff, all the Sourceforge project names, all project data, developer counts, who is working on what projects, what programming languages are being used, operating system counts, all that good stuff. Have fun!

Current status: found something new to add... Donors. This could be interesting for a SNA (social network analysis). I'll get a script written to parse donors and make a new table. When I'm done I'll post here.

query tool

version .01 of our query tool is up and running. Thanks, Dawid!

October Data, updated

The SF and Freshmeat (surprise!) data collections for October are DONE. We had a 10-machine grid working to collect this time. Very speedy! We plan to move to collections on a 60-day rotation, rather than 90-days from here on out. This will match up nicely with the 60-day sourceforge stats interval also.

Also, we have a working prototype of our live query tool -thanks Dawid!- we're just waiting for the production environment to be set up and that will be available for you all to use.

Here is the master file list on our SF project page: Master List of FLOSSmole Files, but we also have quicker links to:

Happy Birthday, FLOSSmole

Happy Birthday, FLOSSmole (nee OSSmole). It was 1 year ago today that we started FLOSSmole project on Sourceforge. What a joyous occasion.

Like any one-year-old, FLOSSmole is growing rapidly, learning new things, meeting new people, and generally being cute. (Ok, maybe spreadsheets aren't all that cute afterall.)

But like any proud mama, I am pleased with the progress of FLOSSmole, and I'd like to take all the credit, but really, we all know that with open source projects, "it takes a village". (And basically my role is just to avoid being the Village Idiot.) Thank you to a wonderful team.

July Raw Developer Data

The July raw developer data has finally been released. I had actually forgotten about it (oops). We had a problem in the way our spider collected the datasource_id=5 developers, so I had put off the release until I fixed that problem, and then I promptly forgot about the whole thing.

In any event, the files are posted, so enjoy!

These files contain (a) complete lists of developers and (b) complete lists of which developer is working on which project, and whether they are an admin on that project. If you need historical data (to judge developer movement between projects), refer to earlier file releases (i.e. April, January, November 2004, etc).

And the developer problem was fixed in the spider code, so everything should be back to normal for the October release! Thanks for your patience.

some changes in the works

Coming up:

1. we'll be doing our next SF "run" in October

2. we're making a web interface for "live" read-only queries - this will satisfy our occasional middle-of-the-night wild hypotheses... "gee, I wonder if developers who program in python also program in java or perl... and where are all these ruby hackers coming from anyway...?" You've got questions, we've got answers.

3. if you've used our data in a project, let us know! It feels good to know we've helped you out. Even better, consider letting us host your data or your results. You'll be world-famous, and get the satisfaction of "giving back" to the open source community.

July Summary Project Data

Since I've been at Oscon this week, I've been delayed in posting my summary analyses. Nevertheless, for those who are waiting, I've posted some basic summaries in the directory below

Here is a directory of excel files showing October 2004, January 2005, April 2005, and July 2005 Summary Data for projects on Sourceforge.

Remember, raw data is also available!

July Raw Project Data

Good news, moles! Some of the July project files (SF only) have been posted to the SF project page

(1) July Raw Project Info here: SourceForge.net: Project Info Filelist.

There are 2 files in here:

--ProjectList gives the project names only
--ProjectInfo gives the project unixnames, dates registered, number of developers, and the date the information was collected by FLOSSmole.

(2) July Raw Project Data here: SourceForge.net: Project Data Filelist

There are 8 files in here:
--Database Environments
--Intended Audience
--License Types
--Operating Systems
--Programming Languages
--Project Status
--Project Topic
--User Interface

Coming up soon will be Summary Project Data with beautiful charts (hopefully!), Raw Developer Data, and Summary Developer Data.
Syndicate content