jhowison's blog

Direct DB access for FLOSSmole collection available

UPDATE: The new procedure for requesting a username/password for direct DB access (as of January 2013) is as follows:

In order to demonstrate usage to granting agencies and to monitor run-away queries (it happens!) interested users need to contact the FLOSSmole project via the ossmole-discuss mailing list to request a personal username and password, which should not be shared.

Other than that simple request, we're not introducing any new AUPs or conditions.

Initially, requesters should join the ossmole-discuss list and send a message requesting database access and a preferred username. Turnaround should be no longer than a business day or two.

Once we receive your request, we will generate your account and provide access to all the data we have in our MySQL database! Easy.

Social Network analysis over time using FLOSSmole data

Just sent off the camera ready version of a paper built using data available in the tracker tables of the FLOSSmole database.

Howison, J., Inoue, K., and Crowston, K. (2006). Social dynamics of free and open source team communications. In Proceedings of the IFIP 2nd International Conference on Open Source Software, Lake Como, Italy. Available from: http://floss.syr.edu/publications/howison_dynamic_sna_intoss_ifip_short.pdf

This paper furthers inquiry into the social structure of free and open source software (FLOSS) teams by undertaking social network analysis across time. Contrary to expectations, we confirmed earlier findings of a wide distribution of centralizations even when examining the networks over time. The paper also provides empirical evidence that while change at the center of FLOSS projects is relatively uncommon, participation across the project communities is highly skewed, with many participants appearing for only one period. Surprisingly, large project teams are not more likely to undergo change at their centers.

Sourceforge Bug Tracker data and analysis scripts

Just wanted to put in a pointer to the data and scripts that we used for our recent First Monday paper, The social structure of Free and Open Source software development. This data is part of OSSmole and Megan and I are working away currently merging out databases. But it is available now on the Syracuse FLOSS research site if people want to jump in.

Graphs of developer counts over time

As an example of the data and analysis in the system here is a graphic of developer counts over time, taken from the Project Summaries pages, developed by James Howison and Kevin Crowston using the OSSmole data. The time series are sorted, programatically, into 6 categories, from constantly rising, mostly rising, not trending, mostly falling, consistently falling and dead projects.



This picture only shows curves and categories for a sample of 120 projects. This can be compared against the categorizations of the total population of Sourceforge projects as is shown in the histogram.



The sample of 120 projects has substantially more consistently rising projects, so it seems clear that that sample is generally more successful. The large unchanging (or not trending) category in the total population reflects the fact that the mode is NA,NA,1,1,1 and our finding that 65,561 of the total 98,568 projects (ie 67%) seen over 5 years have never had more than 1 developer.

The latest Database schema

Megan and I have done some work on commenting the proposed database schema, explaining what each field is and why it is there. The schema is in CVS and is available via the web interface. It is easier to read in an editor capable of syntax coloring for mysql.

We'd very much appreciate feedback on the generality, or lack there of, and coverage of people's interest areas. Best place is the ossmole-discuss mailing list.
Syndicate content