jhowison's blog |
Hello moles,
I'm excited to give you all a heads up that the entire flossmole database is now available directly via a MySQL server. We have transferred the database to the NSF TeraGrid Data Central hosting site [1] (based at the San Diego Supercomputing centre). It's a bigger machine and professionally administered, which was much better than we could offer ourselves. See below for access procedure. The process of transferring the database also enabled us to prepare comprehensive datamarts for each datasource in the database. These are mysqldump files which can be used for local access to parts of the database; there are two for each datasource, one containing the raw html pages and one, substantially smaller, containing just the parsed data points. These will be available shortly and will be an option for those who want to install a local copy of the DB; although we'd be very interested in reasons people find to do that, we'd like to have people sharing useful transformations of the data and the Data Central database should be pretty quick. So now we have three great options for accessing the FLOSSmole data: 1. The traditional monthly flat files 2. Direct MySQL access to the full database @ DC. 3. Comprehensive datamarts for local access Database access further info |
|||
Just sent off the camera ready version of a paper built using data available in the tracker tables of the FLOSSmole database.
Howison, J., Inoue, K., and Crowston, K. (2006). Social dynamics of free and open source team communications. In Proceedings of the IFIP 2nd International Conference on Open Source Software, Lake Como, Italy. Available from: http://floss.syr.edu/publications/howison_dynamic_sna_intoss_ifip_short.pdf This paper furthers inquiry into the social structure of free and open source software (FLOSS) teams by undertaking social network analysis across time. Contrary to expectations, we confirmed earlier findings of a wide distribution of centralizations even when examining the networks over time. The paper also provides empirical evidence that while change at the center of FLOSS projects is relatively uncommon, participation across the project communities is highly skewed, with many participants appearing for only one period. Surprisingly, large project teams are not more likely to undergo change at their centers. |
|||
Just wanted to put in a pointer to the data and scripts that we used for our recent First Monday paper, The social structure of Free and Open Source software development. This data is part of OSSmole and Megan and I are working away currently merging out databases. But it is available now on the Syracuse FLOSS research site if people want to jump in.
|
|||
As an example of the data and analysis in the system here is a graphic of developer counts over time, taken from the Project Summaries pages, developed by James Howison and Kevin Crowston using the OSSmole data. The time series are sorted, programatically, into 6 categories, from constantly rising, mostly rising, not trending, mostly falling, consistently falling and dead projects.
![]() This picture only shows curves and categories for a sample of 120 projects. This can be compared against the categorizations of the total population of Sourceforge projects as is shown in the histogram. ![]() The sample of 120 projects has substantially more consistently rising projects, so it seems clear that that sample is generally more successful. The large unchanging (or not trending) category in the total population reflects the fact that the mode is NA,NA,1,1,1 and our finding that 65,561 of the total 98,568 projects (ie 67%) seen over 5 years have never had more than 1 developer. |
|||
Megan and I have done some work on commenting the proposed database schema, explaining what each field is and why it is there. The schema is in CVS and is available via the web interface. It is easier to read in an editor capable of syntax coloring for mysql.
We'd very much appreciate feedback on the generality, or lack there of, and coverage of people's interest areas. Best place is the ossmole-discuss mailing list. |
|||


