SourceKibitzer Collections

SourceKibitzer, now defunct, was an initiative to collect metrics about the performance of various open source software products. (Here is a Wikipedia article about SourceKibitzer.)

SourceKibitzer sent FLOSSmole their data on a regular basis from February 2007 through September 2007. We dutifully stored this data and it is available for researchers to use if they are interested in the SK metrics from this time period. The datasource ids are as follows:

  • 51: 2007-Feb SourceKibitzer
  • 56: 2007-Mar SourceKibitzer
  • 62: 2007-Apr SK
  • 67: 2007-May SK
  • 73: 2007-Jun SK
  • 79: 2007-Jul SK
  • 85: 2007-Aug SK
  • 91: 2007-Sep SK

Data explanation
Here are the metrics provided for 500-odd projects by SourceKibitzer:

  • project name
  • density of comments (DC: Density of comments. Ratio of sum of the comment lines to sum of all lines in all source files of the package. Indicates how much of the code is commented.)
  • todo count (TODO_COUNT: Number of TODO comments. Sums up the number of TODO comment lines in all source files of the package. The following patterns are recognized as TODO comments: FIX-ME, FIXME, FIX-IT, FIXIT, TO-DO, TODO, XXX, TBD.)
  • commented lines of code (CLOC: Number of lines that contain comments.)
  • total lines of code (LOC: Total number of lines in package/source file including blank lines, executable lines, and comments.)
  • non-comment lines of code (NCLOC: Number of lines containing source code. In other words number of lines that are not comments.)
  • non-commenting source statements (NCSS: Non Commenting Source Statements. Counts the number of source statements excluding blank lines, and comments. The value is not affected by the code style.)
  • number of methods (NOM: Number of methods in the package/source file.)
  • sum of data abstraction coupling (ABSTR_COUPL: Data Abstraction Coupling. Sums up the Data Abstraction Coupling values of all sources files in the package. For a source file measures the number of instantiations of classes within the given source file. Higher value implies more complex the data structure.)
  • boolean expression complexity (BOOL_EXP: Boolean Expression Complexity. Sums up Boolean Expression Complexity of all expressions in the package/source file. Value for an expression is a number of logic operators that given expression contains.)
  • fanout (FANOUT: Fan Out Complexity. Sums up the Fan Out Complexity values of all source files in the package. Value for the source file is the number of classes given source file relies on.)
  • npath complexity (NPATH: NPath Complexity. Sums up the NPath Complexity values of all methods in the package/source file. Value for a method is a number of possible execution paths through this method (including nesting conditional statements and multipart boolean expressions).)
  • weighted method count (WMC: Weighted method count. Sum up the McCabe's Cyclomatic Complexity values of all methods in the package/source file. )