Code, Process, and Code Quality Metrics in ASF projects
# Description
Recent work on open source sustainability shows that successful trajectories of projects in the Apache Software Foundation Incubator (ASFI) can be predicted early on, using a set of socio-technical measures. Because OSS projects are socio-technical systems centered around code artifacts, we hypothesize that sustainable projects may exhibit different code and process patterns than unsustainable ones, and that those patterns can grow more apparent as projects
evolve over time. Here we studied the code and coding processes of over 200 ASFI projects, and found that ASFI graduated projects have different patterns of code quality and complexity than retired ones. Likewise for the coding processes e.g., feature commits or bug-fixing commits are correlated with project graduation success. We find that minor contributors and major contributors (who contribute <5%, respectively >=95% commits) associate with graduation outcomes, implying that having also developers who contribute fewer commits are important for a project’s success. This study provides evidence that OSS projects, especially nascent ones, can benefit from introspection and instrumentation using multidimensional modeling of the whole system, including code, processes, and code quality measures, and how they are interconnected over time.
# Findings
- Graduated and retired projects are different in their code (graduated have less code per author, and less directories per author), processes (graduated commit more and delete more), and quality (graduated have more complex code and more test code)
- Graduated and retired projects follow different trajectories once they enter the incubator. Some projects are somewhat better equipped to graduate fast, while others strive for a more constant but less commit-heavy activity. Finally, retired projects are more likely to have a higher burden per contributor due to having fewer contributors, an increasing codebase size, and being less likely to attract new contributors.
- An increase in the following metrics increases the odds of graduation: lines of code, major and minor contributors, features commits, corrective commits, medium complexity (11-25 McCabe) functions, and very large functions. On the flip side, the increase in the following metrics decreases the odds: top level directories, avg. files modified per commit, very large file sizes, and code duplication percentage.
# Paper
The paper can be found here.