Welcome to the 4th International Workshop on Software Health in Projects, Ecosystems and Communities

VIRTUAL (originally Madrid, Spain) - May 29th, 2021 - co-located with 43rd International Conference on Software Engineering (ICSE 2021)

Factors impacting software health vary substantially depending on the viewpoint of the involved stakeholders: software quality attributes concern the source code artefacts, social factors concern the community of contributors maintaining the software artefacts and business and legal factors concern commercial aspects of the software product. Because of this variety, there cannot be a unique definition of what constitutes software health, since it encompasses many different developments and evolution attributes, including success, longevity, growth, resilience, survival, diversity, sustainability, license compatibility, etc.

As can be witnessed by recent initiatives such as the Linux Foundation's CHAOSS project on community health analytics, and the SECO-ASSIST research project realized the need for socio-technical perspectives concerning software health in projects, ecosystems and communities. Such perspectives are challenging, due to the volatile storage of information regarding social relations, conflicts and interactions. There is a need to find better methods, techniques and tools to monitor software health, as well as to predict and take corrective measures when health implications arise. SoHeal 2021 will have a special focus on software ecosystem health on which these issues are more pronounced. Due to the socio-technical dimension of software ecosystems, measuring their health, identifying the issues, and fixing them is particularly challenging.

SoHeal aims to enable and promote collaboration between academia and industry, unifying the views on software health of researchers and practitioners. The workshop's goals are to: (i) raise awareness of practitioners' problems with software health; (ii) familiarize practitioners with the progress made by academia; and (iii) connect the two communities to advance the body of knowledge and the state of practice.

Important dates

  • Position paper submission deadline: 12 January 2021 19 January 2021
  • Industry/Practitioner talk proposal deadline: 12 January 2021 19 January 2021
  • Notification of acceptance: 22 February 2021
  • Camera Ready for accepted position papers: 12 March 2021
  • Workshop: 29 May 2021

Registration

For authors and attendees, please register through ICSE.

Accepted Papers

Health is Wealth: Evaluating the Health of the Bitcoin Ecosystem in GitHub

Khadija Osman (Carleton University, Canada), Olga Baysal (Carleton University, Canada)

Abstract. Bitcoin is a virtual and decentralized cryptocurrency that operates in a peer-to-peer network providing a private payment mechanism. It is a multi-billion dollar cryptocurrency, and hundreds of other cryptocurrencies are created based on it. Bitcoin is based on open source software (OSS) development. This paper presents the first comprehensive study of the Bitcoin ecosystem in GitHub organized around 481 most popular and actively developed Bitcoin related projects over eight years (2010‐2018). Our work includes manual categorization of the projects, defining software health metrics, classification of projects according to these health metrics, and evaluation of the health trends of the ecosystem. The main findings suggest that the Bitcoin ecosystem in GitHub is represented by nine categories of projects. Moreover, the health of the majority of the projects is assessed as "Low Risk".

Pre-print

A Quantitative Assessment of Package Freshness in Linux Distributions

Damien Legay (University of Mons), Alexandre Decan (University of Mons), Tom Mens (University of Mons)

Abstract. Linux users expect fresh packages in the official repositories of their distributions. Yet, due to philosophical divergences, the packages available in various distributions do not all have the same degree of freshness. Users therefore need to be informed as to those differences. Through quantitative empirical analyses, we assess and compare the freshness of 890 common packages in six mainstream Linux distributions. We find that at least one out of ten packages is outdated, but the proportion of outdated packages varies greatly between these distributions. Using the metrics of update delay and time lag, we find that the majority of packages are using versions less than 3 months behind the upstream in 5 of those 6 distributions. We contrast the user perception of package freshness with our analyses and order the considered distributions in terms of package freshness to help Linux users in choosing a distribution that most fits their needs and expectations.

Pre-print

Does the duration of rapid release cycles affect the bug handling activity?

Thorn Jansen (Eindhoven University of Technology), Zeinab Abou Khalil (University of Mons and University of Lille), Eleni Constantinou (Eindhoven University of Technology), Tom Mens (University of Mons)

Abstract. Software projects are regularly updated with new functionality and bug fixes through so-called releases. In recent years, software projects have been shifting to shorter, rapid release cycles that can affect software development processes such as the bug handling activity. Past research has focused on the impact of switching from traditional to rapid release cycles with respect to bug handling activity, but the duration of fast cycles has not yet been studied. We empirically investigate bugs and releases of 420 open source projects having rapid release cycles to understand the effect of variable and rapid release cycle durations on bug handling activity. We group the releases of these projects into five categories of release cycle durations. For each project, we investigate how the sequence of releases is related to bug handling activity metrics and the effect of the variability of cycle durations on bug fixing. Our results show that there is no statistically significant difference for any of the studied bug handling activity metrics in the presence of variable rapid release cycle durations. These findings indicate that the duration of fast release cycles does not seem to impact bug handling activity.

Pre-print

Open Source Community Health: Analytical Metrics and Their Corresponding Narratives

Sean Goggins (University of Missouri), Kevin Lumbard (University of Nebraska at Omaha), Matt Germonprez (University of Nebraska at Omaha)

Abstract. Open source projects are most often evaluated by potential contributors and consumers using metrics that describe a level of activity within the project because those measurements are available. The principle question in the minds of most evaluators, however, is "How healthy and sustainable is this project in the context of its competitors or dependent projects"? Limitations of current analysis methods focused on trace data alone are discussed, and reviewed in depth. Next, our methods for conducting engaged field research, developing metrics standards as part of a corporate communal partnership, and molding tools that evolve through a reflexive discourse with practitioners using standard metrics is framed as an approach to consider for examining open source software health and sustainability. Researchers, in particular, need tools for increasing the feasibility of comprehensive, multi-project health and sustainability studies, and connecting trace data with human experience. From a practice perspective, these same conditions are increasing the difficulty organizations and individuals engaged in open source face when trying to understand the status, condition, and health of a particular project, the project's ecosystem or ecosystems emerging around their specific project context. This study examines the work of a Linux Foundation working group, CHAOSS (Community Health Analytics Open Source Software) during the first four years of the formation. The paper concludes with examples of CHAOSS metrics operationalized in partnership with corporate collaborators in a manner that emphasizes comparison, transparency, trajectory and visualization as components for discursive, evolutionary understanding of open source software health

Pre-print

Accepted Talks

Choice Matters: Contrasting Package Manager User Experience

Raula Gaikovina Kula (Nara Institute of Science and Technology), Syful Islam (Nara Institute of Science and Technology), Bodin Chinthanet (Nara Institute of Science and Technology), Christoph Treude (University of Adelaide), Takashi Ishio (Nara Institute of Science and Technology), Kenichi Matsumoto (Nara Institute of Science and Technology)

Abstract. A package manager (PM) is crucial to most technology stacks, acting as a broker to ensure that a verified dependency package is correctly installed, configured, or removed from an application. Diversity in technology stacks has led to dozens of PMs with various features. While recent studies have shown that developers struggle to migrate their dependencies, the common assumption is that PMs are used without any issue. In this study, we explore sixteen PMs to understand whether their features correlate with the experience of their users. By studying experience through the questions that developers ask on the question-and-answer site Stack Overflow, we find that developer questions are grouped into three themes (i.e., PM management, Input-Output, and Package Usage). Our analysis results indicate that specific features are correlated with the user experience. Our work lays out future directions to investigate the trade-offs involved in designing the ideal PM.

Can Test Cases Foresee Software Health? Results from a Recent Empirical Study

Fabiano Pecorelli (Sesa Lab - University of Salerno), Fabio Palomba (Sesa Lab - University of Salerno), Andrea De Lucia (Sesa Lab - University of Salerno)

Abstract. Testing represents a crucial activity to ensure software health. Previous studies showed that test-related factors (e.g., code coverage) can predict the future software code quality, as measured by post-release defects. These studies provided compelling evidence on the relation between tests and postrelease defects, yet they considered different test-related factors separately: as such, there is still a lack of knowledge of whether these factors are still good predictors when considering all together. In this talk proposal, we report the results of a recent study we conducted on how test-related factors relate to production code quality in APACHE systems. We investigated how the presence of tests relates to post-release defects; then, we analyzed the role played by the test-related factors previously shown as significantly related to post-release defects. From the practitioner's perspective, the findings of the study show that, when controlling for other metrics (e.g., size), test-related factors have a limited connection to post-release defects.

Latest Tweets