Invited keynote: Can software be healthy?

Jesus Gonzalez-Barahona [homepage | twitter], LibreSoft research group, Universidad Rey Juan Carlos, Madrid (Spain)

Abstract: Health, as a metaphor, is often applied to software. At first sight, it is a tempting one. Everyone would like its software projects and products to be healthy. If some project is not healthy, something is wrong with it, and maybe whatever is wrong can be detected and fixed. This leads to the idea that software health, or indicators of being healthy, could be monitored, and policies to increase the health of a project be put in place and tracked. However, to which extent can the health metaphor be applied to software? What does it mean for a software product, or for the project producing it, to be "healthy"? The health metaphor seems to imply that there is some "ideal" state in which every software should be. If they are not in it, well, they are ill, and should take measures to be healed. But how can we define this ideal state? And if we could, does it correspond to some real situation? The talk will explore this path, by comparing software (product and project) health with the concept of software quality. By focusing on free, open source software, where many quality models for projects have been proposed, we will explore if we can learn some lessons that could be useful when discussing software health. Based on them, we will try to answer the question "what does it mean to be healthy?" for software products and projects.

Bio: Jesus Gonzalez-Barahona is co-founder of the Bitergia company focusing on software development analytics. He is full professor in telematics at the Universidad Rey Juan Carlos, where he is co-leading the LibreSoft research group, focused on libre software and open collaboration in different areas such as software engineering, virtual communities and e-learning. He is an active member of the Linux Foundation's CHAOSS community focusing on community health analytics of open source software.

Presentation slides

Detailed Program

The paper abstracts are found at the end of the page

08:30 - 09:00   Registration
09:00-10:30   Session 1
  [15'] Introduction by the workshop chairs (Slides)
  [15'] Introduction of participants
  [45'] Keynote by Jesus Gonzalez-Barahona: "Can software be healthy?" (Presentation slides, Recording)
  [15'] Discussion
10:30-11:00   Coffee
11:00-12:30   Session 2
  [25'+5'] Josianne Marsan (with Patrick Marois, Mathieu Templier and Bogdan Negoita): "Corporate involvement in Open Source Software ecosystems: challenges and strategies to sustain ecosystem health"
Slides, Recording
  [25'+5'] Amel Charleux (with Robert Viseur): "Exploring impacts of managerial decisions and community configuration on open source project health"
  [25'+5'] Giuseppe Iaffaldano (with Igor Steinmacher, Fabio Calefato, Marco Gerosa and Filippo Lanubile): "Why do developers take breaks from contributing to OSS projects? A preliminary analysis"
12:30-14:00   Lunch
14:00-15:30   Session 3
  [25'+5'] Sean Goggins and Matt Germonprez: "Open Source Health and Sustainability Metrics : The CHAOSS Working Group Update"
Slides, Recording
  [25'+5'] Arun Azhakesan: "Healthy Coexistence: Can FOSS based compliance tools talk to software health tools and dashboards?"
Slides, Recording
  [25'+5'] Vadim Zaytsev: "Ecosystem Health as a Reason for Migration: The Mainframe Case" Recording
15:30-16:00   Coffee
16:00-17:30   Session 4
  [25'+5'] Wolfgang Mauerer: "OSS community, health and ecosystem research: Theory and, or theory versus practice?" Recording
  [25'+5'] Andrea Capiluppi (with Nemitari Ajienka): "Software Ecosystems and Application Domains: An Empirical Study based on Java Software"
Slides, Recording
  [25'+5'] Armstrong Foundjem: "Onboarding Experience at the Edge of Software Ecosystem Health" Recording
17:30-18:00   Discussion & Wrap-up

Josianne Marsan (with Patrick Marois, Mathieu Templier and Bogdan Negoita): Corporate involvement in Open Source Software ecosystems: challenges and strategies to sustain ecosystem health

Today's organizations are facing more volatile and complex environments to which they must creatively adapt to and reinvent themselves. Being innovative requires new forms of organizations that can be supported by open and collaborative interactions, such as creativity networks, crowdsourcing, or innovative ecosystems. In the particular context of software development, open source software (OSS) projects follow the same logic of collaboration. Evidence shows that public and private businesses increasingly tend to incorporate into large-scale OSS projects, namely OSS ecosystems, to reduce effort and accelerate innovation (Herbsleb et al., 2016). As a result, independent and voluntary contributors as well as corporate employees interact in OSS ecosystems within a shared market for software and services. Such interaction results in many challenges, notably with respect to commercial interests of firms that contrast basic values of OSS communities (Fitzgerald, 2006). On the one hand, the relationship between the community and the businesses could provide benefits for both sides, including knowledge transfer and learning, reputation, financial support, or sharing of strategic resources (Germonprez et al., 2016). On the other hand, the participation of businesses might constraint development activities and limit the self-governance of OSS communities (Capra et al., 2011). In some cases, the relationship deteriorates and, as a result, leads to severe negative impacts threatening the long-term health of the ecosystem, as illustrated by the engagement of ORACLE in the LibreOffice project (Gamalielsson and Lundell, 2014). Therefore, there is a need to better understand the various challenges that arise in OSS corporate-communal engagements, and how to cope with such threats in order to encourage mutual benefits and sustain the long-term health of OSS ecosystems. To do so, we interviewed 18 OSS ecosystems experts, covering various roles (e.g., developers, maintainers, community managers), and having many years of experience and contributions in large international OSS projects. In this talk, we will present insights from our experts with respect to the main challenges related to OSS corporate-communal engagements, as well as strategies to help carrying out a smooth and sustainable involvement. We will illustrate our findings with evidence from the field and concrete examples of corporate engagement.

Amel Charleux (with Robert Viseur): Exploring impacts of managerial decisions and community configuration on open source project health

This research aims to explore the impacts of managerial decisions and community configuration on the health of open source communities and ecosystems. Combining a longitudinal single case study to more than 46 interviews with open source community members (editors and contributors), we were able to identify key managerial changes that impact community activity and highlight how certain strategic parameters of the project determine the community configuration and its health. In this study, by health we qualify a wide range of aspects going from the dynamisms of external contributions to the valuation of contributors, contributions and dissemination to end users. We firstly show that managerial decisions are as important as technical choices regarding the software health. Some purely managerial decisions regarding governance structures for example can impact the community willingness to contribute and therefore the available resources to develop the software. Secondly, we put forward the importance of having a coopetitive community (i.e. community with several competing firms) to stimulate and attract valuable users and contributors that highly participate to the software health according to our interviewees.

Giuseppe Iaffaldano (with Igor Steinmacher, Fabio Calefato, Marco Gerosa and Filippo Lanubile): Why do developers take breaks from contributing to OSS projects? A preliminary analysis

Creating a successful and sustainable Open Source Software (OSS) project often depends on the strength and the health of the community behind it. Current literature explains the contributors’ lifecycle, starting with the motivations that drive people to contribute and barriers to joining OSS projects, covering developers’ evolution until they become core members. However, the stages when developers leave the projects are still weakly explored and are not well-defined in existing developers’ lifecycle models. In this position paper, we enrich the knowledge about the leaving stage by identifying sleeping and dead states, representing temporary and permanent brakes that developers take from contributing. We conducted a preliminary set of semi-structured interviews with active developers. We analyzed the answers by focusing on defining and understanding the reasons for the transitions to/from sleeping and dead states. This paper raises new questions that may guide further discussions and research, which may ultimately benefit OSS communities.

Sean Goggins and Matt Germonprez: Open Source Health and Sustainability Metrics : The CHAOSS Working Group Update

The Linux Foundation's CHAOSS working group is just over two years old, and focused on the definition of health and sustainability metrics that can be referenced by industry, contributors and academics. It is comprised of three active working groups, a core group, a group focused on growth, maturity and decline and a group focused on diversity and inclusion. This talk will provide a summary level report of the progress of each group, while aiming to facilitate discussion among participants regarding what types of metrics might be missing, most critical or most challenging to implement consistently. Since its inception at the Linux Foundation’s Open Source Leadership Summit in Spring 2017, the CHAOSS project has become a Linux Foundation project, was officially announced by Jim Zemlin at the Open Source Summit North America in Fall 2017, and has come to include members from 70+ organizations such as RedHat, Pivotal, Intel, Mozilla, The Eclipse Foundation, Oath, and Bitergia ( Both Matt Germonprez and Sean Goggins are founders and board members of the CHAOSS project. The mission of the CHAOSS project is to: s1. Produce integrated, open source software for analyzing software development, and definition of standards and models used in that software in specific use cases; 2. Establish implementation-agnostic open source project health indicators for measuring community activity, contributions, and health; and 3. Optionally produce standardized open source project health indicators exchange formats, detailed use cases, models, or recommendations to analyze specific issues in the industry/OSS world. Augur is an open source software project that is a CHAOSS community technology. Augur fills an important technology niche not addressed by other analytics tools within CHAOSS. Augur is focused on building human centered open source project health indicators, defined by collaborations of the CHAOSS project and other open source project stakeholders. Augur is a project focused on making sense of data using four key human centered data science strategies: 1. Enabling comparisons, letting people navigate complex unknowns analogically as well as see how their project compares with others they are familiar with. 2. Making time a fundamental dimension in all open source project health indicators as point in time scores are useful. Augur makes useful historical comparisons to be be used to anticipate a trajectory. 3. Making the provenance from raw data to open source project health indicators visualization transparent. 4. Enabling visualizations as downloadable as a .csv, .svg, or other data exchange format because (a) people trust open source project health indicators when they can see the underlying data and (b) providing traceability back to the CHAOSS project open source project health indicators require easy transparency.

Arun Azhakesan: Healthy Coexistence: Can FOSS based compliance tools talk to software health tools and dashboards?

This talk proposal summarizes the need for a more health metrics based, effective and collaborative approach in handling the compliance requirements for using open source software and how software health metrics can contribute to a cohesive compliance process. Usage of open source software for development is ever increasing, so is the need for effective compliance processes. The focus is now shifting to FOSS based compliance tools from propriety solutions. The compliance teams should aim at providing seamless and non-intrusive process to assess the of open source software usage in their products. Compliance processes should be designed keeping in mind the dynamic requirements of continuous development practices. Leveraging the potential of various FOSS based compliance tools can provide significant improvements and increased effectiveness for compliance process in software development. The usage of open source-based tools for compliance should not be limited to license scanners and copyright information identifiers, this should also include tools that could determine the quality of the potential open source software that could be used in the product. The talk will explore the possibilities of synergies that can elevate the compliance process to the next level. Making choices based on software heath is a key factor in determining the obligations and risks involved. The compliance process should enable the development team to make the right choice. A centralized approach in evaluating the health metrics of the potential open source software based on historical data can be an efficient method. Creating awareness about the quality metrics for using opens source software in product among the developer community can be a great booster to the open source cultural shift in organizations.

Vadim Zaytsev: Ecosystem Health as a Reason for Migration: The Mainframe Case

At the SoHeal'19, we would like to discuss software health at the ecosystem level as a reason of migration from that ecosystem even if its technical aspects are more than satisfactory, and to share our experience in migrating from mainframe to .NET, LLVM, etc.

Wolfgang Mauerer: OSS community, health and ecosystem research: Theory and, or theory versus practice?

Using substantial amounts of OSS in industrial products is a common reality. However, while a decade or two ago licensing issues, questions about code maturity and (perceived) legal problems with contributing back to upstream communities were crucial for industrial users of OSS, new challenges have started to enter the stage: In particular, the reuse of often tiny components (or: people pulling in npm modules like popping M&Ms) lead not only to an increasing number of technical dependencies, but also place greater stress on the question of trust in an increasing number of people, communities and ecosystems. Well-known, large communities like the Linux kernel are an often pursued goal of scientific analysis, but industry tends to work with and in these communities anyway. Smaller communities of projects that receive less publicity, but are nonetheless fundamental for the overall ecosystem are typically less well understood by industry, but their health and other properties are hard to automatically quantify for the lack of reasonable statistical data from a scientific point of view. This leads to a gap between what industry needs to know, and the insights science can provide. In this (probably opinionated) talk, we discuss this gap from two often opposite sides, practical industrial application and fundamental software engineering research: As a researcher, the author has never understood why industrial belief in software engineering research seems to often stop at using design patterns, and why industry does not try to benefit more from scientific insight. As an industrial practitioner, the author has never understood why academia would need to tell industrial engineers that have participated in OSS projects for years what they have done, post facto, and why research does not listen more closely to what industry is interested in, and needs to know. By unsplitting his two industrial and scientific personalities, and guided by examples from his scientific and industrial work, the author will discuss what missing overlaps between the two worlds exist, and how both sides can contribute to closing them.

Andrea Capiluppi (with Nemitari Ajienka): Software Ecosystems and Application Domains: An Empirical Study based on Java Software

Research on empirical software engineering has increasingly used data from online repositories or collective efforts. The latest trends for researchers is to gather as much data as possible to (i) prevent bias in the representation of a small sample, (ii) work with a sample as close as the population itself, and (iii) showcase the performance of existing or new tools in treating vast amount of data. The effects of harvesting enormous amounts of data have been only marginally considered so far: data could be corrupted; repositories could be forked and developer identities could be duplicated. In this paper we posit that there is a fundamental flaw in harvesting large amounts of data, and when generalising the conclusions: the application domain, or context, of the analysed systems must be the primary factor for the cluster sampling of FOSS projects. In this paper we analyse a sample of software systems, and using an existing approach based on Latent Dirichlet Allocation (LDA), we derive their application domains. We extract a suite structural OO metrics from each project, and cluster projects by domains: we show that most of the chosen metrics come from different populations, and are based on the application domains.

Armstrong Foundjem: Onboarding Experience at the Edge of Software Ecosystem Health

In this paper, I introduce the phenomenon of onboarding experience as an essential factor within an open source software ecosystem that may affect its health, either positively or negatively. In other to understand how this phenomenon relates to software ecosystem health, I interviewed 23 top developers at the OpenStack Foundation. I also did a focus group study with eight participants during two days, pre-summit onboarding training, which is organized by the OpenStack upstream University. Based on these studies, I found out that onboarding experience is a remarkable phenomenon, which can improve the health of a software ecosystem by strengthening growths and survival, promotes diversity and inclusion, and assuring a successful platform for businesses.