A Primer on the Open Source Movement from a Health Care Perspective

Open source, in myriad forms, has emerged as a significant development model that drives both innovation and technological dispersion. Ignore it at your peril, as did the major computer companies destroyed or totally remade by Linux and free software, or encyclopedia publishers by Wikipedia, or journalists and marketers by social media. The term "open source" was associated first with free software, but it goes far beyond software now. People around the world use open hardware, demand open government, share open data, and--yes--pursue open health. The field of health, in particular, will be transformed by open source principles in software, in research, in consultations and telemedicine, and in the various forms of data sharing all these processes call for. This article starts with a definition of open source and a look at some major open source software projects in health care. I then present the dominant research in open source by examining several books on the topic, many of them classics. These authors search for answers to the following frequently-asked questions:

  • How can an accumulation of scattered contributors produce a coherent product?
  • What motivates volunteers to contribute to a project where they don't even know the other contributors?
  • How is quality maintained during this relatively uncoordinated effort?
  • How can leaders maximize the value of cooperating volunteers in a community?
  • What economic model can sustain an open source project?

The main books consulted for the article are:

I will touch on several other interesting texts along the way:

Many of these books are available for free on the Internet and have open licenses, in keeping with the sympathies of their authors for transparency and the free flow of information. But I encourage you to patronize the original publishers who invested in the production and dissemination of the books. By the way, although I did not deliberately choose to promote books by my own company (O'Reilly Media), it turns out that many of the books on this list were published there (and several were edited by me), given our historic role in promoting open source technologies and ideals.

Eric S. RaymondWho, and what, lie behind open source software?

A great deal of mystery swirls around the process by which magnificent free software projects such as Linux emerge. Some people also harbor persistent myths about open source, such as that it is created by amateur hackers sitting in their bedrooms, or is not robust and trustworthy, or lacks support.

Several books can help to set readers on the path to enlightenment regarding the viability of free software. In addition, some will shine a much-needed light on the intriguing processes that lead to its creation. In this section, I will focus on Raymond’s Cathedral & the Bazaar because it is historic and investigates much of the motivation behind those who participate in free software. But it is admittedly old--the first version of Chapter 1 was written in 1996, the first edition of the book was published in 1999, and its final revision was in 2002.

Thus, Raymond should be supplemented by Coleman’s Coding Freedom, which offers a real-life look at a significant free software community. Whitehurst’s Open Organization shows how different but recognizable principles can motivate a successful open source company. And two other books, The Art of Community and Producing Open Source Software, demonstrate that open source projects can be managed through predictable techniques.

Raymond has a background well-suited to a deep investigation into the workings of open source communities. He is a dyed-in-the-wool hacker (of the type memorialized by Levy in his classic book), who maintained a personal web site for a long time and has worked on an ancient “jargon file” for aspiring hackers. But Raymond writes this book in order to offer practical reasons for open source. For instance, Chapter 4 claims to provide "a hard-nosed economic explanation of what makes open-source cooperation sustainable" and contains eight economic models, only a couple of which have proven to be sustainable. Raymond also draws on his background in anthropology (as does Coleman). Interestingly, Raymond is also a libertarian, belying the popular conception of open source advocates as communitarians or socialists (although there is also a strain among computer programmers of “California optimism” that aligns with libertarianism and co-exists just fine with free software).

Perhaps Raymond’s most prophetic contribution was his early recognition that loosely associated groups of cooperating peers can out-perform hierarchical organizations. This theme, which challenges the philosophical basis of capitalism as well as government, later became central to Benkler’s much-cited The Wealth of Networks, as well as to Howe’s more popular screed Crowdsourcing and to Rifkin’s idiosyncratic but intriguing The Zero Marginal Cost Society.

Raymond also foresaw that, in order for peers to succeed in forming productive communities, software tools may be crucial. For instance, Chapter 2 of his book suggests why Linux did not emerge until the 1990s even though many versions of the concept on which it was based (the Unix operating system) had been circulating two decades before that: the Internet had to become faster and support data exchange more easily. The significance of technological advances for peer collaboration comes up two decades later in Noveck’s Smart Citizens, Smarter State.

Some of the classic principles of free software that Raymond first publicized in Chapter 2 of his book include:

  • ”Every good work of software starts by scratching a developer's personal itch." Chapter 4 points out that 95% of development is done in-house, not for sale, so costs are already absorbed by the business. The model of software development that was best known in the 1990s--individual licenses of software on disk--was actually an anomaly, and by now is even more so. Raymond’s observation that businesses create software first and foremost to “scratch an itch” now underlies a common software development model that I call closed core.
  • ”Release Early, Release Often.” Developing your software incrementally and letting the community bang on it generates useful feedback quickly. It took a couple decades for the general software community to re-discover this open source principle and embody it in development models such as Agile and SCRUM.
  • The doctrine that "given enough eyeballs, all bugs are shallow." This claim has a specific technical basis that I describe elsewhere. It is not entirely reliable, as the kerfuffle over the Heartbleed flaw in key open source security software (OpenSSL) a couple years ago showed. But the open source community has proven very adept at fixing bugs once they are found.
  • The concept of a minimum viable product decades before the business community discovered it. Raymond writes: "Your program doesn't have to work particularly well. It can be crude, buggy, incomplete, and poorly documented. What it must not fail to do is (a) run, and (b) convince potential co-developers that it can be evolved into something really neat in the foreseeable future."
  • The claim that conventional tools and goals of management (coordination, motivation) become irrelevant in a network of volunteers. A more refined examination of this principle comes much later in Whitehurst’s Open Organization.

Before I come up from the rich mines of The Cathedral & the Bazaar, I’ll list a few other intriguing insights from the book:

  • The importance of open, widely recognized formats to allow sharing in the noosphere (or ergosphere, a term invented in Chapter 3). Without a shared format, Raymond warns, content or software is tied to the machine where it resides, and therefore is owned by the owner of that machine (or nowadays, a virtual system in the cloud). One could be induced to think that Raymond is commenting directly on the current state of data in health care.
  • Raymond recognizes the importance of ego and reputation in volunteer contributions, and suggests both indulging these needs (in Chapters 2 and 3) and of moving people away from this “territorial” instinct (in Chapter 3).
  • In section 4.9.3 (“Give Away the Recipe, Open a Restaurant”), although Raymond did not directly anticipate the modern cloud, he essentially describes the model of several companies that later offered open-source software: WordPress, Heroku, Cloudera, and far too many others to list.

Gabriella ColemanPassion and changing the world

Raymond’s cool appraisal of free and open source software, set on showing that it’s economically and psychologically robust, leaves little room for passion. Chapter 3 admits that ideological fervor drives some contributions, but The Cathedral & the Bazaar doesn’t enquire much further into the idea that passion could drive open source work.

However, passion is what drives most people who enter healthcare, and particularly those who put information technology under the microscope. I found this passion among the people I interviewed about OpenMRS, Sage Bionetworks, the tranSMART foundation, and Open mHealth. So let’s turn to another and much more contemporary book, Gabriella Coleman’s Coding Freedom, to see what mission-driven health care workers can learn about open source.

Coding Freedom studies the community that maintains Debian, one of several hundred distributions of Linux built by companies, communities, or individuals, and before I explore this community I feel it important to justify this exploration by providing a bit of history around Linux--or GNU/Linux, as many free software advocates insist on calling the distributions. We will see why.

Linus Torvalds was famously just a college undergraduate when he put out Linux as a kernel in 1991. A kernel does very little that the eye can see: it schedules jobs, manages memory, and does other things we rarely think about when using our computers. To fill the enormous gap between machine operations and user interface, Linus made sure his kernel worked with (and was buildable by) a set of utilities called the GNU project. In fact, before making his historic announcement, he pointedly let users know they could run commands to control the kernel, by getting a utility called Bash to run. There would be no Linux without GNU; thus, although the kernel is properly called just Linux, the entire operating system is often called GNU/Linux.

Atop this powerful but geeky combo, most users want graphical interfaces, useful applications such as Web browsers, and other conveniences of 20th and 21st century living. Several small companies thus sprung up to accumulate handfuls of free software (and sometime proprietary add-ons as well) and to sell Linux as part of a larger distribution, first on floppy diskettes and then on CD-ROMs. Each had strengths and weaknesses, but none opened up the full potential of Linux. Furthermore, free software advocates were frustrated by their lack of input into the decisions of the companies making the distributions.

It was a heady time. In 1993, a couple people searching for a way into tech business founded Red Hat, which soon became the leading vendor of GNU/Linux distributions and is now worth tens of billions of dollars. We’ll hear more about that company later. Around the same time, however, group of Linux activists decided to go in an entirely different direction and to bring the community development model of Linux and other free software into the task of delivering a distribution. This was Debian.

Although Red Hat offers a free software version of its distribution and thrives as a commercial venture, it is Debian that most Linux users look to as a canonical distribution. Debian is the basis of many other distributions, including the popular Ubuntu (used for most Linux desktops and now also a server). Ubuntu is run by a company, but Debian has remained purely a community of volunteers. It is nearly unique as a large, worldwide community responsible for critical software, and is thus both a valuable model for open source proponents and an excellent choice for a cultural anthropological project--which brings us to Gabriella Coleman and Coding Freedom.

Coleman rips open the psychology of the passionate hacker in a way I haven’t found in any other book. In place of Raymond’s business models and tit-for-tat game theories, she declares, "free software development is not simply a technical endeavor but also a moral one" (Chapter 4) She extolls hackers’ “liberal visions and romantic sensibilities" (Introduction), compares their interactions to the creativity of a jazz ensemble (Chapter 3), and describes the sheer joy hackers experience at conferences--“ritual-like affairs” (Chapter 1) where they find others who care about and can talk about a shared mission.

Coding in Coleman’s world is an emotional activity, and communities are formed around shared feelings. This would be a wonderful atmosphere to cultivate in health care, a field already teaming with caring professionals. Over the past five years, developers have streamed into health care, overloading app stores (often with apps that don’t actually improve health) and piling into the coding challenges arranged by various health care organizations. But programmers with high aspirations run into barriers right away:

  • Awkward medical codes that are hard to manipulate and difficult to integrate into a workflow. Some, such as ICD-10, are driven by billing instead of treatment. Others, such as SNOMED, are extremely granular and sometimes involve extra attributes, requiring care to present the appropriate choices in ways to let clinicians choose properly. Too many devices and electronic systems--such as lab equipment--shirk the responsibility of assigning codes, shoving this responsibility on staff.
  • Outdated regulations that were instituted for patient safety but impose arbitrary restrictions on activities and evolve creakily to reflect changes in medical settings and practices.
  • Privacy rules that developers are not accustomed to, as important as they may be to quelling a Wild West data environment.
  • Unspoken and impenetrable workflows, usually more directed to billing than health care delivery.
  • Arcane distinctions between disciplines, which are ensconced in organizational walls built up over decades, hamper continuum of care and patient hand-offs, and lead to a major source of medical mistakes.
  • Subject-matter experts from clinical staff whose input is entirely crucial but who may resist change or have unreasonable expectations.

Thus, developers have to undergo weeks of what resembles the standardized testing required for civil service jobs--a discipline entirely unlike the skills they have cultivated as top programmers--and learn to listen empathetically to a range of clinicians, administrators, and patients. These tasks distance the health care field from the cheerful haunts of Debian development. Debian’s talk of software building standards, version numbering, file system layouts, and other operating system requirements seems nerdy to outsiders but is comfy for developers. The same cannot be said for health care app development.

Yochai BenklerThe significance of open source outside the field of software

Raymond aptly captured open source in the 1990s, providing an update to the view of hackers that Stephen Levy gave the world in 1984. Starting in the 2000s decade, another generation of books on open source looked beyond free software to other social movements. These titles include Ghosh’s CODE, Howe’s Crowdsourcing, and Rifkin’s The Zero Marginal Cost Society. Benkler points to three major aspects that distinguish the communities that concern his book (Introduction, pp. 4-5): they are non-proprietary, non-market, and networked. Benkler announces the end of the unchallenged expert. Like Raymond, he champions networks over hierarchies. He has little praise for controlled systems, but of course understands that coordination of some type is still necessary. Coleman’s Coding Freedom stresses that large software projects are consciously organized, a point also made in Fogel's Producing Open Source Software.

Health care is a collaborative activity. If anything, the need for collaboration is ever-expanding: ACOs and continuity of care require organizations to work tightly together, and the professionals are also being told to incorporate patients and their families into the team. So health care should be able better to connect with its mission by examining the lessons of open source movements where collaboration must be solicited, reaped, and rewarded.

Benkler stresses that open source requires individual autonomy, because voluntary participation is the beginning of collaboration. Autonomy is relevant to the modern patient, who needs to get out from under the thumb of the paternalistic medical system. Ironically, the greater autonomy provided by computing power and Internet sources of information promotes people working together.

As a law professor, Benkler's chief concern is improving the legal framework for the information exchange that provides the lifeblood to collective creativity. A good chunk of the book covers such issues as copyright, patents, and ownership rules that have little resonance to health care professionals, although they certainly relevant to pharmacological research and occasionally other issues as well. (The books by law professor Lawrence Lessig also take on this battle, and it forms a large part of Coleman’s book as well.)

Benkler’s Chapter 9 (pp. 352-353) suggests that post-docs in biology could be harnessed to work together on finding cures for conditions that would not generate enough financial payback to interest phama companies. Benkler does not suggest who would handle other critical aspects of drug development, such a clinical trials, under this scenario. In fact, he admits, “This proposal about medicine is, at this stage, the most imaginary among the commons-based strategies for development suggested here.” In the model I called “closed core,” companies contribute to free software in a situation comparable to pre-competitive research in pharma.

Readers of this article can ask: even if we aren’t concerned with Benkler's precise list of topics, what artificial hurdles does the health care field throw up all its own? What could be achieved by freer information sharing, and how can we persuade the field to loosen its constraints? Benkler’s mission is to break down the intellectual property monopolies--a mission not of direct interest to health care readers, but a useful analogy to think of when looking for traits of health care that hold back participation.

Jono BaconGoverning the ungovernable

Seeing how open source development is passion driven, relentlessly non-hierarchical, based on individual goal-setting by each contributor--one can easily conclude that the process is unmanageable. Yet every author in this survey insists that open source requires management. And in addition, the values one cultivates as an open source manager--flexibility, candor, the ability to listen, skills in mediation--work superbly in other organizations and life ventures as well.

This section touches on strategies to manage open source projects, drawing largely from Jono Bacon’s Art of Community: Building the New Age of Participation. Management tools and practices also appear in Karl Fogel’s Producing Open Source Software: How to Run a Successful Free Software Project. Jim Whitehurst’s Open Organization and Jeff Howe’s Crowdsourcing claim that similar principles work in the business world, while Beth Simone Noveck’s Smart Citizens, Smarter State applies them to government and Rick Falkvinge’s Swarmwise claims they work in a wide range of commercial and non-profit settings.

Bacon brought invaluable experience to The Art of Community from his job as community manager of the Ubuntu project. As mentioned earlier in this article, Ubuntu is a Linux distribution based on Debian. Unlike the free-for-all community around Debian, however, Ubuntu was launched and is still maintained by a single company, Canonical. Therefore, Ubuntu, with a certain degree of centralized decision making and a heightened for full-time paid staff, is controlled differently from Debian. Bacon was acutely conscious of the responsibility borne by his company for driving Ubuntu forward, and equally dedicated to making sure that community members outside the company felt valued and listened to.

Thus, Bacon starts off his book with communication practices that open source leaders are wise to follow. These are basically to communicate often and to be as transparent a possible. Whitehurst and other authors would certainly endorse these traits. Whitehurst, who is president and CEO of the Red Hat mentioned earlier, does an excellent job presenting the Internet-age mode of interaction for a business leader. He translates popular community values of transparency and grassroots activism into concepts that appeal to managers, such as “employee motivation” and “change management.” His book helps to explain the astounding continued growth of Red Hat, one of very few software companies whose products are totally open source. (To summarize the book very broadly, Whitehurst says: bring all stakeholders into important decision-making processes, and both motivation and change management will follow.)

Top-down communication (including the use of the press and social media) receive the most attention in The Art of Community, but managing two other forms of communication--bottom-up and side-to-side--is also crucial. In particular, one must be alert to conflict and extreme negativity among participants, and handle these gingerly but clearly.

Open source projects require the same forms of management as other projects. In the second edition of his book, Bacon taps into the current mania for data and adds a chapter on tools for measuring progress. These measurements apply to the growth and health of the community as well. For instance, do you have enough senior community members, or are members burning out and dropping away? Do you have enough new members, or is your community stagnating?

Prospects for open source in health care

The health care industry--obsessed with risk avoidance, intensely hierarchical, heavily silo’d and suspicious of outsiders--may seem poor soil for open source values. But change is just over the horizon. Conditions favorable to open source are congealing, whereas proprietary software models tend to fail in health care.

Can you envy a software firm that has to spend hundreds of thousands of dollars to develop and prove the validity of app, and then tries to recoup those costs by selling it for a dollar or two? Or has to strike deals with scads of different health care providers and payers? These models are unsustainable even for firms who try to make their money on hardware. Most offer services and sell data on their users for extra revenue--but the first is victim to fickle users who give up on a product while the second has ethical and privacy challenges. No wonder that software and data analytics are rarely found in healthcare, except for a highly fragmented and pathetically bevy of data repositories that replace paper records.

Meanwhile, health care providers and payers are facing stresses that call for new software solutions. Practices that were only marginally profitable before are now stressed even more, while those that are still raking in the dollars face payment models that require data crunching, information exchange and the smart application of knowledge. Cognitive loads increase through precision medicine, personalized medicine, and population health. Pay-for-value models reward collaboration, so the old silos will eventually have to come down. Once institutions are cooperating, they have incentives to make data sharable and support software development as a strategic cost-cutting measure. Open source fits this environment quite jollily, whereas proprietary vendors find themselves stuck in intellectual property traps that their potential customers are abandoning.

What reformers need to do is explain to the patients, payers, and providers how the pieces fit together. Open source is an integral part of a strategy to improve care and lower costs through teamwork and data analysis. If we have a future in health care, it is bound up with free, ubiquitous, and appropriate software.