Munnecke on "Dots-First" vs. "Links-First" Metadata Approach, or Why ICD10 is Going to Fail

Here is a 1986 letter from Rep. Sonny Montgomery, chair of House VA committee VA Administrator Thomas Turnage about NHS meta data sharing. Note that, even in 1986, the Committee on Veterans’ Affairs was savvy to, and advocating the use of metadata (then called the “data dictionary – a roadmap to the database.”  It understood its use in VistA (then called DHCP), its role in portability (then with the Indian Health Service), and hopes to use it for the Department of Defense’s Composite Health Care System.

Today, metadata is a household word, given the NSA’s use of it.  But it reflects an entirely different perspective on how we view complex systems.

Imagine a complex system, represented by millions of dots, with even more connectors between the dots.  We can think of the dots as representing the “data” in the system, and the connectors (links) representing the “metadata” in the system. This perspective generates an overwhelming number of dots and links, well beyond any human capacity to understand.

One way to approach this complexity I’ll call the “Dots-first” approach.  This approach tries to categorize the dots, pigeonholing them into a predefined hierarchy of terms: “A place for every dot, and every dot in its place.”  This goes back to Aristotle, and the law of the excluded middle.  Something is either A or Not A, but not both.  We just keep applying this “law” progressively until we get a tidy Aristotelian hierarchy of categories. 

Libraries filed their books this way, according to the Dewey Decimal system.  If you wanted to find a book, you could look in a card catalog for title, author, and subject, then just go to the shelves to find the book.  The links between the dots are largely ignored.  For example, it would be impossible to maintain the card catalog by all the subjects referenced in all the books, or all of the references to other books and papers.  Order is maintained by ignoring links that don’t fit the cataloging/indexing system.

An alternative approach I’ll call the “Links-first” approach.  This approach focuses on the links, not the dots.  It revels in lots of links, and manages them at a meta-data level, maintaining the context of the information.  It can work with the Dots-first categorization schemes, but it doesn’t need them.  This is the approach taken by Google.  It scans the web, indexing information, growing the context of the dot with every new link established.

If a book had a Dewey Decimal System number assigned to it, Google would pick it up as just another piece of metadata.  Users could search for the book using it, but why would they?  Why revert to the “every dot in its place and a place for every dot” scheme when you can use the much richer contextual search that Google provides.

Sonny Montgomery – in 1986 – was advocating the “Links-first” approach that we pioneered in VistA.   This approach came up again in the metadata discussions of the PCAST report.

Bureaucracies typically favor to focus on the dots.  If a Dewey Decimal System isn’t working well enough, the solution is to add more digits of precision to it, more librarians to catalog the books, and larger staffs, standards committees, and regulation to insure that the dots all stay in their assigned pigeonholes.

This is what is happening with ICD10 today.  After the October 2014 roll out, we will now have the ability to differentiate “W59.21 Bitten by turtle” and “W59.22 Struck by turtle” as two distinct dots in the medical information universe.  Unfortunately, we are lacking dots to name tortoises, armadillos, or possums.  Struck By Orca (both the name of the book as well as an ICD10 code) provides some artistic insight into the new coding system.

The continued expectation that we can understand medicine from a “Dots-first” approach is a travesty in today’s world of interconnection, rapidly growing knowledge and life-science discoveries, and the world of personalization.  People use Google, not card-catalogs, to find their information, and do so in a much richer, quicker, and informative way than anything before in human history.

The “Dots-first” thinkers will panic at the emergence of a “links-first” metadata approach.  How can we have establish order if we don’t have experts reviewing the books, applying international standards, and librarians carefully typing and filing the catalogs?

One of the criticisms in the early days of VistA that it’s metadata-driven model would lead to “Helter Skelter” development, and that only centralization could make things orderly.  (Helter-Skelter was the name of the Charles Manson murder movie at the time, so the term carried a lot of linguistic baggage with it.)  They could see only the Dots-first framework, and the ensuing failures of  the centralized, waterfall development of $100m+ megaprojects has continually proven that their approach doesn’t work.  Yet, they continue to blame their failures on the decentralized, metadata-driven core of the system.

There are technologies that address this, such as the Semantic Web or Linked Data initiatives.  But I’m afraid that there is so much money to be made “improving” the medical Dewey Decimal Systems and patching up all the holes in the Dots-first kludges that it seems to be a tremendous uphill battle.