What Other Industries Can Learn from Healthcare’s Knowledge Graphs

-

🤯 :).

Healthcare didn’t grow to be a pacesetter in knowledge graphs by adopting latest technology early. It did so by investing, over centuries, in shared meaning. Long before modern data platforms or AI, medicine aligned on what exists (ontologies), how entities are named (controlled vocabularies), how evidence is generated (observations), how data moves between systems (interoperability standards), and the way alignment is enforced (through regulation, collaboration, and public funding).

This text shows that healthcare just isn’t unique in needing these foundations, and it isn’t any longer unique in constructing them. Other industries are already developing shared ontologies, vocabularies, remark standards, and exchange models in law, finance, climate science, construction, cybersecurity, and government. The difference just isn’t feasibility, but maturity and coordination.

Within the sections that follow, I walk through the important thing lessons other industries can take from healthcare’s experience, highlighting what healthcare got right, and pointing to concrete examples from other domains where similar approaches are already working.

Shared ontologies — agree on what exists

The healthcare industry has tons of ontologies. They’ve ontologies for anatomy (Uberon), genes (Gene Ontology), chemical compounds (ChEBI) and lots of of other domains. Repositories resembling BioPortal and the OBO Foundry provide access to well over a thousand biomedical ontologies. Most of those ontologies are domain ontologies – they describe the domain of healthcare.

Along with these domain ontologies, healthcare uses cross-domain ontologies like Schema.org and QUDT (Quantities, Units, Dimensions, and Types). They use the Web Ontology Language (OWL), the Shapes Constraint Language (SHACL), and the Easy Knowledge Organization System (SKOS) to construct their ontologies – all standards from the World Wide Web Consortium (W3C)–more on this later. There are also things called upper ontologies, that are used to model things at the next level than a selected domain. Some examples of those are the Basic Formal Ontology (BFO), the Suggested Upper Merged Ontology (SUMO), and gist, a light-weight upper ontology.

Other industries can learn from healthcare’s history of codifying a shared understanding of a website and explicitly agreeing on what exists and the way those things relate. While healthcare benefited from centuries of empirical science, all industries and organizations take care of entities and rules that might be codified. Finance, law, supply chains, and even religious institutions have long relied on formalized structures to reason. Listed below are some examples of ontologies being successfully utilized in other industries:

  • The European Laws Identifier (ELI) Ontology is a powerful example of a free, publicly funded ontology built using W3C standards. It provides a shared semantic model for laws across EU member states—defining how laws, amendments, jurisdictions, and legal relationships are identified and linked. Moderately than digitizing documents alone, it encodes how the legal system itself works.
  • The Environment Ontology (ENVO) is a complementary example from the scientific community. ENVO is a community-led, open ontology that represents environments, ecosystems, habitats, and environmental processes. It demonstrates that shared ontologies don’t require centralized authority; they’ll emerge from distributed expert consensus and still grow to be widely used infrastructure.
  • The Financial Industry Business Ontology (FIBO) shows how finance, like healthcare, advantages from agreeing on core concepts—entities, contracts, and instruments—so firms compete on products moderately than on definitions.
  • EarthPortal is like BioPortal but for Earth sciences, though at a smaller scale. It’s a house for ontologies about Earth sciences, and is essentially community-driven, not publicly funded like BioPortal.
  • This can be a small subset — for the complete list go to this app.

Treat controlled vocabularies as infrastructure, not project-specific

Healthcare advanced by treating catalogs of real-world entities as first-class infrastructure. They’ve controlled vocabularies for conditions and procedures (SNOMED CT), diseases (ICD 11), adversarial effects (MedDRA), drugs (RxNorm), compounds (CheBI and PubChem), proteins (UniProt), and genes (NCBI Gene). There are even organizations that tie a lot of these together right into a unified knowledge graph just like the Scalable Precision Medicine Open Knowledge Engine (SPOKE), the Monarch Initiative, and Open Targets.

Other industries can do the identical by constructing and curating lists of things they rely upon (corporations, industries, financial instruments, policies, parts) and publishing them as open, machine-readable datasets. Listed below are a number of distinguished examples from other industries:

  • The United Nations Bibliographic Information System (UNBIS) Thesaurus is a superb example of a free, publicly funded taxonomy that standardizes subjects, geographies, and institutional concepts across the UN system. It acts as a shared controlled vocabulary that allows interoperability across agencies, reports, and repositories.
  • An example from finance is the Legal Entity Identifier (LEI) system. LEI provides a worldwide, open identifier for legal entities participating in financial transactions.
  • The International Financial Reporting Standards (IFRS) Foundation maintains the IFRS Accounting Taxonomy which comprises elements for tagging financial statements prepared in accordance with IFRS Accounting Standards.
  • AGROVOC is a multilingual controlled vocabulary maintained by the Food and Agriculture Organization (FAO) of the United Nations to advertise interoperability of reports and data.
  • GeoNames is an open geographic database of over 25 million place names, identifiers, and geographic features. It’s widely used across industries from logistics to news media and is published using W3C standards.

Let empirical remark drive structure

Healthcare evolved through remark, experimentation, and replication. Claims about drugs should be backed by evidence and dogmatists were (eventually) overruled by empirical results. In healthcare, the Clinical Data Interchange Standards Consortium (CDISC) standardizes how clinical trial observations—measurements, outcomes, and adversarial events—are recorded and evaluated, enabling cumulative, reproducible evidence. There are examples of other industries embracing a standardized approach to recording observational data:

  • The Climate and Forecast Metadata Conventions (CF Conventions) standardize how observed climate variables are described across sensors and models, enabling scientific data to be shared, compared, and reused. They’re developed and maintained through an open, community-driven process.
  • The Industry Foundation Classes (IFC) from buildingSMART international define a shared representation of real-world structures (buildings, components, and systems) across design, construction, and operations. This enables observations about buildings to build up over a structure’s full lifecycle.

Standardize how data is shared, not only what it means

Healthcare didn’t stop at shared semantics and evidence standards; it also standardized interoperability. The Health Level Seven International (HL7) standards—most notably HL7 FHIR—define how clinical data resembling patients, observations, medications, and encounters are exchanged between systems. Listed below are some examples from other industries:

  • The eXtensible Business Reporting Language (XBRL) standardizes how financial statements and disclosures are reported to regulators and markets. These taxonomies are created by regulators and published through registries coordinated by XBRL International
  • The National Information Exchange Model (NIEM) is a framework for constructing information schema by aligning on common vocabulary and design rules across domains. This enables details about people, events, and cases to maneuver between agencies or organizations without losing meaning or legal integrity.

Use regulation to force semantic alignment

Strong regulatory pressure forced healthcare to align on definitions of terms and standards for empirical studies. The FDA reinforces this alignment by requiring conformity to standards and controlled terminologies, resembling CDISC for clinical trial data and MedDRA for adversarial event reporting. Other industries, like finance and aviation, are also highly regulated and have standardized ways of reporting and tracking compliance:

Notably, in healthcare, organizations just like the FDA and WHO actively require using shared vocabularies like MedDRA, ICD, and CDISC in regulatory processes. In finance, while regulators just like the SEC and FINRA implement reporting and compliance, there just isn’t a comparably mature, shared ecosystem of regulatory vocabularies.

Separate pre-competitive semantics from competitive advantage

Healthcare corporations compete on drugs, not the definition of medicine. Agreeing on the definition of terms and best practices for sharing data doesn’t impede competition. The Pistoia Alliance exemplifies this approach in life sciences by bringing competitors together to develop shared semantic standards and interoperability practices as pre-competitive infrastructure. Listed below are some examples from other industries:

  • EDM Council plays a job in finance just like the Pistoia Alliance in life sciences, bringing competing institutions together to develop shared data semantics and standards (including FIBO) as pre-competitive infrastructure.
  • buildingSMART International brings together software vendors, architects, engineers, and construction firms to keep up Industry Foundation Classes (IFC). Vendors compete on tools, but agree on constructing and component terms and the best way they’re represented.
  • The MITRE Corporation, the R&D organization, publishes MITRE ATT&CK, a knowledge graph of adversary tactics and techniques for decision support in cybersecurity operations. While security contractors compete on tools, they’ll agree on the language for describing threats and incidents.

Fund shared knowledge as a public good

Public funding has been essential for constructing and maintaining healthcare’s ontologies and controlled vocabularies, and it’s unlikely that one organization would construct all of them by itself. Other industries could construct consortia, foundations, and public-private partnerships to support an analogous semantic infrastructure. Public funding from the National Institutes of Health (NIH) has been essential to constructing and sustaining core biomedical ontologies and controlled vocabularies. Other industries have also benefited from public funding:

Anchor meaning in open standards

Aligning with open standards ensures that knowledge outlives any single vendor, platform, or technology. Organizations just like the World Wide Web Consortium (W3C) define foundational standards like RDF, OWL, and SHACL. By anchoring semantics in open standards moderately than vendor-specific schemas, industries create knowledge that might be reused, integrated, and reasoned over for a long time, at the same time as tools and architectures evolve.

Construct incrementally

Knowledge graphs in healthcare have been the results of an extended history of discovering latest things, documenting the findings, cataloging the instances of classes, and conducting experiments. It’s unlikely that an industry can construct a website knowledge graph top-down. Well-structured domain knowledge can also be not something that might be done quickly, even with AI.

Conclusion

Long before modern data platforms or AI, medicine invested in shared definitions, controlled vocabularies, empirical standards, and interoperable ways of exchanging evidence. Those decisions allowed knowledge to build up moderately than fragment.

Other industries don’t need to duplicate healthcare’s path exactly, but they’ll adopt a few of its principles. Agree on what exists. Treat reference data and vocabularies as shared infrastructure. Let remark and evidence drive structure. Use regulation and collaboration to implement alignment. Fund semantics as a public good. Anchor meaning in open standards so it outlives any single vendor or system.

Healthcare didn’t succeed since it adopted AI early. It succeeded since it spent centuries externalizing meaning. Knowledge graphs don’t create that agreement—but they finally make it computable, reusable, and scalable.

In regards to the creator: Steve Hedden is the Head of Product Management at TopQuadrant, where he leads the strategy for EDG, a platform for knowledge graph and metadata management. His work focuses on bridging enterprise data governance and AI through ontologies, taxonomies, and semantic technologies. Steve writes and speaks usually about knowledge graphs, and the evolving role of semantics in AI systems.

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x