Glossary

R2RML

R2RML (RDB to RDF Mapping Language) is a W3C standard language for defining customized mappings from relational database schemas to RDF datasets and ontologies.

Get in touch Learn more

Data scientist building training data pipeline on laptop, data preprocessing visible, technical workspace.

W3C STANDARD

What is R2RML?

R2RML (RDB to RDF Mapping Language) is the definitive W3C standard for mapping relational database content to the Resource Description Framework (RDF), enabling the creation of knowledge graphs from existing SQL data.

R2RML is a declarative mapping language that defines rules for transforming data stored in relational databases into RDF datasets. It operates on the logical schema of the source database, allowing developers to specify how tables, rows, and columns correspond to RDF triples (subject, predicate, object). This process creates a virtual or materialized RDF graph that semantically represents the underlying relational data, forming the core of a semantic data fabric.

The standard enables the creation of customized RDF views over SQL data without altering the original database. Mappings define logical tables, term maps for generating IRIs and literals, and predicate-object maps to construct triples. This is foundational for building virtual knowledge graphs and is extended by RML for non-relational sources. R2RML ensures deterministic, repeatable transformation of enterprise data into a format ready for semantic reasoning and graph-based querying with SPARQL.

W3C STANDARD

Key Components of an R2RML Mapping

An R2RML mapping document is an RDF graph that defines how data from a relational database is transformed into a target RDF dataset. It consists of several core logical components that work together.

Triples Map

The Triples Map is the core construct that defines a rule for generating RDF triples from logical database rows. Each map specifies:

A Logical Table: The source of rows (a base table, SQL view, or valid SQL query).
A Subject Map: Defines how to generate the subject IRI or blank node for each row.
Predicate-Object Maps: A set of rules that, paired with the subject, generate predicate-object pairs to form complete triples.

Logical Table

A Logical Table identifies the set of database rows used as input for a Triples Map. It can be defined in three ways:

Base Table or View: Referenced directly by its name.
R2RML View: A valid SQL query whose results are treated as a virtual table. This enables complex joins and transformations before mapping.
SQL Query: An alternative syntax for defining an R2RML view. The logical table provides the column values referenced in subsequent mapping rules.

Term Map

A Term Map is a rule for generating an RDF term (an IRI, blank node, or literal). It is a foundational component used within Subject, Predicate, and Object Maps. Key types include:

Constant-valued Term Map: Always generates the same predefined IRI or literal.
Column-valued Term Map: Generates a term based on the value of a specified database column, often with an optional string transformation template ({COLUMN}).
Template-valued Term Map: Uses a string template that can concatenate column values and constants to build IRIs (e.g., http://example.com/employee/{EMP_ID}).

Subject Map

The Subject Map is a special Term Map within a Triples Map that defines the subject of all triples produced by that map. It specifies:

The IRI or blank node identifier for the resource being described.
Optional Graph Maps to place the triples into named graphs.
Optional Class IRIs (using rr:class) to assert an rdf:type for the subject. A Subject Map is required for every Triples Map, as every triple must have a subject.

Predicate-Object Map

A Predicate-Object Map is a rule that, together with a subject from the Subject Map, creates one or more predicate-object pairs to form triples. It consists of:

One or more Predicate Maps: Term Maps that generate the predicate IRI (e.g., foaf:name).
One or more Object Maps (or Referencing Object Maps): Term Maps that generate the object of the triple, which can be a literal, IRI, or blank node. A single Predicate-Object Map can generate multiple triples for the same subject if it contains multiple predicate-object pairings.

Referencing Object Map (Foreign Key)

A Referencing Object Map (often called a Foreign Key Map) is a special type of Object Map that generates an object by referencing the subject of another Triples Map. This is the primary mechanism for creating links (owl:ObjectProperty relationships) between resources. It defines:

A Parent Triples Map: The Triples Map whose subjects are referenced.
Join Conditions: Specifies how a column in the child logical table (e.g., DEPT_ID) matches a column in the parent logical table (e.g., ID). This creates RDF triples that connect entities, forming the graph structure.

STANDARD COMPARISON

R2RML vs. Related Mapping Approaches

A technical comparison of W3C-standard R2RML against other common methods for mapping relational data to semantic formats.

Mapping Feature / Characteristic	R2RML (W3C Standard)	Direct RDF Export / Dump	ORM-to-RDF Libraries	Proprietary Mapping Tools
Standardization Body	W3C Recommendation	Vendor-specific	Library-specific	Vendor-specific
Output Data Model	RDF Dataset	RDF (often simple triples)	RDF/OWL (object-centric)	Vendor-defined (often RDF)
Mapping Definition Format	RDF (Turtle/RDF/XML)	Implicit in export logic	Programmatic (e.g., Java/Python annotations)	Proprietary GUI or DSL
Mapping Expressivity	Complex joins, templates, data transformations	Basic 1:1 table-to-class, column-to-property	Limited to object-relational mapping patterns	High (vendor-dependent), often includes transformations
Logical vs. Physical Mapping	Logical (declarative, source-independent)	Physical (tightly coupled to source schema)	Physical (coupled to object model)	Typically logical or hybrid
Query Federation Support
Incremental Materialization Support
Portability / Vendor Lock-in
Primary Use Case	Enterprise semantic integration, Virtual Knowledge Graphs	One-time data migration, simple publishing	Application-specific RDF generation	Controlled vendor ecosystem integration

ENTERPRISE KNOWLEDGE GRAPHS

Primary Use Cases for R2RML

R2RML (RDB to RDF Mapping Language) is a W3C standard for defining mappings from relational databases to RDF datasets. Its primary applications center on unlocking structured enterprise data for semantic integration and advanced analytics.

Legacy System Modernization

R2RML provides a non-invasive bridge to modernize legacy relational systems without disrupting existing applications. It allows organizations to expose decades of operational data stored in SQL databases as a standards-based knowledge graph. This enables:

Incremental adoption of semantic technologies.
Reuse of existing ETL investments by adding a semantic mapping layer.
Connection of siloed databases (e.g., CRM, ERP) into a unified RDF model for cross-system queries.

Building Virtual Knowledge Graphs

A core use case is creating virtual knowledge graphs (VKGs). Instead of physically replicating terabytes of relational data into a triplestore, R2RML mappings define a virtual RDF view. Queries in SPARQL are translated on-the-fly into optimized SQL, enabling real-time access to current data. This is critical for:

Data virtualization scenarios requiring a single graph query endpoint.
Enforcing data sovereignty by leaving sensitive data in its original, governed database.
Integrating live transactional data into semantic applications without latency from batch replication.

Semantic Data Integration Hub

R2RML serves as the translation layer in a semantic data fabric. It maps heterogeneous relational schemas from different departments or acquisitions into a unified ontology (e.g., schema.org, a custom enterprise ontology). This resolves structural conflicts and creates a consistent business vocabulary. Key functions include:

Schema alignment: Mapping CUSTOMER.ID (Sales DB) and CLIENT.CLIENT_NO (Service DB) to a single ex:Customer class.
Data value transformation: Converting status codes (e.g., 'A') to human-readable IRIs (e.g., <http://example.com/status/Active>).
Provenance tracking: Using R2RML's named graphs to tag which source database each triple originated from.

Foundation for Graph-Based RAG

R2RML is essential for building deterministic factual grounding in Retrieval-Augmented Generation (RAG) systems. It transforms reliable enterprise relational data into a high-quality knowledge graph that serves as a verifiable source for large language models. This application:

Eliminates hallucinations by tethering LLM responses to mapped, structured facts.
Enables complex multi-hop reasoning across relationships (e.g., "Find projects for customers in the healthcare sector") that are explicit in the database but implicit in documents.
Provides audit trails, as every generated answer can be traced back to specific database records via the R2RML mapping.

Enabling Federated Query & Analytics

By providing a standardized RDF view of relational data, R2RML enables query federation across hybrid data landscapes. A SPARQL endpoint powered by R2RML can participate in federated queries that join data from:

Other knowledge graphs (triplestores).
Document databases via companion standards like RML.
Public linked open data clouds. This allows for complex analytics that were previously impossible, such as enriching internal customer data with demographic information from DBpedia, all within a single query.

Semantic Governance & Compliance

R2RML mappings act as executable documentation of how business concepts map to physical data. This is vital for data governance, regulatory compliance (like GDPR), and auditability. Use cases include:

Defining PII (Personally Identifiable Information) in semantic terms: Mapping a EMPLOYEES.SSN column to a foaf:PersonalID property with appropriate access tags.
Supporting "Right to be Forgotten": The mapping shows exactly which database records correspond to a semantic entity, enabling precise deletion.
Maintaining data lineage: The mapping itself is a key artifact that links the business ontology (the "what") to the system-of-record database (the "where").

R2RML

Frequently Asked Questions

R2RML (RDB to RDF Mapping Language) is a W3C standard for mapping relational database schemas to RDF datasets. These FAQs address its core purpose, mechanics, and role in enterprise semantic architectures.

R2RML (RDB to RDF Mapping Language) is a declarative, W3C-standardized language for defining customized mappings from relational database (RDB) schemas to RDF (Resource Description Framework) datasets and ontologies. It works by allowing a data architect to write a mapping document—typically in Turtle (TTL) format—that specifies how rows and columns in database tables are transformed into RDF triples (subject-predicate-object statements). An R2RML processor (or mapper) executes this document against a live database, generating a virtual or materialized RDF graph. The mapping defines logical tables (base tables, SQL queries), subject maps (how to generate the subject URI for each row), and predicate-object maps (how to generate predicates and objects, which can be URIs, literals, or blank nodes). This process creates a semantic layer over existing relational data without altering the source database.

Enabling Efficiency, Speed & Accuracy

Intelligent Analysis, Decision & Execution

We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.

Talk to Us

Search across company data

Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.

Useful when people spend too long searching or get different answers from different systems.

Enterprise searchRAGPermissions

Automate internal workflows

Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.

Useful when repetitive work moves across multiple tools and teams.

AI agentsWorkflow automationGovernance

Add AI to products and internal tools

Build assistants, guided actions, or decision support into the software your team or customers already use.

Useful when AI needs to be part of the product, not a separate tool.

AI integrationDecision supportModel routing

SEMANTIC DATA FABRIC

Related Terms

R2RML is a core standard within the semantic data fabric stack. These related terms define the broader ecosystem of technologies and architectural patterns for enterprise data integration and knowledge representation.

RML (RDF Mapping Language)

RML is a superset and generalization of R2RML. While R2RML maps only from relational databases, RML provides a unified framework for defining mappings from heterogeneous data formats—including JSON, CSV, XML, and relational tables—to RDF. It uses logical sources and iterator-based mapping rules to transform nested and semi-structured data into a knowledge graph. This makes it essential for modern data fabrics that integrate diverse APIs and file-based sources.

EXPLORE

Virtual Knowledge Graph (VKG)

A Virtual Knowledge Graph is a system that provides a unified, queryable RDF graph interface over underlying heterogeneous data sources without physically materializing all the triples. It uses R2RML or RML mapping definitions to translate SPARQL queries in real-time into the native query languages of the source systems (e.g., SQL, MongoDB queries). This enables on-demand access to fresh data and is a key architectural pattern for logical data fabrics, avoiding large-scale ETL and data duplication.

SPARQL

SPARQL is the W3C-standardized query language for RDF knowledge graphs. It is the primary endpoint for data mapped via R2RML. SPARQL allows for:

Graph pattern matching to find subgraphs.
Aggregation and filtering of results.
Combining data from multiple graphs (federation).
Updating graph stores (with SPARQL Update). When an R2RML mapping is deployed, applications query the resulting RDF dataset using SPARQL, which the system translates (via the mapping) into queries against the original relational database.

EXPLORE

Ontology

An ontology is a formal, machine-readable specification of a conceptual model for a domain. It defines:

Classes (types of things, like Person or Product).
Properties (relationships and attributes, like worksFor or hasPrice).
Constraints and logical rules (like a Person has exactly one birthdate). R2RML mappings do not define an ontology; they map database schemas to instances of one. The target RDF vocabulary (e.g., classes and properties from an OWL or RDFS ontology) is referenced within the R2RML mapping rules to give semantic meaning to the mapped data.

Data Virtualization

Data Virtualization is an integration pattern that provides a unified, abstracted view of data from multiple disparate sources in real-time, without requiring physical movement or replication. A Virtual Knowledge Graph implemented with R2RML is a form of semantic data virtualization. The R2RML mapping layer acts as the virtualization engine, presenting relational data as a virtual RDF graph. This pattern is central to logical data fabrics, enabling agile access while leaving source systems of record intact.

Semantic Integration

Semantic Integration is the process of combining data from disparate sources by resolving schematic and data-level conflicts to achieve a unified, meaningful view. R2RML is a key technical enabler for semantic integration at the data access layer. It solves the schematic heterogenity problem by mapping disparate relational schemas to a common, shared ontology (RDF model). This allows queries to be written in terms of business concepts (e.g., Customer, Order) rather than underlying table and column names, enabling true interoperability.

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.

Limited slotsGet a Free AI Consultation

How We Work

Custom AI workflows for your Business

One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.

Talk to Us

R2RML

What is R2RML?

Key Components of an R2RML Mapping

Triples Map

Logical Table

Term Map

Subject Map

Predicate-Object Map

Referencing Object Map (Foreign Key)

R2RML vs. Related Mapping Approaches

Primary Use Cases for R2RML

Legacy System Modernization

Building Virtual Knowledge Graphs

Semantic Data Integration Hub

Foundation for Graph-Based RAG

Enabling Federated Query & Analytics

Semantic Governance & Compliance

Frequently Asked Questions

Intelligent Analysis, Decision & Execution

Search across company data

Automate internal workflows

Add AI to products and internal tools

RML (RDF Mapping Language)

SPARQL

Prasad Kumkar

Partnered with leading AI, data, and software stack.

Custom AI workflows for your Business

Review the use case

Pick the right approach

Build the first useful version

Improve from there