R2RML is a declarative mapping language that defines rules for transforming data stored in relational databases into RDF datasets. It operates on the logical schema of the source database, allowing developers to specify how tables, rows, and columns correspond to RDF triples (subject, predicate, object). This process creates a virtual or materialized RDF graph that semantically represents the underlying relational data, forming the core of a semantic data fabric.
Glossary
R2RML

What is R2RML?
R2RML (RDB to RDF Mapping Language) is the definitive W3C standard for mapping relational database content to the Resource Description Framework (RDF), enabling the creation of knowledge graphs from existing SQL data.
The standard enables the creation of customized RDF views over SQL data without altering the original database. Mappings define logical tables, term maps for generating IRIs and literals, and predicate-object maps to construct triples. This is foundational for building virtual knowledge graphs and is extended by RML for non-relational sources. R2RML ensures deterministic, repeatable transformation of enterprise data into a format ready for semantic reasoning and graph-based querying with SPARQL.
Key Components of an R2RML Mapping
An R2RML mapping document is an RDF graph that defines how data from a relational database is transformed into a target RDF dataset. It consists of several core logical components that work together.
Triples Map
The Triples Map is the core construct that defines a rule for generating RDF triples from logical database rows. Each map specifies:
- A Logical Table: The source of rows (a base table, SQL view, or valid SQL query).
- A Subject Map: Defines how to generate the subject IRI or blank node for each row.
- Predicate-Object Maps: A set of rules that, paired with the subject, generate predicate-object pairs to form complete triples.
Logical Table
A Logical Table identifies the set of database rows used as input for a Triples Map. It can be defined in three ways:
- Base Table or View: Referenced directly by its name.
- R2RML View: A valid SQL query whose results are treated as a virtual table. This enables complex joins and transformations before mapping.
- SQL Query: An alternative syntax for defining an R2RML view. The logical table provides the column values referenced in subsequent mapping rules.
Term Map
A Term Map is a rule for generating an RDF term (an IRI, blank node, or literal). It is a foundational component used within Subject, Predicate, and Object Maps. Key types include:
- Constant-valued Term Map: Always generates the same predefined IRI or literal.
- Column-valued Term Map: Generates a term based on the value of a specified database column, often with an optional string transformation template (
{COLUMN}). - Template-valued Term Map: Uses a string template that can concatenate column values and constants to build IRIs (e.g.,
http://example.com/employee/{EMP_ID}).
Subject Map
The Subject Map is a special Term Map within a Triples Map that defines the subject of all triples produced by that map. It specifies:
- The IRI or blank node identifier for the resource being described.
- Optional Graph Maps to place the triples into named graphs.
- Optional Class IRIs (using
rr:class) to assert anrdf:typefor the subject. A Subject Map is required for every Triples Map, as every triple must have a subject.
Predicate-Object Map
A Predicate-Object Map is a rule that, together with a subject from the Subject Map, creates one or more predicate-object pairs to form triples. It consists of:
- One or more Predicate Maps: Term Maps that generate the predicate IRI (e.g.,
foaf:name). - One or more Object Maps (or Referencing Object Maps): Term Maps that generate the object of the triple, which can be a literal, IRI, or blank node. A single Predicate-Object Map can generate multiple triples for the same subject if it contains multiple predicate-object pairings.
Referencing Object Map (Foreign Key)
A Referencing Object Map (often called a Foreign Key Map) is a special type of Object Map that generates an object by referencing the subject of another Triples Map. This is the primary mechanism for creating links (owl:ObjectProperty relationships) between resources. It defines:
- A Parent Triples Map: The Triples Map whose subjects are referenced.
- Join Conditions: Specifies how a column in the child logical table (e.g.,
DEPT_ID) matches a column in the parent logical table (e.g.,ID). This creates RDF triples that connect entities, forming the graph structure.
R2RML vs. Related Mapping Approaches
A technical comparison of W3C-standard R2RML against other common methods for mapping relational data to semantic formats.
| Mapping Feature / Characteristic | R2RML (W3C Standard) | Direct RDF Export / Dump | ORM-to-RDF Libraries | Proprietary Mapping Tools |
|---|---|---|---|---|
Standardization Body | W3C Recommendation | Vendor-specific | Library-specific | Vendor-specific |
Output Data Model | RDF Dataset | RDF (often simple triples) | RDF/OWL (object-centric) | Vendor-defined (often RDF) |
Mapping Definition Format | RDF (Turtle/RDF/XML) | Implicit in export logic | Programmatic (e.g., Java/Python annotations) | Proprietary GUI or DSL |
Mapping Expressivity | Complex joins, templates, data transformations | Basic 1:1 table-to-class, column-to-property | Limited to object-relational mapping patterns | High (vendor-dependent), often includes transformations |
Logical vs. Physical Mapping | Logical (declarative, source-independent) | Physical (tightly coupled to source schema) | Physical (coupled to object model) | Typically logical or hybrid |
Query Federation Support | ||||
Incremental Materialization Support | ||||
Portability / Vendor Lock-in | ||||
Primary Use Case | Enterprise semantic integration, Virtual Knowledge Graphs | One-time data migration, simple publishing | Application-specific RDF generation | Controlled vendor ecosystem integration |
Primary Use Cases for R2RML
R2RML (RDB to RDF Mapping Language) is a W3C standard for defining mappings from relational databases to RDF datasets. Its primary applications center on unlocking structured enterprise data for semantic integration and advanced analytics.
Legacy System Modernization
R2RML provides a non-invasive bridge to modernize legacy relational systems without disrupting existing applications. It allows organizations to expose decades of operational data stored in SQL databases as a standards-based knowledge graph. This enables:
- Incremental adoption of semantic technologies.
- Reuse of existing ETL investments by adding a semantic mapping layer.
- Connection of siloed databases (e.g., CRM, ERP) into a unified RDF model for cross-system queries.
Building Virtual Knowledge Graphs
A core use case is creating virtual knowledge graphs (VKGs). Instead of physically replicating terabytes of relational data into a triplestore, R2RML mappings define a virtual RDF view. Queries in SPARQL are translated on-the-fly into optimized SQL, enabling real-time access to current data. This is critical for:
- Data virtualization scenarios requiring a single graph query endpoint.
- Enforcing data sovereignty by leaving sensitive data in its original, governed database.
- Integrating live transactional data into semantic applications without latency from batch replication.
Semantic Data Integration Hub
R2RML serves as the translation layer in a semantic data fabric. It maps heterogeneous relational schemas from different departments or acquisitions into a unified ontology (e.g., schema.org, a custom enterprise ontology). This resolves structural conflicts and creates a consistent business vocabulary. Key functions include:
- Schema alignment: Mapping
CUSTOMER.ID(Sales DB) andCLIENT.CLIENT_NO(Service DB) to a singleex:Customerclass. - Data value transformation: Converting status codes (e.g.,
'A') to human-readable IRIs (e.g.,<http://example.com/status/Active>). - Provenance tracking: Using R2RML's named graphs to tag which source database each triple originated from.
Foundation for Graph-Based RAG
R2RML is essential for building deterministic factual grounding in Retrieval-Augmented Generation (RAG) systems. It transforms reliable enterprise relational data into a high-quality knowledge graph that serves as a verifiable source for large language models. This application:
- Eliminates hallucinations by tethering LLM responses to mapped, structured facts.
- Enables complex multi-hop reasoning across relationships (e.g., "Find projects for customers in the healthcare sector") that are explicit in the database but implicit in documents.
- Provides audit trails, as every generated answer can be traced back to specific database records via the R2RML mapping.
Enabling Federated Query & Analytics
By providing a standardized RDF view of relational data, R2RML enables query federation across hybrid data landscapes. A SPARQL endpoint powered by R2RML can participate in federated queries that join data from:
- Other knowledge graphs (triplestores).
- Document databases via companion standards like RML.
- Public linked open data clouds. This allows for complex analytics that were previously impossible, such as enriching internal customer data with demographic information from DBpedia, all within a single query.
Semantic Governance & Compliance
R2RML mappings act as executable documentation of how business concepts map to physical data. This is vital for data governance, regulatory compliance (like GDPR), and auditability. Use cases include:
- Defining PII (Personally Identifiable Information) in semantic terms: Mapping a
EMPLOYEES.SSNcolumn to afoaf:PersonalIDproperty with appropriate access tags. - Supporting "Right to be Forgotten": The mapping shows exactly which database records correspond to a semantic entity, enabling precise deletion.
- Maintaining data lineage: The mapping itself is a key artifact that links the business ontology (the "what") to the system-of-record database (the "where").
Frequently Asked Questions
R2RML (RDB to RDF Mapping Language) is a W3C standard for mapping relational database schemas to RDF datasets. These FAQs address its core purpose, mechanics, and role in enterprise semantic architectures.
R2RML (RDB to RDF Mapping Language) is a declarative, W3C-standardized language for defining customized mappings from relational database (RDB) schemas to RDF (Resource Description Framework) datasets and ontologies. It works by allowing a data architect to write a mapping document—typically in Turtle (TTL) format—that specifies how rows and columns in database tables are transformed into RDF triples (subject-predicate-object statements). An R2RML processor (or mapper) executes this document against a live database, generating a virtual or materialized RDF graph. The mapping defines logical tables (base tables, SQL queries), subject maps (how to generate the subject URI for each row), and predicate-object maps (how to generate predicates and objects, which can be URIs, literals, or blank nodes). This process creates a semantic layer over existing relational data without altering the source database.
Enabling Efficiency, Speed & Accuracy
Intelligent Analysis, Decision & Execution
We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.
Talk to Us
Search across company data
Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.
Useful when people spend too long searching or get different answers from different systems.

Automate internal workflows
Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.
Useful when repetitive work moves across multiple tools and teams.

Add AI to products and internal tools
Build assistants, guided actions, or decision support into the software your team or customers already use.
Useful when AI needs to be part of the product, not a separate tool.
Related Terms
R2RML is a core standard within the semantic data fabric stack. These related terms define the broader ecosystem of technologies and architectural patterns for enterprise data integration and knowledge representation.
Virtual Knowledge Graph (VKG)
A Virtual Knowledge Graph is a system that provides a unified, queryable RDF graph interface over underlying heterogeneous data sources without physically materializing all the triples. It uses R2RML or RML mapping definitions to translate SPARQL queries in real-time into the native query languages of the source systems (e.g., SQL, MongoDB queries). This enables on-demand access to fresh data and is a key architectural pattern for logical data fabrics, avoiding large-scale ETL and data duplication.
Ontology
An ontology is a formal, machine-readable specification of a conceptual model for a domain. It defines:
- Classes (types of things, like
PersonorProduct). - Properties (relationships and attributes, like
worksFororhasPrice). - Constraints and logical rules (like
a Person has exactly one birthdate). R2RML mappings do not define an ontology; they map database schemas to instances of one. The target RDF vocabulary (e.g., classes and properties from an OWL or RDFS ontology) is referenced within the R2RML mapping rules to give semantic meaning to the mapped data.
Data Virtualization
Data Virtualization is an integration pattern that provides a unified, abstracted view of data from multiple disparate sources in real-time, without requiring physical movement or replication. A Virtual Knowledge Graph implemented with R2RML is a form of semantic data virtualization. The R2RML mapping layer acts as the virtualization engine, presenting relational data as a virtual RDF graph. This pattern is central to logical data fabrics, enabling agile access while leaving source systems of record intact.
Semantic Integration
Semantic Integration is the process of combining data from disparate sources by resolving schematic and data-level conflicts to achieve a unified, meaningful view. R2RML is a key technical enabler for semantic integration at the data access layer. It solves the schematic heterogenity problem by mapping disparate relational schemas to a common, shared ontology (RDF model). This allows queries to be written in terms of business concepts (e.g., Customer, Order) rather than underlying table and column names, enabling true interoperability.

About the author
Prasad Kumkar
CEO & MD, Inference Systems
Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.
His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.
Partnered with leading AI, data, and software stack.
How We Work
Custom AI workflows for your Business
One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.
01
Review the use case
We understand the task, the users, and where AI can actually help.
Read more02
Pick the right approach
We define what needs search, automation, or product integration.
Read more03
Build the first useful version
We implement the part that proves the value first.
Read more04
Improve from there
We add the checks and visibility needed to keep it useful.
Read moreThe first call is a practical review of your use case and the right next step.
Talk to Us