ARCore is Google's software development kit (SDK) for creating augmented reality applications on Android. It provides three core capabilities: motion tracking to understand the device's position relative to the world, environmental understanding to detect horizontal and vertical surfaces, and light estimation to match the lighting of virtual objects to their surroundings. These features allow developers to anchor digital content convincingly within the user's physical environment.
Glossary
ARCore
What is ARCore?
ARCore is Google's foundational platform for building augmented reality experiences on Android, enabling devices to understand and interact with the physical world.
As a spatial computing architecture, ARCore operates by fusing visual data from the camera with inertial readings from the device's IMU (Inertial Measurement Unit) in a process akin to Visual-Inertial Odometry (VIO). It builds a sparse point cloud of the environment for tracking and can perform plane detection for content placement. This on-device processing enables robust, markerless AR without requiring specialized hardware, forming the perceptual foundation for applications ranging from interactive gaming to practical digital twin visualization.
Core Capabilities of ARCore
ARCore is Google's platform for building augmented reality experiences on Android, enabling digital content to interact with the real world through three foundational pillars: motion tracking, environmental understanding, and light estimation.
How ARCore Works: The Technical Pipeline
ARCore, Google's platform for Android augmented reality, operates through a real-time pipeline that fuses sensor data to understand and interact with the physical world.
ARCore's pipeline begins with motion tracking, which uses the device's camera and Inertial Measurement Unit (IMU) to estimate its 6DoF pose in real time. It identifies visual feature points across frames and fuses this data with gyroscope and accelerometer readings via sensor fusion, creating a stable coordinate system for virtual content. This process is a form of Visual-Inertial Odometry (VIO), a core component of Visual SLAM systems.
Concurrently, environmental understanding detects horizontal and vertical planes, like floors and walls, through plane detection. Light estimation analyzes the camera image to match the lighting of virtual objects to the real scene. For advanced geometry, depth mapping uses the device's sensors to create a real-time depth map, enabling occlusion and more realistic interactions. These components collectively enable spatial mapping and scene understanding for persistent AR experiences.
ARCore in the Development Ecosystem
ARCore is Google's platform for building augmented reality experiences on Android, offering motion tracking, environmental understanding, and light estimation. This section details its core technical subsystems and their role in the spatial computing stack.
Motion Tracking & Visual-Inertial Odometry (VIO)
ARCore's foundational capability is 6DoF pose estimation through Visual-Inertial Odometry (VIO). It fuses data from the device's camera and Inertial Measurement Unit (IMU) to track the phone's position and orientation in real-time.
- Process: Identifies feature points in the camera feed and tracks them across frames, using IMU data to maintain accuracy during fast motion or poor lighting.
- Output: Continuously provides a pose graph representing the device's movement through space, enabling virtual objects to remain anchored.
- Key Benefit: Enables persistent AR content placement without markers or pre-scanned environments.
Environmental Understanding & Plane Detection
This subsystem interprets the geometry of the physical world. Using feature points and depth data (when available), ARCore performs plane detection to identify flat, horizontal, and vertical surfaces like floors, tables, and walls.
- Mechanism: Clusters feature points into large, connected planes and provides their boundaries and pose.
- Application: Essential for placing virtual objects that appear to rest on real surfaces. This data can feed into higher-level scene understanding or spatial mapping.
- Advanced Output: Can generate a coarse world mesh, a real-time 3D polygonal representation of surfaces for occlusion and physics.
Light Estimation
For virtual objects to appear believably integrated, they must match the ambient lighting. ARCore's light estimation analyzes the camera image to determine the environment's average color temperature and intensity.
- Function: Provides a dominant directional light source (often mimicking the main light in the scene) and ambient spherical harmonics.
- Result: Virtual objects cast consistent shadows and exhibit accurate specular highlights, dramatically increasing visual coherence.
- Evolution: Earlier versions provided simple ambient intensity; modern implementations offer more sophisticated HDR lighting estimation for higher fidelity.
Depth API & Scene Reconstruction
On supported devices with time-of-flight sensors or dual cameras, ARCore's Depth API provides a dense depth map in real-time. This enables advanced interactions and detailed 3D scene reconstruction.
- Capabilities: Allows virtual objects to occlude behind real-world geometry and enables physics interactions with complex surfaces.
- Technical Basis: Creates a point cloud or depth image that can be used for surface reconstruction, moving beyond simple plane detection to understand complex geometry.
- Use Case: Critical for applications like measuring real objects, scanning environments for digital twins, or creating immersive occlusion effects.
Cloud Anchors & Persistent AR
ARCore Cloud Anchors enable multi-user, persistent AR experiences by creating spatial anchors that can be resolved by different devices at different times.
- Process: The device uploads visual features from its environment to Google's cloud. The cloud processes this data to create a unique anchor that other devices can later recognize and localize against.
- Function: Solves the problem of shared frame-of-reference, enabling collaborative AR apps and experiences that persist across sessions (persistent AR).
- Underlying Tech: Relies on visual recognition and large-scale feature matching rather than GPS, providing room-scale precision.
Integration with the Android Sensor Stack
ARCore is not a standalone sensor but a sophisticated sensor fusion platform deeply integrated with Android's hardware abstraction layer (HAL). It optimally manages the camera, IMU, and other sensors.
- Synchronization: Precisely time-aligns camera frames with high-frequency IMU gyroscope and accelerometer readings, which is critical for robust VIO.
- Calibration: Manages device-specific camera intrinsics and IMU-camera extrinsics (their relative position) to ensure accurate measurements.
- System Resource Management: Dynamically adjusts CPU/GPU usage and camera parameters to balance AR performance with device battery life and thermal constraints.
ARCore vs. ARKit: A Technical Comparison
A feature-by-feature comparison of Google's ARCore and Apple's ARKit, the dominant SDKs for building augmented reality applications on mobile devices.
| Core Feature / Metric | ARCore (Google) | ARKit (Apple) | Primary Use Case |
|---|---|---|---|
Primary Platform | Android (7.0+ / API Level 24+) | iOS (11.0+) | Mobile AR Development |
Core Tracking Method | Visual-Inertial Odometry (VIO) | Visual-Inertial Odometry (VIO) | 6DoF device pose estimation |
Environmental Understanding | Plane detection (horizontal & vertical), feature points | Plane detection (horizontal & vertical), feature points | Virtual object placement & occlusion |
Light Estimation | Environmental HDR light estimation | Environmental lighting with intensity & color temperature | Realistic virtual object shading |
Depth API / LiDAR Integration | Depth API (software-based, ToF/LiDAR on supported devices) | Scene Geometry API & LiDAR Scanner (hardware on Pro devices) | Occlusion, physics, mesh generation |
Cloud Anchors / Persistence | Cloud Anchors (cross-platform) | Persistent World Tracking & Collaborative Sessions | Shared multi-user & persistent AR experiences |
Face Tracking | via separate ML Kit API | TrueDepth Camera system (front-facing) | Selfie filters & facial expression analysis |
Image Tracking | Augmented Images API | Image Tracking & Detection | Triggering AR from 2D markers or pictures |
Object Tracking | Augmented Objects API (limited) | Object Scanning & Detection | Placing AR content on/around specific 3D objects |
Motion Tracking | Supports 3DoF & 6DoF | 6DoF only | Device orientation & position tracking |
World Mapping / Meshing | Generates feature points & planes; mesh via Depth API | Generates coarse world mesh (enhanced with LiDAR) | Environmental understanding for occlusion & physics |
People Occlusion | via Depth API on supported hardware | People Occlusion (with LiDAR or A12+ Bionic) | Realistic AR content interaction with people |
Development Language | Java, Kotlin, C/C++/NDK, Unity, Unreal | Swift, Objective-C, C++, Unity, Unreal | Native & game engine SDK integration |
Key Hardware Dependency | Motion & camera sensors; Google Play Services for AR | A9+ chip (iOS 11), A12+ for advanced features | Performance & feature availability |
Frequently Asked Questions
ARCore is Google's platform for building augmented reality experiences on Android. This FAQ addresses common technical questions for developers and architects implementing spatial computing solutions.
ARCore is Google's software development kit (SDK) for building augmented reality (AR) applications on Android devices. It works by using a process called concurrent odometry and mapping (COM) to understand the device's position relative to the world. ARCore uses the phone's camera to detect visually distinct feature points, fuses this data with readings from the Inertial Measurement Unit (IMU) for robust motion tracking, and constructs a geometric understanding of flat surfaces and environmental lighting to enable realistic virtual object placement and interaction.
Key underlying processes include:
- Motion Tracking: Estimates the device's 6DoF pose (position and orientation) in real-time.
- Environmental Understanding: Detects horizontal and vertical surfaces like floors and walls.
- Light Estimation: Assesses ambient lighting to correctly shade virtual objects.
Enabling Efficiency, Speed & Accuracy
Intelligent Analysis, Decision & Execution
We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.
Talk to Us
Search across company data
Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.
Useful when people spend too long searching or get different answers from different systems.

Automate internal workflows
Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.
Useful when repetitive work moves across multiple tools and teams.

Add AI to products and internal tools
Build assistants, guided actions, or decision support into the software your team or customers already use.
Useful when AI needs to be part of the product, not a separate tool.
Related Terms
ARCore operates within a broader ecosystem of spatial computing technologies. These related concepts define the underlying systems for mapping, understanding, and interacting with the physical world.
Simultaneous Localization and Mapping (SLAM)
Simultaneous Localization and Mapping (SLAM) is the foundational computational technique that enables ARCore's core functionality. It is the process by which a device constructs a map of an unknown environment while simultaneously tracking its own position within that map. ARCore implements a Visual-Inertial SLAM system that fuses camera images with inertial sensor data from the device's IMU.
- Key Components: Feature tracking, sparse mapping, and pose estimation.
- Contrast with ARCore: SLAM is the generic algorithmic family; ARCore is Google's specific, production-hardened implementation for Android, incorporating additional APIs for plane detection and light estimation.
Visual-Inertial Odometry (VIO)
Visual-Inertial Odometry (VIO) is the real-time pose estimation engine at the heart of ARCore's motion tracking. It is a sensor fusion technique that combines a continuous stream of visual data from the camera with high-frequency motion data from the Inertial Measurement Unit (IMU).
- Purpose: To estimate the device's 6-degree-of-freedom (6DoF) position and orientation.
- Advantage over pure vision: The IMU provides robust tracking during fast motion, temporary occlusion, or low-texture environments where visual features are scarce. ARCore's VIO system is optimized for power efficiency and accuracy on mobile System-on-a-Chip (SoC) architectures.
Spatial Anchor
A Spatial Anchor is a persistent point of reference in the real world that ARCore creates and manages. It allows virtual content to be precisely placed and recalled in the same physical location across multiple app sessions, even if the environment changes slightly.
- Mechanism: ARCore generates a unique descriptor for the local geometry and visual features surrounding the anchor point.
- Cloud Anchors: ARCore's Cloud Anchors service enables shared multi-user experiences by allowing anchors to be hosted online and resolved by other devices in the same location.
- Use Case: Placing a virtual sculpture in a lobby that multiple visitors can see days later from their own devices.
Scene Understanding
Scene Understanding refers to ARCore's ability to parse the physical environment beyond simple geometry. It involves identifying semantic and functional properties of surfaces and objects.
- Core Capabilities:
- Plane Detection: Identifying horizontal (floors, tables) and vertical (walls) surfaces.
- Depth API: Generating real-time depth maps using the device's camera(s) or time-of-flight sensor.
- Semantic Understanding: More advanced classification of detected planes (e.g., 'floor', 'seat', 'table').
- Purpose: This understanding allows virtual objects to interact realistically with the world—sitting on tables, occluding behind real objects, or bouncing on the floor.
ARKit
ARKit is Apple's counterpart framework to ARCore, providing augmented reality capabilities for iOS and iPadOS devices. It serves the same fundamental purpose but within Apple's ecosystem.
- Technical Comparison: Both platforms offer motion tracking, plane detection, light estimation, and image anchoring. They differ in underlying sensor fusion implementations, specific API features (e.g., ARCore's Cloud Anchors vs. ARKit's RealityKit and Object Capture), and hardware optimization targets.
- Developer Impact: The existence of these parallel platforms led to the creation of cross-platform SDKs like Unity's AR Foundation and Google's own ARCore SDK for Unity, which abstract the underlying native APIs.

About the author
Prasad Kumkar
CEO & MD, Inference Systems
Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.
His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.
Partnered with leading AI, data, and software stack.
How We Work
Custom AI workflows for your Business
One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.
01
Review the use case
We understand the task, the users, and where AI can actually help.
Read more02
Pick the right approach
We define what needs search, automation, or product integration.
Read more03
Build the first useful version
We implement the part that proves the value first.
Read more04
Improve from there
We add the checks and visibility needed to keep it useful.
Read moreThe first call is a practical review of your use case and the right next step.
Talk to Us