Glossary
Spatial Computing Architectures

Spatial Computing Architectures
Terms related to systems for mapping, understanding, and interacting with the physical world. Target: AR/VR system architects and CTOs.
Simultaneous Localization and Mapping (SLAM)
Simultaneous Localization and Mapping (SLAM) is a computational technique used by robots and autonomous systems to construct a map of an unknown environment while simultaneously tracking their own position within it.
Visual-Inertial Odometry (VIO)
Visual-Inertial Odometry (VIO) is a sensor fusion technique that combines data from a camera and an Inertial Measurement Unit (IMU) to estimate the 6-degree-of-freedom (6DoF) pose of a device, providing robust tracking even during rapid motion or visual degradation.
Point Cloud
A point cloud is a set of data points in a 3D coordinate system, typically representing the external surface of an object or scene, generated by sensors like LiDAR or through photogrammetry.
Voxel Grid
A voxel grid is a 3D volumetric representation of space, analogous to a 2D pixel grid, where each voxel (volume element) stores information such as occupancy, color, or density.
Signed Distance Function (SDF)
A Signed Distance Function (SDF) is a scalar field that, for any point in space, defines the shortest distance to the surface of an object, with the sign indicating whether the point is inside (negative) or outside (positive) the object.
Semantic Segmentation
Semantic segmentation is a computer vision task that assigns a class label (e.g., 'car', 'road', 'person') to every pixel in an image, providing a dense understanding of scene composition.
Depth Map
A depth map is an image or image channel where each pixel value represents the distance from the camera to the corresponding point in the 3D scene.
Bundle Adjustment
Bundle adjustment is a nonlinear optimization problem in computer vision and photogrammetry that refines the 3D coordinates of a scene geometry, the parameters of the camera(s), and/or the camera poses to minimize reprojection error across a set of images.
Loop Closure
Loop closure is the process in SLAM where a system recognizes a previously visited location, allowing it to correct accumulated drift in its pose estimate and map by enforcing global consistency.
Iterative Closest Point (ICP)
Iterative Closest Point (ICP) is an algorithm used to align two 3D point clouds by iteratively minimizing the distance between corresponding points, commonly used for point cloud registration and scan matching.
Pose Graph
A pose graph is a sparse graphical model used in SLAM where nodes represent estimated robot poses (positions and orientations) and edges represent spatial constraints between them derived from sensor measurements.
Spatial Anchor
A spatial anchor is a persistent point of reference in the real world that a mixed reality or augmented reality application can use to precisely place and recall virtual content across sessions.
Scene Understanding
Scene understanding is the high-level computer vision task of parsing a visual scene to identify objects, surfaces, layouts, and their semantic relationships and physical properties.
6DoF Pose
6DoF Pose refers to the complete position and orientation of an object in three-dimensional space, defined by three translational degrees of freedom (x, y, z) and three rotational degrees of freedom (roll, pitch, yaw).
Spatial Mapping
Spatial mapping is the process of creating a 3D digital representation of the physical environment, including its geometry and sometimes semantics, for use in augmented reality, robotics, and spatial computing applications.
Surface Reconstruction
Surface reconstruction is the process of creating a continuous polygonal mesh or other surface representation from a set of unorganized 3D points, such as those from a point cloud.
ARKit
ARKit is Apple's software framework for building augmented reality experiences on iOS devices, providing capabilities like world tracking, scene understanding, and face tracking.
ARCore
ARCore is Google's platform for building augmented reality experiences on Android, offering motion tracking, environmental understanding, and light estimation.
OpenXR
OpenXR is a royalty-free, open standard developed by the Khronos Group that provides native access to a wide range of virtual reality and augmented reality devices and platforms.
Sensor Fusion
Sensor fusion is the process of combining sensory data from disparate sources (e.g., cameras, IMUs, LiDAR) to produce information that is more accurate, complete, and reliable than that provided by any individual sensor.
Kalman Filter
A Kalman filter is an optimal recursive algorithm used in sensor fusion and state estimation that predicts a system's future state and updates the prediction with new measurements, minimizing the mean of the squared error.
Feature Tracking
Feature tracking is the process of following distinctive points (features) across a sequence of images or video frames to estimate motion, optical flow, or camera pose.
Global Map
In SLAM and robotics, a global map is the unified, consistent representation of the entire known environment, often built by merging local submaps and corrected through loop closure.
Visual SLAM
Visual SLAM is a class of SLAM techniques that use one or more cameras as the primary sensor for both localization and mapping, without relying on pre-existing maps or external positioning systems.
ORB-SLAM
ORB-SLAM is a versatile and accurate feature-based monocular, stereo, and RGB-D visual SLAM system known for its robustness and use of ORB features for tracking, mapping, and loop closing.
Plane Detection
Plane detection is a computer vision process that identifies flat surfaces (like floors, walls, and tables) in a 3D scene, a fundamental capability for AR placement and spatial understanding.
World Mesh
A world mesh is a real-time, generated 3D polygonal mesh that represents the reconstructed surfaces of the physical environment, used for occlusion, physics, and navigation in mixed reality applications.
Bounding Volume Hierarchy (BVH)
A Bounding Volume Hierarchy (BVH) is a tree structure used in computer graphics and computational geometry to organize objects in space, enabling efficient spatial queries like ray intersection and collision detection.
Foveated Rendering
Foveated rendering is a graphics optimization technique that reduces the rendering quality in the peripheral vision (where the eye perceives less detail) while maintaining high resolution in the central foveal region, significantly saving computational resources.
Hand Tracking
Hand tracking is the computer vision technology that detects, localizes, and estimates the pose (joint positions) of a user's hands in real time, enabling natural interaction in virtual and augmented reality.
Partnered with leading AI, data, and software stack.
How We Work
Custom AI workflows for your Business
One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.
01
Review the use case
We understand the task, the users, and where AI can actually help.
Read more02
Pick the right approach
We define what needs search, automation, or product integration.
Read more03
Build the first useful version
We implement the part that proves the value first.
Read more04
Improve from there
We add the checks and visibility needed to keep it useful.
Read moreThe first call is a practical review of your use case and the right next step.
Talk to Us