Inferensys

Glossary

TinyML Frameworks

Terms related to software libraries and toolchains specifically designed for developing and deploying machine learning models on microcontrollers. Target: Firmware Developers.
Governance lead reviewing model governance framework on laptop, policy documents visible, executive office setup.
Glossary

TinyML Frameworks

Terms related to software libraries and toolchains specifically designed for developing and deploying machine learning models on microcontrollers. Target: Firmware Developers.

TensorFlow Lite Micro (TFLM)

TensorFlow Lite Micro (TFLM) is a cross-platform, open-source deep learning inference framework designed to run neural network models on microcontrollers and other devices with only kilobytes of memory.

CMSIS-NN

CMSIS-NN is a collection of efficient neural network kernels developed by Arm as part of the Cortex Microcontroller Software Interface Standard (CMSIS) to maximize performance on Arm Cortex-M processor cores.

STM32Cube.AI

STM32Cube.AI is an STMicroelectronics development tool that converts pre-trained neural networks into optimized C code for deployment on STM32 microcontroller families.

Edge Impulse

Edge Impulse is a cloud-based development platform that provides an end-to-end workflow for building, optimizing, and deploying machine learning models to microcontroller and edge device targets.

ESP-DL

ESP-DL is Espressif Systems' deep learning library providing optimized neural network operations and model deployment tools for their ESP32 series of microcontrollers.

SensiML

SensiML is a software toolkit for creating AI algorithms that analyze real-time sensor data, enabling the development of intelligent sensing applications for microcontrollers.

MicroTVM

MicroTVM is a component of Apache TVM that enables the compilation and deployment of machine learning models onto bare-metal microcontrollers by providing a minimal runtime and ahead-of-time (AOT) compilation.

uTensor

uTensor is an open-source, lightweight machine learning inference framework built specifically for microcontrollers, featuring a simple C++ API and a runtime that executes models from TensorFlow.

Ell

Ell is an open-source, embedded learning library from Microsoft that enables the building and deployment of intelligent machine-learned models onto resource-constrained platforms like microcontrollers and single-board computers.

TinyEngine

TinyEngine is a memory-efficient deep learning inference framework that generates specialized, ultra-lean C code for a given neural network, minimizing memory overhead on microcontrollers.

EON Compiler

The EON Compiler is a model optimization tool within the Edge Impulse platform that applies compression techniques like quantization and pruning to reduce model size and latency for edge deployment.

MLPerf Tiny

MLPerf Tiny is a benchmark suite from the MLPerf consortium designed to measure the performance and accuracy of machine learning systems on ultra-low-power devices like microcontrollers.

MCUNet

MCUNet is a system co-design framework that jointly optimizes TinyML models (TinyNAS) and inference engines (TinyEngine) to enable efficient deep learning on microcontrollers with severely limited memory.

TinyML Toolchain

A TinyML toolchain is the integrated set of software tools—including compilers, optimizers, profilers, and deployment utilities—used to convert, optimize, and deploy machine learning models onto microcontroller hardware.

FlatBuffer Model

A FlatBuffer model is a neural network model serialized using the FlatBuffers cross-platform serialization library, which is the standard, memory-efficient format used by TensorFlow Lite and TensorFlow Lite Micro.

C Array Model

A C array model is a neural network model represented as a constant C/C++ byte array (header file) within source code, enabling direct compilation into a firmware binary without a separate file system.

nncase

nncase is an open-source neural network compiler developed by Canaan Inc. that compiles models from frameworks like TensorFlow and ONNX into high-performance code for edge inference, supporting microcontrollers via its CPU backend.

Micro Interpreter

A micro interpreter is a minimal runtime component within a TinyML framework (like TFLM) that reads a model, plans its execution graph, and invokes optimized kernel functions to perform inference on a microcontroller.

Tensor Arena

The tensor arena is a statically or dynamically allocated block of memory (often SRAM) used by a TinyML inference engine to store intermediate activation tensors and other temporary data during model execution.

CMSIS-DSP

CMSIS-DSP is a library of common digital signal processing (DSP) functions optimized for Arm Cortex-M and Cortex-A processors, providing a foundational building block for efficient sensor data processing in TinyML applications.

AI Coprocessor

An AI coprocessor is a dedicated hardware accelerator, such as a microNPU (Neural Processing Unit), integrated into a microcontroller or system-on-chip to offload and dramatically accelerate neural network inference tasks.

Ethos-U55

The Arm Ethos-U55 is a microNPU (Neural Processing Unit) designed as a configurable, area- and power-efficient accelerator for machine learning inference in embedded and IoT endpoint devices using Cortex-M CPUs.

NPU SDK

An NPU SDK is a software development kit provided by a silicon vendor that contains compilers, runtime libraries, and profiling tools needed to deploy and execute neural network models on their dedicated Neural Processing Unit hardware.

Micro-Compiler

A micro-compiler in TinyML is a specialized compiler (e.g., within TVM or a vendor SDK) that translates high-level neural network models into highly optimized, low-level machine code or C code targeted for microcontroller execution.

Operator Fusion

Operator fusion is a graph optimization technique where consecutive neural network operations (layers) are combined into a single, compound kernel to reduce memory accesses and overhead, critical for efficient microcontroller inference.

Graph Optimization

Graph optimization in TinyML is the process of transforming a neural network's computational graph—through techniques like constant folding and operator fusion—to reduce its memory footprint and improve execution speed on constrained hardware.

Model Zoo

A TinyML model zoo is a curated repository of pre-trained, optimized, and benchmarked neural network models for common edge tasks (like keyword spotting or visual wake words), ready for deployment on specific microcontroller platforms.

Embedded ML Framework

An embedded ML framework is a software library or toolchain, such as TensorFlow Lite Micro or CMSIS-NN, specifically engineered to enable the deployment and execution of machine learning models on microcontroller-based embedded systems.

On-Device SDK

An on-device SDK is a vendor-specific software development kit that provides libraries, APIs, and tools to develop applications that include local, on-device machine learning inference, typically for a family of microcontrollers or processors.

Deployment Workflow

The TinyML deployment workflow is the end-to-end process of converting a trained model, optimizing it for target hardware, integrating it into embedded firmware, and validating its performance and resource usage on the actual device.