Glossary

Configuration Management (Hydra, YAML Config)

Configuration management in machine learning is the practice of externalizing all tunable parameters and settings into structured files or frameworks to separate code from configuration and ensure reproducibility.

Get in touch Learn more

Knowledge manager reviewing enterprise knowledge management system on laptop, document library visible, casual office.

EXPERIMENT TRACKING

What is Configuration Management (Hydra, YAML Config)?

Configuration management is a foundational practice in machine learning engineering that separates tunable parameters from core application logic to ensure reproducibility, facilitate experimentation, and streamline deployment.

Configuration management in machine learning is the systematic practice of externalizing all tunable parameters, settings, and environment variables into declarative, structured files—such as YAML, JSON, or TOML—to decouple configuration from code. This separation enables reproducibility by providing a complete, versioned record of every experiment's settings, facilitates hyperparameter tuning by allowing easy swapping of values, and supports environment-specific deployments (development, staging, production) without code changes. Frameworks like Hydra extend this concept by enabling hierarchical, composable, and dynamically configurable setups from the command line.

Using a structured format like YAML provides human-readable, hierarchical organization for complex configurations encompassing model architectures, data paths, optimizer settings, and logging directives. Advanced tools manage configuration inheritance, variable interpolation, and environment variable overrides. This discipline is critical for experiment tracking systems, which log these configurations alongside metrics and artifacts, creating an auditable lineage. Proper configuration management directly supports Evaluation-Driven Development by ensuring every model result can be precisely linked to its defining parameters for rigorous benchmarking and comparison.

CONFIGURATION MANAGEMENT

Core Components of ML Configuration Management

Configuration management externalizes tunable parameters into structured files, separating code from settings to ensure reproducibility and simplify experimentation. This card grid details its foundational components.

Configuration Files (YAML/JSON)

Configuration files are human-readable, structured documents (commonly YAML or JSON) that store all adjustable parameters for a machine learning experiment. This includes:

Hyperparameters (e.g., learning rate, batch size, layer count)
Data paths and preprocessing settings
Model architecture definitions
Training loop controls (e.g., number of epochs)

By externalizing these settings, the core training code remains static while configurations can be swapped easily, enabling parameter sweeps and A/B testing without code changes. This is the bedrock of reproducibility, as a run can be perfectly recreated from its config file and code commit.

Hierarchical & Composable Configs

Hierarchical configuration organizes settings into a tree-like structure, allowing for inheritance and overrides. A base config (e.g., model/base.yaml) can define common defaults, while experiment-specific configs (e.g., experiment/bert.yaml) extend and modify only the necessary parameters.

Composability allows configs to be assembled from multiple modular files. For instance, you can have separate files for dataset.yaml, optimizer.yaml, and model.yaml, which are stitched together at runtime. This promotes modularity, reduces duplication, and allows for easy mixing and matching of components (e.g., trying the same optimizer with five different model architectures).

Configuration Frameworks (Hydra)

Configuration frameworks like Hydra provide a programmatic layer to manage YAML files, overcoming their static limitations. Key features include:

Dynamic overrides from the command line (e.g., python train.py model.layer_size=512)
Automatic multirun sweeps for launching parallel jobs from a single config
Config composition via config groups and defaults lists
Runtime instantiation of Python objects directly from config values

These frameworks transform static configs into a dynamic experimentation API, bridging the gap between declarative settings and executable code. They enforce structure while providing the flexibility needed for rapid iterative development.

EXPLORE

Environment & Secret Management

Environment-specific configuration separates settings that vary between development, staging, and production environments (e.g., API endpoints, database URIs, log levels). This is often managed via separate config files (config_dev.yaml, config_prod.yaml) or environment variables.

Secret management is the secure handling of sensitive data like API keys, passwords, and tokens. Best practices dictate that these are never hardcoded in config files committed to version control. Instead, they are injected at runtime via environment variables or dedicated secret management services (e.g., HashiCorp Vault, AWS Secrets Manager), linking configuration management to security posture.

Config Versioning & Provenance

Config versioning treats configuration files as first-class artifacts, storing them in version control (e.g., Git) alongside the training code. Each experiment run is linked to the exact config file snapshot that produced it.

This establishes full lineage and provenance: given any trained model, you can trace back to the precise code commit, data version, and configuration that generated it. This is non-negotiable for auditability, debugging, and regulatory compliance. Experiment tracking platforms automatically log the config state as run metadata, creating an immutable record of the experimental setup.

Validation & Schema Enforcement

Config validation ensures that configuration files are syntactically correct and semantically valid before a costly training run begins. This prevents runtime failures due to typos, incorrect types, or missing required fields.

Schema enforcement tools like Pydantic or OmegaConf's structured configs allow you to define a strict schema (types, value ranges, optional/mandatory fields) for your configuration. The framework then validates the loaded config against this schema, providing immediate error feedback. This acts as a contract between the configuration and the code, catching errors early and serving as machine-readable documentation for all allowable parameters.

EXPERIMENT TRACKING

How Configuration Management Works in Practice

Configuration management is the engineering practice of externalizing all tunable parameters and settings from code into structured files or dedicated frameworks to ensure reproducibility, facilitate experimentation, and enable safe deployment.

In practice, configuration management separates an application's logic from its settings. All adjustable parameters—such as model architecture choices, hyperparameters, file paths, and feature flags—are defined in external files, typically in structured formats like YAML, JSON, or TOML. This decoupling allows developers to modify system behavior without altering the core codebase. Frameworks like Hydra extend this concept by enabling hierarchical, composable configurations and dynamic overrides directly from the command line, streamlining complex experimental setups.

The primary workflow involves defining a base configuration schema, then creating specific configuration files or profiles for different environments (development, testing, production) or experiments. During execution, the application loads the specified configuration, injecting parameters into the runtime. This practice is integral to experiment tracking, as the exact configuration used for each training run is logged as immutable metadata. It ensures that any model can be precisely reproduced by re-executing the code with its logged configuration, forming the bedrock of reliable machine learning operations (MLOps).

COMMON FRAMEWORKS AND TOOLS

Configuration Management (Hydra, YAML Config)

Configuration management in machine learning externalizes tunable parameters into structured files, separating code from settings to ensure reproducibility and systematic experimentation.

The Core Principle: Separation of Code and Config

Configuration management enforces a strict separation between a model's algorithmic logic (the code) and its tunable parameters (the config). This principle is foundational for reproducibility and scalable experimentation.

Code defines the model architecture, training loop, and data loading logic.
Config files (e.g., YAML, JSON) specify hyperparameters, file paths, dataset names, and environment flags.

This separation allows the same codebase to be executed with hundreds of different configurations without modification, enabling systematic hyperparameter sweeps and A/B testing. It also ensures that every experiment run is fully documented by its configuration file, a cornerstone of experiment tracking.

YAML: The Standard Configuration Format

YAML (YAML Ain't Markup Language) is the de facto standard for ML configuration files due to its human-readable, hierarchical structure. It supports complex nested dictionaries, lists, and basic data types, making it ideal for organizing model parameters.

Example Structure:

yaml
model:
  name: "resnet50"
  pretrained: true
training:
  batch_size: 32
  learning_rate: 0.001
  optimizer: "Adam"
data:
  path: "/datasets/cifar10"

Key advantages include readability for non-engineers and native support in most programming languages via libraries like PyYAML. The hierarchical nature allows logical grouping of parameters for data, model, training, and evaluation.

Hydra: Advanced Composition and Overrides

Hydra is a popular open-source framework from Meta AI that extends basic YAML configuration with powerful composition and override capabilities. Its primary innovation is the ability to dynamically compose a configuration from multiple sources at runtime.

Config Groups: Organize configurations into directories (e.g., model/resnet.yaml, dataset/cifar.yaml).
Command-Line Overrides: Modify any parameter at launch: python train.py training.batch_size=64 model=vit.
Multirun: Launch sweeps over configs with a single command for hyperparameter tuning.
Instantiating Objects: Directly create Python objects (e.g., optimizer instances) from config definitions.

This makes Hydra exceptionally powerful for managing complex experiments with many interchangeable components, directly feeding into experiment tracking systems.

EXPLORE

OmegaConf: Hydra's Configuration Engine

OmegaConf is the underlying library that powers Hydra, providing a unified interface for handling configuration data from YAML files, CLI arguments, and environment variables. It solves common issues with plain YAML dictionaries.

Key features include:

Variable Interpolation: Reference other config values within the file: log_dir: "${data.path}/logs".
Runtime Merging: Seamlessly merge configurations from defaults, YAML files, and command-line inputs.
Structured Configs: Use Python dataclasses to define a strict schema for configurations, enabling type safety and validation.

While often used through Hydra, OmegaConf can be used independently as a robust configuration management solution, ensuring configs are resolved and validated before runtime.

Integration with Experiment Trackers

Configuration management is intrinsically linked to experiment tracking. All major tracking platforms (MLflow, Weights & Biases, etc.) automatically log the complete configuration used for a training run as run parameters.

Automatic Logging: Frameworks like Hydra can be configured to log the entire resolved config (including overrides) to the tracking server.
Run Comparison: Trackers use logged configurations to enable run comparison, allowing engineers to filter and contrast experiments based on specific hyperparameter values.
Reproducibility: The logged config, combined with the environment snapshot and code version, provides the complete recipe needed to reproduce any past experiment.

This creates a closed loop where configuration defines the experiment, and tracking validates and records its outcome.

Best Practices and Common Patterns

Effective configuration management follows established engineering patterns to prevent complexity and errors.

Hierarchical Organization: Structure configs to mirror your code's architecture (e.g., data, model, training, evaluation).
Environment-Specific Configs: Use separate config files or overrides for local, staging, and production environments to manage different datasets or resource limits.
Sensitive Data Handling: Never store secrets (API keys, passwords) in version-controlled config files. Use environment variables or secret management tools, which can be referenced via interpolation (e.g., api_key: ${oc.env:API_KEY} in OmegaConf).
Schema Validation: Use Pydantic or Structured Configs in Hydra to validate data types and required fields at startup, catching errors before a costly training run begins.
Defaults and Overrides: Maintain a base config.yaml with sensible defaults, allowing specific experiments to override only the necessary parameters.

EXPERIMENT TRACKING

Configuration Management vs. Related Concepts

A comparison of configuration management tools and practices against adjacent concepts in the ML lifecycle, highlighting their distinct purposes and overlaps.

Feature / Purpose	Configuration Management (Hydra, YAML)	Experiment Tracking (MLflow, W&B)	Hyperparameter Tuning (Optuna, Ray Tune)	Model Registry
Primary Goal	Separate code from config; manage hierarchical settings for runs.	Log, version, and compare runs (params, metrics, artifacts).	Automate the search for optimal hyperparameter values.	Centralize storage, versioning, and stage management of trained models.
Core Artifact	Structured config files (e.g., YAML, JSON).	Run metadata (ID, params, metrics, tags).	Optimization history and best trial configuration.	Model artifact, version, and stage metadata.
Key Output	Resolved, validated configuration for a single run.	Experiment dashboard for run comparison and analysis.	A set of evaluated trials with performance metrics.	A registered, versioned model ready for deployment.
Dynamic Overrides
Hierarchical Composition
Automated Search Algorithms
Artifact Logging & Storage
Model Lifecycle Stage Management
Tight Integration with Training Code
Essential for Reproducibility

CONFIGURATION MANAGEMENT

Frequently Asked Questions

Configuration management is the engineering practice of externalizing all tunable parameters and settings from code into structured files or dedicated frameworks. This separation is fundamental for reproducibility, collaboration, and systematic experimentation in machine learning.

Configuration management in machine learning is the systematic practice of externalizing all tunable parameters, settings, and environment variables into declarative files (e.g., YAML, JSON) or dedicated frameworks, thereby separating the model's logic from its configuration. This separation ensures that every aspect of a training run—from hyperparameters and dataset paths to model architecture choices—is explicitly defined, versioned, and reproducible. It directly supports experiment tracking by providing a clear, auditable record of what was executed, enabling precise run comparison and eliminating the ambiguity of hard-coded values scattered throughout scripts.

Enabling Efficiency, Speed & Accuracy

Intelligent Analysis, Decision & Execution

We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.

Talk to Us

Search across company data

Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.

Useful when people spend too long searching or get different answers from different systems.

Enterprise searchRAGPermissions

Automate internal workflows

Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.

Useful when repetitive work moves across multiple tools and teams.

AI agentsWorkflow automationGovernance

Add AI to products and internal tools

Build assistants, guided actions, or decision support into the software your team or customers already use.

Useful when AI needs to be part of the product, not a separate tool.

AI integrationDecision supportModel routing

CONFIGURATION MANAGEMENT

Related Terms

Configuration management in machine learning is a foundational practice for reproducibility. These related concepts detail the specific tools, file formats, and methodologies that enable the separation of code from configuration.

Hydra

Hydra is an open-source framework from Facebook Research for elegantly configuring complex applications. Its core innovation is dynamic, composable configuration via a command-line interface. Key features include:

Hierarchical configuration through config groups and defaults lists.
Command-line overrides that allow any parameter to be changed without altering source files.
Multirun capability for launching sweeps over configurations from a single command.
Structured configs via Python dataclasses for type safety and IDE support. It is widely used in ML projects to manage experiments, separating the config.yaml from the train.py script.

EXPLORE

YAML (YAML Ain't Markup Language)

YAML is a human-readable data serialization language and the de facto standard for configuration files in machine learning and DevOps. Its structure uses indentation and simple punctuation, making it more intuitive than JSON or XML for complex nested settings. In ML configuration, YAML files typically define:

Hyperparameters (learning rate, batch size, epochs).
Model architecture specifications (layer sizes, activation functions).
Data pipeline paths and transformation parameters.
Training environment settings (device, distributed strategy). Frameworks like Hydra, Kubernetes, and Ansible rely on YAML for declarative configuration management.

OmegaConf

OmegaConf is the underlying configuration library that powers the Hydra framework. It provides a unified API for handling configuration data from multiple sources (YAML files, CLI arguments, environment variables) and merging them seamlessly. Its primary features are:

Variable interpolation (e.g., ${data.path}/train).
Runtime type validation and schema enforcement.
Read/write access to nested configurations using dot notation.
Resolution of references across the configuration tree. While Hydra provides the workflow, OmegaConf provides the fundamental data structure (DictConfig, ListConfig) that makes dynamic, hierarchical configuration possible in Python.

EXPLORE

Configuration Drift

Configuration drift occurs when the actual runtime configuration of a system diverges from its intended, version-controlled specification. In ML, this is a critical failure mode for reproducibility. Causes include:

Manual command-line overrides that are not logged.
Environment variable changes not captured in the experiment track.
Implicit defaults from libraries that differ between runs.
Secret or local config files not committed to version control. Mitigation requires rigorous practices: using structured configs, logging all overrides as run metadata, and employing configuration hashing to detect changes.

Structured Configs

Structured Configs refer to the practice of defining configuration schemas using native language constructs like Python dataclasses, Pydantic models, or TypedDicts. This approach moves beyond unstructured YAML dictionaries to provide:

Type safety and validation at edit time and runtime.
IDE autocompletion and inline documentation.
Explicit defaults and optional/required field specification.
Inheritance and composition through object-oriented principles. Frameworks like Hydra and Pydantic Settings enable structured configs, ensuring that configuration errors are caught early and the config itself becomes a source of documentation.

Environment Variables & .env Files

Environment variables and .env files are a standard method for managing configuration that varies between deployments (development, staging, production) or contains sensitive data. They are used to externalize:

API keys, database passwords, and other secrets.
Service endpoints and resource locations.
Feature flags and operational modes. In ML, tools like python-dotenv load variables from a .env file into the process environment. Best practice is to treat the .env file as secret (excluded from Git) while committing a .env.example template. Configuration management systems like Hydra can merge these variables into the main config, keeping secrets out of YAML files.

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.

Limited slotsGet a Free AI Consultation

How We Work

Custom AI workflows for your Business

One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.

Talk to Us

Configuration Management (Hydra, YAML Config)

What is Configuration Management (Hydra, YAML Config)?

Core Components of ML Configuration Management

Configuration Files (YAML/JSON)

Hierarchical & Composable Configs

Configuration Frameworks (Hydra)

Environment & Secret Management

Config Versioning & Provenance

Validation & Schema Enforcement

How Configuration Management Works in Practice

Configuration Management (Hydra, YAML Config)

The Core Principle: Separation of Code and Config

YAML: The Standard Configuration Format

Hydra: Advanced Composition and Overrides

OmegaConf: Hydra's Configuration Engine

Integration with Experiment Trackers

Best Practices and Common Patterns

Configuration Management vs. Related Concepts

Frequently Asked Questions

Intelligent Analysis, Decision & Execution

Search across company data

Automate internal workflows

Add AI to products and internal tools

Hydra

OmegaConf

Prasad Kumkar

Partnered with leading AI, data, and software stack.

Custom AI workflows for your Business

Review the use case

Pick the right approach

Build the first useful version

Improve from there