Data profiling is the automated, statistical analysis of a dataset to discover and document its structural characteristics, content patterns, data quality issues, and inter-column relationships. It operates at the column and table level, generating metrics like null counts, value distributions, data types, and uniqueness to create a quantitative portrait of the data's current state. This process is distinct from data discovery, which focuses on finding assets, and is a prerequisite for effective data validation and schema management.




