Poor data curation invalidates billion-molecule screens. The promise of screening vast chemical libraries in silico collapses when the underlying molecular representations—like SMILES strings or 3D conformers—contain errors or lack critical stereochemical information, leading AI models to optimize for non-existent compounds.














