Trigger: Raw clinical database is declared 'clean' and locked in the EDC (e.g., Medidata Rave).
Context Pulled: AI agent retrieves the final raw datasets, the annotated Case Report Form (aCRF), and the study protocol from connected systems (EDC, eTMF).
Agent Action:
- Mapping Suggestion: Uses the aCRF and protocol to propose initial SDTM domain mappings (e.g., which raw variables become
--TESTCD in LB). It flags ambiguous mappings for human review.
- Conformance Check: Validates proposed datasets against CDISC SDTM IG rules and any study-specific implementation guide, checking for:
- Variable naming and order
- Controlled terminology
- Presence of required domains (DM, VS, LB, etc.)
--SEQ generation logic
- Traceability: Automatically generates a draft Define.xml skeleton with variable-level metadata and origins.
System Update: A validation report is posted to the study team's collaboration portal (e.g., Jira, SharePoint), listing passed checks, warnings (e.g., non-standard terms), and critical errors (e.g., missing USUBJID). The agent creates tickets for flagged items, assigned to the relevant programmer.
Human Review Point: Lead Statistical Programmer reviews the mapping suggestions and validation report, approving or overriding before final SDTM programming begins.