AI integration for custodian identification typically connects to three key platform surfaces: the custodian management module, the data processing pipeline, and the review workspace. The AI agent ingests communication metadata (from email servers, MS 365, Slack exports) and content, then analyzes patterns like communication volume, centrality in networks, topic clusters, and keyword frequency. Results are pushed back to the platform as a ranked custodian list, often creating or updating custom objects (e.g., Custodian records in Relativity) with AI-generated fields for RelevanceScore, KeyTopics, and CommunicationNetworkRole.
Integration
AI for Custodian Identification and Ranking

Where AI Fits in Custodian Identification
A technical blueprint for integrating AI into the custodian identification workflow within e-discovery platforms like Relativity, Everlaw, DISCO, and Nuix.
Implementation involves a background service that polls the platform's API for new data collections or uses webhooks triggered by processing completion. For each custodian candidate, the AI runs a multi-step analysis: 1) Network Analysis to map who communicates with whom, 2) Content Scoring based on case-relevant terms and concepts, and 3) Temporal Analysis to flag activity around key dates. The output is a structured payload sent via the platform's REST API to update custodian records, often triggering platform-native workflows for legal hold issuance or collection prioritization. This reduces the manual analyst work from days of spreadsheet analysis to hours of AI-assisted review.
Rollout should start with a pilot on a single matter, comparing AI rankings against a senior reviewer's manual list to calibrate scoring thresholds. Governance is critical: all AI-generated scores and tags must be auditable, with the underlying analysis (e.g., "why was this custodian ranked #1?") accessible via a drill-down report or linked annotation. Integrate a human-in-the-loop approval step before the ranked list updates the production custodian module, ensuring the legal team maintains final control. This phased approach de-risks the integration while demonstrating concrete time savings in the critical early phase of discovery.
Platform-Specific Integration Surfaces
Targeting Native Custodian Objects
AI for custodian ranking integrates directly with the custodian or person management modules within e-discovery platforms. In Relativity, this means enriching the Custodian object or related custom objects via the REST API, adding fields like AI_RelevanceScore, AI_CommunicationVolume, and AI_KeyTopicTags. For Everlaw, integration focuses on the People feature, using its API to append analysis results as properties or link to smart tags.
Key surfaces include:
- Custodian data grids: Inject AI-generated scores and tags for sortable, filterable columns.
- Custodian reports: Automate generation of custodian heatmaps and prioritization lists.
- Hold notification workflows: Trigger communications based on AI-ranked tiers, integrating with platform email or export features.
The goal is to make AI outputs native, searchable, and actionable within the platform's existing custodian workflow, avoiding external reports that create silos.
High-Value AI Use Cases for Custodian Identification and Ranking
Custodian identification is a critical, time-intensive phase in e-discovery. These AI integration patterns connect directly to platform custodian management modules to analyze communication patterns, content relevance, and organizational roles, outputting prioritized lists and risk scores.
Communication Network Analysis
AI analyzes email To/CC/BCC fields, chat participants, and meeting invites to map communication volume and centrality. Integrates with the platform's custodian object to auto-populate relationship strength and influence scores, highlighting key hubs beyond the org chart.
Content Relevance Scoring
LLMs evaluate custodian document sets against case issues and keywords. Scores for topic prevalence, sentiment shifts, and privilege likelihood are written back as custom fields in the custodian record, enabling sortable, data-driven prioritization for legal hold.
Role & Tenure Enrichment
AI agents cross-reference custodian names with HRIS data (via secure APIs) to append job function, project history, and employment timeline. This context is injected into the platform to flag custodians in sensitive roles or during critical periods.
Risk-Based Custodian Tiering
A composite AI model synthesizes network, content, and role data to assign custodians to Tier 1 (Critical), Tier 2 (Relevant), or Tier 3 (Peripheral). Results populate a platform dashboard or custom object, driving collection strategy and resource allocation.
Dynamic Custodian List Maintenance
As new data is ingested, AI continuously re-evaluates custodian rankings. Webhooks or platform event handlers trigger re-scoring, and the custodian management interface is updated automatically, ensuring the legal team works from a current, evidence-based list.
Integration with Legal Hold Modules
Prioritized custodian lists and risk scores are formatted and pushed directly into the platform's legal hold issuance workflow (e.g., Relativity Legal Hold, Everlaw's Custodian Manager). This automates the creation of hold groups and tracks custodian responsiveness.
Example AI-Powered Custodian Workflows
These workflows demonstrate how AI agents can be integrated into e-discovery platforms to automate the identification, ranking, and management of custodians. Each pattern connects to platform APIs for data ingestion, analysis, and result output, creating a closed-loop system that reduces manual investigation time from weeks to days.
Trigger: A new matter is created in the e-discovery platform (e.g., Relativity, Everlaw) and a data source (like an Office 365 tenant) is staged for collection.
Workflow:
- An AI agent is triggered via a platform webhook or scheduled job.
- The agent queries the platform's API for the matter's metadata and uses system connectors to pull communication logs (Exchange Online, Teams, Slack exports) before full collection.
- Using graph analysis and LLM-powered pattern recognition, the agent analyzes:
- Communication Volume & Centrality: Who sends/receives the most messages?
- Topic Clustering: Which individuals are central to discussions about key case concepts (e.g., "merger," "pricing")?
- Temporal Analysis: Who was active during critical event periods?
- The agent generates a ranked list of custodians with confidence scores and supporting rationale.
- It creates Custom Objects or populates a custodian management worksheet within the platform via API, pre-populating fields like
Custodian Name,Employee ID,Data Source,Relevance Score, andKey Rationale.
Human Review Point: The legal team reviews the AI-generated list in the platform's UI, adjusting the preservation order and collection scope before issuing legal holds.
Implementation Architecture: Data Flow and APIs
A production-ready architecture for ingesting communication data, applying AI analysis, and outputting prioritized custodian lists directly into your e-discovery platform's management modules.
The integration connects to communication data sources—typically Microsoft 365 Exchange Online, Google Workspace, Slack Enterprise Grid, or on-premises email archives—via their respective APIs or secure data exports. A dedicated ingestion service pulls metadata (To, From, CC, BCC, Timestamps) and content, normalizing it into a unified schema. This raw corpus is then processed by an AI analysis pipeline that performs entity resolution (matching email addresses to known employee records), communication graph analysis (identifying central nodes and clusters), and content salience scoring (using LLMs to flag messages containing key case terms, sensitive topics, or urgent sentiment).
Results are structured into a custodian profile object for each individual, containing calculated metrics like: message_volume, centrality_score, topic_relevance_score, and timeline_coverage. These profiles are pushed via the e-discovery platform's API—such as the Relativity Custodian Manager API, Everlaw's People API, or DISCO's Custodians endpoint—to create or update custodian records. The AI system can auto-populate fields for Collection Priority (High/Medium/Low), Suggested Search Terms, and Notes with the rationale for the ranking, turning a manual, spreadsheet-driven process into an API-driven workflow that updates in near real-time as new data is ingested.
For governance, all AI-generated scores and recommendations are logged with versioning and audit trails, allowing legal teams to review the rationale for a custodian's rank. The system can be configured to operate in an assistive mode, requiring a reviewer's approval before updating platform records, or a fully automated mode for low-risk, high-volume matters. This architecture ensures the AI acts as a force multiplier for legal teams, moving custodian identification from a multi-day, manual analysis task to a process that delivers a continuously refined, data-driven priority list within hours of data receipt.
Code and Payload Examples
Analyzing Email and Chat Networks
This Python example uses the Relativity REST API to fetch communication metadata, then applies network analysis to identify central custodians. The script calculates metrics like betweenness centrality and communication volume to score each custodian's potential importance.
pythonimport requests import networkx as nx import pandas as pd # Fetch communication data from Relativity relativity_api_url = "https://your-instance.relativity.com/Relativity.REST/api/" auth_token = "YOUR_OAUTH_TOKEN" workspace_id = 123456 # Query for email metadata (From, To, CC, Date) query_payload = { "condition": "('Document Type' EQUALS 'Email')", "fields": ["Control Number", "Extracted Email From", "Extracted Email To", "Extracted Email CC", "Date Sent"], "length": 10000 } response = requests.post( f"{relativity_api_url}workspace/{workspace_id}/documents/query", headers={"Authorization": f"Bearer {auth_token}", "Content-Type": "application/json"}, json=query_payload ) # Build a network graph G = nx.Graph() for doc in response.json()["Objects"]: sender = doc["Extracted Email From"] recipients = doc["Extracted Email To"].split(';') + doc["Extracted Email CC"].split(';') for recipient in recipients: if recipient.strip(): if G.has_edge(sender, recipient.strip()): G[sender][recipient.strip()]['weight'] += 1 else: G.add_edge(sender, recipient.strip(), weight=1) # Calculate centrality scores centrality_scores = nx.betweenness_centrality(G, weight='weight') volume_scores = {node: G.degree(node, weight='weight') for node in G.nodes()} # Combine into a custodian ranking DataFrame df_ranking = pd.DataFrame([ { "Custodian": custodian, "Betweenness_Centrality": centrality_scores.get(custodian, 0), "Communication_Volume": volume_scores.get(custodian, 0), "Score": centrality_scores.get(custodian, 0) * 0.7 + (volume_scores.get(custodian, 0) / max(volume_scores.values())) * 0.3 } for custodian in set(list(centrality_scores.keys()) + list(volume_scores.keys())) ]).sort_values("Score", ascending=False) print(df_ranking.head(10))
This analysis identifies custodians who act as communication hubs or bridges between teams, who are often critical for legal hold.
Realistic Time Savings and Operational Impact
This table illustrates the operational impact of integrating AI into the custodian identification workflow within platforms like Relativity, Everlaw, DISCO, and Nuix. It compares manual processes against AI-assisted workflows, showing realistic time savings and improvements in accuracy and strategic focus.
| Workflow Stage | Manual / Traditional Process | AI-Assisted Process | Impact & Implementation Notes |
|---|---|---|---|
Initial Custodian List Generation | Manual compilation from HR directories and interview notes over 2-3 days | AI analyzes org charts, communication metadata, and content to propose a ranked list in 2-4 hours | Reduces foundational legwork; output is a CSV or JSON for import into platform custodian modules |
Communication Pattern Analysis | Manual review of sample email threads to map relationships (weeks) | AI models analyze entire corpus to map communication frequency, centrality, and topic clusters (hours) | Uncovers hidden key players and informal networks not visible in org charts |
Relevance Scoring & Prioritization | Subjective ranking based on custodian title and initial interviews | AI scores custodians based on volume of relevant comms, keyword density, and connection to key issues | Creates a data-driven priority queue for legal hold and collection, reducing collection scope by 20-40% |
Legal Hold Notice Drafting | Generic notices drafted manually for all custodians | AI generates personalized notice summaries referencing their likely relevant data types and preservation duties | Increases compliance and understanding; integrates with platform's notice tracking via API |
Collection Scope Definition | Broad collection mandates based on custodian role | AI recommends specific date ranges, data sources (email, Slack, cloud drives), and search terms per custodian | Focuses collection efforts, cutting downstream processing and review volume significantly |
Integration with Platform Custodian Module | Manual data entry of custodian details and manual tagging | Automated API sync of AI-generated profiles, scores, and recommended tags into platform custodian objects | Ensures a single source of truth; enables dynamic reporting and workflow triggers based on custodian rank |
Ongoing Custodian Re-ranking | Static list; re-evaluation only upon new manual discovery | Continuous re-scoring as new data is ingested, with alerts for custodians rising in relevance | Maintains investigation agility; implemented via platform event handlers or scheduled scripts |
Governance, Security, and Phased Rollout
A secure, phased implementation ensures AI-driven custodian analysis enhances—rather than disrupts—existing legal hold and collection workflows.
Implementation begins by connecting to the e-discovery platform's custodian management module via its API (e.g., Relativity's Custodian object, Everlaw's Custodians endpoint). The AI agent ingests communication metadata—From/To/CC, Date, Subject—and content from a controlled, pre-processed dataset. It analyzes patterns like communication volume, centrality in networks, and topic relevance to key issues, outputting a ranked list with confidence scores and supporting evidence. These results are written back as custom fields (e.g., AI_Custodian_Rank, AI_Key_Topics) or to a dedicated reporting object, never overwriting existing legal team classifications.
Security is paramount. The AI service operates within the same secure environment as the e-discovery platform, with all data in transit and at rest encrypted. Access is governed by the platform's native RBAC; the AI only processes data the authenticated user can already see. All AI interactions—queries, model calls, result writes—are logged to a dedicated audit trail, creating a defensible record of the automated analysis for judicial or regulatory scrutiny. For highly sensitive matters, a human-in-the-loop approval step can be configured, where the AI's custodian list is presented in a staging area for a senior reviewer or case manager to approve before promotion to the live custodian list.
A phased rollout minimizes risk. Phase 1 (Pilot): The AI runs in a parallel, non-production workspace on a closed set of historical matters. The legal team compares its rankings against known outcomes to calibrate confidence thresholds. Phase 2 (Assisted): The AI is enabled for active matters but its outputs are presented as "recommendations" alongside manual custodian lists within the platform's UI, allowing teams to build trust. Phase 3 (Integrated): For validated workflows, the AI automatically updates custodian priority scores and tags, triggering platform-native alerts or workflow rules to expedite collection. This controlled approach turns a powerful capability into a reliable, governed component of the legal process.
Enabling Efficiency, Speed & Accuracy
Intelligent Analysis, Decision & Execution
We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.
Talk to Us
Search across company data
Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.
Useful when people spend too long searching or get different answers from different systems.

Automate internal workflows
Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.
Useful when repetitive work moves across multiple tools and teams.

Add AI to products and internal tools
Build assistants, guided actions, or decision support into the software your team or customers already use.
Useful when AI needs to be part of the product, not a separate tool.
Frequently Asked Questions
Practical questions for legal and technical teams implementing AI to identify and prioritize key custodians within e-discovery platforms like Relativity, Everlaw, DISCO, and Nuix.
The AI agent ingests and analyzes multiple data streams to build a comprehensive communication and content profile for each potential custodian:
- Communication Metadata: From email headers, chat logs (Slack, Teams), and calendar invites to map
To/From/CCpatterns, frequency, and timing. - Content Analysis: The body of emails, chat messages, and documents to identify topics discussed, project names, technical jargon, and sentiment.
- Platform-Specific Data: Native fields from the e-discovery platform, such as:
- Relativity: Custodian Manager fields,
Email Threadingresults,DtSearchindex metadata. - Everlaw:
Custodianobject properties,Communication Analysisdata. - DISCO:
Custodianattributes from the processing engine. - Nuix:
Peopleentities identified during processing.
- Relativity: Custodian Manager fields,
- External HR Data (if integrated): Job titles, departments, and tenure from systems like Workday to enrich the analysis.
The agent uses this data to calculate metrics like communication centrality, topic authority, and temporal relevance to the matter's key dates.

About the author
Prasad Kumkar
CEO & MD, Inference Systems
Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.
His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.
Partnered with leading AI, data, and software stack.
How We Work
Custom AI workflows for your Business
One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.
01
Review the use case
We understand the task, the users, and where AI can actually help.
Read more02
Pick the right approach
We define what needs search, automation, or product integration.
Read more03
Build the first useful version
We implement the part that proves the value first.
Read more04
Improve from there
We add the checks and visibility needed to keep it useful.
Read moreThe first call is a practical review of your use case and the right next step.
Talk to Us