Integration

AI Integration with Eyefinity Speech-to-Text

Add advanced speech recognition to Eyefinity for real-time exam transcription, voice-based record search, and accessibility features using cloud STT APIs and audio capture hooks.

Get in touch Learn more

Stylish WeWork-like workspace with hot desks and document wall, professional searching through enterprise knowledge base on a mounted ultrawide display, warm industrial pendants overhead.

ARCHITECTURE AND IMPLEMENTATION

Where Speech-to-Text Fits in Eyefinity Workflows

Integrating speech-to-text into Eyefinity transforms manual dictation and note entry into automated, structured data capture, directly within existing clinical and administrative workflows.

Speech-to-text (STT) integration connects to Eyefinity at three primary surfaces: the clinical documentation module for SOAP notes and exam findings, the patient communication hub for call logging and message intake, and the optical sales/order entry interface for hands-free product specification. Instead of a standalone dictation app, the integration uses Eyefinity's API hooks—like the Note creation endpoint and CommunicationLog object—to inject transcribed text directly into the relevant patient record or workflow queue. This means a provider's spoken exam notes become a draft SOAP note in the chart, a front-desk call about frame availability is logged as a searchable activity, and an optician's voice commands during a fitting populate the order form, all without switching applications.

Implementation requires a secure, cloud-based STT service (like Azure Speech, Google Speech-to-Text, or AWS Transcribe) configured with a custom vocabulary built from Eyefinity's optical and clinical terminology. Audio is captured via the practice's existing hardware—examination room PCs, front-desk headsets, or mobile devices—and streamed to the STT service. The resulting transcript is then processed by a lightweight orchestration layer that maps the text to the correct Eyefinity data model: identifying the patient context from the audio session, applying structured formatting based on note templates (e.g., pulling OD/OS measurements into separate fields), and calling the appropriate Eyefinity API to create or update the record. For sensitive PHI, audio streams and transcripts should be ephemeral, with final data persisted only within Eyefinity's audit-trailed environment.

Rollout should start with a single, high-volume workflow—such as post-exam note dictation—to validate accuracy and user adoption before expanding. Governance is critical: all AI-generated drafts must be clearly flagged in the UI for provider review and sign-off, maintaining the legal record-keeping requirements of an EHR. Additionally, a feedback loop should be established where corrections to transcripts are used to retrain and improve the acoustic and language models for practice-specific accents and jargon. This turns the integration from a simple transcription tool into a continuously improving system that reduces manual data entry by 50-70% in targeted workflows, while keeping the provider securely within the Eyefinity interface they already use.

PLATFORM SURFACES

Eyefinity Surfaces for Speech-to-Text Integration

Real-Time SOAP Note Drafting

The core surface for STT integration is the patient encounter workflow within ExamWRITER or the clinical documentation module. By connecting to Eyefinity's audio capture hooks—often via a companion mobile app or desktop microphone—speech can be streamed to a cloud STT service (like Azure Speech, Google Speech-to-Text, or Deepgram) and returned as structured text.

This real-time transcription populates the subjective (S) and objective (O) sections of SOAP notes, significantly reducing manual typing. The integration must map transcribed findings to the correct fields in Eyefinity's patient chart data model, such as ChiefComplaint, HistoryOfPresentIllness, and Assessment. A post-processing layer can apply optometry-specific entity recognition to tag terms like 'myopia', 'hyperopia', or 'IOP' for easier review and coding.

Implementation requires a secure, low-latency webhook from the STT provider to an intermediary service that formats the payload and calls the Eyefinity API to update the open encounter record, ensuring the provider's workflow is not interrupted.

OPTICAL PRACTICE AUTOMATION

High-Value Speech-to-Text Use Cases for Eyefinity

Integrating advanced speech-to-text into Eyefinity transforms voice into structured, actionable data. These use cases leverage Eyefinity's audio capture hooks and cloud APIs to automate documentation, enhance accessibility, and unlock real-time insights from patient encounters.

Ambient Clinical Documentation

Automatically transcribe patient-provider conversations during exams into structured SOAP note drafts within Eyefinity's clinical module. The system identifies key sections (Subjective, Objective) and populates relevant fields, reducing manual charting time after each visit.

Hours -> Minutes

Charting time

Voice-Activated Frame & Lens Search

Enable optical staff to search Eyefinity's frame inventory and lens catalog using natural voice queries (e.g., 'Show me titanium frames under $200'). The system converts speech to a structured search, accelerating product discovery and patient consultations at the optical desk.

Batch -> Real-time

Inventory lookup

Hands-Free Insurance Verification

Front desk staff can verbally state a patient's insurance details into a headset. Speech-to-text populates the verification form in Eyefinity's insurance module and triggers an automated eligibility check via payer APIs, streamlining the check-in process.

1 sprint

Typical implementation

Accessible Patient Intake & Kiosks

Integrate speech-to-text into Eyefinity's patient portal and on-site kiosks. Patients can verbally complete registration forms, update histories, or ask questions, improving accessibility and reducing front-desk burden for demographic data entry.

Same day

Data entry reduction

Voice-Driven Optical Order Entry

Opticians dictate frame selections, lens prescriptions, and add-ons during patient fittings. Speech is converted into a pre-filled optical lab order within Eyefinity, minimizing manual data entry errors and accelerating order submission to partner labs.

Batch -> Real-time

Order processing

Post-Visit Call Note Automation

Automatically transcribe follow-up phone calls with patients regarding contact lens orders, billing questions, or post-op checks. Summaries are attached to the patient record in Eyefinity, creating a searchable audit trail and ensuring continuity of care.

Hours -> Minutes

Call logging

PRACTICAL IMPLEMENTATION PATTERNS

Example Speech-to-Text Workflows in Eyefinity

Integrating speech-to-text into Eyefinity transforms manual dictation and note-taking into structured, searchable data. These workflows show how to connect cloud STT services to Eyefinity's audio capture hooks and API ecosystem for real-time clinical and operational automation.

This workflow captures the provider-patient conversation during an exam and generates a structured SOAP note draft in Eyefinity.

Trigger: Provider starts a patient encounter in Eyefinity and activates the 'Ambient Note' feature via a button or voice command.
Context/Data Pulled: The system retrieves the patient's demographic data, past ocular history, and current visit reason from Eyefinity's patient and encounter APIs to prime the context.
Model/Agent Action: A local or cloud-based audio stream is sent to a specialized STT service (e.g., Google Cloud Speech-to-Text with medical diarization). The raw transcript is then processed by an LLM agent prompted with optometry-specific templates to extract key findings: Chief Complaint, History of Present Illness, Assessment (e.g., "OD: -1.25 -0.50 x 180"), and Plan.
System Update: The structured draft is posted back to the correct encounter in Eyefinity using the EncounterNote API endpoint. The draft is clearly marked as AI-generated and placed in a "Review" status.
Human Review Point: The provider reviews, edits, and signs the note within Eyefinity. All edits are logged, and the final note is stored as the official record.

Technical Note: Implementation requires handling PHI securely, often using a HIPAA-compliant STT provider and ensuring audio data is encrypted in transit and not retained post-processing.

FROM AUDIO CAPTURE TO STRUCTURED CLINICAL DATA

Implementation Architecture: Connecting STT Services to Eyefinity

A production-ready blueprint for integrating cloud-based speech-to-text services into Eyefinity's patient encounter workflows.

The integration architecture connects Eyefinity's audio capture hooks—typically from its telehealth module, dictation tools, or patient portal—to a secure, cloud-based STT service like Azure Speech, Google Speech-to-Text, or AWS Transcribe. The core workflow is event-driven: when an audio file is saved to a designated location in Eyefinity's document management system or a visit_audio_ready webhook is triggered, a middleware service (often deployed as a secure container) retrieves the file, streams it to the STT API, and returns a structured transcript. This transcript is then processed to extract key entities like patient identifiers, visit dates, and clinical terms, which are mapped to corresponding fields in Eyefinity's SOAP note templates or clinical documentation modules via its RESTful API, creating a draft note ready for provider review and signature.

For real-time use cases, such as during a virtual visit, the architecture leverages Eyefinity's telehealth SDK or embedded browser components to stream audio directly to the STT service. The real-time transcript is displayed in a side-panel within the Eyefinity interface, allowing the provider to correct terms on the fly. Post-visit, the final transcript is automatically associated with the patient's chart and encounter ID. Critical implementation details include configuring specialty-specific speech models (trained on optometric terminology for conditions like myopia, glaucoma, or diabetic retinopathy), implementing PHI redaction at the STT layer before any data persists, and setting up a dead-letter queue for failed transcriptions to ensure no patient audio is lost.

Rollout follows a phased approach: starting with non-clinical workflows like voice-based search in patient records or accessibility features for staff, then progressing to pilot groups for visit documentation. Governance is enforced through Eyefinity's existing role-based access controls (RBAC) to determine who can initiate transcriptions and edit drafts, and all actions are logged to its audit trail for compliance. The final architecture ensures transcripts become a searchable, structured asset within Eyefinity, reducing manual data entry from hours to minutes per provider per day, while keeping sensitive audio data encrypted in transit and at rest, aligned with HIPAA BAA requirements of major cloud STT providers.

EYEFINITY SPEECH-TO-TEXT INTEGRATION

Code and Payload Examples

Capturing and Processing Audio Streams

Integrate with Eyefinity's audio capture hooks to stream patient-provider conversations to a cloud STT service like Azure Speech or Google Cloud Speech-to-Text. The system listens for specific encounter triggers (e.g., exam room session start) via the Eyefinity API, begins streaming, and returns a structured transcript appended to the patient's chart.

Example Workflow:

Event Trigger: POST /api/v1/encounter/{id}/start from Eyefinity.
Audio Capture: Stream from designated device/room system.
STT Processing: Send stream to configured provider with medical terminology boost.
Data Return: Structured JSON with speaker diarization and timestamps is posted back to the encounter's notes section via PATCH /api/v1/encounter/{id}/notes.

python
# Pseudo-code for handling an encounter start webhook
def handle_encounter_start(encounter_id, room_device_id):
    audio_stream = start_audio_capture(room_device_id)
    transcript = stt_client.streaming_recognize(
        audio_stream,
        config={"model": "medical_conversation", "enable_speaker_diarization": True}
    )
    # Post structured transcript to Eyefinity
    eyefinity_api.update_encounter_notes(
        encounter_id,
        notes=format_transcript_for_ehr(transcript)
    )

EYEFINITY SPEECH-TO-TEXT INTEGRATION

Realistic Time Savings and Operational Impact

How integrating advanced speech-to-text with Eyefinity transforms manual documentation and search workflows, based on typical practice operations.

Workflow / Metric	Before AI Integration	After AI Integration	Implementation Notes
Patient encounter note creation	5-10 minutes manual typing/clicking	1-2 minutes voice dictation + AI draft	Ambient listening or post-visit dictation via STT API; draft populates SOAP note fields.
Searching patient records for specific symptoms	2-3 minutes of manual filtering and scanning	30 seconds via natural language voice query	Vector search on transcribed notes enables semantic queries like 'patients with dry eye and contact lens wear'.
Prior authorization documentation assembly	15-20 minutes gathering and summarizing notes	5 minutes with automated summarization and packet drafting	AI extracts relevant data from encounter transcripts and past notes to populate PA forms.
Staff training on documentation updates	30-60 minute live sessions per policy change	10-minute AI-generated summary + interactive Q&A	AI analyzes new policy docs and creates bite-sized training materials accessible via practice portal.
Post-visit summary for patient portal	Manual copy/paste or skipped due to time	Automated 1-paragraph summary generated in <1 minute	AI condenses encounter transcript into patient-friendly language; staff reviews before sending.
Coding validation from visit dialogue	Retrospective manual review after visit	Real-time code suggestions during transcription	AI analyzes transcript against CPT/ICD-10 rules, flags potential missed codes for review.
Accessibility for patients/staff with disabilities	Reliance on manual note-taking or third-party services	Real-time captions for telehealth and in-office consultations	STT provides live transcription displayed on screen, integrated into Eyefinity's telehealth module.

SECURE, CONTROLLED IMPLEMENTATION FOR CLINICAL WORKFLOWS

Governance, Security, and Phased Rollout

Integrating speech-to-text into Eyefinity requires a security-first architecture and a phased rollout to manage clinical risk and user adoption.

Deploying AI-powered speech-to-text for Eyefinity begins with a zero-trust data architecture. Audio capture from exam rooms or provider devices is encrypted in transit and processed by a dedicated, HIPAA-compliant cloud STT service (like Azure Speech or Google Cloud Speech-to-Text). Transcripts are never stored with the raw audio by the AI service. Instead, structured text is returned via a secure API to a middleware layer that handles PHI redaction and context enrichment—appending the correct patient ID, encounter ID, and provider ID from Eyefinity's API—before the draft note is inserted into the correct chart section. All data flows are logged for a full audit trail, and access is governed by Eyefinity's native RBAC, ensuring only authorized staff can view or edit AI-generated content.

A successful rollout follows a three-phase pilot approach to de-risk the integration and prove value:

Phase 1: Shadow Mode. The STT system transcribes encounters in real-time but outputs are only visible to a pilot group (e.g., two optometrists and an assistant) in a separate dashboard. This validates accuracy for ophthalmic terminology (e.g., 'conjunctival injection', 'myopic degeneration') and workflow fit without altering production records.
Phase 2: Draft-Assist Mode. For the pilot group, structured draft SOAP notes are created within Eyefinity as unsigned drafts, prefilled in the subjective and objective sections. Providers review, edit, and sign off as usual. This phase measures time saved per encounter and captures feedback on note structure.
Phase 3: Controlled Expansion. Roll out to additional providers, coupled with human-in-the-loop rules—for example, any note with a high-confidence clinical finding (like 'suspicious nevus') is flagged for mandatory review before signing. Continuous monitoring tracks adoption rates and accuracy metrics per provider.

Governance is maintained through a dedicated model management layer. This includes regular accuracy audits on a held-out set of encounter transcripts, monitoring for model drift in specialty terminology, and a clear process for providers to flag errors. These flags feed directly into prompt and workflow refinements. This controlled, iterative approach ensures the AI integration enhances productivity without compromising the integrity of the clinical record or adding undue risk to the practice. For related architectural patterns, see our guides on /integrations/optometry-practice-management-platforms/ai-integration-for-revolutionehr-clinical-documentation and /integrations/electronic-health-record-platforms/ai-governance-frameworks.

Enabling Efficiency, Speed & Accuracy

Intelligent Analysis, Decision & Execution

We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.

Talk to Us

Search across company data

Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.

Useful when people spend too long searching or get different answers from different systems.

Enterprise searchRAGPermissions

Automate internal workflows

Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.

Useful when repetitive work moves across multiple tools and teams.

AI agentsWorkflow automationGovernance

Add AI to products and internal tools

Build assistants, guided actions, or decision support into the software your team or customers already use.

Useful when AI needs to be part of the product, not a separate tool.

AI integrationDecision supportModel routing

IMPLEMENTATION AND WORKFLOW DETAILS

FAQ: Speech-to-Text Integration for Eyefinity

Practical answers for integrating advanced speech-to-text (STT) into Eyefinity workflows, covering real-time transcription, voice search, accessibility, and the technical patterns for connecting cloud STT APIs to Eyefinity's audio capture hooks and data model.

Eyefinity provides several integration surfaces where audio can be captured and processed:

Telehealth/Virtual Visit Module: Audio streams from integrated video platforms (e.g., Zoom, embedded WebRTC) can be captured via webhook or API post-visit.
Voice Note Attachments: Clinicians can record voice memos attached to patient records or specific orders. These are typically stored as audio files in Eyefinity's document management system (DMS).
Front-Desk Call Recording Hooks: For practices recording phone calls for quality or training, integration can be added to the call recording system's output feed.
Mobile App Dictation: The Eyefinity mobile app supports audio recording for notes, which can be sent to a cloud STT service before syncing back as text.

Implementation Note: Most integrations use a sidecar architecture. Audio is captured, sent securely (e.g., via HTTPS POST) to a cloud STT service like Google Speech-to-Text, AWS Transcribe, or Azure Speech, and the resulting transcript is posted back to the relevant Eyefinity record via its REST API or used to populate a note field.

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.

Limited slotsGet a Free AI Consultation

How We Work

Custom AI workflows for your Business

One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.