Speech-to-text (STT) integration connects to Eyefinity at three primary surfaces: the clinical documentation module for SOAP notes and exam findings, the patient communication hub for call logging and message intake, and the optical sales/order entry interface for hands-free product specification. Instead of a standalone dictation app, the integration uses Eyefinity's API hooks—like the Note creation endpoint and CommunicationLog object—to inject transcribed text directly into the relevant patient record or workflow queue. This means a provider's spoken exam notes become a draft SOAP note in the chart, a front-desk call about frame availability is logged as a searchable activity, and an optician's voice commands during a fitting populate the order form, all without switching applications.
Integration
AI Integration with Eyefinity Speech-to-Text

Where Speech-to-Text Fits in Eyefinity Workflows
Integrating speech-to-text into Eyefinity transforms manual dictation and note entry into automated, structured data capture, directly within existing clinical and administrative workflows.
Implementation requires a secure, cloud-based STT service (like Azure Speech, Google Speech-to-Text, or AWS Transcribe) configured with a custom vocabulary built from Eyefinity's optical and clinical terminology. Audio is captured via the practice's existing hardware—examination room PCs, front-desk headsets, or mobile devices—and streamed to the STT service. The resulting transcript is then processed by a lightweight orchestration layer that maps the text to the correct Eyefinity data model: identifying the patient context from the audio session, applying structured formatting based on note templates (e.g., pulling OD/OS measurements into separate fields), and calling the appropriate Eyefinity API to create or update the record. For sensitive PHI, audio streams and transcripts should be ephemeral, with final data persisted only within Eyefinity's audit-trailed environment.
Rollout should start with a single, high-volume workflow—such as post-exam note dictation—to validate accuracy and user adoption before expanding. Governance is critical: all AI-generated drafts must be clearly flagged in the UI for provider review and sign-off, maintaining the legal record-keeping requirements of an EHR. Additionally, a feedback loop should be established where corrections to transcripts are used to retrain and improve the acoustic and language models for practice-specific accents and jargon. This turns the integration from a simple transcription tool into a continuously improving system that reduces manual data entry by 50-70% in targeted workflows, while keeping the provider securely within the Eyefinity interface they already use.
Eyefinity Surfaces for Speech-to-Text Integration
Real-Time SOAP Note Drafting
The core surface for STT integration is the patient encounter workflow within ExamWRITER or the clinical documentation module. By connecting to Eyefinity's audio capture hooks—often via a companion mobile app or desktop microphone—speech can be streamed to a cloud STT service (like Azure Speech, Google Speech-to-Text, or Deepgram) and returned as structured text.
This real-time transcription populates the subjective (S) and objective (O) sections of SOAP notes, significantly reducing manual typing. The integration must map transcribed findings to the correct fields in Eyefinity's patient chart data model, such as ChiefComplaint, HistoryOfPresentIllness, and Assessment. A post-processing layer can apply optometry-specific entity recognition to tag terms like 'myopia', 'hyperopia', or 'IOP' for easier review and coding.
Implementation requires a secure, low-latency webhook from the STT provider to an intermediary service that formats the payload and calls the Eyefinity API to update the open encounter record, ensuring the provider's workflow is not interrupted.
High-Value Speech-to-Text Use Cases for Eyefinity
Integrating advanced speech-to-text into Eyefinity transforms voice into structured, actionable data. These use cases leverage Eyefinity's audio capture hooks and cloud APIs to automate documentation, enhance accessibility, and unlock real-time insights from patient encounters.
Ambient Clinical Documentation
Automatically transcribe patient-provider conversations during exams into structured SOAP note drafts within Eyefinity's clinical module. The system identifies key sections (Subjective, Objective) and populates relevant fields, reducing manual charting time after each visit.
Voice-Activated Frame & Lens Search
Enable optical staff to search Eyefinity's frame inventory and lens catalog using natural voice queries (e.g., 'Show me titanium frames under $200'). The system converts speech to a structured search, accelerating product discovery and patient consultations at the optical desk.
Hands-Free Insurance Verification
Front desk staff can verbally state a patient's insurance details into a headset. Speech-to-text populates the verification form in Eyefinity's insurance module and triggers an automated eligibility check via payer APIs, streamlining the check-in process.
Accessible Patient Intake & Kiosks
Integrate speech-to-text into Eyefinity's patient portal and on-site kiosks. Patients can verbally complete registration forms, update histories, or ask questions, improving accessibility and reducing front-desk burden for demographic data entry.
Voice-Driven Optical Order Entry
Opticians dictate frame selections, lens prescriptions, and add-ons during patient fittings. Speech is converted into a pre-filled optical lab order within Eyefinity, minimizing manual data entry errors and accelerating order submission to partner labs.
Post-Visit Call Note Automation
Automatically transcribe follow-up phone calls with patients regarding contact lens orders, billing questions, or post-op checks. Summaries are attached to the patient record in Eyefinity, creating a searchable audit trail and ensuring continuity of care.
Example Speech-to-Text Workflows in Eyefinity
Integrating speech-to-text into Eyefinity transforms manual dictation and note-taking into structured, searchable data. These workflows show how to connect cloud STT services to Eyefinity's audio capture hooks and API ecosystem for real-time clinical and operational automation.
This workflow captures the provider-patient conversation during an exam and generates a structured SOAP note draft in Eyefinity.
- Trigger: Provider starts a patient encounter in Eyefinity and activates the 'Ambient Note' feature via a button or voice command.
- Context/Data Pulled: The system retrieves the patient's demographic data, past ocular history, and current visit reason from Eyefinity's patient and encounter APIs to prime the context.
- Model/Agent Action: A local or cloud-based audio stream is sent to a specialized STT service (e.g., Google Cloud Speech-to-Text with medical diarization). The raw transcript is then processed by an LLM agent prompted with optometry-specific templates to extract key findings: Chief Complaint, History of Present Illness, Assessment (e.g., "OD: -1.25 -0.50 x 180"), and Plan.
- System Update: The structured draft is posted back to the correct encounter in Eyefinity using the
EncounterNoteAPI endpoint. The draft is clearly marked as AI-generated and placed in a "Review" status. - Human Review Point: The provider reviews, edits, and signs the note within Eyefinity. All edits are logged, and the final note is stored as the official record.
Technical Note: Implementation requires handling PHI securely, often using a HIPAA-compliant STT provider and ensuring audio data is encrypted in transit and not retained post-processing.
Implementation Architecture: Connecting STT Services to Eyefinity
A production-ready blueprint for integrating cloud-based speech-to-text services into Eyefinity's patient encounter workflows.
The integration architecture connects Eyefinity's audio capture hooks—typically from its telehealth module, dictation tools, or patient portal—to a secure, cloud-based STT service like Azure Speech, Google Speech-to-Text, or AWS Transcribe. The core workflow is event-driven: when an audio file is saved to a designated location in Eyefinity's document management system or a visit_audio_ready webhook is triggered, a middleware service (often deployed as a secure container) retrieves the file, streams it to the STT API, and returns a structured transcript. This transcript is then processed to extract key entities like patient identifiers, visit dates, and clinical terms, which are mapped to corresponding fields in Eyefinity's SOAP note templates or clinical documentation modules via its RESTful API, creating a draft note ready for provider review and signature.
For real-time use cases, such as during a virtual visit, the architecture leverages Eyefinity's telehealth SDK or embedded browser components to stream audio directly to the STT service. The real-time transcript is displayed in a side-panel within the Eyefinity interface, allowing the provider to correct terms on the fly. Post-visit, the final transcript is automatically associated with the patient's chart and encounter ID. Critical implementation details include configuring specialty-specific speech models (trained on optometric terminology for conditions like myopia, glaucoma, or diabetic retinopathy), implementing PHI redaction at the STT layer before any data persists, and setting up a dead-letter queue for failed transcriptions to ensure no patient audio is lost.
Rollout follows a phased approach: starting with non-clinical workflows like voice-based search in patient records or accessibility features for staff, then progressing to pilot groups for visit documentation. Governance is enforced through Eyefinity's existing role-based access controls (RBAC) to determine who can initiate transcriptions and edit drafts, and all actions are logged to its audit trail for compliance. The final architecture ensures transcripts become a searchable, structured asset within Eyefinity, reducing manual data entry from hours to minutes per provider per day, while keeping sensitive audio data encrypted in transit and at rest, aligned with HIPAA BAA requirements of major cloud STT providers.
Code and Payload Examples
Capturing and Processing Audio Streams
Integrate with Eyefinity's audio capture hooks to stream patient-provider conversations to a cloud STT service like Azure Speech or Google Cloud Speech-to-Text. The system listens for specific encounter triggers (e.g., exam room session start) via the Eyefinity API, begins streaming, and returns a structured transcript appended to the patient's chart.
Example Workflow:
- Event Trigger:
POST /api/v1/encounter/{id}/startfrom Eyefinity. - Audio Capture: Stream from designated device/room system.
- STT Processing: Send stream to configured provider with medical terminology boost.
- Data Return: Structured JSON with speaker diarization and timestamps is posted back to the encounter's notes section via
PATCH /api/v1/encounter/{id}/notes.
python# Pseudo-code for handling an encounter start webhook def handle_encounter_start(encounter_id, room_device_id): audio_stream = start_audio_capture(room_device_id) transcript = stt_client.streaming_recognize( audio_stream, config={"model": "medical_conversation", "enable_speaker_diarization": True} ) # Post structured transcript to Eyefinity eyefinity_api.update_encounter_notes( encounter_id, notes=format_transcript_for_ehr(transcript) )
Realistic Time Savings and Operational Impact
How integrating advanced speech-to-text with Eyefinity transforms manual documentation and search workflows, based on typical practice operations.
| Workflow / Metric | Before AI Integration | After AI Integration | Implementation Notes |
|---|---|---|---|
Patient encounter note creation | 5-10 minutes manual typing/clicking | 1-2 minutes voice dictation + AI draft | Ambient listening or post-visit dictation via STT API; draft populates SOAP note fields. |
Searching patient records for specific symptoms | 2-3 minutes of manual filtering and scanning | 30 seconds via natural language voice query | Vector search on transcribed notes enables semantic queries like 'patients with dry eye and contact lens wear'. |
Prior authorization documentation assembly | 15-20 minutes gathering and summarizing notes | 5 minutes with automated summarization and packet drafting | AI extracts relevant data from encounter transcripts and past notes to populate PA forms. |
Staff training on documentation updates | 30-60 minute live sessions per policy change | 10-minute AI-generated summary + interactive Q&A | AI analyzes new policy docs and creates bite-sized training materials accessible via practice portal. |
Post-visit summary for patient portal | Manual copy/paste or skipped due to time | Automated 1-paragraph summary generated in <1 minute | AI condenses encounter transcript into patient-friendly language; staff reviews before sending. |
Coding validation from visit dialogue | Retrospective manual review after visit | Real-time code suggestions during transcription | AI analyzes transcript against CPT/ICD-10 rules, flags potential missed codes for review. |
Accessibility for patients/staff with disabilities | Reliance on manual note-taking or third-party services | Real-time captions for telehealth and in-office consultations | STT provides live transcription displayed on screen, integrated into Eyefinity's telehealth module. |
Governance, Security, and Phased Rollout
Integrating speech-to-text into Eyefinity requires a security-first architecture and a phased rollout to manage clinical risk and user adoption.
Deploying AI-powered speech-to-text for Eyefinity begins with a zero-trust data architecture. Audio capture from exam rooms or provider devices is encrypted in transit and processed by a dedicated, HIPAA-compliant cloud STT service (like Azure Speech or Google Cloud Speech-to-Text). Transcripts are never stored with the raw audio by the AI service. Instead, structured text is returned via a secure API to a middleware layer that handles PHI redaction and context enrichment—appending the correct patient ID, encounter ID, and provider ID from Eyefinity's API—before the draft note is inserted into the correct chart section. All data flows are logged for a full audit trail, and access is governed by Eyefinity's native RBAC, ensuring only authorized staff can view or edit AI-generated content.
A successful rollout follows a three-phase pilot approach to de-risk the integration and prove value:
- Phase 1: Shadow Mode. The STT system transcribes encounters in real-time but outputs are only visible to a pilot group (e.g., two optometrists and an assistant) in a separate dashboard. This validates accuracy for ophthalmic terminology (e.g., 'conjunctival injection', 'myopic degeneration') and workflow fit without altering production records.
- Phase 2: Draft-Assist Mode. For the pilot group, structured draft SOAP notes are created within Eyefinity as unsigned drafts, prefilled in the subjective and objective sections. Providers review, edit, and sign off as usual. This phase measures time saved per encounter and captures feedback on note structure.
- Phase 3: Controlled Expansion. Roll out to additional providers, coupled with human-in-the-loop rules—for example, any note with a high-confidence clinical finding (like 'suspicious nevus') is flagged for mandatory review before signing. Continuous monitoring tracks adoption rates and accuracy metrics per provider.
Governance is maintained through a dedicated model management layer. This includes regular accuracy audits on a held-out set of encounter transcripts, monitoring for model drift in specialty terminology, and a clear process for providers to flag errors. These flags feed directly into prompt and workflow refinements. This controlled, iterative approach ensures the AI integration enhances productivity without compromising the integrity of the clinical record or adding undue risk to the practice. For related architectural patterns, see our guides on /integrations/optometry-practice-management-platforms/ai-integration-for-revolutionehr-clinical-documentation and /integrations/electronic-health-record-platforms/ai-governance-frameworks.
Enabling Efficiency, Speed & Accuracy
Intelligent Analysis, Decision & Execution
We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.
Talk to Us
Search across company data
Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.
Useful when people spend too long searching or get different answers from different systems.

Automate internal workflows
Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.
Useful when repetitive work moves across multiple tools and teams.

Add AI to products and internal tools
Build assistants, guided actions, or decision support into the software your team or customers already use.
Useful when AI needs to be part of the product, not a separate tool.
FAQ: Speech-to-Text Integration for Eyefinity
Practical answers for integrating advanced speech-to-text (STT) into Eyefinity workflows, covering real-time transcription, voice search, accessibility, and the technical patterns for connecting cloud STT APIs to Eyefinity's audio capture hooks and data model.
Eyefinity provides several integration surfaces where audio can be captured and processed:
- Telehealth/Virtual Visit Module: Audio streams from integrated video platforms (e.g., Zoom, embedded WebRTC) can be captured via webhook or API post-visit.
- Voice Note Attachments: Clinicians can record voice memos attached to patient records or specific orders. These are typically stored as audio files in Eyefinity's document management system (DMS).
- Front-Desk Call Recording Hooks: For practices recording phone calls for quality or training, integration can be added to the call recording system's output feed.
- Mobile App Dictation: The Eyefinity mobile app supports audio recording for notes, which can be sent to a cloud STT service before syncing back as text.
Implementation Note: Most integrations use a sidecar architecture. Audio is captured, sent securely (e.g., via HTTPS POST) to a cloud STT service like Google Speech-to-Text, AWS Transcribe, or Azure Speech, and the resulting transcript is posted back to the relevant Eyefinity record via its REST API or used to populate a note field.

About the author
Prasad Kumkar
CEO & MD, Inference Systems
Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.
His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.
Partnered with leading AI, data, and software stack.
How We Work
Custom AI workflows for your Business
One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.
01
Review the use case
We understand the task, the users, and where AI can actually help.
Read more02
Pick the right approach
We define what needs search, automation, or product integration.
Read more03
Build the first useful version
We implement the part that proves the value first.
Read more04
Improve from there
We add the checks and visibility needed to keep it useful.
Read moreThe first call is a practical review of your use case and the right next step.
Talk to Us