Speech Act Theory is a linguistic and philosophical framework that analyzes utterances as actions—such as informing, requesting, or promising—rather than mere statements of fact. Developed by philosophers J.L. Austin and John Searle, it distinguishes between the locutionary act (the literal utterance), the illocutionary act (the intended force or function), and the perlocutionary act (the effect on the listener). In multi-agent systems, this theory provides the semantic foundation for Agent Communication Languages (ACLs) and Interaction Protocols, ensuring that messages between autonomous agents are not just transmitted but are understood as executable directives.
Glossary
Speech Act Theory

What is Speech Act Theory?
Speech Act Theory is a linguistic and philosophical framework that models communication as the performance of actions, forming the theoretical foundation for communicative acts in multi-agent interaction protocols.
For agent coordination, Speech Act Theory enables precise modeling of communicative intent, which is critical for protocols like the Contract Net Protocol (a request for bids) or establishing Social Commitments (a promise to act). An illocutionary force, such as a REQUEST or INFORM, is formally encoded within an ACL message, allowing agents to reason about obligations and trigger appropriate behavioral responses. This moves agent communication beyond simple data exchange to a framework of performative utterances that directly shape system state, task allocation, and collaborative plans within an orchestrated environment.
Core Concepts of Speech Act Theory
Speech Act Theory is a linguistic and philosophical framework that models communication as the performance of actions (e.g., informing, requesting, promising), forming the theoretical foundation for communicative acts in agent interaction protocols.
Locutionary, Illocutionary, Perlocutionary Acts
J.L. Austin's core taxonomy distinguishes three simultaneous dimensions of a speech act:
- Locutionary Act: The literal utterance of words with a specific meaning and reference (e.g., saying "The door is open").
- Illocutionary Act: The intended action performed in saying something (e.g., informing, warning, or promising). This is the core communicative force.
- Perlocutionary Act: The consequential effect achieved by saying something on the listener's thoughts or actions (e.g., persuading someone to close the door, startling them). In multi-agent systems, protocols focus on standardizing the illocutionary force to ensure predictable interpretation.
Illocutionary Force & Felicity Conditions
The illocutionary force is the speaker's intent (e.g., to request, commit, or declare). For an act to be successful or "felicitous," certain felicity conditions must be met:
- Preparatory Conditions: Contextual prerequisites (e.g., to promise, you must be capable of performing the promised action).
- Sincerity Condition: The speaker must possess the requisite psychological state (e.g., intention to fulfill a promise).
- Essential Condition: The utterance counts as undertaking an obligation (e.g., a promise creates an expectation of future action). Agent communication languages (ACLs) encode these conditions in message semantics to validate communicative acts.
Directive vs. Commissive Acts
John Searle's classification categorizes illocutionary acts by their direction of fit and psychological state:
- Directives: Acts where the speaker attempts to get the hearer to do something. Direction of fit is world-to-word (the world changes to match the words). Examples: requests, commands, questions. Associated psychological state is want or desire.
- Commissives: Acts where the speaker commits to a future course of action. Direction of fit is also world-to-word. Examples: promises, pledges, oaths. Associated psychological state is intention. These are fundamental for agent protocols involving task allocation (directives) and forming commitments (commissives).
Assertives, Expressives, & Declarations
Searle's other core categories complete the model of agent communication:
- Assertives: Acts that commit the speaker to the truth of a proposition. Direction of fit is word-to-world (words match the world). Examples: informing, stating, concluding. Psychological state is belief.
- Expressives: Acts that express the speaker's psychological state about a state of affairs. No direction of fit. Examples: thanking, apologizing, congratulating.
- Declarations: Acts that immediately change institutional reality by their successful performance. Direction of fit is both ways (words change the world). Examples: "You're fired," "I pronounce you married." These require a specific institutional role or authority, highly relevant for agents operating within electronic institutions.
Performativity & Agent Communication Languages
A performative utterance is one where "saying is doing"—the speech act itself constitutes the action (e.g., "I promise..."). This concept is directly implemented in Agent Communication Languages (ACLs) like FIPA ACL. An ACL message explicitly encodes:
- Performative: The illocutionary force (e.g.,
inform,request,cfp(call-for-proposal),propose). - Content: The propositional content of the message.
- Sender/Receiver: The participating agents.
- Protocol & Conversation ID: The interaction context. This formalization allows software agents to unambiguously interpret messages as actions, not just strings, enabling reliable coordination.
Conversation Policies & Interaction Protocols
Individual speech acts are composed into structured sequences called conversation policies or interaction protocols. These define the permissible flow of communicative acts between agents to achieve a larger goal. Common patterns include:
- Request-Response: A
requestfollowed by aninform(result) orrefuse. - Contract Net: A
cfp(call-for-proposal), followed byproposemessages from bidders, leading toaccept-proposal/reject-proposalandinform-done/failure. - Auctions & Negotiations: Sequences of
propose,counter-propose,accept, andreject. Protocols are often specified using finite state machines or UML sequence diagrams, providing a predictable framework for complex, stateful agent dialogues grounded in speech act theory.
Frequently Asked Questions
Speech Act Theory provides the formal linguistic foundation for communication in multi-agent systems, modeling messages as actions that change the state of a conversation and the commitments between agents.
Speech Act Theory is a linguistic and philosophical framework that models communication as the performance of actions, such as informing, requesting, or promising, forming the theoretical basis for Agent Communication Languages (ACLs) and Interaction Protocols in multi-agent systems. Developed by philosophers J.L. Austin and John Searle, it posits that uttering a sentence (a locutionary act) performs an action (an illocutionary act) with an intended effect (a perlocutionary act). In AI, this translates agents' messages into formal communicative acts with defined semantics, enabling predictable, goal-directed interactions. For example, an agent sending a REQUEST speech act creates an obligation for the recipient to respond, while a INFORM act updates the shared beliefs within the system.
Enabling Efficiency, Speed & Accuracy
Intelligent Analysis, Decision & Execution
We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.
Talk to Us
Search across company data
Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.
Useful when people spend too long searching or get different answers from different systems.

Automate internal workflows
Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.
Useful when repetitive work moves across multiple tools and teams.

Add AI to products and internal tools
Build assistants, guided actions, or decision support into the software your team or customers already use.
Useful when AI needs to be part of the product, not a separate tool.
Related Terms
Speech Act Theory provides the philosophical foundation for communication in multi-agent systems. These related concepts detail the formal languages, protocols, and normative structures built upon it to enable structured agent interaction.
Agent Communication Language (ACL)
A formal language with defined syntax, semantics, and pragmatics that enables autonomous agents to exchange knowledge and requests. It operationalizes Speech Acts into computable messages.
- Syntax: Defines the message structure (e.g., sender, receiver, performative, content).
- Semantics: Provides the precise meaning of message elements.
- Pragmatics: Governs the intended effect and context of the message exchange.
- Standard Example: The FIPA ACL (Foundation for Intelligent Physical Agents) is a widely referenced standard specifying performatives like
inform,request,cfp(call-for-proposal), andpropose.
Interaction Protocol
A predefined, structured sequence of permissible message exchanges between agents to achieve a specific communicative goal. It choreographs Speech Acts into a predictable conversation.
- Formal Specification: Often defined using finite state machines, Petri nets, or sequence diagrams.
- Common Protocols: Contract Net Protocol (for task allocation), various auction protocols (English, Dutch, Vickrey), and negotiation protocols.
- Role: Ensures that a series of communicative acts (e.g., a call for proposals followed by bids and an award) follows a mutually understood pattern, preventing miscoordination.
Social Commitments
Normative constructs that create obligations between agents, defining that a debtor agent is committed to a creditor agent to bring about a certain condition. They provide a formal mechanism for trust and accountability.
- Foundation: Grounded in the Speech Act of promising.
- Lifecycle: Commitments have states (e.g., active, fulfilled, violated, terminated).
- Function: Enable agents to reason about the future behavior of others, forming the basis for cooperative planning. If Agent A commits to delivering a result to Agent B, B can plan its subsequent actions accordingly.
Electronic Institutions
Computational frameworks that define the norms, rules, and structured interaction spaces governing autonomous agents, ensuring orderly societal interactions. They provide the 'rules of the game' for Speech Act-based communication.
- Components: Include roles agents can play, scenes (virtual rooms for specific interactions), and normative rules that specify penalties or rewards.
- Analogy: Functions like a digital marketplace or courtroom, where protocols dictate who can speak, when, and what they can say.
- Purpose: Reduces complexity and enforces desirable global properties (e.g., fairness, non-repudiation) in open multi-agent systems.
Belief-Desire-Intention (BDI) Architecture
A prominent software model for intelligent agents based on practical reasoning, where an agent's behavior is driven by its Beliefs (world model), Desires (goals), and Intentions (committed plans). Speech Acts are the primary means for BDI agents to interact.
- Reasoning Loop: Agents perceive (update beliefs), deliberate on desires, plan (form intentions), and act.
- Communication Link: An
informspeech act updates another agent's beliefs. Arequestspeech act creates a new desire or intention in the recipient. - Framework Example: The JACK or JADE platforms implement BDI principles with built-in support for agent communication.
FIPA ACL Performatives
The set of communicative acts defined by the FIPA standard, providing a concrete vocabulary for Agent Communication Languages. Each performative corresponds to a type of Speech Act with formal semantics.
- Assertives:
inform,confirm,disconfirm(commit the speaker to the truth of a proposition). - Directives:
request,cfp(call-for-proposal),subscribe(attempt to get the hearer to perform an action). - Commissives:
propose,accept-proposal,reject-proposal(commit the speaker to a future course of action). - Declaratives:
proxy(empower another agent to act). - Example:
(request :sender A :receiver B :content "open valve V1")is a directive Speech Act.

About the author
Prasad Kumkar
CEO & MD, Inference Systems
Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.
His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.
Partnered with leading AI, data, and software stack.
How We Work
Custom AI workflows for your Business
One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.
01
Review the use case
We understand the task, the users, and where AI can actually help.
Read more02
Pick the right approach
We define what needs search, automation, or product integration.
Read more03
Build the first useful version
We implement the part that proves the value first.
Read more04
Improve from there
We add the checks and visibility needed to keep it useful.
Read moreThe first call is a practical review of your use case and the right next step.
Talk to Us