Inferensys

Glossary

Speech Act Theory

Speech Act Theory is a linguistic and philosophical framework that models communication as the performance of actions, forming the theoretical basis for agent interaction protocols in AI.
Governance lead reviewing model governance framework on laptop, policy documents visible, executive office setup.
AGENT COORDINATION PATTERNS

What is Speech Act Theory?

Speech Act Theory is a linguistic and philosophical framework that models communication as the performance of actions, forming the theoretical foundation for communicative acts in multi-agent interaction protocols.

Speech Act Theory is a linguistic and philosophical framework that analyzes utterances as actions—such as informing, requesting, or promising—rather than mere statements of fact. Developed by philosophers J.L. Austin and John Searle, it distinguishes between the locutionary act (the literal utterance), the illocutionary act (the intended force or function), and the perlocutionary act (the effect on the listener). In multi-agent systems, this theory provides the semantic foundation for Agent Communication Languages (ACLs) and Interaction Protocols, ensuring that messages between autonomous agents are not just transmitted but are understood as executable directives.

For agent coordination, Speech Act Theory enables precise modeling of communicative intent, which is critical for protocols like the Contract Net Protocol (a request for bids) or establishing Social Commitments (a promise to act). An illocutionary force, such as a REQUEST or INFORM, is formally encoded within an ACL message, allowing agents to reason about obligations and trigger appropriate behavioral responses. This moves agent communication beyond simple data exchange to a framework of performative utterances that directly shape system state, task allocation, and collaborative plans within an orchestrated environment.

AGENT COORDINATION PATTERNS

Core Concepts of Speech Act Theory

Speech Act Theory is a linguistic and philosophical framework that models communication as the performance of actions (e.g., informing, requesting, promising), forming the theoretical foundation for communicative acts in agent interaction protocols.

01

Locutionary, Illocutionary, Perlocutionary Acts

J.L. Austin's core taxonomy distinguishes three simultaneous dimensions of a speech act:

  • Locutionary Act: The literal utterance of words with a specific meaning and reference (e.g., saying "The door is open").
  • Illocutionary Act: The intended action performed in saying something (e.g., informing, warning, or promising). This is the core communicative force.
  • Perlocutionary Act: The consequential effect achieved by saying something on the listener's thoughts or actions (e.g., persuading someone to close the door, startling them). In multi-agent systems, protocols focus on standardizing the illocutionary force to ensure predictable interpretation.
02

Illocutionary Force & Felicity Conditions

The illocutionary force is the speaker's intent (e.g., to request, commit, or declare). For an act to be successful or "felicitous," certain felicity conditions must be met:

  • Preparatory Conditions: Contextual prerequisites (e.g., to promise, you must be capable of performing the promised action).
  • Sincerity Condition: The speaker must possess the requisite psychological state (e.g., intention to fulfill a promise).
  • Essential Condition: The utterance counts as undertaking an obligation (e.g., a promise creates an expectation of future action). Agent communication languages (ACLs) encode these conditions in message semantics to validate communicative acts.
03

Directive vs. Commissive Acts

John Searle's classification categorizes illocutionary acts by their direction of fit and psychological state:

  • Directives: Acts where the speaker attempts to get the hearer to do something. Direction of fit is world-to-word (the world changes to match the words). Examples: requests, commands, questions. Associated psychological state is want or desire.
  • Commissives: Acts where the speaker commits to a future course of action. Direction of fit is also world-to-word. Examples: promises, pledges, oaths. Associated psychological state is intention. These are fundamental for agent protocols involving task allocation (directives) and forming commitments (commissives).
04

Assertives, Expressives, & Declarations

Searle's other core categories complete the model of agent communication:

  • Assertives: Acts that commit the speaker to the truth of a proposition. Direction of fit is word-to-world (words match the world). Examples: informing, stating, concluding. Psychological state is belief.
  • Expressives: Acts that express the speaker's psychological state about a state of affairs. No direction of fit. Examples: thanking, apologizing, congratulating.
  • Declarations: Acts that immediately change institutional reality by their successful performance. Direction of fit is both ways (words change the world). Examples: "You're fired," "I pronounce you married." These require a specific institutional role or authority, highly relevant for agents operating within electronic institutions.
05

Performativity & Agent Communication Languages

A performative utterance is one where "saying is doing"—the speech act itself constitutes the action (e.g., "I promise..."). This concept is directly implemented in Agent Communication Languages (ACLs) like FIPA ACL. An ACL message explicitly encodes:

  • Performative: The illocutionary force (e.g., inform, request, cfp (call-for-proposal), propose).
  • Content: The propositional content of the message.
  • Sender/Receiver: The participating agents.
  • Protocol & Conversation ID: The interaction context. This formalization allows software agents to unambiguously interpret messages as actions, not just strings, enabling reliable coordination.
06

Conversation Policies & Interaction Protocols

Individual speech acts are composed into structured sequences called conversation policies or interaction protocols. These define the permissible flow of communicative acts between agents to achieve a larger goal. Common patterns include:

  • Request-Response: A request followed by an inform (result) or refuse.
  • Contract Net: A cfp (call-for-proposal), followed by propose messages from bidders, leading to accept-proposal/reject-proposal and inform-done/failure.
  • Auctions & Negotiations: Sequences of propose, counter-propose, accept, and reject. Protocols are often specified using finite state machines or UML sequence diagrams, providing a predictable framework for complex, stateful agent dialogues grounded in speech act theory.
SPEECH ACT THEORY

Frequently Asked Questions

Speech Act Theory provides the formal linguistic foundation for communication in multi-agent systems, modeling messages as actions that change the state of a conversation and the commitments between agents.

Speech Act Theory is a linguistic and philosophical framework that models communication as the performance of actions, such as informing, requesting, or promising, forming the theoretical basis for Agent Communication Languages (ACLs) and Interaction Protocols in multi-agent systems. Developed by philosophers J.L. Austin and John Searle, it posits that uttering a sentence (a locutionary act) performs an action (an illocutionary act) with an intended effect (a perlocutionary act). In AI, this translates agents' messages into formal communicative acts with defined semantics, enabling predictable, goal-directed interactions. For example, an agent sending a REQUEST speech act creates an obligation for the recipient to respond, while a INFORM act updates the shared beliefs within the system.

Prasad Kumkar

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.