Inferensys

Glossary

Programming by Example (PBE)

Programming by Example (PBE) is a program synthesis paradigm where a system automatically generates a general program from a set of concrete input-output pairs provided as the specification.
Isolated secure server room with network cables physically disconnected, minimal lighting, security-focused environment.
PROGRAM SYNTHESIS

What is Programming by Example (PBE)?

Programming by Example (PBE) is a subfield of program synthesis where the system infers a general program from a set of concrete input-output demonstrations.

Programming by Example (PBE) is a program synthesis paradigm where the specification is provided as a set of concrete input-output pairs. The system's core task is to infer a general program—often in a Domain-Specific Language (DSL)—that correctly produces the given outputs from the corresponding inputs. This approach is highly accessible, allowing users without programming expertise to automate repetitive data transformation tasks, such as formatting text or cleaning spreadsheets, by simply demonstrating a few desired corrections.

The technical challenge in PBE is searching a vast space of possible programs for one consistent with all examples. Solvers use techniques like enumerative search, constraint solving, and version space algebra to efficiently prune possibilities. A canonical industrial application is Microsoft Excel's FlashFill. PBE is closely related to program induction and often employs Oracle-Guided Synthesis, where the user interactively provides new examples to refine the result, bridging the gap between demonstration and a robust, executable script.

PROGRAM SYNTHESIS

Core Characteristics of Programming by Example

Programming by Example (PBE) is a program synthesis paradigm where the specification is provided as a set of concrete input-output pairs, and the system infers a general program that satisfies all examples. Its core characteristics define its unique approach to automating code creation.

01

Specification via Examples

The fundamental input to a PBE system is a specification provided as a set of concrete input-output pairs. Unlike formal logic or natural language, the user demonstrates the desired transformation with specific instances. The system's core challenge is to generalize from these finite examples to a program that works for all unseen inputs conforming to an implicit pattern. This makes PBE highly accessible, as users need no formal specification language.

  • Example: In a spreadsheet, a user provides "John Doe" -> "Doe, J." and "Jane Smith" -> "Smith, J.".
  • Goal: The synthesizer infers a program that extracts the last name, adds a comma, and appends the first initial for any similar full name input.
02

Search in a Constrained Space

PBE is fundamentally a search problem over the space of all possible programs. To make this tractable, synthesis is performed within a Domain-Specific Language (DSL). The DSL provides a limited set of primitives (e.g., string functions like Concat, Substring, Replace) and composition rules, which constrains the search to plausible, efficient programs for the target domain.

  • Key Mechanism: The synthesizer uses deductive search, enumerative search, or constraint solving to explore the DSL-defined space.
  • Benefit: This ensures generated programs are usable and efficient within their domain, such as data transformations in Excel or shell script commands.
03

Ranking and Disambiguation

Given a finite set of examples, there are often many candidate programs that satisfy them. PBE systems employ a ranking function or Occam's razor principle to select the simplest, most likely program intended by the user. This is critical for user intent disambiguation.

  • Common Heuristics: Prefer shorter programs, programs using more familiar operators, or programs that generalize in predictable ways.
  • Example: For examples "cat" -> "c" and "dog" -> "d", both first character and remove all but first character are valid. Ranking selects the simpler, more intuitive first character function.
04

Interactive Refinement Loop

PBE is typically an interactive, human-in-the-loop process. The user provides initial examples, the system proposes a program, and the user can refine the specification by:

  • Providing additional positive examples to clarify the pattern.
  • Providing negative examples (counterexamples) to rule out incorrect generalizations.
  • Directly editing the output for a new input, creating a new input-output pair. This iterative refinement loop is central to systems like Microsoft Excel's FlashFill, allowing users to quickly converge on the correct program through collaboration with the synthesizer.
05

Connection to Inductive Programming

PBE is a form of inductive programming, where a general rule (the program) is inferred from specific observations (the examples). This contrasts with deductive synthesis, which derives a program from a complete formal specification. The inductive nature introduces the generalization challenge: the system must hypothesize a program that works for the infinite set of inputs implied by the finite examples without overfitting to the provided instances. Techniques from machine learning, such as learning decision trees or using version space algebras, are often applied to manage this hypothesis space.

06

Primary Application Domains

PBE excels in domains where tasks are repetitive, rule-based, and easily demonstrated. Its major applications include:

  • Data Wrangling & Transformation: Synthesizing string manipulations, column splits, and format conversions in tools like Trifacta Wrangler or Microsoft Excel's FlashFill.
  • Shell Script & Regular Expression Synthesis: Generating command-line pipelines or regex patterns from example text matches.
  • UI Automation: Inferring macros or scripts from user demonstrations of graphical interface interactions.
  • Educational Tools: Helping novices learn programming concepts by showing the code that corresponds to their actions. The success in these areas stems from PBE's ability to turn demonstrative intent into executable code without requiring programming expertise.
PROGRAM SYNTHESIS

How Programming by Example Works

Programming by Example (PBE) is a paradigm of program synthesis where the system infers a general program from a set of concrete input-output demonstrations.

Programming by Example (PBE) is a program synthesis technique where the user specifies desired behavior by providing concrete input-output pairs, and the system automatically infers a general program that satisfies all given examples. This approach transforms a specification-by-demonstration into executable code, such as a string transformation in a spreadsheet or a data-wrangling script, without requiring the user to write formal logic or code. The core challenge is the ambiguity of examples; many possible programs can fit the same few data points.

The synthesizer resolves this ambiguity through a search-and-verify loop. It searches a space of programs defined by a Domain-Specific Language (DSL) for candidates that produce the correct outputs for all given inputs. Advanced systems like FlashFill use deductive reasoning and constraint solving to efficiently prune the search. To ensure the generated program generalizes correctly, the system may propose new counterexamples for the user to label, a process known as interactive synthesis or oracle-guided synthesis.

PROGRAMMING BY EXAMPLE (PBE)

Frequently Asked Questions

Programming by Example (PBE) is a core technique in program synthesis that enables users to generate executable code by providing concrete demonstrations of the desired behavior. This FAQ addresses common technical questions about its mechanisms, applications, and relationship to other AI paradigms.

Programming by Example (PBE) is a program synthesis paradigm where the system infers a general, executable program from a set of concrete input-output pairs provided by the user. It works by treating the examples as a specification and searching a constrained space of possible programs, defined by a Domain-Specific Language (DSL), for one that produces the correct outputs for all given inputs. The core technical challenge is search space explosion; solvers use techniques like version space algebra, Satisfiability Modulo Theories (SMT), or neural-guided search to efficiently find a candidate program that generalizes correctly beyond the provided examples.

Prasad Kumkar

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.