Back

Survey: Formal Logic and Reasoning Frameworks for Robotic Systems

Autonomous robots must reason about actions, time, uncertainty, and physical constraints to operate reliably in unstructured environments. Over six decades, the AI and robotics communities have developed a rich landscape of logical formalisms — from classical planning languages like PDDL to temporal logics like STL, probabilistic programming languages like Pyro, and SMT solvers like Z3 — each capturing different facets of the reasoning problem. This survey provides a comprehensive, opinionated guide to these logical tools, organized by the type of reasoning they enable, and evaluates their maturity, complementarity, and applicability to modern robotic systems. We argue that no single formalism suffices; the frontier lies in principled integration of multiple logical tools within unified architectures, increasingly mediated by large language models that translate between human intent and formal specification.

1. Introduction & Problem Definition

A robot tasked with “set the table for dinner” must determine which objects to move and in what order (task planning), compute collision-free trajectories for each manipulation (motion planning), monitor that plates do not collide during placement (safety verification), reason about uncertain object poses from noisy perception (state estimation), and recover gracefully when a glass slips from the gripper (replanning under partial observability). Each of these sub-problems demands a different style of formal reasoning, and the overarching challenge is that they are deeply coupled.

Why this matters. As robots move from structured factory floors to homes, hospitals, and disaster sites, the environments become open-ended, the tasks become linguistically specified, and the consequences of failure become safety-critical. Informal, ad-hoc reasoning — hardcoded state machines, hand-tuned heuristics — does not scale. Formal logical tools provide three things informal methods cannot: (1) compositionality — complex specifications built from simple, reusable primitives; (2) verifiability — the ability to prove, or at least systematically check, that a robot’s behavior satisfies its specification; and (3) interpretability — formal specifications are inspectable and debuggable, unlike opaque neural policies.

What makes it hard. The core difficulty is a trilemma among expressiveness, computational tractability, and groundability. Classical planning (PDDL) is tractable and groundable but cannot express continuous dynamics or uncertainty. Temporal logics (LTL, STL) elegantly express time-extended behaviors but synthesis is computationally expensive. Probabilistic programming handles uncertainty natively but inference is intractable for high-dimensional problems. No single formalism occupies the sweet spot. Moreover, robots operate at the intersection of discrete symbolic reasoning and continuous physical dynamics, a regime that strains every formalism’s assumptions. The field’s history is, in many ways, the history of attacking different faces of this trilemma.

This survey covers the full landscape: classical logic-based planning (Section 2.1), temporal logics (Section 2.2), probabilistic reasoning (Section 2.3), satisfiability-based methods (Section 2.4), ontological reasoning (Section 2.5), and behavior trees as a reactive execution formalism (Section 2.6). We trace the field’s evolution from McCarthy’s 1963 situation calculus to today’s LLM-mediated neuro-symbolic pipelines (Section 3), identify the most impactful contributions in each family (Section 4), catalog practical software tools (Section 5), assess the state of the art (Section 6), and argue for the most promising open directions (Section 7).

2. Taxonomy of Approaches

We organize logical tools for robotic reasoning along two axes: what kind of reasoning the formalism supports, and what kind of world model it assumes. Table 1 provides the high-level taxonomy; subsequent subsections elaborate each family.

Logical Tools for Robotic Reasoning
  • A Classical Logic-Based Planning
    • A1. PDDL
    • A2. ASP / Action Languages (A/B/C)
    • A3. Situation Calculus
  • B Temporal Logics
    • B1. LTL, CTL, GR(1)
    • B2. STL, MTL
    • B3. Extensions (cLTL, CaTL, ProbSTL, SpaTiaL, wSTL)
  • C Probabilistic Reasoning
    • C1. PPLs (Church, Pyro, Gen, ProbLog)
    • C2. Probabilistic Model Checking (PRISM, Storm)
    • C3. Belief-space Planning, POMDPs
  • D Satisfiability-Based Methods
    • D1. SAT / SMT (Z3)
    • D2. δ-complete reasoning (dReal)
    • D3. MILP
  • E Ontological & Knowledge Representation
    • E1. OWL / Description Logics (KnowRob)
    • E2. Prolog
  • F Behavior Trees
    • F1. BT formalism + LTL/CTL verification
    • F2. BT synthesis from temporal logic
Table 1. Taxonomy of Logical Tools for Robotic Reasoning
Family Representative Formalisms World Model Strengths Weaknesses
Classical Logic-Based Planning Situation Calculus, STRIPS, PDDL, Action Languages (A/B/C), ASP Discrete, deterministic, fully observable Mature solvers, composable actions, well-understood complexity No time, no uncertainty, no continuous dynamics
Temporal Logics LTL, CTL, MTL, STL, cLTL, CaTL, GR(1) Discrete (LTL/CTL) or continuous (STL) traces Rich temporal specifications, synthesis & monitoring State explosion, expensive synthesis, limited scalability
Probabilistic Reasoning Church, Pyro, Gen, ProbLog, PRISM, Storm, POMDPs Stochastic, partially observable Handles uncertainty natively, principled inference Inference intractability, scalability bottleneck
Satisfiability-Based SAT, SMT (Z3, dReal), MILP Constraint systems over Booleans/reals Sound, complete (or δ-complete), handles nonlinear dynamics Scalability, discretization overhead
Ontological / Knowledge Representation OWL, Description Logics, Prolog, KnowRob Structured relational knowledge Semantic interoperability, commonsense reasoning No dynamics, no temporal logic, scalability
Behavior Trees BT formalism + LTL/CTL verification Reactive, hierarchical control Modular, reactive, composable, verifiable Not a reasoning formalism per se; requires external planners

2.1 Classical Logic-Based Planning

Classical logic-based planning treats the world as a set of logical propositions (true/false facts about the current state) and actions that change those propositions. The core question is: given an initial state and a goal state, find a sequence of actions that transforms one into the other. Formally, a planning problem is a tuple \(\langle S, A, \gamma, s_0, G \rangle\) where \(S\) is a finite set of states (each defined by a set of propositions), \(A\) is a set of actions, \(\gamma: S \times A \to S\) is a deterministic transition function, \(s_0\) is the initial state, and \(G \subseteq S\) is the set of goal states. Each action \(a \in A\) has preconditions (propositions that must hold for the action to be applicable) and effects (propositions added or removed after execution).

Example. Consider a robot that must set a table. The state includes propositions like on(plate, counter), hand-empty, clear(table). The action pick-up(plate) has preconditions {on(plate, counter), hand-empty} and effects {holding(plate), ¬on(plate, counter), ¬hand-empty}. A planner searches for a sequence of such actions that reaches the goal state {on(plate, table), on(fork, table)}. The key assumption is that everything not mentioned in an action’s effects stays unchanged — the frame assumption.

The oldest family traces to McCarthy’s situation calculus [44], which reified world states (“situations”) and actions as first-order logical terms. The key operation is result(a, s) — applying action a in situation s to produce a successor situation. This was elegant but immediately ran into the frame problem: the need to explicitly axiomatize everything that does not change after each action, producing O(actions × fluents) frame axioms.

STRIPS and PDDL. The practical response to the frame problem was STRIPS [19], which introduced add/delete lists to compactly represent action effects with an implicit frame assumption: anything not mentioned in the effects is unchanged. PDDL [45] standardized and extended the STRIPS representation with types, conditional effects, universal quantification, and a modular “requirements” system. PDDL remains the de facto standard for task planning in robotics. Its subsequent versions added temporal planning with durative actions (PDDL 2.1 (Fox and Long, 2003)), derived predicates (PDDL 2.2), trajectory constraints and preferences (PDDL 3.0), and hybrid discrete-continuous processes (PDDL+). For probabilistic extensions, PPDDL introduced stochastic effects and RDDL provided relational dynamic influence diagrams for POMDPs.

PDDL’s power comes from its ecosystem: domain-independent planners like Fast Downward (Helmert et al., 2006) use sophisticated heuristic search (causal graph heuristics, landmark heuristics) to solve PDDL problems efficiently, and the International Planning Competition (IPC) has driven two decades of algorithmic progress. In robotics, PDDL is the standard task-level backbone in task-and-motion planning (TAMP) systems. PDDLStream [22] elegantly extends PDDL with streams — declarative wrappers around blackbox sampling procedures (inverse kinematics solvers, motion planners) — reducing TAMP to sequences of finite PDDL problems solved by off-the-shelf planners.

Action Languages and ASP. A parallel line of work by Gelfond and Lifschitz [24] introduced action languages (A, B, C) — propositional formalisms with built-in inertia that solve the frame problem automatically. Language B adds static causal laws for indirect effects (ramifications); Language C generalizes further. These are designed to be compiled into Answer Set Programs (logic programs under stable model semantics), enabling computation via modern ASP solvers like Clingo [23]. ASP’s strength is elaboration tolerance — the ability to modularly add constraints, defaults, exceptions, and preferences without rewriting the core program. This makes ASP particularly suited to robotic domains with commonsense reasoning requirements, incomplete knowledge, and sensing actions [56], [46]. Rizwan et al. [51] demonstrate ASP for human-robot collaborative assembly, using external atoms to bridge symbolic planning with geometric feasibility checks — a compelling model for integrating logical and continuous reasoning.

However, the ASP grounding bottleneck (full instantiation before solving) limits scalability, and ASP lacks native support for continuous variables, time, or probability. The survey by Meli et al. [46] finds that standard logic programming (Prolog, ASP) currently offers the best tradeoff of expressivity, efficiency, and software maturity for most robotic tasks, but notes that temporal and probabilistic formalisms — despite offering greater expressivity — suffer from state explosion that prevents scaling to realistic scenarios.

2.2 Temporal Logics

Classical planning answers “what sequence of actions achieves the goal?” Temporal logics answer a richer question: “does the system’s behavior over time satisfy a specification?” This shift from goal-reaching to trajectory-level properties is essential for robotics, where safety (“never collide”), liveness (“eventually reach the target”), persistence (“always maintain communication”), and ordering (“deliver water before sushi”) all matter.

Temporal logics extend standard propositional logic with operators that reason over time. The four core operators are: □ (“always” — the property must hold at every future time step), ◊ (“eventually” — the property must hold at some future time step), U (“until” — one property holds until another becomes true), and ○ (“next” — the property holds at the next time step). These operators compose: \(\square(\text{safe}) \wedge \lozenge(\text{goal})\) means “always remain safe and eventually reach the goal.”

Example. A delivery robot must visit room A then room B, while always avoiding obstacles: \(\square(\neg \text{collision}) \wedge (\lozenge(\text{at\_A} \wedge \lozenge(\text{at\_B})))\). This single formula encodes a safety constraint, a liveness requirement, and an ordering constraint — something a classical planning goal like {at_B} cannot express. The temporal logic approach works by compiling this formula into an automaton (a state machine) that tracks which parts of the specification have been satisfied so far, then finding a robot policy that drives the automaton to an accepting state.

Linear Temporal Logic (LTL). LTL (Pnueli, 1977) extends propositional logic with the temporal operators above over discrete traces. An LTL formula specifies a property that an infinite discrete trace must satisfy. The standard approach to LTL-based robot control is reactive synthesis: translate the LTL formula into a deterministic automaton (Rabin, Büchi, or for the co-safe fragment, a DFA), take the product with a discrete abstraction of the robot’s workspace, and solve the resulting game or planning problem [35]. Tools like TuLiP (Wongpiromsarn et al., 2011) and LTLMoP (Finucane et al., 2010) automate this pipeline for GR(1) specifications.

LTL’s limitations for robotics are well-known. First, synthesis suffers from state-space explosion — the automaton can be doubly exponential in the formula’s length. Second, LTL operates over discrete traces, requiring a finite abstraction of the continuous workspace, which introduces conservatism. Third, LTL is Boolean (satisfied or not), with no notion of “how well” a specification is met. Despite these limitations, LTL remains the workhorse for task specification in robotics. He et al. [28] were the first to bring LTL into manipulation planning, introducing a novel abstraction over object-location permutations that avoids workspace discretization. Bozkurt et al. [7] showed how LTL specifications can drive model-free RL via automata-based reward shaping. And Xu et al. [62] demonstrated hierarchical LTL (HLTL) for multi-robot systems, decomposing complex natural-language instructions into hierarchical temporal specifications via LLMs.

Signal Temporal Logic (STL). While LTL operates over discrete Boolean propositions, STL [41] extends temporal logic to continuous real-valued signals with time bounds. The temporal operators now carry intervals: \(\square_{\lbrack a,b\rbrack}\) means “always during time \(\lbrack a,b\rbrack\)” and \(\lozenge_{\lbrack a,b\rbrack}\) means “at some point during \(\lbrack a,b\rbrack\).” Predicates are inequalities over signals rather than Boolean atoms.

Example. For a mobile robot navigating to a goal: \(\square_{\lbrack 0,10\rbrack}(\text{speed} \lt 30) \wedge \lozenge_{\lbrack 0,5\rbrack}(\text{distance\_to\_goal} \lt 0.1)\) means “keep speed below 30 for the entire first 10 seconds, and reach within 0.1m of the goal at some point within the first 5 seconds.” Unlike LTL, STL assigns a quantitative robustness score — not just pass/fail, but how well the specification is satisfied. A trajectory with speed peaking at 20 is more robust than one peaking at 29, even though both satisfy the formula.

The seminal paper introduced monitoring — checking whether a continuous signal trace satisfies an STL formula — via a two-stage process: predicate filters convert real-valued signals to Boolean signals, then interval-propagation evaluates the temporal operators.

STL’s true power emerged with quantitative robustness semantics [17], [18], which assign a real-valued robustness score measuring how strongly a signal satisfies or violates a specification. Positive robustness means satisfaction; the magnitude indicates the margin. This robustness score is differentiable (in a generalized sense), opening the door to optimization.

Leung et al. [39] made this operational with STLCG, translating STL robustness into PyTorch computation graphs that support automatic differentiation. This was foundational: it cast STL and neural networks into the same computational language, enabling STL specifications to serve as differentiable loss functions, regularizers, or constraints in any gradient-based pipeline. The paper demonstrated trajectory planning, parametric STL clustering, and neural network regularization — all via backpropagation through temporal logic.

The practical consequence for robotics is profound. STL robustness can be encoded as Mixed-Integer Linear Program (MILP) constraints via the Big-M method (Raman et al., 2014), enabling optimal control synthesis subject to temporal specifications. Tools like PyTeLo [11] provide modular MILP encodings, and the survey by Zhao et al. [67] on optimization-based TAMP catalogs how STL integrates with trajectory optimization for contact-rich manipulation.

Extensions and Variants. The base temporal logics have been extended in multiple directions, each addressing a specific robotics need:

  • Counting Temporal Logics (cLTL/cLTL+) [53]: Adds counting quantifiers over robot populations — “at least 5 robots surveil area A” — with ILP-based synthesis that scales independently of fleet size. This breaks the exponential barrier of per-robot LTL assignment.
  • Capability Temporal Logic (CaTL/CaTL+) [12]: A fragment of STL tailored for heterogeneous robot teams, specifying tasks as tuples of (duration, region, required capabilities, required resources). The novel multi-class STL (mcSTL) computes independent robustness scores per predicate class, preventing resource deficits from being masked by capability surpluses.
  • Probabilistic STL (ProbSTL) [61]: Extends STL with a sub-language for probabilistic statements over predictions, achieving superior monitoring accuracy under perception uncertainty.
  • Spatio-Temporal Logic (SpaTiaL) [50]: Unifies spatial relations between objects with temporal task patterns for robot-agnostic manipulation specification.
  • Weighted STL (wSTL) [65]: Adds importance weights to STL sub-formulas, used for generating concise explanations of learned robot policies.

The survey by Bartocci et al. [3] on mining STL specifications provides the definitive catalog of methods for learning STL formulas from data — a critical capability when specifications are not known a priori.

2.3 Probabilistic Reasoning

Real robots face pervasive uncertainty: noisy sensors, stochastic dynamics, unknown environments, and unpredictable human partners. The formalisms above either ignore uncertainty entirely (PDDL, LTL) or treat it as adversarial nondeterminism (reactive synthesis). Probabilistic reasoning tools model uncertainty explicitly by maintaining probability distributions over possible world states and computing expectations or bounds over outcomes. The central formulation is: given a prior distribution \(P(s)\) over states, a stochastic transition model \(P(s' | s, a)\), and an observation model \(P(o | s)\), compute a belief \(b(s) = P(s | o_1, \ldots, o_t)\) and select actions that maximize expected reward (or satisfy a specification with high probability).

Example. A robot must pick up a cup whose position is uncertain due to sensor noise. Rather than planning for a single estimated pose, the robot maintains a Gaussian belief \(b(\text{cup\_pos}) = \mathcal{N}(\mu, \Sigma)\). If the covariance \(\Sigma\) is too large, the planner inserts a “look at cup” action to reduce uncertainty before grasping — the plan reasons about when to sense, not just what to do. A probabilistic model checker can then verify: “is the probability of a successful grasp above 0.95?”

Probabilistic Programming Languages (PPLs). Church [26] established probabilistic programming as a reasoning paradigm by showing that arbitrary generative models — including planning problems — can be expressed as stochastic programs and solved via probabilistic inference. The key innovation was stochastic memoization, which enabled compact representation of infinite nonparametric Bayesian models. Church demonstrated planning as inference: expressing goals as observations in a generative model, then conditioning to derive optimal actions. This paradigm directly connects to robot task planning — representing the world as a probabilistic program and inferring plans as posterior samples.

Church spawned a lineage of practical PPLs. Pyro [6] brings deep probabilistic programming to PyTorch, supporting variational inference, MCMC, and Gaussian processes at scale. Gen [15] provides programmable inference in Julia, giving researchers fine-grained control over proposal distributions and inference strategies — critical for robotics applications where domain-specific inference knowledge dramatically improves performance. ProbLog (De Raedt et al., 2007) takes a different approach, extending Prolog with probabilistic facts, enabling logical reasoning under uncertainty via weighted model counting. Its neuro-symbolic extension DeepProbLog integrates neural network predicates.

For robotics specifically, Cannizzaro et al. [10] demonstrate COBRA-PPM, combining causal Bayesian networks with Pyro for interventional inference in robot manipulation, achieving 88.6% prediction accuracy and 94.2% task success on block-stacking with sim-to-real transfer. This is currently the most direct application of probabilistic programming as a formal reasoning tool for robotic decision-making.

The real-time dimension is addressed by Hummelgren et al. [31], who introduce ProbTime, the first real-time probabilistic programming language combining inference with first-class timing primitives, using Sequential Monte Carlo under schedulability constraints.

Probabilistic Model Checking. Where PPLs perform inference over programs, probabilistic model checkers verify properties of stochastic models. The PRISM model checker (Kwiatkowska et al., 2011) and its successor Storm [29] support DTMCs, CTMCs, MDPs, and POMDPs, checking properties expressed in probabilistic temporal logics (PCTL, CSL, rPATL). These tools answer questions like “what is the maximum probability of reaching the goal while keeping expected energy below threshold?” — quantitative queries that Boolean verification cannot express.

Kwiatkowska et al. [36] provide the authoritative survey on probabilistic model checking for autonomy, showing how the framework extends from single-agent MDPs to multi-agent stochastic games with Nash equilibrium synthesis. Lacerda et al. [37] demonstrate practical deployment: their PRISM-based framework generates verified policies for mobile service robots with formal guarantees on task completion probability, deployed for 8+ months on real robots in elder-care and office environments. The key innovation is a task progression metric for co-safe LTL that enables graceful degradation when full task completion becomes impossible.

Ye et al. [63] bridge domain-specific robotics modeling and probabilistic verification with RoboChart+PRISM, providing an end-to-end pipeline from graphical state-machine design to PCTL model checking. Lampacrescia et al. [38] extend Storm with statistical model checking for faster but approximate verification of robotic systems.

Belief-Space Planning. Kaelbling and Lozano-Pérez [34] take a different approach to uncertainty: rather than verifying stochastic models, they plan directly in belief space — the space of probability distributions over world states. Their key insight is that belief-space goals and pre-images can be compactly described using a small vocabulary of logical fluents over probability distributions (e.g., BIn(cup, cupboard, 0.05) means “with probability > 0.95, the cup is in the cupboard”). These fluents have closed-form regression functions under Gaussian beliefs, enabling the same regression-based planning machinery used for deterministic TAMP to work under uncertainty. This is a hybrid logical-probabilistic formalism: logical structure for planning, probabilistic content within the fluents.

Hybrid Logic-Probabilistic Systems. Zhang et al. [66] present iCORPP, which bridges ASP-based commonsense reasoning with POMDPs by using the probabilistic logic language P-log to dynamically construct compact probabilistic models from logical knowledge. This represents the most mature integration of logic programming and probabilistic sequential decision-making for real-robot service tasks.

2.4 Satisfiability-Based Methods

A powerful meta-approach to robotic reasoning casts planning and verification problems as satisfiability queries: encode the problem as a logical formula over variables (Booleans, integers, reals), then ask a solver whether there exists an assignment of variables that makes the formula true. SAT solvers handle purely Boolean formulas; SMT (Satisfiability Modulo Theories) solvers extend this to arithmetic, inequalities, and nonlinear functions. The formulation is: given a formula \(\varphi(x_1, \ldots, x_n)\) over variables \(x_i\), find an assignment \(\sigma\) such that \(\varphi(\sigma) = \text{true}\), or prove that no such assignment exists.

Example. To verify that a robot arm trajectory is collision-free over a 5-second window, encode the arm’s forward kinematics as equations over joint angles \(q(t)\), obstacle positions as inequality constraints \(\|p_{\text{link}}(q(t)) - p_{\text{obs}}\| > r\), and the time interval as a real variable \(t \in \lbrack 0, 5\rbrack\). An SMT solver checks whether any time point violates the safety constraint. If unsatisfiable, the trajectory is provably safe; if satisfiable, the solver returns a concrete counterexample (the exact time and configuration of collision).

SMT Solving. Z3 [16] is the foundational SMT solver, supporting arithmetic, bit-vectors, arrays, and quantifiers. With 12,000+ citations, it is the de facto backend for an enormous range of formal verification and constraint-based planning tools. For robotics, Z3’s ability to reason about mixed Boolean-integer-real constraints makes it suitable for encoding temporal logic checking, planning constraints, and safety verification.

δ-Complete Reasoning. For robotic systems with nonlinear dynamics (drag proportional to velocity squared, trigonometric kinematics), exact SMT solving is undecidable. Gao et al. [20], [21] introduced δ-complete decision procedures and the dReal solver, which relaxes exact satisfiability to δ-satisfiability: a formula is either unsatisfiable or satisfiable up to a bounded numerical perturbation δ. This is grounded in computable analysis (Type 2 computability theory) and provides formal correctness guarantees for numerical reasoning about nonlinear ODEs — precisely the mathematics governing physical robots.

Planning via SMT. Bryce et al. [8] were the first to encode PDDL+ planning with nonlinear continuous change as SMT problems solved by dReal, introducing planning-specific heuristics that exploit the problem structure. Cashmore et al. [13] extended this to the full PDDL+ semantics (including processes and events) with SMTPlan, using computer algebra systems for symbolic integration and a zero-crossing constraint technique to ensure invariant conditions hold throughout time intervals — not just at discrete checkpoints. SMTPlan dramatically outperforms discretization-based planners (UPMurphi, DiNo) and is insensitive to time horizon length, but struggles with large discrete branching factors. Goldman et al. [25] combined dReal with HTN planning (SHOP2) and controller synthesis (CIRCA) in the Hy-CIRCA architecture for correct-by-construction hybrid planning.

STL Model Checking via SMT. Yu et al. [64] implement STLmc, which reduces robust STL model checking of hybrid systems to SMT queries. Their parallelized two-step algorithm first obtains a discrete abstraction (Z3), then checks nonlinear ODE refinements (dReal) in parallel. The key theoretical contribution is extending ε-strengthening from first-order logic to full STL, cleanly reducing quantitative robustness verification to Boolean satisfiability.

2.5 Ontological and Knowledge Representation

Robots operating in human environments need more than planning and verification — they need semantic knowledge about objects, their properties, spatial relations, and commonsense constraints. An ontology is a formal, structured representation of concepts and their relationships within a domain, expressed using description logics or predicate calculus. It defines a hierarchy of classes (e.g., Cup ⊏ Container ⊏ Object), properties (e.g., hasHandle, isFragile), and inference rules that allow the robot to derive implicit knowledge from explicit facts.

Example. A robot is told to “put the drink in the fridge.” The ontology knows that Juice ⊏ Drink, that Fridge ⊏ Container with property keepsCold, and that Juice instances are isPerishable. From these axioms, the robot infers that the juice bottle on the counter is the target object — without being told explicitly — and can query whether the fridge is reachable given spatial constraints. This is knowledge the planner cannot derive from PDDL preconditions alone.

KnowRob. Tenorth and Beetz [60] introduced KnowRob, the most influential knowledge processing infrastructure for cognition-enabled robots. It represents world knowledge using OWL ontologies grounded in Description Logics, with SWI-Prolog as the reasoning engine. The key architectural innovation is virtual knowledge bases — knowledge computed on demand by grounding queries in the robot’s perception system, internal data structures, or external sources. KnowRob 2.0 [4] extended this with physics simulation-based reasoning, episodic memories (NEEMs), and “computable relations” that attach procedural computation to logical predicates — enabling queries about grasp forces, pouring dynamics, and spatial reachability that purely symbolic systems cannot answer.

Surveys and Comparisons. Olivares-Alarcos et al. [49] systematically compared six ontology-based frameworks (KnowRob, ROSETTA, IEEE-ORA/OROSU, ORO, CARESSES, RehabRobo-Onto) across ontology scope, reasoning scope, and application domain, finding that no two frameworks define the same set of basic terms — even “Action” vs. “Task” lacks consensus. Manzoor et al. [43] expanded to ten systems, cataloging the logical tools used: OWL dominates for ontology encoding, Prolog for reasoning, with PDDL appearing in several systems for plan generation from ontological knowledge. Aguado et al. [1] surveyed 32 ontology-based projects for dependable autonomy, revealing that formal temporal logics (STL, LTL) are essentially absent from the ontology literature — a significant gap suggesting these communities have developed in isolation.

Ontology-Mediated Planning. John and Koopmann [33] propose a formal framework where OWL-DL knowledge bases define state descriptions and action preconditions can query implicit knowledge via description logic reasoning under open-world semantics. This is a promising direction for connecting the expressiveness of ontological reasoning with the algorithmic power of classical planning.

2.6 Behavior Trees

Behavior Trees (BTs) are not a logical formalism in the same sense as PDDL or STL, but they are the dominant reactive execution framework in robotics and serve as the critical interface between logical specifications and robot control. A BT is a directed rooted tree where leaves are either actions (things the robot does) or conditions (things the robot checks), and internal nodes define control flow: a Sequence node executes children left-to-right, succeeding only if all succeed; a Fallback node tries children in order, succeeding if any one succeeds; a Parallel node runs children concurrently. The tree is “ticked” at a fixed rate from the root, making it inherently reactive — the robot re-evaluates its situation every tick cycle.

Example. A mobile robot tasked with fetching a cup could use a BT where the root is a Sequence: (1) Fallback[check if holding cup → navigate to cup → pick up cup], (2) navigate to delivery location, (3) place cup. If the cup slips mid-delivery, the “check if holding cup” condition fails on the next tick, and the Fallback automatically re-triggers the pick-up subtree — no explicit replanning needed. This reactive recovery is the key advantage over flat plan execution.

Iovino et al. [32] provide the comprehensive survey (166 papers): BTs encode task-switching logic in this hierarchical structure, yielding modularity (subtrees are self-contained), reactivity (continuous re-evaluation via ticks), and transparency (human-readable hierarchy). Crucially, BTs generalize FSMs, decision trees, subsumption architectures, and teleo-reactive programs [32].

Formal Verification of BTs. BTs become relevant to this survey through their formal verification. Biggar et al. [5] introduced a framework expressing BTs in LTL, enabling modular verification where subtree changes can be checked independently. Serbinowska and Johnson [54] built BehaVerify, translating py_trees BTs into nuXmv models for LTL model checking, scaling to 100+ node BTs (BlueROV underwater vehicle). Their follow-up [55] formalized Stateful BTs with persistent blackboard memory, proving computational equivalence to Turing machines (unbounded) and finite automata (bounded), and achieved 100x scalability improvements over competing tools — verifying 20,000-node BTs in 32 seconds. Tadiello and Troubitsyna [58] approach BT verification via Event-B, enabling safety invariant proofs.

BT Synthesis from Temporal Logic. Hong et al. [30] address the harder direction: automatically constructing BTs that satisfy given LTL specifications, using grammar-based Monte Carlo Tree Search with CSP model verification as the simulation phase. This replaces slow simulator-based feedback with static formal verification, achieving correct-by-construction BT synthesis for robotic missions.

3. Timeline & Evolution

The field’s evolution follows a clear arc from theoretical foundations through practical tools to modern hybrid systems.

1963–1971: Logical Foundations. McCarthy’s situation calculus [44] established that first-order logic could represent dynamic worlds with actions. The frame problem was identified (McCarthy and Hayes, 1969), and STRIPS [19] provided the first practical resolution with add/delete lists.

1977–1998: Standardization and Formal Specification. Pnueli (1977) introduced LTL for program verification, which would later migrate to robotics. Gelfond and Lifschitz [24] developed stable model semantics and action languages, solving the frame problem in a principled way. PDDL [45] standardized the planning language, enabling the IPC competitions that drove two decades of planner development.

2004–2010: Continuous-Time Reasoning. Maler and Nickovic [41] introduced STL, extending temporal logic to continuous real-valued signals — a pivotal moment for cyber-physical systems. Donzé and Maler [17] and Fainekos and Pappas [18] added quantitative robustness semantics, transforming STL from a monitoring tool into an optimization objective. Church [26] established probabilistic programming as a reasoning paradigm.

2008–2013: Solver Infrastructure. Z3 [16] and dReal [21] provided the SMT backbone for formal reasoning about real-valued and nonlinear constraints. KnowRob [60] established ontological reasoning for robot manipulation. Kaelbling and Lozano-Pérez [34] demonstrated belief-space TAMP on a real PR2 robot.

2015–2020: Bridging Logic and Learning. He et al. [28] brought LTL to manipulation planning. Leung et al. [39] made STL differentiable via STLCG, bridging temporal logic and deep learning. Bozkurt et al. [7] showed LTL could drive model-free RL. Garrett et al. [7] published PDDLStream, making PDDL practical for TAMP. Cashmore et al. [13] solved full PDDL+ via SMT.

2022–present: LLMs as Specification Translators & Hybrid Systems. The most significant recent inflection point is using LLMs not as planners but as formalizers — translating natural language into PDDL, LTL, or STL specifications that are then solved by verified classical tools [59]. Systems like LLM+P, Lang2LTL, NL2STL, and Nl2Hltl2Plan [62] demonstrate this paradigm. Concurrently, formal methods for verifying learned robot policies have matured [42], and large-scale structured surveys [2] confirm the field’s shift toward probabilistic verification and formal synthesis.

Key inflection points:

  1. PDDL (1998): Created the common language that enabled systematic planner comparison and drove algorithmic progress.
  2. STL (2004): Extended formal specification to continuous signals, making temporal logic relevant to control systems.
  3. STL robustness semantics (2009–2010): Turned verification into optimization, opening the door to gradient-based methods.
  4. STLCG (2020): Made temporal logic differentiable, bridging formal methods and deep learning.
  5. LLMs as formalizers (2023–present): Removed the specification bottleneck by automating natural-language-to-formal-language translation.

4. Key Papers & Contributions

4.1 Classical Planning

McCarthy [44], “Situations, Actions, and Causal Laws.” The origin paper. Introduced the situation calculus — situations as first-order terms, actions as transformers via result(a, s), fluents parameterized by situations. Demonstrated on the monkey-and-bananas problem. Purely theoretical, with no computational implementation and no treatment of uncertainty or continuous dynamics. But it established the conceptual vocabulary (preconditions, effects, state representation) that PDDL inherited 35 years later.

McDermott et al. [45], “PDDL — The Planning Domain Definition Language.” The paper that launched a thousand planners. PDDL’s genius was separating domain “physics” from planner “advice” and factoring the language into optional requirement subsets, creating a neutral common ground adoptable at each planner’s capability level. Its limitations — deterministic only, no temporal/continuous reasoning, informally specified semantics — drove all subsequent extensions (PDDL 2.1 through PDDL+, PPDDL, RDDL). For robotics today, PDDL is indispensable as the task-level interface in TAMP systems.

Garrett et al. [22], “PDDLStream.” The most practically impactful TAMP paper. By introducing streams (declarative wrappers around blackbox samplers) and an adaptive algorithm that plans optimistically with placeholder objects, PDDLStream reduced TAMP to sequences of finite PDDL problems. The unique-optimistic-objects-per-stream-instance design enables lazy evaluation, and the system successfully executed real-world PR2 tasks (meal serving, cooking). The limitation — open-loop execution only — motivates ongoing work on closed-loop replanning.

Gelfond and Lifschitz [24], “Action Languages.” Solved the frame and ramification problems simultaneously through built-in inertia and static causal laws. The translation to answer set programs made action languages computationally practical. This paper is the bridge between theoretical action formalisms and the modern ASP planning ecosystem.

4.2 Temporal Logics

Maler and Nickovic [41], “Monitoring Temporal Properties of Continuous Signals.” The STL foundational paper. The two-stage monitoring approach (predicate filters + interval propagation) was efficient (O(k·n)) and practical — tested on a nuclear plant steam generator trace (1M time units in 2.84s). The limitation to offline, Boolean monitoring motivated all subsequent work on quantitative semantics and online monitoring.

Leung et al. [39], “Backpropagation through Signal Temporal Logic Specifications.” STLCG is, in my assessment, the single most important paper bridging formal methods and modern machine learning for robotics. By mapping STL’s recursive robustness semantics to recurrent computation graphs, it made temporal logic a differentiable “layer” composable with neural networks. The smooth approximations (logsumexp) and integral robustness formula addressed the non-smooth, point-sensitive nature of standard robustness. Every subsequent paper on differentiable STL, STL-guided RL, or STL-regularized learning builds on this foundation.

He et al. [28], “Towards Manipulation Planning with Temporal Logic Specifications.” First to bring LTL into manipulation TAMP, using a novel abstraction over object-location permutations rather than workspace discretization. The synergistic three-layer architecture (abstraction graph × DFA product → Dijkstra’s → sampling-based motion planning) avoided the state explosion that plagued standard LTL abstractions. All 50 runs per specification succeeded within 100 seconds on a 34-DOF PR2, with no backtracking. The limitation — exponential growth in objects and locations — motivates ongoing work on heuristic search in product graphs.

Sahin et al. [53], “Multirobot Coordination with Counting Temporal Logics.” An elegant extension: by tracking robot counts per state rather than identities, the cLTL aggregate encoding achieves O(h(|→|+|μ|)) decision variables — independent of fleet size N. This is the scalability breakthrough for multi-robot temporal logic planning.

4.3 Probabilistic Reasoning

Goodman et al. [26], “Church: A Language for Generative Models.” Established probabilistic programming as a reasoning formalism. The planning-as-inference demonstration — casting MDP/POMDP optimal action selection as conditioning a generative model on goal achievement — is the conceptual foundation for all subsequent work connecting probabilistic programming to robot planning. The stochastic memoization innovation for nonparametric Bayesian models was technically brilliant.

Kaelbling and Lozano-Pérez [34], “Integrated Task and Motion Planning in Belief Space.” The most influential paper connecting symbolic planning and probabilistic reasoning for manipulation. The vocabulary of belief-state fluents with closed-form regression functions under Gaussian beliefs is a rare example of a representation that is simultaneously formally clean and practically functional. The system autonomously generates perception actions as first-class operators — the planner decides when and where to look, not just what to do. Demonstrated on a real PR2 for multi-object manipulation.

Kwiatkowska et al. [36], “Probabilistic Model Checking and Autonomy.” The authoritative survey by the PRISM creators. The key insight: a single framework — probabilistic model checking with progressively richer temporal logics — handles the full spectrum from MDPs to multi-agent stochastic games with Nash equilibrium synthesis. The limitation: scalability. The paper is refreshingly honest about this: “efficiency is a major limitation.”

Lacerda et al. [37], “Probabilistic Planning with Formal Performance Guarantees for Mobile Service Robots.” The most mature deployment of probabilistic verification in real-world robotics. The task progression metric for graceful degradation — doing “as much as possible” when full task completion is impossible — is an idea every real-world robotic planner should adopt. Eight months of continuous deployment validates the approach.

4.4 Satisfiability and Hybrid Reasoning

de Moura and Bjørner [16], “Z3: An Efficient SMT Solver.” Infrastructure paper. Z3 is the silent backbone underneath much of formal methods for robotics — it’s the solver that makes constraint-based planning, STL checking, neural network verification, and PDDL+ solving computationally feasible.

Gao et al. [2012/2013], “δ-Complete Decision Procedures / dReal.” The theoretical and practical foundation for reasoning about nonlinear real arithmetic with formal guarantees. The δ-relaxation idea is philosophically important for robotics: if a system is δ-satisfiable but not truly satisfiable, it is fragile under tiny perturbations — itself a safety-relevant finding. dReal handles polynomials, trigonometric functions, exponentials, and ODE solutions.

Cashmore et al. [13], “Planning for Hybrid Systems via Satisfiability Modulo Theories.” SMTPlan is the state of the art for PDDL+ planning with nonlinear polynomial dynamics. The zero-crossing constraint for ensuring invariant conditions hold throughout intervals — not just at endpoints — is a crucial correctness technique. Performance is independent of time horizon length, unlike discretization-based methods. The limitation is discrete scalability: performance degrades sharply beyond ~5 happenings.

4.5 Ontological Reasoning

Tenorth and Beetz [60], “KnowRob.” The most cited ontology-based reasoning system for robots. The virtual knowledge base concept — computing answers on-the-fly by combining static ontological knowledge with procedural queries into perception, spatial reasoning, and web sources — is the right architecture for open-world robot knowledge. The hybrid open-world (OWL) + closed-world (Prolog) approach balances expressiveness with computational tractability.

4.6 Surveys Mapping the Landscape

Luckcuck et al. [40] is the most comprehensive taxonomy of formal methods for autonomous robotics (63 papers, 2007–2018), cataloging temporal logics, dynamic logics, process algebras, set-based methods, and ontologies. Notable gaps: no STL, no PDDL, no probabilistic programming. Meli et al. [46] fills the logic programming gap with 114 papers on ASP, Prolog, and PDDL for robotic task planning. Manganaris et al. [42] covers the intersection of formal methods with deep learning for robot policy learning and verification. Tantakoun et al. [59] maps the LLM-as-formalizer landscape. Cohen et al. [14] frames the entire field along a spectrum from formal symbols to neural embeddings, arguing that most “end-to-end” systems secretly rely on symbolic structure.

5. Software, Tools & Practical Resources

5.1 Planning Tools

Tool Language Key Features Maturity URL
Fast Downward C++ Heuristic search for PDDL 2.2, multiple IPC winner Very high (380 citations) github.com/aibasel/downward
Unified Planning (AIPlan4EU) Python Planner-independent API, auto-selects engines High (295 citations) github.com/aiplan4eu/unified-planning
Clingo C++/Python ASP grounder+solver, multi-shot solving, theory propagation Very high (766 stars) github.com/potassco/clingo
PDDLStream Python PDDL + streams for TAMP, FastDownward backend High github.com/caelan/pddlstream

Getting started with PDDL: The Planning.wiki [27] at planning.wiki is the best beginner resource, with worked examples and planner selection guides. The ROS 2 course by Rodríguez Lera et al. [52] provides hands-on integration of PDDL with robotic architectures. For ASP, the Potassco tutorial ecosystem at potassco.org covers hybrid ASP, constraint solving, and robotics applications.

5.2 Temporal Logic Tools

Tool Language Key Features Maturity URL
STLCG Python/PyTorch Differentiable STL robustness via computation graphs High (79 citations) github.com/StanfordASL/stlcg
RTAMT Python/C++ Online/offline STL monitoring, ROS 1/2 integration High (75 citations) github.com/nickovic/rtamt
PyTeLo Python MTL/STL/wSTL parsing → MILP encoding via Gurobi Moderate (10 citations) github.com/descyphy/pytelo
TuLiP Python LTL/GR(1) synthesis for hybrid systems High (125 citations) github.com/tulip-control/tulip-control
STLmc Python Robust STL model checking via Z3/dReal Moderate (18 citations) stlmc.github.io

Recommendation: If you work with continuous signals and want to integrate STL into a learning pipeline, start with STLCG. If you need runtime monitoring on a real robot (ROS 2), use RTAMT. If you need controller synthesis from LTL/GR(1) specs, use TuLiP. For MILP-based optimization subject to temporal specs, use PyTeLo.

5.3 Probabilistic Reasoning Tools

Tool Language Key Features Maturity URL
Pyro Python/PyTorch SVI, MCMC, deep generative models Very high (9000 citations) pyro.ai
NumPyro Python/JAX 100x+ faster MCMC/NUTS via JAX High github.com/pyro-ppl/numpyro
Gen.jl Julia Programmable inference, robotics-oriented High (1800 citations) gen.dev
ProbLog Python Probabilistic Prolog, weighted model counting Moderate (397 stars) github.com/ML-KULeuven/problog
Storm C++/Python Probabilistic model checker (DTMCs, MDPs, POMDPs) Very high (235 citations) stormchecker.org
PRISM Java Probabilistic model checker, multiple temporal logics Very high prismmodelchecker.org
GTSAM C++/Python Factor graph optimization for SLAM/state estimation Very high (3355 citations) github.com/borglab/gtsam

Recommendation: For Bayesian deep learning in Python, use Pyro or NumPyro. For robotics perception and state estimation, GTSAM is the standard. For programmable inference in research settings, Gen.jl offers the most control. For formal verification of stochastic models, Storm has the best combination of coverage and performance. For logic-based probabilistic reasoning, ProbLog bridges symbolic and statistical worlds.

5.4 Satisfiability Solvers

Tool Language Key Features URL
Z3 C++/Python SMT solver, arithmetic/bitvectors/arrays, quantifiers github.com/Z3Prover/z3
dReal C++/Python δ-complete SMT for nonlinear reals, ODE support dreal.github.io

5.5 Ontological Reasoning

Tool Language Key Features URL
KnowRob C++/Prolog/Python OWL ontologies + Prolog reasoning, ROS integration knowrob.org
SWI-Prolog C/Prolog Full Prolog with extensive libraries, ROS bindings swi-prolog.org

5.6 Behavior Tree Tools

Tool Language Key Features URL
BehaviorTree.CPP C++ ROS 2 integration, XML specification, widely adopted github.com/BehaviorTree/BehaviorTree.CPP
py_trees Python Python BT library, composable, ROS integration github.com/splintered-reality/py_trees
BehaVerify Python BT → nuXmv translation for LTL/CTL verification github.com/verivital/behaverify

6. State of the Art

What works best today, and under what conditions?

For deterministic task planning in manipulation, PDDL + Fast Downward (or the Unified Planning Framework for convenience) remains the gold standard. PDDLStream is the most practical TAMP framework. For temporal task specifications, LTL dominates for discrete tasks; STL dominates for continuous-time control and monitoring. The STLCG differentiable approach has been adopted widely for integrating temporal constraints into optimization and learning.

For multi-robot coordination, the choice depends on fleet size: standard LTL product-automaton approaches work for small teams (< 10); counting temporal logics (cLTL) or capability temporal logics (CaTL) scale to hundreds of agents via MILP encoding.

For uncertainty handling, the landscape is bifurcated. Probabilistic model checking (Storm, PRISM) provides formal guarantees for small-to-medium models. Probabilistic programming (Pyro, Gen) handles richer models but without formal guarantees. Belief-space TAMP [34] is the most principled integration of symbolic planning and probabilistic reasoning, but scales poorly with the number of objects.

For formal verification, model checking (nuXmv for BTs, PRISM/Storm for stochastic systems, STLmc for STL over hybrid systems) provides the strongest guarantees but faces state-space explosion. Runtime monitoring (RTAMT for STL) is the practical compromise — checking properties during execution rather than proving them for all possible executions.

Where is there consensus?

  • PDDL is the right level of abstraction for symbolic task planning in TAMP.
  • STL robustness is the right metric for continuous-time specification satisfaction in control.
  • No single formalism handles the full reasoning stack; integration is necessary.
  • LLMs should translate natural language to formal specifications, not act as planners themselves [59], [14].

Where is there disagreement?

  • Whether probabilistic programming or probabilistic model checking is more useful for robotics. Model checking provides guarantees but doesn’t scale; PPLs scale but provide no guarantees.
  • Whether temporal logic synthesis or temporal-logic-guided RL is the right approach for policy generation. Synthesis is correct-by-construction but limited to small state spaces; RL scales but sacrifices guarantees [42].
  • Whether ontological reasoning (KnowRob-style) will remain relevant as LLMs absorb commonsense knowledge. The structured, inspectable nature of ontologies argues for their continued role in safety-critical applications, but the engineering overhead is substantial.

7. Open Problems & Future Directions

7.1 The Specification Bottleneck

Writing formal specifications — whether PDDL domains, LTL formulas, or STL constraints — requires expertise that most robotics practitioners lack. LLMs as formalizers [59] partially address this, but current systems struggle with semantic errors (the generated PDDL may parse and solve but not match human intent), and fine-tuned models (e.g., Mistral-7B for NL→LTL [62]) are limited to the distribution of their training data. The open problem is verified specification generation: ensuring that the formal specification faithfully captures the user’s intent, not just that it is syntactically valid.

7.2 Scalable Formal Guarantees for Learned Policies

The central tension identified by Manganaris et al. [42] — scalability versus expressiveness in formal verification of neural policies — remains open. Set-propagation methods suffer compounding over-approximation; HJ reachability is exponential in dimension; certificate functions (Lyapunov, barrier) only verify simple safety/stability. The most promising direction is compositional verification: proving properties of subsystems independently and composing the guarantees — but this requires the right decomposition, which is itself an open problem.

7.3 Unified Discrete-Continuous-Probabilistic Reasoning

No existing framework seamlessly handles discrete task logic, continuous dynamics, temporal constraints, and probabilistic uncertainty within a single formalism. PDDL+ with SMT [13] handles discrete + continuous but not probability. Probabilistic PDDL (PPDDL/RDDL) handles discrete + probability but not continuous dynamics. STL handles continuous time but not planning. Belief-space TAMP [34] handles planning + uncertainty but not temporal specifications. The integration of these is perhaps the field’s most important open problem.

A promising approach is temporal ASP with probabilistic extensions: telingo (Cabalar et al., 2019) adds temporal operators to ASP; Cabalar et al. [9] add metric temporal constructs; combining these with probabilistic logic programming (P-log, ProbLog) could yield a unified symbolic reasoner for temporal-probabilistic-discrete domains.

7.4 Real-Time Formal Reasoning

Most formal methods operate offline: PDDL planners generate plans before execution; STL monitoring checks traces after the fact; probabilistic model checking verifies models ahead of time. Real robots need online formal reasoning — replanning when the world changes, monitoring in real-time, and adapting specifications on the fly. Hummelgren et al.’s [31] ProbTime (real-time probabilistic programming) and RTAMT’s online STL monitoring are steps in this direction, but a general framework for real-time formal reasoning with bounded computation time remains elusive.

7.5 Connection to the Researcher’s Own Work

For a researcher working at the intersection of reasoning, planning, robot manipulation, diffusion policies, foundation models, and sim-to-real transfer, the most actionable open problems are:

  1. STL-guided diffusion policies. STLCG makes STL differentiable; diffusion models generate trajectories. Combining them — using STL robustness as a guidance signal during diffusion sampling — would produce trajectories that satisfy formal temporal specifications without requiring MILP reformulation. This bridges your work on diffusion policy with formal methods.
  2. PDDL + foundation models for TAMP. PDDLStream provides the symbolic backbone; foundation models (vision-language models, world models) provide the perceptual grounding and constraint evaluation. The gap is making the stream interfaces (IK samplers, collision checkers, motion planners) learnable and differentiable rather than handcoded.
  3. Formal specifications for imitation learning. When learning from demonstrations, the “specification” is implicit in the demonstrations. Mining STL specifications from demonstrations [3] could make these implicit specifications explicit, enabling formal verification of learned policies and transfer of specifications across embodiments.
  4. Sim-to-real with formal guarantees. The reality gap means formal guarantees derived in simulation may not transfer. dReal’s δ-complete reasoning provides a principled framework: if a property is δ-robust in simulation, and the sim-to-real gap is bounded by δ, the property transfers. Making this bound tight and practical is the challenge.

Citation

If you find this survey useful, please cite it as

@misc{formal_logic_robotic_systems_survey_2026,
  author    = {Hu Tianrun},
  title     = {Formal Logic and Reasoning Frameworks for Robotic Systems},
  year      = {2026},
  publisher = {GitHub},
  url       = {https://h-tr.github.io/blog/surveys/formal-logic-reasoning-robotic-systems.html}
}
          

References

  1. Aguado, E., Gomez, V., Hernando, M. et al. (2024). A survey of ontology-enabled processes for dependable robot autonomy. Frontiers in Robotics and AI, 11.
  2. Azaiez, A., Anisi, D.A., Farrell, M., and Luckcuck, M. (2025). Revisiting formal methods for autonomous robots: A structured survey. arXiv:2509.20488.
  3. Bartocci, E., Mateis, C., Nesterini, E., and Nickovic, D. (2022). Survey on mining signal temporal logic specifications. Information and Computation, 289.
  4. Beetz, M., Beßler, D., Haidu, A. et al. (2018). KnowRob 2.0 — A 2nd generation knowledge processing framework for cognition-enabled robotic agents. IEEE ICRA.
  5. Biggar, O., Zamani, M., and Shames, I. (2020). A framework for formal verification of behavior trees with linear temporal logic. IEEE RA-L.
  6. Bingham, E., Chen, J.P., Jankowiak, M. et al. (2019). Pyro: Deep universal probabilistic programming. JMLR, 20(1).
  7. Bozkurt, A.K., Wang, Y., Zavlanos, M., and Pajic, M. (2020). Model-free reinforcement learning for stochastic games with linear temporal logic objectives. arXiv:2010.01050.
  8. Bryce, D., Gao, S., Musliner, D.J., and Goldman, R.P. (2015). SMT-based nonlinear PDDL+ planning. AAAI.
  9. Cabalar, P., Diéguez, M., Schaub, T., and Schuhmann, A. (2025). Computational methods for dynamic answer set programming. arXiv.
  10. Cannizzaro, R., Groom, M., Routley, J., Ness, R.O., and Kunze, L. (2024). COBRA-PPM: A causal Bayesian reasoning architecture using probabilistic programming for robot manipulation under uncertainty. arXiv.
  11. Cardona, G.A., Leahy, K., Mann, M., and Vasile, C.-I. (2023). PyTeLo: A flexible and efficient temporal logic tool for Python. arXiv.
  12. Cardona, G.A. and Vasile, C.-I. (2024). Planning for heterogeneous teams of robots with temporal logic, capability, and resource constraints. IJRR.
  13. Cashmore, M., Magazzeni, D., and Zehtabi, P. (2020). Planning for hybrid systems via satisfiability modulo theories. JAIR, 67, 235–283.
  14. Cohen, V., Liu, J.X., Mooney, R., Tellex, S., and Watkins, D. (2024). A survey of robotic language grounding: Tradeoffs between symbols and embeddings. IJCAI.
  15. Cusumano-Towner, M., Saad, F., Lew, A., and Mansinghka, V. (2019). Gen: A general-purpose probabilistic programming system with programmable inference. PLDI.
  16. de Moura, L. and Bjørner, N. (2008). Z3: An efficient SMT solver. TACAS.
  17. Donzé, A. and Maler, O. (2010). Robust satisfaction of temporal logic over real-valued signals. FORMATS.
  18. Fainekos, G. and Pappas, G.J. (2009). Robustness of temporal logic specifications for continuous-time signals. Theoretical Computer Science, 410(42).
  19. Fikes, R.E. and Nilsson, N.J. (1971). STRIPS: A new approach to the application of theorem proving to problem solving. Artificial Intelligence, 2(3-4).
  20. Gao, S., Avigad, J., and Clarke, E.M. (2012). δ-complete decision procedures for satisfiability over the reals. IJCAR.
  21. Gao, S., Kong, S., and Clarke, E.M. (2013). dReal: An SMT solver for nonlinear theories over the reals. CADE.
  22. Garrett, C.R., Lozano-Pérez, T., and Kaelbling, L.P. (2020). PDDLStream: Integrating symbolic planners and blackbox samplers via optimistic adaptive planning. ICAPS.
  23. Gebser, M., Kaminski, R., Kaufmann, B., and Schaub, T. (2014). Clingo = ASP + control: Preliminary report. ICLP Technical Communications.
  24. Gelfond, M. and Lifschitz, V. (1998). Action languages. Electronic Transactions on Artificial Intelligence, 2.
  25. Goldman, R.P., Bryce, D., Pelican, M.J.S., Musliner, D.J., and Bae, K. (2016). A hybrid architecture for correct-by-construction hybrid planning and control. AAAI Spring Symposium.
  26. Goodman, N.D., Mansinghka, V.K., Roy, D.M., Bonawitz, K., and Tenenbaum, J.B. (2008). Church: A language for generative models. UAI.
  27. Green, A. et al. (2025). Planning.wiki — The AI planning & PDDL wiki. planning.wiki.
  28. He, K., Lahijanian, M., Kavraki, L.E., and Vardi, M.Y. (2015). Towards manipulation planning with temporal logic specifications. IEEE ICRA.
  29. Hensel, C., Junges, S., Katoen, J.-P., Quatmann, T., and Volk, M. (2021). The probabilistic model checker Storm. STTT.
  30. Hong, W., Chen, Z., Li, M. et al. (2023). Formal verification based synthesis for behavior trees. SETTA.
  31. Hummelgren, L., Becker, M., and Broman, D. (2023). Real-time probabilistic programming. RTSS.
  32. Iovino, M., Scukins, E., Styrud, J. et al. (2022). A survey of behavior trees in robotics and AI. Robotics and Autonomous Systems.
  33. John, T. and Koopmann, P. (2024). Planning with OWL-DL ontologies (extended version). arXiv.
  34. Kaelbling, L.P. and Lozano-Pérez, T. (2013). Integrated task and motion planning in belief space. IJRR, 32(9-10).
  35. Kress-Gazit, H., Fainekos, G.E., and Pappas, G.J. (2009). Temporal-logic-based reactive mission and motion planning. IEEE TRO, 25(6).
  36. Kwiatkowska, M., Norman, G., and Parker, D. (2022). Probabilistic model checking and autonomy. Annual Review of Control, Robotics, and Autonomous Systems, 5.
  37. Lacerda, B., Faruq, F., Parker, D., and Hawes, N. (2019). Probabilistic planning with formal performance guarantees for mobile service robots. IJRR, 38(9).
  38. Lampacrescia, M., Klauck, M., and Palmas, M. (2024). Towards verifying robotic systems using statistical model checking in STORM. arXiv.
  39. Leung, K., Aréchiga, N., and Pavone, M. (2020). Backpropagation through signal temporal logic specifications: Infusing logical structure into gradient-based methods. WAFR / IJRR.
  40. Luckcuck, M., Farrell, M., Dennis, L.A., Dixon, C., and Fisher, M. (2019). Formal specification and verification of autonomous robotic systems: A survey. ACM Computing Surveys, 52(5).
  41. Maler, O. and Nickovic, D. (2004). Monitoring temporal properties of continuous signals. FORMATS/FTRTFT.
  42. Manganaris, A., Giammarino, V., Qureshi, A.H., and Jagannathan, S. (2026). Formal methods in robot policy learning and verification: A survey. TMLR.
  43. Manzoor, S., Rocha, Y.G., Joo, S.-H. et al. (2021). Ontology-based knowledge representation in robotic systems: A survey oriented toward applications. Applied Sciences, 11(4324).
  44. McCarthy, J. (1963). Situations, actions, and causal laws. Stanford AI Project Memo 2.
  45. McDermott, D., Ghallab, M., Howe, A. et al. (1998). PDDL — The planning domain definition language. Tech Report CVC TR-98-003.
  46. Meli, D., Nakawala, H., and Fiorini, P. (2023). Logic programming for deliberative robotic task planning. Artificial Intelligence Review.
  47. Meli, D., Sridharan, M., and Fiorini, P. (2021). Inductive learning of answer set programs for autonomous surgical task planning. Machine Learning, 110.
  48. Nickovic, D. (2026). RTAMT — Real-time STL monitoring library. github.com/nickovic/rtamt.
  49. Olivares-Alarcos, A., Beßler, D., Khamis, A. et al. (2019). A review and comparison of ontology-based approaches to robot autonomy. The Knowledge Engineering Review, 34.
  50. Pek, C., Schuppe, G.F., Esposito, F., Tumova, J., and Kragic, D. (2023). SpaTiaL: Monitoring and planning of robotic tasks using spatio-temporal logic specifications. arXiv.
  51. Rizwan, M., Patoglu, V., and Erdem, E. (2020). Human-robot collaborative assembly planning: An answer set programming approach. TPLP / ICLP.
  52. Rodríguez Lera, F.J. et al. (2024). PDDL course: Deliberation, planning, and execution in robotic architectures using ROS 2.
  53. Sahin, Y.E., Nilsson, P., and Ozay, N. (2020). Multirobot coordination with counting temporal logics. arXiv:1810.13087.
  54. Serbinowska, S.S. and Johnson, T.T. (2022). BehaVerify: Verifying temporal logic specifications for behavior trees. SEFM.
  55. Serbinowska, S.S., Robinette, P.K., Karsai, G., and Johnson, T.T. (2024). Formalizing stateful behavior trees. FMAS.
  56. Son, T.C., Pontelli, E., Balduccini, M., and Schaub, T. (2022). Answer set planning: A survey. arXiv:2202.05793.
  57. Sun, X. and Shoukry, Y. (2022). Neurosymbolic motion and task planning for linear temporal logic tasks. arXiv:2210.05180.
  58. Tadiello, M. and Troubitsyna, E. (2022). Verifying safety of behaviour trees in Event-B. iFM.
  59. Tantakoun, M., Zhu, X., and Muise, C. (2025). LLMs as planning formalizers: A survey. arXiv:2503.18971.
  60. Tenorth, M. and Beetz, M. (2013). KnowRob: A knowledge processing infrastructure for cognition-enabled robots. IJRR, 32(5).
  61. Tiger, M. and Heintz, F. (2020). Incremental reasoning in probabilistic signal temporal logic. IJCAI.
  62. Xu, S., Luo, X., Huang, Y. et al. (2024). Nl2Hltl2Plan: Scaling up natural language understanding for multi-robots through hierarchical temporal logic task representation. arXiv:2408.08188.
  63. Ye, K., Cavalcanti, A., Foster, S. et al. (2022). Probabilistic modelling and verification using RoboChart and PRISM. SoSyM, 21(2).
  64. Yu, G., Lee, J., and Bae, K. (2022). STLmc: Robust STL model checking of hybrid systems using SMT. CAV.
  65. Yuasa, M., Sreenivas, R.S., and Tran, H.T. (2026). Neuro-symbolic generation of explanations for robot policies with weighted signal temporal logic. IEEE RA-L.
  66. Zhang, S., Khandelwal, P., and Stone, P. (2024). iCORPP: Interleaved commonsense reasoning and probabilistic planning on robots. arXiv.
  67. Zhao, Z., Cheng, S., Ding, Y. et al. (2024). A survey of optimization-based task and motion planning: From classical to learning approaches. IEEE/ASME T-Mech.