Fieldtested

GLOSSARY

AI agents glossary

Agent Evaluation
The process of measuring agent performance — accuracy, reliability, cost, latency — against defined benchmarks or production data.
Agent Fallback
The mechanism by which an agent yields to a human or alternate path when it cannot proceed safely or confidently.
Agent Handoff
The transfer of an in-progress task from one agent (or human) to another, with enough context to continue without restart.
Agent Harness
The execution scaffold around an LLM that manages the agent loop, tool invocation, memory, and safety controls.
Agent Loop
The repeating cycle where an agent observes, thinks, acts, and re-observes until its goal is achieved or a stop condition triggers.
Agent Memory
The mechanism by which an agent retains information across turns, sessions, or long-running tasks beyond the context window.
Agent Observability
The instrumentation and tooling that makes agent runs inspectable — traces, logs, metrics, replays — so failures can be debugged and improvements measured.
Agent Orchestration
The discipline of coordinating multiple agents — routing tasks, managing handoffs, sharing context, and resolving conflicts.
Agent Policy
The set of rules and constraints that govern what an agent may and may not do, including authentication, rate limits, and forbidden actions.
Agentic AI
AI systems that autonomously plan, act, and use tools to complete multi-step tasks.
Agentic RAG
Retrieval-Augmented Generation where the agent dynamically decides what to retrieve, when to retrieve it, and how to integrate the result.
AI Agent
A software system that uses an LLM to perceive context, decide actions, invoke tools, and complete tasks toward a goal.
Autonomous Agent
An agent that operates without step-by-step human input, deciding its own actions to reach a stated goal.
Browser Agent
An agent specialized in operating a web browser to research, fill forms, scrape data, and complete web-based tasks.
Chain of Thought
A prompting technique where the model is encouraged to produce step-by-step reasoning before answering, improving accuracy on complex tasks.
Computer Use
An agent capability to operate a computer interface directly — clicking, typing, reading screens — instead of calling APIs.
Context Window
The maximum amount of text — measured in tokens — that an LLM can process in a single inference call, including both input and generated output.
Embeddings
Numerical vector representations of text (or other content) where semantic similarity maps to vector distance, enabling search by meaning rather than keyword.
Function Calling
The protocol by which an LLM emits a structured function invocation that a runtime then executes — synonymous with tool calling.
Guardrails
Safety controls layered around an LLM or agent to prevent harmful, off-policy, or non-compliant outputs and actions.
Hallucination
A confident but incorrect output from an LLM — invented facts, fabricated citations, or nonexistent functions — produced as if it were grounded.
iPaaS
Integration Platform as a Service — a category of tools that connect SaaS applications via workflows, increasingly with AI agent capabilities embedded.
MCP
Model Context Protocol — an open standard from Anthropic for connecting LLMs to external tools and data sources via a uniform interface.
Model Context Protocol
The full name of MCP — an open standard for LLM-tool interoperability published by Anthropic in November 2024.
Multi-Agent System
A system of multiple specialized agents that collaborate, hand off tasks, and coordinate to solve problems larger than any single agent.
Prompt Injection
A security vulnerability where attacker-controlled text causes an LLM to follow instructions outside its intended scope, bypassing system rules.
RAG
Retrieval-Augmented Generation — a pattern where relevant external documents are retrieved at query time and added to the prompt to ground the model's answer.
ReAct
A reasoning pattern where an agent alternates explicit thought ("Reason") and action ("Act") steps until the task completes.
Tool Calling
An LLM capability to invoke external functions or APIs as part of a response, used by agents to act on the world.
Vector Database
A database optimized for storing and searching high-dimensional vectors — the embeddings used in semantic search and RAG.