AgentOps
AgentOps is an observability, evaluation, and debugging platform for AI agents. Its open-source Python SDK (with TypeScript support for OpenAI Agents) instruments agent runs in two lines of code, capturing LLM calls, tool invocations, costs, latencies, and multi-agent interactions. Sessions are visualized in a hosted dashboard at app.agentops.ai with time-travel debugging, waterfall views, and replay. AgentOps offers native integrations with 400+ LLMs and frameworks including CrewAI, AutoGen / AG2, LangChain, LangGraph, LlamaIndex, OpenAI Agents, Haystack, and Camel AI.
AgentOps publishes 3 APIs on the APIs.io network. Tagged areas include AI Agents, Observability, Evaluation, Tracing, and Python SDK.
AgentOps’ developer surface includes documentation, engineering blog, pricing, and 8 more developer resources.
APIs
AgentOps Python SDK
The AgentOps Python SDK is the primary entry point, installable via pip install agentops and initialized with two lines of code. It auto-instruments supported agent frameworks a...
AgentOps TypeScript SDK
AgentOps' TypeScript SDK provides instrumentation for the OpenAI Agents SDK in Node.js applications, surfacing the same traces and metrics as the Python SDK inside the AgentOps ...
AgentOps Dashboard
The hosted dashboard at app.agentops.ai visualizes agent sessions with waterfall views, time-travel replay, LLM cost tracking, and multi-agent interaction graphs. Supports sessi...
Features
Initialize observability with agentops.init() and automatic framework instrumentation.
Time-travel debugging with full session and event replay in the dashboard.
Token counting and cost tracking across foundation model providers and agents.
Visualize interactions between agents in CrewAI, AutoGen, LangGraph, and custom systems.
Time-based waterfall views of all events in a session.
Use the @trace decorator and OTel-aligned spans to instrument custom code paths.
Self-hosted deployment available on Enterprise plans.
Enterprise compliance with SOC 2 and HIPAA available on the Enterprise tier.
Use Cases
Inspect multi-step agent runs, tool calls, and intermediate reasoning to find failures.
Track token usage and cost per agent, framework, and provider.
Evaluate agent performance across sessions and compare versions.
Monitor production agents with dashboards, alerts, and exports.
Visualize and debug coordination between agents in multi-agent frameworks.
Integrations
Native instrumentation for OpenAI Chat Completions and Responses APIs.
First-class support for OpenAI Agents in Python and TypeScript.
Instrumentation for Anthropic Claude models.
Native CrewAI integration with multi-agent visualization.
Native integration with AG2, formerly AutoGen.
Instrumentation for LangChain chains and agents.
Trace and visualize LangGraph stateful agents.
Trace LlamaIndex RAG and agent applications.
Instrumentation for Haystack pipelines.
Native integration with Camel AI multi-agent system.
Instrumentation for Cohere model calls.
Capture calls routed through LiteLLM across providers.
Instrumentation for Mistral models.
Instrumentation for Gemini and Vertex AI.
Instrumentation for xAI Grok models.