AgentVista ingests metrics and logs from your infrastructure and services through the OpenTelemetry protocol. Combined with agent traces, they give you a complete picture of what’s happening across your entire stack — not just inside the model.
Metrics and logs are not a separate product — they work alongside your AI agent traces in the same platform. A log line and the agent trace that produced it share the same trace_id, so you can move between them with a single click.
Metrics
Metrics are time-series numerical measurements: CPU utilization, memory usage, request rates, error counts, or any custom business signal you want to track over time.
Metric types
AgentVista supports three standard metric types:
| Type | Description | Example |
|---|
gauge | A value that can go up or down at any point in time | CPU usage percentage, memory used |
counter | A value that only increases, measuring a total count | Total HTTP requests, total tokens consumed |
histogram | A distribution of values, used for latency and size measurements | Request duration, response size |
What a metric record contains
Each metric stream is identified by a name (e.g. cpu.usage, http.requests.total) and optionally a unit and description. Individual data points carry:
| Field | Description |
|---|
value | The numerical measurement |
timestamp | When the measurement was taken |
labels | A JSON object of key-value pairs for filtering and grouping (e.g. {"service": "api", "region": "us-east-1"}) |
You can query metrics by name, filter by labels, and specify a time range. Pre-built dashboards for common services are available out of the box.
Sending metrics via OTLP
The easiest way to send metrics is through an OpenTelemetry Collector. Point your existing Collector at AgentVista’s OTLP endpoint and your metrics start appearing in the dashboard immediately — no code changes required.
# otel-collector-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
# Host metrics (CPU, memory, disk, network)
hostmetrics:
scrapers:
cpu:
memory:
disk:
network:
# Prometheus scrape (if you're already running Prometheus exporters)
prometheus:
config:
scrape_configs:
- job_name: "my-service"
static_configs:
- targets: ["localhost:9090"]
processors:
batch:
timeout: 10s
exporters:
otlphttp:
endpoint: https://api.agentvista.dev/api/v1
headers:
Authorization: "Bearer av_xxxxx"
service:
pipelines:
metrics:
receivers: [otlp, hostmetrics, prometheus]
processors: [batch]
exporters: [otlphttp]
logs:
receivers: [otlp]
processors: [batch]
exporters: [otlphttp]
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlphttp]
Logs
Logs are structured records of events that occurred in your services. AgentVista stores them with severity levels, service attribution, and optional links to the trace that produced them.
What a log record contains
| Field | Description |
|---|
timestamp | When the event occurred |
severity | The log level (see below) |
service_name | The service that emitted this log |
body | The log message text |
attributes | Structured JSON fields for additional context |
trace_id | Optional — links this log to a distributed trace |
span_id | Optional — links this log to a specific span within the trace |
Severity levels
AgentVista stores logs at six severity levels, matching the OpenTelemetry log data model:
| Level | When to use |
|---|
trace | Fine-grained debugging information, usually disabled in production |
debug | Diagnostic information useful during development |
info | Normal operational events |
warn | Unexpected conditions that didn’t cause a failure |
error | Failures that affected a specific operation |
fatal | Failures that caused the service to stop |
Searching logs
The log explorer supports full-text search on the body field. You can filter by severity, service_name, and time range to narrow down to the events you care about.
Log-to-trace correlation
When a log record includes a trace_id, AgentVista links it to the corresponding trace. From the log explorer, click any log line that has a trace_id to open the full trace waterfall — including all AI spans and infrastructure spans that were active when the log was emitted.
This answers the question that separate tooling can’t: was this error caused by the agent logic, the LLM call, or the database? When you can jump from an error-level log directly to the trace that contains it, the answer is usually obvious.
error log: "Timeout waiting for database response" (service: api, trace_id: a1b2c3d4)
↓
Trace a1b2c3d4:
[agent] "support-bot" 0ms ─────────────────────── 5200ms FAILED
[llm] "extract-intent" 10ms ────── 380ms
[tool] "fetch-ticket" 400ms ──────────────────── 5100ms FAILED
[db] "SELECT FROM tickets..." 400ms ─────────────────── 5000ms FAILED
↑ timeout here
Sending logs via OTLP
Add a logs pipeline to your Collector config (shown above) and point your application’s logging library at the Collector:
# Using the OpenTelemetry Python logging handler
from opentelemetry._logs import set_logger_provider
from opentelemetry.exporter.otlp.proto.grpc._log_exporter import OTLPLogExporter
from opentelemetry.sdk._logs import LoggerProvider, LoggingHandler
from opentelemetry.sdk._logs.export import BatchLogRecordProcessor
import logging
provider = LoggerProvider()
set_logger_provider(provider)
exporter = OTLPLogExporter(endpoint="http://localhost:4317", insecure=True)
provider.add_log_record_processor(BatchLogRecordProcessor(exporter))
handler = LoggingHandler(level=logging.INFO, logger_provider=provider)
logging.getLogger().addHandler(handler)
# Now all Python log calls flow to AgentVista via the Collector
logging.info("Ticket processed", extra={"ticket_id": "8821"})
logging.error("Database timeout", extra={"query": "SELECT...", "duration_ms": 5000})
// Using the OpenTelemetry JS logging SDK
import { LoggerProvider } from '@opentelemetry/sdk-logs';
import { OTLPLogExporter } from '@opentelemetry/exporter-logs-otlp-grpc';
import { BatchLogRecordProcessor } from '@opentelemetry/sdk-logs';
const provider = new LoggerProvider();
provider.addLogRecordProcessor(
new BatchLogRecordProcessor(
new OTLPLogExporter({ url: 'http://localhost:4317' })
)
);
const logger = provider.getLogger('my-service');
logger.emit({
severityText: 'INFO',
body: 'Ticket processed',
attributes: { ticket_id: '8821' },
});
If you already run Grafana, Loki, or another log aggregator, you can run both in parallel during migration. The OpenTelemetry Collector supports multiple exporters — point it at AgentVista and your existing destination simultaneously until you’re ready to consolidate.