Metrics & Logs - AgentVista

AgentVista ingests metrics and logs from your infrastructure and services through the OpenTelemetry protocol. Combined with agent traces, they give you a complete picture of what’s happening across your entire stack — not just inside the model.

Metrics and logs are not a separate product — they work alongside your AI agent traces in the same platform. A log line and the agent trace that produced it share the same trace_id, so you can move between them with a single click.

Metrics

Metrics are time-series numerical measurements: CPU utilization, memory usage, request rates, error counts, or any custom business signal you want to track over time.

Metric types

AgentVista supports three standard metric types:

Type	Description	Example
`gauge`	A value that can go up or down at any point in time	CPU usage percentage, memory used
`counter`	A value that only increases, measuring a total count	Total HTTP requests, total tokens consumed
`histogram`	A distribution of values, used for latency and size measurements	Request duration, response size

What a metric record contains

Each metric stream is identified by a name (e.g. cpu.usage, http.requests.total) and optionally a unit and description. Individual data points carry:

Field	Description
`value`	The numerical measurement
`timestamp`	When the measurement was taken
`labels`	A JSON object of key-value pairs for filtering and grouping (e.g. `{"service": "api", "region": "us-east-1"}`)

You can query metrics by name, filter by labels, and specify a time range. Pre-built dashboards for common services are available out of the box.

Sending metrics via OTLP

The easiest way to send metrics is through an OpenTelemetry Collector. Point your existing Collector at AgentVista’s OTLP endpoint and your metrics start appearing in the dashboard immediately — no code changes required.

# otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

  # Host metrics (CPU, memory, disk, network)
  hostmetrics:
    scrapers:
      cpu:
      memory:
      disk:
      network:

  # Prometheus scrape (if you're already running Prometheus exporters)
  prometheus:
    config:
      scrape_configs:
        - job_name: "my-service"
          static_configs:
            - targets: ["localhost:9090"]

processors:
  batch:
    timeout: 10s

exporters:
  otlphttp:
    endpoint: https://api.agentvista.dev/api/v1
    headers:
      Authorization: "Bearer av_xxxxx"

service:
  pipelines:
    metrics:
      receivers: [otlp, hostmetrics, prometheus]
      processors: [batch]
      exporters: [otlphttp]
    logs:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlphttp]
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlphttp]

Logs

Logs are structured records of events that occurred in your services. AgentVista stores them with severity levels, service attribution, and optional links to the trace that produced them.

What a log record contains

Field	Description
`timestamp`	When the event occurred
`severity`	The log level (see below)
`service_name`	The service that emitted this log
`body`	The log message text
`attributes`	Structured JSON fields for additional context
`trace_id`	Optional — links this log to a distributed trace
`span_id`	Optional — links this log to a specific span within the trace

Severity levels

AgentVista stores logs at six severity levels, matching the OpenTelemetry log data model:

Level	When to use
`trace`	Fine-grained debugging information, usually disabled in production
`debug`	Diagnostic information useful during development
`info`	Normal operational events
`warn`	Unexpected conditions that didn’t cause a failure
`error`	Failures that affected a specific operation
`fatal`	Failures that caused the service to stop

Searching logs

The log explorer supports full-text search on the body field. You can filter by severity, service_name, and time range to narrow down to the events you care about.

Log-to-trace correlation

When a log record includes a trace_id, AgentVista links it to the corresponding trace. From the log explorer, click any log line that has a trace_id to open the full trace waterfall — including all AI spans and infrastructure spans that were active when the log was emitted. This answers the question that separate tooling can’t: was this error caused by the agent logic, the LLM call, or the database? When you can jump from an error-level log directly to the trace that contains it, the answer is usually obvious.

error log: "Timeout waiting for database response" (service: api, trace_id: a1b2c3d4)
     ↓
Trace a1b2c3d4:
  [agent] "support-bot"             0ms ─────────────────────── 5200ms  FAILED
    [llm] "extract-intent"          10ms ────── 380ms
    [tool] "fetch-ticket"                  400ms ──────────────────── 5100ms  FAILED
      [db] "SELECT FROM tickets..."        400ms ─────────────────── 5000ms  FAILED
                                                                    ↑ timeout here

Sending logs via OTLP

Add a logs pipeline to your Collector config (shown above) and point your application’s logging library at the Collector:

Python
Node.js

# Using the OpenTelemetry Python logging handler
from opentelemetry._logs import set_logger_provider
from opentelemetry.exporter.otlp.proto.grpc._log_exporter import OTLPLogExporter
from opentelemetry.sdk._logs import LoggerProvider, LoggingHandler
from opentelemetry.sdk._logs.export import BatchLogRecordProcessor
import logging

provider = LoggerProvider()
set_logger_provider(provider)

exporter = OTLPLogExporter(endpoint="http://localhost:4317", insecure=True)
provider.add_log_record_processor(BatchLogRecordProcessor(exporter))

handler = LoggingHandler(level=logging.INFO, logger_provider=provider)
logging.getLogger().addHandler(handler)

# Now all Python log calls flow to AgentVista via the Collector
logging.info("Ticket processed", extra={"ticket_id": "8821"})
logging.error("Database timeout", extra={"query": "SELECT...", "duration_ms": 5000})

// Using the OpenTelemetry JS logging SDK
import { LoggerProvider } from '@opentelemetry/sdk-logs';
import { OTLPLogExporter } from '@opentelemetry/exporter-logs-otlp-grpc';
import { BatchLogRecordProcessor } from '@opentelemetry/sdk-logs';

const provider = new LoggerProvider();
provider.addLogRecordProcessor(
  new BatchLogRecordProcessor(
    new OTLPLogExporter({ url: 'http://localhost:4317' })
  )
);

const logger = provider.getLogger('my-service');
logger.emit({
  severityText: 'INFO',
  body: 'Ticket processed',
  attributes: { ticket_id: '8821' },
});

If you already run Grafana, Loki, or another log aggregator, you can run both in parallel during migration. The OpenTelemetry Collector supports multiple exporters — point it at AgentVista and your existing destination simultaneously until you’re ready to consolidate.

​Metrics

​Metric types

​What a metric record contains

​Sending metrics via OTLP

​Logs

​What a log record contains

​Severity levels

​Searching logs

​Log-to-trace correlation

​Sending logs via OTLP

Metrics

Metric types

What a metric record contains

Sending metrics via OTLP

Logs

What a log record contains

Severity levels

Searching logs

Log-to-trace correlation

Sending logs via OTLP