Platform Features

Built for the Future of Infrastructure

AgentMetal combines bare-metal performance with AI-powered operations. Every feature is designed to reduce manual toil while keeping you in control.

illustration

AI Agent Operations

Every service is managed by a specialized AI agent that understands its domain deeply. Agents observe desired vs actual state, reason about changes, and execute multi-step plans — all with full audit trails.

  • 13 specialized agents covering compute, database, networking, and more
  • Continuous reconciliation loop keeps actual state matched to desired state
  • Full audit log for every decision and action taken
  • Configurable agent behavior through policy definitions
illustration

Self-Healing Infrastructure

When things break, agents detect the issue through monitoring, diagnose the root cause, and execute repairs automatically. From restarting crashed services to rebuilding failed replicas.

  • Automatic detection via Prometheus metrics and health checks
  • Root-cause analysis before executing repairs
  • Graduated response: restart, rebuild, failover, or alert
  • Incident reports generated for every healing action
illustration

Predictive Auto-Scaling

Linear regression forecasting predicts traffic spikes before they happen. Agents pre-scale infrastructure so your services never miss a beat.

  • Time-series forecasting on CPU, memory, and request metrics
  • Pre-scale resources minutes before demand arrives
  • Scale-down cooldown periods prevent thrashing
  • Custom scaling policies per service or workload
illustration

Infrastructure as Code

Define entire stacks in JSON with dependency resolution. Plan, apply, and destroy with Terraform-like semantics — but agents handle the complexity.

  • JSON stack definitions with resource dependencies
  • Plan preview shows exactly what will change before applying
  • Rollback support with state snapshots
  • Import existing resources into managed stacks
illustration

Human-in-the-Loop Approvals

Dangerous operations require human approval. Configure risk policies per operation type: safe ops auto-execute, moderate ops notify, dangerous ops wait for approval.

  • Three-tier risk classification: safe, moderate, dangerous
  • Slack and webhook notifications for pending approvals
  • Time-boxed approval windows with automatic rollback
  • Per-team and per-environment policy configuration
illustration

Multi-Provider Bare Metal

Deploy across Hetzner, OVH, and Equinix Metal. Provider-agnostic abstraction means you can move workloads between providers without changing your stack definitions.

  • Unified API across Hetzner, OVH, and Equinix Metal
  • Cross-provider VPC networking with WireGuard mesh
  • Provider-specific optimizations handled transparently
  • Migrate workloads between providers with a single command