Built for the Future of Infrastructure
AgentMetal combines bare-metal performance with AI-powered operations. Every feature is designed to reduce manual toil while keeping you in control.
AI Agent Operations
Every service is managed by a specialized AI agent that understands its domain deeply. Agents observe desired vs actual state, reason about changes, and execute multi-step plans — all with full audit trails.
- 13 specialized agents covering compute, database, networking, and more
- Continuous reconciliation loop keeps actual state matched to desired state
- Full audit log for every decision and action taken
- Configurable agent behavior through policy definitions
Self-Healing Infrastructure
When things break, agents detect the issue through monitoring, diagnose the root cause, and execute repairs automatically. From restarting crashed services to rebuilding failed replicas.
- Automatic detection via Prometheus metrics and health checks
- Root-cause analysis before executing repairs
- Graduated response: restart, rebuild, failover, or alert
- Incident reports generated for every healing action
Predictive Auto-Scaling
Linear regression forecasting predicts traffic spikes before they happen. Agents pre-scale infrastructure so your services never miss a beat.
- Time-series forecasting on CPU, memory, and request metrics
- Pre-scale resources minutes before demand arrives
- Scale-down cooldown periods prevent thrashing
- Custom scaling policies per service or workload
Infrastructure as Code
Define entire stacks in JSON with dependency resolution. Plan, apply, and destroy with Terraform-like semantics — but agents handle the complexity.
- JSON stack definitions with resource dependencies
- Plan preview shows exactly what will change before applying
- Rollback support with state snapshots
- Import existing resources into managed stacks
Human-in-the-Loop Approvals
Dangerous operations require human approval. Configure risk policies per operation type: safe ops auto-execute, moderate ops notify, dangerous ops wait for approval.
- Three-tier risk classification: safe, moderate, dangerous
- Slack and webhook notifications for pending approvals
- Time-boxed approval windows with automatic rollback
- Per-team and per-environment policy configuration
Multi-Provider Bare Metal
Deploy across Hetzner, OVH, and Equinix Metal. Provider-agnostic abstraction means you can move workloads between providers without changing your stack definitions.
- Unified API across Hetzner, OVH, and Equinix Metal
- Cross-provider VPC networking with WireGuard mesh
- Provider-specific optimizations handled transparently
- Migrate workloads between providers with a single command