Concepts

Core concepts behind AgentMetal — resources, status model, agents, reconciliation, risk levels, tools, and providers.

Concepts

This page explains the core concepts behind AgentMetal's architecture.

Resources

Everything in AgentMetal is a resource. Every resource has a standard metadata envelope called ResourceMeta:

FieldTypeDescription
IDstringUnique identifier (UUID)
NamestringHuman-readable name
KindstringResource type (Instance, Database, VPC, etc.)
VersionintMonotonically increasing version for optimistic concurrency
LabelsmapKey-value pairs for organizing and filtering resources
CreatedAttimestampWhen the resource was created
UpdatedAttimestampWhen the resource was last modified
Each resource also has a Spec (desired state) and a Status (observed state). Agents work to make the actual state match the desired state.

Status Model

Every resource follows a standard lifecycle:

Pending → Creating → Running → Updating → Deleting → Deleted
                        ↓
                      Error
  • Pending — the resource has been accepted but no agent has started work
  • Creating — an agent is actively provisioning the resource
  • Running — the resource is healthy and operational
  • Updating — an agent is applying a configuration change
  • Deleting — an agent is tearing down the resource
  • Deleted — the resource has been fully removed
  • Error — something went wrong; the agent will attempt to heal automatically

Agents

AgentMetal has 13 specialized agents, one for each service domain plus shared infrastructure agents. Every agent implements four core methods:

  • Reconcile() — compare desired state with actual state and produce a plan to converge them
  • Heal() — detect and remediate failures (restart processes, failover replicas, replace nodes)
  • Plan() — generate a human-readable execution plan for review before applying changes
  • Tools() — return the set of domain-specific tools the agent can use

Reconciliation Loop

The reconciliation loop is the heart of AgentMetal:

  1. Observe — read desired state from etcd and actual state from the infrastructure
  2. Diff — compare the two and identify what needs to change
  3. Plan — generate a sequence of operations to converge
  4. Approve — check the risk level; auto-approve safe operations or request human approval
  5. Execute — run the planned operations using domain-specific tools
  6. Record — write the outcome to the audit log with full reasoning

Risk Levels

Every operation is classified by risk:

LevelBehaviorExample
SafeAuto-executed immediatelyAdding a DNS record, scaling up replicas
ModerateAuto-executed with notificationRestarting a service, applying config changes
DangerousRequires human approvalDeleting data, major version upgrades, failover
## Tools

Agents use domain-specific tools to interact with infrastructure. Tools are strongly typed functions that agents call during execution:

  • SSH — execute commands on remote hosts
  • libvirt — manage VMs (create, destroy, migrate)
  • PostgreSQL — run SQL, manage replication, configure parameters
  • Redis — manage instances, configure sentinel, form clusters
  • nftables — manage firewall rules for security groups
  • WireGuard — configure VPN tunnels for VPC networking
  • HAProxy — manage load balancer configuration and hot-reload
  • MinIO — manage object storage buckets and policies

Providers

AgentMetal abstracts bare-metal providers behind a BareMetalProvider interface. Supported providers:

  • Hetzner — dedicated servers and cloud instances
  • OVH — dedicated servers across European data centers
  • Equinix Metal — global bare-metal infrastructure

The provider interface handles server provisioning, network configuration, and OS installation. Once a server is provisioned, AgentMetal agents take over all service management.

Operations

Long-running tasks are tracked as operations. When you create a resource, the API returns an operation ID that you can poll:

agentmetal operation get op-abc123

Or via the API:

GET /v1/operations/{id}

Operations include status, progress percentage, and a list of steps completed.