Metal3
| Enterprise | ||||
|---|---|---|---|---|
| Available in these plans | Free | Dev | Prod | Scale |
| Metal3 Node Provider | ||||
The Metal3 provider allows you to provision bare metal servers as Machines using Metal3 and Ironic.
When a Machine is requested, the platform claims an available BareMetalHost resource, configures it with the requested OS image and user data, and Ironic handles the PXE boot and OS installation on the physical server.
This enables you to offer different configurations of bare metal servers, all managed through BareMetalHost resources on a Control Plane Cluster. Machines can be used as private nodes for tenant clusters or provisioned independently.
Overview​
The Metal3 provider works by selecting available BareMetalHost resources on a Control Plane Cluster and provisioning them with an OS image and user data configuration. Node types let you organize bare metal servers based on type, location, or other criteria. When a Machine is created, the platform:
- Selects an available BareMetalHost matching the node type's label selector and resource requirements
- Configures the BareMetalHost with the OS image, user data, and optional network configuration
- Ironic provisions the server through a series of steps (power management, PXE boot, in-memory installer)
- The server boots into the provisioned OS and is initialized
When the Machine is deleted, the platform restores the BareMetalHost to its original state, making it available for reuse.
How it works: Provisioning​
When a Machine is created, the platform provisions the bare metal server through the following steps:
- The platform generates a user data configuration and stores it in a Kubernetes Secret on the Control Plane Cluster.
- The provider sets the BareMetalHost's
userDatareference to this Secret and the image information from the configured OSImage or direct image properties. - Ironic provisions the server through a series of steps: power management, PXE boot, and an in-memory installer that writes the OS to disk.
- The server boots into the provisioned OS and is initialized with the user data configuration.
When a Machine is used as a private node for a tenant cluster, the user data includes registration scripts that automatically join the server to the tenant cluster.
Infrastructure deployment​
The Metal3 provider can deploy the required infrastructure components on the Control Plane Cluster. Each component can be individually enabled and customized with Helm values. Metal3 and Ironic may also be managed yourself. Disable the respective component and the provider uses whatever is already deployed.
- Metal3 & Ironic
- DHCP Server
- Multus
Bare metal provisioning and lifecycle management. Deploys the Bare Metal Operator and Ironic.
Helm values:
| Value | Description | Default |
|---|---|---|
ironic.image.repository | Ironic container image | quay.io/metal3-io/ironic |
ironic.image.tag | Ironic image tag | release-32.0 |
ipaDownloader.image.repository | IPA ramdisk downloader image | quay.io/metal3-io/ironic-ipa-downloader |
ipaDownloader.image.tag | IPA ramdisk downloader tag | latest |
bareMetalOperator.image.repository | Bare Metal Operator image | quay.io/metal3-io/baremetal-operator |
bareMetalOperator.image.tag | Bare Metal Operator tag | v0.12.0 |
deploy:
metal3:
enabled: true
helmValues: |
ironic:
image:
tag: release-32.0
Handles PXE and HTTP Boot by acting as a proxy between bare metal servers and Ironic, which may reside in a different network. When the provider also deploys Metal3, the Ironic endpoint URLs configure automatically.
Deployment fields (NodeProvider metal3.deploy.dhcp):
| Field | Description | Default |
|---|---|---|
enabled | Deploy the DHCP server. | false |
chartRepo | Override the Helm chart repository. | oci://ghcr.io/loft-sh/charts |
chart | Override the Helm chart name. | vcluster-platform-dhcp-server |
version | Override the Helm chart version. | Bundled version |
helmValues | Raw YAML passed as values to the chart. | — |
Helm values:
| Value | Description | Default |
|---|---|---|
image.registry | Container image registry. | ghcr.io |
image.repository | Container image repository. | loft-sh/vcluster-platform-dhcp-server |
image.tag | Container image tag. | Chart app version |
image.pullPolicy | Image pull policy. | Always |
imagePullSecrets | Pull secrets for the image. | [] |
hostNetwork | Run the pod on the host network. When true, the pod also runs as root and the Multus NetworkAttachmentDefinition is not attached. | false |
podAnnotations / podLabels | Extra pod metadata. | {} |
resources | Container resource requests and limits. | {} |
nodeSelector / tolerations / affinity | Pod scheduling constraints. | {} / [] / {} |
securityContext | Container security context. Overridden to runAsUser: 0 when hostNetwork is true. | NET_BIND_SERVICE capability, runAsNonRoot: true, runAsUser: 1000 |
extraArgs | Extra arguments appended to the server binary. Use to set the IPA inspector and installer kernel parameters. | [] |
extraEnv | Extra environment variables on the server container. | [] |
networkAttachmentDefinition.vip | Virtual IP with prefix for the DHCP server. Injected into the NetworkAttachmentDefinition under ipam.static. Ignored when hostNetwork: true. | 192.168.100.3/24 |
networkAttachmentDefinition.config | CNI configuration JSON for the NetworkAttachmentDefinition. Use bridge if Control Plane Cluster nodes have a bridge attached to the provisioning network. Use macvlan if the bare metal servers are on the same network as the control plane cluster nodes. | bridge br0 |
dhcp.serverIP | IP the DHCP server advertises as itself. | $(SERVER_IP) (pod env) |
dhcp.port | DHCP listen port. | 67 |
dhcp.listenAddr | DHCP listen address. Set to $(SERVER_IP):{{ .Values.dhcp.port }} under hostNetwork to avoid answering on shared bridges. Rendered with tpl. | 0.0.0.0:67 |
tftp.port | TFTP listen port. | 69 |
tftp.listenAddr | TFTP listen address. Set to "" to disable TFTP entirely (HTTP Boot still works). Rendered with tpl. | $(SERVER_IP):69 |
http.port / http.listenAddr / http.advertiseUrl | HTTP boot file server config. Rendered with tpl. | 8080 |
proxy.callbackPort / proxy.callbackListenAddr / proxy.callbackAdvertiseUrl | IPA callback proxy config. Rendered with tpl. | 8081 |
proxy.ironicHttpUrl | Ironic HTTP endpoint for image serving. | Auto-configured when Metal3 is enabled |
proxy.ironicApiUrl | Ironic API endpoint. | Auto-configured when Metal3 is enabled |
Server flags (via extraArgs):
| Flag | Description |
|---|---|
--inspector-collectors | Comma-separated list overriding the ipa-inspection-collectors kernel parameter in the inspector iPXE script. |
--inspector-extra-kernel-param | Extra kernel parameter appended to the inspector iPXE script. May be specified multiple times. |
--installer-extra-kernel-param | Extra kernel parameter appended to the installer iPXE script. May be specified multiple times. |
--callback-insecure-skip-verify | Skip TLS verification when the callback proxy forwards to IPA. |
Per-BareMetalHost DHCP annotations:
The DHCP server reads these annotations from the matched BareMetalHost to build the DHCP reply.
| Annotation | Description |
|---|---|
metal3.vcluster.com/ip-address | IPv4 address with prefix (CIDR) to lease. Required. |
metal3.vcluster.com/gateway | Default gateway. If unset, the first host address in the ip-address CIDR is used. |
metal3.vcluster.com/dns-servers | Comma-separated DNS servers. |
metal3.vcluster.com/ntp-servers | Comma-separated NTP servers. |
Example with a bridge:
deploy:
dhcp:
enabled: true
helmValues: |
networkAttachmentDefinition:
vip: 192.168.100.2/24
config: |
{
"cniVersion": "0.3.1",
"type": "bridge",
"bridge": "br0",
"isDefaultGateway": false
}
Example with macvlan (bare metal servers on the same network as the Control Plane Cluster nodes):
deploy:
dhcp:
enabled: true
helmValues: |
networkAttachmentDefinition:
vip: 10.0.0.2/24
config: |
{
"cniVersion": "0.3.1",
"type": "macvlan",
"master": "eth0",
"mode": "bridge"
}
Example with hostNetwork and custom IPA kernel parameters:
deploy:
dhcp:
enabled: true
helmValues: |
hostNetwork: true
dhcp:
listenAddr: "$(SERVER_IP):67"
extraArgs:
- --inspector-extra-kernel-param=console=ttyS0,115200n8
- --installer-extra-kernel-param=console=ttyS0,115200n8
CNI plugin that enables attaching the DHCP server to a separate provisioning network.
Helm values:
| Value | Description | Default |
|---|---|---|
namespace | Namespace to deploy Multus into | Provider namespace |
image.registry | Container image registry | ghcr.io |
image.repository | Container image repository | k8snetworkplumbingwg/multus-cni |
image.tag | Container image tag | snapshot-thick |
deploy:
multus:
enabled: true
helmValues: |
namespace: kube-system
The DHCP server is automatically configured based on BareMetalHost resources and platform IPAM when using network properties.
Configuration​
A Metal3 NodeProvider configuration consists of a cluster reference, optional infrastructure deployment settings, and a list of node types.
Cluster reference​
The clusterRef specifies the Control Plane Cluster where the platform installs Metal3 components and where the BareMetalHost resources live. This cluster must be connected to vCluster Platform.
| Field | Description | Required |
|---|---|---|
clusterRef.cluster | Name of the connected Control Plane Cluster | Yes |
clusterRef.namespace | Namespace on the Control Plane Cluster for Metal3 components and BareMetalHost resources | Yes |
Ironic must have network access to the BMC addresses of the bare metal servers. Ensure the Control Plane Cluster where Ironic is deployed can reach the BMC network (Redfish/IPMI endpoints).
Example​
This configuration deploys Metal3, Ironic, the DHCP server, and Multus, and defines a single node type that selects BareMetalHosts with the label role: compute.
apiVersion: management.loft.sh/v1
kind: NodeProvider
metadata:
name: metal3-provider
spec:
displayName: "Metal3 Bare Metal Provider"
metal3:
clusterRef:
cluster: bare-metal-cluster
namespace: metal3-system
deploy:
metal3:
enabled: true
dhcp:
enabled: true
helmValues: |
networkAttachmentDefinition:
vip: 192.168.100.2/24
multus:
enabled: true
helmValues: |
namespace: kube-system
nodeTypes:
- name: "compute-node"
displayName: "Compute Node"
resources:
cpu: "32"
memory: 128Gi
bareMetalHosts:
selector:
matchLabels:
role: compute
properties:
vcluster.com/os-image: ubuntu-noble
Get started​
This walkthrough covers the essential steps to go from a connected Control Plane Cluster to a provisioned bare metal server.
Apply the NodeProvider.
Create a NodeProvider that references your Control Plane Cluster and defines at least one node type. The node type's label selector determines which BareMetalHosts it can claim.
apiVersion: management.loft.sh/v1
kind: NodeProvider
metadata:
name: metal3-provider
spec:
displayName: "Metal3 Bare Metal Provider"
metal3:
clusterRef:
cluster: bare-metal-cluster
namespace: metal3-system
deploy:
metal3:
enabled: true
dhcp:
enabled: true
nodeTypes:
- name: "compute-node"
displayName: "Compute Node"
resources:
cpu: "32"
memory: 128Gi
bareMetalHosts:
selector:
matchLabels:
role: compute
properties:
vcluster.com/os-image: ubuntu-noblekubectl apply -f metal3-provider.yamlWait for Metal3 and Ironic to be running on the Control Plane Cluster before creating BareMetalHost resources. The Metal3 webhook must be ready to validate them.
Create BMC credentials.
Create a Secret with the BMC username and password for your server. This Secret is referenced by the BareMetalHost resource.
apiVersion: v1
kind: Secret
metadata:
name: server-01-bmc
namespace: metal3-system
type: Opaque
stringData:
username: admin
password: <BMC-PASSWORD>kubectl apply -f server-01-bmc-secret.yamlCreate a BareMetalHost.
Register the physical server by creating a BareMetalHost resource. The
bmc.addressscheme determines which driver Metal3 uses (Redfish, IPMI, etc.). ThebootMACAddressidentifies the NIC used for PXE boot.apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: server-01
namespace: metal3-system
labels:
role: compute
spec:
bmc:
address: redfish://192.168.1.100
credentialsName: server-01-bmc
disableCertificateVerification: true
bootMACAddress: "aa:bb:cc:dd:ee:01"kubectl apply -f server-01-bmh.yamlThe server moves through
registeringandinspectingstates as Metal3 verifies BMC access and collects hardware inventory.Verify the server reaches available state.
Once the BareMetalHost passes inspection, it transitions to
available. This means the server is registered, its hardware inventory is collected, and it is ready for provisioning.kubectl get baremetalhost -n metal3-systemNAME STATE CONSUMER ONLINE ERROR
server-01 available trueCreate a vCluster that claims the server.
Create a vCluster with private nodes configured to use the Metal3 provider. The platform provisions a Machine, which claims the BareMetalHost and installs the OS through Ironic.
privateNodes:
enabled: true
autoNodes:
- provider: metal3-provider
static:
- name: compute-nodes
quantity: 1
nodeTypeSelector:
- property: vcluster.com/node-type
value: compute-nodeAfter provisioning completes, the server boots into the configured OS and joins the tenant cluster as a worker node.
Define node types​
Each node type specifies which BareMetalHosts it can claim and what properties to apply during provisioning.
Select BareMetalHosts by label​
Use bareMetalHosts.selector to match BareMetalHosts by labels. Only hosts matching the selector and having sufficient resources are eligible.
nodeTypes:
- name: "gpu-server"
displayName: "GPU Server"
resources:
cpu: "64"
memory: 256Gi
nvidia.com/gpu: "4"
bareMetalHosts:
selector:
matchLabels:
role: gpu
datacenter: us-east
- name: "general-compute"
displayName: "General Compute"
resources:
cpu: "32"
memory: 128Gi
bareMetalHosts:
selector:
matchLabels:
role: compute
Configuration properties​
Properties are key-value pairs on node types or Machines that control provisioning behavior.
Image configuration​
vcluster.com/os-image​
Type: string
References an OSImage resource by name. The OSImage's properties are used to configure the image URL, checksum, and checksum type. This is the recommended way to configure OS images.
metal3.vcluster.com/image-url​
Type: string
Required: Yes (if not using vcluster.com/os-image)
Direct URL to the OS image. Example: https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img
metal3.vcluster.com/image-checksum​
Type: string
Required: Yes (if not using vcluster.com/os-image)
Checksum of the OS image for verification.
metal3.vcluster.com/image-checksum-type​
Type: string
Checksum algorithm. Supported values: md5, sha256, sha512. If omitted, the type is auto-detected.
Network configuration​
metal3.vcluster.com/network-cidr​
Type: string (standard CIDR notation)
Specifies the gateway IP and subnet for IP allocation. Use standard CIDR notation where the host portion is the gateway address. The platform allocates IPs from the resulting subnet using its built-in IPAM.
Example: 10.0.0.1/24
metal3.vcluster.com/network-ip-range​
Type: string (comma-separated IP ranges)
Specifies explicit IP ranges for allocation instead of CIDR-based allocation. Format: IP1-IP2,IP3-IP4
Example: 10.0.0.20-10.0.0.30,10.0.0.40-10.0.0.50
metal3.vcluster.com/dns-servers​
Type: string (comma-separated)
DNS servers to configure on the provisioned server. Example: 8.8.8.8,8.8.4.4
metal3.vcluster.com/network-data​
Type: string
Complete custom network-data configuration. When set, this overrides automatic network configuration from CIDR or IP range properties.
vcluster.com/network-data-template​
Type: string
A Go template string that renders into a cloud-init network-config document. The provider evaluates this when metal3.vcluster.com/network-data is not set. The template receives the same Values object as vcluster.com/user-data-template.
vcluster.com/network-data-template-secret​
Type: string (format: <namespace>/<name>)
References a Secret containing a Go template for network data. The provider evaluates this when both metal3.vcluster.com/network-data and vcluster.com/network-data-template are unset. The Secret format is the same as for vcluster.com/user-data-template-secret and must carry the vcluster.com/user-data-template-type label.
Node environment selection​
vcluster.com/network-environment​
Type: string
Names the NodeEnvironment whose properties are merged into the NodeClaim's effective property set. The platform resolves this from the merged properties of NodeProvider, NodeType, and NodeClaim. When set, this property takes precedence over the typed NodeClaim.spec.environmentRef field.
Server pinning​
metal3.vcluster.com/server-name​
Type: string
Pins a Machine to a specific BareMetalHost by name. This is useful when creating a Machine directly and you want to target a particular server. Set this property on the Machine, not on the node type.
SSH and user data​
vcluster.com/ssh-keys​
Type: string (comma-separated)
References SSHKey resources by name. The public keys are included in the user data during provisioning.
vcluster.com/user-data​
Type: string
Custom cloud-init configuration merged with the generated user data. Accepts any valid cloud-config directives (packages, write_files, runcmd, etc.). The platform's provisioning commands are appended after any user-supplied runcmd entries.
Takes precedence over vcluster.com/user-data-template and vcluster.com/user-data-template-secret.
vcluster.com/user-data-template​
Type: string
A Go template string that renders into a cloud-config YAML document. Evaluated when vcluster.com/user-data is not set. The template receives a Values object with the following fields:
| Field | Description |
|---|---|
NodeClaim | The NodeClaim being provisioned |
Project | The project the node belongs to |
Properties | Merged properties on the NodeProvider, NodeType, and NodeClaim |
SSHKeys | SSH public keys to inject |
The metal3 provider additionally exposes:
| Field | Description |
|---|---|
BareMetalHost | The selected BareMetalHost (unstructured) |
AllocatedIP | IP allocated for the Machine, if IPAM is in use |
Gateway | Default gateway for the allocated network |
Example:
annotations:
vcluster.com/user-data-template: |
#cloud-config
users:
- name: ubuntu
ssh_authorized_keys:
{{- range .SSHKeys }}
- {{ .PublicKey }}
{{- end }}
runcmd:
- echo "Provisioning {{ .NodeClaim.Name }}"
Takes precedence over vcluster.com/user-data-template-secret.
vcluster.com/user-data-template-secret​
Type: string
The name of a Secret containing a Go template for user data. Evaluated when both vcluster.com/user-data and vcluster.com/user-data-template are unset. The Secret must have the label vcluster.com/user-data-template-type set.
Example Secret:
apiVersion: v1
kind: Secret
metadata:
name: my-user-data-template
namespace: loft
labels:
vcluster.com/user-data-template-type: cloud-config
stringData:
template: |
#cloud-config
users:
- name: ubuntu
ssh_authorized_keys:
{{- range .SSHKeys }}
- {{ .PublicKey }}
{{- end }}
Reference it by name on the node type:
annotations:
vcluster.com/user-data-template-secret: my-user-data-template
The platform resolves user data using the first property that is set:
vcluster.com/user-datavcluster.com/user-data-templatevcluster.com/user-data-template-secret
In all cases, the platform appends the vCluster join command and SSH keys to the resolved cloud-config.
Try it yourself​
The vCluster Bare Metal with KubeVirt guide lets you run the full Metal3 bare metal provisioning flow locally using KubeVirt VMs as fake bare metal servers. It sets up a vind (vCluster in Docker) cluster with KubeVirt, a Metal3 NodeProvider, and simulated BareMetalHosts with Redfish BMC endpoints. No physical hardware is required.