KubeVirt Operator
Work in progress!
The KubeVirt Operator manages virtual machines on Kubernetes clusters by bridging Viti Stack infrastructure resources with KubeVirt virtualization capabilities. It reconciles Machine Custom Resource Definitions to create and manage KubeVirt VirtualMachine and VirtualMachineInstance resources, providing declarative VM lifecycle management.
Architecture
Controller Structure
The operator implements the Kubernetes controller pattern with the following components:
- Machine Controller: Reconciles
vitistack.io/v1alpha1/Machineresources - KubeVirt Integration: Translates Machine specs to KubeVirt VirtualMachine resources
- Network Management: Configures VM networking through NetworkConfiguration CRDs
- Storage Provisioning: Handles persistent volume claims and storage class integration
- Lifecycle Management: Manages VM creation, updates, and cleanup operations
Repository Structure
├── cmd/ # Main entry point
├── controllers/v1alpha1/ # Machine controller implementation
│ └── machine_controller.go # Primary reconciliation logic
├── config/
│ ├── crd/ # Custom Resource Definitions
│ ├── rbac/ # Role-based access control
│ ├── manager/ # Operator deployment
│ ├── prometheus/ # Monitoring configuration
│ ├── network-policy/ # Network security policies
│ └── samples/ # Example resources
├── charts/kubevirt-operator/ # Helm deployment charts
├── examples/ # Machine resource examples
├── internal/ # Internal implementation packages
├── pkg/ # Public packages and utilities
├── test/ # Test suites
└── docs/ # Setup and configuration documentation
API Reference
Machine Resource
The primary resource managed by the KubeVirt Operator:
apiVersion: vitistack.io/v1alpha1
kind: Machine
metadata:
name: string # Machine identifier
namespace: string # Kubernetes namespace
labels:
cluster.vitistack.io/cluster-name: string # Associated cluster
vitistack.io/machine-template: string # Template reference
spec:
# Template Configuration
template: string # Machine template name (small, medium, large)
# Resource Overrides
resources:
cpu:
cores: int # CPU cores override
threads: int # CPU threads override
sockets: int # CPU sockets override
memory:
size: string # Memory size (e.g., "2Gi", "4Gi")
# Storage Configuration
disks:
- name: string # Disk identifier
size: string # Disk size (e.g., "20Gi", "100Gi")
storageClass: string # Kubernetes StorageClass
accessMode: string # Volume access mode: ReadWriteOnce, ReadWriteMany
volumeMode: string # Volume mode: Filesystem, Block
# Network Configuration
networks:
- name: string # Network interface name
networkName: string # NetworkConfiguration reference
model: string # NIC model: virtio, e1000, rtl8139
macAddress: string # MAC address (optional)
# Boot Configuration
bootOrder: []string # Boot device order: disk, network, cdrom
# Cloud-Init Configuration
cloudInit:
userData: string # Cloud-init user data
networkData: string # Cloud-init network configuration
secretRef: # Reference to secret containing cloud-init
name: string # Secret name
key: string # Secret key
# Virtual Machine Settings
domain:
machine:
type: string # Machine type: pc-q35, pc-i440fx
features:
acpi: bool # Enable ACPI
apic: bool # Enable APIC
hyperv: bool # Enable Hyper-V optimizations
firmware:
bootloader:
efi: bool # Use EFI bootloader
secureBoot: bool # Enable secure boot
status:
phase: string # Current phase: Pending, Creating, Running, Stopped, Failed
conditions: []Condition # Status conditions
vmName: string # Created VirtualMachine name
vmiName: string # Active VirtualMachineInstance name
ipAddresses: []string # Assigned IP addresses
nodeName: string # Kubernetes node hosting the VM
lastUpdated: string # Last reconciliation timestamp
resourceVersion: string # Current resource version
Machine Templates
Predefined resource configurations for common VM sizes:
Small Template
template: small
# Translates to:
resources:
cpu:
cores: 1
threads: 1
sockets: 1
memory:
size: "2Gi"
disks:
- name: "root"
size: "20Gi"
storageClass: "default"
Medium Template
template: medium
# Translates to:
resources:
cpu:
cores: 2
threads: 1
sockets: 1
memory:
size: "4Gi"
disks:
- name: "root"
size: "40Gi"
storageClass: "default"
Large Template
template: large
# Translates to:
resources:
cpu:
cores: 4
threads: 1
sockets: 1
memory:
size: "8Gi"
disks:
- name: "root"
size: "80Gi"
storageClass: "default"
Generated KubeVirt Resources
The operator creates corresponding KubeVirt resources:
VirtualMachine Resource
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
name: string # Generated from Machine name
namespace: string # Inherited from Machine
labels:
vitistack.io/managed-by: kubevirt-operator
vitistack.io/machine: string # Reference to source Machine
ownerReferences:
- apiVersion: vitistack.io/v1alpha1
kind: Machine
name: string # Parent Machine name
uid: string # Parent Machine UID
spec:
running: bool # VM power state
template:
metadata:
labels:
vitistack.io/machine: string
spec:
domain:
cpu:
cores: int # From Machine resources.cpu.cores
threads: int # From Machine resources.cpu.threads
sockets: int # From Machine resources.cpu.sockets
memory:
guest: string # From Machine resources.memory.size
devices:
disks: [] # Generated from Machine disks
interfaces: [] # Generated from Machine networks
networkInterfaceMultiqueue: bool
machine:
type: string # From Machine domain.machine.type
features: {} # From Machine domain.features
firmware: {} # From Machine domain.firmware
networks: [] # Network configurations
volumes: [] # Volume configurations
Configuration Reference
Environment Variables
| Variable | Type | Default | Description |
|---|---|---|---|
KUBECONFIG |
string | - | Kubernetes configuration file path |
RECONCILE_INTERVAL |
duration | 30s | Machine reconciliation interval |
MAX_CONCURRENT_RECONCILES |
int | 5 | Maximum concurrent reconciliations |
METRICS_BIND_ADDRESS |
string | :8080 |
Metrics server bind address |
HEALTH_PROBE_BIND_ADDRESS |
string | :8081 |
Health probe bind address |
LEADER_ELECTION |
bool | true | Enable leader election |
NAMESPACE |
string | - | Operator namespace |
Machine Template Configuration
Template Definitions
Templates are hardcoded configurations that can be referenced by name:
| Template | CPU | Memory | Root Disk | Use Case |
|---|---|---|---|---|
small |
1 core | 2Gi | 20Gi | Development, testing |
medium |
2 cores | 4Gi | 40Gi | Light workloads |
large |
4 cores | 8Gi | 80Gi | Production workloads |
Resource Override Behavior
When both template and resource overrides are specified:
spec:
template: medium # Base: 2 cores, 4Gi memory
resources:
cpu:
cores: 4 # Override: Results in 4 cores
memory:
size: "8Gi" # Override: Results in 8Gi memory
Final configuration: 4 cores, 8Gi memory, 40Gi disk (from template)
Storage Configuration
Storage Class Integration
| Parameter | Type | Description |
|---|---|---|
storageClass |
string | Kubernetes StorageClass name |
size |
string | Volume size (e.g., "20Gi", "100Gi") |
accessMode |
string | Volume access mode |
volumeMode |
string | Volume mode: Filesystem or Block |
Supported Access Modes
| Mode | Description | Multi-Node | Use Case |
|---|---|---|---|
ReadWriteOnce |
Single node read-write | No | Standard VM disks |
ReadWriteMany |
Multi-node read-write | Yes | Shared storage |
ReadOnlyMany |
Multi-node read-only | Yes | Read-only data |
Network Configuration
NetworkConfiguration CRD Integration
The operator integrates with Viti Stack NetworkConfiguration resources:
networks:
- name: "eth0"
networkName: "prod-network" # References NetworkConfiguration
model: "virtio"
macAddress: "52:54:00:12:34:56"
Network Interface Models
| Model | Description | Performance | Compatibility |
|---|---|---|---|
virtio |
Paravirtualized NIC | High | Modern OS |
e1000 |
Intel E1000 emulation | Medium | Legacy OS |
rtl8139 |
Realtek RTL8139 | Low | Very old OS |
Operational Reference
Reconciliation Workflow
The Machine controller implements the following reconciliation logic:
- Resource Validation: Validates Machine specification and template references
- Template Resolution: Applies machine template and processes overrides
- Network Preparation: Ensures NetworkConfiguration resources exist
- Storage Provisioning: Creates PersistentVolumeClaims for disks
- VirtualMachine Creation: Generates KubeVirt VirtualMachine resource
- Status Monitoring: Watches VirtualMachineInstance status
- Network Configuration: Applies network settings and IP assignments
- Cleanup Management: Handles resource deletion and finalizers
Machine Lifecycle States
| Phase | Description | Next States |
|---|---|---|
Pending |
Machine created, awaiting reconciliation | Creating, Failed |
Creating |
Resources being provisioned | Running, Failed |
Running |
VM successfully running | Stopped, Failed |
Stopped |
VM powered off | Running, Failed |
Failed |
Unrecoverable error | - |
KubeVirt Resource Mapping
CPU Configuration Mapping
| Machine Spec | KubeVirt VirtualMachine | Description |
|---|---|---|
resources.cpu.cores: 2 |
domain.cpu.cores: 2 |
Total CPU cores |
resources.cpu.threads: 1 |
domain.cpu.threads: 1 |
Threads per core |
resources.cpu.sockets: 1 |
domain.cpu.sockets: 1 |
CPU sockets |
Memory Configuration Mapping
| Machine Spec | KubeVirt VirtualMachine | Description |
|---|---|---|
resources.memory.size: "4Gi" |
domain.memory.guest: "4Gi" |
Guest memory allocation |
Disk Configuration Mapping
# Machine specification
disks:
- name: "root"
size: "20Gi"
storageClass: "fast-ssd"
accessMode: "ReadWriteOnce"
# Generated PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: "{machine-name}-root"
spec:
storageClassName: "fast-ssd"
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: "20Gi"
# Generated VirtualMachine disk reference
volumes:
- name: "root"
persistentVolumeClaim:
claimName: "{machine-name}-root"
disks:
- name: "root"
disk:
bus: "virtio"
Processing Specifications
Template Processing Algorithm
func ProcessMachineSpec(machine *Machine) *ProcessedSpec {
spec := &ProcessedSpec{}
// 1. Apply base template
if template := GetTemplate(machine.Spec.Template); template != nil {
spec.CPU = template.CPU
spec.Memory = template.Memory
spec.Disks = template.Disks
}
// 2. Apply resource overrides
if machine.Spec.Resources.CPU != nil {
spec.CPU = machine.Spec.Resources.CPU
}
if machine.Spec.Resources.Memory != nil {
spec.Memory = machine.Spec.Resources.Memory
}
// 3. Merge disk configurations
if len(machine.Spec.Disks) > 0 {
spec.Disks = mergeDiskConfigs(spec.Disks, machine.Spec.Disks)
}
return spec
}
Network Configuration Processing
func ProcessNetworkConfiguration(machine *Machine) []NetworkConfig {
var configs []NetworkConfig
for _, network := range machine.Spec.Networks {
// Lookup NetworkConfiguration CRD
netConfig := GetNetworkConfiguration(network.NetworkName)
config := NetworkConfig{
Name: network.Name,
Model: network.Model,
MacAddress: network.MacAddress,
VLAN: netConfig.Spec.VLAN,
Bridge: netConfig.Spec.Bridge,
}
configs = append(configs, config)
}
return configs
}
Cloud-Init Processing
# Machine specification with cloud-init
spec:
cloudInit:
userData: |
#cloud-config
users:
- name: admin
sudo: ALL=(ALL) NOPASSWD:ALL
ssh_authorized_keys:
- ssh-rsa AAAAB3...
secretRef:
name: "machine-secrets"
key: "userdata"
# Generated VirtualMachine volume
volumes:
- name: "cloudinitdisk"
cloudInitNoCloud:
userData: |
#cloud-config
users: ...
Error Handling Reference
Reconciliation Error Types
| Error Type | Condition | Recovery Action |
|---|---|---|
TemplateNotFound |
Invalid template reference | Fix template name in Machine spec |
NetworkConfigurationNotFound |
Missing NetworkConfiguration CRD | Create required NetworkConfiguration |
StorageClassNotFound |
Invalid StorageClass | Update storageClass or create StorageClass |
InsufficientResources |
Node resource exhaustion | Scale cluster or reduce resource requests |
KubeVirtApiError |
KubeVirt API failure | Check KubeVirt installation and permissions |
Status Conditions
| Condition Type | Status | Reason | Description |
|---|---|---|---|
Ready |
True/False | Various | Overall machine readiness |
VirtualMachineReady |
True/False | VMCreated/VMFailed |
VirtualMachine resource status |
StorageReady |
True/False | PVCBound/PVCPending |
Storage provisioning status |
NetworkReady |
True/False | NetworkConfigured/NetworkFailed |
Network configuration status |
Finalizer Management
The operator uses finalizers for proper cleanup:
metadata:
finalizers:
- machine.vitistack.io/cleanup
Cleanup Process:
- Delete VirtualMachine and VirtualMachineInstance
- Delete PersistentVolumeClaims
- Clean up NetworkConfiguration references
- Remove finalizer
Monitoring Reference
Prometheus Metrics
| Metric Name | Type | Labels | Description |
|---|---|---|---|
kubevirt_operator_machines_total |
Gauge | phase, template |
Total machines by phase |
kubevirt_operator_reconciliation_duration_seconds |
Histogram | controller |
Reconciliation duration |
kubevirt_operator_reconciliation_errors_total |
Counter | controller, error_type |
Reconciliation errors |
kubevirt_operator_virtual_machines_total |
Gauge | status |
Created VirtualMachine resources |
kubevirt_operator_storage_provisioning_duration_seconds |
Histogram | storage_class |
Storage provisioning time |
kubevirt_operator_network_configuration_errors_total |
Counter | network_name |
Network configuration errors |
Health Endpoints
| Endpoint | Purpose | Status Codes |
|---|---|---|
/healthz |
Liveness probe | 200 (healthy), 500 (unhealthy) |
/readyz |
Readiness probe | 200 (ready), 500 (not ready) |
/metrics |
Prometheus metrics | 200 (metrics available) |
Logging Reference
Structured Logging Fields
{
"timestamp": "2024-01-01T12:00:00Z",
"level": "info",
"controller": "Machine",
"machine": "test-vm",
"namespace": "default",
"phase": "Creating",
"message": "Creating VirtualMachine resource"
}
Security Reference
RBAC Requirements
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: kubevirt-operator
rules:
- apiGroups: ["vitistack.io"]
resources: ["machines", "networkconfigurations"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["kubevirt.io"]
resources: ["virtualmachines", "virtualmachineinstances"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""]
resources: ["persistentvolumeclaims", "secrets", "configmaps"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "patch"]
Service Account Configuration
apiVersion: v1
kind: ServiceAccount
metadata:
name: kubevirt-operator-controller-manager
namespace: kubevirt-operator-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kubevirt-operator-manager-rolebinding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kubevirt-operator
subjects:
- kind: ServiceAccount
name: kubevirt-operator-controller-manager
namespace: kubevirt-operator-system
Deployment Reference
Helm Chart Configuration
| Parameter | Default | Description |
|---|---|---|
image.repository |
ghcr.io/vitistack/kubevirt-operator |
Container image |
image.tag |
Chart version | Image tag |
image.pullPolicy |
IfNotPresent |
Image pull policy |
replicaCount |
1 | Operator replicas |
resources.limits.cpu |
500m |
CPU limit |
resources.limits.memory |
512Mi |
Memory limit |
resources.requests.cpu |
100m |
CPU request |
resources.requests.memory |
256Mi |
Memory request |
nodeSelector |
{} |
Node selection constraints |
tolerations |
[] |
Pod tolerations |
affinity |
{} |
Pod affinity rules |
Prerequisites
KubeVirt Installation
The operator requires KubeVirt to be installed in the cluster:
# Install KubeVirt operator
kubectl apply -f https://github.com/kubevirt/kubevirt/releases/download/v0.59.0/kubevirt-operator.yaml
# Create KubeVirt custom resource
kubectl apply -f https://github.com/kubevirt/kubevirt/releases/download/v0.59.0/kubevirt-cr.yaml
# Verify installation
kubectl get pods -n kubevirt
Storage Requirements
- At least one StorageClass with dynamic provisioning
- RWO (ReadWriteOnce) access mode support
- Sufficient storage capacity for VM disks
Installation Methods
Helm Installation
# Add Helm repository
helm repo add vitistack oci://ghcr.io/vitistack/helm
# Install operator
helm install kubevirt-operator vitistack/kubevirt-operator \
--namespace kubevirt-operator-system \
--create-namespace
Manual Installation
# Apply CRDs and operator
kubectl apply -f config/crd/
kubectl apply -f config/rbac/
kubectl apply -f config/manager/
Example Configurations
Basic Virtual Machine
apiVersion: vitistack.io/v1alpha1
kind: Machine
metadata:
name: basic-vm
namespace: default
spec:
template: medium
Virtual Machine with Overrides
apiVersion: vitistack.io/v1alpha1
kind: Machine
metadata:
name: custom-vm
namespace: default
spec:
template: small
resources:
cpu:
cores: 4
memory:
size: "8Gi"
disks:
- name: "data"
size: "100Gi"
storageClass: "fast-ssd"
Virtual Machine with Networking
apiVersion: vitistack.io/v1alpha1
kind: Machine
metadata:
name: networked-vm
namespace: default
spec:
template: medium
networks:
- name: "eth0"
networkName: "prod-network"
model: "virtio"
- name: "eth1"
networkName: "storage-network"
model: "virtio"
Virtual Machine with Cloud-Init
apiVersion: vitistack.io/v1alpha1
kind: Machine
metadata:
name: cloud-init-vm
namespace: default
spec:
template: large
cloudInit:
userData: |
#cloud-config
users:
- name: admin
sudo: ALL=(ALL) NOPASSWD:ALL
ssh_authorized_keys:
- ssh-rsa AAAAB3NzaC1yc2EAAAADAQAB...
packages:
- curl
- vim
runcmd:
- systemctl enable docker
- systemctl start docker
Troubleshooting Reference
Common Issues
| Issue | Symptom | Resolution |
|---|---|---|
| VM not starting | Machine stuck in Creating phase | Check KubeVirt installation and node resources |
| Storage provisioning failure | PVC in Pending state | Verify StorageClass exists and has available capacity |
| Network configuration error | VM created but no network access | Check NetworkConfiguration CRD and multus installation |
| Template not found | Machine validation error | Use valid template name: small, medium, or large |
Debug Commands
Check Machine Status:
kubectl get machines -A
kubectl describe machine <machine-name>
Check Generated Resources:
kubectl get vm,vmi,pvc -l vitistack.io/machine=<machine-name>
View Operator Logs:
kubectl logs -n kubevirt-operator-system deployment/kubevirt-operator-controller-manager -f
Check KubeVirt Status:
kubectl get pods -n kubevirt
kubectl get vmi -A
This reference documentation provides comprehensive technical details for system administrators and developers working with the KubeVirt Operator, assuming familiarity with Kubernetes, KubeVirt, and virtualization concepts.