Deploy Portefaix on Homelab
This guide shows you how to deploy a Portefaix platform on a Raspberry Pi cluster using Talos Linux as the immutable OS, talhelper for declarative cluster configuration, Cilium as the CNI, SOPS + age for secret encryption, and Cloudflare R2 for Terraform state storage.
Goal: a multi-node Talos cluster on Raspberry Pi hardware, with Cilium providing cluster networking, secrets encrypted at rest with SOPS, and Portefaix stacks reconciled by ArgoCD.
Why Talos? Unlike conventional Linux distributions, Talos has no SSH daemon,
no package manager, and no shell. Every node is configured exclusively through its API via
talosctl. The result is a minimal, read-only OS surface that is far harder to
misconfigure or compromise than a general-purpose Linux install.
Prerequisites
- 2+ Raspberry Pi 4 or 5 boards (8 GB RAM recommended for the control plane)
- SD cards or USB SSDs (32 GB+ per node)
-
talosctlinstalled —curl -sL https://talos.dev/install | sh -
talhelperinstalled —curl -fsSL https://i.jpillora.com/budimanjojo/talhelper! | bash -
ageandsopsinstalled — available via Homebrew or your system package manager - Cloudflare account with R2 enabled and an API token
- Terraform ≥ 1.5, kubectl, and Helm installed locally
1. Set up SOPS + age encryption
All secrets in the Portefaix homelab repository are encrypted with age via SOPS. Generate a key pair and store the private key securely:
# Generate a new age key pair
age-keygen -o $HOME/.config/portefaix/portefaix.homelab.txt
# The output will show the public key:
# Public key: age1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
# Export the public key for use in .sops.yaml
age-keygen -y $HOME/.config/portefaix/portefaix.homelab.txt
Add a .sops.yaml at the root of the repository to tell SOPS which key to use for
encryption. Replace the age1... value with your actual public key:
creation_rules:
- path_regex: .*\.yaml$
age: age1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Set the environment variable so talosctl and SOPS can find your private key:
export SOPS_AGE_KEY_FILE="$HOME/.config/portefaix/portefaix.homelab.txt" 2. Build a custom Talos image for Raspberry Pi
Talos Linux requires hardware-specific extensions for Raspberry Pi. Use the Talos Image Factory to build a customized image with the correct firmware extension:
- Raspberry Pi 4: select the
siderolabs/rpi4-firmwareextension - Raspberry Pi 5: select the
siderolabs/rpi5-firmwareextension
The factory produces a schematic ID. Download the metal disk image for your board:
# Example: Raspberry Pi 4, Talos v1.9.x
# Replace SCHEMATIC_ID with the ID from factory.talos.dev
SCHEMATIC_ID="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
TALOS_VERSION="v1.9.5"
curl -LO "https://factory.talos.dev/image/$SCHEMATIC_ID/$TALOS_VERSION/metal-arm64.raw.xz"
# Flash each SD card or SSD (replace /dev/sdX with your device)
xz -d -c metal-arm64.raw.xz | sudo dd of=/dev/sdX bs=4M status=progress conv=fsync
Boot each board from the flashed media. Talos will start in maintenance mode and listen for
configuration on port 50000. Note the IP addresses assigned to each node (check your router
DHCP table or use nmap).
3. Inspect available disks (optional)
Before generating configuration, confirm the disk path Talos should install to. While the node is in maintenance mode, query it without authentication:
talosctl disks \
--nodes 192.168.0.61 \
--endpoints 192.168.0.61 \
--insecure
Update installDisk in talconfig.yaml with the path shown (typically
/dev/mmcblk0 for SD cards or /dev/sda for USB SSDs).
4. Generate cluster secrets
talhelper generates a talsecret.sops.yaml containing Talos bootstrap secrets
(CA certificates, join tokens, etc.) and encrypts it immediately with SOPS:
cd portefaix-infrastructure/talos/homelab
# Generate Talos cluster secrets
talhelper gensecret > talsecret.sops.yaml
# Encrypt in-place using your .sops.yaml configuration
sops --encrypt --in-place talsecret.sops.yaml Never commit an unencrypted talsecret.sops.yaml. Verify
encryption before committing: sops --decrypt talsecret.sops.yaml should require
your age private key. The encrypted file is safe to commit.
5. Generate per-node Talos configurations
talhelper reads talconfig.yaml (your cluster topology) and
talsecret.sops.yaml (the encrypted secrets) to produce a machine config file for
each node:
talhelper genconfig
# Output: clusterconfig/
# portefaix-homelab-controlplane.yaml
# portefaix-homelab-worker-1.yaml
# portefaix-homelab-worker-2.yaml
# talosconfig
Copy the generated talosconfig to the default location so
talosctl finds it automatically:
mkdir -p $HOME/.talos
cp clusterconfig/talosconfig $HOME/.talos/config 6. Apply configuration to each node
Push the machine configuration to each node while it is in maintenance mode. The node will apply the config and reboot into Talos:
# Control plane node
talosctl apply-config \
--nodes 192.168.0.61 \
--endpoints 192.168.0.61 \
--insecure \
--file clusterconfig/portefaix-homelab-controlplane.yaml
# Worker node 1
talosctl apply-config \
--nodes 192.168.0.208 \
--endpoints 192.168.0.61 \
--insecure \
--file clusterconfig/portefaix-homelab-worker-1.yaml
# Worker node 2
talosctl apply-config \
--nodes 192.168.0.116 \
--endpoints 192.168.0.61 \
--insecure \
--file clusterconfig/portefaix-homelab-worker-2.yaml Wait for the nodes to reboot (~2 minutes), then verify they are reachable:
talosctl --nodes 192.168.0.61 version 7. Bootstrap the Talos cluster
Bootstrap must be run once on the control plane node. This initialises etcd and makes the API server available:
talosctl bootstrap --nodes 192.168.0.61 Monitor the bootstrap progress:
talosctl --nodes 192.168.0.61 dashboard
# Or watch services directly
talosctl --nodes 192.168.0.61 services
NODE SERVICE STATE HEALTH LAST CHANGE
192.168.0.61 apid Running OK 35s ago
192.168.0.61 containerd Running OK 38s ago
192.168.0.61 etcd Running OK 12s ago
192.168.0.61 kubelet Running OK 8s ago 8. Fetch cluster credentials
talosctl kubeconfig \
--nodes 192.168.0.61 \
--endpoints 192.168.0.61 \
$HOME/.kube/portefaix-homelab
export KUBECONFIG="$HOME/.kube/portefaix-homelab"
kubectl get nodes
NAME STATUS ROLES AGE VERSION
portefaix NotReady control-plane 2m v1.31.4
portefaix-1 NotReady <none> 90s v1.31.4
portefaix-2 NotReady <none> 85s v1.31.4 NotReady is expected — no CNI has been installed yet. Cilium is installed in the
next step.
9. Install Cilium
Install Cilium with Hubble enabled for network observability. Talos requires specific mount and kernel parameters — pass them via Helm values:
helm repo add cilium https://helm.cilium.io/
helm repo update
helm install cilium cilium/cilium \
--namespace kube-system \
--set ipam.mode=kubernetes \
--set kubeProxyReplacement=true \
--set securityContext.capabilities.ciliumAgent="{CHOWN,KILL,NET_ADMIN,NET_RAW,IPC_LOCK,SYS_ADMIN,SYS_RESOURCE,DAC_OVERRIDE,FOWNER,SETGID,SETUID}" \
--set securityContext.capabilities.cleanCiliumState="{NET_ADMIN,SYS_ADMIN,SYS_RESOURCE}" \
--set cgroup.autoMount.enabled=false \
--set cgroup.hostRoot=/sys/fs/cgroup \
--set hubble.relay.enabled=true \
--set hubble.ui.enabled=true \
--wait Verify Cilium is healthy and nodes are Ready:
cilium status
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP
portefaix Ready control-plane 5m v1.31.4 192.168.0.61 <none>
portefaix-1 Ready <none> 4m30s v1.31.4 192.168.0.208 <none>
portefaix-2 Ready <none> 4m25s v1.31.4 192.168.0.116 <none> 10. Label nodes by role
Portefaix uses node labels to schedule platform components on appropriate hardware:
# Application / low-cost workload nodes
kubectl label node portefaix-1 portefaix-2 \
node-role.kubernetes.io/worker=true \
node-role.kubernetes.io/lowcost=true
# Infrastructure nodes (ingress, monitoring, cert-manager)
kubectl label node portefaix-6 portefaix-7 \
node-role.kubernetes.io/worker=true \
node-role.kubernetes.io/infra=true | Label | Description |
|---|---|
node-role.kubernetes.io/infra=true | Platform components (ingress, monitoring, cert-manager) |
node-role.kubernetes.io/lowcost=true | PoCs, development workloads, batch jobs |
11. Configure Cloudflare R2 for Terraform state
Cloudflare R2 stores Terraform state files and provides S3-compatible object storage for observability components. Configure credentials in your Portefaix config:
function setup_cloudflare() {
export CLOUDFLARE_ACCOUNT_ID="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
export AWS_ACCESS_KEY_ID="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
export AWS_SECRET_ACCESS_KEY="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
} . ./portefaix.sh talos
# Create the R2 bucket for Terraform state
aws s3api create-bucket \
--bucket portefaix-homelab-tfstates \
--endpoint-url "https://$CLOUDFLARE_ACCOUNT_ID.r2.cloudflarestorage.com"
# Create observability storage bucket
aws s3api create-bucket \
--bucket portefaix-homelab-observability \
--endpoint-url "https://$CLOUDFLARE_ACCOUNT_ID.r2.cloudflarestorage.com" 12. Provision DNS and observability storage with Terraform
cd portefaix-infrastructure/terraform/talos/dns
terraform init \
-backend-config="bucket=portefaix-homelab-tfstates" \
-backend-config="key=dns/homelab.tfstate" \
-backend-config="endpoint=https://$CLOUDFLARE_ACCOUNT_ID.r2.cloudflarestorage.com" \
-backend-config="region=auto"
terraform apply cd portefaix-infrastructure/terraform/talos/observability
terraform init \
-backend-config="bucket=portefaix-homelab-tfstates" \
-backend-config="key=observability/homelab.tfstate" \
-backend-config="endpoint=https://$CLOUDFLARE_ACCOUNT_ID.r2.cloudflarestorage.com" \
-backend-config="region=auto"
terraform apply 13. Deploy Portefaix stacks via ArgoCD
helm repo add argo https://argoproj.github.io/argo-helm
helm repo update
helm install argocd argo/argo-cd \
--namespace argocd --create-namespace \
--values portefaix-kubernetes/gitops/argocd/values-talos.yaml \
--wait
kubectl apply -f portefaix-kubernetes/gitops/argocd/bootstrap/app-of-apps-talos-homelab.yaml
argocd app wait portefaix-bootstrap --health --timeout 600 Stacks available on Homelab
| Stack | Description | Storage / service used |
|---|---|---|
| Networking | Cilium + Hubble | In-cluster eBPF-based CNI |
| Observability | Prometheus, Grafana, Loki, Tempo | Cloudflare R2 for long-term storage |
| DNS management | External DNS | Cloudflare DNS |
| TLS certificates | cert-manager | Cloudflare DNS for DNS-01 challenges |
| Secret management | External Secrets Operator | SOPS-encrypted secrets in Git |
| Policy enforcement | Kyverno | — |
Troubleshooting
Node stuck in maintenance mode after applying config
If talosctl version times out after applying config, the node may not have
rebooted cleanly. Check the node is accessible and in the correct boot mode:
# Query node state without authentication (maintenance mode only)
talosctl --nodes 192.168.0.61 --insecure get machinestatus
# Force reboot if needed
talosctl --nodes 192.168.0.61 reboot Nodes stuck in NotReady after Cilium install
Check that Cilium pods started correctly and that the kernel extensions in the Talos image include the required eBPF capabilities:
kubectl -n kube-system get pods -l k8s-app=cilium
kubectl -n kube-system logs -l k8s-app=cilium --tail=50
# Verify Talos has the correct extensions installed
talosctl --nodes 192.168.0.61 get extensions R2 endpoint authentication errors
# Verify R2 credentials work
aws s3 ls \
--endpoint-url "https://$CLOUDFLARE_ACCOUNT_ID.r2.cloudflarestorage.com" Ensure your R2 API token has Object Read & Write permissions on the target bucket.
SOPS decryption fails
# Verify your age key is set correctly
echo $SOPS_AGE_KEY_FILE
cat $SOPS_AGE_KEY_FILE | head -2
# Test decryption
sops --decrypt talsecret.sops.yaml | head -5
If you have multiple age keys, ensure the public key in .sops.yaml matches
the one in your key file.