Portefaix docs GitHub

Deploy Portefaix on Homelab

This guide shows you how to deploy a Portefaix platform on a Raspberry Pi cluster using Talos Linux as the immutable OS, talhelper for declarative cluster configuration, Cilium as the CNI, SOPS + age for secret encryption, and Cloudflare R2 for Terraform state storage.

Goal: a multi-node Talos cluster on Raspberry Pi hardware, with Cilium providing cluster networking, secrets encrypted at rest with SOPS, and Portefaix stacks reconciled by ArgoCD.

Why Talos? Unlike conventional Linux distributions, Talos has no SSH daemon, no package manager, and no shell. Every node is configured exclusively through its API via talosctl. The result is a minimal, read-only OS surface that is far harder to misconfigure or compromise than a general-purpose Linux install.

Prerequisites

  • 2+ Raspberry Pi 4 or 5 boards (8 GB RAM recommended for the control plane)
  • SD cards or USB SSDs (32 GB+ per node)
  • talosctl installed — curl -sL https://talos.dev/install | sh
  • talhelper installed — curl -fsSL https://i.jpillora.com/budimanjojo/talhelper! | bash
  • age and sops installed — available via Homebrew or your system package manager
  • Cloudflare account with R2 enabled and an API token
  • Terraform ≥ 1.5, kubectl, and Helm installed locally

1. Set up SOPS + age encryption

All secrets in the Portefaix homelab repository are encrypted with age via SOPS. Generate a key pair and store the private key securely:

# Generate a new age key pair
age-keygen -o $HOME/.config/portefaix/portefaix.homelab.txt

# The output will show the public key:
# Public key: age1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

# Export the public key for use in .sops.yaml
age-keygen -y $HOME/.config/portefaix/portefaix.homelab.txt

Add a .sops.yaml at the root of the repository to tell SOPS which key to use for encryption. Replace the age1... value with your actual public key:

creation_rules:
  - path_regex: .*\.yaml$
    age: age1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Set the environment variable so talosctl and SOPS can find your private key:

export SOPS_AGE_KEY_FILE="$HOME/.config/portefaix/portefaix.homelab.txt"

2. Build a custom Talos image for Raspberry Pi

Talos Linux requires hardware-specific extensions for Raspberry Pi. Use the Talos Image Factory to build a customized image with the correct firmware extension:

  • Raspberry Pi 4: select the siderolabs/rpi4-firmware extension
  • Raspberry Pi 5: select the siderolabs/rpi5-firmware extension

The factory produces a schematic ID. Download the metal disk image for your board:

# Example: Raspberry Pi 4, Talos v1.9.x
# Replace SCHEMATIC_ID with the ID from factory.talos.dev
SCHEMATIC_ID="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
TALOS_VERSION="v1.9.5"

curl -LO "https://factory.talos.dev/image/$SCHEMATIC_ID/$TALOS_VERSION/metal-arm64.raw.xz"

# Flash each SD card or SSD (replace /dev/sdX with your device)
xz -d -c metal-arm64.raw.xz | sudo dd of=/dev/sdX bs=4M status=progress conv=fsync

Boot each board from the flashed media. Talos will start in maintenance mode and listen for configuration on port 50000. Note the IP addresses assigned to each node (check your router DHCP table or use nmap).

3. Inspect available disks (optional)

Before generating configuration, confirm the disk path Talos should install to. While the node is in maintenance mode, query it without authentication:

talosctl disks \
  --nodes 192.168.0.61 \
  --endpoints 192.168.0.61 \
  --insecure

Update installDisk in talconfig.yaml with the path shown (typically /dev/mmcblk0 for SD cards or /dev/sda for USB SSDs).

4. Generate cluster secrets

talhelper generates a talsecret.sops.yaml containing Talos bootstrap secrets (CA certificates, join tokens, etc.) and encrypts it immediately with SOPS:

cd portefaix-infrastructure/talos/homelab

# Generate Talos cluster secrets
talhelper gensecret > talsecret.sops.yaml

# Encrypt in-place using your .sops.yaml configuration
sops --encrypt --in-place talsecret.sops.yaml

Never commit an unencrypted talsecret.sops.yaml. Verify encryption before committing: sops --decrypt talsecret.sops.yaml should require your age private key. The encrypted file is safe to commit.

5. Generate per-node Talos configurations

talhelper reads talconfig.yaml (your cluster topology) and talsecret.sops.yaml (the encrypted secrets) to produce a machine config file for each node:

talhelper genconfig

# Output: clusterconfig/
#   portefaix-homelab-controlplane.yaml
#   portefaix-homelab-worker-1.yaml
#   portefaix-homelab-worker-2.yaml
#   talosconfig

Copy the generated talosconfig to the default location so talosctl finds it automatically:

mkdir -p $HOME/.talos
cp clusterconfig/talosconfig $HOME/.talos/config

6. Apply configuration to each node

Push the machine configuration to each node while it is in maintenance mode. The node will apply the config and reboot into Talos:

# Control plane node
talosctl apply-config \
  --nodes 192.168.0.61 \
  --endpoints 192.168.0.61 \
  --insecure \
  --file clusterconfig/portefaix-homelab-controlplane.yaml

# Worker node 1
talosctl apply-config \
  --nodes 192.168.0.208 \
  --endpoints 192.168.0.61 \
  --insecure \
  --file clusterconfig/portefaix-homelab-worker-1.yaml

# Worker node 2
talosctl apply-config \
  --nodes 192.168.0.116 \
  --endpoints 192.168.0.61 \
  --insecure \
  --file clusterconfig/portefaix-homelab-worker-2.yaml

Wait for the nodes to reboot (~2 minutes), then verify they are reachable:

talosctl --nodes 192.168.0.61 version

7. Bootstrap the Talos cluster

Bootstrap must be run once on the control plane node. This initialises etcd and makes the API server available:

talosctl bootstrap --nodes 192.168.0.61

Monitor the bootstrap progress:

talosctl --nodes 192.168.0.61 dashboard

# Or watch services directly
talosctl --nodes 192.168.0.61 services
NODE           SERVICE    STATE     HEALTH   LAST CHANGE
192.168.0.61   apid       Running   OK       35s ago
192.168.0.61   containerd Running   OK       38s ago
192.168.0.61   etcd       Running   OK       12s ago
192.168.0.61   kubelet    Running   OK       8s ago

8. Fetch cluster credentials

talosctl kubeconfig \
  --nodes 192.168.0.61 \
  --endpoints 192.168.0.61 \
  $HOME/.kube/portefaix-homelab

export KUBECONFIG="$HOME/.kube/portefaix-homelab"

kubectl get nodes
NAME            STATUS     ROLES           AGE   VERSION
portefaix       NotReady   control-plane   2m    v1.31.4
portefaix-1     NotReady   <none>          90s   v1.31.4
portefaix-2     NotReady   <none>          85s   v1.31.4

NotReady is expected — no CNI has been installed yet. Cilium is installed in the next step.

9. Install Cilium

Install Cilium with Hubble enabled for network observability. Talos requires specific mount and kernel parameters — pass them via Helm values:

helm repo add cilium https://helm.cilium.io/
helm repo update

helm install cilium cilium/cilium \
  --namespace kube-system \
  --set ipam.mode=kubernetes \
  --set kubeProxyReplacement=true \
  --set securityContext.capabilities.ciliumAgent="{CHOWN,KILL,NET_ADMIN,NET_RAW,IPC_LOCK,SYS_ADMIN,SYS_RESOURCE,DAC_OVERRIDE,FOWNER,SETGID,SETUID}" \
  --set securityContext.capabilities.cleanCiliumState="{NET_ADMIN,SYS_ADMIN,SYS_RESOURCE}" \
  --set cgroup.autoMount.enabled=false \
  --set cgroup.hostRoot=/sys/fs/cgroup \
  --set hubble.relay.enabled=true \
  --set hubble.ui.enabled=true \
  --wait

Verify Cilium is healthy and nodes are Ready:

cilium status

kubectl get nodes -o wide
NAME          STATUS   ROLES           AGE     VERSION   INTERNAL-IP     EXTERNAL-IP
portefaix     Ready    control-plane   5m      v1.31.4   192.168.0.61    <none>
portefaix-1   Ready    <none>          4m30s   v1.31.4   192.168.0.208   <none>
portefaix-2   Ready    <none>          4m25s   v1.31.4   192.168.0.116   <none>

10. Label nodes by role

Portefaix uses node labels to schedule platform components on appropriate hardware:

# Application / low-cost workload nodes
kubectl label node portefaix-1 portefaix-2 \
  node-role.kubernetes.io/worker=true \
  node-role.kubernetes.io/lowcost=true

# Infrastructure nodes (ingress, monitoring, cert-manager)
kubectl label node portefaix-6 portefaix-7 \
  node-role.kubernetes.io/worker=true \
  node-role.kubernetes.io/infra=true
LabelDescription
node-role.kubernetes.io/infra=truePlatform components (ingress, monitoring, cert-manager)
node-role.kubernetes.io/lowcost=truePoCs, development workloads, batch jobs

11. Configure Cloudflare R2 for Terraform state

Cloudflare R2 stores Terraform state files and provides S3-compatible object storage for observability components. Configure credentials in your Portefaix config:

function setup_cloudflare() {
    export CLOUDFLARE_ACCOUNT_ID="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    export AWS_ACCESS_KEY_ID="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    export AWS_SECRET_ACCESS_KEY="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
}
. ./portefaix.sh talos

# Create the R2 bucket for Terraform state
aws s3api create-bucket \
  --bucket portefaix-homelab-tfstates \
  --endpoint-url "https://$CLOUDFLARE_ACCOUNT_ID.r2.cloudflarestorage.com"

# Create observability storage bucket
aws s3api create-bucket \
  --bucket portefaix-homelab-observability \
  --endpoint-url "https://$CLOUDFLARE_ACCOUNT_ID.r2.cloudflarestorage.com"

12. Provision DNS and observability storage with Terraform

cd portefaix-infrastructure/terraform/talos/dns
terraform init \
  -backend-config="bucket=portefaix-homelab-tfstates" \
  -backend-config="key=dns/homelab.tfstate" \
  -backend-config="endpoint=https://$CLOUDFLARE_ACCOUNT_ID.r2.cloudflarestorage.com" \
  -backend-config="region=auto"
terraform apply
cd portefaix-infrastructure/terraform/talos/observability
terraform init \
  -backend-config="bucket=portefaix-homelab-tfstates" \
  -backend-config="key=observability/homelab.tfstate" \
  -backend-config="endpoint=https://$CLOUDFLARE_ACCOUNT_ID.r2.cloudflarestorage.com" \
  -backend-config="region=auto"
terraform apply

13. Deploy Portefaix stacks via ArgoCD

helm repo add argo https://argoproj.github.io/argo-helm
helm repo update

helm install argocd argo/argo-cd \
  --namespace argocd --create-namespace \
  --values portefaix-kubernetes/gitops/argocd/values-talos.yaml \
  --wait

kubectl apply -f portefaix-kubernetes/gitops/argocd/bootstrap/app-of-apps-talos-homelab.yaml

argocd app wait portefaix-bootstrap --health --timeout 600

Stacks available on Homelab

StackDescriptionStorage / service used
NetworkingCilium + HubbleIn-cluster eBPF-based CNI
ObservabilityPrometheus, Grafana, Loki, TempoCloudflare R2 for long-term storage
DNS managementExternal DNSCloudflare DNS
TLS certificatescert-managerCloudflare DNS for DNS-01 challenges
Secret managementExternal Secrets OperatorSOPS-encrypted secrets in Git
Policy enforcementKyverno

Troubleshooting

Node stuck in maintenance mode after applying config

If talosctl version times out after applying config, the node may not have rebooted cleanly. Check the node is accessible and in the correct boot mode:

# Query node state without authentication (maintenance mode only)
talosctl --nodes 192.168.0.61 --insecure get machinestatus

# Force reboot if needed
talosctl --nodes 192.168.0.61 reboot

Nodes stuck in NotReady after Cilium install

Check that Cilium pods started correctly and that the kernel extensions in the Talos image include the required eBPF capabilities:

kubectl -n kube-system get pods -l k8s-app=cilium
kubectl -n kube-system logs -l k8s-app=cilium --tail=50

# Verify Talos has the correct extensions installed
talosctl --nodes 192.168.0.61 get extensions

R2 endpoint authentication errors

# Verify R2 credentials work
aws s3 ls \
  --endpoint-url "https://$CLOUDFLARE_ACCOUNT_ID.r2.cloudflarestorage.com"

Ensure your R2 API token has Object Read & Write permissions on the target bucket.

SOPS decryption fails

# Verify your age key is set correctly
echo $SOPS_AGE_KEY_FILE
cat $SOPS_AGE_KEY_FILE | head -2

# Test decryption
sops --decrypt talsecret.sops.yaml | head -5

If you have multiple age keys, ensure the public key in .sops.yaml matches the one in your key file.