The fab.yaml file is the configuration file for the fabric. It supplies the configuration of the users, their credentials, logging, telemetry, and other non wiring related settings. The fab.yaml file is composed of multiple YAML documents inside of a single file. Per the YAML spec 3 hyphens (---) on a single line separate the end of one document from the beginning of the next. There are two YAML documents in the fab.yaml file. For more information about how to use hhfab init, run hhfab init --help.
Typical HHFAB workflows
HHFAB for VLAB
For a VLAB user, the typical workflow with hhfab is:
hhfab init --dev
hhfab vlab gen
hhfab vlab up
The above workflow will get a user up and running with a spine-leaf VLAB.
HHFAB for Physical Machines
It's possible to start from scratch:
hhfab init (see different flags to customize initial configuration)
After the above workflow a user will have a .img file suitable for installing the control node, then bringing up the switches which comprise the fabric.
Fab.yaml
Configure control node and switch users
Configuring control node and switch users is done either passing --default-password-hash to hhfab init or editing the resulting fab.yaml file emitted by hhfab init. You can specify users to be configured on the control node(s) and switches in the following format:
spec:config:control:defaultUser:# user 'core' on all control nodespassword:"hashhashhashhashhash"# password hashauthorizedKeys:-"ssh-ed25519SecREKeyJumblE"fabric:mode:spine-leaf# "spine-leaf" or "collapsed-core"defaultSwitchUsers:admin:# at least one user with name 'admin' and role 'admin'role:admin#password: "$5$8nAYPGcl4..." # password hash#authorizedKeys: # optional SSH authorized keys# - "ssh-ed25519 AAAAC3Nza..."op:# optional read-only userrole:operator#password: "$5$8nAYPGcl4..." # password hash#authorizedKeys: # optional SSH authorized keys# - "ssh-ed25519 AAAAC3Nza..."
Control node(s) user is always named core.
The role of the user,operator is read-only access to sonic-cli command on the switches. In order to avoid conflicts, do not use the following usernames: operator,hhagent,netops.
NTP and DHCP
The control node uses public ntp servers from cloudflare and google by default. The control node runs a dhcp server on the management network. See the example file.
Control Node
The control node is the host that manages all the switches, runs k3s, and serves images. This is the YAML document configure the control node:
apiVersion:fabricator.githedgehog.com/v1beta1kind:ControlNodemetadata:name:control-1namespace:fabspec:bootstrap:disk:"/dev/sda"# disk to install OS on, e.g. "sda" or "nvme0n1"external:interface:enp2s0# interface for externalip:dhcp# IP address for external interfacemanagement:interface:enp2s1# interface for management# Currently only one ControlNode is supported
The management interface is for the control node to manage the fabric switches, not end-user management of the control node. For end-user management of the control node specify the external interface name.
Forward switch metrics and logs
There is an option to enable Grafana Alloy on all switches to forward metrics and logs to the configured targets using
Prometheus Remote-Write API and Loki API. If those APIs are available from Control Node(s), but not from the switches,
it's possible to enable HTTP Proxy on Control Node(s) that will be used by Grafana Alloy running on the switches to
access the configured targets. It could be done by passing --control-proxy=true to hhfab init.
Metrics includes port speeds, counters, errors, operational status, transceivers, fans, power supplies, temperature
sensors, BGP neighbors, LLDP neighbors, and more. Logs include agent logs.
Configuring the exporters and targets is currently only possible by editing the fab.yaml configuration file. An example configuration is provided below:
spec:config:...defaultAlloyConfig:agentScrapeIntervalSeconds:120unixScrapeIntervalSeconds:120unixExporterEnabled:truelokiTargets:grafana_cloud:# target name, multiple targets can be configuredbasicAuth:# optionalpassword:"<password>"username:"<username>"labels:# labels to be added to all logsenv:env-1url:https://logs-prod-021.grafana.net/loki/api/v1/pushuseControlProxy:true# if the Loki API is not available from the switches directly, use the Control Node as a proxyprometheusTargets:grafana_cloud:# target name, multiple targets can be configuredbasicAuth:# optionalpassword:"<password>"username:"<username>"labels:# labels to be added to all metricsenv:env-1sendIntervalSeconds:120url:https://prometheus-prod-36-prod-us-west-0.grafana.net/api/prom/pushuseControlProxy:true# if the Loki API is not available from the switches directly, use the Control Node as a proxyunixExporterCollectors:# list of node-exporter collectors to enable, https://grafana.com/docs/alloy/latest/reference/components/prometheus.exporter.unix/#collectors-list-cpu-filesystem-loadavg-meminfocollectSyslogEnabled:true# collect /var/log/syslog on switches and forward to the lokiTargets
apiVersion:fabricator.githedgehog.com/v1beta1kind:Fabricatormetadata:name:defaultnamespace:fabspec:config:control:tlsSAN:# IPs and DNS names to access API-"customer.site.io"ntpServers:-time.cloudflare.com-time1.google.comdefaultUser:# user 'core' on all control nodespassword:"hash..."# password hashauthorizedKeys:-"ssh-ed25519hash..."fabric:mode:spine-leaf# "spine-leaf" or "collapsed-core"includeONIE:truedefaultSwitchUsers:admin:# at least one user with name 'admin' and role 'admin'role:adminpassword:"hash..."# password hashauthorizedKeys:-"ssh-ed25519hash..."op:# optional read-only userrole:operatorpassword:"hash..."# password hashauthorizedKeys:-"ssh-ed25519hash..."defaultAlloyConfig:agentScrapeIntervalSeconds:120unixScrapeIntervalSeconds:120unixExporterEnabled:truecollectSyslogEnabled:truelokiTargets:lab:url:http://url.io:3100/loki/api/v1/pushuseControlProxy:truelabels:descriptive:nameprometheusTargets:lab:url:http://url.io:9100/api/v1/pushuseControlProxy:truelabels:descriptive:namesendIntervalSeconds:120---apiVersion:fabricator.githedgehog.com/v1beta1kind:ControlNodemetadata:name:control-1namespace:fabspec:bootstrap:disk:"/dev/sda"# disk to install OS on, e.g. "sda" or "nvme0n1"external:interface:eno2# interface for externalip:dhcp# IP address for external interfacemanagement:interface:eno1# Currently only one ControlNode is supported