Conductor Documentation

NIC Bonding at Cloud Platform HLD

Introduction

NIC Bonding supports aggregating network interfaces on a host within a WRCP installation, specifically in the All-in-One Duplex configurations.

The network interface aggregation can operate in modes such as “Active-backup,” “Balanced XOR,” or “802.3ad.”

Workflows

This feature is enabled at provisioning time, through the deployment config. The relevant workflow is provision_sysctrl.

An indicative high-level sequence of events is shown in the sequence diagram below:

Flow

Input

Updated CIQ with NIC Bonding Spec / Server Golden Config.

Output

Successful NIC bonding deployment on all 15G and 16G compute nodes on the Compute/workload cluster or management/WR Systems controller on 15G or 16G HW.

##Sample Logical Network Address Configuration (IPv4)

floating IP: 128.224.54.34 controller-0 IP: 128.224.54.35 controller-1 IP: 128.224.54.36 oam_gateway IP: 128.224.54.1

dns_servers: 128.224.144.130

management_subnet: 10.9.32.0/24 management_start_address: 10.9.32.2 management_end_address: 10.9.32.254 management_multicast_subnet: management_gateway_address: 10.9.32.1

cluster_host_subnet: 192.168.206.0/24 cluster_pod_subnet: cluster_service_subnet:

external_oam_subnet: 128.224.54.0/23 external_oam_gateway_address: 128.224.54.1 external_oam_floating_address: 128.224.54.34 external_oam_node_0_address: 128.224.54.35 external_oam_node_1_address: 128.224.54.36

pxeboot_subnet: 192.168.202.0/24

Including the definition of the network interface bond name. For example, cat /proc/net/bonding/pxeboot0.

CIQ Structure for NIC Bonding

In the installation CIQs, the structure for each service_tag will be as follows:

  
server_list:
- service_tag: "14hxys3"
  hostname: "controller-1"
  role: controller-std
  id: 1
  bmc_endpoint: "https://100.76.27.140"
  bonded_interfaces:
    - NIC.Bond.1-1-1
    - NIC.Bond.1-1-2
  bond_mode: 802.3ad
  transmitHashPolicy: layer2
  main_interface: NIC.Slot.2-1-1
  # in case plugin fails to map main_interface to interface name and mac,
  # fall back to legacy way to provision bootstrap_interface or bootstrap_mac


  boot_device:  /dev/disk/by-path/pci-0000:67:00.0-scsi-0:3:111:0 
  osds:
  - /dev/disk/by-path/pci-0000:67:00.0-scsi-0:2:3:0

NIC.Bond.1-1-1 and NIC.Bond.1-1-2 should be present in the fqdd mapping CIQ, like, for example, below:

device_mappers:
- filter:
    models:
    - 'PowerEdge R750'
  disk_mappers:
  - fqdd: HBA355i
    prefix: pci-0000:67:00.0-sas-0x
    suffix: -lun-0
  port_mappers:
  - fqdd: 'NIC.Bond.1-1-1'
    ethifname: 'enp23s0f0'
  - fqdd: 'NIC.Bond.1-1-2'
    ethifname: 'enp202s0f0'
  - fqdd: 'NIC.Embedded.1-1-1'
    ethifname: 'eno8303'
  - fqdd: 'NIC.Embedded.2-1-1'
    ethifname: 'eno8403'
    # R750, Broadcom Adv Quad 25Gb Ethernet, Broadcom Corp
  - fqdd: 'NIC.Integrated.1-1-1'
    ethifname: 'eno12399'
  - fqdd: 'NIC.Integrated.1-2-1'
    ethifname: 'eno12409'
  - fqdd: 'NIC.Integrated.1-3-1'
    ethifname: 'eno12419'
  - fqdd: 'NIC.Integrated.1-4-1'
    ethifname: 'eno12429'

The deployment config jinja file changes in the following manner:

---
apiVersion: starlingx.windriver.com/v1
kind: HostProfile
metadata:
  labels:
    controller-tools.k8s.io: "1.0"
  name: controller-{{server.id}}-profile
  namespace: deployment
spec:
  {% if server.tag_subfunction_lowlatency_ %}
  base: controller-aio-ll-profile
  {% elif server.tag_subfunction_worker_%}
  base: controller-aio-profile
  {% else %}
  base: controller-profile
  {% endif %}
  bootDevice: {{server.boot_device}}
  rootDevice: {{server.boot_device}}
  boardManagement:
    credentials:
      password:
        secret: bmc-secret-{{server.service_tag | lower}}
    type: {{server.bmc_type or 'dynamic'}}
    address: {{server.bmc_ip_}}
  interfaces:
    {% if server.bond_mode %}
    bond:
    - class: platform
      dataNetworks: []
      members:
      {% for bond_if in server.bonded_interfaces_ %}
      - {{bond_if}}
      {% endfor %}
      mode: {{server.bond_mode}}
      {% if server.transmitHashPolicy %}
      transmitHashPolicy: {{server.transmitHashPolicy}}
      {% endif %}
      {% if server.primaryReselect %}
      primaryReselect: {{server.primaryReselect}}
      {% endif %}
      name: apxeboot0
      platformNetworks:
      - pxeboot
    ethernet:
    - class: none
      dataNetworks: []
      name: enp23s0f0
      platformNetworks: []
      port:
        name: enp23s0f0
    - class: none
      dataNetworks: []
      name: enp202s0f0
      platformNetworks: []
      port:
        name: enp202s0f0
    {% if server.id == 0 %}
    - class: none
      dataNetworks: []
      mtu: 1500
      name: lo
      platformNetworks: []
      port:
        name: lo
      ptpRole: none
    {% endif %}
    vlan:
    - class: platform
      dataNetworks: []
      lower: apxeboot0
      name: oam0
      platformNetworks:
      - oam
      ptpRole: none
      vid: {{ oam_vlan }}
    - class: platform
      dataNetworks: []
      lower: apxeboot0
      name: mgmt0
      platformNetworks:
      - mgmt
      ptpRole: none
      vid: {{ management_vlan }}
    - class: platform
      dataNetworks: []
      lower: apxeboot0
      name: cluster0
      platformNetworks:
      - cluster-host
      ptpRole: none
      vid: {{ cluster_host_vlan }}
  {% if server.osds %}
  storage:
    osds:
    {% for osd_device in server.osds %}
    - cluster: ceph_cluster
      function: osd
      path: {{osd_device}}
    {% endfor %}
  {% endif %}
---