Conductor Documentation

WRCP Plugin

Introduction

This plugin has two main capabilities: Discover and manage existing Wind River Cloud Platform (WRCP) installations and make new WRCP installations. For management we can discover system controllers and subclouds, upgrade kubernetes, upgrade trident, certificate audit and “upload and apply” patch and WRA management. For the installation capabilities, this plugin has the ability to provision a fully working WRCP installation for AIO-SX and AIO-DX configurations.

NOTE:

WRCP Management

Each item on the following list is materialized as a workflow in the plugin.

Prerequisites

All stories require that you provide secrets for the following values:

You are free to change the names of the secrets, however, you must provide the secret names in the deployment inputs when enrolling a new system.

Node Types

cloudify.nodes.WRCP.WRCP This node represents a WRCP System. A system can be a System Controller, Standalone system, or a Subcloud.

Properties

Runtime Properties:

Labels

The following are possible for a WRCP system deployment:

Sites

During installation the plugin requests a system’s location (latitude and longitude) from the WRCP API, and creates a Site in Conductor. You can see sites’ location on the map on the Dashboard screen.

discover_and_deploy

This workflow discovers and deploys subclouds. The steps followed are listed next.

  1. Checks if there are discovered subclouds using the “Discover Subclouds” workflow, then gets the controller node instance, and uses the parent deployment’s capabilities to get auth data.

  2. Checks if the deployment_id passed from the workflow is valid and if there are 2 subclouds or more in the controller node instance runtime properties. If not, raises an Exception.

  3. Then, for each of the subclouds, checks if the subcloud has already been discovered and deployed. If not, checks if the address of the subcloud is ipv4 or ipv6, formats and dict of “inputs”, adds the labels, adds the id of each of the deployments to a list, as the inputs and labels.

  4. To complete, calls deploy_subclouds with the data gathered in the previous step.

upload_and_apply_patch

Workflow to install patches.

Notices:

It includes the following steps:

  • Upload patch
  • Apply patch
  • Create subcloud patch strategy
  • Execute apply action on created strategy

Parameters:

      type_names:
        default: []
        description: |
          Type names for which the workflow will execute the update operation.
          By default the operation will execute for nodes of type
          cloudify.nodes.WRCP.WRCP
      node_ids:
        default: []
        description: |
          Node template IDs for which the workflow will execute the update
          operation
      node_instance_ids:
        default: []
        description: |
          Node instance IDs for which the workflow will execute the update
          operation
      patch_dir:
        default: ''
        description: |
          The path to a directory on the manager where the patches are located.
          The patches will be uploaded and applied.
      max_parallel_subclouds:
        default: 1
        description: |
          The maximum number of workers within a subcloud to update in
          parallel
      stop_on_failure:
        default: True
        description: |
          Flag to indicate if the update should stop
          updating additional subclouds if a failure
          is encountered
      subcloud_apply_type:
        default: 'serial'
        description: |
          The apply type for the update. serial or parallel
      strategy_action:
        default: 'apply'
        description: |
          Perform an action on the update strategy.
          Valid values are: apply, or abort
      apply_patches:
        type: boolean
        default: True
        description: |
          Flag to indicate if the patches should be applied.
          When the flag is force also update will be skipped
      install_patches:
        type: boolean
        default: True
        description: |
          Flag to indicate if the patches should be installed.
      subcloud_upload_only:
        type: boolean
        default: False
        description: |
          Flag to indicate if the patch operation should only upload the patches on the subclouds
	  check_status:
        type: boolean
        default: True
        description: |
          Flag to indicate if status of installation should be checked.

check_update_status

Checking status of strategy steps and update labels if failed/complete.

Parameters:

      type_names:
        default: [ ]
        description: |
          Type names for which the workflow will execute the update
          check_update_status operation.
          By default the operation will execute for nodes of type
          cloudify.nodes.WRCP.WRCP
      node_ids:
        default: [ ]
        description: |
          Node template IDs for which the workflow will execute the
          check_update_status operation
      node_instance_ids:
        default: [ ]
        description: |
          Node instance IDs for which the workflow will execute the
          check_update_status operation

refresh_status

Workflow starts check_update_status on each subcloud.

Parameters:

      type_names:
        default: []
        description: |
          Type names for which the workflow will execute the refresh_status operation.
          By default the operation will execute for nodes of type
          cloudify.nodes.WRCP.WRCP
      node_ids:
        default: []
        description: |
          Node template IDs for which the workflow will execute the refresh_status
          operation
      node_instance_ids:
        default: []
        description: |
          Node instance IDs for which the workflow will execute the refresh_status
          operation

Upgrade

ATTENTION: Before start workflow make sure that below files does not exist on controller-0. These will be automatically copied from the load installed during the upgrade.

  • ~/wind-river-cloud-platform-deployment-manager-overrides.yaml
  • ~/wind-river-cloud-platform-deployment-manager.tgz
  • ~/wind-river-cloud-platform-deployment-manager.yaml

Workflow to upgrade sw version on nodes, deployment managers and subcloud. It contains the following steps:

  • upgrade controllers:
    1. Install license for new release
    2. Upload ISO and SIG files to controller-0
    3. Start upgrade process
    4. Verify that upgrade has started
    5. Lock controller-1
    6. Check upgrade status
    7. Unlock controller-1 In case of HA:
    8. Set controller-1 as active controller
    9. Wait for all services on controller-1 are enabled-active, the swact is complete
    10. Lock controller-0
    11. Upgrade controller-0
    12. Unlock Controller-0
    13. upgrade workers and storages (if present)
    14. upgrade deployment manager
    15. upgrade each
  • Upgrade subclouds:
    • create subcloud upgrade strategy
    • execute apply action on created strategy

Parameters:

      type_names:
        default: []
        description: |
          Type names for which the workflow will execute the upgrade operation.
          By default the operation will execute for nodes of type
          cloudify.nodes.WRCP.WRCP
      node_ids:
        default: []
        description: |
          Node template IDs for which the workflow will execute the upgrade
          operation
      node_instance_ids:
        default: []
        description: |
          Node instance IDs for which the workflow will execute the upgrade
          operation
      sw_version:
        default: ''
        description: |
          SW version to upgrade to. Example: 21.12
      license_file_path:
        default: ''
        description: |
          File path on the manager where the license file is located.
          This license file will be applied as a part of upgrade process.
      iso_path:
        default: ''
        description: |
          File path on the manager where the iso with new SW version is
          located. Example: /home/user/21.12/bootimage.iso
          cfy_user needs to be able to access this file.
      sig_path:
        default: ''
        description: |
          File path on the manager where the iso signarute is
          located. Example: /home/user/21.12/bootimage.sig
          cfy_user needs to be able to access this file.
      type_of_strategy:
        default: 'upgrade'
        description: |
          The Subcloud update strategy to be used for subclouds.
          One of: firmware, kube-rootca-update, kubernetes, patch, prestage,
          or upgrade
      subcloud_apply_type:
        default: 'parallel'
        description: |
          The apply type for the update. serial or parallel
      strategy_action:
        default: 'apply'
        description: |
           Perform an action on the update strategy.
           Valid values are: apply, or abort
      max_parallel_subclouds:
        default: 1
        description: |
           The maximum number of workers within a subcloud to update in
           parallel
      stop_on_failure:
        default: True
        description: |
          Flag to indicate if the update should stop
          updating additional subclouds if a failure
          is encountered
      force_flag:
        default: True
        description: |
          Force upgrade to run. Required if the workflow needs to run while
          there are active alarms.

run_on_subclouds

The run_on_subclouds workflow creates temporary execution group based on provided parameters, start batch execution on created group (run workflow with name workflow_id on all matched deployments) and after finish the workflow (all executions are completed) delete this group.

Currently, the workflow is looking for match in all deployments (including environment and not related deployment) so the user must use labels and filters carefully.

Parameters:

      workflow_id:
        description: |
          The name of the workflow to execute on all matched deployments (deployments group)
        default: ''
      workflow_inputs:
        description: The workflow parameters required during workflow execution
        default: {}
      labels:
        description: The labels on the basis of which deployments are selected to create
          a deployment group and start batch execution of workflow on each matched subcloud.
          The labels are optional and can be provided parallel with filter_ids
        default: {}
      filter_ids:
        description: |
          List of filter ids on the basis of which deployments are selected to create
          a deployment group and start batch execution of workflow on each matched subcloud.
          The filter_ids can be provided parallel with labels.
          The provided id must be really id of filter
        default: []
      run_on_all_subclouds:
        type: boolean
        description: |
          You can run a workflow on all sub environments.
          When the parameters is True, the deployments will be selected based on rule:
          {"csys-obj-parent": "<parent_deployment_id>"}
          In this case, labels rules and filter_ids will be skipped
        default: False
      type_names:
        default: [ ]
        description: |
          Type names for which the workflow will execute the workflow,
          especially start_subclouds_executions and
          wait_for_execution_end_and_delete_deployment_group operation.
          By default the operation will execute for nodes of type
          cloudify.nodes.WRCP.WRCP
      node_ids:
        default: [ ]
        description: |
          Node template IDs for which the workflow will execute the workflow,
          especially start_subclouds_executions and
          wait_for_execution_end_and_delete_deployment_group operation.
      node_instance_ids:
        default: [ ]
        description: |
          Node instance IDs for which the workflow will execute the workflow,
          especially start_subclouds_executions and
          wait_for_execution_end_and_delete_deployment_group operation.

Kubernetes Cluster Upgrade Automation (upgrade_kubernetes)

Upgrade the Kubernetes version in a standalone or a DC system.

Parameters:

  • upgrade_controllers (boolean): whether to upgrade the system controller in a DC system, or all hosts in a non-DC system.
  • upgrade_subclouds (boolean): whether to upgrade the subclouds in a DC system.
  • version (string): the target version to upgrade to. Do not include the letter v, e.g. if upgrading to v1.21.8, this input should be 1.21.8.
  • alarm_restrictions (string): sets how the Kubernetes version upgrade orchestration behaves when alarms are present:
    • STRICT: some basic alarms are ignored;
    • RELAXED: non-management-affecting alarms are ignored.
  • instance_action (string): only applies when the wr-openstack application is loaded, with virtual machines (instances) running on worker hosts. It specifies how the strategy deals with those over the strategy execution.
    • STOP-START: instances will be stopped before the host lock operation following the upgrade and then started again following the host unlock;
    • MIGRATE: instances will be migrated to another host while their host is being upgraded.
  • max_parallel_subclouds (integer): sets the maximum number of subclouds that can be upgraded in parallel. This parameter is overridden if subcloud_group is used.
  • max_parallel_worker (integer): this option applies to the parallel worker apply type selection, to specify the maximum worker hosts to upgrade in parallel.
  • subcloud_apply_type (string): determines whether the subclouds are upgraded in parallel, or serially. This parameter is overridden if subcloud_group is used.
    • SERIAL: subclouds will be upgraded one at a time;
    • PARALLEL: subclouds will be upgraded in parallel.
  • subcloud_force (boolean): ignore the audit status of subclouds when selecting them for orchestration. This allows subclouds that are in-sync to be orchestrated.
  • subcloud_group (string): optionally pass the name or ID of a subcloud group to the dcmanager kube-upgrade-strategy command. This results in a strategy that is only applied to all subclouds in the specified group. If not specified, all subcloud groups are upgraded.
  • worker_apply_type (string): this option specifies the host concurrency of the Kubernetes version upgrade strategy.
    • SERIAL: worker hosts will be patched one at a time;
    • PARALLEL: worker hosts will be upgraded in parallel.

Note: most of these parameters are directly inserted into WRCP commands, so additional details about them can be found in the platform documentation.

Summary:

The workflow will invoke the sw-manager and dcmanager (if in a DC system) WRCP APIs to run a Kubernetes upgrade following the chosen parameters. The target version must be available in the platform and higher than the currently installed version by one step, i.e. it’s not possible to skip a Kubernetes version when upgrading.

For DC systems, this workflow should only be executed from the System Controller environment in Conductor, as it leverages the DC APIs to apply the upgrade on its subclouds.

As the Kubernetes upgrade is a time consuming process, expect the workflow to retry some operations, such as polling the upgrade progress in the platform, during its execution.

Analytics Deployment Automation (install_wra, upgrade_wra, uninstall_wra)

Install WRA:

The install_wra workflow performs the following steps:

  • upload_wra: Retrieves the WRA tarball and uploads it to the system.
  • update_oidc: Sets up OIDC DEX.
  • apply_security_overrides: (Subcloud only) Copies and applies the security overrides from the system controller.
  • set_labels_to_hosts: Assigns host labels to controllers and workers.
  • allocate_resources_and_apply: Performs resource allocations based on the provided helm overrides.

Parameters:

      wra_tgz_url:
        description: URL containing the path to a WRA .tar.gz file.
        type: string
        default: ''
      oidc:
        description: Whether the OIDC DEX login setup is updated.
        type: boolean
        default: False
      storage_resource_allocations:
        description: >
          The list of helm overrides for storage resources. For example:
          [
            {
                "file_url": "elasticsearch-data.yaml",
                "helm_chart_name": "elasticsearch-data"
            },
            {
                "file_url": "arbitrary-overrides.yaml",
                "helm_chart_name": "placeholder"
            }
          ]
        type: list
        default: []

Upgrade WRA:

The upgrade_wra workflow supports both the “update” and “upgrade” capabilities. An “update” is defined by an increase of the minor version number, from 21.12-0 to 21.12-1 for example, while an “upgrade” is defined by an increase of the major version number, like from 21.12 to 22.06.

The operation is chosen based on the provided WRA tarball, relative to the currently installed WRA version.

These are the steps to update WRA:

  • Retrieve the Wind River Studio Analytics application tarball.
  • Update the application.
  • Verify that the update process was successful.

These are the steps to upgrade WRA:

  • Retrieve the Wind River Studio Analytics application tarball.
  • Check if the installed WRA application is up to date with the latest minor version.
  • Apply helm overrides.
  • If upgrading a subcloud, copy the security overrides from the System Controller and apply them to the subcloud.
  • Upgrade the application.
  • Monitor the upgrade process.
  • Run applicable post-requisites.

WRA upgrades requires the same inputs as a clean install (install_wra), updates only require the URL to the WRA tarball to be installed.

Parameters:

      wra_tgz_url:
        description: URL containing the path to a WRA .tar.gz file.
        type: string
        default: ''
      oidc:
        description: Whether the OIDC DEX login setup is updated.
        type: boolean
        default: False
      storage_resource_allocations:
        description: >
          The list of helm overrides for storage resources. For example:
          [
            {
                "file_url": "elasticsearch-data.yaml",
                "helm_chart_name": "elasticsearch-data"
            },
            {
                "file_url": "arbitrary-overrides.yaml",
                "helm_chart_name": "placeholder"
            }
          ]
        type: list
        default: []

Uninstall WRA:

The uninstall_wra workflow performs the following steps, they apply to system controllers and subclouds:

  • Removes the application.
  • Deletes the application after the removal is finished.
  • Removes the labels from the controllers.
  • Cleans up unused docker containers.

The workflow does not require any parameters.

NetApp Trident Storage Upgrade Automation (upgrade_trident)

Overview

Astra Trident deploys in Kubernetes clusters as pods and provides dynamic storage orchestration services for your Kubernetes workloads. It enables your containerized applications to quickly and easily consume persistent storage from NetApp’s broad portfolio that includes ONTAP (AFF/FAS/Select/Cloud/Amazon FSx for NetApp ONTAP, Element software (NetApp HCI/SolidFire), as well as the Azure NetApp Files service, and Cloud Volumes Service on Google Cloud.

The NetApp Trident upgrade process is integrated as a new workflow. This workflow can be broken in a series of relatively independent tasks that are called in order:

  1. Trident Check: Checks whether Trident is being used and also checks whether it is up to date.
  2. This step decides whether we should continue. Trident Health Check: Checks whether Trident is in a working state just to make sure it can be safely upgraded.
  3. Trident Upgrade: Reinstalls Trident to perform an upgrade. When you uninstall Trident, the Persistent Volume Claim (PVC) and Persistent Volume (PV) used by the Astra Trident deployment are not deleted. PVs that have already been provisioned will remain available while Astra Trident is offline, and Astra Trident will provision volumes for any PVCs that are created in the interim once it is back online.
  4. Trident Health Check: Checks whether Trident is in a working state just to make sure the upgrade worked as intended.

These steps are shown in the diagram below:

NetApp Trident Update

How to run the upgrade workflow

Upgrades can be performed via the “upgrade_trident” workflow, the user only needs to provide the target’s deployment id:

cfy executions start upgrade_trident --deployment-id <DEPLOYMENT_ID> 

When a platform upgrade is performed, the newer WRCP version already comes with the newer version of tridentctl which is then used to upgrade Trident. The workflow does not come with any rollback functionality as that would require downgrading tridentctl, which might make it incompatible with the Kubernetes version being used in the system.

Certificate Audit (audit_certificates)

Summary

This workflow will scan for all certificates it can find in a WRCP system, including those stored in the Kubernetes cluster’s secrets. These certificates can include the local registry, SSL, DC Admin, etcd, cert-manager, kube-system certificates and more, depending on which certificates are stored in the platform.

After the workflow is done executing, the certificate data can be verified in each node instance’s runtime properties, or in the Certificates Audit page in the web GUI. For each certificate, the following fields will be visible:

  • Name
  • Expiry date
  • CA
  • Region
  • Deployment
  • Site
  • Type
  • Auto-renew

Usage

To run this workflow, simply choose the desired environment(s) and run the audit_certificates workflow from either the environment’s page or the environments list page (bulk action).

The workflow does not require any parameters.

WRCP Provisioning (install_wrcp)

This workflow has the ability to provision new WRCP installation for servers with Hewlett Packard Enterprise (HPE) hardware only. The WRCP configurations supported are AIO-SX and AIO-DX. The requirements are:

  • A WRCP base iso to be customized
  • An instance of CFS (FileServer, CustomISO API) accessible by the target server
  • HPE Server with full support for Redfish API (Gen9 or above) with ILO 4 (starting on 2.30) or ILO 5.

Note The WRCP Plugin is only supported on IPV4 installations.

This workflow includes the following 4 steps:

  1. Custom ISO: With the files input creates a new custom WRCP ISO to be used in the installation.
  2. Redfish: Inserts the custom ISO in the virtual drive DVD and reboots the server.
  3. Bootstrap and DM: After the initial boot, starting the WRCP bootstrapping and starts the Deployment Manager
  4. Custom ISO cleanup: After all the steps, The ISO created in the first step is deleted.

Node Types

windriver.nodes.CustomISO This is a node that represents the step that builds custom ISO using a base one together with the information input in the CIQ files. Since this is step that builds the ISO, it is supposed to be the first one for the System Install blueprint.

windriver.nodes.RedfishBMC This node is responsible for communicating with the BMC. Using the information present in the CIQ files, this step inserts the ISO generated before by CustomISO step in the Virtual Drive and then reboots the server.

windriver.nodes.controller.DM This node defines the operations that will run the initial bootstrap in WRCP, and then run the Deployment Manager. This step still uses the information contained in the CIQ files.

windriver.nodes.controller.SystemController This node will define the operations necessary for subcloud installation in a future release.

windriver.nodes.CustomISOCleaner This node deletes the iso created by the CustomISO node. It is supposed to be the last node in the blueprint.

Runtime Properties:

  • installer_image: WRCP ISO image url
  • client_config: It contains client properties like auth_url, region_name, username, api_key, etc.
  • client_config_https: Almost the same as above, but treats https endpoints specially.
  • bootstrap_config: The information necessary for WRCP bootstrapping process. E.g. ssh_config, bootstrap_values, - deployment_config, etc…
  • bmc_config: BMC’s access information and credentials
  • iso_config: Data necessary too find and customize the WRCP ISO.
  • configure_config: Special information about the custom
  • ciq_file_path: Path to find CIQ files inside CFS server
  • golden_templates_config: Path to find Golden Templates files inside the CFS server
  • dm_state: Stores DM’s last state.
  • bootstrap_state: Stores the last state of bootstrap process
  • bmc_state: Relevant information about BMC state like power state, redfish state, virtual media state, etc…
  • ssh_state: Information about SSH state.

How to use

  1. Upload the blueprint to Conductor
  2. Fill up the CIQ files
  3. Provide valid localhost.yaml and deployment-config.yaml files and rename them, respectively, to bootstrap_values.yaml.v1.jinja2 and deployment_config.yaml.v2.jinja2. Place them in the golden_configs directory.
  4. Upload the CIQ files and golden configs to CFS (FileServer and CustomISO API)
  5. Upload Credentials
  • Create a secret named wrcp_license
    • Containing a full copy of a compatible WRCP license
    • must be named wrcp_license
  • Create secret named ‘global_secrets’ filling up the following fields:
# This CIQ secrets Template applies to a single National Datacenter (NDC)
site_name: "global"               # (String) required, Name of the NDC site for this secrets file

registry_username: username       # (String) required, username for access to docker images registry
registry_password: password       # (String) required, password for access to docker images registry

# This section lists BMC secrets per server
server_list:
- service_tag: "ndc_server_1"     # (String) required with quotes, value should be lower case only,e.g '1a2b3c4'
    bmc_user: user1               # (String) required, username of the BMC account to use
    bmc_password: password1       # (String) required, password of the BMC account to use

- service_tag: "ndc_server_2"
    bmc_user: user1
    bmc_password: password1

- service_tag: "ndc_server_3"
    bmc_user: user1
    bmc_password: password1
  • Create a secret ‘CUSTOM_NAME_secrets’ filling up the following fields:
    • Where CUSTOM_NAME is the same name you have for yaml filename in the ciq_file_list blueprint input. Using the example ciq_file_list below, it should be wr-duplex-1_2_secrets.
# This CIQ secrets Template applies to a single Regional Datacenter (RDC)
# RDC CIQ schema: schemas/regional_datacenter_ciq_secrets.spec.v1.json

site_name: wr-duplex-1      # (String) required, Name of the RDC site for this secrets file

server_list:
- service_tag: ''                   # (String) required with quotes, value should be lower case only,e.g '1a2b3c4'
    bmc_user: user                    # (String) required, username of the BMC account to use
    bmc_password: password            # (String) required, password of the BMC account to use

- service_tag: ''
    bmc_user: user
    bmc_password: password

initial_sysadmin_password: syspwd*  # (String) required, the initial password of controller-0 of this cell site
  1. Create Deployment While filling up the inputs, special attention to the next two fields. Here is an example:
ciq_file_list:
 [
     {
         "file_name": "wr-duplex-1_2_ndc.yaml",    
         "file_type": "NDC"
     },
     {
         "file_name": "wr-duplex-1_2.yaml",
         "file_type": "RDC"
     }
 ]

 golden_config_file_list:
 [
     {
         "file_name":"bootstrap_values.yaml.v1.jinja2",
         "file_type":"RDCboostrapValues"
     },
     {
         "file_name":"deployment_config.yaml.v2.jinja2",
         "file_type":"RDCdeploymentConfig"
     }
 ]
  1. shutdown controller-1 target server.
  2. Then, run “install_wrcp” workflow.