Keeping the nodes in SUSE OpenStack Cloud up-to-date requires an appropriate setup of the update and pool repositories and the deployment of either the barclamp or the SUSE Manager barclamp. For details, see Section 5.2, “Update and Pool Repositories”, Section 9.4.1, “Deploying Node Updates with the Updater Barclamp”, and Section 9.4.2, “Configuring Node Updates with the . Barclamp”
If one of those barclamps is deployed, patches are installed on the nodes. Installing patches that do not require a reboot of a node does not come with any service interruption. If a patch (for example, a kernel update) requires a reboot after the installation, services running on the machine that is rebooted will not be available within SUSE OpenStack Cloud. Therefore it is strongly recommended to install those patches during a maintenance window.
As of SUSE OpenStack Cloud 7 it is not possible to put SUSE OpenStack Cloud into “Maintenance Mode”.
While the Administration Server is offline, it is not possible to deploy new nodes. However, rebooting the Administration Server has no effect on starting instances or on instances already running.
The consequences a reboot of a Control Node has, depends on the services running on that node:
Database, Keystone, RabbitMQ, Glance, Nova: No new instances can be started.
Swift: No object storage data is available. If Glance uses Swift, it will not be possible to start new instances.
Cinder, Ceph: No block storage data is available.
Neutron: No new instances can be started. On running instances the network will be unavailable.
Horizon. Horizon will be unavailable. Starting and managing instances can be done with the command line tools.
Whenever a Compute Node is rebooted, all instances running on that particular node will be shut down and must be manually restarted. Therefore it is recommended to “evacuate” the node by migrating instances to another node, before rebooting it.
In case you need to restart your complete SUSE OpenStack Cloud (after a complete shut down or a power outage), the nodes and services need to be started in the following order:
Control Node/Cluster on which the Database is deployed
Control Node/Cluster on which RabbitMQ is deployed
Control Node/Cluster on which Keystone is deployed
For Swift:
Storage Node on which the swift-storage
role is deployed
Storage Node on which the swift-proxy
role is deployed
For Ceph:
Storage Node on which the ceph-mon
role is deployed
Storage Node on which the ceph-osd
role is deployed
Storage Node on which the ceph-radosgw
and
ceph-mds
roles are deployed (if deployed on different
nodes: in either order)
Any remaining Control Node/Cluster. The following additional rules apply:
The Control Node/Cluster on which the neutron-server
role is deployed needs to be started before starting the node/cluster
on which the neutron-l3
role is deployed.
The Control Node/Cluster on which the nova-controller
role is deployed needs to be started before starting the node/cluster
on which Heat is deployed.
Compute Nodes
If multiple roles are deployed on a single Control Node, the services are automatically started in the correct order on that node. If you have more than one node with multiple roles, make sure they are started as closely as possible to the order listed above.
If you need to shut down SUSE OpenStack Cloud, the nodes and services need to be terminated in reverse order than on start-up:
Compute Nodes
Control Node/Cluster on which Heat is deployed
Control Node/Cluster on which the nova-controller
role is deployed
Control Node/Cluster on which the neutron-l3
role is deployed
All Control Node(s)/Cluster(s) on which neither of the following services is deployed: Database, RabbitMQ, and Keystone.
For Swift:
Storage Node on which the swift-proxy
role is deployed
Storage Node on which the swift-storage
role is deployed
For Ceph:
Storage Node on which the ceph-radosgw
and
ceph-mds
roles are deployed (if deployed on different
nodes: in either order)
Storage Node on which the ceph-osd
role is deployed
Storage Node on which the ceph-mon
role is deployed
Control Node/Cluster on which Keystone is deployed
Control Node/Cluster on which RabbitMQ is deployed
Control Node/Cluster on which the Database is deployed
Upgrading from SUSE OpenStack Cloud 6 to SUSE OpenStack Cloud 7 can either be done via a Web interface or from the command line. Starting with SUSE OpenStack Cloud 7, a “non-disruptive” update is supported, when the requirements listed at Non-Disruptive Upgrade Requirements are met. The non-disruptive upgrade guarantees a fully functional SUSE OpenStack Cloud operation during the upgrade procedure. The only feature that is not supported during the non-disruptive upgrade procedure is the deployment of additional nodes.
If the requirements for a non-disruptive upgarde are not met, the upgrade procedure will be done in “normal mode”. When live-migration is set up, instances will be migrated to another node, before the respective Compute Node will get updated to ensure continuous operation. However, you will not be able to access instances during the upgrade of the Control Nodes.
When starting the upgrade process, several checks are performed to determine whether the SUSE OpenStack Cloud is in an upgradeable state and whether a non-disruptive update would be supported:
All nodes need to have the latest SUSE OpenStack Cloud 6 updates and the latest SUSE Linux Enterprise Server 12 SP2 updates installed. If this is not the case, refer to Section 9.4.1, “Deploying Node Updates with the Updater Barclamp” for instructions on how to update.
All allocated nodes need to be turned on and have to be in state “ready”.
All barclamp proposals need to have been successfully deployed. In case a proposal is in state “failed”, the upgrade procedure will refuse to start. Fix the issue or—if possible—remove the proposal.
In case the pacemaker barclamp is deployed, all clusters need to be in a healthy state.
The dns-server
role must be applied to the Administration Server.
The following repositories need to be available on a server that is accessible from the Administration Server. The HA repositories are only needed if you have an HA setup. It is recommended to use the same server that also hosts the respective repositories of the current version.
SUSE-OpenStack-Cloud-7-Pool |
SUSE-OpenStack-Cloud-7-Update |
SLES12-SP2-Pool |
SLES12-SP2-Update |
SLE-HA12-SP2-Pool (for HA setups only) |
SLE-HA12-SP2-Update (for HA setups only) |
Do not add these repositories to the SUSE OpenStack Cloud repository configuration, yet. This needs to be done during the upgrade procedure.
All Control Nodes need to be set up highly available.
Live-migration support needs to be configured and enabled for the Compute Nodes. The amount of free ressources (CPU and RAM) on the Compute Nodes needs to be sufficient to evacuate the nodes one by one.
TO BE DONE
The upgrade procedure on the command line is performed by using the program
crowbarctl
. For general help, run crowbarctl
help
. To get help on a certain subcommand, run
crowbarctl COMMAND help
.
To review the process of the upgrade procedure, you may call
crowbarctl upgrade status
at any time. Steps may have
three states: pending
, running
, and
passed
.
To start the upgrade procedure from the command line, log in to the Administration Server
Perform the preliminary checks to determine whether the upgrade requirements are met:
crowbarctl upgrade prechecks
The command's result is shown in a table. Make sure the column
precheck
command
afterwards. Do not proceed before all checks are passed.
crowbarctl upgrade prechecks +-------------------------------+--------+----------+--------+------+ | Check ID | Passed | Required | Errors | Help | +-------------------------------+--------+----------+--------+------+ | network_checks | true | true | | | | cloud_healthy | true | true | | | | maintenance_updates_installed | true | true | | | | compute_status | true | false | | | | ha_configured | true | false | | | | clusters_healthy | true | true | | | +-------------------------------+--------+----------+--------+------+
Depending on the outcome of the checks, it is automatically decided whether the upgrade procedure will continue in non-disruptive or in normal mode.
Prepare the nodes by transitioning them into the “upgrade” state and stopping the chef daemon:
crowbarctl upgrade prepare
Depending of the size of your SUSE OpenStack Cloud deployment, this step may take
some time. Use the command crowbarctl upgrade status
to monitor the status of the process named
steps.prepare.status
. It needs to be in state
passed
before you proceed:
crowbarctl upgrade status +--------------------------------+----------------+ | Status | Value | +--------------------------------+----------------+ | current_step | backup_crowbar | | current_substep | | | current_node | | | remaining_nodes | | | upgraded_nodes | | | crowbar_backup | | | openstack_backup | | | steps.prechecks.status | passed | | steps.prepare.status | passed | | steps.backup_crowbar.status | pending | | steps.repocheck_crowbar.status | pending | | steps.admin.status | pending | | steps.database.status | pending | | steps.repocheck_nodes.status | pending | | steps.services.status | pending | | steps.backup_openstack.status | pending | | steps.nodes.status | pending | +--------------------------------+----------------+
Create a backup of the existing Administration Server installation. In case something
goes wrong during the upgrade procedure of the Administration Server you can restore
the original state from this backup with the command crowbarctl
backup restore NAME
crowbarctl upgrade backup crowbar
To list all existing backups including the one you have just created, run the following command:
crowbarctl backup list +----------------------------+--------------------------+--------+---------+ | Name | Created | Size | Version | +----------------------------+--------------------------+--------+---------+ | crowbar_upgrade_1486116507 | 2017-02-03T10:08:30.721Z | 209 KB | 3.0 | +----------------------------+--------------------------+--------+---------+
This step prepares the upgrade of the Administration Server by checking the availability of the update and pool repositories for SUSE OpenStack Cloud 7 and SUSE Linux Enterprise Server 12 SP2. Run the following command:
crowbarctl upgrade repocheck crowbar +---------------------------------+--------------------------------+ | Status | Value | +---------------------------------+--------------------------------+ | os.available | false | | os.repos | SLES12-SP2-Pool | | | SLES12-SP2-Updates | | os.errors.x86_64.missing | SLES12-SP2-Pool | | | SLES12-SP2-Updates | | openstack.available | false | | openstack.repos | SUSE-OpenStack-Cloud-7-Pool | | | SUSE-OpenStack-Cloud-7-Updates | | openstack.errors.x86_64.missing | SUSE-OpenStack-Cloud-7-Pool | | | SUSE-OpenStack-Cloud-7-Updates | +---------------------------------+--------------------------------+
All four required repositories are reported as missing, because they have not yet been added to the Crowbar configuration. To add them to the Administration Server proceed as follows.
Note that this step is for setting up the repositories for the Administration Server, not for the nodes in SUSE OpenStack Cloud (this will be done in a subsequent step).
Start yast repositories
and proceed with
. Replace the repositories
SLES12-SP1-Pool
and
SLES12-SP1-Updates
with the respective SP2
repositories.
If you prefer to use zypper over YaST, you may alternatively make the
change using zypper mr
.
Next, replace the SUSE-OpenStack-Cloud-6
update and
pool repositories with the respective SUSE OpenStack Cloud 7
versions.
Once the repository configuration on the Administration Server has been updated, run the command to check the repositories again. If the configuration is correct, the result should look like the following:
crowbarctl upgrade repocheck crowbar +---------------------+--------------------------------+ | Status | Value | +---------------------+--------------------------------+ | os.available | true | | os.repos | SLES12-SP2-Pool | | | SLES12-SP2-Updates | | openstack.available | true | | openstack.repos | SUSE-OpenStack-Cloud-7-Pool | | | SUSE-OpenStack-Cloud-7-Updates | +---------------------+--------------------------------+
Now that the repositories are available, the Administration Server itself will be
upgraded. The update will run in the background using zypper
dup
. Once all packages have been upgraded, the Administration Server will
be rebooted and you will be logged out. To start the upgrade run:
crowbarctl upgrade admin
Starting with SUSE OpenStack Cloud 7, Crowbar uses a PostgreSQL database to store its data. With this step, the database is created on the Administration Server. Alternatively a database on a remote host can be used.
To create the database on the Administration Server proceed as follows:
Login to the Administration Server.
To create the database on the Administration Server with the default credentials
(crowbar
/crowbar
) for the
database, run
crowbarctl upgrade database new
To use a different user name and password, run the following command instead:
crowbarctl upgrade database new \ --db-username=USERNAME --db-password=PASSWORD
To connect to an existing PostgreSQL database, use the following command rather than creating a new database:
crowbarctl upgrade database connect --db-username=USERNAME \ --db-password=PASSWORD --database=DBNAME \ --host=IP_or_FQDN --port=PORT
After the Administration Server has been successfully updated, the Control Nodes and Compute Nodes will be upgraded. At first the availability of the repositories used to provide packages for the SUSE OpenStack Cloud nodes is tested.
Note that the configuration for these repositories differs from the one for the Administration Server that was already done in a previous step. In this step the repository locations are made available to Crowbar rather than to libzypp on the Administration Server. To check the repository configuration run the following command:
crowbarctl upgrade repocheck nodes +---------------------------------+--------------------------------+ | Status | Value | +---------------------------------+--------------------------------+ | ha.available | false | | ha.repos | SLES12-SP2-HA-Pool | | | SLES12-SP2-HA-Updates | | ha.errors.x86_64.missing | SLES12-SP2-HA-Pool | | | SLES12-SP2-HA- Updates | | os.available | false | | os.repos | SLES12-SP2-Pool | | | SLES12-SP2-Updates | | os.errors.x86_64.missing | SLES12-SP2-Pool | | | SLES12-SP2-Updates | | openstack.available | false | | openstack.repos | SUSE-OpenStack-Cloud-7-Pool | | | SUSE-OpenStack-Cloud-7-Updates | | openstack.errors.x86_64.missing | SUSE-OpenStack-Cloud-7-Pool | | | SUSE-OpenStack-Cloud-7-Updates | +---------------------------------+--------------------------------+
To update the locations for the listed repositories, start yast
crowbar
and proceed as described in Section 7.4, “.
”
Once the repository configuration for Crowbar has been updated, run the command to check the repositories again to determine, whether the current configuration is correct.
crowbarctl upgrade repocheck nodes +---------------------+--------------------------------+ | Status | Value | +---------------------+--------------------------------+ | ha.available | true | | ha.repos | SLE12-SP2-HA-Pool | | | SLE12-SP2-HA-Updates | | os.available | true | | os.repos | SLES12-SP2-Pool | | | SLES12-SP2-Updates | | openstack.available | true | | openstack.repos | SUSE-OpenStack-Cloud-7-Pool | | | SUSE-OpenStack-Cloud-7-Updates | +---------------------+--------------------------------+
To PXE boot new nodes, an additional SUSE Linux Enterprise Server 12 SP2 repository—a copy of the installation syste— is required. Although not required during the upgrade procedure, it is recommended to set up this directory now. Refer to Section 5.1, “Copying the Product Media Repositories” for details. If you had also copied the SUSE OpenStack Cloud 6 installation media (optional), you may also want to provide the SUSE OpenStack Cloud 7 the same way.
Once the upgrade procedure has been successfully finished, you may
delete the previous copies of the installation media in
/srv/tftpboot/suse-12.1/x86_64/install
and
/srv/tftpboot/suse-12.1/x86_64/repos/Cloud
.
To ensure the status of the nodes does not change during the upgrade process, the majority of the OpenStack services will be stopped on the nodes now. As a result, the OpenStack API will no longer be accessible. The instances, however, will continue to run and will also be accessible. Run the following command:
crowbarctl upgrade services
This step takes a while to finish. Monitor the process by running
crowbarctl upgrade status
. Do not proceed before
steps.services.status
is set to
passed
.
The last step before upgrading the nodes is to make a backup of the OpenStack PostgreSQL database. The database dump will be stored on the Administration Server and can be used to restore the database in case something goes wrong during the upgrade.
crowbarctl upgrade backup openstack
The final step of the upgrade procedure is upgrading the nodes. To start the process, enter:
crowbarctl upgrade nodes all
The upgrade process runs in the background and can be queried with
crowbarctl upgrade status
. Depending on the size of
your SUSE OpenStack Cloud it may take several hours, especially when performing a
non-disruptive update. In that case, the Compute Nodes are updated
one-by-one after instances have been live-migrated to other nodes.
Instead of upgrading all
nodes you may also upgrade
the Control Nodes first and individual Compute Nodes afterwards. Refer to
crowbarctl upgrade nodes --help
for details.
There are a few issues to pay attention to when making an existing SUSE OpenStack Cloud deployment highly available (by setting up HA clusters and moving roles to these clusters). To make existing services highly available, proceed as follows. Note that moving to an HA setup cannot be done without SUSE OpenStack Cloud service interruption, because it requires OpenStack components to be restarted.
Teaming network mode is required for an HA setup of SUSE OpenStack Cloud. If you are planning to move your cloud to an HA setup at a later point in time, make sure to deploy SUSE OpenStack Cloud with teaming network mode from the beginning. Otherwise a migration to an HA setup is not supported.
Make sure to have read the sections Section 1.5, “HA Setup” and Section 2.6, “High Availability” of this manual and taken any appropriate action.
Make the HA repositories available on the Administration Server as described in
Section 5.2, “Update and Pool Repositories”. Run the command
chef-client
afterward.
Set up your cluster(s) as described in Section 10.1, “Deploying Pacemaker (Optional, HA Setup Only)”.
To move a particular role from a regular control node to a cluster, you need to stop the associated service(s) before re-deploying the role on a cluster:
Log in to each node on which the role is deployed and stop its associated service(s) (a role can have multiple services). Do so by running the service's start/stop script with the stop argument, for example:
rcopenstack-keystone stop
See Appendix C, Roles and Services in SUSE OpenStack Cloud for a list of roles, services and start/stop scripts.
The following roles need additional treatment:
Stop the database on the node the Database barclamp is deployed with the command:
rcpostgresql stop
Copy /var/lib/pgsql
to a temporary location
on the node, for example:
cp -ax /var/lib/pgsql /tmp
Redeploy the Database barclamp to the cluster. The original node may also be part of this cluster.
Log in to a cluster node and run the following command to
determine which cluster node runs the
postgresql
service:
crm_mon -1
Log in to the cluster node running
postgresql
.
Stop the postgresql
service:
crm resource stop postgresql
Copy the data backed up earlier to the cluster node:
rsync -av --delete NODE_WITH_BACKUP:/tmp/pgsql/ /var/lib/pgsql/
Restart the postgresql
service:
crm resource start postgresql
Copy the content of /var/lib/pgsql/data/
from
the original database node to the cluster node with DRBD or shared
storage.
If using Keystone with PKI tokens, the PKI keys on all nodes
need to be re-generated. This can be achieved by removing the
contents of /var/cache/*/keystone-signing/
on
the nodes. Use a command similar to the following on the
Administration Server as root
:
for NODE in NODE1 NODE2 NODE3; do ssh $NODE rm /var/cache/*/keystone-signing/* done
Go to the barclamp featuring the role you want to move to the
cluster. From the left side of the crm
/
crm_mon
CLI tools.
Repeat these steps for all roles you want to move to cluster. See Section 2.6.2.1, “Control Node(s)—Avoiding Points of Failure” for a list of services with HA support.
Moving to an HA setup also requires to create SSL certificates for nodes in the cluster that run services using SSL. Certificates need to be issued for the generated names (see Important: Proposal Name) and for all public names you have configured in the cluster.
After a role has been deployed on a cluster, its services are managed by the HA software. You must never manually start or stop an HA-managed service or configure it to start on boot. Services may only be started or stopped by using the cluster management tools Hawk or the crm shell. See http://www.suse.com/documentation/sle-ha-12/book_sleha/data/sec_ha_config_basics_resources.html for more information.
Backing Up and Restoring the Administration Server can either be done via the Crowbar
Web interface or on the Administration Server's command line via the crowbarctl
backup
command. Both tools provide the same functionality.
To use the Web interface for backing up and restoring the Administration Server, go to
the Crowbar Web interface on the Administration Server, for example
http://192.168.124.10/
. Log in as user crowbar
. The password is
crowbar
by default, if you have not changed it. Go to › .
To create a backup, click the respective button. Provide a descriptive name (allowed characters are letters, numbers, dashes and underscores) and confirm with
. Alternatively, you can upload a backup, for example from a previous installation.Existing backups are listed with name and creation date. For each backup, three actions are available:
Download a copy of the backup file. The TAR archive you receive with this download can be uploaded again via
.Restore the backup.
Delete the backup.
Backing up and restoring the Administration Server from the command line can be done
with the command crowbarctl backup
. For getting general
help, run the command crowbarctl --help backup
, help on
a subcommand is available by running crowbarctl
SUBCOMMAND --help
.
The following commands for creating and managing backups exist:
crowbarctl backup create
NAME
Create a new backup named NAME. It will be
stored at /var/lib/crowbar/backup
.
crowbarctl backup [--yes] NAME
Restore the backup named NAME. You will be
asked for confirmation before any existing proposals will get
overwritten. If using the option --yes
, confirmations
are tuned off and the restore is forced.
crowbarctl backup delete NAME
Delete the backup named NAME.
crowbarctl backup download NAME
[FILE]
Download the backup named NAME. If you
specify the optional [FILE], the download is
written to the specified file. Otherwise it is saved to the current
working directory with an automatically generated file name. If
specifying -
for [FILE],
the output is written to STDOUT.
crowbarctl backup list
List existing backups. You can optionally specify different
output formats and filters—refer to crowbarctl backup
list --help
for details.
crowbarctl backup upload
FILE
Upload a backup from FILE.