hacluster #133

Supports: xenial bionic eoan focal trusty
Add to new model

Description

Corosync/Pacemaker


Overview

The hacluster charm provides high availability for OpenStack applications that
lack native (built-in) HA functionality. The clustering solution is based on
Corosync and Pacemaker.

It is a subordinate charm that works in conjunction with a principle charm that
supports the 'hacluster' interface. The current list of such charms can be
obtained from the Charm Store (the charms
officially supported by the OpenStack Charms project are published by
'openstack-charmers').

Note: The hacluster charm is generally intended to be used with
MAAS-based clouds.

Usage

High availability can be configured in two mutually exclusive ways:

  • virtual IP(s)
  • DNS

The virtual IP method of implementing HA requires that all units of the
clustered OpenStack application are on the same subnet.

The DNS method of implementing HA requires that MAAS is used
as the backing cloud. The clustered nodes must have static or "reserved" IP
addresses registered in MAAS. If using a version of MAAS earlier than 2.3 the
DNS hostname(s) should be pre-registered in MAAS before use with DNS HA.

Configuration

This section covers common configuration options. See file config.yaml for
the full list of options, along with their descriptions and default values.

cluster_count

The cluster_count option sets the number of hacluster units required to form
the principle application cluster (the default is 3). It is best practice to
provide a value explicitly as doing so ensures that the hacluster charm will
wait until all relations are made to the principle application before building
the Corosync/Pacemaker cluster, thereby avoiding a race condition.

Deployment

At deploy time an application name should be set, and be based on the principle
charm name (for organisational purposes):

juju deploy hacluster <principle-charm-name>-hacluster

A relation is then added between the hacluster application and the principle
application.

In the below example the VIP approach is taken. These commands will deploy a
three-node Keystone HA cluster, with a VIP of 10.246.114.11. Each will reside
in a container on existing machines 0, 1, and 2:

juju deploy -n 3 --to lxd:0,lxd:1,lxd:2 --config vip=10.246.114.11 keystone
juju deploy --config cluster_count=3 hacluster keystone-hacluster
juju add-relation keystone-hacluster:ha keystone:ha

Actions

This section lists Juju actions supported by the charm.
Actions allow specific operations to be performed on a per-unit basis.

  • pause
  • resume
  • status
  • cleanup

To display action descriptions run juju actions hacluster. If the charm is
not deployed then see file actions.yaml.

Bugs

Please report bugs on Launchpad.

For general charm questions refer to the OpenStack Charm Guide.


Configuration

cluster_count
(int) Number of peer units required to bootstrap cluster services. . If less that 3 is specified, the cluster will be configured to ignore any quorum problems; with 3 or more units, quorum will be enforced and services will be stopped in the event of a loss of quorum. It is best practice to set this value to the expected number of units to avoid potential race conditions.
3
corosync_bindiface
(string) Default network interface on which HA cluster will bind to communication with the other members of the HA Cluster. Defaults to the network interface hosting the units private-address. Only used when corosync_transport = multicast.
corosync_key
(string) This value will become the Corosync authentication key. To generate a suitable value use: . sudo corosync-keygen sudo cat /etc/corosync/authkey | base64 -w 0 . This configuration element is mandatory and the service will fail on install if it is not provided. The value must be base64 encoded.
64RxJNcCkwo8EJYBsaacitUvbQp5AW4YolJi5/2urYZYp2jfLxY+3IUCOaAUJHPle4Yqfy+WBXO0I/6ASSAjj9jaiHVNaxmVhhjcmyBqy2vtPf+m+0VxVjUXlkTyYsODwobeDdO3SIkbIABGfjLTu29yqPTsfbvSYr6skRb9ne0=
corosync_mcastaddr
(string) Multicast IP address to use for exchanging messages over the network. If multiple clusters are on the same bindnetaddr network, this value can be changed. Only used when corosync_transport = multicast.
226.94.1.1
corosync_mcastport
(int) Default multicast port number that will be used to communicate between HA Cluster nodes. Only used when corosync_transport = multicast.
corosync_transport
(string) Two supported modes are multicast (udp) or unicast (udpu)
unicast
debug
(boolean) Enable debug logging
failed_actions_alert_type
(string) If the CRM status has recorded failed actions in any of the registered resource agents, check_crm can optionally generate an alert. Valid options: ignore/warning/critical
critical
failed_actions_threshold
(int) check_crm will not generate an alert unless enough failed actions have been recorded. Has no effect if failed_actions_alert_type is set to 'ignore'
1
failure_timeout
(int) Sets the pacemaker default resource meta-attribute value for failure_timeout. This value represents the duration in seconds to wait before resetting failcount to 0. In practice, this is measured as the time elapsed since the most recent failure. Setting this to 0 disables the feature.
maas_credentials
(string) MAAS credentials (required for STONITH).
maas_source
(string) PPA for python3-maas-client: . - ppa:maas/stable - ppa:maas/next . The last option should be used in conjunction with the key configuration option. Used when service_dns is set on the primary charm for DNS HA.
ppa:maas/stable
maas_source_key
(string) PPA key for python3-maas-client: PPA Key configuration option. Used when nodes are offline to specify the ppa public key.
maas_url
(string) MAAS API endpoint (required for STONITH).
maintenance-mode
(boolean) When enabled pacemaker will be put in maintenance mode, this will allow administrators to manipulate cluster resources (e.g. stop daemons, reboot machines, etc). Pacemaker will not monitor the resources while maintence mode is enabled.
monitor_host
(string) One or more IPs, separated by space, that will be used as a safety check for avoiding split brain situations. Nodes in the cluster will ping these IPs periodically. Node that can not ping monitor_host will not run shared resources (VIP, shared disk...).
monitor_interval
(string) Time period between checks of resource health. It consists of a number and a time factor, e.g. 5s = 5 seconds. 2m = 2 minutes.
5s
nagios_context
(string) Used by the nrpe-external-master subordinate charm. A string that will be prepended to instance name to set the host name in nagios. So for instance the hostname would be something like: . juju-postgresql-0 . If you're running multiple environments with the same services in them this allows you to differentiate between them.
juju
nagios_servicegroups
(string) A comma-separated list of nagios servicegroups. If left empty, the nagios_context will be used as the servicegroup.
netmtu
(int) Specifies the corosync.conf network mtu. If unset, the default corosync.conf value is used (currently 1500). See 'man corosync.conf' for detailed information on this config option.
pacemaker_key
(string) This value will become the Pacemaker authentication key. To generate a suitable value use: . dd if=/dev/urandom of=/tmp/authkey bs=2048 count=1 cat /tmp/authkey | base64 -w 0 . If this configuration element is not set then the corosync key will be reused as the pacemaker key.
prefer-ipv6
(boolean) If True enables IPv6 support. The charm will expect network interfaces to be configured with an IPv6 address. If set to False (default) IPv4 is expected. . NOTE: these charms do not currently support IPv6 privacy extension. In order for this charm to function correctly, the privacy extension must be disabled and a non-temporary address must be configured/available on your network interface.
service_start_timeout
(int) Systemd override value for corosync and pacemaker service start timeout in seconds. Set value to -1 turn off timeout for the services.
180
service_stop_timeout
(int) Systemd override value for corosync and pacemaker service stop timeout seconds. Set value to -1 turn off timeout for the services.
60
stonith_enabled
(string) Enable resource fencing (aka STONITH) for every node in the cluster. This requires MAAS credentials be provided and each node's power parameters are properly configured in its inventory.
False