Etcd

By Canonical Kubernetes
Cloud

Architecture:

Channel	Revision	Published	Runs on
latest/stable	760	06 Mar 2024	Ubuntu 22.04 Ubuntu 20.04
latest/candidate	764	15 Apr 2024	Ubuntu 22.04 Ubuntu 20.04
latest/beta	760	15 Dec 2023	Ubuntu 22.04 Ubuntu 20.04
latest/edge	765	16 Apr 2024	Ubuntu 22.04 Ubuntu 20.04
1.30/edge	765	16 Apr 2024	Ubuntu 22.04 Ubuntu 20.04
1.29/stable	760	12 Feb 2024	Ubuntu 22.04 Ubuntu 20.04
1.29/candidate	764	15 Apr 2024	Ubuntu 22.04 Ubuntu 20.04
1.29/beta	760	14 Dec 2023	Ubuntu 22.04 Ubuntu 20.04
1.29/edge	759	25 Oct 2023	Ubuntu 22.04 Ubuntu 20.04
1.28/stable	748	22 Aug 2023	Ubuntu 22.04 Ubuntu 20.04
1.28/candidate	742	07 Jun 2023	Ubuntu 22.04 Ubuntu 20.04
1.28/beta	748	07 Aug 2023	Ubuntu 22.04 Ubuntu 20.04
1.28/edge	752	21 Aug 2023	Ubuntu 22.04 Ubuntu 20.04
1.27/stable	742	12 Jun 2023	Ubuntu 22.04 Ubuntu 20.04
1.27/candidate	742	12 Jun 2023	Ubuntu 22.04 Ubuntu 20.04
1.27/beta	736	10 Apr 2023	Ubuntu 22.04 Ubuntu 20.04
1.27/edge	737	10 Apr 2023	Ubuntu 22.04 Ubuntu 20.04
1.26/stable	728	20 Mar 2023	Ubuntu 22.04 Ubuntu 20.04
1.26/candidate	728	16 Mar 2023	Ubuntu 22.04 Ubuntu 20.04
1.26/beta	720	09 Apr 2023	Ubuntu 22.04 Ubuntu 20.04
1.26/edge	720	19 Nov 2022	Ubuntu 22.04 Ubuntu 20.04
1.25/stable	718	30 Sep 2022	Ubuntu 22.04 Ubuntu 20.04 Ubuntu 18.04
1.25/candidate	718	28 Sep 2022	Ubuntu 22.04 Ubuntu 20.04 Ubuntu 18.04
1.25/beta	721	01 Dec 2022	Ubuntu 22.04 Ubuntu 20.04
1.25/edge	708	09 Sep 2022	Ubuntu 22.04 Ubuntu 20.04 Ubuntu 18.04
1.24/stable	701	04 Aug 2022	Ubuntu 22.04 Ubuntu 20.04 Ubuntu 18.04
1.24/candidate	701	01 Aug 2022	Ubuntu 22.04 Ubuntu 20.04 Ubuntu 18.04
1.24/beta	691	03 May 2022	Ubuntu 20.04 Ubuntu 18.04 Ubuntu 16.04
1.24/edge	700	22 Jul 2022	Ubuntu 22.04 Ubuntu 20.04 Ubuntu 18.04
1.23/beta	682	22 Mar 2022	Ubuntu 20.04 Ubuntu 18.04 Ubuntu 16.04
1.23/edge	680	24 Feb 2022	Ubuntu 20.04 Ubuntu 18.04 Ubuntu 16.04

Platform:

22.04 20.04

Relevant links

Homepage

Discuss this charm

Share your thoughts on this charm with the community on discourse.

Join the discussion

Etcd is a highly available distributed key value store that provides a reliable way to store data across a cluster of machines. Etcd gracefully handles master elections during network partitions and will tolerate machine failure, including the master.

Your applications can read and write data into etcd. A simple use-case is to store database connection details or feature flags in etcd as key value pairs. These values can be watched, allowing your app to reconfigure itself when they change.

Advanced uses take advantage of the consistency guarantees to implement database master elections or do distributed locking across a cluster of workers.

Etcd allows storing data in a distributed hierarchical database with observation.

Usage

We can deploy a single node with the following commands:

juju deploy easyrsa
juju deploy etcd
juju add-relation etcd easyrsa

And add capacity with:

juju add-unit -n 2 etcd

It’s recommended to run an odd number of machines as it has greater redundancy than an even number (i.e. with 4, you can lose 1 before quorum is lost, whereas with 5, you can lose 2).

Notes about cluster turn-up

The etcd charm initializes a cluster using the Static configuration: which is the most “flexible” of all the installation options, considering it allows etcd to be self-discovering using the peering relationships provided by Juju.

Health

Health of the cluster can be checked by running a juju action.

juju run-action --wait etcd/0 health

The health is also reported continuously via juju status. During initial cluster turn-up, it’s entirely reasonable for the health checks to fail; this is not a situation to cause you alarm. The health-checks are being executed before the cluster has stabilized, and it should even out once the members start to come online and the update-status hook is run again.

This will give you some insight into the cluster on a 5 minute interval, and will report healthy nodes vs unhealthy nodes.

For example:

Unit        Workload  Agent  Machine  Public address  Ports     Message
etcd/0*     active    idle   1        54.227.0.225    2379/tcp  Healthy with 3 known peers
etcd/1      active    idle   2        184.72.191.212  2379/tcp  Healthy with 3 known peers
etcd/2      active    idle   3        34.207.195.139  2379/tcp  Healthy with 3 known peers

TLS

The ETCD charm supports TLS terminated endpoints by default. All efforts have been made to ensure the PKI is as robust as possible.

Client certificates can be obtained by running an action on any of the cluster members:

juju run-action --wait etcd/0 package-client-credentials
juju scp etcd/0:etcd_credentials.tar.gz etcd_credentials.tar.gz

This will place the client certificates in pwd. If you’re keen on using etcdctl outside of the cluster machines, you’ll need to expose the charm, and export some environment variables to consume the client credentials.

If you are using etcd <=3.2.x:

juju expose etcd
export ETCDCTL_KEY_FILE=$(pwd)/client.key
export ETCDCTL_CERT_FILE=$(pwd)/client.crt
export ETCDCTL_CA_FILE=$(pwd)/ca.crt
export ETCDCTL_ENDPOINT=https://{ip of etcd host}:2379
etcdctl member list

Or if you’re using etcd >=3.3.x:

juju expose etcd
export ETCDCTL_KEY=$(pwd)/client.key
export ETCDCTL_CERT=$(pwd)/client.crt
export ETCDCTL_CACERT=$(pwd)/ca.crt
export ETCDCTL_ENDPOINTS=https://{ip of etcd host}:2379
etcdctl member list

If in doubt, you can always export all the env vars from both.

Persistent Storage

Many cloud providers use ephemeral storage. When using cloud provider infrastructures is recommended to place any data-stores on persistent volumes that exist outside of the ephemeral storage on the unit.

Juju abstracts this with the storage provider.

To add a unit of storage we’ll first need to discover what storage types the cloud provides to us, which can be listed:

juju list-storage-pools

AWS Storage example

To add SSD backed EBS storage from AWS, the following example provisions a single 10GB SSD EBS instance and attaches it to the etcd/0 unit.

juju add-storage etcd/0 data=ebs-ssd,10G

GCE Storage example

To add Persistent Disk storage from GCE, the following example provisions a single 10GB PD instance and attaches it to the etcd/0 unit.

juju add-storage etcd/0 data=gce,10G

Cinder Storage example

To add Persistent Disk storage from Open Stack Cinder, the following example provisions a single 10GB PD instance and attaches it to the etcd/0 unit.

juju add-storage etcd/0 data=cinder,10G

Charm Actions

Restore

Allows the operator to restore the data from a cluster-data snapshot.

juju attach etcd snapshot=/path/to/etcd-backup
juju run-action --wait etcd/leader restore

param target: destination directory to save the existing data.
param skip-backup: Don’t backup any existing data. (defaults to True)

See the section “Restoring a Snapshot” below for the full procedure for restoring .

Snapshot

Allows the operator to snapshot a running clusters data for use in cloning, backing up, or migrating etcd clusters.

juju run-action --wait etcd/0 snapshot target=/mnt/etcd-backups keys-version=v3

param target: destination directory to save the resulting snapshot archive.
param keys-version: etcd keys-version to snapshot: v3 and v2 are valid options here

NOTE: etcd supports multiple key versions (presently v2 and v3) and data for each version is separate, so you must specify which set of data you wish to snapshot. If your etcd is deployed for Kubernetes versions post 1.10, data will be stored in v3 format, if you are snapshotting 1.09 or older, you may want keys-version=v2. It is often the case that both sets of data are present, so you may need to run the action twice to receive a snaphot of v2 and v3 data.

Restoring a snapshot

Warning: Restoring a snapshot should not be performed when there is more than one unit of etcd running. These instructions detail deploying a new instance first.

As restoring only works when there is a single unit of etcd, it is usual to deploy a new instance of the application first. For example:

juju deploy etcd new-etcd --series=focal --config channel=3.4/stable

The --series option is included here to illustrate how to specify which series the new unit should be running on. The --config option is required to specify the same channel of etcd as the original unit.

Next we upload and identify the snapshot file to this new unit:

juju attach-resource new-etcd snapshot=./etcd-snapshot-2022-09-26-18.04.02.tar.gz

Then run the restore action:

juju run new-etcd/0 restore

Once the restore action has finished, you should see output confirming that the operation is completed.

Ⓘ If you have snapshots for both v2 and v3 data, you will need to rerun the actions above, attaching the resource and then running the restore action again.

The new etcd application will need to be connected to the rest of the deployment:

juju integrate new-etcd [calico|flannel|$cni]
juju integrate new-etcd kubernetes-control-plane

To restore the cluster capabilities of etcd, you can now add more units:

juju add-unit new-etcd -n 2

Once the deployment has settled and all new-etcd units report ready, verify the cluster health with:

 juju run new-etcd/0 health

which should return something similar to:

unit-new-etcd-0:
  id: 27fe2081-6513-4968-869d-6c2c092210a1
  results:
    result-map:
      message: |-
        member 3c149609bfcf7692 is healthy: got healthy result from https://172.31.18.7:2379
        cluster is healthy
  status: completed
  timing:
    completed: 2022-10-26 15:16:33 +0000 UTC
    enqueued: 2022-10-26 15:16:32 +0000 UTC
    started: 2022-10-26 15:16:33 +0000 UTC
  unit: new-etcd/0

Migrating etcd

Migrating the etcd data is a fairly easy task. Use the following steps:

Step 1: Snapshot your existing cluster. This is encapsulated in the snapshot action.

juju run-action --wait etcd/0 snapshot keys-version=v3

Action queued with id: b46d5d6f-5625-4320-8cda-b611c6ae580c

Step 2: Check the status of the action so you can verify the hash sum of the resulting file. The output will contain results.copy.cmd the value can be copied and used to download the snapshot that you just created.

Download the snapshot tar archive from the unit that created the snapshot and verify the sha256 hash sum.

 juju show-action-output b46d5d6f-5625-4320-8cda-b611c6ae580c
results:
  copy:
    cmd: juju scp etcd/0:/home/ubuntu/etcd-snapshots/etcd-snapshot-2016-11-09-02.41.47.tar.gz
      .
  snapshot:
    path: /home/ubuntu/etcd-snapshots/etcd-snapshot-2016-11-09-02.41.47.tar.gz
    sha256: 1dea04627812397c51ee87e313433f3102f617a9cab1d1b79698323f6459953d
    size: 68K
status: completed

juju scp etcd/0:/home/ubuntu/etcd-snapshots/etcd-snapshot-2016-11-09-02.41.47.tar.gz .

sha256sum etcd-snapshot-2016-11-09-02.41.47.tar.gz

Step 3: Deploy the new cluster leader, and attach the snapshot as a resource.

juju deploy etcd new-etcd --resource snapshot=./etcd-snapshot-2016-11-09-02.41.47.tar.gz

Step 4: Re-Initialize the etcd leader with the data by running the restore action which uses the resource that was attached in step 3.

juju run-action --wait new-etcd/0 restore

Step 5: Scale and operate as required, verify the data was restored.

Limited egress operations

The etcd charm installs the etcd application as a snap package. You can supply an etcd.snap resource to make this charm easily installable behind a firewall.

juju deploy /path/to/etcd
juju attach etcd etcd=/path/to/etcd.snap

Post Deployment Snap Upgrades (if using the resource)

The charm if installed from a locally supplied resource will be locked into that resource version until another is supplied and explicitly installed.

juju attach etcd etcd=/path/to/new/etcd.snap
juju run-action etcd/0 install
juju run-action etcd/1 install

Migrate from Deb to Snap

This section only applies if you are upgrading an existing etcd charm deployments. This migration should only be needed once because new deployments of etcd will default to snap delivery.

Revision 24 and prior the etcd charm installed the etcd application from Debian packages. Revisions 25+ install from the snap store (or resource). During the migration process, you will be notified that a classic installation exists and a manual migration action must be run.

Before a migration is your opportunity to ensure state has been captured, and to plan for downtime, as this migration process will stop and resume the etcd application. This service disruption can cause disruptions with other dependent applications.

Starting the migration

The deb to snap migration process has been as automated as possible. Despite the automatic backup mechanism during the migration process, you are still encouraged to run a snapshot before executing the upgrade.

Once the snapshot is completed, begin the migration process. You first need to upgrade the charm to revision 25 or later.

juju upgrade-charm etcd

For your convenience there is the snap-upgrade action that removes the Debian package and installs the snap package. Each etcd unit will need to be upgraded individually. Best practice would be to migrate an individual unit at a time to ensure the cluster upgrades completely.

juju run-action etcd/0 snap-upgrade
# Repeat this command for other etcd units in your cluster.

Once the unit has completed upgrade, the unit’s status message will return to its normal health check messaging.

Unit        Workload  Agent  Machine  Public address  Ports     Message
etcd/0*     active    idle   1        54.89.190.93    2379/tcp  Healthy with 3 known peers

Once you have the snap package you can upgrade to different versions of etcd by configuring the snap channel configuration option on the charm.

juju config etcd channel=3.0/stable

Known Limitations

Moving from 2.x to 3.x and beyond

The etcd charm relies heavily on the snap package for etcd. In order to properly migrate a 2.x series etcd deployment into 3.1 and beyond you will need to follow a proper channel migration path. The initial deb to snap upgrade process will place you in a 2.3 deployment.

You can migrate from 2.3 to 3.0

juju config etcd channel=3.0/stable

From the 3.0 channel you can migrate to 3.1 (current latest at time of writing)

juju config etcd channel=3.1/stable

You MUST perform the 2.3 => 3.0 before moving from 3.0 => 3.1 A migration from 2.3 => 3.1 is not supported at this time.

Multiple snapd refresh timers

The etcd charm exposes a snapd_refresh config option that is used to control how often snapd checks for updates to installed snaps. By default, this is set to max which scans for refreshes once per month. If a subordinate charm based on layer-snap is related to an etcd principal unit, the refresh timer may be inadvertantly changed.

The best practice for deploying multiple layer-snap charms onto a single machine is to ensure snapd_refresh is consistent among those charms. As an example, set an explicit refresh timer for the last Friday of the month with:

juju config etcd snapd_refresh='fri5'

TLS Defaults Warning (for Trusty etcd charm users)

Additionally, this charm breaks with no backwards compatible/upgrade path at the Trusty/Xenial series boundary. Xenial forward will enable TLS by default. This is an incompatible break due to the nature of peer relationships, and how the certificates are generated/passed off.

To migrate from Trusty to Xenial, the operator will be responsible for deploying the Xenial etcd cluster, then issuing an etcd data dump on the trusty series, and importing that data into the new cluster. This can be only be performed on a single node due to the nature of how replicas work in etcd.

Any issues with the above process should be filed against the charm layer in github.

Help improve this document in the forum (guidelines). Last updated 9 months ago.