k8s cassandra #0

Supports: kubernetes

Deploy this charm on Kubernetes with the CLI. Find out how by reading the docs.

Description

A CAAS charm to deploy Kafka.


Overview

The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data. Cassandra's support for replicating across multiple datacenters is best-in-class, providing lower latency for your users and the peace of mind of knowing that you can survive regional outages.

See cassandra.apache.org for more information.

Editions

This charm supports Apache Cassandra 3.x

Deployment

Cassandra deployments are relatively simple in that they consist of a set of Cassandra nodes which seed from each other to create a ring of servers:

juju bootstrap microk8s k8s-cloud
juju add-model cassandramodel
juju deploy -n3 cs:narindergupta/k8s-cassandra cassandra

The service units will deploy and will form a single ring.

New nodes can be added to scale up:

juju add-unit cassandra

/!\ Nodes must be manually decommissioned before dropping a unit.

microk8s.kubectl exec -n cassandramodel -it cassandra-2 -- nodetool decommission
# Wait until Mode is DECOMMISSIONED
microk8s.kubectl exec -n cassandramodel -it cassandra-2 -- nodetool netstats
juju remove-unit cassandra/2

It is recommended to deploy at least 3 nodes and configure all your keyspaces to have a replication factor of three. Using fewer nodes or neglecting to set your keyspaces' replication settings means that your data is at risk and availability lower, as a failed unit may take the only copy of data with it.

Production systems will normally want to set max_heap_size and heap_newsize to the empty string, to enable automatic memory size tuning. The defaults have been chosen to be suitable for development environments but will perform poorly with real workloads.

Planning

  • Do not attempt to store too much data per node. If you need more space, add more nodes. Most workloads work best with a capacity under 1TB per node, so take care with larger deployments. Recommended capacities are vague and version dependent.

  • You need to keep 50% of your disk space free for Cassandra maintenance operations. If you expect your nodes to hold 500GB of data each, you will need a 1TB partition. Using non-default compaction such as LeveledCompactionStrategy can lower this waste.

  • Much more information can be found in the Cassandra 2.2 documentation

Usage

To relate the Cassandra charm to a service that understands how to talk to Cassandra using Thrift or the native Cassandra protocol::

juju deploy cs:~cassandra-charmers/cqlsh
juju add-relation cqlsh cassandra:database

Alternatively, if you require a superuser connection, use the database-admin relation instead of database::

juju deploy cs:~cassandra-charmers/cqlsh cqlsh-admin
juju add-relation cqlsh-admin cassandra:database-admin

The cluster is configured to use the recommended 'snitch' (GossipingPropertyFileSnitch), so you will need to configure replication of your keyspaces using the NetworkTopologyStrategy replica placement strategy. The datacenter is set in the Cassandra charm configuration, and provided by the client interface if clients need to do this programatically. For example, using the default datacenter named 'DC1':

CREATE KEYSPACE IF NOT EXISTS mydata WITH REPLICATION =
{ 'class': 'NetworkTopologyStrategy', 'DC1': 3};

Although authentication is configured using the standard PasswordAuthentication, by default no authorization is configured and the provided credentials will have access to all data on the cluster. For more granular permissions, you will need to set the authorizer in the service configuration to CassandraAuthorizer and manually grant permissions to the users.

Contact Information

General

The Juju mailing list

Charm

Cassandra

DataStax Enterprise


Configuration

cluster_name
(string) Name of the Cassandra cluster. This is mainly used to prevent machines in one logical cluster from joining another. All Cassandra services you wish to cluster together must have the same cluster_name. This setting cannot be changed after service deployment.
juju
datacenter
(string) The node's datacenter used by the endpoint_snitch. e.g. "DC1". It cannot be changed after service deployment.
DC1
heap_newsize
(string) The size of the JVM's young generation in the heap. If you set this, you should also set max_heap_size. If in doubt, go with 100M per physical CPU core. The default is automatically tuned.
32M
image
(string) OCI image
datastax/cassandra:4.0
max_heap_size
(string) Total size of Java memory heap, for example 1G or 512M. If you set this, you should also set heap_newsize. The default is automatically tuned.
384M
rack
(string) The rack used by the endpoint_snitch for all units in this service. e.g. "Rack1". This cannot be changed after deployment. It defaults to the service name. Cassandra will store replicated data in different racks whenever possible.
Rack1