mongodb #0
Description
MongoDB is a high-performance, open source, schema-free document- oriented data store that's easy to deploy, manage and use. It's network accessible, written in C++ and offers the following features:
- Collection oriented storage - easy storage of object-style data
- Full index support, including on inner objects
- Query profiling
- Replication and fail-over support
- Efficient storage of binary data including large objects (e.g. videos)
- Auto-sharding for cloud-level scalability (Q209) High performance, scalability, and reasonable depth of functionality are the goals for the project.
- Tags:
- databases ›
Overview
This charm deploys MongoDB in three configurations:
- Single node
- Replica set
- Sharded clusters
By default, the MongoDB application is installed from the Ubuntu archive, except for arm64 platforms. The version of MongoDB in the archive is known to have issues on arm64, so by default this charm will use ppa:mongodb-arm64/ppa which contains backported fixes for this architecture.
Usage
Review the configurable options
The MongoDB charm allows for certain values to be configurable via a config.yaml file. The options provided are extensive, you should review the options.
Specifically the following options are important:
- replicaset
- ie: myreplicaset
- Each replicaset has a unique name to distinguish it’s members from other replicasets available in the network.
-
The default value of "myset" should be fine for most single cluster scenarios.
-
web_admin_ui
- MongoDB comes with a basic but very informative web user interface that provides health and status information on the database node as well as the cluster.
- The default value of yes will start the Admin web UI on port 28017.
Most of the options in config.yaml have been modeled after the default configuration file for mongodb (normally in /etc/mongodb.conf) and should be familiar to most mongodb admins. Each option in this charm have a brief description of what it does.
Usage
Single Node
Deploy the first MongoDB instance
juju deploy mongodb
juju expose mongodb
Replica Sets
Deploying
Deploy the first two MongoDB instances that will form replicaset:
juju deploy mongodb -n 2
Deploying three or more units at start can sometimes lead to unexpected race-conditions so it's best to start with two nodes.
Your deployment should look similar to this ( juju status
):
environment: amazon
machines:
"0":
agent-state: started
agent-version: 1.16.5
dns-name: ec2-184-73-7-172.compute-1.amazonaws.com
instance-id: i-cb55cceb
instance-state: running
series: precise
hardware: arch=amd64 cpu-cores=1 cpu-power=100 mem=1740M root-disk=8192M
"1":
agent-state: pending
dns-name: ec2-54-196-181-161.compute-1.amazonaws.com
instance-id: i-974bd2b7
instance-state: pending
series: precise
hardware: arch=amd64 cpu-cores=1 cpu-power=100 mem=1740M root-disk=8192M
services:
mongodb:
charm: cs:precise/mongodb-20
exposed: false
relations:
replica-set:
- mongodb
units:
mongodb/0:
agent-state: pending
machine: "1"
public-address: ec2-54-196-181-161.compute-1.amazonaws.com
In addition, the MongoDB web interface should also be accessible via the services’ public-address and port 28017 ( ie: http://ec2-50-17-73-255.compute-1.amazonaws.com:28017 ).
(Optional) Change the replicaset name
juju set mongodb replicaset=<new_replicaset_name>
Add one or more nodes to your replicaset
juju add-unit mongodb
juju add-unit mongodb -n2
We now have a working MongoDB replica-set.
Caveats
Keep in mind that you need to have odd number of nodes for a properly formed replicaset.
Replicaset can't function with only one available node - shall this happens the remaining node is switched to 'read-only' until at least one of the broken nodes is restored.
More info can be found in MongoDB documentation at their website
Removing a failed node
Working units can be removed from replica set using 'juju remove-unit' command. If the removing unit is primary it will automatically be stepped down (so thath re-election of new primary is performend) before being removed. However, if a unit fails (freezes, gets destroyed and is unbootable), operator needs to manually remove it. The operator would connect to primary unit, and issue rs.remove() for failed unit. Also, operator needs to issue 'juju remove-unit --force' to remove failed unit from juju.
Recovering from degraded replicaset
If two members go down replicaset is in read-only state. That is because the remaining node is in SECONDARY state (it can't get promoted/voted to PRIMARY because there is no majority in replicaset). If failed nodes can't be brought back to life we need to manually force remaining node to become a primary. Here is how:
- connect to the node that's alive
- start 'mongo', a cli utility
- upon connecting you'll see that node is SECONDARY
-
display current configuration with: rs.config()
- this will show the alive node as well as the nodes that are unreachabble
-
store the configuration into some temporary json document: cfg=rs.config()
-
change the cfg document so that it's members array contain only the unit that is alive: cfg.members=[cfg.members[0]]
-
force reconfiguration of the replicaset: rs.reconfig(cfg, {force: true})
-
wait a few, and press ENTER. You should see that your node becomes PRIMARY.
After this clean up the unavailable machines from juju: juju remove-machine --force XX ## XX is the machine number
And add more units to form a proper replicaset. (To avoid race conditions it is best to add units one by one).
juju add-unit mongodb
Sharding (Scale Out Usage)
According the the MongoDB documentation found on their website, one way of deploying a Shard Cluster is as follows:
- deploy config servers
- deploy a mongo shell (mongos)
- deploy shards
- connect the config servers to the mongo shell
- add the shards to the mongo shell
Using Juju we can deploy a sharded cluster using the following commands:
Prepare a configuration file similar to the following:
shard1:
replicaset: shard1
shard2:
replicaset: shard2
shard3:
replicaset: shard3
configsvr:
replicaset: configsvr
We'll save this one as ~/mongodb-shard.yaml
.
Bootstrap the environment
juju bootstrap
Config Servers ( we'll deploy 3 of them )
juju deploy mongodb configsvr --config ~/mongodb-shard.yaml -n3
Mongo Shell ( We just deploy one for now )
juju deploy mongodb mongos
Shards ( We'll deploy three replica-sets )
juju deploy mongodb shard1 --config ~/mongodb-shard.yaml -n3
juju deploy mongodb shard2 --config ~/mongodb-shard.yaml -n3
juju deploy mongodb shard3 --config ~/mongodb-shard.yaml -n3
Connect the Config Servers to the Mongo shell (mongos)
juju add-relation mongos:mongos-cfg configsvr:configsvr
Connect each Shard to the Mongo shell (mongos)
juju add-relation mongos:mongos shard1:database
juju add-relation mongos:mongos shard2:database
juju add-relation mongos:mongos shard3:database
With the above commands, we should now have a three replica-set sharded cluster running. Using the default configuration, here are some details of our sharded cluster:
- mongos is running on port 27021
- configsvr is running on port 27019
- the shards are running on the default mongodb port of 27017
- The web admin is turned on by default and accessible with your browser on port 28017 on each of the shards.
To verify that your sharded cluster is running, connect to the mongo shell and run sh.status()
:
mongo --host <mongos_host>:<mongos_port>
run sh.status()
You should see your the hosts for your shards in the status output.
Use the storage subordinate to store mongodb data on a permanent OpenStack or Amazon EBS volume
The storage subordinate and block-storage-broker service can automatically handle attaching the volume and mounting it to the unit before MongoDB is setup to use it.
For example if you've created the volumes vol-id-00001
and vol-id-00002
and want to attach them to your 2 mongo units, with your OpenStack or AWS credentials in a credential.yaml
file:
juju deploy block-storage-broker --config credentials.yaml
juju deploy storage
juju add-relation block-storage-broker storage
juju set storage provider=block-storage-broker
juju set volume_map="{mongodb/0: vol-id-00001, mongodb/1: vol-id-00002}"
juju add-relation storage mongodb
Use a permanent Openstack volume to store mongodb data. (DEPRECATED)
Note: Although these steps will still work they are now deprecated, you should use the storage subordinate above instead.
To deploy mongodb using permanent volume on Openstack, the permanent volume should be attached to the mongodb unit just after the deployment, then the configuration should be updated like follows.
juju set mongodb volume-dev-regexp="/dev/vdc" volume-map='{"mongodb/0": "vol-id-00000000000000"}' volume-ephemeral-storage=false
Backups
Backups can be enabled via config. Note that destroying the service cannot currently remove the backup cron job so it will continue to run. There is a setting for the number of backups to keep, however, to prevent from filling disk space.
To fetch the backups scp the files down from the path in the config.
Benchmarking
Mongo units can be benchmarked via the perf
juju action, available beginning with juju 1.23.
$ juju action defined mongodb
perf: The standard mongoperf benchmark.
$ juju action do mongodb/0 perf
Action queued with id: 23532149-15c2-47f0-8d97-115fb7dfa1cd
$ juju action fetch --wait 0 23532149-15c2-47f0-8d97-115fb7dfa1cd
results:
meta:
composite:
direction: desc
units: ops/sec
value: "7736507.70391"
start: 2015-05-07T16:36:04Z
stop: 2015-05-07T16:39:05Z
results:
average:
units: ops/sec
value: "7736507.70391"
iterations:
units: iterations
value: "179"
max:
units: ops/sec
value: "10282496"
min:
units: ops/sec
value: "3874546"
total:
units: ops
value: "1384834879"
status: completed
timing:
completed: 2015-05-07 16:39:06 +0000 UTC
enqueued: 2015-05-07 16:36:01 +0000 UTC
started: 2015-05-07 16:36:04 +0000 UTC
Known Limitations and Issues
- If your master/slave/replicaset deployment is not updating correctly, check the log files at
/var/log/mongodb/mongodb.log
to see if there is an obvious reason ( port not open etc.). - Ensure that TCP port 27017 is accessible from all of the nodes in the deployment.
- If you are trying to access your MongoDB instance from outside your deployment, ensure that the service has been exposed (
juju expose mongodb
) - Make sure that the mongod process is running ( ps -ef | grep mongo ).
- Try restarting the database ( restart mongodb )
- If all else fails, remove the data directory on the slave (
rm -fr /var/log/mongodb/data/*
) and restart the mongodb-slave daemon (restart mongodb
).
Contact Information
MongoDB Contact Information
Configuration
- arbiter
- (string) Enable arbiter mode. Possible values are 'disabled' for no arbiter, 'enable' to become an arbiter or 'host:port' to declare another host as an arbiter. replicaset_master must be set for this option to work.
- disabled
- auth
- (boolean) Turn on/off security
- autoresync
- (boolean) Automatically resync if slave data is stale
- backup_copies_kept
- (int) Number of backups to keep. Keeps one week's worth by default.
- 7
- backup_directory
- (string) Where can the backups be found.
- /home/ubuntu/backups
- backups_enabled
- (boolean) Enable daily backups to disk.
- bind_ip
- (string) IP address that mongodb should listen for connections.
- all
- config_server_dbpath
- (string) The path where the config server data files will be kept.
- /mnt/var/lib/mongodb/configsvr
- config_server_logpath
- (string) The path where to send config server log data.
- /mnt/var/log/mongodb/configsvr.log
- config_server_port
- (int) Port number to use for the config-server
- 27019
- cpu
- (boolean) Enables periodic logging of CPU utilization and I/O wait
- dbpath
- (string) The path where the data files will be kept.
- /var/lib/mongodb
- diaglog
- (int) Set oplogging level where n is 0=off (default), 1=W, 2=R, 3=both, 7=W+some reads
- extra_config_options
- (string) Extra options ( comma separated ) to be included ( at the end ) in the mongodb.conf file.
- none
- extra_daemon_options
- (string) Extra options ( exactly as you would type them in the command line ) to be added via the command line to the mongodb daemon
- none
- journal
- (boolean) Enable journaling, http://www.mongodb.org/display/DOCS/Journaling
- True
- key
- (string) Key ID to import to the apt keyring to support use with arbitary source configuration from outside of Launchpad archives or PPA's.
- logappend
- (boolean) Append log entries to existing log file
- True
- logpath
- (string) The path where to send log data.
- /var/log/mongodb/mongodb.log
- logrotate-frequency
- (string) How often should the logs be rotated. Use values from logrotate.
- daily
- logrotate-maxsize
- (string) Maximum log size before rotating.
- 500M
- logrotate-rotate
- (int) Number of log files to keep.
- 5
- master
- (string) Who is the master DB. If not "self", put the Master DB here as "host:port"
- self
- mms-interval
- (string) Ping interval for Mongo monitoring server ( in number of seconds )
- disabled
- mms-name
- (string) Server name for Mongo monitoring server
- disabled
- mms-token
- (string) Accout token for Mongo monitoring server
- disabled
- mongos_logpath
- (string) The path where to send log data from the mongo router.
- /mnt/var/log/mongodb/mongos.log
- mongos_port
- (int) Port number to use for the mongo router
- 27021
- nagios_context
- (string) Used by the nrpe-external-master subordinate charm. A string that will be prepended to instance name to set the host name in nagios. So for instance the hostname would be something like: juju-myservice-0 If you're running multiple environments with the same services in them this allows you to differentiate between them.
- juju
- nagios_servicegroups
- (string) A comma-separated list of nagios servicegroups. If left empty, the nagios_context will be used as the servicegroup
- nocursors
- (boolean) Diagnostic/debugging option
- nohints
- (boolean) Ignore query hints
- noprealloc
- (boolean) Disable data file preallocation
- noscripting
- (boolean) Turns off server-side scripting. This will result in greatly limited functionality
- notablescan
- (boolean) Turns off table scans. Any query that would do a table scan fails
- nssize
- (string) Specify .ns file size for new databases
- default
- objcheck
- (boolean) Inspect all client data for validity on receipt (useful for developing drivers)
- opIdMem
- (string) Size limit for in-memory storage of op ids
- default
- oplogSize
- (string) Custom size for replication operation log
- default
- port
- (int) Default MongoDB port
- 27017
- quota
- (boolean) Enable db quota management
- replicaset
- (string) Name of the replica set
- myset
- replicaset_master
- (string) Replica Set master (optional). Possible values are 'auto' for automatic detection based on install time or 'host:port' to connect to 'host' on 'port' and register as a member.
- auto
- source
- (string) Optional configuration to support use of additional sources such as: - ppa:myteam/ppa - cloud:precise-proposed/icehouse - http://my.archive.com/ubuntu main The last option should be used in conjunction with the key configuration option.
- None
- verbose
- (boolean) Verbose logging output
- volume-dev-regexp
- (string) Deprecated, use the storage subordinate. Block device for attached volumes as seen by the VM, will be "scanned" for an unused device when "volume-map" is valid for the unit.
- /dev/vd[b-z]
- volume-ephemeral-storage
- (boolean) Deprecated, use the storage subordinate. If false, a configure-error state will be raised if volume-map[$JUJU_UNIT_NAME] is not set (see "volume-map" below) - see "volume-map" below. If true, service units won't try to use "volume-map" (and related variables) to mount and use external (EBS) volumes, thus storage lifetime will equal VM, thus ephemeral. YOU'VE BEEN WARNED.
- True
- volume-map
- (string) Deprecated, use the storage subordinate. YAML map as e.g. "{ mongodb/0: vol-0000010, mongodb/1: vol-0000016 }". Service units will raise a "configure-error" condition if no volume-map value is set for it - it expects a human to set it properly to resolve it.
- web_admin_ui
- (boolean) Replica Set Admin UI (accessible via default_port + 1000)
- True