ibm platform lsf master #1

Supports: trusty
Add to new model

Description

IBM Platform LSF is a powerful workload management platform for demanding, distributed High Performance Computing(HPC) environments.
It provides a comprehensive set of intelligent, policy-driven scheduling features that enable you to utilize all of your compute infrastructure resources
and ensure optimal application performance. LSF server host that acts as the overall coordinator for that cluster. Each cluster has one master host to do
all job scheduling and dispatch.


Overview

IBM Platform LSF Master v9.1.3

IBM Platform LSF is a powerful workload management platform for demanding, distributed High Performance Computing (HPC) environments.
It provides a comprehensive set of intelligent, policy-driven scheduling features that enable you to utilize
all of your compute infrastructure resources and ensure optimal application performance.

A LSF Master host is a server that acts as the overall coordinator for that cluster. Each cluster has one master host to do all
job scheduling and dispatch. If the master host goes down, another LSF server (called as Master-Candidates) in the cluster becomes the master host.

More information available at the IBM Knowledge Center

Please note that the LSF charms (LSF Storage, LSF Master and LSF Server) will create LSF cluster having same host type (either x86 or ppcle). LSF supports different host types to be part of LSF Cluster. But the LSF charms as of now supports same host type model only (So all the hosts ie master, server should have same machine architecture either x86 or ppcle).

Usage

This charm will not download or require any LSF product binaries as that is being handled by the IBM Platform LSF Storage Charm.
The LSF installation/configuration files will be shared when a relation is added between LSF Storage and LSF Master (using NFS).

Deploy

This charm will deploy LSF Master and Master Candidates. LSF Master will act as an overall master for the LSF Cluster.
Functionally this charm will do nothing. To have a working LSF Cluster, you need to add relations between LSF Master and LSF Storage.

To deploy IBM Platform LSF Master charm, follow the below steps:

1) Deploy the LSF Storage charm

juju deploy ibm-platform-lsf-storage \
  --resource ibm_lsf_installer=</path/to/installer.tar.Z> \
  --resource ibm_lsf_installation_x86=</path/to/distribution.tar.Z> \
  --resource ibm_lsf_entitlement_file=</path/to/entitlement.dat>

Note: The resource for getting the Installation/Distribution is different for x86 and Power, so if you are deploying on x86 machines, use resource name as
ibm_lsf_installation_x86 and for Power as ibm_lsf_installation_ppcle

2) Deploy the LSF Master charm

juju deploy ibm-platform-lsf-master
          or
juju deploy -n <num> ibm-platform-lsf-master

where <num> is the number of LSF Master units you want to deploy.

3) Add a relation between LSF Storage and LSF Master

juju add-relation ibm-platform-lsf-storage ibm-platform-lsf-master

This will create Platform LSF Master host which is ready to add more hosts to the LSF Cluster.

Add units of LSF Master

To add more units of LSF Master, you can use add-unit to add more hosts to your LSF Cluster (each added unit will be designated as LSF Master-Candidate host).

juju add-unit ibm-platform-lsf-master

This command will add a LSF Master Candidate to your existing LSF Cluster.

Remove units of LSF Master

juju remove-unit <unit name of the LSF Master you want to remove>

This will remove the unit from the existing LSF Cluster. (LSF conf files will be updated and LSF files will be unmounted)

Note: If you remove the unit which is the LSF Master host, still your LSF Cluster will be working as the next LSF Master Candidate will become the new Master until a new Master is added. You can create a new LSF Master by adding a unit to the existing cluster)

Removing Relations

An IBM Platform LSF Master charm can be related to LSF Storage charm as well as LSF Server charm. In case you want to remove the relation between these, refer to the below steps:

juju remove-relation ibm-platform-lsf-master ibm-platform-lsf-server

This will remove the server host information from the lsf configuration files and the server host will no longer be part of the existing LSF Cluster.

juju remove-relation ibm-platform-lsf-storage ibm-platform-lsf-master

This will stop the LSF Daemons on the Master and remove the entries from LSF Configuration Files. Then it will unmount the LSF shared installation files.

Verify the LSF Cluster

Once the IBM Platform LSF Master is deployed successfully and relation is established between LSF Storage and LSF Master, you can verify your LSF cluster by running some lsf commands as mentioned below:

  • Login into the machine where LSF Master is installed or LSF Master Candidate is installed as lsf administrator user lsfadmin

  • Export the lsf profile path: . /usr/share/lsf/conf/profile.lsf

  • Go to the path /usr/share/lsf/9.1/linux2.6-glibc2.3-x86_64/bin. The distribution folder will change based upon Operating System, for Power the path would be /usr/share/lsf/9.1/linux3.10-glibc2.17-ppc64le/bin and you can issue commands like lsid, lshosts, bhosts to verify your cluster is up and fine.

IBM Platform LSF Information

(1) General Information

Information on IBM Platform LSF available at the IBM Knowledge Center

(2) Contact Information

For issues with this charm, please contact IBM Juju Support Team jujusupp@us.ibm.com

(3) Known Limitations

This charm is dependent on LSF Storage charm which makes use of Juju features that are only available
in version 2.0 or greater.