kanif - a TakTuk wrapper for cluster management
kash|kaget|kaput [-aHhimsV] [-f conf-file] [-l login] [-M machines-list] [-n|-w nodes] [-o options] [-p level] [-r command] [-T options] [-t timeout] [-u timeout] [-x nodes] [machines specifications] [command body]
kanif is a tool for cluster management and administration. It combines main features of well known cluster management tools such as c3, pdsh and dsh and mimics their syntax. For the effective cluster management it relies on TakTuk, a tool for large scale remote execution deployment.
For simple parallel tasks that have to be executed on regular machines such as clusters, TakTuk syntax is too complicated. The goal of kanif is to provide an easier and familiar syntax to cluster administrators while still taking advantage of TakTuk characteristics and features (adaptivity, scalability, portability, autopropagation and informations redirection).
To work, kanif needs to find the taktuk
command (version 3.3 and
above) in the user path. The other requirements are the same as TakTuk: it
requires, on all the nodes of the cluster, a working Perl interpreter (version
5.8 and above) and a command to log without password (such as ssh
with
proper rsa keys installed).
kanif provides three simple commands for clusters administration and management:
kanif combines the advantages of several cluster management tools. Its main features can be summarized as follows:
As with pdsh
, kanif deployment can be monitored and controlled by
signals. When kanif receives a SIGINT (usually sent by typing Ctrl-C), it
displays a brief summary of its deployment state and commands execution
progress. After this first SIGINT, if kanif receives a second signal within
one second:
At the end of executions, kanif also reports a quick summary of failures: connections and commands execution.
To help administrators in their task, kanif options syntax is as close as possible to C3/pdsh/dsh well known tools.
conf-file
as configuration file instead of default. Several
possibilities are examined for default configuration file, in order:
$HOME/.kanif.conf
, /etc/kanif.conf
, /etc/c3.conf
.
head
node (using local interface) for all specified
clusters.
login
to connect to remote hosts.
machines-list
. kanif accepts as many -M options as you wish.
nodes
to the deployment. See section
HOSTNAMES SPECIFICATION for more information about nodes
syntax.
kanif accepts as many -n options as you wish.
command
used to contact remote hosts (default is
ssh -o StrictHostKeyChecking=no -o BatchMode=yes
).
Usually all kanif options can be set by environment variables. The rationale is that boolean options have 0/1 value and environment settings are overridden by command line switches.
The name of an environment variable used by kanif is made of
the long option name capitalized with dashes replaced by underscores and
KANIF_
prepended (for instance KANIF_ALL
, KANIF_HEAD
, and so on).
This rule admits the following exceptions (that have been
chosen to mimic C3/dsh behavior):
Notice also that the variable KANIF_WCOLL has no meaning to kanif.
Hostnames given to kanif might be simple machine name or complex hosts lists specifications. In its general form, an hostname is made of an host set and an optional exclusion set separated by a slash. Each of those sets is a comma separated list of host templates. Each of these templates is made of constant parts (characters outside brackets) and optional range parts (characters inside brackets). Each range part is a comma separated list of intervals or single values. Each interval is made of two single values separated by a dash. This is true for all hostnames given to kanif (both with -M or -n/-w options).
In other words, the following expressions are valid host specifications: node1 node[19] node[1-3] node[1-3],otherhost/node2 node[1-3,5]part[a-b]/node[3-5]parta,node1partb
they respectively expand to: node1 node19 node1 node2 node3 node1 node3 otherhost node1parta node2parta node2partb node3partb node5partb
Notice that these list of values are not regular expressions (node[19]
is
node19
and not node1, node2, ...., node9
). Intervals are implemented
using the perl magical auto increment feature, thus you can use alphanumeric
values as interval bounds (see perl documentation, operator ++ for limitations
of this auto increment).
With kanif, you can specify the remote nodes on which you want to do some stuff using the command line switches (-n and -x, pdsh/dsh style), using machines specifications (C3 style) or both. Thus, this part of the documentation might be ignored if you do not want to use C3 style nodes management.
To use machines specification you must describe your cluster in a configuration file (see -f option and kanif.conf(5)). Machines specifications are nodes intervals taken from clusters defined in this file.
A machine specification is an optional cluster name followed by a colon and an optional range. The default cluster is taken if no cluster name is given. All the nodes of the cluster are taken if no range is given. Notice that if none of -n/-w, -M or machine specification is given on the command line, the remote hosts are assumed to be all the nodes of the default cluster.
Depending on the name used to invoke it (kash, kaput or kaget), kanif does not perform the same task. Here are its various behavior:
cp(1)
.
Notice that when using kaget or kaput each file or directory is completely copied before proceeding to the next one.
When a configuration file exists on the system or is given on the command line
(see option -f), remote machines can be specified via clusters names. For
instance, the simple execution of the command ls -l
on all the nodes of the
cluster named megacluster
can be written:
kash megacluster: ls -l
Intervals can also be given. The following command copies the local .cshrc file
to the login directory of a subset of the default cluster and another subset of
the megacluster
:
kaput :3-6 megacluster:2-5 $HOME/.cshrc .
Finally, one can take advantage of the default behavior to gather a file named
results.txt
placed
in the /tmp
directory on all the nodes of the default cluster to the local
directory results
:
kaget /tmp/results.txt results
When a user does not want to write a configuration file or just wants to deploy on some other nodes, it is possible to give remote hosts on the command line:
kash -n localhost,supernode uptime
This last command will just execute uptime
on localhost
and supernode
.
Giving intervals and exclusion lists is also possible on the command line. The
following command copie the file /tmp/temporary.txt
to the remote /tmp
directories of node1 and node5:
kaput -n node[1-6] -x node[2-4],node6 /tmp/temporary.txt /tmp
Finally, without entering into the details of each option, the final command
illustrates the -u option. It executes during 5 seconds a ping
to
gateway
from 5 nodes:
kash -n node[1-2],node[4-6] -u 5 ping gateway
Missing features:
Performance issues:
taktuk(1), kanif.conf(5)
The author of kanif and current maintainer of the package is Guillaume Huard. Acknowledgements to Lucas Nussbaum for the idea of the name ``kanif''.
kanif is provided under the terms of the GNU General Public License version 2 or later.