4. Charliecloud command reference

This section is a comprehensive description of the usage and arguments of the Charliecloud commands. Its content is identical to the commands’ man pages.

4.1. ch-build

Wrapper for docker build that works around some of its annoying behaviors.

4.1.1. Synopsis

$ ch-build -t TAG [ARGS ...] CONTEXT

4.1.2. Description

Build a Docker image named TAG described by Dockerfile ./Dockerfile or as specified. This is a wrapper for docker build with various enhancements.

Sudo privileges are required to run the docker command.

Arguments:

--file
Dockerfile to use (default: ./Dockerfile)
-t
name (tag) of Docker image to build
--help
print help and exit
--version
print version and exit

Additional arguments are accepted and passed unchanged to docker build.

4.1.3. Improvements over plain docker build

ch-build adds the following features to docker build:

  • If there is a file Dockerfile in the current working directory and -f is not already specified, add -f $PWD/Dockerfile.
  • Pass the HTTP proxy environment variables through with --build-arg.

Note

The suffix :latest is somewhat misleading, as neither ch-build nor bare docker build will notice if the base FROM image has been updated. Use --no-cache to make sure you have the latest base image, at the cost of rebuilding every layer.

4.1.4. Examples

Create a Docker image tagged foo and specified by the file Dockerfile located in the current working directory. Use /bar as the Docker context directory:

$ ch-build -t foo /bar

Equivalent to above:

$ ch-build -t foo --file=./Dockerfile /bar

Instead, use the Dockerfile /baz/qux.docker:

$ ch-build -t foo --file=/baz/qux.docker /bar

Note that calling your Dockerfile anything other than Dockerfile will confuse people.

4.2. ch-build2dir

Build a Charliecloud image from Dockerfile and unpack it.

4.2.1. Synopsis

$ ch-build2dir CONTEXT DEST [ARGS ...]

4.2.2. Description

Build a Docker image as specified by the file Dockerfile in the current working directory and context directory CONTEXT. Unpack it in DEST.

Sudo privileges are required to run the docker command.

This runs the following command sequence: ch-build, ch-docker2tar, and ch-tar2dir but provides less flexibility than the individual commands.

Arguments:

CONTEXT
Docker context directory
DEST
directory in which to place image tarball and directory
ARGS
additional arguments passed to ch-build
--help
print help and exit
--version
print version and exit

4.3. ch-docker2tar

Flatten a Docker image into a Charliecloud image tarball.

4.3.1. Synopsis

$ ch-docker2tar IMAGE OUTDIR

4.3.2. Description

Flattens the Docker image tagged IMAGE into a Charliecloud tarball in directory OUTDIR.

Sudo privileges are required to run docker export.

Additional arguments:

--help
print help and exit
--version
print version and exit

4.3.3. Example

$ ch-docker2tar hello /var/tmp
57M /var/tmp/hello.tar.gz
$ ls -lh /var/tmp
-rw-r-----  1 reidpr reidpr  57M Feb 13 16:14 hello.tar.gz

4.4. ch-fromhost

Inject files from the host into an image directory.

4.4.1. Synopsis

$ ch-fromhost [OPTION ...] [FILE_OPTION ...] IMGDIR

4.4.2. Description

Note

This command is experimental. Features may be incomplete and/or buggy. Please report any issues you find, so we can fix them!

Inject files from the host into the Charliecloud image directory IMGDIR.

The purpose of this command is to provide host-specific files, such as GPU libraries, to a container. It should be run after ch-tar2dir and before ch-run. After invocation, the image is no longer portable to other hosts.

Injection is not atomic; if an error occurs partway through injection, the image is left in an undefined state. Injection is currently implemented using a simple file copy, but that may change in the future.

By default, file paths that contain the strings /bin or /sbin are assumed to be executables and placed in /usr/bin within the container. File paths that contain the strings /lib or .so are assumed to be shared libraries and are placed in the first-priority directory reported by ldconfig (see --lib-path below). Other files are placed in the directory specified by --dest.

If any shared libraries are injected, run ldconfig inside the container (using ch-run -w) after injection.

4.4.3. Options

4.4.3.1. To specify which files to inject

-c, --cmd CMD
Inject files listed in the standard output of command CMD.
-f, --file FILE
Inject files listed in the file FILE.
-p, --path PATH
Inject the file at PATH.
--cray-mpi
Cray-enable an MPICH installed inside the image. See important details below.
--nvidia
Use nvidia-container-cli list (from libnvidia-container) to find executables and libraries to inject.

These can be repeated, and at least one must be specified.

4.4.3.2. To specify the destination within the image

-d, --dest DST
Place files specified later in directory IMGDIR/DST, overriding the inferred destination, if any. If a file’s destination cannot be inferred and --dest has not been specified, exit with an error. This can be repeated to place files in varying destinations.

4.4.3.3. Additional arguments

--lib-path
Print the guest destination path for shared libraries inferred as described above.
--no-ldconfig
Don’t run ldconfig even if we appear to have injected shared libraries.
-h, --help
Print help and exit.
-v, --verbose
List the injected files.
--version
Print version and exit.

4.4.4. --cray-mpi prerequisites and quirks

The implementation of --cray-mpi for MPICH is messy, foul smelling, and brittle. It replaces or overrides the open source MPICH libraries installed in the container. Users should be aware of the following.

  1. Containers must have the following software installed:
    1. Open source MPICH.
    2. PatchELF with our patches. Use the shrink-soname branch.
    3. libgfortran.so.3, because Cray’s libmpi.so.12 links to it.
  2. Applications must be linked to libmpi.so.12 (not e.g. libmpich.so.12). How to configure MPICH to accomplish this is not yet clear to us; test/Dockerfile.mpich does it, while the Debian packages do not.
  3. One of the cray-mpich-abi modules must be loaded when ch-fromhost is invoked.
  4. Tested only for C programs compiled with GCC, and it probably won’t work otherwise. If you’d like to use another compiler or another programming language, please get in touch so we can implement the necessary support.

Please file a bug if we missed anything above or if you know how to make the code better.

4.4.5. Notes

Symbolic links are dereferenced, i.e., the files pointed to are injected, not the links themselves.

As a corollary, do not include symlinks to shared libraries. These will be re-created by ldconfig.

There are two alternate approaches for nVidia GPU libraries:

  1. Link libnvidia-containers into ch-run and call the library functions directly. However, this would mean that Charliecloud would either (a) need to be compiled differently on machines with and without nVidia GPUs or (b) have libnvidia-containers available even on machines without nVidia GPUs. Neither of these is consistent with Charliecloud’s philosophies of simplicity and minimal dependencies.
  2. Use nvidia-container-cli configure to do the injecting. This would require that containers have a half-started state, where the namespaces are active and everything is mounted but pivot_root(2) has not been performed. This is not feasible because Charliecloud has no notion of a half-started container.

Further, while these alternate approaches would simplify or eliminate this script for nVidia GPUs, they would not solve the problem for other situations.

4.4.6. Bugs

File paths may not contain colons or newlines.

4.4.7. Examples

Place shared library /usr/lib64/libfoo.so at path /usr/lib/libfoo.so (assuming /usr/lib is the first directory searched by the dynamic loader in the image), within the image /var/tmp/baz and executable /bin/bar at path /usr/bin/bar. Then, create appropriate symlinks to libfoo and update the ld.so cache.

$ cat qux.txt
/bin/bar
/usr/lib64/libfoo.so
$ ch-fromhost --file qux.txt /var/tmp/baz

Same as above:

$ ch-fromhost --cmd 'cat qux.txt' /var/tmp/baz

Same as above:

$ ch-fromhost --path /bin/bar --path /usr/lib64/libfoo.so /var/tmp/baz

Same as above, but place the files into /corge instead (and the shared library will not be found by ldconfig):

$ ch-fromhost --dest /corge --file qux.txt /var/tmp/baz

Same as above, and also place file /etc/quux at /etc/quux within the container:

$ ch-fromhost --file qux.txt --dest /etc --path /etc/quux /var/tmp/baz

Inject the executables and libraries recommended by nVidia into the image, and then run ldconfig:

$ ch-fromhost --nvidia /var/tmp/baz

4.4.8. Acknowledgements

This command was inspired by the similar Shifter feature that allows Shifter containers to use the Cray Aires network. We particularly appreciate the help provided by Shane Canon and Doug Jacobsen during our implementation of --cray-mpi.

We appreciate the advice of Ryan Olson at nVidia on implementing --nvidia.

4.5. ch-pull2dir

Download image via docker pull and unpack it into directory.

4.5.1. Synopsis

$ ch-pull2dir IMAGE[:TAG] DIR

4.5.2. Description

Pull Docker image named IMAGE[:TAG] from Docker Hub and extract it into a subdirectory of DIR. A temporary tarball is stored in DIR.

Sudo privileges are required to run the docker pull command.

This runs the following command sequence: ch-pull2tar, ch-tar2dir. See warning in the documentation for ch-tar2dir.

Additional arguments:

--help
print help and exit
--version
print version and exit

4.5.3. Examples

$ ch-pull2dir alpine /var/tmp
Using default tag: latest
latest: Pulling from library/alpine
Digest: sha256:621c2f39f8133acb8e64023a94dbdf0d5ca81896102b9e57c0dc184cadaf5528
Status: Image is up to date for alpine:latest
-rw-r--r--. 1 charlie charlie 2.1M Oct  5 19:52 /var/tmp/alpine.tar.gz
creating new image /var/tmp/alpine
/var/tmp/alpine unpacked ok
removed '/var/tmp/alpine.tar.gz'

Same as above, except optional TAG is specified:

$ ch-pull2dir alpine:3.6 /var/tmp
3.6: Pulling from library/alpine
Digest: sha256:cc24af836d1377e092ecb4e8f0a4324c3b1aa2b5295c2239edcc7bbc86a9cbc6
Status: Image is up to date for alpine:3.6
-rw-r--r--. 1 charlie charlie 2.1M Oct  5 19:54 /var/tmp/alpine:3.6.tar.gz
creating new image /var/tmp/alpine:3.6
/var/tmp/alpine:3.6 unpacked ok
removed '/var/tmp/alpine:3.6.tar.gz'

4.6. ch-pull2tar

Download image via docker pull and flatten it to tarball.

4.6.1. Synopsis

$ ch-pull2tar IMAGE[:TAG] OUTDIR

4.6.2. Description

Pull a Docker image named IMAGE[:TAG] from Docker Hub and flatten it into a Charliecloud tarball in directory OUTDIR.

This runs the following command sequence: docker pull, ch-docker2tar but provides less flexibility than the individual commands.

Sudo privileges are required for docker pull.

Additional arguments:

--help
print help and exit
--version
print version and exit

4.6.3. Examples

$ ch-pull2tar alpine /var/tmp
Using default tag: latest
latest: Pulling from library/alpine
Digest: sha256:621c2f39f8133acb8e64023a94dbdf0d5ca81896102b9e57c0dc184cadaf5528
Status: Image is up to date for alpine:latest
-rw-r--r--. 1 charlie charlie 2.1M Oct  5 19:52 /var/tmp/alpine.tar.gz

Same as above, except optional TAG is specified:

$ ch-pull2tar alpine:3.6
3.6: Pulling from library/alpine
Digest: sha256:cc24af836d1377e092ecb4e8f0a4324c3b1aa2b5295c2239edcc7bbc86a9cbc6
Status: Image is up to date for alpine:3.6
-rw-r--r--. 1 charlie charlie 2.1M Oct  5 19:54 /var/tmp/alpine:3.6.tar.gz

4.7. ch-run

Run a command in a Charliecloud container.

4.7.1. Synopsis

$ ch-run [OPTION...] NEWROOT CMD [ARG...]

4.7.2. Description

Run command CMD in a Charliecloud container using the flattened and unpacked image directory located at NEWROOT.

-b, --bind=SRC[:DST]
mount SRC at guest DST (default /mnt/0, /mnt/1, etc.)
-c, --cd=DIR
initial working directory in container
--ch-ssh
bind ch-ssh(1) into container at /usr/bin/ch-ssh
-g, --gid=GID
run as group GID within container
-j, --join
use the same container (namespaces) as peer ch-run invocations
--join-pid=PID
join the namespaces of an existing process
--join-ct=N
number of ch-run peers (implies --join; default: see below)
--join-tag=TAG
label for ch-run peer group (implies --join; default: see below)
--no-home
do not bind-mount your home directory (by default, your home directory is mounted at /home/$USER in the container)
-t, --private-tmp
use container-private /tmp (by default, /tmp is shared with the host)
--set-env=FILE
set environment variables as specified in host path FILE
-u, --uid=UID
run as user UID within container
--unset-env=GLOB
unset environment variables whose names match GLOB
-v, --verbose
be more verbose (debug if repeated)
-w, --write
mount image read-write (by default, the image is mounted read-only)
-?, --help
print help and exit
--usage
print a short usage message and exit
-V, --version
print version and exit

4.7.3. Host files and directories available in container via bind mounts

In addition to any directories specified by the user with --bind, ch-run has standard host files and directories that are bind-mounted in as well.

The following host files and directories are bind-mounted at the same location in the container. These cannot be disabled.

  • /dev
  • /etc/passwd
  • /etc/group
  • /etc/hosts
  • /etc/resolv.conf
  • /proc
  • /sys

Three additional bind mounts can be disabled by the user:

  • Your home directory (i.e., $HOME) is mounted at guest /home/$USER by default. This is accomplished by mounting a new tmpfs at /home, which hides any image content under that path. If --no-home is specified, neither of these things happens and the image’s /home is exposed unaltered.
  • /tmp is shared with the host by default. If --private-tmp is specified, a new tmpfs is mounted on the guest’s /tmp instead.
  • If file /usr/bin/ch-ssh is present in the image, it is over-mounted with the ch-ssh binary in the same directory as ch-run.

4.7.4. Multiple processes in the same container with --join

By default, different ch-run invocations use different user and mount namespaces (i.e., different containers). While this has no impact on sharing most resources between invocations, there are a few important exceptions. These include:

  1. ptrace(2), used by debuggers and related tools. One can attach a debugger to processes in descendant namespaces, but not sibling namespaces. The practical effect of this is that (without --join), you can’t run a command with ch-run and then attach to it with a debugger also run with ch-run.
  2. Cross-memory attach (CMA) is used by cooperating processes to communicate by simply reading and writing one another’s memory. This is also not permitted between sibling namespaces. This affects various MPI implementations that use CMA to pass messages between ranks on the same node, because it’s faster than traditional shared memory.

--join is designed to address this by placing related ch-run commands (the “peer group”) in the same container. This is done by one of the peers creating the namespaces with unshare(2) and the others joining with setns(2).

To do so, we need to know the number of peers and a name for the group. These are specified by additional arguments that can (hopefully) be left at default values in most cases:

  • --join-ct sets the number of peers. The default is the value of the first of the following environment variables that is defined: OMPI_COMM_WORLD_LOCAL_SIZE, SLURM_STEP_TASKS_PER_NODE, SLURM_CPUS_ON_NODE.
  • --join-tag sets the tag that names the peer group. The default is environment variable SLURM_STEP_ID, if defined; otherwise, the PID of ch-run’s parent. Tags can be re-used for peer groups that start at different times, i.e., once all peer ch-run have replaced themselves with the user command, the tag can be re-used.

Caveats:

  • One cannot currently add peers after the fact, for example, if one decides to start a debugger after the fact. (This is only required for code with bugs and is thus an unusual use case.)
  • ch-run instances race. The winner of this race sets up the namespaces, and the other peers use the winner to find the namespaces to join. Therefore, if the user command of the winner exits, any remaining peers will not be able to join the namespaces, even if they are still active. There is currently no general way to specify which ch-run should be the winner.
  • If --join-ct is too high, the winning ch-run’s user command exits before all peers join, or ch-run itself crashes, IPC resources such as semaphores and shared memory segments will be leaked. These appear as files in /dev/shm/ and can be removed with rm(1).
  • Many of the arguments given to the race losers, such as the image path and --bind, will be ignored in favor of what was given to the winner.

4.7.5. Environment variables

ch-run leaves environment variables unchanged, i.e. the host environment is passed through unaltered, except:

  • limited tweaks to avoid significant guest breakage;
  • user-set variables via --set-env; and
  • user-unset variables via --unset-env.

This section describes these features.

The default tweaks happen first, and then --set-env and --unset-env in the order specified on the command line. The latter two can be repeated arbitrarily many times, e.g. to add/remove multiple variable sets or add only some variables in a file.

4.7.5.1. Default behavior

By default, ch-run makes the following environment variable changes:

  • $HOME: If the path to your home directory is not /home/$USER on the host, then an inherited $HOME will be incorrect inside the guest. This confuses some software, such as Spack.

    Thus, we change $HOME to /home/$USER, unless --no-home is specified, in which case it is left unchanged.

  • $PATH: Newer Linux distributions replace some root-level directories, such as /bin, with symlinks to their counterparts in /usr.

    Some of these distributions (e.g., Fedora 24) have also dropped /bin from the default $PATH. This is a problem when the guest OS does not have a merged /usr (e.g., Debian 8 “Jessie”). Thus, we add /bin to $PATH if it’s not already present.

    Further reading:

4.7.5.2. Setting variables with --set-env

The purpose of --set-env=FILE is to set environment variables that cannot be inherited from the host shell, e.g. Dockerfile ENV directives or other build-time configuration. FILE is a host path to provide the greatest flexibility; guest paths can be specified by prepending the image path.

Variable values in FILE replace any already set. If a variable is repeated, the last value wins.

The syntax of FILE is key-value pairs separated by the first equals character (=, ASCII 61), one per line, with optional single straight quotes (', ASCII 39) around the value. Empty lines are ignored. Newlines (ASCII 10) are not permitted in either key or value. No variable expansion, comments, etc. are provided. The value may be empty, but not the key. (This syntax is designed to accept the output of printenv and be easily produced by other simple mechanisms.) Examples of valid lines:

Line Key Value
FOO=bar FOO bar
FOO=bar=baz FOO bar=baz
FLAGS=-march=foo -mtune=bar FLAGS -march=foo -mtune=bar
FLAGS='-march=foo -mtune=bar' FLAGS -march=foo -mtune=bar
FOO= FOO (empty string)
FOO='' FOO (empty string)
FOO='''' FOO '' (two single quotes)

Example invalid lines:

Line Problem
FOO bar no separator
=bar key cannot be empty

Example valid lines that are probably not what you want:

Line Key Value Problem
FOO="bar" FOO "bar" double quotes aren’t stripped
FOO=bar # baz FOO bar # baz comments not supported
PATH=$PATH:/opt/bin PATH $PATH:/opt/bin variables not expanded
FOO=bar FOO bar leading space in key
FOO= bar FOO bar leading space in value

Example Docker command to produce a valid FILE:

$ docker inspect $TAG --format='{{range .Config.Env}}{{println .}}{{end}}'

4.7.5.3. Removing variables with --unset-env

The purpose of --unset-env=GLOB is to remove unwanted environment variables. The argument GLOB is a glob pattern (dialect fnmatch(3) with no flags); all variables with matching names are removed from the environment.

Warning

Because the shell also interprets glob patterns, if any wildcard characters are in GLOB, it is important to put it in single quotes to avoid surprises.

GLOB must be a non-empty string.

Example 1: Remove the single environment variable FOO:

$ export FOO=bar
$ env | fgrep FOO
FOO=bar
$ ch-run --unset-env=FOO $CH_TEST_IMGDIR/chtest -- env | fgrep FOO
$

Example 2: Hide from a container the fact that it’s running in a Slurm allocation, by removing all variables beginning with SLURM. You might want to do this to test an MPI program with one rank and no launcher:

$ salloc -N1
$ env | egrep '^SLURM' | wc
   44      44    1092
$ ch-run $CH_TEST_IMGDIR/mpihello-openmpi -- /hello/hello
[... long error message ...]
$ ch-run --unset-env='SLURM*' $CH_TEST_IMGDIR/mpihello-openmpi -- /hello/hello
0: MPI version:
Open MPI v3.1.3, package: Open MPI root@c897a83f6f92 Distribution, ident: 3.1.3, repo rev: v3.1.3, Oct 29, 2018
0: init ok cn001.localdomain, 1 ranks, userns 4026532530
0: send/receive ok
0: finalize ok

Example 3: Clear the environment completely (remove all variables):

$ ch-run --unset-env='*' $CH_TEST_IMGDIR/chtest -- env
$

Note that some programs, such as shells, set some environment variables even if started with no init files:

$ ch-run --unset-env='*' $CH_TEST_IMGDIR/debian9 -- bash --noprofile --norc -c env
SHLVL=1
PWD=/
_=/usr/bin/env
$

4.7.6. Examples

Run the command echo hello inside a Charliecloud container using the unpacked image at /data/foo:

$ ch-run /data/foo -- echo hello
hello

Run an MPI job that can use CMA to communicate:

$ srun ch-run --join /data/foo -- bar

4.8. ch-ssh

Run a remote command in a Charliecloud container.

4.8.1. Synopsis

$ CH_RUN_ARGS="NEWROOT [ARG...]"
$ ch-ssh [OPTION...] HOST CMD [ARG...]

4.8.2. Description

Runs command CMD in a Charliecloud container on remote host HOST. Use the content of environment variable CH_RUN_ARGS as the arguments to ch-run on the remote host.

Note

Words in CH_RUN_ARGS are delimited by spaces only; it is not shell syntax.

4.8.3. Example

On host bar.example.com, run the command echo hello inside a Charliecloud container using the unpacked image at /data/foo with starting directory /baz:

$ hostname
foo
$ export CH_RUN_ARGS='--cd /baz /data/foo'
$ ch-ssh bar.example.com -- hostname
bar

4.9. ch-tar2dir

Unpack an image tarball into a directory.

4.9.1. Synopsis

$ ch-tar2dir TARBALL DIR

4.9.2. Description

Extract the tarball TARBALL into a subdirectory of DIR. TARBALL must contain a Linux filesystem image, e.g. as created by ch-docker2tar, and be compressed with gzip or xz. If TARBALL has no extension, try appending .tar.gz and .tar.xz.

Inside DIR, a subdirectory will be created whose name corresponds to the name of the tarball with .tar.gz or other suffix removed. If such a directory exists already and appears to be a Charliecloud container image, it is removed and replaced. If the existing directory doesn’t appear to be a container image, the script aborts with an error.

Additional arguments:

--help
print help and exit
--verbose
be more verbose
--version
print version and exit

Warning

Placing DIR on a shared file system can cause significant metadata load on the file system servers. This can result in poor performance for you and all your colleagues who use the same file system. Please consult your site admin for a suitable location.

4.9.3. Example

$ ls -lh /var/tmp
total 57M
-rw-r-----  1 reidpr reidpr  57M Feb 13 16:14 hello.tar.gz
$ ch-tar2dir /var/tmp/hello.tar.gz /var/tmp
creating new image /var/tmp/hello
/var/tmp/hello unpacked ok
$ ls -lh /var/tmp
total 57M
drwxr-x--- 22 reidpr reidpr 4.0K Feb 13 16:29 hello
-rw-r-----  1 reidpr reidpr  57M Feb 13 16:14 hello.tar.gz