Docker container (base) for the converter software
This directory contains all the files needed to create the base Docker image used for the converter software in projects
.
A specific project's translation script can be executed either in an app like manner, calling it directly from within the container, using a local copy executed by the container, or in an interactive shell within the container.
Building & running the Docker container
This is a quick tour on how to build and run the Docker container through different scenarios. This is not a lecture on containerisation in general, nor Linux/ Unix, shell scripting or programming. But if you encounter a specific problem, feel free to ping the MA team.
This section describes four use cases of the Docker container (including build instructions per use case) but starts with a short primer of what is common to all scenarios described here.
Prerequisites
For building the Docker image, you need a local copy of the Git repository. After that, this guide assumes you are in the projects
subdirectory (we skip the output of the commands here):
$ git clone https://git.scicore.unibas.ch/schwede/modelcif-converters.git modelcif-converters.git
$ cd modelcif-converters.git/projects
$
Since the Docker container will run a dedicated, non-root user internally, it is advisable to create this user with the ID of your local user. That way, file permission issues will be avoided. Get your user ID with the following command and write it down - it will be needed in the build steps:
$ whoami
localuser
$ id
uid=1234(localuser) ...
$
Look for the uid
in the output of id
. In the example above, 1234
is the ID of user localuser
, currently logged in and executing the commands.
One last thing that is needed for the example runs of the Docker container is data. For simplicity, we assume that a directory /home/user/models
exists on the local computer executing a converter, full of modelling data.
Run a fixed converter from within the Docker container (app-like)
This use case comes closest to having a Docker container that works like a ModelCIF converter app, that you can
-
hand over to others
-
send to a compute cluster
-
turn into a Singularity image
and the conversion to ModelCIF works out of the box.
The idea is to copy a translation script from one of the projects into the Docker image along with all the software needed to run it. That enables you to start the script as a command with docker run
.
The whole build of the Docker image, including installing necessary software and copying the translation script, is covered by our Dockerfile. You just need to specify the translation script by build time argument CONVERTERSCRIPT
during docker build
. By default, the translation script is renamed to convert2modelcif
to be called as a command. This can be overwritten using build time argument CONVERTERCMD
. There is also an alias 2cif
to the converter command, which is immutable.
The following command will build a Docker image named converter
(with tag latest
). The translation script will be copied from USDA-ASFVG/translate2modelcif.py
and made available as convert2modelcif
in the Docker image/ docker run
. By MMCIF_USER_ID
, we use UID 1234
for the internal user of the Docker image, so mounted files have the right owner inside and outside of the Docker container (assuming the example user from above). Pay attention to the alternative Dockerfile location specified by -f docker/Dockerfile
, as we are calling docker build
from the projects subdirectory to get the right build context:
$ # DOCKER_BUILDKIT=1 is only needed for older versions of Docker.
$ DOCKER_BUILDKIT=1 docker build -f docker/Dockerfile --build-arg MMCIF_USER_ID=1234 --build-arg CONVERTERSCRIPT=USDA-ASFVG/translate2modelcif.py -t converter:latest .
$
After building the Docker image, its time to run the translation command. To do so, we need to make the data available inside the Docker container. This is achieved by mounting the model-data directory into the Docker container. In the following example, the local /home/user/models
directory of the host machine is made available as /data
inside the container. So the command send to docker run
has to use /data
as it is executed inside the container:
$ docker run --rm -v /home/user/models:/data -t converter:latest convert2modelcif /data/ /data/proteome_accessions.csv
$
Run a local converter script with the Docker container
Instead of using the script which is statically copied inside the Docker image, you can also use the Docker container as run time environment executing a translation script from disk. This comes in handy when converting many different modelling project types to ModelCIF. Rather than building individual Docker images per modelling variant, use a single one (that guarantees that all ModelCIF files are build with exactly the same software stack) and iterate through the various translation scripts.
That already works with the Docker image build for app-like execution. But in the build example here, we use a file that can not be executed, to be copied as convert2modelcif
. This makes sure that the Docker container is not accidentally run without declaring a specific script (Note: don't try to use docker/README.md
for CONVERTERSCRIPT
, it is excluded from the build context by .dockerignore
.):
$ # DOCKER_BUILDKIT=1 is only needed for older versions of Docker.
$ DOCKER_BUILDKIT=1 docker build -f docker/Dockerfile --build-arg MMCIF_USER_ID=1234 --build-arg CONVERTERSCRIPT=docker/requirements.txt -t converter:latest .
$
The dedicated translation script is made available inside the Docker container as a direct bind mount to the installed converter command. The remaining parameters are the same as for the app-like docker run
command:
$ docker run --rm -v /home/user/models:/data -v $(pwd)/USDA-ASFVG/translate2modelcif.py:/usr/local/bin/convert2modelcif -t converter:latest convert2modelcif /data/ /data/proteome_accessions.csv
$
Run the converter command in an interactive shell from within the Docker container
As the Docker image comes with a shell (bash) installed, the translation script can be started from an interactive session within the Docker container.
The Docker image does not need a special build command to allow interactive sessions, any from the app-like and the local variant will work. The magic comes in with the docker run
command.
A drawback of running bash inside a Docker container is the lack of your personal configuration which stays outside of the shell by default. That can be mended with a bind mount and with the example call, we also add a bash history file to not lose complex command lines:
$ touch .history
$ docker run --rm -i -t -v /home/user/models:/data -v $HOME/.bashrc:/home/mmcif/.bashrc -v $(pwd)/.history:/home/mmcif/.bash_history -t converter:latest bash
$
In the interactive shell, the convert2modelcif
command is available, as well as any script/ data that is mounted by docker run -v ...
.
Be aware of the touch .history
command before docker run
. This makes sure a file .history
exists before starting the Docker container. If the file does not exist, Docker will create a directory .history
itself but that does not record the bash command history.
Build the development Docker container
The Dockerfile has an additional build argument, ADD_DEV
. If set to YES
, the following development tools are added to the Docker image:
None of these are needed to run a translation script.
The build argument is just added to the docker build
call:
$ # DOCKER_BUILDKIT=1 is only needed for older versions of Docker.
$ DOCKER_BUILDKIT=1 docker build -f docker/Dockerfile --build-arg MMCIF_USER_ID=1234 --build-arg ADD_DEV=YES -t converter:latest .
$
For working on a translation script, it is convenient to mount the complete Git repository when running the Docker container interactively. This makes sure pyproject.toml
is available from the repository root to black
and pylint
:
$ touch .history
$ docker run --rm -i -t -v /home/user/models:/data -v $HOME/.bashrc:/home/mmcif/.bashrc -v $(pwd)/.history:/home/mmcif/.bash_history -v $(pwd)/../:/develop -t converter:latest bash
$
In the session, the Git repository can be found in /develop
.