Docker container (base) for the converter software
This directory contains all the files needed to create the base Docker image used for the converter software in projects
.
A specific project's translation script can be executed either in an app like manner, calling it directly from within the container, using a local copy executed by the container, or in an interactive shell within the container (I.O.U. a link to the anchor here).
Building & running the Docker container
This is a quick tour on how to build and run the Docker container through different scenarios. This is not a lecture on containerisation in general, nor Linux/ Unix, shell scripting or programming. But if you encounter a specific problem, feel free to ping the MA team.
This section describes four use cases of the Docker container (including build instructions per use case) but starts with a short primer of what is common to all scenarios described here.
Prerequisites
For building the Docker image, you need a local copy of the Git repository. After that, this guide assumes you are in the projects
subdirectory (we skip the output of the commands here):
$ git clone https://git.scicore.unibas.ch/schwede/modelcif-converters.git modelcif-converters.git
$ cd modelcif-converters.git/projects
$
Since the Docker container will run a dedicated, non-root user internally, it is advisable to create this user with the ID of your local user. That way, file permission issues will be avoided. Get your user ID with the following command and write it down - it will be needed in the build steps:
$ whoami
localuser
$ id
uid=1234(localuser) ...
$
Look for the uid
in the output of id
. In the example above, 1234
is the ID of user localuser
, currently logged in and executing the commands.
One last thing that is needed for the example runs of the Docker container is data. For simplicity, we assume that a directory /home/user/models
exists on the local computer executing a converter, full of modelling data.
Run a fixed converter from within the Docker container (app-like)
This use case comes closest to having a Docker container that works like a ModelCIF converter app, that you can
-
hand over to others
-
send to a compute cluster
-
turn into a Singularity image
and the conversion to ModelCIF works out of the box.
The idea is to copy a translation script from one of the projects into the Docker image along with all the software needed to run it. That enables you to start the script as a command with docker run
.
The whole build of the Docker image, including installing necessary software and copying the translation script, is covered by our Dockerfile. You just need to specify the translation script by build time argument CONVERTERSCRIPT
during docker build
. By default, the translation script is renamed to convert2modelcif
to be called as a command. This can be overwritten using build time argument CONVERTERCMD
. There is also an alias 2cif
to the converter command, which is immutable.
The following command will build a Docker image named converter
(with tag latest
). The translation script will be copied from USDA-ASFVG/translate2modelcif.py
and made available as convert2modelcif
in the Docker image/ docker run
. By MMCIF_USER_ID
, we use UID 1234
for the internal user of the Docker image, so mounted files have the right owner inside and outside of the Docker container (assuming the example user from above). Pay attention to the alternative Dockerfile location specified by -f docker/Dockerfile
, as we are calling docker build
from the projects subdirectory to get the right build context:
$ # DOCKER_BUILDKIT=1 is only needed for older versions of Docker.
$ DOCKER_BUILDKIT=1 docker build -f docker/Dockerfile --build-arg MMCIF_USER_ID=1234 --build-arg CONVERTERSCRIPT=USDA-ASFVG/translate2modelcif.py -t converter:latest .
$
After building the Docker image, its time to run the translation command. To do so, we need to make the data available inside the Docker container. This is achieved by mounting the model-data directory into the Docker container. In the following example, the local /home/user/models
directory of the host machine is made available as /data
inside the container. So the command send to docker run
has to use /data
as it is executed inside the container:
$ docker run --rm -v /home/user/models:/data -t converter:latest convert2modelcif /data/ /data/proteome_accessions.csv
$
Run a local converter script with the Docker container
Instead of using the script which is statically copied inside the Docker image, you can also use the Docker container as run time environment executing a translation script from disk. This comes in handy when converting many different modelling project types to ModelCIF. Rather than building individual Docker images per modelling variant, use a single one (that guarantees that all ModelCIF files are build with exactly the same software stack) and iterate through the various translation scripts.
That already works with the Docker image build for app-like execution. But in the build example here, we use a file that can not be executed to be copied as convert2modelcif
. This makes sure that the Docker container is not accidentally run without declaring a specific script (Note: don't try to use docker/README.md
for CONVERTERSCRIPT
, it is excluded from the build context by .dockerignore
.):
$ # DOCKER_BUILDKIT=1 is only needed for older versions of Docker.
$ DOCKER_BUILDKIT=1 docker build -f docker/Dockerfile --build-arg MMCIF_USER_ID=1234 --build-arg CONVERTERSCRIPT=docker/requirements.txt -t converter:latest .
$
The dedicated translation script is made available inside the Docker container as a direct bind mount to the installed converter command. The remaining parameters are the same as for the app-like docker run
command:
$ docker run --rm -v /home/user/models:/data -v $(pwd)/USDA-ASFVG/translate2modelcif.py:/usr/local/bin/convert2modelcif -t converter:latest convert2modelcif /data/ /data/proteome_accessions.csv
$