8000 Bitextor Docker · bitextor/bitextor Wiki · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Bitextor Docker

Leopoldo Pla edited this page Dec 2, 2021 · 7 revisions

Installing and running Bitextor via Docker

Docker installation

To install Docker refer to Docker documentation (recommended), or install via snap:

sudo snap install docker

Bitextor docker image

Docker image of bitextor is available here. Both release and nightly versions are available:

docker pull bitextor/bitextor # latest release
docker pull bitextor/bitextor:edge # nightlies from Github master branch

Usage

After pulling the Bitextor image, Bitextor can be run with:

docker run bitextor/bitextor

The command above will launch bitextor.sh and show the help message. Any arguments that are provided to this command will be passed to bitextor.sh.

To run Bitextor you need to provide some resources to it, such as the config file, translation system, dictionaries, bicleaner model, etc. The Docker container must be able to read these files. Likewise, the output of Bitextor ideally should also be shared with the host system in a similar way. One way to achieve this is to use Docker volumes.

The following snippet is an example of Docker volume usage. The folder /home/user/bitextor-files of the host system will be mounted to the /home/docker/data folder of the container.

docker run -v /home/user/bitextor-files:/home/docker/data bitextor/bitextor

# multiple volumes are also allowed:
docker run -v /home/user/bitextor_input:/home/docker/bitextor_input -v /home/user/bitextor_output:/home/docker/bitextor_output bitextor/bitextor

In the image Bitextor folder and other relevant dependencies are located at /home/docker. All of the dependencies and compilations are fulfilled.

$ ls /home/docker
bitextor  go  heritrix-3.4.0-SNAPSHOT  protobuf-3.10.1

It is important to note that the paths to the input files and scripts that are specified in the configuration file should refer to the folders that will be mounted inside the Docker container. The path of the config file passed to Bitextor via command line should also be relative to the container.

$ # ~/bitextor-data folder contains the config file and the bicleaner model
$ ls ~/bitextor-data 
bitextor_config.yaml  en-es

$ # config file argument relative to the container
$ docker run -v ~/bitextor-data:/home/docker/corpus bitextor/bitextor -s /home/docker/corpus/bitextor_config.yaml
# in bitextor_config.yaml:

# make sure output files are in the volume,
# so that they can be accessed directly from the host machine
permanentDir: /home/docker/corpus/permanent
transientDir: /home/docker/corpus/transient
dataDir: /home/docker/corpus/data

# bicleaer model path also relative to the container
bicleaner: /home/docker/corpus/en-es/en-es.yaml

Launching Bitextor with interactive shell

By default Bitextor Docker image launches bitextor.sh scripts, if you want to change that behavior to launch an interactive shell instead, you have to change the entrypoint of the container.

When launching the container for the first time:

docker pull bitextor/bitextor
docker run -it --entrypoint /bin/bash bitextor # will open an interactive shell

To run a shell in an existing Bitextor container (i.e. after you have already run Bitextor in the default way), you first have to create an image of it, and then run that image with a different entrypoint like in the snippet above. To create an image based on an existing container first find out the name or the ID.

docker ps -a # will list your containers with some basic info
docker commit <CONTAINER_ID> bitextor/new_image # create new image
docker run -it --entrypoint /bin/bash bitextor/new_image # run shell in the new image
Clone this wiki locally
0