You are viewing the RapidMiner Go documentation for version 9.7 - Check here for latest version
RapidMiner Job Container
Job Containers are the back-end components of RapidMiner Go that execute CPU-heavy computations such as model training and prediction. The default docker-compose-services only starts one Job Container on the same host as RapidMiner Go, but in a production environment multiple Job Containers should be started on separate machines. The load balancing between JC instances is handled by the AMQ service. A JC instance only performs one job at a time, so the next job in queue will be picked by the JC instance that first becomes idle.
Licensing
Job Containers depend on the license file at licenses/rapidminer-go-on-prem directory - if this is not present JC will not start. This folder is automatically mounted into the file system of every RapidMiner Go and Job Container instance - so there's no need to copy it manually.
Configuration using environment variables
A Job Container is a Spring Boot application. It currently has a single valid Spring profile value: broker-amq
.
Table of default environment variables:
Environment variable name | Description |
---|---|
JOB_QUEUE | AMQ job queue name |
JOB_STATUS_QUEUE | AMQ status queue name |
JOB_COMMAND_TOPIC | AMQ topic name |
AMQ_URL | AMQ URL |
AMQ_USERNAME | AMQ username |
AMQ_PASSWORD | AMQ password |
Multiple JobContainers and per user job limitation
Multiple JobContainer instances can be run by increasing the JOB_CONTAINERS
variable in .env file.
In this case make sure there is enough available RAM on the host machine to be allocated for these instances.
The default value of MEMORY_PER_JOB_CONTAINER
requires 4GB per JobContaner.
For instance by using the default memory settings with 2 JCs will require 4 + 4 * 2 = 12Gb RAM in total.
With multiple JCs available you can also increase the AUTOMODELER_EXECUTION_QUEUE_LIMIT_PER_USER
in AutoModeler settings.
If this setting is equal to the number of JCs one user's jobs can be run parallely on all JCs - so
an other user submitting his or her job later will need to wait until both JC finish their current job.
By decreasing the queue limit you can limit every user to a fraction of the JCs thus preserving execution resources for other concurrent users.