You are viewing the RapidMiner Server documentation for version 9.5 - Check here for latest version
Kubernetes
Our Docker Images are ready to deploy to any Kubernetes Cluster. Here we provide example deployment configurations and tutorials, but the final deployment depends on your requirements.
The following guide requires a running Kubernetes cluster. We tested our example configuration with these Kubernetes services:
Deployment architecture and definition
In our example, we deploy a PostgeSQL database server, RapidMiner Server, and some Job Agents on Kubernetes.
To deploy RapidMiner Server on Kubernetes, you need to define the services, volumes and pods.
Volumes
Our example configuration uses two persistent volumes:
- A volume for the PostgreSQL database data storage
- A volume for the RapidMiner Home of the RapidMiner Server
To define the volumes, you can apply the following Kubernetes Object Configuration YAML file.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pgvolume-claim
labels:
app: database
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: rmsvolume-claim
labels:
app: rapidminer-server
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
Services
To deploy the example configuration, we specify three Kubernetes Service Endpoints:
- The ActiveMQ service endpoint is an internal endpoint that is used by the Job Agents (port: 5672)
- The database service endpoint is an internal endpoint that used to connect from the RapidMiner Server (port: 5432)
- The RapidMiner Server service endpoint represent the public web interface of the RapidMiner Server (port: 8080).
Note: the public endpoint definition may differ on different Kubernetes Clusters.
Public cloud providers support the LoadBalancer type, but the MicroK8S implementation requires the setting of an Ingress to enable public access.
To define the service endpoints, you can apply the following Kubernetes Object Configuration YAML file:
kind: Service
apiVersion: v1
metadata:
name: rapidminer-server-amq-svc
labels:
app: rapidminer-server-amq-svc
role: server
spec:
ports:
- port: 5672
targetPort: amq
selector:
app: rapidminer-server
role: server
---
kind: Service
apiVersion: v1
metadata:
name: postgres-svc
labels:
app: database
spec:
ports:
- port: 5432
targetPort: postgresport
selector:
app: database
---
kind: Service
apiVersion: v1
metadata:
name: rapidminer-server-svc
labels:
app: rapidminer-server-svc
role: server
spec:
ports:
- port: 8080
targetPort: rmswebui
selector:
app: rapidminer-server
role: server
type: LoadBalancer
PODs / Containers
Our example configuration defines the following 3 deployments:
- The Database pod contains the PostgreSQL container. The
pgvolume-claimis used as persistent volume. We also defined asubPathto ensure empty mount point for the postgres container.
kind: Pod
apiVersion: v1
metadata:
name: database
labels:
app: database
spec:
containers:
- name: database
image: postgres:9.6
ports:
- name: postgresport
containerPort: 5432
env:
- name: POSTGRES_DB
value: rmsdb
- name: POSTGRES_USER
value: rmsdbuser
- name: POSTGRES_PASSWORD
value: rmsdbpassword
volumeMounts:
- name: pgvolume
mountPath: /var/lib/postgresql/data
subPath: postgres
volumes:
- name: pgvolume
persistentVolumeClaim:
claimName: pgvolume-claim
- The RapidMiner Server container is defined with the following configuration. The environment variables are defined based on our Docker Image documentation. The
rmsvolume-claimis used to provide the persistent RapidMiner Home Folder. We also defined asubPathon the volume to ensure empty mount point for the first startup to let the RapidMiner Server container do the initialization of the RapidMiner Home Folder.
kind: Pod
apiVersion: v1
metadata:
name: rapidminer-server
labels:
app: rapidminer-server
role: server
spec:
containers:
- name: rapidminer-server
image: rapidminer/rapidminer-server:latest
ports:
- name: rmswebui
containerPort: 8080
- name: amq
containerPort: 5672
env:
- name: JOBSERVICE_QUEUE_ACTIVEMQ_USERNAME
value: amq-user
- name: JOBSERVICE_QUEUE_ACTIVEMQ_PASSWORD
value: amq-pass
- name: JOBSERVICE_AUTH_SECRET
value: c29tZS1hdXRoLXNlY3JldAo=
- name: DBHOST
value: postgres-svc
- name: DBSCHEMA
value: rmsdb
- name: DBUSER
value: rmsdbuser
- name: DBPASS
value: rmsdbpassword
volumeMounts:
- name: rmsvolume
mountPath: /persistent-rapidminer-home
subPath: rapidminer-home
volumes:
- name: rmsvolume
persistentVolumeClaim:
claimName: rmsvolume-claim
- The Job Agent containers are deployed using a Deployment Kubernetes object type, that provides replication and starts three instances in our example.
kind: Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: job-agent
labels:
app: job-agent
role: execution
spec:
replicas: 3
selector:
matchLabels:
app: job-agent
template:
metadata:
labels:
app: job-agent
role: execution
spec:
containers:
- name: job-agent
image: rapidminer/rapidminer-execution-jobagent:latest
env:
- name: RAPIDMINER_SERVER_HOST
value: rapidminer-server-svc
- name: RAPIDMINER_SERVER_PORT
value: '8080'
- name: JOBAGENT_QUEUE_ACTIVEMQ_URI
value: failover:(tcp://rapidminer-server-amq-svc:5672)
- name: JOBAGENT_QUEUE_ACTIVEMQ_USERNAME
value: amq-user
- name: JOBAGENT_QUEUE_ACTIVEMQ_PASSWORD
value: amq-pass
- name: JOBAGENT_AUTH_SECRET
value: c29tZS1hdXRoLXNlY3JldAo=
Deployment process
Based on the object definitions shown above, you can deploy the RapidMiner Server on Kubernetes Cluster with the database and Job Agent dependencies:
- Make sure that the connection to your Kubernetes Cluster is working
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:11:31Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:02:58Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
- Create and check the volumes
$ kubectl apply -f volumes.yaml persistentvolumeclaim/pgvolume-claim created persistentvolumeclaim/rmsvolume-claim created $ kubectl get pv pvc $ kubectl get pv pv
- Create and check services
$ kubectl apply -f services.yaml` service/rapidminer-server-amq-svc created service/postgres-svc created service/rapidminer-server-svc created $ kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE postgres-svc ClusterIP 10.152.183.35432/TCP 72s rapidminer-server-amq-svc ClusterIP 10.152.183.128 5672/TCP 72s rapidminer-server-svc LoadBalancer 10.152.183.252 ****** 8080:30661/TCP 72s
- Deploy services
$ kubectl apply -f database.yaml pod/database created $ kubectl apply -f rapidminer-server.yaml pod/rapidminer-server created $ kubectl apply -f job-agent.yaml deployment.apps/job-agent created
- Check the running PODs
$ kubectl get pod NAME READY STATUS RESTARTS AGE pod/database 1/1 Running 0 41m pod/job-agent-556b49567b-5cm8n 1/1 Running 0 44s pod/job-agent-556b49567b-6585h 1/1 Running 0 44s pod/job-agent-556b49567b-zk44g 1/1 Running 0 44s pod/rapidminer-server 1/1 Running 0 40m