You are viewing the RapidMiner Server documentation for version 9.4 - Check here for latest version
Kubernetes
Our Docker Images are ready to deploy to any Kubernetes Cluster. Here we provide example deployment configurations and tutorials, but the final deployment depends on your requirements.
The following guide requires a running Kubernetes cluster. We tested our example configuration with these Kubernetes services:
Deployment architecture and definition
In our example, we deploy a PostgeSQL database server, RapidMiner Server, and some Job Agents on Kubernetes.
To deploy RapidMiner Server on Kubernetes, you need to define the services, volumes and pods.
Volumes
Our example configuration uses two persistent volumes:
- A volume for the PostgreSQL database data storage
- A volume for the RapidMiner Home of the RapidMiner Server
To define the volumes, you can apply the following Kubernetes Object Configuration YAML file.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pgvolume-claim
  labels:
    app: database
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: rmsvolume-claim
  labels:
    app: rapidminer-server
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
Services
To deploy the example configuration, we specify three Kubernetes Service Endpoints:
- The ActiveMQ service endpoint is an internal endpoint that is used by the Job Agents (port: 5672)
- The database service endpoint is an internal endpoint that used to connect from the RapidMiner Server (port: 5432)
- The RapidMiner Server service endpoint represent the public web interface of the RapidMiner Server (port: 8080).
Note: the public endpoint definition may differ on different Kubernetes Clusters.
Public cloud providers support the LoadBalancer type, but the MicroK8S implementation requires the setting of an Ingress to enable public access.
To define the service endpoints, you can apply the following Kubernetes Object Configuration YAML file:
kind: Service
apiVersion: v1
metadata:
  name: rapidminer-server-amq-svc
  labels:
    app: rapidminer-server-amq-svc
    role: server
spec:
  ports:
  - port: 5672
    targetPort: amq
  selector:
    app: rapidminer-server
    role: server
---
kind: Service
apiVersion: v1
metadata:
  name: postgres-svc
  labels:
    app: database
spec:
  ports:
  - port: 5432
    targetPort: postgresport
  selector:
    app: database
---
kind: Service
apiVersion: v1
metadata:
  name: rapidminer-server-svc
  labels:
    app: rapidminer-server-svc
    role: server
spec:
  ports:
  - port: 8080
    targetPort: rmswebui
  selector:
    app: rapidminer-server
    role: server
  type: LoadBalancer
PODs / Containers
Our example configuration defines the following 3 deployments:
- The Database pod contains the PostgreSQL container. The pgvolume-claimis used as persistent volume. We also defined asubPathto ensure empty mount point for the postgres container.
kind: Pod
apiVersion: v1
metadata:
  name: database
  labels:
    app: database
spec:
  containers:
  - name: database
    image: postgres:9.6
    ports:
    - name: postgresport
      containerPort: 5432
    env:
    - name: POSTGRES_DB
      value: rmsdb
    - name: POSTGRES_USER
      value: rmsdbuser
    - name: POSTGRES_PASSWORD
      value: rmsdbpassword
    volumeMounts:
    - name: pgvolume
      mountPath: /var/lib/postgresql/data
      subPath: postgres
  volumes:
  - name: pgvolume
    persistentVolumeClaim:
      claimName: pgvolume-claim
- The RapidMiner Server container is defined with the following configuration. The environment variables are defined based on our Docker Image documentation. The rmsvolume-claimis used to provide the persistent RapidMiner Home Folder. We also defined asubPathon the volume to ensure empty mount point for the first startup to let the RapidMiner Server container do the initialization of the RapidMiner Home Folder.
kind: Pod
apiVersion: v1
metadata:
  name: rapidminer-server
  labels:
    app: rapidminer-server
    role: server
spec:
  containers:
  - name: rapidminer-server
    image: rapidminer/rapidminer-server:9.4.1
    ports:
    - name: rmswebui
      containerPort: 8080
    - name: amq
      containerPort: 5672
    env:
    - name: JOBSERVICE_QUEUE_ACTIVEMQ_USERNAME
      value: amq-user
    - name: JOBSERVICE_QUEUE_ACTIVEMQ_PASSWORD
      value: amq-pass
    - name: JOBSERVICE_AUTH_SECRET
      value: c29tZS1hdXRoLXNlY3JldAo=
    - name: DBHOST
      value: postgres-svc
    - name: DBSCHEMA
      value: rmsdb
    - name: DBUSER
      value: rmsdbuser
    - name: DBPASS
      value: rmsdbpassword
    volumeMounts:
    - name: rmsvolume
      mountPath: /persistent-rapidminer-home
      subPath: rapidminer-home
  volumes:
  - name: rmsvolume
    persistentVolumeClaim:
      claimName: rmsvolume-claim
- The Job Agent containers are deployed using a Deployment Kubernetes object type, that provides replication and starts three instances in our example.
kind: Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: job-agent
  labels:
    app: job-agent
    role: execution
spec:
  replicas: 3
  selector:
    matchLabels:
      app: job-agent
  template:
    metadata:
      labels:
        app: job-agent
        role: execution
    spec:
      containers:
      - name: job-agent
        image: rapidminer/rapidminer-execution-jobagent:9.4.1
        env:
        - name: RAPIDMINER_SERVER_HOST
          value: rapidminer-server-svc
        - name: RAPIDMINER_SERVER_PORT
          value: '8080'
        - name: JOBAGENT_QUEUE_ACTIVEMQ_URI
          value: failover:(tcp://rapidminer-server-amq-svc:5672)
        - name: JOBAGENT_QUEUE_ACTIVEMQ_USERNAME
          value: amq-user
        - name: JOBAGENT_QUEUE_ACTIVEMQ_PASSWORD
          value: amq-pass
        - name: JOBAGENT_AUTH_SECRET
          value: c29tZS1hdXRoLXNlY3JldAo=
Deployment process
Based on the object definitions shown above, you can deploy the RapidMiner Server on Kubernetes Cluster with the database and Job Agent dependencies:
- Make sure that the connection to your Kubernetes Cluster is working
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:11:31Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:02:58Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
- Create and check the volumes
$ kubectl apply -f volumes.yaml persistentvolumeclaim/pgvolume-claim created persistentvolumeclaim/rmsvolume-claim created $ kubectl get pv pvc $ kubectl get pv pv
- Create and check services
$ kubectl apply -f services.yaml` service/rapidminer-server-amq-svc created service/postgres-svc created service/rapidminer-server-svc created $ kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE postgres-svc ClusterIP 10.152.183.35432/TCP 72s rapidminer-server-amq-svc ClusterIP 10.152.183.128 5672/TCP 72s rapidminer-server-svc LoadBalancer 10.152.183.252 ****** 8080:30661/TCP 72s 
- Deploy services
$ kubectl apply -f database.yaml pod/database created $ kubectl apply -f rapidminer-server.yaml pod/rapidminer-server created $ kubectl apply -f job-agent.yaml deployment.apps/job-agent created
- Check the running PODs
$ kubectl get pod NAME READY STATUS RESTARTS AGE pod/database 1/1 Running 0 41m pod/job-agent-556b49567b-5cm8n 1/1 Running 0 44s pod/job-agent-556b49567b-6585h 1/1 Running 0 44s pod/job-agent-556b49567b-zk44g 1/1 Running 0 44s pod/rapidminer-server 1/1 Running 0 40m