Running MapR Data Science Refinery as a Kubernetes Service

This topic contains sample snippets from YAML files that enable you to run MapR Data Science Refinery as a Kubernetes service. The samples show you how to use various Kubernetes features. This includes specifying your MapR Data Platform ticket as a Kubernetes secret, using a ConfigMap to define environment variables, mapping ports to route external traffic, passing the container password as a Kubernetes secret, exposing MapR Data Science Refinery as a service, and running MapR Data Science Refinery as a deployment for high availability.

The simplest way to run MapR Data Science Refinery as a Kubernetes service is to expose it to Kubernetes by including it in your Kubernetes pod manifest file. The following shows a sample file:

apiVersion: v1
kind: Pod
metadata:
  name: dsr-kube
  labels:
      app: dsr-svc
spec:
  containers:
  - name: dsr
    imagePullPolicy: Always
    image: maprtech/data-science-refinery:v1.4.1_6.1.0_6.3.0_centos7
    securityContext:
      capabilities:
        add: ["SYS_ADMIN" , "SYS_RESOURCE"]
    resources:
      requests:
        memory: "2Gi"
        cpu: "500m"
    env:
    - name: MAPR_MOUNT_PATH
      value: /mapr
    - name: MAPR_CLUSTER
      value: cluster1
    - name: MAPR_CLDB_HOSTS
      value: 10.10.102.95
    - name: MAPR_CONTAINER_USER
      value: mapr
    - name: MAPR_CONTAINER_GROUP
      value: mapr
    - name: MAPR_CONTAINER_PASSWORD
      value: mapr
    - name: HOST_IP
      valueFrom:
        fieldRef:
          fieldPath: status.hostIP
    - name: ZEPPELIN_DEPLOY_MODE
      value: kubernetes
    volumeMounts:
    - mountPath: /dev/fuse
      name: fuse
  volumes:
  - name: fuse
    hostPath:
      path: /dev/fuse

Make sure you include the following lines in your manifest file:

- name: ZEPPELIN_DEPLOY_MODE
      value: kubernetes

You can run MapR Data Science Refinery with Kubernetes 1.9 and later.