创建开发机

本文为您详细介绍通过控制台和命令行创建开发机的步骤和参数配置。

前提条件

  • 已创建集群,详情请参考:创建集群
  • 已创建并行文件存储,如不需要也可忽略此步骤,详情请参考:创建存储
  • 账户余额充足。

通过控制台创建

注意: 旧版开发机目前无法开启docker功能,如需使用,请停止并删除旧版开发机,创建新的开发机使用。

  1. 登录英博云控制台。
  2. 在页面左侧导航栏,选择 开发机,进入开发机列表页面。
  3. 在开发机列表页面,单击左上角 创建开发机,配置创建开发机所需的参数。

开发机配置

参数
说明
集群选择开发机归属的的集群,选择集群后必须选择命名空间命名空间选择开发机部署的命名空间,支持选择自定义创建的命名空间或 Kubernetes 自建的命名空间实例名称根据提示的命名规则,自定义开发机名称。2-8字符,仅支持小写字母开头,内容支持包含小写字母、数字资源类型选择GPU卡类型或CPU类型镜像容器使用的镜像,支持选择预置镜像和自定义镜像Docker容器Docker容器开关,默认关闭,开启后,可以在开发机内使用docker run的命令启动docker容器,docker push和pull命令推拉镜像存储每个开发机默认100GB系统盘
支持通过PVC挂载多块共享存储卷,按容量计费,支持SSD和HDD两种存储类型
支持通过PVC挂载多块块存储卷,按容量计费root密码设置开发机的root密码,需要二次确认数量支持一次创建多台同配置开发机备注输入开发机备注信息

注意: 平台支持自主搭建Docker Registry用来存储和分发容器镜像,详情请参考镜像仓库

通过kubectl命令行创建

前提:

  1. 已安装kubectl工具到本地。详情请参考:安装和设置 kubectlopen in new window
  2. 已通过 kubectl 连接目标集群。具体操作详情请参考:连接集群

GPU类型标签:

  • H800:H800_NVLINK_80GB
  • A800:A800_NVLINK_80GB
  • 4090:RTX_4090
  • 4090D:RTX_4090D
  • A16:A16
  • A40:A40
  • CPU:amd-epyc-milan

创建开发机的YAML文件,此示例中创建的单卡H800实例、启动docker容器,未启用公网IP,并挂载名为t256g和train的共享存储卷,示例文件 gpu-example.yaml代码如下:

########################################################
apiVersion: apps.ebcloud.com/v1alpha1
kind: ContainerServer
metadata:
  name: cs-gpu-docker
  namespace: default
spec:
  command:
  - sh
  - -c
  - |-
    if [ -z "${EBSYS_INITIALIZED}" ] || [ "$(echo "${EBSYS_INITIALIZED}" | tr '[:upper:]' '[:lower:]')" = "false" ]; then
        echo "================ Customized initialization commands goes here. ==============================="
        echo "**** While using self defined image, make sure to install sshd, chpasswd and docker-client."
        echo "**** Root user ssh login and subsystem ftp should also be enabled."
        mkdir -p /etc/ssh
        echo "PasswordAuthentication yes" >> /etc/ssh/sshd_config
        echo "PermitRootLogin yes" >> /etc/ssh/sshd_config
        echo "ROOT User SSH PASSWORD Login Enabled!"
        echo "Subsystem sftp /usr/lib/openssh/sftp-server" >> /etc/ssh/sshd_config
        dpkg -i /public/shared-resources/openssh-server/ubuntu_22.04_amd64/*.deb
        cp /public/shared-resources/docker-build/docker /usr/bin/docker
        echo "================ End of custimized initialization commands. =================================="
        echo "root:$INIT_ROOT_PASSWORD" | chpasswd
        echo 'Root password initialization complete.'
        if [ -f /proc/1/environ ]; then
            echo 'while IFS= read -r line; do export "$line"; done < <(tr "\\0" "\\n" < /proc/1/environ)' | tee -a /etc/profile
            echo "K8s env >> /etc/profile DONE"
        fi
    fi
    if [ -n "$DOCKER_CONFJSON" ]; then
        mkdir -p ~/.docker
        echo "$DOCKER_CONFJSON" > ~/.docker/config.json
        chmod 600 ~/.docker/config.json
        echo 'Docker config initialization complete.'
    fi
    if service ssh start -D; then
        echo "SSHD exited."
    else
        /usr/sbin/sshd -D
        echo "SSHD failed to start."
    fi
  enableDocker: true
  image: registry-cn-huabei1-internal.ebcloud.com/nvcr.io/nvidia/cuda:12.4.1-cudnn-runtime-ubuntu22.04
  initRootPassword: nT6#bI53
  volumeMounts:
  - name: data-storage
    mountPath: /data # 存储卷1挂载路径
    persistentVolumeClaim:
      claimName: t256g # 存储卷1名称
  - name: data-storage1
    mountPath: /data1 # 存储卷2挂载路径
    persistentVolumeClaim:
      claimName: train # 存储卷2名称
#  network: # 公网IP配置,若需要可解除这部分注释
#    public: true
#    tcp:
#      ports:
#      - 22
#      - 80
#      - 443
  power: "ON"
  resources:
    cpu:
      count: "20"
    gpu:
      count: "1"
      type: H800_NVLINK_80GB # GPU卡类型
    memory:
      count: 200Gi
  sshAccess:
    enable: true
    targetPort: 22
# 执行以下命令,创建开发机。
kubectl apply -f gpu-example.yaml

# 执行以下命令,查看弹性容器实例是否创建成功。
kubectl get pod -n default

# 若需要删除开发机,执行以下命令。
kubectl delete -f gpu-example.yaml

配置示例一

  • 创建GPU开发机、不启动docker容器、未启用公网IP、并挂载名为t256g的共享存储卷。
########################################################
apiVersion: apps.ebcloud.com/v1alpha1
kind: ContainerServer
metadata:
  name: cs-demo-gpu
  namespace: default
spec:
  command:
  - sh
  - -c
  - |-
    if [ -z "${EBSYS_INITIALIZED}" ] || [ "$(echo "${EBSYS_INITIALIZED}" | tr '[:upper:]' '[:lower:]')" = "false" ]; then
        echo "================ Customized initialization commands goes here. ==============================="
        echo "**** While using self defined image, make sure to install sshd, chpasswd and docker-client."
        echo "**** Root user ssh login and subsystem ftp should also be enabled."
        mkdir -p /etc/ssh
        echo "PasswordAuthentication yes" >> /etc/ssh/sshd_config
        echo "PermitRootLogin yes" >> /etc/ssh/sshd_config
        echo "ROOT User SSH PASSWORD Login Enabled!"
        echo "Subsystem sftp /usr/lib/openssh/sftp-server" >> /etc/ssh/sshd_config
        dpkg -i /public/shared-resources/openssh-server/ubuntu_22.04_amd64/*.deb
        cp /public/shared-resources/docker-build/docker /usr/bin/docker
        echo "================ End of custimized initialization commands. =================================="
        echo "root:$INIT_ROOT_PASSWORD" | chpasswd
        echo 'Root password initialization complete.'
        if [ -f /proc/1/environ ]; then
            echo 'while IFS= read -r line; do export "$line"; done < <(tr "\\0" "\\n" < /proc/1/environ)' | tee -a /etc/profile
            echo "K8s env >> /etc/profile DONE"
        fi
    fi
    if [ -n "$DOCKER_CONFJSON" ]; then
        mkdir -p ~/.docker
        echo "$DOCKER_CONFJSON" > ~/.docker/config.json
        chmod 600 ~/.docker/config.json
        echo 'Docker config initialization complete.'
    fi
    if service ssh start -D; then
        echo "SSHD exited."
    else
        /usr/sbin/sshd -D
        echo "SSHD failed to start."
    fi
  enableDocker: false
  image: registry-cn-huabei1-internal.ebcloud.com/nvcr.io/nvidia/cuda:12.4.1-cudnn-runtime-ubuntu22.04
  initRootPassword: nT6#bI53
  volumeMounts:
  - name: data-storage
    mountPath: /data # 存储卷1挂载路径
    persistentVolumeClaim:
      claimName: t256g # 存储卷1名称
#  network: 
#    public: true
#    tcp:
#      ports:
#      - 22
#      - 80
#      - 443
  power: "ON"
  resources:
    cpu:
      count: "20"
    gpu:
      count: "1"
      type: H800_NVLINK_80GB
    memory:
      count: 200Gi
  sshAccess:
    enable: true
    targetPort: 22

配置示例二

  • 创建CPU开发机启动docker容器、未启用公网IP、并挂载名为t256g的共享存储卷。
apiVersion: apps.ebcloud.com/v1alpha1
kind: ContainerServer
metadata:
  name: cs-demo-cpu
  namespace: default
spec:
  command:
  - sh
  - -c
  - |-
    if [ -z "${EBSYS_INITIALIZED}" ] || [ "$(echo "${EBSYS_INITIALIZED}" | tr '[:upper:]' '[:lower:]')" = "false" ]; then
        echo "================ Customized initialization commands goes here. ==============================="
        echo "**** While using self defined image, make sure to install sshd, chpasswd and docker-client."
        echo "**** Root user ssh login and subsystem ftp should also be enabled."
        mkdir -p /etc/ssh
        echo "PasswordAuthentication yes" >> /etc/ssh/sshd_config
        echo "PermitRootLogin yes" >> /etc/ssh/sshd_config
        echo "ROOT User SSH PASSWORD Login Enabled!"
        echo "Subsystem sftp /usr/lib/openssh/sftp-server" >> /etc/ssh/sshd_config
        dpkg -i /public/shared-resources/openssh-server/ubuntu_22.04_amd64/*.deb
        cp /public/shared-resources/docker-build/docker /usr/bin/docker
        echo "================ End of custimized initialization commands. =================================="
        echo "root:$INIT_ROOT_PASSWORD" | chpasswd
        echo 'Root password initialization complete.'
        if [ -f /proc/1/environ ]; then
            echo 'while IFS= read -r line; do export "$line"; done < <(tr "\\0" "\\n" < /proc/1/environ)' | tee -a /etc/profile
            echo "K8s env >> /etc/profile DONE"
        fi
    fi
    if [ -n "$DOCKER_CONFJSON" ]; then
        mkdir -p ~/.docker
        echo "$DOCKER_CONFJSON" > ~/.docker/config.json
        chmod 600 ~/.docker/config.json
        echo 'Docker config initialization complete.'
    fi
    if service ssh start -D; then
        echo "SSHD exited."
    else
        /usr/sbin/sshd -D
        echo "SSHD failed to start."
    fi
  enableDocker: true
  image: registry-cn-huabei1-internal.ebcloud.com/docker.io/ubuntu:22.04
  initRootPassword: nT6#bI53
  volumeMounts:
  - name: t256g
    mountPath: /data
    persistentVolumeClaim:
      claimName: t256g
#  network: 
#    public: true
#    tcp:
#      ports:
#      - 22
#      - 80
#      - 443
  power: "ON"
  resources:
    cpu:
      count: "1"
    memory:
      count: 2Gi
  sshAccess:
    enable: true
    targetPort: 22