open-falcon 安装 gpu-mon 显卡监控

open-falcon 安装 gpu-mon 显卡监控

1,安装dcgm

# lspci | grep NVIDIA
03:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1080] (rev a1)

datacenter-gpu-manager_1.4.2_amd64.deb

datacenter-gpu-manager_1.4.2_amd64.deb 下载

dpkg -i datacenter-gpu-manager_1.4.2_amd64.deb
# dcgmi --version

dcgmi  version: 1.4.2
# nvvs --version

nvvs  version: 396.27
# nvidia-smi
# nvtop
  • 启用监听
# nv-hostengine
Started host engine version 1.4.2 using port number: 5555
  • 查看GPU设备
# dcgmi discovery -l
1 GPU found.
+--------+-------------------------------------------------------------------+
| GPU ID | Device Information                                                |
+========+===================================================================+
| 0      |  Name: GeForce GTX 1080                                           |
|        |  PCI Bus ID: 00000000:03:00.0                                     |
|        |  Device UUID: GPU-ccb935ed-c798-ab9b-7c4e-0bde0d6ed354            |
+--------+-------------------------------------------------------------------+

2,安装gpu-mon

# go get -u github.com/open-falcon/gpu-mon
# pwd
/root/go/src/github.com/open-falcon/gpu-mon
# make
gofmt -s -w ./args.go ./fetch/metrics.go ./fetch/dcgm.go ./fetch/fetch.go ./common/config.go ./common/log.go ./common/log_test.go ./common/utils.go ./common/config_test.go ./common/common.go ./send/send_test.go ./send/send.go ./send/utils.go ./send/utils_test.go ./main.go
building gpu-mon ...

2.1,使用插件

  • open-falcon 插件功能需要开启
cp gpu-mon cfg.example.json 60_gpuMonitor.sh /root/open-falcon/agent/plugin/
# pwd
/root/open-falcon/agent/plugin
# mv cfg.example.json cfg.json
  • /root/open-falcon/plugin 为插件路径
# pwd
/root/open-falcon/plugin
# ls
60_gpuMonitor.sh  cfg.json  gpu-mon  logs
  • 自定义名称创建一个组
    在这里插入图片描述
  • 编辑plugins
    在这里插入图片描述
  • 填写路径 ./
    在这里插入图片描述
  • 需要手动添加主机名
    在这里插入图片描述
  • 成功查看CPU功率
    在这里插入图片描述

参考:

  1. gpu-mon
  2. DCGM安装
  3. 玩转GPU(初级+进阶)
  4. open-falcon agent plugin的使用
相关推荐
©️2020 CSDN 皮肤主题: 技术黑板 设计师:CSDN官方博客 返回首页