成功最有效的方法就是向有经验的人学习!

Netdata监控

介绍

Linux性能实时监测工具-Netdata,它是Linux系统实时性能监测工具,以web的可视化方式展示系统及应用程序的实时运行状态(包括cpu、内存、硬盘输入/输出、网络等linux性能的数据)。Netdata的web前端响应很快,而且不需要Flash插件。 UI很整洁,保持着 Netdata 应有的特性。第一眼看上去,你能够看到很多图表,幸运的是绝大多数常用的图表数据(像 CPU,RAM,网络和硬盘)都在顶部。

https://github.com/netdata/netdata/blob/master/backends/prometheus/README.md

默认情况下,netdata 监听在 19999 端口
prometheus可以添加以下配置
file

监控规则

网络数据cpu使用率高

Netdata 高 CPU 使用率 (> 80%)

  - alert: NetdataHighCpuUsage
    expr: rate(netdata_cpu_cpu_percentage_average{dimension="idle"}[1m]) > 80
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: Netdata high cpu usage (instance {{ $labels.instance }})
      description: "Netdata high CPU usage (> 80%)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

主机 CPU 窃取嘈杂的邻居

CPU 窃取率 > 10%。嘈杂的邻居正在扼杀 VM 性能,或者 Spot 实例可能已失去信用。
有关介绍参考:https://www.gl.sh.cn/2022/03/10/zhu_ji_ying_jian_jian_kong_node_exporter.html#_CPU-2

  - alert: HostCpuStealNoisyNeighbor
    expr: rate(netdata_cpu_cpu_percentage_average{dimension="steal"}[1m]) > 10
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: Host CPU steal noisy neighbor (instance {{ $labels.instance }})
      description: "CPU steal is > 10%. A noisy neighbor is killing VM performances or a spot instance may be out of credit.\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

Netdata 内存占用高

Netdata 高内存使用率 (> 80%)

  - alert: NetdataHighMemoryUsage
    expr: 100 / netdata_system_ram_MB_average * netdata_system_ram_MB_average{dimension=~"free|cached"} < 20
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: Netdata high memory usage (instance {{ $labels.instance }})
      description: "Netdata high memory usage (> 80%)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

Netdata 磁盘空间不足

Netdata 磁盘空间不足 (> 80%)

  - alert: NetdataLowDiskSpace
    expr: 100 / netdata_disk_space_GB_average * netdata_disk_space_GB_average{dimension=~"avail|cached"} < 20
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: Netdata low disk space (instance {{ $labels.instance }})
      description: "Netdata low disk space (> 80%)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

Netdata 预测磁盘已满

Netdata 预测 24 小时内磁盘已满

  - alert: NetdataPredictedDiskFull
    expr: predict_linear(netdata_disk_space_GB_average{dimension=~"avail|cached"}[3h], 24 * 3600) < 0
    for: 0m
    labels:
      severity: warning
    annotations:
      summary: Netdata predicted disk full (instance {{ $labels.instance }})
      description: "Netdata predicted disk full in 24 hours\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

Netdata MD 不匹配 cnt 未同步块

RAID 阵列有不同步的块

  - alert: NetdataMdMismatchCntUnsynchronizedBlocks
    expr: netdata_md_mismatch_cnt_unsynchronized_blocks_average > 1024
    for: 2m
    labels:
      severity: warning
    annotations:
      summary: Netdata MD mismatch cnt unsynchronized blocks (instance {{ $labels.instance }})
      description: "RAID Array have unsynchronized blocks\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

Netdata 磁盘重新分配扇区

磁盘上重新分配的扇区

  - alert: NetdataDiskReallocatedSectors
    expr: increase(netdata_smartd_log_reallocated_sectors_count_sectors_average[1m]) > 0
    for: 0m
    labels:
      severity: info
    annotations:
      summary: Netdata disk reallocated sectors (instance {{ $labels.instance }})
      description: "Reallocated sectors on disk\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

Netdata 磁盘当前待处理扇区

磁盘当前待处理扇区

  - alert: NetdataDiskCurrentPendingSector
    expr: netdata_smartd_log_current_pending_sector_count_sectors_average > 0
    for: 0m
    labels:
      severity: warning
    annotations:
      summary: Netdata disk current pending sector (instance {{ $labels.instance }})
      description: "Disk current pending sector\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

Netdata 报告无法纠正的磁盘扇区

报告无法纠正的磁盘扇区

  - alert: NetdataReportedUncorrectableDiskSectors
    expr: increase(netdata_smartd_log_offline_uncorrectable_sector_count_sectors_average[2m]) > 0
    for: 0m
    labels:
      severity: warning
    annotations:
      summary: Netdata reported uncorrectable disk sectors (instance {{ $labels.instance }})
      description: "Reported uncorrectable disk sectors\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
赞(1) 打赏
未经允许不得转载:陈桂林博客 » Netdata监控
分享到

大佬们的评论 抢沙发

全新“一站式”建站,高质量、高售后的一条龙服务

微信 抖音 支付宝 百度 头条 快手全平台打通信息流

橙子建站.极速智能建站8折购买虚拟主机

觉得文章有用就打赏一下文章作者

非常感谢你的打赏,我们将继续给力更多优质内容,让我们一起创建更加美好的网络世界!

支付宝扫一扫打赏

微信扫一扫打赏

登录

找回密码

注册