导航：首页 > 服务器 >

在k8s集群中安装prometheus

发表于：2025-02-19 作者：千家信息网编辑

千家信息网最后更新 2025年02月19日，在早期的版本中 Kubernetes 提供了 heapster、influxDB、grafana 的组合来监控系统，现在更加流行的监控工具是 prometheus，prometheus 是 Googl

千家信息网最后更新 2025年02月19日在k8s集群中安装prometheus

在早期的版本中 Kubernetes 提供了 heapster、influxDB、grafana 的组合来监控系统，现在更加流行的监控工具是 prometheus，prometheus 是 Google 内部监控报警系统的开源版本

Kubernetes 集群的监控方案目前主要有以下几种方案：
1、Heapster：Heapster 是一个集群范围的监控和数据聚合工具，以 Pod 的形式运行在集群中。
2、metrics-server：metrics-server 也是一个集群范围内的资源数据聚合工具，是 Heapster 的替代品，同样的，metrics-server 也只是显示数据，并不提供数据存储服务。
3、cAdvisor：cAdvisor是Google开源的容器资源监控和性能分析工具，它是专门为容器而生，本身也支持 Docker 容器，在 Kubernetes 中，我们不需要单独去安装，cAdvisor 作为 kubelet 内置的一部分程序可以直接使用。
4、Kube-state-metrics：kube-state-metrics通过监听 API Server 生成有关资源对象的状态指标，比如 Deployment、Node、Pod，需要注意的是 kube-state-metrics 只是简单提供一个 metrics 数据，并不会存储这些指标数据，所以我们可以使用 Prometheus 来抓取这些数据然后存储。

Prometheus 相比于其他传统监控工具主要有以下几个特点：
具有由 metric 名称和键/值对标识的时间序列数据的多维数据模型
有一个灵活的查询语言
不依赖分布式存储，只和本地磁盘有关
通过 HTTP 的服务拉取时间序列数据
也支持推送的方式来添加时间序列数据
还支持通过服务发现或静态配置发现目标
多种图形和仪表板支持

Prometheus 由多个组件组成，但是其中许多组件是可选的：
Prometheus Server：用于抓取指标、存储时间序列数据
exporter：暴露指标让任务来抓
pushgateway：push 的方式将指标数据推送到该网关
alertmanager：处理报警的报警组件
adhoc：用于数据查询

1、创建独立的命名空间

apiVersion: v1kind: Namespacemetadata:  name: kube-ops

2、以configmap的形式管理配置文件prometheus.yml

apiVersion: v1kind: ConfigMapmetadata:  name: prometheus-config  namespace: kube-opsdata:  prometheus.yml: |    global:      scrape_interval: 15s      scrape_timeout: 15s    scrape_configs:    - job_name: 'prometheus'      static_configs:      - targets: ['localhost:9090']

配置文件prometheus.yml中包含了3个模块：global、rule_files 和 scrape_configs
其中 global 模块控制 Prometheus Server 的全局配置
rule_files 模块制定了规则所在的位置，prometheus 可以根据这个配置加载规则，用于生成新的时间序列数据或者报警信息，当前我们没有配置任何规则
scrape_configs 用于控制 prometheus 监控哪些资源。
在默认的配置里有一个单独的 job，叫做prometheus，它采集 prometheus 服务本身的时间序列数据。这个 job 包含了一个单独的、静态配置的目标：监听 localhost 上的9090端口。
prometheus 默认会通过目标的/metrics路径采集 metrics。所以，默认的 job 通过 URL：http://localhost:9090/metrics采集 metrics。
3、配置 rbac 认证

apiVersion: v1kind: ServiceAccountmetadata:  name: prometheus  namespace: kube-ops---apiVersion: rbac.authorization.k8s.io/v1kind: ClusterRolemetadata:  name: prometheusrules:- apiGroups:  - ""  resources:  - nodes  - services  - endpoints  - pods  - nodes/proxy  verbs:  - get  - list  - watch- apiGroups:  - ""  resources:  - configmaps  - nodes/metrics  verbs:  - get- nonResourceURLs:  - /metrics  verbs:  - get---apiVersion: rbac.authorization.k8s.io/v1beta1kind: ClusterRoleBindingmetadata:  name: prometheusroleRef:  apiGroup: rbac.authorization.k8s.io  kind: ClusterRole  name: prometheussubjects:- kind: ServiceAccount  name: prometheus  namespace: kube-ops

4、配置pv和pvc用于数据持久化

apiVersion: v1kind: PersistentVolumemetadata:  name: prometheusspec:  capacity:    storage: 10Gi  accessModes:  - ReadWriteOnce  persistentVolumeReclaimPolicy: Recycle  nfs:    server: 192.168.1.244    path: /data/k8s---apiVersion: v1kind: PersistentVolumeClaimmetadata:  name: prometheus  namespace: kube-opsspec:  accessModes:  - ReadWriteOnce  resources:    requests:      storage: 10Gi

5、创建 prometheus 的 Pod 资源
$ docker pull prom/prometheus:v2.4.3

apiVersion: extensions/v1beta1kind: Deploymentmetadata:  name: prometheus  namespace: kube-ops  labels:    app: prometheusspec:  template:    metadata:      labels:        app: prometheus    spec:      serviceAccountName: prometheus      containers:      - image: prom/prometheus:v2.4.3        name: prometheus        command:        - "/bin/prometheus"        args:        - "--config.file=/etc/prometheus/prometheus.yml"        - "--storage.tsdb.path=/prometheus"        - "--storage.tsdb.retention=24h"        - "--web.enable-admin-api"  # 控制对admin HTTP API的访问，其中包括删除时间序列等功能        - "--web.enable-lifecycle"  # 支持热更新，直接执行localhost:9090/-/reload立即生效        ports:        - containerPort: 9090          protocol: TCP          name: http        volumeMounts:        - mountPath: "/prometheus"          subPath: prometheus          name: data        - mountPath: "/etc/prometheus"          name: config-volume        resources:          requests:            cpu: 100m            memory: 512Mi          limits:            cpu: 100m            memory: 512Mi      securityContext:        runAsUser: 0      volumes:      - name: data        persistentVolumeClaim:          claimName: prometheus      - configMap:          name: prometheus-config        name: config-volume

$ kubectl get pod -n kube-ops
prometheus-77d968648-w5j6z 1/1 Running 53 82d
6、创建prometheus pod的svc

apiVersion: v1kind: Servicemetadata:  name: prometheus  namespace: kube-ops  labels:    app: prometheusspec:  selector:    app: prometheus  type: NodePort  ports:    - name: web      port: 9090      targetPort: http

$ kubectl get svc -n kube-ops
prometheus NodePort 10.102.197.83 9090:32619/TCP
http://192.168.1.243:32619
点击status----targets查看监控目录状态

很赞哦！