千家信息网

k8s集群中的EFK日志搜集系统

发表于:2025-02-03 作者:千家信息网编辑
千家信息网最后更新 2025年02月03日,Kubernetes 集群本身不提供日志收集的解决方案,一般来说有主要的3种方案来做日志收集:1、在每个节点上运行一个 agent 来收集日志由于这种 agent 必须在每个节点上运行,所以直接使用
千家信息网最后更新 2025年02月03日k8s集群中的EFK日志搜集系统

Kubernetes 集群本身不提供日志收集的解决方案,一般来说有主要的3种方案来做日志收集:
1、在每个节点上运行一个 agent 来收集日志
由于这种 agent 必须在每个节点上运行,所以直接使用 DaemonSet 控制器运行该应用程序即可
这种方法也仅仅适用于收集输出到 stdout 和 stderr 的应用程序日志
简单来说,本方式就是在每个node上各运行一个日志代理容器,
对本节点/var/log和 /var/lib/docker/containers/两个目录下的日志进行采集
2、在每个 Pod 中包含一个 sidecar 容器来收集应用日志
在 sidecar 容器中运行日志采集代理程序会导致大量资源消耗,因为你有多少个要采集的 Pod,就需要运行多少个采集代理程序,另外还无法使用 kubectl logs 命令来访问这些日志
3、直接在应用程序中将日志信息推送到采集后端

Kubernetes 中比较流行的日志收集解决方案是 Elasticsearch、Fluentd 和 Kibana(EFK)技术栈,也是官方现在比较推荐的一种方案
Elasticsearch 是一个实时的、分布式的可扩展的搜索引擎,允许进行全文、结构化搜索,它通常用于索引和搜索大量日志数据,也可用于搜索许多不同类型的文档

创建 Elasticsearch 集群
一般使用3个 Elasticsearch Pod 来避免高可用下多节点集群中出现的"脑裂"问题,并且使用StatefulSet控制器来创建Elasticsearch Pod
创建StatefulSet pod时,直接在其pvc模板中使用StorageClass自动生成pv和pvc,可以实现数据持久化,nfs-client-provisioner已经提前准备好了。
1、创建独立的命名空间

apiVersion: v1kind: Namespacemetadata:  name: logging

2、创建StorageClas,也可以使用已经存在的StorageClas

apiVersion: storage.k8s.io/v1kind: StorageClassmetadata:  name: es-data-dbprovisioner: fuseim.pri/ifs      # 该值需要和 provisioner 配置的保持一致

3、创建StatefulSet pod前需要先创建无头服务

kind: ServiceapiVersion: v1metadata:  name: elasticsearch  namespace: logging  labels:    app: elasticsearchspec:  selector:    app: elasticsearch  clusterIP: None  ports:    - port: 9200      name: rest    - port: 9300      name: inter-node

4、创建elasticsearch statefulset pod
$ docker pull docker.elastic.co/elasticsearch/elasticsearch-oss:6.4.3
$ docker pull busybox

apiVersion: apps/v1kind: StatefulSetmetadata:  name: es-cluster  namespace: loggingspec:  serviceName: elasticsearch  replicas: 3  selector:    matchLabels:      app: elasticsearch  template:    metadata:      labels:        app: elasticsearch    spec:      containers:      - name: elasticsearch        image: docker.io/elasticsearch:latest        resources:            limits:              cpu: 1000m            requests:              cpu: 100m        ports:        - containerPort: 9200          name: rest          protocol: TCP        - containerPort: 9300          name: inter-node          protocol: TCP        volumeMounts:        - name: data          mountPath: /usr/share/elasticsearch/data        env:          - name: cluster.name            value: k8s-logs          - name: node.name            valueFrom:              fieldRef:                fieldPath: metadata.name          - name: discovery.zen.ping.unicast.hosts            value: "es-cluster-0.elasticsearch,es-cluster-1.elasticsearch,es-cluster-2.elasticsearch"          - name: discovery.zen.minimum_master_nodes            value: "2"          - name: ES_JAVA_OPTS            value: "-Xms512m -Xmx512m"      initContainers:      - name: fix-permissions        image: busybox        command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]        securityContext:          privileged: true        volumeMounts:        - name: data          mountPath: /usr/share/elasticsearch/data      - name: increase-vm-max-map        image: busybox        command: ["sysctl", "-w", "vm.max_map_count=262144"]        securityContext:          privileged: true      - name: increase-fd-ulimit        image: busybox        command: ["sh", "-c", "ulimit -n 65536"]        securityContext:          privileged: true  volumeClaimTemplates:  - metadata:      name: data      labels:        app: elasticsearch    spec:      accessModes: [ "ReadWriteOnce" ]      storageClassName: es-data-db      resources:        requests:          storage: 100Gi

$ kubectl get pod -n logging
NAME READY STATUS RESTARTS AGE
es-cluster-0 1/1 Running 0 42s
es-cluster-1 1/1 Running 0 10m
es-cluster-2 1/1 Running 0 9m49s
在nfs服务器上会自动生成3个目录,用于这3个pod存储数据
$ cd /data/k8s
$ ls
logging-data-es-cluster-0-pvc-98c87fc5-c581-11e9-964d-000c29d8512b/
logging-data-es-cluster-1-pvc-07872570-c590-11e9-964d-000c29d8512b/
logging-data-es-cluster-2-pvc-27e15977-c590-11e9-964d-000c29d8512b/
检查es集群状态
$ kubectl port-forward es-cluster-0 9200:9200 --namespace=logging
在另外一个窗口执行
$ curl http://localhost:9200/_cluster/state?pretty

用deployment控制器创建kibana

apiVersion: v1kind: Servicemetadata:  name: kibana  namespace: logging  labels:    app: kibanaspec:  ports:  - port: 5601  type: NodePort  selector:    app: kibana---apiVersion: apps/v1kind: Deploymentmetadata:  name: kibana  namespace: logging  labels:    app: kibanaspec:  selector:    matchLabels:      app: kibana  template:    metadata:      labels:        app: kibana    spec:      containers:      - name: kibana        image: docker.elastic.co/kibana/kibana-oss:6.4.3        resources:          limits:            cpu: 1000m          requests:            cpu: 100m        env:          - name: ELASTICSEARCH_URL            value: http://elasticsearch:9200        ports:        - containerPort: 5601

$ kubectl get svc -n logging |grep kibana
kibana NodePort 10.111.239.0 5601:32081/TCP 114m
访问kibana
http://192.168.1.243:32081

安装配置 Fluentd
1、通过 ConfigMap 对象来指定 Fluentd 配置文件

kind: ConfigMapapiVersion: v1metadata:  name: fluentd-config  namespace: logging  labels:    addonmanager.kubernetes.io/mode: Reconciledata:  system.conf: |-          root_dir /tmp/fluentd-buffers/      containers.input.conf: |-          @id fluentd-containers.log      @type tail      path /var/log/containers/*.log      pos_file /var/log/es-containers.log.pos      time_format %Y-%m-%dT%H:%M:%S.%NZ      localtime      tag raw.kubernetes.*      format json      read_from_head true              @id raw.kubernetes      @type detect_exceptions      remove_tag_prefix raw      message log      stream stream      multiline_flush_interval 5      max_bytes 500000      max_lines 1000      system.input.conf: |-          @id journald-docker      @type systemd      filters [{ "_SYSTEMD_UNIT": "docker.service" }]              @type local        persistent true            read_from_head true      tag docker              @id journald-kubelet      @type systemd      filters [{ "_SYSTEMD_UNIT": "kubelet.service" }]              @type local        persistent true            read_from_head true      tag kubelet      forward.input.conf: |-          @type forward      output.conf: |-          @type kubernetes_metadata              @id elasticsearch      @type elasticsearch      @log_level info      include_tag_key true      host elasticsearch      port 9200      logstash_format true      request_timeout    30s              @type file        path /var/log/fluentd-buffers/kubernetes.system.buffer        flush_mode interval        retry_type exponential_backoff        flush_thread_count 2        flush_interval 5s        retry_forever        retry_max_interval 30        chunk_limit_size 2M        queue_limit_length 8        overflow_action block          

上面配置文件中我们配置了 docker 容器日志目录以及 docker、kubelet 应用的日志的收集,收集到数据经过处理后发送到 elasticsearch:9200 服务
2、使用DaemonSet创建fluentd pod
$ docker pull cnych/fluentd-elasticsearch:v2.0.4
$ docker info
Docker Root Dir: /var/lib/docker

apiVersion: v1kind: ServiceAccountmetadata:  name: fluentd-es  namespace: logging  labels:    k8s-app: fluentd-es    kubernetes.io/cluster-service: "true"    addonmanager.kubernetes.io/mode: Reconcile---kind: ClusterRoleapiVersion: rbac.authorization.k8s.io/v1metadata:  name: fluentd-es  labels:    k8s-app: fluentd-es    kubernetes.io/cluster-service: "true"    addonmanager.kubernetes.io/mode: Reconcilerules:- apiGroups:  - ""  resources:  - "namespaces"  - "pods"  verbs:  - "get"  - "watch"  - "list"---kind: ClusterRoleBindingapiVersion: rbac.authorization.k8s.io/v1metadata:  name: fluentd-es  labels:    k8s-app: fluentd-es    kubernetes.io/cluster-service: "true"    addonmanager.kubernetes.io/mode: Reconcilesubjects:- kind: ServiceAccount  name: fluentd-es  namespace: logging  apiGroup: ""roleRef:  kind: ClusterRole  name: fluentd-es  apiGroup: ""---apiVersion: apps/v1kind: DaemonSetmetadata:  name: fluentd-es  namespace: logging  labels:    k8s-app: fluentd-es    version: v2.0.4    kubernetes.io/cluster-service: "true"    addonmanager.kubernetes.io/mode: Reconcilespec:  selector:    matchLabels:      k8s-app: fluentd-es      version: v2.0.4  template:    metadata:      labels:        k8s-app: fluentd-es        kubernetes.io/cluster-service: "true"        version: v2.0.4      annotations:        scheduler.alpha.kubernetes.io/critical-pod: ''    spec:      serviceAccountName: fluentd-es      containers:      - name: fluentd-es        image: cnych/fluentd-elasticsearch:v2.0.4        env:        - name: FLUENTD_ARGS          value: --no-supervisor -q        resources:          limits:            memory: 500Mi          requests:            cpu: 100m            memory: 200Mi        volumeMounts:        - name: varlog          mountPath: /var/log        - name: varlibdockercontainers          mountPath: /var/lib/docker/containers          readOnly: true        - name: config-volume          mountPath: /etc/fluent/config.d      nodeSelector:        beta.kubernetes.io/fluentd-ds-ready: "true"      tolerations:      - key: node-role.kubernetes.io/master        operator: Exists        effect: NoSchedule      terminationGracePeriodSeconds: 30      volumes:      - name: varlog        hostPath:          path: /var/log      - name: varlibdockercontainers        hostPath:          path: /var/lib/docker/containers      - name: config-volume        configMap:          name: fluentd-config

可以搜集/var/log和/var/log/containers和/var/lib/docker/containers内的日志
还可以搜集docker服务和kubelet服务的日志
为了能够灵活控制哪些节点的日志可以被收集,所以我们这里还添加了一个 nodSelector 属性

nodeSelector:  beta.kubernetes.io/fluentd-ds-ready: "true"

所以要给所有节点打标签:
$ kubectl get node
$ kubectl label nodes server243.example.com beta.kubernetes.io/fluentd-ds-ready=true
$ kubectl get nodes --show-labels
由于我们的集群使用的是 kubeadm 搭建的,默认情况下 master 节点有污点,所以要想也收集 master 节点的日志,则需要添加上容忍

tolerations:- key: node-role.kubernetes.io/master  operator: Exists  effect: NoSchedule

$ kubectl get pod -n logging
NAME READY STATUS RESTARTS AGE
es-cluster-0 1/1 Running 0 10h
es-cluster-1 1/1 Running 0 10h
es-cluster-2 1/1 Running 0 10h
fluentd-es-rf6p6 1/1 Running 0 9h
fluentd-es-s99r2 1/1 Running 0 9h
fluentd-es-snmtt 1/1 Running 0 9h
kibana-bd6f49775-qsxb2 1/1 Running 0 11h
3、在kibana上配置
http://192.168.1.243:32081
Create index pattern----第一步输入logstash-*,第二步选择@timestamp
4、创建测试pod,在kibana上查看日志

apiVersion: v1kind: Podmetadata:  name: counterspec:  containers:  - name: count    image: busybox    args: [/bin/sh, -c,            'i=0; while true; do echo "$i: $(date)"; i=$((i+1)); sleep 1; done']

回到 Kibana Dashboard 页面,在上面的Discover页面搜索栏中输入kubernetes.pod_name:counter

0