千家信息网

Elasticsearch启动分析与问题解决-bootstrap checks

发表于:2024-10-11 作者:千家信息网编辑
千家信息网最后更新 2024年10月11日,[TOC]0 说明使用的es版本为5.6,Linux版本为CentOs 6.5.1 Elasticsearch bootstrap checks1.1 开发环境如果在es的配置中没有配置network
千家信息网最后更新 2024年10月11日Elasticsearch启动分析与问题解决-bootstrap checks

[TOC]


0 说明

使用的es版本为5.6,Linux版本为CentOs 6.5.

1 Elasticsearch bootstrap checks

1.1 开发环境

如果在es的配置中没有配置network.host来指定一个可用的IP地址的话,默认情况下,就绑定到localhost上,此时es会认为用户只是在开发环境下使用es,基于开箱即用的原则,虽然es此时也会进行bootstrap checks,来检查用户的配置是否与es设定的安全值相匹配,如下:

  • 如果匹配,则不会有warnning信息,此时es正常启动;
  • 如果不匹配,则会有warnning信息,但因为是开发环境,es依然会正常启动;

1.2 生产环境

一旦用户配置了network.host来指定一个可用的非loopback地址,那么es就会认为用户此时是在生产环境下启动es,同样会进行检查,但一旦检查不通过,直接会将前面的warnning提升为error,所以此时es会启动失败。

2 开发环境启动时的bootstrap checks分析

不配置network.host时,直接启动es,会有下面的warnning:

[2018-12-07T04:15:44,735][INFO ][o.e.d.DiscoveryModule    ] [PQ85ukj] using discovery type [zen][2018-12-07T04:15:45,702][INFO ][o.e.n.Node               ] initialized[2018-12-07T04:15:45,703][INFO ][o.e.n.Node               ] [PQ85ukj] starting ...[2018-12-07T04:15:46,071][INFO ][o.e.t.TransportService   ] [PQ85ukj] publish_address {127.0.0.1:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300}[2018-12-07T04:15:46,090][WARN ][o.e.b.BootstrapChecks    ] [PQ85ukj] max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536][2018-12-07T04:15:46,090][WARN ][o.e.b.BootstrapChecks    ] [PQ85ukj] max number of threads [1024] for user [hadoop] is too low, increase to at least [2048][2018-12-07T04:15:46,090][WARN ][o.e.b.BootstrapChecks    ] [PQ85ukj] max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144][2018-12-07T04:15:46,090][WARN ][o.e.b.BootstrapChecks    ] [PQ85ukj] system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk[2018-12-07T04:15:49,269][INFO ][o.e.c.s.ClusterService   ] [PQ85ukj] new_master {PQ85ukj}{PQ85ukjdSoeVEpSpByAjMw}{Dbb3lzTWTN-eUEKXO8z-sw}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)[2018-12-07T04:15:49,313][INFO ][o.e.h.n.Netty4HttpServerTransport] [PQ85ukj] publish_address {127.0.0.1:9200}, bound_addresses {[::1]:9200}, {127.0.0.1:9200}[2018-12-07T04:15:49,313][INFO ][o.e.n.Node               ] [PQ85ukj] started[2018-12-07T04:15:49,553][INFO ][o.e.g.GatewayService     ] [PQ85ukj] recovered [0] indices into cluster_state

提取其waarnning信息,如下:

文件描述符:max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536]线程数: max number of threads [1024] for user [hadoop] is too low, increase to at least [2048] 虚拟内存: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144] system call filters: system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk

可以看到有4个问题,分别为:文件描述符、线程数、虚拟内存与system call filters。

虽然有warnning,但因为es本身会认为是在开发环境下运行,基于开箱即用的特性,是仍然可以正常启动的。

3 生产环境启动时的bootstrap checks分析

绑定IP地址后再启动,发现有下面的报错信息:

ERROR: [4] bootstrap checks failed[1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536][2]: max number of threads [1024] for user [hadoop] is too low, increase to at least [2048][3]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144][4]: system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk

直接error,所以会启动失败,除非进行上面的设置符合安全要求。

4 生产环境正常启动配置

解决上面出现的问题,需要进行如下的配置。

4.1 文件描述符

  • 临时修改:
 ulimit -n 65536

但是重新登录后就会恢复成默认值了。

  • 永久修改

修改/etc/security/limits.conf配置,如下:

hadoop          soft    nofile  65536   # soft表示为超过这个值就会有warnninghadoop          hadr    nofile  100000  # hard则表示不能超过这个值

之后再重新登录,使用ulimit -n就可以进行验证。

4.2 线程数

修改/etc/security/limits.conf配置,如下:

hadoop          soft    nproc   2048hadoop          hard    nproc   4096

实际上,该配置文件对于nproc的说明为进程数,而不是线程数:

#                                                                                                                    39,1          41%# /etc/security/limits.conf##Each line describes a limit for a user in the form:##            ##Where:# can be:#        - an user name#        - a group name, with @group syntax#        - the wildcard *, for default entry#        - the wildcard %, can be also used with %group syntax,#                 for maxlogin limit## can have the two values:#        - "soft" for enforcing the soft limits#        - "hard" for enforcing hard limits## can be one of the following:#        - core - limits the core file size (KB)#        - data - max data size (KB)#        - fsize - maximum filesize (KB)#        - memlock - max locked-in-memory address space (KB)#        - nofile - max number of open files#        - rss - max resident set size (KB)#        - stack - max stack size (KB)#        - cpu - max CPU time (MIN)#        - nproc - max number of processes#        - as - address space limit (KB)#        - maxlogins - max number of logins for this user#        - maxsyslogins - max number of logins on the system#        - priority - the priority to run user process with#        - locks - max number of file locks the user can hold#        - sigpending - max number of pending signals#        - msgqueue - max memory used by POSIX message queues (bytes)#        - nice - max nice priority allowed to raise to values: [-20, 19]#        - rtprio - max realtime priority##                 ##*               soft    core            0#*               hard    rss             10000

4.3 虚拟内存

  • 查看当前值
sysctl vm.max_map_count
  • 临时设置
sysctl -w vm.max_map_count=262144

但是重启系统后就会失效。

  • 永久性设置

修改配置文件/etc/sysctl.conf,如下:

vm.max_map_count=262144

需要重启后才生效。

4.4 system call filters

  • 原因
    这是在因为Centos6不支持SecComp,而ES5.4.0默认bootstrap.system_call_filter为true进行检测,所以导致检测失败,失败后直接导致ES不能启动。

  • 解决
    在elasticsearch.yml中配置bootstrap.system_call_filter为false,注意要在Memory下面:
    bootstrap.memory_lock: false
    bootstrap.system_call_filter: false

参考:https://www.jianshu.com/p/89f8099a6d09

0