千家信息网

ceph bug gdb调试的示例分析

发表于:2025-02-05 作者:千家信息网编辑
千家信息网最后更新 2025年02月05日,小编给大家分享一下ceph bug gdb调试的示例分析,相信大部分人都还不怎么了解,因此分享这篇文章给大家参考一下,希望大家阅读完这篇文章后大有收获,下面让我们一起去了解一下吧!环境:ceph20.
千家信息网最后更新 2025年02月05日ceph bug gdb调试的示例分析

小编给大家分享一下ceph bug gdb调试的示例分析,相信大部分人都还不怎么了解,因此分享这篇文章给大家参考一下,希望大家阅读完这篇文章后大有收获,下面让我们一起去了解一下吧!

环境:ceph20.2.3 armv7 32位,ceph编译环境是yocto

问题描述:在arm开发上测试ceph当启动mds进程的时候,mon进程就会挂掉。

ceph编译的时候默认就会有-g,编译出来可以直接用gdb调试。

接下来用gdb调试mon进程,ceph-mon是多进程的,gdb调试的时候要开启子线程调试模式。

follow-fork-mode detach-on-fork 说明

parent on 只调试主进程(GDB默认)
child on 只调试子进程
parent off 同时调试两个进程,gdb跟主进程,子进程block在fork位置
child off 同时调试两个进程,gdb跟子进程,主进程block在fork位置

1、启动gdb

root@node32:~# gdbGNU gdb (GDB) 7.12.1Copyright (C) 2017 Free Software Foundation, Inc.License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it.There is NO WARRANTY, to the extent permitted by law.  Type "show copying"and "show warranty" for details.This GDB was configured as "arm-ap-linux-gnueabi".Type "show configuration" for configuration details.For bug reporting instructions, please see:.Find the GDB manual and other documentation resources online at:.For help, type "help".Type "apropos word" to search for commands related to "word".

2、读入ceph-mon文件

(gdb) file ceph-monReading symbols from ceph-mon...done.

3、设置运行参数

(gdb) set args -i node32(gdb) (gdb) (gdb) (gdb) show argsArgument list to give program being debugged when it is started is "-i node32".

.4、开启多进程调试模式,gdb会阻塞主进程

(gdb) show follow-fork-modeDebugger response to a program call of fork or vfork is "parent".(gdb) show detach-on-forkWhether gdb will detach the child of a fork is on.(gdb) set follow-fork-mode child(gdb) set detach-on-fork off(gdb)

5、运行程序

(gdb) runStarting program: /usr/bin/ceph-mon -i node32[Thread debugging using libthread_db enabled]Using host libthread_db library "/lib/libthread_db.so.1".[New Thread 0xb68d5ce0 (LWP 2036)][Thread 0xb68d5ce0 (LWP 2036) exited][New process 2037][Thread debugging using libthread_db enabled]Using host libthread_db library "/lib/libthread_db.so.1".[New Thread 0xb68d5ce0 (LWP 2038)]Reading symbols from /usr/lib/libtcmalloc.so.4...done.Reading symbols from /usr/lib/libbz2.so.1...done.Reading symbols from /lib/libz.so.1...done.Reading symbols from /usr/lib/libleveldb.so.1...done.Reading symbols from /usr/lib/libsnappy.so.1...done.Reading symbols from /usr/lib/libnss3.so...done.Reading symbols from /usr/lib/libnspr4.so...done.Reading symbols from /lib/libpthread.so.0...done.Reading symbols from /lib/libdl.so.2...done.Reading symbols from /usr/lib/libboost_thread.so.1.63.0...done.Reading symbols from /usr/lib/libboost_random.so.1.63.0...done.Reading symbols from /lib/librt.so.1...done.Reading symbols from /usr/lib/libboost_iostreams.so.1.63.0...done.Reading symbols from /usr/lib/libboost_system.so.1.63.0...done.Reading symbols from /usr/lib/libstdc++.so.6...done.Reading symbols from /lib/libm.so.6...done.Reading symbols from /lib/libgcc_s.so.1...done.Reading symbols from /lib/libc.so.6...done.Reading symbols from /lib/ld-linux-armhf.so.3...done.Reading symbols from /usr/lib/libunwind.so.8...done.Reading symbols from /usr/lib/libnssutil3.so...done.Reading symbols from /usr/lib/libplc4.so...done.Reading symbols from /usr/lib/libplds4.so...done.[New Thread 0xb48c5ce0 (LWP 2039)][New Thread 0xb40c5ce0 (LWP 2040)][New Thread 0xb38c5ce0 (LWP 2041)][New Thread 0xb30c5ce0 (LWP 2042)][New Thread 0xb28c5ce0 (LWP 2043)][New Thread 0xb20c5ce0 (LWP 2044)][New Thread 0xb18c5ce0 (LWP 2045)][New Thread 0xb10c5ce0 (LWP 2046)][New process 2047]

6、查看运行中的线程

(gdb) info inferiors   Num  Description       Executable          1    process 2033      /usr/bin/ceph-mon   2    process 2037      /usr/bin/ceph-mon * 3                /bin/bash.bash    (gdb)

可以看到现在在3号进程,现在3号进程没有进程号表示已经exit。

7、目前1号进程和2号进程都在阻塞状态中,切换到1号进程,continue

(gdb) inferior 1[Switching to inferior 1 [process 2033] (/usr/bin/ceph-mon)][Switching to thread 1.1 (Thread 0xb6ff1010 (LWP 2033))]#0  0xb6a15648 in __libc_fork () at /usr/src/debug/glibc/2.25-r0/git/sysdeps/nptl/fork.c:139warning: Source file is more recent than executable.139       pid = ARCH_FORK ();(gdb) where#0  0xb6a15648 in __libc_fork () at /usr/src/debug/glibc/2.25-r0/git/sysdeps/nptl/fork.c:139#1  0x7f6eba7c in Preforker::prefork (this=0xbeffeb70, err=...) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/common/Preforker.h:52#2  0x7f692058 in main (argc=, argv=0x0) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/ceph_mon.cc:500(gdb) cContinuing.[Inferior 1 (process 2033) exited normally](gdb) info inferiors   Num  Description       Executable        * 1                /usr/bin/ceph-mon   2    process 2037      /usr/bin/ceph-mon (gdb)

可以看到1号进程也退出了。切换到2号进程

8、切换到2号进程,并continue,进程2阻塞,等待客户端发送消息

9、在另一个开发板上启动mds进程

10、mon接收到消息并段错误

[New Thread 0xb08c5ce0 (LWP 2087)][New Thread 0xb05c5ce0 (LWP 2088)]Thread 2.8 "ms_dispatch" received signal SIGSEGV, Segmentation fault.[Switching to Thread 0xb20c5ce0 (LWP 2044)]0xb6befe1c in std::local_Rb_tree_decrement (__x=0x7fc14b24 <_ZStL19piecewise_construct>)    at ../../../../../../../../../../work-shared/gcc-5.4.0-r0/gcc-5.4.0/libstdc++-v3/src/c++98/tree.cc:9898      ../../../../../../../../../../work-shared/gcc-5.4.0-r0/gcc-5.4.0/libstdc++-v3/src/c++98/tree.cc: No such file or directory.(gdb) Continuing.Thread 2.8 "ms_dispatch" received signal SIGSEGV, Segmentation fault.raise (sig=sig@entry=11) at /usr/src/debug/glibc/2.25-r0/git/sysdeps/unix/sysv/linux/raise.c:5151      /usr/src/debug/glibc/2.25-r0/git/sysdeps/unix/sysv/linux/raise.c: No such file or directory.(gdb) bt #0  raise (sig=sig@entry=11) at /usr/src/debug/glibc/2.25-r0/git/sysdeps/unix/sysv/linux/raise.c:51#1  0x7f970930 in reraise_fatal (signum=11) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/global/signal_handler.cc:71#2  handle_fatal_signal (signum=11) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/global/signal_handler.cc:133#3  #4  0xb6befe1c in std::local_Rb_tree_decrement (__x=0x7fc14b24 <_ZStL19piecewise_construct>)    at ../../../../../../../../../../work-shared/gcc-5.4.0-r0/gcc-5.4.0/libstdc++-v3/src/c++98/tree.cc:98#5  0x7f7e585c in std::_Rb_tree_iterator >::operator-- (this=) at /usr/include/c++/5.4.0/bits/stl_tree.h:220#6  std::_Rb_tree, std::_Select1st >, std::less, std::allocator > >::_M_get_insert_hint_unique_pos (__k=..., __position=..., this=0x855265dc) at /usr/include/c++/5.4.0/bits/stl_tree.h:1924#7  std::_Rb_tree, std::_Select1st >, std::less, std::allocator > >::_M_emplace_hint_unique, std::tuple<> >(std::_Rb_tree_const_iterator >, std::piecewise_construct_t const&, std::tuple&&, std::tuple<>&&) (this=this@entry=0x855265dc, __pos=...) at /usr/include/c++/5.4.0/bits/stl_tree.h:2174#8  0x7f9b538c in std::map, std::allocator > >::operator[] (__k=..., this=0x855265dc)    at /usr/include/c++/5.4.0/bits/stl_map.h:483#9  FSMap::insert (this=this@entry=0x85526518, new_info=...) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/mds/FSMap.cc:794#10 0x7f7d4c94 in MDSMonitor::prepare_beacon (this=this@entry=0x85526340, op=...) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/mon/MDSMonitor.cc:549#11 0x7f7da428 in MDSMonitor::prepare_update (this=this@entry=0x85526340, op=...) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/mon/MDSMonitor.cc:469#12 0x7f75bd20 in PaxosService::dispatch (this=this@entry=0x85526340, op=...) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/mon/PaxosService.cc:96#13 0x7f72021c in Monitor::dispatch_op (this=this@entry=0x855bc000, op=...) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/mon/Monitor.cc:3605#14 0x7f721078 in Monitor::_ms_dispatch (this=this@entry=0x855bc000, m=m@entry=0x855ff980) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/mon/Monitor.cc:3532#15 0x7f743414 in Monitor::ms_dispatch (this=0x855bc000, m=0x855ff980) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/mon/Monitor.h:905#16 0x7fb769b4 in Messenger::ms_deliver_dispatch (m=0x855ff980, this=0x855b0b00) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/msg/Messenger.h:584#17 DispatchQueue::entry (this=0x855b0c80) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/msg/simple/DispatchQueue.cc:185#18 0x7fa71d24 in DispatchQueue::DispatchThread::entry (this=) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/msg/simple/DispatchQueue.h:103#19 0xb6d55f28 in start_thread (arg=0xb20c5ce0) at /usr/src/debug/glibc/2.25-r0/git/nptl/pthread_create.c:458#20 0xb6a49968 in ?? () at ../sysdeps/unix/sysv/linux/arm/clone.S:76 from /lib/libc.so.6Backtrace stopped: previous frame identical to this frame (corrupt stack?)(gdb)

错误原因可以看到work-shared/gcc-5.4.0-r0/gcc-5.4.0/libstdc++-v3/src/c++98/tree.cc文件98行引发了一个段错误。

调试后发现因为ceph和gcc 5.4.0版编译器不匹配,编译器换成6.3.0问题解决,编译器6.3.0编译ceph会在pg scrub的是crash后来yocto的toolchain换成4.9.2后不再有pg scrub问题和mds crash问题

以上是"ceph bug gdb调试的示例分析"这篇文章的所有内容,感谢各位的阅读!相信大家都有了一定的了解,希望分享的内容对大家有所帮助,如果还想学习更多知识,欢迎关注行业资讯频道!

0