本文共 3522 字,大约阅读时间需要 11 分钟。
aarch32 linux4.14 zynq-7000
系统概率死机重启,关掉watchdog后,系统hang住,串口没打印任何异常log,且串口无法输入
判断系统状态,camera中断中添加blink led操作,死机的时候中断无法操作,一般情况软件的crash,串口总会有些log打印的,很奇怪这个死机没有串口log打印出来,推测在中断中遇到BUG_ON或者crash,或者硬件导致(bus 挂掉,dram不稳)
打开 kernel config如下后虽有看到ATOMIC_SLEEP 的call stack,修复后,仍然可以复现问题
CONFIG_KALLSYMS_ALL=yCONFIG_DEBUG_INFO=yCONFIG_MAGIC_SYSRQ=y//tirgger t dump all task stackCONFIG_SOFTLOCKUP_DETECTOR=yCONFIG_DEBUG_ATOMIC_SLEEP=y
经历一番找规律与加log实验后没有大的收获,怀疑log在log buf中没有打印出来,决定上jtag dump死机时候的logbuf
zynq7020 jtag 使用注意事项(摘自zynq trm),看文档jtag 如下寄存器和配置(cascade mode),先检查自己当前的系统配置是否正确(MIO 2345 保证gpio全为0),可以在 查询些jtag连接注意的事情JTAG Enable/Disable Control
The DAP and TAP controllers are controlled by a few mechanisms. • Cascade versus Independent mode • Enable/disable DAP and TAP controllers • Permanently disable JTAG The JTAG connections can be enabled and disabled using the devcfg.CTRL [JTAG_CHAIN_DIS] bit. It is set = 1 to disable JTAG to protect the PL from unwanted JTAG accesses. The DAP controller is enabled by setting the devcfg.CTRL [DAP_EN] bit = 111 . Any other value causes the DAP controller to be bypassed. This bit is lockable by setting the devcfg.LOCK [DBG_LOCK] bit = 1 . Once locked, it can only be unlocked with a POR reset.确定jtag usb cable 指示灯为绿则代表硬件配置和驱动没有问题
本人安装的xilinx sdk图像界面打不开,只能用命令行的方式
xilinx的debug命令 xsdb 可以通过jtag attch上arm进行debug,wiki:
source settings64.shxsdbrlwrap: warning: your $TERM is 'xterm-256color' but rlwrap couldn't find it in the terminfo database. Expect some problems. ****** Xilinx System Debugger (XSDB) v2016.4 **** Build date : Jan 23 2017-19:28:44 ** Copyright 1986-2016 Xilinx, Inc. All Rights Reserved.xsdb% connect attempting to launch hw_server ****** Xilinx hw_server v2016.4 **** Build date : Jan 23 2017-19:28:34 ** Copyright 1986-2016 Xilinx, Inc. All Rights Reserved.INFO: hw_server application startedINFO: Use Ctrl-C to exit hw_server applicationINFO: To connect to this hw_server instance use url: TCP:127.0.0.1:3121tcfchan#0xsdb% targets 1 APU 2 ARM Cortex-A9 MPCore #0 (Running) 3 ARM Cortex-A9 MPCore #1 (Running) 4 xc7z020xsdb%
targets 出现两个核的信息说明已经attach上了arm DAP,获取log buf 地址,并dump其中内容
log_buf 的物理地址 = 虚拟地址 - 起始虚拟地址 + 起始物理地址,内核空间的虚拟地址与物理地址是一一对应的readelf -s vmlinux |grep __log_buf 11475: c0e9251c 0x40000 OBJECT LOCAL DEFAULT 29 __log_buf//或者在板子上查看cat /proc/kallsyms |grep __log_buf//获取kernel 加载的物理地址 0x8000mkimage -l uImage Image Name: Linux-4.14.0-xilinx-99556-gc12b6Created: Fri Apr 24 20:03:19 2020Image Type: ARM Linux Kernel Image (uncompressed)Data Size: 5821352 Bytes = 5684.91 kB = 5.55 MBLoad Address: 00008000Entry Point: 00008000//获取linux起始虚拟地址,vmlinux.lds 0xc0008000//所以log_buf 的物理地址= 0xc0e9251c - 0xc0008000 + 00008000 = 0xe9251c//https://www.xilinx.com/html_docs/xilinx2018_1/SDK_Doc/xsct/memory/reference_memory_mrd.html 查找mrd命令的详细用法//xsdb consolexsdb% target 1 //attach cpu 1xsdb% mrd -bin -file log.bin 0xe9251c 0x40000//shell consolestrings log.bin > log.txtcat log.txt
查看到log buf中有null point的打印(证明了是logbuf 没有及时打印出来所以没看到的log),通过addr2line出问题的代码的位置即可,最终的问题现场是spi_lock_irq 里面出现了crash
除了dump logbuf还可以dump kernel text 段来判断text 段是否被踩,dumpcpu寄存器确定cpu是否还在跑
转载地址:http://vxbji.baihongyu.com/