White Paper: Red Hat Crash Utility |
||
![]() |
||
< Prev | Contents | Next > |
![]() |
Crash Usage: A Case StudyThe steps taken to debug a kernel crash dump are not etched in stone, and the crash commands used to debug a kernel issue vary according to the problem exhibited. The section contains of a case study that shows how the capabilities of the crash utility were used to to debug a specific kernel problem. However, before doing so, it should be noted that the following commands are typically the most commonly-used:
A Case Study: "kernel BUG at pipe.c:120!"Upon bringing up a crash session, a great deal of information can be gained just by the invocation data. Here is what what displayed in this particular case:
In this case the PANIC string "kernel BUG at pipe.c:120!" points to the exact kernel source code line at which the panic occurred. Then, getting a backtrace of panicking task is typically the first order of the day:
The backtrace shows that the call to die() was generated by an invalid_op exception. The exception was caused by the BUG() call in the pipe_read() function:
In the code segment above, the pipe_read() code has previously down'd the semaphore of the inode associated with the pipe, giving it exclusive access. It had read all data in the pipe, but still needed more to satisfy the count requested. Finding that there was a writer with more data -- and who was waiting on the semaphore -- it woke up the writer. However, after doing the wakeup, it did a sanity-check on the pipe contents, and found that it was no longer empty -- which is theoretically impossible since it was still holding the semaphore. It appeared that the writer process wrote to the pipe while the reader process still had exclusive access -- somehow overriding the semaphore. Since the semaphore mechanism was seemingly not working, it was first necessary to look at the actual semaphore structure associated with the pipe's inode. This first required looking at the first argument to the pipe_read() function; the whatis command shows that it is a struct file pointer:
Using the bt -f option, each frame in the backtrace is expanded to show all stack data in the frame. Looking at the expansion of the sys_read() frame, we can see that the last thing pushed on the stack before calling pipe_read() was the file pointer address of edf3f740:
The task at hand is finding the inode containing the suspect semaphore from the file structure address. The file structure's f_dentry member points to its dentry structure, whose d_inode member in turn points to the pipe's inode. The struct command can be used to dump the complete contents of a data structure at a given address; by tagging the .member onto the structure name, we can print just the member desired. By following the structure chain, the inode address can be determined like so:
The dump of the semaphore structure above showed the problem: the counter value of 2 is illegal. It should never be greater than 1; in this case a value of 2 allows two successful down operations, i.e., giving two tasks access to the pipe at the same time. (As an aside, determining the inode address above could also be accomplished by using the context-sensitive files command, which dumps the associated file, dentry and inode structure addresses for each open file descriptor of a task. The dumped file descriptor list would contain one with a reference to the file structure at edf3f740, and would also show the associated inode address of f640e740.) Before getting a dumpfile, this same panic had occurred several times. It was erroneously presumed that the problem was in the pipe-handling code, but it was eventually determined not to be the case. By instrumenting a kernel with debug code, the starting counter value of a pipe was found to be 3. Compounding that problem was the fact that the inode slab cache is one of a few special cases that presume that the freed inode's contents are left in a legitimate state so that they do not have to be completely reinitialized with each subsequent reallocation. So when the pipe's inode was created, it received an inode with a bogus counter value. Confirming the existence of bogus inode structures in the slab cache was a multi-stepped procedure. Using the command kmem command to access the inode slab cache, we can get the addresses of all free and currently-allocated inodes. Since there are typically several thousand inodes, the output is extremely verbose, but here is the beginning of it:
In the truncated output above, all of the inode address in the slab cache are dumped; the ones currently in use are surrounded by brackets, the free ones are not. So, for example, the inodes at addresses f4e52040 and f4e52200 are free; the others are not. The full output was piped to a script that pulled out just the free inode addresses (i.e., output lines starting with three spaces), and redirected them into a file. The file was modified to be a crash input file by making each extracted inode address to be the arguments of the struct command, using its short-cut method that allows the dropping of the struct command name; therefore the input file contained hundreds of crash commands of the form:
Note that the struct command would be used by default above, as documented in its help page; if the first command line argument is not a crash or gdb command, but it is the name of a known data structure, it passes the arguments to the struct command. Using the capability of feeding an input file, in this case consisting of hundreds of short-cut struct commands like those above, the output was again quite verbose, consisting of structure member dumps of the form:
However, it was a simple matter of piping the output to grep, and looking for counter values not equal to 1:
This turned out to be the smoking gun. Another round of debugging with an instrumented kernel that trapped attempts to free an inode with a semaphore counter of 3 caught the perpetrator in the act. |
![]() |
||
< Prev | Contents | Next > |
![]() |