linux-kernel - Re: debugging rmmod failures

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <c2f59560-f925-8f43-f967-a8b33c9abd36@oracle.com>
Date:   Mon, 19 Sep 2022 09:48:48 -0700
From:   Stephen Brennan <stephen.s.brennan@...cle.com>
To:     Steve French <smfrench@...il.com>,
        LKML <linux-kernel@...r.kernel.org>
Cc:     linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: debugging rmmod failures

On 9/16/22 13:13, Steve French wrote:
> Any suggestions on how to debug why rmmod fails with EBUSY?
> 
> I was trying to debug a problem where rmmod fails during an xfstest
> run but couldn't find useful information on how to get ideas 'why'
> rmmod is failing (in this case with EBUSY) and I could only find two
> places where EBUSY would plausibly be returned but don't see the
> pr_debug message in the log (e.g. "already dying") that I would expect
> to get in those cases.
> 
> It also fails in this test scenario (which takes a while to reproduce
> so isn't trivial to repro) with EBUSY when doing "rmmod --verbose
> --force"  and the --verbose didn't display any additional info.
> 
> I also tried "trace-cmd record -e *module*" which showed it (one call)
> returning 0xFFFFFFF0 but nothing useful that I could see.

In situations like this I sometimes find it useful to use the ftrace
function_graph tracer. It can show you every single function call,
starting from some root function. Of course it's a bit of a footgun, so
be careful with it :)

I used it to trace the delete_module system call like so:

# trace-cmd record -p function_graph -g __x64_sys_delete_module \
                    -F rmmod $MODULE_TO_REMOVE
...
# trace-cmd report
            rmmod-41656 [007]  5506.963846: funcgraph_entry:                   |  __x64_sys_delete_module() {
            rmmod-41656 [007]  5506.963846: funcgraph_entry:                   |    capable() {
            rmmod-41656 [007]  5506.963846: funcgraph_entry:                   |      security_capable() {
...                                                                                ...
            rmmod-41656 [007]  5506.965521: funcgraph_entry:                   |    free_module() {
            rmmod-41656 [007]  5506.965521: funcgraph_entry:                   |      mod_sysfs_teardown() {
            rmmod-41656 [007]  5506.965522: funcgraph_entry:                   |        mutex_lock() {


Of course, my module teardown succeeded, but you should be able to
cross reference your function calls with the source code to see where
an error may have occurred.

Regards,
Stephen

> 
> Any ideas on how to debug *why* an rmmod fails?