[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4D545B93.9010109@oracle.com>
Date: Thu, 10 Feb 2011 13:41:39 -0800
From: Randy Dunlap <randy.dunlap@...cle.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
CC: lkml <linux-kernel@...r.kernel.org>,
Corey Minyard <minyard@....org>,
Bjorn Helgaas <bjorn.helgaas@...com>
Subject: Re: Linux 2.6.38-rc4 (other bugs: ipmi Oops)
On 02/10/11 12:03, Linus Torvalds wrote:
> On Thu, Feb 10, 2011 at 11:34 AM, Randy Dunlap <randy.dunlap@...cle.com> wrote:
>>
>> Loading ipmi_si module a second time causes an Oops:
>>
>> [ 68.120143] RIP: 0010:[<ffffffff813fc579>] [<ffffffff813fc579>] put_driver+0x10/0x22
>
> The disassembly is
>
> 55 push %rbp
> 48 89 e5 mov %rsp,%rbp
> 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> 48 ff 05 c7 af 80 01 incq 0x180afc7(%rip) # 0x180aff2
> * 48 8b 7f 60 mov 0x60(%rdi),%rdi <-- trapping instruction
> e8 38 27 ec ff callq 0xffffffffffec276c
> 48 ff 05 bf af 80 01 incq 0x180afbf(%rip) # 0x180affa
> c9 leaveq
> c3 retq
>
> which is the access of "drv->p" in that function:
>
> kobject_put(&drv->p->kobj);
>
> so "drv" that was passed in was just bogus. (it's
> "0xffffffffa06a8430", looks like it's the DEBUG_PAGEALLOC that has
> caused the page to be free'd).
>
>> [ 68.340115] Call Trace:
>> [ 68.340115] [<ffffffff813fc64b>] driver_register+0xc0/0x1b2
>> [ 68.340115] [<ffffffff8137f5de>] pnp_register_driver+0x28/0x31
>> [ 68.340115] [<ffffffffa06b888d>] init_ipmi_si+0x1a4/0x4cd [ipmi_si]
>> [ 68.340115] [<ffffffff810020a6>] do_one_initcall+0x6c/0x1e3
>> [ 68.340115] [<ffffffff810d4998>] sys_init_module+0x12b/0x307
>
> And I think that - as usual - the problem is that the damn driver
> cleanup is very ugly, and has this duplicate set of code to unregister
> all the random crap. Except one of the duplicates is missing one case.
> I think the bug was introduced by Gjorn Helgaas in commit 9e368fa011d4
> ("ipmi: add PNP discovery (ACPI namespace via PNPACPI)") which added
> the acpi pnp case, but only unregistered it on the regular module exit
> path, not on the "module loaded with no pnp devices" path.
>
> Does this patch fix it? And Corey - this is a good example of why the
> code shouldn't duplicate the "unregister stuff" in the module load
> error case vs the module exit path, and there should be a shared
> "cleanup()" function that is called by both. Can this be cleaned up,
> please?
>
> PATCH IS UNTESTED!
That works.
Acked-and-tested-by: Randy Dunlap <randy.dunlap@...cle.com>
thanks,
--
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists