[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <613dc158-e3de-f713-019e-cf8c190e72be@acm.org>
Date: Fri, 16 Jun 2017 07:15:00 -0500
From: Corey Minyard <minyard@....org>
To: Tony Camuso <tcamuso@...hat.com>,
openipmi-developer@...ts.sourceforge.net
Cc: linux-kernel@...r.kernel.org
Subject: Re: [PATCH] ipmi: use rcu lock around call to
intf->handlers->sender()
On 06/15/2017 10:54 AM, Corey Minyard wrote:
> On 06/13/2017 09:54 AM, Tony Camuso wrote:
>> A vendor with a system having more than 128 CPUs occasionally
>> encounters a
>> crash during shutdown. This is not an easily reproduceable event, but
>> the
>> vendor was able to provide the following analysis of the crash, which
>> exhibits the same footprint each time.
>>
>> crash> bt
>> PID: 0 TASK: ffff88017c70ce70 CPU: 5 COMMAND: "swapper/5"
>> #0 [ffff88085c143ac8] machine_kexec at ffffffff81059c8b
>> #1 [ffff88085c143b28] __crash_kexec at ffffffff811052e2
>> #2 [ffff88085c143bf8] crash_kexec at ffffffff811053d0
>> #3 [ffff88085c143c10] oops_end at ffffffff8168ef88
>> #4 [ffff88085c143c38] no_context at ffffffff8167ebb3
>> #5 [ffff88085c143c88] __bad_area_nosemaphore at ffffffff8167ec49
>> #6 [ffff88085c143cd0] bad_area_nosemaphore at ffffffff8167edb3
>> #7 [ffff88085c143ce0] __do_page_fault at ffffffff81691d1e
>> #8 [ffff88085c143d40] do_page_fault at ffffffff81691ec5
>> #9 [ffff88085c143d70] page_fault at ffffffff8168e188
>> [exception RIP: unknown or invalid address]
>> RIP: ffffffffa053c800 RSP: ffff88085c143e28 RFLAGS: 00010206
>> RAX: ffff88017c72bfd8 RBX: ffff88017a8dc000 RCX: ffff8810588b5ac8
>> RDX: ffff8810588b5a00 RSI: ffffffffa053c800 RDI: ffff8810588b5a00
>> RBP: ffff88085c143e58 R8: ffff88017c70d408 R9: ffff88017a8dc000
>> R10: 0000000000000002 R11: ffff88085c143da0 R12: ffff8810588b5ac8
>> R13: 0000000000000100 R14: ffffffffa053c800 R15: ffff8810588b5a00
>> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
>> --- <IRQ stack> ---
>> [exception RIP: cpuidle_enter_state+82]
>> RIP: ffffffff81514192 RSP: ffff88017c72be50 RFLAGS: 00000202
>> RAX: 0000001e4c3c6f16 RBX: 000000000000f8a0 RCX: 0000000000000018
>> RDX: 0000000225c17d03 RSI: ffff88017c72bfd8 RDI: 0000001e4c3c6f16
>> RBP: ffff88017c72be78 R8: 000000000000237e R9: 0000000000000018
>> R10: 0000000000002494 R11: 0000000000000001 R12: ffff88017c72be20
>> R13: ffff88085c14f8e0 R14: 0000000000000082 R15: 0000001e4c3bb400
>> ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0018
>>
>> This is the corresponding stack trace
>>
>> It has crashed because the area pointed with RIP extracted from timer
>> element is already removed during a shutdown process.
>>
>> The function is smi_timeout().
>>
>> And we think ffff8810588b5a00 in RDX is a parameter struct smi_info
>>
>> crash> rd ffff8810588b5a00 20
>> ffff8810588b5a00: ffff8810588b6000 0000000000000000 .`.X............
>> ffff8810588b5a10: ffff880853264400 ffffffffa05417e0 .D&S......T.....
>> ffff8810588b5a20: 24a024a000000000 0000000000000000 .....$.$........
>> ffff8810588b5a30: 0000000000000000 0000000000000000 ................
>> ffff8810588b5a40: ffffffffa053a040 ffffffffa053a060 @.S.....`.S.....
>> ffff8810588b5a50: 0000000000000000 0000000100000001 ................
>> ffff8810588b5a60: 0000000000000000 0000000000000e00 ................
>> ffff8810588b5a70: ffffffffa053a580 ffffffffa053a6e0 ..S.......S.....
>> ffff8810588b5a80: ffffffffa053a4a0 ffffffffa053a250 ..S.....P.S.....
>> ffff8810588b5a90: 0000000500000002 0000000000000000 ................
>>
>> Unfortunately the top of this area is already detroyed by someone.
>> But because of two reasonns we think this is struct smi_info
>> 1) The address included in between ffff8810588b5a70 and
>> ffff8810588b5a80:
>> are inside of ipmi_si_intf.c see crash> module ffff88085779d2c0
>>
>> 2) We've found the area which point this.
>> It is offset 0x68 of ffff880859df4000
>>
>> crash> rd ffff880859df4000 100
>> ffff880859df4000: 0000000000000000 0000000000000001 ................
>> ffff880859df4010: ffffffffa0535290 dead000000000200 .RS.............
>> ffff880859df4020: ffff880859df4020 ffff880859df4020 @.Y.... @.Y....
>> ffff880859df4030: 0000000000000002 0000000000100010 ................
>> ffff880859df4040: ffff880859df4040 ffff880859df4040 @@.Y....@@.Y....
>> ffff880859df4050: 0000000000000000 0000000000000000 ................
>> ffff880859df4060: 0000000000000000 ffff8810588b5a00 .........Z.X....
>> ffff880859df4070: 0000000000000001 ffff880859df4078 ........x@......
>>
>> If we regards it as struct ipmi_smi in shutdown process
>> it looks consistent.
>>
>> The remedy for this apparent race is affixed below.
>
> I think you are right about this problem, but in_shutdown is checked
> already
> a bit before when newmsg is extracted from the list. Wouldn't it be
> better
> to add the rcu_read_lock() region starting right before the previous
> in_shutdown check to after the send? That would avoid a leak in this
> case.
While lying awake unable to sleep, I realized that you can't call the
sender function while holding rcu_read_lock(). That will break RT,
because you can't claim a mutex while holding rcu_read_lock(),
and the sender function will claim normal spinlocks.
So I need to think about this a bit.
-corey
>
> Thanks,
>
> -corey
>
>> Signed-off-by: Tony Camuso <tcamuso@...hat.com>
>> ---
>> drivers/char/ipmi/ipmi_msghandler.c | 9 +++++++--
>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/char/ipmi/ipmi_msghandler.c
>> b/drivers/char/ipmi/ipmi_msghandler.c
>> index 9f69995..577509f 100644
>> --- a/drivers/char/ipmi/ipmi_msghandler.c
>> +++ b/drivers/char/ipmi/ipmi_msghandler.c
>> @@ -3897,8 +3897,13 @@ static void smi_recv_tasklet(unsigned long val)
>> }
>> if (!run_to_completion)
>> spin_unlock_irqrestore(&intf->xmit_msgs_lock, flags);
>> - if (newmsg)
>> - intf->handlers->sender(intf->send_info, newmsg);
>> +
>> + if (newmsg) {
>> + rcu_read_lock();
>> + if (!intf->in_shutdown)
>> + intf->handlers->sender(intf->send_info, newmsg);
>> + rcu_read_unlock();
>> + }
>> handle_new_recv_msgs(intf);
>> }
>
>
Powered by blists - more mailing lists