linux-kernel - Re: [PATCH] ipmi: use rcu lock around call to intf->handlers->sender()

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c036b3d9-d83b-ae2e-e450-4a0b97c6bcde@redhat.com>
Date:   Mon, 19 Jun 2017 10:42:41 -0400
From:   Tony Camuso <tcamuso@...hat.com>
To:     minyard@....org, openipmi-developer@...ts.sourceforge.net
Cc:     linux-kernel@...r.kernel.org
Subject: Re: [PATCH] ipmi: use rcu lock around call to
 intf->handlers->sender()

On 06/19/2017 10:32 AM, Corey Minyard wrote:
> On 06/19/2017 09:29 AM, Tony Camuso wrote:
>> On 06/19/2017 09:31 AM, Corey Minyard wrote:
>>> On 06/16/2017 07:15 AM, Corey Minyard wrote:
>>>> On 06/15/2017 10:54 AM, Corey Minyard wrote:
>>>>> On 06/13/2017 09:54 AM, Tony Camuso wrote:
>>>>>> A vendor with a system having more than 128 CPUs occasionally encounters a
>>>>>> crash during shutdown. This is not an easily reproduceable event, but the
>>>>>> vendor was able to provide the following analysis of the crash, which
>>>>>> exhibits the same footprint each time.
>>>>>>
>>>>>> crash> bt
>>>>>> PID: 0      TASK: ffff88017c70ce70  CPU: 5   COMMAND: "swapper/5"
>>>>>>   #0 [ffff88085c143ac8] machine_kexec at ffffffff81059c8b
>>>>>>   #1 [ffff88085c143b28] __crash_kexec at ffffffff811052e2
>>>>>>   #2 [ffff88085c143bf8] crash_kexec at ffffffff811053d0
>>>>>>   #3 [ffff88085c143c10] oops_end at ffffffff8168ef88
>>>>>>   #4 [ffff88085c143c38] no_context at ffffffff8167ebb3
>>>>>>   #5 [ffff88085c143c88] __bad_area_nosemaphore at ffffffff8167ec49
>>>>>>   #6 [ffff88085c143cd0] bad_area_nosemaphore at ffffffff8167edb3
>>>>>>   #7 [ffff88085c143ce0] __do_page_fault at ffffffff81691d1e
>>>>>>   #8 [ffff88085c143d40] do_page_fault at ffffffff81691ec5
>>>>>>   #9 [ffff88085c143d70] page_fault at ffffffff8168e188
>>>>>>      [exception RIP: unknown or invalid address]
>>>>>>      RIP: ffffffffa053c800  RSP: ffff88085c143e28  RFLAGS: 00010206
>>>>>>      RAX: ffff88017c72bfd8  RBX: ffff88017a8dc000  RCX: ffff8810588b5ac8
>>>>>>      RDX: ffff8810588b5a00  RSI: ffffffffa053c800  RDI: ffff8810588b5a00
>>>>>>      RBP: ffff88085c143e58   R8: ffff88017c70d408   R9: ffff88017a8dc000
>>>>>>      R10: 0000000000000002  R11: ffff88085c143da0  R12: ffff8810588b5ac8
>>>>>>      R13: 0000000000000100  R14: ffffffffa053c800  R15: ffff8810588b5a00
>>>>>>      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>>>>>> --- <IRQ stack> ---
>>>>>>      [exception RIP: cpuidle_enter_state+82]
>>>>>>      RIP: ffffffff81514192  RSP: ffff88017c72be50  RFLAGS: 00000202
>>>>>>      RAX: 0000001e4c3c6f16  RBX: 000000000000f8a0  RCX: 0000000000000018
>>>>>>      RDX: 0000000225c17d03  RSI: ffff88017c72bfd8  RDI: 0000001e4c3c6f16
>>>>>>      RBP: ffff88017c72be78   R8: 000000000000237e   R9: 0000000000000018
>>>>>>      R10: 0000000000002494  R11: 0000000000000001  R12: ffff88017c72be20
>>>>>>      R13: ffff88085c14f8e0  R14: 0000000000000082  R15: 0000001e4c3bb400
>>>>>>      ORIG_RAX: ffffffffffffff10  CS: 0010  SS: 0018
>>>>>>
>>>>>> This is the corresponding stack trace
>>>>>>
>>>>>> It has crashed because the area pointed with RIP extracted from timer
>>>>>> element is already removed during a shutdown process.
>>>>>>
>>>>>> The function is smi_timeout().
>>>>>>
>>>>>> And we think ffff8810588b5a00 in RDX is a parameter struct smi_info
>>>>>>
>>>>>> crash> rd ffff8810588b5a00 20
>>>>>> ffff8810588b5a00:  ffff8810588b6000 0000000000000000 .`.X............
>>>>>> ffff8810588b5a10:  ffff880853264400 ffffffffa05417e0 .D&S......T.....
>>>>>> ffff8810588b5a20:  24a024a000000000 0000000000000000 .....$.$........
>>>>>> ffff8810588b5a30:  0000000000000000 0000000000000000 ................
>>>>>> ffff8810588b5a40:  ffffffffa053a040 ffffffffa053a060 @.S.....`.S.....
>>>>>> ffff8810588b5a50:  0000000000000000 0000000100000001 ................
>>>>>> ffff8810588b5a60:  0000000000000000 0000000000000e00 ................
>>>>>> ffff8810588b5a70:  ffffffffa053a580 ffffffffa053a6e0 ..S.......S.....
>>>>>> ffff8810588b5a80:  ffffffffa053a4a0 ffffffffa053a250 ..S.....P.S.....
>>>>>> ffff8810588b5a90:  0000000500000002 0000000000000000 ................
>>>>>>
>>>>>> Unfortunately the top of this area is already detroyed by someone.
>>>>>> But because of two reasonns we think this is struct smi_info
>>>>>>   1) The address included in between  ffff8810588b5a70 and ffff8810588b5a80:
>>>>>>    are inside of ipmi_si_intf.c  see crash> module ffff88085779d2c0
>>>>>>
>>>>>>   2) We've found the area which point this.
>>>>>>    It is offset 0x68 of  ffff880859df4000
>>>>>>
>>>>>> crash> rd  ffff880859df4000 100
>>>>>> ffff880859df4000:  0000000000000000 0000000000000001 ................
>>>>>> ffff880859df4010:  ffffffffa0535290 dead000000000200 .RS.............
>>>>>> ffff880859df4020:  ffff880859df4020 ffff880859df4020 @.Y.... @.Y....
>>>>>> ffff880859df4030:  0000000000000002 0000000000100010 ................
>>>>>> ffff880859df4040:  ffff880859df4040 ffff880859df4040 @@.Y....@@.Y....
>>>>>> ffff880859df4050:  0000000000000000 0000000000000000 ................
>>>>>> ffff880859df4060:  0000000000000000 ffff8810588b5a00 .........Z.X....
>>>>>> ffff880859df4070:  0000000000000001 ffff880859df4078 ........x@......
>>>>>>
>>>>>>   If we regards it as struct ipmi_smi in shutdown process
>>>>>>   it looks consistent.
>>>>>>
>>>>>> The remedy for this apparent race is affixed below.
>>>>>
>>>>> I think you are right about this problem, but in_shutdown is checked already
>>>>> a bit before when newmsg is extracted from the list. Wouldn't it be better
>>>>> to add the rcu_read_lock() region starting right before the previous
>>>>> in_shutdown check to after the send?  That would avoid a leak in this
>>>>> case.
>>>>
>>>> While lying awake unable to sleep, I realized that you can't call the
>>>> sender function while holding rcu_read_lock().  That will break RT,
>>>> because you can't claim a mutex while holding rcu_read_lock(),
>>>> and the sender function will claim normal spinlocks.
>>>>
>>>> So I need to think about this a bit.
>>>>
>>>
>>> I was wrong about this.  An rcu_read_lock() around the whole thing should
>>> be all that is required to fix this.  I can do a patch, or you can, if you like.
>>>
>>> Thanks again for pointing this out.
>>>
>>> -corey
>>
>> Is this what you have in mind?
>>
>> ---
>>  drivers/char/ipmi/ipmi_msghandler.c | 11 +++++++++--
>>  1 file changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c
>> index 9f69995..e20f8d7 100644
>> --- a/drivers/char/ipmi/ipmi_msghandler.c
>> +++ b/drivers/char/ipmi/ipmi_msghandler.c
>> @@ -3880,6 +3880,9 @@ static void smi_recv_tasklet(unsigned long val)
>>       */
>>      if (!run_to_completion)
>>          spin_lock_irqsave(&intf->xmit_msgs_lock, flags);
>> +
>> +    rcu_read_lock();
>> +
>>      if (intf->curr_msg == NULL && !intf->in_shutdown) {
>>          struct list_head *entry = NULL;
>>
>> @@ -3894,11 +3897,15 @@ static void smi_recv_tasklet(unsigned long val)
>>              newmsg = list_entry(entry, struct ipmi_smi_msg, link);
>>              intf->curr_msg = newmsg;
>>          }
>> +
>> +        if (newmsg)
>> +            intf->handlers->sender(intf->send_info, newmsg);
>>      }
>> +
>> +    rcu_read_unlock();
>> +
>>      if (!run_to_completion)
>>          spin_unlock_irqrestore(&intf->xmit_msgs_lock, flags);
>> -    if (newmsg)
>> -        intf->handlers->sender(intf->send_info, newmsg);
>>
>>      handle_new_recv_msgs(intf);
>>  }
> 
> No, you definitely cannot call the sender function while holding the lock.
> 
> I was talk about adding rcu_read_lock() before the spin_lock_irqsave()
> and rcu_read_unlock() after the sender() call.

More like this, then.

diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c
index 9f69995..49a7e96 100644
--- a/drivers/char/ipmi/ipmi_msghandler.c
+++ b/drivers/char/ipmi/ipmi_msghandler.c
@@ -3878,6 +3878,9 @@ static void smi_recv_tasklet(unsigned long val)
  	 * because the lower layer is allowed to hold locks while calling
  	 * message delivery.
  	 */
+
+	rcu_read_lock();
+
  	if (!run_to_completion)
  		spin_lock_irqsave(&intf->xmit_msgs_lock, flags);
  	if (intf->curr_msg == NULL && !intf->in_shutdown) {
@@ -3900,6 +3903,8 @@ static void smi_recv_tasklet(unsigned long val)
  	if (newmsg)
  		intf->handlers->sender(intf->send_info, newmsg);
  
+	rcu_read_unlock();
+
  	handle_new_recv_msgs(intf);
  }

> 
> IIRC, this code was moved from a section that was under the rcu read
> lock, but the lock was not added when moved.
> 
> -corey
>