linux-kernel - Re: [PATCH] Drivers: hv: vmbus: handle various crash scenarios

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87mvpq7gtg.fsf@vitty.brq.redhat.com>
Date:	Tue, 22 Mar 2016 15:00:59 +0100
From:	Vitaly Kuznetsov <vkuznets@...hat.com>
To:	KY Srinivasan <kys@...rosoft.com>
Cc:	"devel\@linuxdriverproject.org" <devel@...uxdriverproject.org>,
	"linux-kernel\@vger.kernel.org" <linux-kernel@...r.kernel.org>,
	Haiyang Zhang <haiyangz@...rosoft.com>,
	"Alex Ng \(LIS\)" <alexng@...rosoft.com>,
	"Radim Krcmar" <rkrcmar@...hat.com>,
	Cathy Avery <cavery@...hat.com>
Subject: Re: [PATCH] Drivers: hv: vmbus: handle various crash scenarios

KY Srinivasan <kys@...rosoft.com> writes:

>> -----Original Message-----
>> From: Vitaly Kuznetsov [mailto:vkuznets@...hat.com]
>> Sent: Monday, March 21, 2016 12:52 AM
>> To: KY Srinivasan <kys@...rosoft.com>
>> Cc: devel@...uxdriverproject.org; linux-kernel@...r.kernel.org; Haiyang
>> Zhang <haiyangz@...rosoft.com>; Alex Ng (LIS) <alexng@...rosoft.com>;
>> Radim Krcmar <rkrcmar@...hat.com>; Cathy Avery <cavery@...hat.com>
>> Subject: Re: [PATCH] Drivers: hv: vmbus: handle various crash scenarios
>> 
>> KY Srinivasan <kys@...rosoft.com> writes:
>> 
>> >> -----Original Message-----
>> >> From: Vitaly Kuznetsov [mailto:vkuznets@...hat.com]
>> >> Sent: Friday, March 18, 2016 5:33 AM
>> >> To: devel@...uxdriverproject.org
>> >> Cc: linux-kernel@...r.kernel.org; KY Srinivasan <kys@...rosoft.com>;
>> >> Haiyang Zhang <haiyangz@...rosoft.com>; Alex Ng (LIS)
>> >> <alexng@...rosoft.com>; Radim Krcmar <rkrcmar@...hat.com>; Cathy
>> >> Avery <cavery@...hat.com>
>> >> Subject: [PATCH] Drivers: hv: vmbus: handle various crash scenarios
>> >>
>> >> Kdump keeps biting. Turns out CHANNELMSG_UNLOAD_RESPONSE is
>> always
>> >> delivered to CPU0 regardless of what CPU we're sending
>> >> CHANNELMSG_UNLOAD
>> >> from. vmbus_wait_for_unload() doesn't account for the fact that in case
>> >> we're crashing on some other CPU and CPU0 is still alive and operational
>> >> CHANNELMSG_UNLOAD_RESPONSE will be delivered there completing
>> >> vmbus_connection.unload_event, our wait on the current CPU will never
>> >> end.
>> >
>> > What was the host you were testing on?
>> >
>> 
>> I was testing on both 2012R2 and 2016TP4. The bug is easily reproducible
>> by forcing crash on a secondary CPU, e.g.:
>
> Prior to 2012R2, all messages would be delivered on CPU0 and this includes CHANNELMSG_UNLOAD_RESPONSE.
> For this reason we don't support kexec on pre-2012 R2 hosts. On 2012. From 2012 R2 on, all vmbus 
> messages (responses) will be delivered on  the CPU that we initially set up - look at the code in
> vmbus_negotiate_version(). So on post 2012 R2 hosts, the response to CHANNELMSG_UNLOAD_RESPONSE
> will be delivered on the CPU where we initiate the contact with the
> host - CHANNELMSG_INITIATE_CONTACT message.

Unfortunatelly there is a descrepancy between WS2012R2 and WS2016TP4. On
WS2012R2 what you're saying is true and all messages including
CHANNELMSG_UNLOAD_RESPONSE are delivered to the CPU we used for initial
contact. On WS2016TP4 CHANNELMSG_UNLOAD_RESPONSE seems to be a special
case and it is always delivered to CPU0, no matter which CPU we used for
initial contact. This can be a host bug. You can use the attached patch
to see the issue.

For now I can suggest we check message pages for all CPUs from
vmbus_wait_for_unload(). We can race with other CPUs again but we don't
care as we're checking for completion_done() in the loop as well. I'll
try this approach.

-- 
  Vitaly


View attachment "0001-Drivers-hv-vmbus-handle-various-crash-scenarios.patch" of type "text/x-patch" (6177 bytes)