linux-kernel - RE: [PATCH 2/4] Drivers: hv: balloon: account for gaps in hot add regions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <BN3PR03MB2146DADD381A3822CDB0F49AD8190@BN3PR03MB2146.namprd03.prod.outlook.com>
Date:	Sat, 6 Aug 2016 00:07:24 +0000
From:	"Alex Ng (LIS)" <alexng@...rosoft.com>
To:	Vitaly Kuznetsov <vkuznets@...hat.com>,
	"devel@...uxdriverproject.org" <devel@...uxdriverproject.org>
CC:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"Haiyang Zhang" <haiyangz@...rosoft.com>,
	KY Srinivasan <kys@...rosoft.com>
Subject: RE: [PATCH 2/4] Drivers: hv: balloon: account for gaps in hot add
 regions

> -----Original Message-----
> From: Vitaly Kuznetsov [mailto:vkuznets@...hat.com]
> Sent: Friday, August 5, 2016 3:49 AM
> To: devel@...uxdriverproject.org
> Cc: linux-kernel@...r.kernel.org; Haiyang Zhang <haiyangz@...rosoft.com>;
> KY Srinivasan <kys@...rosoft.com>; Alex Ng (LIS) <alexng@...rosoft.com>
> Subject: [PATCH 2/4] Drivers: hv: balloon: account for gaps in hot add regions
> 
> I'm observing the following hot add requests from the WS2012 host:
> 
> hot_add_req: start_pfn = 0x108200 count = 330752
> hot_add_req: start_pfn = 0x158e00 count = 193536
> hot_add_req: start_pfn = 0x188400 count = 239616
> 
> As the host doesn't specify hot add regions we're trying to create 128Mb-
> aligned region covering the first request, we create the 0x108000 -
> 0x160000 region and we add 0x108000 - 0x158e00 memory. The second
> request passes the pfn_covered() check, we enlarge the region to 0x108000 -
> 0x190000 and add 0x158e00 - 0x188200 memory. The problem emerges with
> the third request as it starts at 0x188400 so there is a 0x200 gap which is not
> covered. As the end of our region is 0x190000 now it again passes the
> pfn_covered() check were we just adjust the covered_end_pfn and make it
> 0x188400 instead of 0x188200 which means that we'll try to online
> 0x188200-0x188400 pages but these pages were never assigned to us and we
> crash.

The fact that the host sent a request that's non-contiguous with the previous
request is unexpected. Could we check to see the number of pages we returned
in our response, after each request?

I'm wondering if we may have given a wrong response to cause the host to
follow-up with a gapped request.