[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <DM6PR21MB1337424D893B60F48F45A289CAD30@DM6PR21MB1337.namprd21.prod.outlook.com>
Date: Mon, 12 Aug 2019 15:56:05 +0000
From: Haiyang Zhang <haiyangz@...rosoft.com>
To: Lorenzo Pieralisi <lorenzo.pieralisi@....com>
CC: "sashal@...nel.org" <sashal@...nel.org>,
"bhelgaas@...gle.com" <bhelgaas@...gle.com>,
"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
KY Srinivasan <kys@...rosoft.com>,
Stephen Hemminger <sthemmin@...rosoft.com>,
"olaf@...fle.de" <olaf@...fle.de>, vkuznets <vkuznets@...hat.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH v2] PCI: hv: Detect and fix Hyper-V PCI domain number
collision
> -----Original Message-----
> From: Lorenzo Pieralisi <lorenzo.pieralisi@....com>
> Sent: Monday, August 12, 2019 11:39 AM
> To: Haiyang Zhang <haiyangz@...rosoft.com>
> Cc: sashal@...nel.org; bhelgaas@...gle.com; linux-
> hyperv@...r.kernel.org; linux-pci@...r.kernel.org; KY Srinivasan
> <kys@...rosoft.com>; Stephen Hemminger <sthemmin@...rosoft.com>;
> olaf@...fle.de; vkuznets <vkuznets@...hat.com>; linux-
> kernel@...r.kernel.org
> Subject: Re: [PATCH v2] PCI: hv: Detect and fix Hyper-V PCI domain number
> collision
>
> On Tue, Aug 06, 2019 at 11:52:11PM +0000, Haiyang Zhang wrote:
> > Currently in Azure cloud, for passthrough devices including GPU, the
> > host sets the device instance ID's bytes 8 - 15 to a value derived from
> > the host HWID, which is the same on all devices in a VM. So, the device
> > instance ID's bytes 8 and 9 provided by the host are no longer unique.
> >
> > This can cause device passthrough to VMs to fail because the bytes 8 and
> > 9 is used as PCI domain number. So, as recommended by Azure host team,
> > we now use the bytes 4 and 5 which usually contain unique numbers as PCI
> > domain. The chance of collision is greatly reduced. In the rare cases of
> > collision, we will detect and find another number that is not in use.
>
> This is not clear at all. Why "finding another number" is fine with
> this patch while it is not with current kernel code ? Also does this
> have backward compatibility issues ?
The bytes 4, 5 have more uniqueness (info entropy) than bytes 8, 9, so we use
bytes 4, 5. On older hosts, bytes 4, 5 can also be used -- so it has no backward
compatibility issues.
> I do not understand if a collision is a problem or not from the
> log above.
Collision will cause the second device with the same domain number fails to load.
I will include these info into the patch description.
>
> > Thanks to Michael Kelley <mikelley@...rosoft.com> for proposing this
> idea.
>
> Add it as Suggested-by: tag.
I will add this line.
Thanks,
- Haiyang
Powered by blists - more mailing lists