lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DM6PR21MB1337D4F34CAA49BE369FB793CAD20@DM6PR21MB1337.namprd21.prod.outlook.com>
Date:   Tue, 13 Aug 2019 12:55:59 +0000
From:   Haiyang Zhang <haiyangz@...rosoft.com>
To:     Lorenzo Pieralisi <lorenzo.pieralisi@....com>
CC:     "sashal@...nel.org" <sashal@...nel.org>,
        "bhelgaas@...gle.com" <bhelgaas@...gle.com>,
        "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
        "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
        KY Srinivasan <kys@...rosoft.com>,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        "olaf@...fle.de" <olaf@...fle.de>, vkuznets <vkuznets@...hat.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH v3] PCI: hv: Detect and fix Hyper-V PCI domain number
 collision



> -----Original Message-----
> From: Lorenzo Pieralisi <lorenzo.pieralisi@....com>
> Sent: Tuesday, August 13, 2019 6:14 AM
> To: Haiyang Zhang <haiyangz@...rosoft.com>
> Cc: sashal@...nel.org; bhelgaas@...gle.com; linux-
> hyperv@...r.kernel.org; linux-pci@...r.kernel.org; KY Srinivasan
> <kys@...rosoft.com>; Stephen Hemminger <sthemmin@...rosoft.com>;
> olaf@...fle.de; vkuznets <vkuznets@...hat.com>; linux-
> kernel@...r.kernel.org
> Subject: Re: [PATCH v3] PCI: hv: Detect and fix Hyper-V PCI domain number
> collision
> 
> On Mon, Aug 12, 2019 at 06:20:53PM +0000, Haiyang Zhang wrote:
> > Currently in Azure cloud, for passthrough devices including GPU, the host
> > sets the device instance ID's bytes 8 - 15 to a value derived from the host
> > HWID, which is the same on all devices in a VM. So, the device instance
> > ID's bytes 8 and 9 provided by the host are no longer unique. This can
> > cause device passthrough to VMs to fail because the bytes 8 and 9 are used
> > as PCI domain number. Collision of domain numbers will cause the second
> > device with the same domain number fail to load.
> >
> > As recommended by Azure host team, the bytes 4, 5 have more uniqueness
> > (info entropy) than bytes 8, 9. So now we use bytes 4, 5 as the PCI domain
> > numbers. On older hosts, bytes 4, 5 can also be used -- no backward
> > compatibility issues here. The chance of collision is greatly reduced. In
> > the rare cases of collision, we will detect and find another number that is
> > not in use.
> 
> I have not explained what I meant correctly. This patch fixes an
> issue and the "find another number" fallback can be also applied
> to the current kernel without changing the bytes you use for
> domain numbers.
> 
> This patch would leave old kernels susceptible to breakage.
> 
> Again, I have no Azure knowledge but it seems better to me to
> add a fallback "find another number" allocation on top of mainline
> and send it to stable kernels. Then we can add another patch to
> change the bytes you use to reduce the number of collision.
> 
> Please let me know what you think, thanks.

Thanks for your clarification.
Actually, I hope the stable kernel will be patched to use bytes 4,5 too,
because host provided numbers are persistent across reboots, we like
to use them if possible.

I think we can either --
1) Apply this patch for mainline and stable kernels as well.
2) Or, break this patch into two patches, and apply both of them for 
Mainline and stable kernels.

Which way do you prefer?

Thanks,
- Haiyang

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ