lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 15 Mar 2018 18:28:21 +0000
From:   Dexuan Cui <decui@...rosoft.com>
To:     'Lorenzo Pieralisi' <lorenzo.pieralisi@....com>
CC:     "'bhelgaas@...gle.com'" <bhelgaas@...gle.com>,
        "'linux-pci@...r.kernel.org'" <linux-pci@...r.kernel.org>,
        KY Srinivasan <kys@...rosoft.com>,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        "'linux-kernel@...r.kernel.org'" <linux-kernel@...r.kernel.org>,
        "'driverdev-devel@...uxdriverproject.org'" 
        <driverdev-devel@...uxdriverproject.org>,
        Haiyang Zhang <haiyangz@...rosoft.com>,
        "'olaf@...fle.de'" <olaf@...fle.de>,
        "'apw@...onical.com'" <apw@...onical.com>,
        "'jasowang@...hat.com'" <jasowang@...hat.com>,
        "'vkuznets@...hat.com'" <vkuznets@...hat.com>,
        "'marcelo.cerri@...onical.com'" <marcelo.cerri@...onical.com>,
        "Michael Kelley (EOSG)" <Michael.H.Kelley@...rosoft.com>
Subject: RE: [PATCH v4 1/2] PCI: hv: Serialize the present and eject work
 items

> From: Dexuan Cui
> > From: Lorenzo Pieralisi <lorenzo.pieralisi@....com>
> > I need to know either what commit you are fixing (ie Fixes: tag - which
> > is preferrable) or you tell me which kernel versions we are targeting
> > for the stable backport.
> > Lorenzo
> 
> Sorry.  Here I was hesitant to add a "Fixes:" because the bug was there the first
> day
> when the driver was introduced.
> 
> Please use
> Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft
> Hyper-V VMs")
> or
> Cc: <stable@...r.kernel.org> # v4.6+

BTW, the bug here is a race condtion which couldn't be easily hit in the past,
probably because most of the time only one PCI device was only added into the
VM once. But now it's becoming typical that a VM can have 4 GPU devices so we
start to notice this bug. With 7 Mellanox VFs assigned to a VM, we can easily
reproduce the bug by "hot-remove and hot-add VFs" test:

general protection fault: 0000 [#1] SMP
...
Workqueue: events hv_eject_device_work [pci_hyperv]
task: ffff8800ed5e5400 ti: ffff8800ee674000 task.ti: ffff8800ee674000
RIP: 0010:[<ffffffffc025c5ce>]  ... hv_eject_device_work+0xbe/0x160 [pci_hyperv]
...
Call Trace:
 [<ffffffff8183c3a6>] ? __schedule+0x3b6/0xa30
 [<ffffffff8109a585>] process_one_work+0x165/0x480
 [<ffffffff8109a8eb>] worker_thread+0x4b/0x4c0
 [<ffffffff8109a8a0>] ? process_one_work+0x480/0x480
 [<ffffffff810a0c25>] kthread+0xe5/0x100
 [<ffffffff810a0b40>] ? kthread_create_on_node+0x1e0/0x1e0
 [<ffffffff81840f0f>] ret_from_fork+0x3f/0x70
 [<ffffffff810a0b40>] ? kthread_create_on_node+0x1e0/0x1e0
Code: ...
RIP  [<ffffffffc025c5ce>] hv_eject_device_work+0xbe/0x160 [pci_hyperv]
...
BUG: unable to handle kernel paging request at ffffffffffffffd8
IP: [<ffffffff810a12d0>] kthread_data+0x10/0x20
PGD 1e0d067 PUD 1e0f067 PMD 0
Oops: 0000 [#2] SMP
 
Thanks,
-- Dexuan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ