lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170823135140.GA8498@bhelgaas-glaptop.roam.corp.google.com>
Date:   Wed, 23 Aug 2017 08:51:40 -0500
From:   Bjorn Helgaas <helgaas@...nel.org>
To:     Oza Oza <oza.oza@...adcom.com>
Cc:     Bjorn Helgaas <bhelgaas@...gle.com>,
        Rob Herring <robh+dt@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Ray Jui <rjui@...adcom.com>,
        Scott Branden <sbranden@...adcom.com>,
        Jon Mason <jonmason@...adcom.com>,
        BCM Kernel Feedback <bcm-kernel-feedback-list@...adcom.com>,
        Andy Gospodarek <gospo@...adcom.com>,
        linux-pci <linux-pci@...r.kernel.org>,
        linux-kernel@...r.kernel.org,
        Oza Pawandeep <oza.pawandeep@...il.com>,
        Sinan Kaya <okaya@...eaurora.org>,
        Timur Tabi <timur@...eaurora.org>,
        Alex Williamson <alex.williamson@...hat.com>
Subject: Re: [PATCH v7 1/2] PCI: iproc: Retry request when CRS returned from
 EP

On Wed, Aug 23, 2017 at 03:57:02PM +0530, Oza Oza wrote:
> On Wed, Aug 23, 2017 at 2:03 AM, Bjorn Helgaas <helgaas@...nel.org> wrote:
> > [+cc Sinan, Timur, Alex]
> >
> > Hi Oza,
> >
> > On Mon, Aug 21, 2017 at 09:28:43PM +0530, Oza Pawandeep wrote:
> >> PCIe spec r3.1, sec 2.3.2
> >> If CRS software visibility is not enabled, the RC must reissue the
> >> config request as a new request.
> >>
> >> - If CRS software visibility is enabled,
> >> - for a config read of Vendor ID, the RC must return 0x0001 data
> >> - for all other config reads/writes, the RC must reissue the
> >>   request
> >
> > You mentioned early on that this fixes an issue with NVMe after a
> > reset.  Is that reset an FLR?  I'm wondering if this overlaps with the
> > work Sinan is doing:
> >
> >   http://lkml.kernel.org/r/20170818212310.15145.21732.stgit@bhelgaas-glaptop.roam.corp.google.com
> >
> > With the current code in the tree, pci_flr_wait() probably waits only
> > 100ms on your system.  It currently reads PCI_COMMAND and probably
> > sees 0xffff0001 because the NVMe device returns a CRS completion
> > (which this iproc RC incorrectly turns into 0xffff0001 data), so we
> > exit the loop after a single msleep(100).
> >
> > Sinan is extending pci_flr_wait() to read the Vendor ID and look for
> > the CRS SV indication.  If this is where you're seeing the NVMe issue,
> > I bet we can fix this by simply changing iproc_pcie_config_read() to
> > return the correct data when it sees a CRS completion.  Then the
> > retry loop in pci_flr_wait() will work as intended.
> >
> 
> The issue with User Space poll mode drivers resulting into CRS.
> 
> 
> root@...958742k:~# /usr/share/spdk/examples/nvme/perf -r 'trtype:PCIe
> traddr:0000:01:00.0' -q 128 -s 4096 -w read -d 2048 -t 30 -c 0x1
> Starting DPDK 16.11.1 initialization...
> [ DPDK EAL parameters: perf -c 1 -m 2048 --file-prefix=spdk_pid2202 ]
> EAL: Detected 8 lcore(s)
> EAL: Probing VFIO support...
> EAL: VFIO support initialized
> EAL: cannot open /proc/self/numa_maps, consider that all memory is in
> socket_id 0
> Initializing NVMe Controllers
> EAL: PCI device 0000:01:00.0 on NUMA socket 0
> EAL:   probe driver: 8086:953 spdk_nvme
> EAL:   using IOMMU type 1 (Type 1)
> [   45.811353] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x19@...a0
> Attaching to NVMe Controller at 0000:01:00.0 [8086:0953]
> [   46.486601] vfio_bar_restore: 0000:01:00.0 reset recovery - restoring bars
> nvme_ctrlr.c:1237:nvme_ctrlr_process_init: ***ERROR*** Initialization
> timed out in state 3
> nvme_ctrlr.c: 414:nvme_ctrlr_shutdown: ***ERROR*** did not shutdown
> within 5 seconds
> EAL: Hotplug doesn't support vfio yet
> spdk_nvme_probe() failed for transport address '0000:01:00.0'
> /usr/share/spdk/examples/nvme/perf: errors occured
> 
> ok, so this is in the context of
> [   52.500579] [<ffff000008088498>] dump_backtrace+0x0/0x208
> [   52.506147] [<ffff0000080886b4>] show_stack+0x14/0x20
> [   52.511357] [<ffff0000083ace34>] dump_stack+0x8c/0xb0
> [   52.516567] [<ffff00000840a550>] pci_flr_wait+0x18/0xc8
> [   52.521955] [<ffff00000840bf9c>] pcie_flr+0x34/0x68
> [   52.526986] [<ffff00000840e3d0>] __pci_dev_reset+0x100/0x2c0
> [   52.532823] [<ffff00000840e60c>] pci_try_reset_function+0x64/0x80
> [   52.539108] [<ffff00000856e4b8>] vfio_pci_ioctl+0x398/0xb78
> [   52.544857] [<ffff000008568378>] vfio_device_fops_unl_ioctl+0x20/0x30
> [   52.551501] [<ffff0000081d4e24>] do_vfs_ioctl+0xa4/0x738
> [   52.556980] [<ffff0000081d5544>] SyS_ioctl+0x8c/0xa0
> [   52.562100] [<ffff000008082f8c>] __sys_trace_return+0x0/0x4
> 
> 
> So, I have two things to do here.
> 
> 1) remove retry funciton and just do following.
>    data = readl(cfg_data_p);
>    if (data == CFG_RETRY_STATUS)    /* 0xffff0001 */
>      data = 0xffffffff;
> 
> 2) refactor the code with separate patch.
> 
> but for that Sinan's patch is required.
> then with both the patches I think, things will work out correctly for
> iproc SOC.

It looks like you are using the pci_flr_wait() path as I suspected.

If you do the check as you mentioned in 1), this should work even
without Sinan's patch.

It's possible you'd have to increase the pci_flr_wait() timeout
(currently 1sec total).  I think we'll end up doing that as part of
Sinan's work, so we use the same timeout whether or not the RC
supports CRS SV.

> 
> Let me know how this sounds and if you agree with above two points, I
> will be posting final set of patches.
> 
> let me know if you want me to change anything in the commit description.
> in my opinion I should remove config write description which is irrelevant here.

I agree.  I don't think you need to say anything special about write,
since I think an RC is probably allowed to drop (with no retries) a
config write that receives a CRS completion.  In this case, I expect
the RC would log an error, so you should see something in PCI_STATUS.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ