lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210315183226.GA14801@raphael-debian-dev>
Date:   Mon, 15 Mar 2021 18:32:32 +0000
From:   Raphael Norwitz <raphael.norwitz@...anix.com>
To:     Alex Williamson <alex.williamson@...hat.com>
CC:     Amey Narkhede <ameynarkhede03@...il.com>,
        Leon Romanovsky <leon@...nel.org>,
        "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
        "bhelgaas@...gle.com" <bhelgaas@...gle.com>,
        Raphael Norwitz <raphael.norwitz@...anix.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Alay Shah <alay.shah@...anix.com>,
        Suresh Gumpula <suresh.gumpula@...anix.com>,
        Shyam Rajendran <shyam.rajendran@...anix.com>,
        Felipe Franciosi <felipe@...anix.com>
Subject: Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device
 reset mechanism

On Mon, Mar 15, 2021 at 10:29:50AM -0600, Alex Williamson wrote:
> On Mon, 15 Mar 2021 21:03:41 +0530
> Amey Narkhede <ameynarkhede03@...il.com> wrote:
> 
> > On 21/03/15 05:07PM, Leon Romanovsky wrote:
> > > On Mon, Mar 15, 2021 at 08:34:09AM -0600, Alex Williamson wrote:  
> > > > On Mon, 15 Mar 2021 14:52:26 +0100
> > > > Pali Rohár <pali@...nel.org> wrote:
> > > >  
> > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote:  
> > > > > > slot reset (pci_dev_reset_slot_function) and secondary bus
> > > > > > reset(pci_parent_bus_reset) which I think are hot reset and
> > > > > > warm reset respectively.  
> > > > >
> > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is just another
> > > > > type of reset, which is currently implemented only for PCIe hot plug
> > > > > bridges and for PowerPC PowerNV platform and it just call PCI secondary
> > > > > bus reset with some other hook. PCIe Warm Reset does not have API in
> > > > > kernel and therefore drivers do not export this type of reset via any
> > > > > kernel function (yet).  
> > > >
> > > > Warm reset is beyond the scope of this series, but could be implemented
> > > > in a compatible way to fit within the pci_reset_fn_methods[] array
> > > > defined here.  Note that with this series the resets available through
> > > > pci_reset_function() and the per device reset attribute is sysfs remain
> > > > exactly the same as they are currently.  The bus and slot reset
> > > > methods used here are limited to devices where only a single function is
> > > > affected by the reset, therefore it is not like the patch you proposed
> > > > which performed a reset irrespective of the downstream devices.  This
> > > > series only enables selection of the existing methods.  Thanks,  
> > >
> > > Alex,
> > >
> > > I asked the patch author here [1], but didn't get any response, maybe
> > > you can answer me. What is the use case scenario for this functionality?
> > >
> > > Thanks
> > >
> > > [1] https://lore.kernel.org/lkml/YE389lAqjJSeTolM@unreal/ 
> > >  
> > Sorry for not responding immediately. There were some buggy wifi cards
> > which needed FLR explicitly not sure if that behavior is fixed in
> > drivers. Also there is use a case at Nutanix but the engineer who
> > is involved is on PTO that is why I did not respond immediately as
> > I don't know the details yet.
> 
> And more generally, devices continue to have reset issues and we
> impose a fixed priority in our ordering.  We can and probably should
> continue to quirk devices when we find broken resets so that we have
> the best default behavior, but it's currently not easy for an end user
> to experiment, ie. this reset works, that one doesn't.  We might also
> have platform issues where a given reset works better on a certain
> platform.  Exposing a way to test these things might lead to better
> quirks.  In the case I think Pali was looking for, they wanted a
> mechanism to force a bus reset, if this was in reference to a single
> function device, this could be accomplished by setting a priority for
> that mechanism, which would translate to not only the sysfs reset
> attribute, but also the reset mechanism used by vfio-pci.  Thanks,
> 
> Alex
>

To confirm from our end - we have seen many such instances where default
reset methods have not worked well on our platform. Debugging these
issues is painful in practice, and this interface would make it far
easier.

Having an interface like this would also help us better communicate the
issues we find with upstream. Allowing others to more easily test our
(or other entities') findings should give better visibility into
which issues apply to the device in general and which are platform
specific. In disambiguating the former from the latter, we should be
able to better quirk devices for everyone, and in the latter cases, this
interface allows for a safer and more elegant solution than any of the
current alternatives.

CC Alay, Suresh, Shyam and Felipe in case they have anything to add.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ