lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 05 Jun 2013 07:22:00 -0600
From:	Betty Dall <betty.dall@...com>
To:	Bjorn Helgaas <bhelgaas@...gle.com>
Cc:	Chen Gong <gong.chen@...ux.intel.com>,
	"Rafael J. Wysocki" <rjw@...k.pl>,
	Huang Ying <ying.huang@...el.com>,
	"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>
Subject: Re: [PATCH v2 3/3] PCI/AER: Provide reset_link for firmware first
 root port

On Tue, 2013-06-04 at 16:15 -0600, Bjorn Helgaas wrote:
> On Tue, Jun 4, 2013 at 3:38 PM, Betty Dall <betty.dall@...com> wrote:
> > On Tue, 2013-06-04 at 04:36 -0400, Chen Gong wrote:
> >> On Thu, May 30, 2013 at 08:39:29AM -0600, Betty Dall wrote:
> >> > Date:       Thu, 30 May 2013 08:39:29 -0600
> >> > From: Betty Dall <betty.dall@...com>
> >> > To: rjw@...k.pl, bhelgaas@...gle.com
> >> > Cc: ying.huang@...el.com, linux-acpi@...r.kernel.org,
> >> >  linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org, Betty Dall
> >> >  <betty.dall@...com>
> >> > Subject: [PATCH v2 3/3] PCI/AER: Provide reset_link for firmware first root
> >> >  port
> >> > X-Mailer: git-send-email 1.7.7.6
> >> >
> >> > The firmware first path does not install the aerdrv root port
> >> > service driver, so the firmware first root port does not have
> >> > a reset_link callback. When a firmware first root port does not have
> >> > a reset_link callback, use a new default reset_link similar to what
> >> > we already do for the default_downstream_reset_link(). The firmware
> >> > first default reset_link brings the root port out of SBR if firmware
> >> > left it in SBR.
> >> >
> >> > Changes since v1:
> >> > Removed incorrect setting of p2p_ctrl after the read.
> >> >
> >> > Signed-off-by: Betty Dall <betty.dall@...com>
> >> > ---
> >> >
> >> >  drivers/pci/pcie/aer/aerdrv_core.c |   36 ++++++++++++++++++++++++++++++++++++
> >> >  1 files changed, 36 insertions(+), 0 deletions(-)
> >> >
> >> >
> >> > diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
> >> > index 8ec8b4f..72f76cd 100644
> >> > --- a/drivers/pci/pcie/aer/aerdrv_core.c
> >> > +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> >> > @@ -413,6 +413,39 @@ static pci_ers_result_t default_downstream_reset_link(struct pci_dev *dev)
> >> >     return PCI_ERS_RESULT_RECOVERED;
> >> >  }
> >> >
> >> > +/**
> >> > + * default_ff_root_port_reset_link - default reset function for firmware
> >> > + *         first Root Port
> >> > + * @dev: pointer to root port's pci_dev data structure
> >> > + *
> >> > + * Invoked when performing link reset at Root Port w/ no aer driver.
> >> > + * This happens through the firmware first path. Firmware may leave
> >> > + * the Root Port in SBR and it is the OS responsiblity to bring it out
> >> > + * of SBR.
> >> > + */
> >> > +static pci_ers_result_t default_ff_root_port_reset_link(struct pci_dev *dev)
> >> > +{
> >> > +   u16 p2p_ctrl;
> >> > +
> >> > +   /* Read Secondary Bus Reset */
> >> > +   pci_read_config_word(dev, PCI_BRIDGE_CONTROL, &p2p_ctrl);
> >> > +
> >> > +   /* De-assert Secondary Bus Reset, if it is set */
> >> > +   if (p2p_ctrl & PCI_BRIDGE_CTL_BUS_RESET) {
> >> > +           p2p_ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
> >> > +           pci_write_config_word(dev, PCI_BRIDGE_CONTROL, p2p_ctrl);
> >> > +
> >> > +           /*
> >> > +            * System software must wait for at least 100ms from the end
> >> > +            * of a reset of one or more device before it is permitted
> >> > +            * to issue Configuration Requests to those devices.
> >> > +            */
> >> > +           msleep(200);
> >> > +           dev_dbg(&dev->dev, "Root Port link has been reset\n");
> >> > +   }
> >> > +   return PCI_ERS_RESULT_RECOVERED;
> >> > +}
> >>
> >> I don't think this function is OK.
> >> 1) You don't really reset the 2nd Bus but just checking its status.
> >> I think you should have following steps to reset 2nd Bus:
> >>
> >> a. Assert 2nd Bus Reset
> >> b. wait for some time until this message has been broadcasted well
> >> c. De-assert 2nd Bus Reset
> >> d. wait for Trst time
> >>
> >> IOW, since we have aer_do_secondary_bus_reset to perform secondary bus
> >> reset, why you repeat it again?
> >>
> >> 2) msleep(200) is too long for kernel. You'd better yield the CPU when
> >> sleep.
> >
> > The firmware first path currently has no reset_link. I want to make a
> > minimal change since resetting the link could be considered firmware's
> > job in the firmware first path. This change is to just check if SBR is
> > set, and bring the link out of reset only if it is in SBR.  This way, if
> > a another firmware first platform is already resetting the link, it
> > won't be done twice. I took the msleep and the code from
> > aer_do_sercondary_bus_reset() as you noticed.
> 
> I understand the desire to make a minimal change, and we certainly
> don't want to make changes bigger than they need to be.
> 
> However, my goal is really to have clean, maintainable code at the
> end.  If that means bigger patches where you can't test every affected
> platform, that's a risk I'm willing to take.
> 
> In this particular case, we could try to limit the change to just the
> platforms you can test, as you've done.  But it looks to me like we
> actually want to reset the link in *all* cases, regardless of whether
> firmware-first is involved.  We're handling a fatal error for a
> device.  That device might be below a PCIe switch (where we currently
> reset the link), or it might be directly below a root port (where we
> currently don't).  Why should those cases be different?  I don't think
> the device or the driver should be able to tell the difference.
> 
> My *guess* is that this is an oversight in the original code, and that
> doing the link reset for both downstream ports and root ports will
> make error handling for devices below root ports work better.
> 
> The *last* thing I want is code littered with special cases that are
> only there because "they only fix the platforms I could actually
> test."  It's just about impossible to clean those up later.
> 
> Bjorn

That makes sense to me. I will work on a new version of the patch and
test it today. 

-Betty

> >> > +
> >> >  static int find_aer_service_iter(struct device *device, void *data)
> >> >  {
> >> >     struct pcie_port_service_driver *service_driver, **drv;
> >> > @@ -460,6 +493,9 @@ static pci_ers_result_t reset_link(struct pci_dev *dev)
> >> >             status = driver->reset_link(udev);
> >> >     } else if (pci_pcie_type(udev) == PCI_EXP_TYPE_DOWNSTREAM) {
> >> >             status = default_downstream_reset_link(udev);
> >> > +   } else if (pci_pcie_type(udev) == PCI_EXP_TYPE_ROOT_PORT &&
> >> > +           pcie_aer_get_firmware_first(udev)) {
> >> > +           status = default_ff_root_port_reset_link(udev);
> >> >     } else {
> >> >             dev_printk(KERN_DEBUG, &dev->dev,
> >> >                     "no link-reset support at upstream device %s\n",
> >> > --
> >> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >> > the body of a message to majordomo@...r.kernel.org
> >> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> > Please read the FAQ at  http://www.tux.org/lkml/
> >
> >


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ