lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20060724204801.GE7448@austin.ibm.com>
Date:	Mon, 24 Jul 2006 15:48:01 -0500
From:	linas@...tin.ibm.com (Linas Vepstas)
To:	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	linux-pci maillist <linux-pci@...ey.karlin.mff.cuni.cz>,
	Greg KH <greg@...ah.com>,
	Tom Long Nguyen <tom.l.nguyen@...el.com>
Subject: Re: [PATCH 1/5] PCI-Express AER implemetation: aer howto document

Hi,

More late commentary ...

On Fri, Jul 14, 2006 at 01:25:49PM +0800, Zhang, Yanmin wrote:
> --- linux-2.6.17/Documentation/pcieaer-howto.txt	1970-01-01 08:00:00.000000000 +0800
> +++ linux-2.6.17_aer/Documentation/pcieaer-howto.txt	2006-07-14 11:09:37.000000000 +0800
> +6.1. Configuring the AER capability structure
> +
> +AER aware drivers of PCI Express component need change the device
> +control registers to enable AER. They also could change AER registers,
> +including mask and severity registers.

Hmm. Why not just enable error reporting for everything? Why make 
the device driver jump through this extra hoop?

If there is some really good reason not to enable reporting by default,
(which I cannot think of at the moment), then there's another
possiblity: enable error reporting if and only if the device
driver has struct pci_driver -> err_handler != NULL.

> +6.2. Provide PCI error-recovery callbacks
> +
> +If an error message indicates a non-fatal error, performing link reset
> +at upstream is not required. The AER driver calls error_detected(dev,
> +pci_channel_io_normal) to all drivers associated within a hierarchy in

Hmm. I would rather extend enum pci_channel_state to include a non-fatal 
error notification. That is, add pci_channel_io_nonfatal_error=4; to the enum.  

> +If an error message indicates a fatal error, kernel will broadcast
> +error_detected(dev, pci_channel_io_frozen) to all drivers within
> +a hierarchy in question. 

"The hierarchy in question" -- does that meen all drivers attached to
the root port?  Or only drivers that aree using some particular link?
You don't want to notify/reset every PCI slot in the system (that would
hurt!); ideally one rests only the one PCI slot that was affected.

> +As different kinds of devices might use different approaches
> +to reset link, AER port service driver is required to provide the
> +function to reset link. Firstly, kernel looks for if the upstream
> +component has an aer driver. If it has, kernel uses the reset_link
> +callback of the aer driver. 

I don't yet entirely understand link reset. However, the original
pci error recovery spec was written by assuming that it would be 
the aer root port driver that performs the link reset. The callback
link_reset() was to notify the device driver that the link was reset.

> +8. Frequent Asked Questions
> +
> +Q: What happens if a PCI Express device driver does not provide an
> +error recovery handle?

What's an "error recovery handle"? Does this refer to the 
struct pci_driver {
   struct pci_error_handlers *err_handler;
}

pointer?  I think it does, but at first this is unclear. 

> +Q: How does this infrastructure deal with driver that is not PCI
> +Express aware?
> +
> +A: This infrastructure calls the error callback functions of the
> +driver when an error happens. But if the driver is not aware of
> +PCI Express, the device might not report its own errors to root
> +port.

Which is a good reason to enable eror reporting by default, or at 
least, to enable error reporting when 
struct pci_driver->err_handler != NULL


--linas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ