lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <C27F8246C663564A84BB7AB3439772421B78147576@IRVEXCHCCR01.corp.ad.broadcom.com>
Date:	Sun, 30 May 2010 21:43:05 -0700
From:	"Michael Chan" <mchan@...adcom.com>
To:	"'Andi Kleen'" <andi@...stfloor.org>
cc:	"'davem@...emloft.net'" <davem@...emloft.net>,
	"'netdev@...r.kernel.org'" <netdev@...r.kernel.org>,
	"'linux-pci@...r.kernel.org'" <linux-pci@...r.kernel.org>
Subject: Re: [PATCH] bnx2: Fix IRQ failures during kdump.

Andi Kleen wrote:

> On Sun, May 30, 2010 at 09:12:15AM -0700, Michael Chan wrote:
> > Andi Kleen wrote:
> >
> > > "Michael Chan" <mchan@...adcom.com> writes:
> > >
> > > > When switching from the crashed kernel to the kdump kernel
> without
> > > going
> > > > through PCI reset, IRQs may not work if a different IRQ mode is
> used
> > > on
> > >
> > > PCIe with AER actually does support per link root port reset
> > > (e.g. used for AER)
> >
> > Do you mean the slot_reset function in the pci_error_handlers?  This
> 
> Well the fallback code in the PCIE root port driver
> that does the actual resets.

aer_root_reset() in aerdrv.c?

> 
> It could be called directly before kexec.
> 
> > needs to be called in the context of the crashed kernel, right?
> 
> It could be done on kexec, however of course you would rely
> on PCI root port data structures still being intact on a crash
> (I guess that's reasonable, they are not very complicated)
> 
> >
> > >
> > > I've been wondering for some time if kexec should not simply
> > > use that to reset all the devices, instead of addings hacks
> > > around this to all drivers.
> > >
> > > That would fix your problems too, right?
> >
> > If it is called in the context of the crashed kernel, it won't work.
> > We would reset it and put in back into the same IRQ mode.
> 
> Who would put it back? Your driver wouldn't be called anymore.

The bnx2 driver like many other drivers has a slot_reset function in the
pci_driver struct's err_handler.  If the AER code calls this function,
we would reset the chip and put it back to the same IRQ mode.  Without
calling this per driver reset function, I'm not sure if you can reset
the device if the device does not support Function Level Reset.

> 
> >
> > >
> > > The question is just if AER is widely enough supported for this.
> > >
> >
> > Some newer PCIe devices support Function Level Reset, and that would
> > be ideal.  But most existing devices including bnx2 devices don't
> have
> > this feature.
> 
> Root port reset should be fine for this case. Even if some
> innocent device on the same root port gets reset too that shouldn't
> matter.
> Only drawback for the NIC would be that you have to renegotiate links I
> think.
> 
> Also there are systems without AER support.
> 
> -Andi
> --
> ak@...ux.intel.com -- Speaking for myself only.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ