lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 24 Apr 2009 18:09:44 +0200
From:	"Rafael J. Wysocki" <rjw@...k.pl>
To:	Thadeu Lima de Souza Cascardo <cascardo@...oscopio.com>
Cc:	Jiri Slaby <jirislaby@...il.com>,
	e1000-devel@...ts.sourceforge.net, Ingo Molnar <mingo@...e.hu>,
	LKML <linux-kernel@...r.kernel.org>,
	Jesse Barnes <jbarnes@...tuousgeek.org>
Subject: Re: [E1000-devel] e1000: "eeprom checksum is not valid" after kexec

On Thursday 23 April 2009, Thadeu Lima de Souza Cascardo wrote:
> On Thu, Apr 23, 2009 at 10:40:14PM +0200, Jiri Slaby wrote:
> > On 04/23/2009 04:41 PM, Thadeu Lima de Souza Cascardo wrote:
> > > On Thu, Apr 23, 2009 at 04:30:01PM +0200, Jiri Slaby wrote:
> > >> On 04/23/2009 04:10 PM, Thadeu Lima de Souza Cascardo wrote:
> > >>> Have you tried b43fcd7dc7b, found in v2.6.30-rc3?
> > >> I've tried 2.6.30-rc3-next-20090423 without success.
> > > 
> > > You mean next-20090423. The patch is really found there.
> > > 
> > > But, then, I realize you mean reverting these patches for the kernel
> > > that is running or the kernel that is being kexec'd?
> > 
> > The latter.
> > 
> > > If b43fcd7dc7b is applied to the running kernel, it fixes the shutdown
> > > issue, and the next loaded kernel probes e1000 fine.
> > 
> > Makes sense.
> > 
> > > If you are reverting 4a865905f in the kexec'd kernel and the running
> > > kernel does not have b43fcd7dc7b, then I'd like to test the revert for
> > > my case here, which is e100.
> > 
> > To make things clear: on that machine, there was stock opensuse 11.1
> > distro kernel which is 2.6.27-based (no b43fcd7dc7b). I needed to debug
> > a wireless bug, so I kexec'ed wireless-testing (contains 4a865905f already).
> > 
> > So in fact, 4a865905f from the testing kernel triggered a bug fixed in
> > near past by b43fcd7dc7b.
> > 
> > Did the other two e100* drivers suffer from the same and were fixed
> > recently? It would render kexec pretty unusable from the older kernels
> > if this is not going to be fixed anyhow :(.
> 
> Yes, as well as some other network drivers, it seems. My fix for e100
> should be in Jeffrey Kirsher's tree by now and go into netdev and rc4
> soon, I expect.
> 
> But, since I also thought that it would be good to fix that and allow
> people to kexec from earlier kernels, I did a followup to e100-devel,
> linux-pci, netdev and Rafael Wysocki. I didn't include linux-kernel,
> which I have just fixed, bouncing the message (oops!). I may bounce it
> to you too, if you want that.
> 
> Your findings shed a light into that problem. But I could find it in
> very early kernels too for some configurations, and these commits you
> are reverting may only fix the issue for the most common configurations
> out there. That is, it was very easy to trigger the shutdown bug with
> these patches. But I think there are some other bugs out there that will
> trigger it, and they are not that easy bisecting, it seems, since only
> some very particular configurations trigger it.
> 
> I will do some tests with the commits you mention and reproduce the
> problem using as earlier kernels as I can and send the config.

Cascardo, Jiri, can you tell me please what the status here is?

My understanding is that the commit pointed to by Jiri caused a problem
if the current mainline kernel was kexeced from an older kernel (2.6.27.x from
openSUSE-11.1 in this particular case), because the older kernel didn't
have the recent network driver fixes applied.  Is this correct?

Also, I'm still interested in whether or not removig the following three lines:

        /* Check if we're already there */
        if (dev->current_state == state)
                return 0;

from pci_set_power_state() in the current mainline kernel fixes the problem
in the configuration where it is readily reproducible.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ