lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110805140047.GA19758@colin.search.kasperd.net>
Date:	Fri, 5 Aug 2011 16:08:15 +0200
From:	Kasper Dupont <kasperd@...hh.24.jul.2011.kasperd.net>
To:	Francois Romieu <romieu@...zoreil.com>
Cc:	ivecera@...hat.com, hayeswang@...ltek.com, gregkh@...e.de,
	netdev@...r.kernel.org
Subject: Re: r8169 driver crashes in 2.6.32.43

I did a bit more of experiments. I took the unmodified
2.6.32.43 kernel and added printk statements to see when
it entered the interrupt handler and when it left it.

That way I was able to confirm that the system locked
up inside the interrupt handler.

Next I added printk statements to see how many times the
loop in the interrupt handler was run. It seemed that
when it locked up inside the handler it would run the
loop just two times and then lock up before leaving the
handler.

I added more printk statements to see which branches were
taken inside the loop. Unfortunately those printk
statements changed the timing enough that the crashes
were no longer as reproducable.

I saw a pattern repeating. It would do the stop queue
thing, then leave the handler and while not inside this
interrupt handler there would be a message about the
interface coming up again. Seems like it was doing stop
queue calls much more frequently than it should be.

After a few attempts I managed to get it to lock up again
with all the printk statements in place. What I found was
that in the beginning of the loop status was 0x85. It
would then call the napi event code. At the end of the
first itteration of the loop status was 0.

At that point it did not itterate through the loop again
and it did not leave the interrupt handler either. I'll
power cycle the machine and take a closer look on the
source to see what could possible be happening at that
point.

I also did a bit of testing with the patches that causes
it to drop the network instead of crashing. On those I
am able to bring up the second interface and get data off
the machine for debugging, so if there is any debug info
you think would be useful in those cases, let me know.

-- 
Kasper Dupont -- Rigtige mænd skriver deres egne backupprogrammer
#define _(_)"d.%.4s%."_"2s" /* This is my email address */
char*_="@2kaspner"_()"%03"_("4s%.")"t\n";printf(_+11,_+6,_,11,_+2,_+7,_+6);
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ