[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4F4FF712.2030400@windriver.com>
Date: Thu, 1 Mar 2012 16:24:18 -0600
From: Jason Wessel <jason.wessel@...driver.com>
To: Andrei Warkentin <awarkentin@...are.com>
CC: <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
Andrei Warkentin <andreiw@...are.com>,
<kgdb-bugreport@...ts.sourceforge.net>,
Matt Mackall <mpm@...enic.com>,
Andrei Warkentin <andrey.warkentin@...il.com>
Subject: Re: [PATCHv3 1/3] NETPOLL: Extend rx_hook support.
On 03/01/2012 03:04 PM, Andrei Warkentin wrote:
> ----- Original Message -----
>> From: "Andrei Warkentin" <awarkentin@...are.com>
>> To: "Jason Wessel" <jason.wessel@...driver.com>
>> Cc: netdev@...r.kernel.org, linux-kernel@...r.kernel.org, "Andrei Warkentin" <andreiw@...are.com>,
>> kgdb-bugreport@...ts.sourceforge.net, "Matt Mackall" <mpm@...enic.com>, "Andrei Warkentin"
>> <andrey.warkentin@...il.com>
>> Sent: Tuesday, February 28, 2012 12:43:52 PM
>> Subject: Re: [PATCHv3 1/3] NETPOLL: Extend rx_hook support.
>>
>>> All that netpoll_poll() did was to call netpoll_poll_dev(). I have
>>> not yet looked at the differences between kgdboe and the netkdb
>>> code
>>> you proposed but I would have suspected it also falls victim to the
>>> ethernet preemption problem which prevented kgdboe from ever being
>>> considered for a mainline merge. Certainly there are ways to fix
>>> this
>>> problem but most involved changes to scheduling, core net code, or
>>> substantial driver specific changes.
>>>
>> I see, I read up on the issues w.r.t. preemption. Could this be
>> worked
>> around by modifiying affected drivers to bypass locking if they are
>> used in KDB context? Make some accessor netdev-specific lock/unlocks
>> that won't do anything if running in KDB context.
>>
>>
> By the way, is there a good way to repro the preemption case? Hopefully this doesn't
> involve some crazy hardware...
I have several cases which will usually hang the machine fairly quickly, but they all involve using gdb and a target using SMP. Most often it is as simple as this:
* Use an SMP system with with at least 2 cores
* Start two threads rapidly running some processes
while [ 1 ] ; do date > /dev/null ; done &
while [ 1 ] ; do date > /dev/null ; done &
* Connect with gdb to kgdb and set a breakpoint at do_fork
Now do "c"
Now do "c 1000"
Generally the system will hang long before you get 1000 breakpoints hit and it will be a condition where there is a lock needed to create an skb, or the ethernet driver is preempted or some part of the network stack is preempted (or holding a lock) on the non master cpu.
There is another condition that is hard to catch that involves a task migrating from one cpu to the next, but we'll stick to the simple test case I described above for now.
I did have a question, because it seems you were using qemu / kvm. I have a number of test cases that use kvm, but the netkkgdb does not seem to work with the nc. My question is how am I supposed to actually use the netkgdb?
Here is what I observe on the target system:
insmod netkgdb.ko netkgdb=@/,@10.0.2.2/
echo g > /proc/sysrq-trigger
On my host system:
nc.traditional -l -u -p 7777
I will type help, and then the netkgdb is toast. It doesn't seem to respond anymore.
Jason.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists