linux-kernel - Re: linux-next: manual merge of the kgdb tree with Linus' tree

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:	Mon, 09 Aug 2010 00:13:01 -0500
From:	Jason Wessel <jason.wessel@...driver.com>
To:	paulmck@...ux.vnet.ibm.com
CC:	Stephen Rothwell <sfr@...b.auug.org.au>,
	linux-next@...r.kernel.org, linux-kernel@...r.kernel.org,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Subject: Re: linux-next: manual merge of the kgdb tree with Linus' tree

On 08/07/2010 04:17 PM, Paul E. McKenney wrote:
> On Sat, Aug 07, 2010 at 02:05:42PM +1000, Stephen Rothwell wrote:
>   
>> Hi Jason,
>>
>> Today's linux-next merge of the kgdb tree got a conflict in
>> include/linux/rcupdate.h between commits
>> 551d55a944b143ef26fbd482d1c463199d6f65cf ("tree/tiny rcu: Add debug RCU
>> head objects") and f5155b33277c9678041a27869165619bb34f722f ("rcu: add an
>> rcu_dereference_index_check()") from Linus' tree and commit
>> 9e213357d0aeaeb81e213cfd3b9415db5fccc1b5 ("rcu,debug_core: allow the
>> kernel debugger to reset the rcu stall timer") from the kgdb tree.
>>     
>
> Hello, Jason,
>
> Just trying to make sure I understand this...
>
> This cannot be a "stop the machine" debugger, because otherwise the
> jiffies counter would stop and you would not get RCU CPU stall warnings.
>
> It might be a "stop the machine" debugger, but where the jiffies counter
> catches up quickly as soon as the machine restarts.  In this case,
> your patch would be a reasonable approach, but RCU CPU stall warnings
> are going to be the least of your problems. 

You should have the patches now in as I posted them to LKML as an RFC.  
If there are other problems in this area I am interested in
understanding what further issues exist that still have yet to be dealt
with.

The general idea is that the kernel can take an exception and execute
for a short period of time with all the processors spinning in a wait
loop and then resume kernel execution.  As you might guess the debugger
is a "multipurpose" tool and there are quite a few circumstances where
the a trip into the debugger is really a one way trip to a reboot when
you are done inspecting.

>  Actually, I have only seen
> one piece of your patch.  Could you please send me the rest of it?
>
> If you are permitting some tasks to run while others are halted,
> then the RCU CPU stall is simply a symptom of an underlying problem,
> namely that if you halt a task in an RCU read-side critical section
> for long enough, you will OOM the system.
>
>   

We are definitely not "partially running".  Picking an choosing threads
to run without a complete integration with the scheduler and all other
related systems like RCU would be a _really_ bad idea.  :-)

Jason.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/