linux-kernel - Re: [git pull] kgdb-light -v10

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.1.00.0802120829410.2920@woody.linux-foundation.org>
Date:	Tue, 12 Feb 2008 08:46:51 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Ingo Molnar <mingo@...e.hu>
cc:	Andi Kleen <andi@...stfloor.org>, linux-kernel@...r.kernel.org,
	"Frank Ch. Eigler" <fche@...hat.com>,
	Roland McGrath <roland@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [git pull] kgdb-light -v10

On Tue, 12 Feb 2008, Ingo Molnar wrote:
> 
> * Andi Kleen <andi@...stfloor.org> wrote:
> > 
> > Stopping all CPUs for indefinite time very much seems like "breaking a 
> > correctly working system" to me. [...]
> 
> well, this is a small detail, but still you are wrong, and on a 
> correctly working system this will not occur. (if yes, tell me how)

Quite frankly, I don't see why the kernel kgdb layer should have *any* 
code like this at all.

The one who is actually debugging is the one who should decide which CPU's 
get stopped, and which don't.

I realize that the gdb remote protocol is probably a piece of crap and 
cannot handle that, but hey, that's not my problem, and more importantly, 
I don't think it's even a *remotely* valid reason for making bad decisions 
in the kernel. gdb was still open source last time I saw, and I think it's 
reasonable to just say:

 - the kgdb commands should always act on the *current* CPU only
 - add one command that says "switch over to CPU #n" which just releases 
   the current CPU and sends an IPI to that CPU #n (no timeouts, no 
   synchronous waiting, no nothing - it's like a "continue", but with a 
   "try to get the other CPU to stop"

Yes, other CPU's will obviously often end up stopping due to waiting for 
some spinlock or other if we stop one, but that's a separate issue, and 
quite often it might be sufficient - and what we want.

And yes, you'd likely have to add some support to gdb to make this 
_usable_, but now all that usability crap, all those timeouts for "stop 
all CPU's" are now in user space on the _debugger_ side. That can be as 
fancy as it wants to be.

And maybe this isn't realistic. I'm not saying "we _must_ do it this way", 
I just want to say that the kernel kgdb layer should be as thin ass 
humanly possible, and maybe the right thing to do is to simply totally 
punt on the whole "stop other cpu's" issue and make it a debugger-side 
question.

In other words, is it perhaps possible to just *get*rid*of* that 
"kgdb_active" and "nmicallback" and the whole multi-CPU roundup? Just use 
a kgdb spinlock around the stuff that actually sends and receives 
individual packets, and expect the debugger side to sort them out (yeah, I 
suspect this involves having to add the CPU ID to each packet).

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/