linux-kernel - Re: [tip:tracing/urgent] tracing: Fix too large stack usage in do_one

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.2.01.0908210845540.3158@localhost.localdomain>
Date:	Fri, 21 Aug 2009 09:05:25 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	mingo@...hat.com, "H. Peter Anvin" <hpa@...or.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	a.p.zijlstra@...llo.nl, catalin.marinas@....com,
	Jens Axboe <jens.axboe@...cle.com>, fweisbec@...il.com,
	srostedt@...hat.com, tglx@...utronix.de,
	Ingo Molnar <mingo@...e.hu>,
	Arjan van de Ven <arjan@...ux.intel.com>
Subject: Re: [tip:tracing/urgent] tracing: Fix too large stack usage in
 do_one_initcall()

So I obviously agree with fixing do_one_initcall(), but..

Looking at the other cases, I do note (once more) what a horrible thing 
SCSI is, and that the callchains are not only way too deep, but the SCSI 
routines stand out among the cases that have 100+ bytes of stack frame.

We _really_ should fix these:

>   5)     3444     116   __alloc_pages_nodemask+0xd7/0x550   
>  10)     3216     108   create_object+0x28/0x250
>  18)     2896     128   sd_prep_fn+0x332/0xa70
>  23)     2640     172   blk_execute_rq+0x6b/0xb0
>  46)     1532     108   scsi_add_lun+0x44b/0x460
>  47)     1424     116   scsi_probe_and_add_lun+0x182/0x4e0

I also note that in this case, we'd have gotten rid of a _lot_ of the 
callchain if we had actually just executed this thing asynchronously. 
Because we clearly have that __async_schedule() there in the callchain in 
two places: before the port probing and the disk probing.

But it looks like we hit the MAX_WORK limit. Which sounds odd, since that 
is set to 32768, but I guess it can happen. It sounds a bit unlikely. 
Ingo, do you have something set to disable that?

I do wonder, though. Maybe we should never have that MAX_WORK limit, and 
instead limit the parallelism by actively trying to yield when there's too 
much work? That bootup sequence _does_ tend to have deep callchains (with 
all the crazy device register crud), and maybe we should actively see the 
async work code as not just a way to speed up boot, but also as a way to 
avoid deep callchains.

Hmm?  Comments?

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/