linux-kernel - Re: [crash, bisected] Re: [PATCH 3/4] x86

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <48754A08.1060302@sgi.com>
Date:	Wed, 09 Jul 2008 16:30:16 -0700
From:	Mike Travis <travis@....com>
To:	"Eric W. Biederman" <ebiederm@...ssion.com>
CC:	"H. Peter Anvin" <hpa@...or.com>,
	Jeremy Fitzhardinge <jeremy@...p.org>,
	Christoph Lameter <cl@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...e.hu>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Jack Steiner <steiner@....com>
Subject: Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu
 area

Eric W. Biederman wrote:
> Mike Travis <travis@....com> writes:
> 
...  (I have been using the trick
>> to replace printk with early_printk so messages come out immediately instead
>> of from the log buf.)
> 
> Just passing early_printk=xxx on the command line should have that effect.
> Although I do admit you have to be a little bit into the boot before early_printk
> is setup.

What I meant was using early_printk in place of printk, which seems to stuff the
messages into the log buf until the serial console is setup fairly late in start_kernel.
I did this by removing printk() and renaming early_printk() to be printk (and a couple
other things like #define early_printk printk ...

> 
>> I've been able to make some more progress.  I've gotten to a point where it
>> panics from stack overflow.  I've verified this by bumping THREAD_ORDER and
>> it boots fine.  Now tracking down stack usages.  (I have found a couple of new
>> functions using set_cpus_allowed(..., CPU_MASK_ALL) instead of
>> set_cpus_allowed_ptr(... , CPU_MASK_ALL_PTR).  But these are not in the calling
>> sequence so subsequently are not the cause.
> 
> Is stack overflow the only problem you are seeing or are there still other mysteries?

I'm not entirely sure it's a stack overflow, the fault has a NULL dereference and
then the stack overflow message.

> 
>> One weird thing is early_idt_handler seems to have been called and that's one
>> thing our simulator does not mimic for standard Intel FSB systems - early
>> pending
>> interrupts.  (It's designed after all to mimic our h/w, and of course it's been
>> booting fine under that environment.) 
> 
> That usually indicates you are taking an exception during boot not that you
> have received an external interrupt.  Something like a page fault or a
> division by 0 error.

I was thinking maybe an RTC interrupt?  But a fault does sound more likely.

> 
>> Only a few of these though I would think might get called early in
>> the boot, that might also be contributing to the stack overflow.
> 
> Still the call chain depth shouldn't really be changing.  So why should it
> matter?  Ah.  The high cpu count is growing cpumask_t so when you put
> it on the stack.  That makes sense.  So what stars out as a 4 byte
> variable on the stack in a normal setup winds up being a 1k variable
> with 4k cpus.

Yes, it's definitely the three related:

NR_CPUS Patch_Applied THREAD_ORDER Results
  256        NO           1        works (obviously ;-)
  256        YES          1        works
 4096        NO           1        works
 4096        YES          1        panics
 4096        YES          3        works (just happened to pick 3,
					  2 probably will work as well.)

> Reasonable.  The practical problem is you are mixing a lot of changes
> simultaneously and it confuses things.  Compiling with NR_CPUS=4096
> and working out the bugs from a growing cpumask_t, putting the per cpu
> area in a zero based segment, and putting putting the pda into the
> per cpu area all at the same time.

I've been testing NR_CPUS=4096 for quite a while and it's been very
reliable.  It's just weird that this config fails with this new patch
applied.  (default configs and some fairly normal distro configs also
work fine.)  And with the zillion config straws we now have, spotting
the arbitrary needle is proving difficult. ;-) 

> Who knows maybe the only difference between 4.2.0 and 4.2.4 is that
> 4.2.4 optimizes it's stack usage a little better and you don't see
> a stack overflow.

I haven't tried the THREAD_ORDER=3 (or 2) under 4.2.0, but that would
seem to indicate this may be true.

> It would be very very good if we could separate out these issues
> especially the segment for the per cpu variables.  We need something
> like that.

One reason I've been sticking with 4.2.4.

Thanks again for your help.

Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/