linux-kernel - Re: Make sure we populate the initroot filesystem late enough

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <D963EE1E-7B36-4074-95B9-A312D92678EB@kernel.crashing.org>
Date:	Mon, 12 Mar 2007 22:03:08 -0500
From:	Kumar Gala <galak@...nel.crashing.org>
To:	Paul TBBle Hampson <Paul.Hampson@...ox.com>
Cc:	Michael Ellerman <michael@...erman.id.au>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	linuxppc-dev@...abs.org, john stultz <johnstul@...ibm.com>,
	torvalds@...ux-foundation.org,
	David Woodhouse <dwmw2@...radead.org>
Subject: Re: Make sure we populate the initroot filesystem late enough


On Mar 12, 2007, at 6:01 PM, Paul TBBle Hampson wrote:

> On Thu, Mar 01, 2007 at 09:30:56AM +0900, Michael Ellerman wrote:
>> On Wed, 2007-02-28 at 10:13 +0000, David Woodhouse wrote:
>>> On Wed, 2007-02-28 at 07:43 +0100, Benjamin Herrenschmidt wrote:
>>>> I wouldn't be that sure ... I've had problems in the past with  
>>>> PMU based
>>>> cpufreq... looks like flushing all caches and hard-resetting the
>>>> processor on the fly when there can be pending DMAs might be a  
>>>> source of
>>>> trouble... especially on CPUs that don't have working cache  
>>>> flush HW
>>>> assist.
>>>
>>> I've seen it on a PowerMac3,1 (400MHz G4) where we don't have  
>>> cpufreq.
>>> I've also seen it on the latest 1.5GHz Mac Mini, and on my  
>>> shinybook.
>>> They all fall over with the latest kernel, although the shinybook  
>>> only
>>> does so immediately when booted with mem=512M. The shinybook does  
>>> crash
>>> later with new kernels though; I don't yet know why. It could be the
>>> same thing, or it could be something different. That one seemed to
>>> appear between Fedora's 2.6.19-1.2913 and 2.6.19-1.2914 kernels,  
>>> where
>>> we did nothing but turned CONFIG_SYSFS_DEPRECATED on.
>>>
>>> I don't blame cpufreq. At various times I've been equally  
>>> convinced that
>>> it was due to CONFIG_KPROBES, and Linus' initrd-moving patch.
>
>> Is there any pattern to the way it dies? Or is it just randomly  
>> dieing
>> somewhere depending on which config options you have enabled?
>
>> This is starting to sound reminiscent of a bug I chased for a  
>> while last
>> year on Power5, but didn't find. It was "fixed" on some machines by
>> disabling CONFIG_KEXEC, and/or other random unrelated CONFIG options.
>> Unfortunately it magically stopped reproducing so I never caught  
>> it :/
>
> Hmm. The crash came back after I booted into Mac OS X and back. It  
> was however
> a different crash, I believe it was coming from the USB modules (as  
> it would
> keep going when it happened, and get another crash, which tended to  
> scroll away
> too fast for me to capture) but I believe it was still getting down  
> into the
> slab code and actually dying there.
>
> However, reverting the reversion of
> 8d610dd52dd1da696e199e4b4545f33a2a5de5c6 and instead applying
> the following patch:
>
> diff -ru linux-source-2.6.20.orig/arch/powerpc/mm/init_32.c linux- 
> source-2.6.20/arch/powerpc/mm/init_32.c
> --- linux-source-2.6.20.orig/arch/powerpc/mm/init_32.c  2007-02-05  
> 05:44:54.000000000 +1100
> +++ linux-source-2.6.20/arch/powerpc/mm/init_32.c       2007-03-10  
> 11:03:56.000000000 +1100
> @@ -244,7 +244,8 @@
>  void free_initrd_mem(unsigned long start, unsigned long end)
>  {
>         if (start < end)
> -               printk ("Freeing initrd memory: %ldk freed\n", (end  
> - start) >> 10);
> +               printk ("NOT Freeing initrd memory: %ldk freed\n",  
> (end - start) >> 10);
> +       return;
>         for (; start < end; start += PAGE_SIZE) {
>                 ClearPageReserved(virt_to_page(start));
>                 init_page_count(virt_to_page(start));
>
> which if I recall correctly David Woodhouse posted to this thread,
> seems to have fixed it.
>
> I dunno if it's relevant, but my initrd.img is 13193315 bytes long,
> (ie 99 bytes over 12884k) and the above logs:
> "NOT Freeing initrd memory: 12888k freed"
> which makes sense...
>
> I of course completely failed to think to check this with the crashing
> kernel, if it seems relevant I can roll back to it and get the  
> numbers.

Have you tried 2.6.20.2, there was a significant bug in get_order()  
that was deemed to be causing these issues.

- k
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/