linux-kernel - Re: 2.6.21-rc4-mm1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4602C83B.20608@shadowen.org>
Date:	Thu, 22 Mar 2007 18:17:31 +0000
From:	Andy Whitcroft <apw@...dowen.org>
To:	Con Kolivas <kernel@...ivas.org>
CC:	Andy Whitcroft <apw@...dowen.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, Steve Fox <drfickle@...ibm.com>,
	"Martin J. Bligh" <mbligh@...igh.org>
Subject: Re: 2.6.21-rc4-mm1

Andy Whitcroft wrote:
> Con Kolivas wrote:
>> On Thursday 22 March 2007 20:48, Andy Whitcroft wrote:
>>> Andy Whitcroft wrote:
>>>> Andy Whitcroft wrote:
>>>>> Andrew Morton wrote:
>>>>>> Temporarily at
>>>>>>
>>>>>>   http://userweb.kernel.org/~akpm/2.6.21-rc4-mm1/
>>>>>>
>>>>>> Will appear later at
>>>>>>
>>>>>>  
>>>>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc
>>>>>> 4/2.6.21-rc4-mm1/
>>>>> [All of the below is from the pre hot-fix runs.  The very few results
>>>>> which are in for the hot-fix runs seem worse if anything.  :(  All
>>>>> results should be out on TKO.]
>>>>>
>>>>>> - Restored the RSDL CPU scheduler (a new version thereof)
>>>>> Unsure if the above is the culprit but there seems to be a smattering of
>>>>> BUG's in kernbench from the schedular on several systems, and panics
>>>>> which do not fully dump out.
>>>>>
>>>>> elm3b239 is about 2/4 kernbench being the test in progress when we
>>>>> blammo in both failed tests, elm3b234 doesn't boot at all.
>>>> Well I have one result through for backing RSDL out on elm3b239 and that
>>>> does indeed seem to give us a successful boot and test.  peterz has
>>>> pointed me to an incremental patch from Con which I'll push through
>>>> testing and see if that sorts it out.
>>> Ok, tested the patch below on top of 2.6.21-rc4-mm1 and this seems to
>>> fix the problem:
>>>
>>> http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc4-mm1-rsdl-0.32.p
>>> atch
>>>
>>> Hard to tell from that patch whether it will be fixed in the changes
>>> already committed to the next -mm.
>>>
>>> Its possible that it may be fixed by the following patch:
>>>
>>>     sched-rsdl-improvements.patch
>>>
>>> Which has the following slipped in at the end of the changelog:
>>>
>>>     A tiny change checking for MAX_PRIO in normal_prio()
>>>     may prevent oopses on bootup on large SMP due to
>>>     forking off the idle task.
>>>
>>> Con, are all the changes in the 0.32 patch above with akpm?
>> Yes he's queued everything in that patch you tested for the next -mm. Thanks 
>> very much for testing it.
> 
> No worries.  I've just got through the results on the other machine in
> the mix.  That machine seems to be fixed by backing out RSDL and not by
> the fixup 0.32 patch ...
> 
> This second machine seems to had hard very soon after user space starts
> executing but without a panic.  I can't say that the symptoms are very
> definitive, but I do have a good result from that machine without RSDL
> and not with rsdl-0.32.
> 
> The machine is a dual-core x86_64 machine: Dual Core AMD Opteron(tm)
> Processor 275.
> 
> I'll let you know if I find out anything else.  Shout if you want any
> information or have anything you want poked or tested.

Ok, I have yet a third x86_64 machine is is blowing up with the latest
2.6.21-rc4-mm1+hotfixes+rsdl-0.32 but working with
2.6.21-rc4-mm1+hotfixes-RSDL.  I have results on various hotfix levels
so I have just fired off a set of tests across the affected machines on
that latest hotfix stack plus the RSDL backout and the results should be
in in the next hour or two.

I think there is a strong correlation between RSDL and these hangs.  Any
suggestions as to the next step.

-apw
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/