lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 25 Mar 2007 13:27:22 +0100
From:	Andy Whitcroft <apw@...dowen.org>
To:	Con Kolivas <kernel@...ivas.org>
CC:	William Lee Irwin III <wli@...omorphy.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, Steve Fox <drfickle@...ibm.com>,
	"Martin J. Bligh" <mbligh@...igh.org>
Subject: Re: debug rsdl 0.33

Con Kolivas wrote:
> On Saturday 24 March 2007 08:45, Con Kolivas wrote:
>> On Friday 23 March 2007 23:28, Andy Whitcroft wrote:
>>> Andy Whitcroft wrote:
>>>> Con Kolivas wrote:
>>>>> On Friday 23 March 2007 05:17, Andy Whitcroft wrote:
>>>>>> Ok, I have yet a third x86_64 machine is is blowing up with the
>>>>>> latest 2.6.21-rc4-mm1+hotfixes+rsdl-0.32 but working with
>>>>>> 2.6.21-rc4-mm1+hotfixes-RSDL.  I have results on various hotfix
>>>>>> levels so I have just fired off a set of tests across the affected
>>>>>> machines on that latest hotfix stack plus the RSDL backout and the
>>>>>> results should be in in the next hour or two.
>>>>>>
>>>>>> I think there is a strong correlation between RSDL and these hangs.
>>>>>> Any suggestions as to the next step.
>>>>> Found a nasty in requeue_task
>>>>> +	if (list_empty(old_array->queue + old_prio))
>>>>> +		__clear_bit(old_prio, p->array->prio_bitmap);
>>>>>
>>>>> see anything wrong there? I do :P
>>>>>
>>>>> I'll queue that up with the other changes pending and hopefully that
>>>>> will fix your bug.
>>>> Tests queued with your rdsl-0.33 patch (I am assuming its in there).
>>>> Will let you know how it looks.
>>> Hmmm, this is good for the original machine (as was 0.32) but not for
>>> either of the other two.  I am seeing panics as below on those two.
>> This machine seems most sensitive to it (first column):
>> elm3b6
>> amd64
>> newisys
>> 4cpu
>> config: amd64
>>
>> Can you throw this debugging patch at it please? The console output might
>> be very helpful. On top of sched-rsdl-0.33 thanks!
> 
> Better yet this one which checks the expired array as well and after 
> pull_task.
> 
> If anyone's getting a bug they think might be due to rsdl please try this (on 
> rsdl 0.33).

Ok, new round of tests across the sensitive machines with 0.33 plus the
above debug patch are in the queue.  Will let you know how they pan out.

The tests with -rc4 + 0.33 are also in.  Failing there also.  Both out
of __sched_text_start, so I'd guess the same cause and the schedular is
fingered.

-apw
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ