lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3efb10970908300234q60e56418kbab64ef9a1dd23e3@mail.gmail.com>
Date:	Sun, 30 Aug 2009 11:34:21 +0200
From:	Remy Bohmer <linux@...mer.net>
To:	Daniel Walker <dwalker@...o99.com>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	LKML <linux-kernel@...r.kernel.org>,
	linux-rt-users <linux-rt-users@...r.kernel.org>
Subject: Re: System lockup with 2.6.26.8-rt16 on ARM9 [Solved]

Hi Daniel,


2009/8/29 Daniel Walker <dwalker@...o99.com>:
> On Sat, 2009-08-29 at 09:47 +0200, Remy Bohmer wrote:
>
>> Well, we found the root cause of this problem.
>> It turned out to be caused by sched_clock() that made disjunct time jumps.
>> This caused this check to become true in kernel/sched_rt.c:370:
>>          if (rt_rq->rt_time > runtime) {
>>                 rt_rq->rt_throttled = 1;
>>                 if (rt_rq_throttled(rt_rq)) {
>>                         sched_rt_rq_dequeue(rt_rq);
>>                         return 1;
>>                 }
>>         }
>>
>> The end results is that all realtime tasks got throttled for a long
>> time, and that time got extended every time sched_clock() made such a
>> jump. I would never have expected the scheduler would show this kind
>> of behaviour while CONFIG_RT_GROUP_SCHED is _not_ set...
>>
>> The root-cause of the sched_clock being faulty was a synchronisation
>> issue between 2 clock domains. The CPU clock and the clock domain of
>> the peripheral (GPT) on which the sched_clock() implementation was
>> based. The GPT made jumps backwards which triggered a false wraparound
>> detection in the conversion of 32->64 bit timestamps, causing the time
>> to jump about 356 seconds in the future...
>>
>
> Can you tell us more about what type of board this was? I've never heard
> of a ARM board having an unstable clocksource before ..

It was a Freescale iMX25 based board... (hmm, looking at it, it was a
driver build by Montavista that configured the GPT as clocksource, so
it might be interesting info for you too, notice that the EPIT turns
out to be much more stable in this processor. I also never seen this
problem on iMX35 based boards for which the same driver was used.)

Remy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ