lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5020A220.2030305@linux.vnet.ibm.com>
Date:	Tue, 07 Aug 2012 13:05:36 +0800
From:	Michael Wang <wangyun@...ux.vnet.ibm.com>
To:	Sasha Levin <levinsasha928@...il.com>
CC:	John Stultz <johnstul@...ibm.com>, Avi Kivity <avi@...hat.com>,
	paulmck@...ux.vnet.ibm.com,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	mingo@...nel.org, a.p.zijlstra@...llo.nl, prarit@...hat.com,
	tglx@...utronix.de, Dave Jones <davej@...hat.com>
Subject: Re: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks on v3.6

On 08/07/2012 04:35 AM, Sasha Levin wrote:
> On 08/06/2012 10:31 PM, John Stultz wrote:
>> On 08/06/2012 11:28 AM, Sasha Levin wrote:
>>> On 08/06/2012 08:20 PM, John Stultz wrote:
>>>> On 08/06/2012 10:21 AM, John Stultz wrote:
>>>>> On 08/05/2012 09:55 AM, Sasha Levin wrote:
>>>>>> On 07/30/2012 03:17 PM, Avi Kivity wrote:
>>>>>>> Possible causes:
>>>>>>>    - the APIC calibration in the guest failed, so it is programming too
>>>>>>> low values into the timer
>>>>>>>    - it actually needs 1 us wakeups and then can't keep up (esp. as kvm
>>>>>>> interrupt injection is slowing it down)
>>>>>>>
>>>>>>> You can try to find out by changing
>>>>>>> arch/x86/kvm/lapic.c:start_lapic_timer() to impose a minimum wakeup of
>>>>>>> (say) 20 microseconds which will let the guest live long enough for you
>>>>>>> to ftrace it and see what kind of timers it is programming.
>>>>>> I've kept trying to narrow it down, and found out It's triggerable using adjtimex().
>>>> Sorry, one more question: Could you provide details on how is it trigger-able using adjtimex?
>>> It triggers after a while of fuzzing using trinity of just adjtimex ('./trinity --quiet -l off -cadjtimex').
>>>
>>> Trinity is available here: http://git.codemonkey.org.uk/?p=trinity.git .
>>>
>>> Let me know if I can help further with reproducing this, I can probably copy over my testing environment to some other host if you'd like.
>> So far no luck. Dmesg mostly just gets filled up with trinity-child OOMs.   How much memory are you running with?
>>
>> Are you running trinity as root or as some user that has CAP_SYS_TIME and can actually change values via adjtimex? Or does it trip just by reading the values?
> 
> As root in a disposable vm. It triggers at a random point, not after a specific call.

I have tested with a 3.6.0-rc1 guest again, running command:

./trinity --quiet -l off -cadjtimex --dangerous

for normal user:
	only oom info
for root:
	the guest hung without any stall info printed

I'm not sure how this trinity tool implemented, but at least it do help
to produce some rarely kernel bug...

And could you please also provide the way you start the guest? Is there
any special option?

Regards,
Michael Wang

> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ