lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 02 Dec 2015 00:30:44 +0100
From:	Sander Eikelenboom <linux@...elenboom.it>
To:	Boris Ostrovsky <boris.ostrovsky@...cle.com>
Cc:	linux-kernel@...r.kernel.org, xen-devel@...ts.xen.org,
	david.vrabel@...rix.com
Subject: Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv
 guest under Xen with single vcpu.

On 2015-12-02 00:19, Boris Ostrovsky wrote:
> On 12/01/2015 06:00 PM, Sander Eikelenboom wrote:
>> On 2015-12-01 23:47, Boris Ostrovsky wrote:
>>> On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:
>>>> On 2015-11-30 23:54, Boris Ostrovsky wrote:
>>>>> On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:
>>>>>> On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:
>>>>>>> On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom 
>>>>>>> wrote:
>>>>>>>> Hi all,
>>>>>>>> 
>>>>>>>> I have just tested a 4.4-rc2 kernel (current linus tree) + the 
>>>>>>>> tip tree
>>>>>>>> pulled on top.
>>>>>>>> 
>>>>>>>> Running this kernel under Xen on PV-guests with multiple vcpus 
>>>>>>>> goes well (on
>>>>>>>> idle < 10% cpu usage),
>>>>>>>> but a guest with only a single vcpu doesn't idle at all, it 
>>>>>>>> seems a kworker
>>>>>>>> thread is stuck:
>>>>>>>> root       569 98.0  0.0      0     0 ?        R 16:02 12:47
>>>>>>>> [kworker/0:1]
>>>>>>>> 
>>>>>>>> Running a 4.3 kernel works fine with a single vpcu, bisecting 
>>>>>>>> would probably
>>>>>>>> quite painful since there were some breakages this merge window 
>>>>>>>> with respect
>>>>>>>> to Xen pv-guests.
>>>>>>>> 
>>>>>>>> There are some differences in the diff's from booting a 4.3, 
>>>>>>>> 4.4-single,
>>>>>>>> 4.4-multi cpu boot:
>>>>>>> 
>>>>>>> Boris has been tracking a bunch of them. I am attaching the 
>>>>>>> latest set of
>>>>>>> patches I've to carry on top of v4.4-rc3.
>>>>>> 
>>>>>> Hi Konrad,
>>>>>> 
>>>>>> i will test those, see if it fixes all my issues and report back
>>>>> 
>>>>> They shouldn't help you ;-( (and I just saw a message from you 
>>>>> confirming this)
>>>>> 
>>>>> The first one fixes a 32-bit bug (on bare metal too). The second 
>>>>> fixes
>>>>> a fatal bug for 32-bit PV guests. The other two are code
>>>>> improvements/cleanup.
>>>> 
>>>> One of these patches also fixes a bug i was having with a 
>>>> pci-passthrough device in
>>>> a HVM that wasn't working (depending on which dom0-kernel i was 
>>>> using (4.3 or 4.4)),
>>>> but didn't report yet.
>>>> 
>>>> Fingers crossed but i think this pv-guest single vcpu issue is the 
>>>> last i'm troubled by for now ;)
>>> 
>>> I could not reproduce this, including with your kernel config file.
>> 
>> Hmm that's unpleasant :-\
>> 
>> Hmm other strange thing is it doesn't seem to affect dom0 (which is 
>> also a PV guest), but only unprivileged ones
>> All unprivileged pv-guests seem to have the irq issue, but only with a 
>> single vcpu i see to get the stuck kworker thread that got my 
>> attention, with a 2 vcpu that doesn't seem to happen, but you still 
>> get the dmesg output and warnings about hvc)
>> 
>> Could it be that:
>> 
>> arch/x86/include/asm/i8259.h
>> static inline int nr_legacy_irqs(void)
>> {
>>         return legacy_pic->nr_legacy_irqs;
>> }
>> 
>> returns something different in some circumstances ?
> 
> It should return 16 pre-8c058b0b9c34d8c8d7912880956543769323e2d8 and 0
> after that commit.
> 
> This is the last number that you see in
>     NR_IRQS:4352 nr_irqs:48 0
> line.
> 
> I think you should be able to safely revert both
> b4ff8389ed14b849354b59ce9b360bdefcdbf99c and
> 8c058b0b9c34d8c8d7912880956543769323e2d8 and see if it makes any
> difference.
> 
> 
> -boris
> 

That was already underway compiling :)

And it does reveal that reverting both fixes the issue, no stuck kworker 
thread .. and no:
    genirq: Flags mismatch irq 8. 00000000 (hvc_console) vs. 00000000 
(rtc0)
    hvc_open: request_irq failed with rc -16.

What i did get was an conflict reverting 
b4ff8389ed14b849354b59ce9b360bdefcdbf99c:
arch/arm64/include/asm/irq.h, although that shouldn't matter because we 
are on x86 and not on arm.

--
Sander


>> 
>> -- Sander
>> 
>>> 
>>> -boris
>> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@...ts.xen.org
>> http://lists.xen.org/xen-devel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ