lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <565E27C5.9050703@oracle.com>
Date:	Tue, 1 Dec 2015 18:05:41 -0500
From:	Boris Ostrovsky <boris.ostrovsky@...cle.com>
To:	Sander Eikelenboom <linux@...elenboom.it>
Cc:	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
	david.vrabel@...rix.com, linux-kernel@...r.kernel.org,
	xen-devel@...ts.xen.org
Subject: Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest
 under Xen with single vcpu.

On 12/01/2015 05:51 PM, Sander Eikelenboom wrote:
> On 2015-11-30 23:54, Boris Ostrovsky wrote:
>> On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:
>>> On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:
>>>> On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom wrote:
>>>>> Hi all,
>>>>>
>>>>> I have just tested a 4.4-rc2 kernel (current linus tree) + the tip 
>>>>> tree
>>>>> pulled on top.
>>>>>
>>>>> Running this kernel under Xen on PV-guests with multiple vcpus 
>>>>> goes well (on
>>>>> idle < 10% cpu usage),
>>>>> but a guest with only a single vcpu doesn't idle at all, it seems 
>>>>> a kworker
>>>>> thread is stuck:
>>>>> root       569 98.0  0.0      0     0 ?        R    16:02 12:47
>>>>> [kworker/0:1]
>>>>>
>>>>> Running a 4.3 kernel works fine with a single vpcu, bisecting 
>>>>> would probably
>>>>> quite painful since there were some breakages this merge window 
>>>>> with respect
>>>>> to Xen pv-guests.
>>>>>
>>>>> There are some differences in the diff's from booting a 4.3, 
>>>>> 4.4-single,
>>>>> 4.4-multi cpu boot:
>>>>
>>>> Boris has been tracking a bunch of them. I am attaching the latest 
>>>> set of
>>>> patches I've to carry on top of v4.4-rc3.
>>>
>>> Hi Konrad,
>>>
>>> i will test those, see if it fixes all my issues and report back
>>
>> They shouldn't help you ;-( (and I just saw a message from you 
>> confirming this)
>>
>> The first one fixes a 32-bit bug (on bare metal too). The second fixes
>> a fatal bug for 32-bit PV guests. The other two are code
>> improvements/cleanup.
>>
>>
>>>
>>> Thanks :)
>>>
>>> -- Sander
>>>
>>>>> Between 4.3 and 4.4-single:
>>>>>
>>>>> -NR_IRQS:4352 nr_irqs:32 16
>>>>> +Using NULL legacy PIC
>>>>> +NR_IRQS:4352 nr_irqs:32 0
>>
>> This is fine, as long as you have 
>> b4ff8389ed14b849354b59ce9b360bdefcdbf99c.
>>
>>>>>
>>>>> -cpu 0 spinlock event irq 17
>>>>> +cpu 0 spinlock event irq 1
>>
>> This is strange. I wouldn't expect spinlocks to use legacy irqs.
>>
>
> Could it be .. that with your fixup:
>     xen/events: Always allocate legacy interrupts on PV guests
>     (b4ff8389ed14b849354b59ce9b360bdefcdbf99c)
> for commit:
>     x86/irq: Probe for PIC presence before allocating descs for legacy 
> IRQs
>     (8c058b0b9c34d8c8d7912880956543769323e2d8)
>
> that we now have the situation described in the commit message of 
> 8c058b0b9c, but now for Xen PV instead of
> Hyper-V ?
> (seems both Xen and Hyper-V want to achieve the same but have 
> different competing implementations ?)
>
> (BTW 8c058b0b9c has a CC for stable ... so could be destined to cause 
> more trouble).


You mean my statement that irq 1 looks bad? That was a red herring, it 
should be fine.

-boris


>
> -- 
> Sander
>
>
>>>>>
>>>>> and later on:
>>>>>
>>>>> -hctosys: unable to open rtc device (rtc0)
>>>>> +rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock
>>>>>
>>>>> +genirq: Flags mismatch irq 8. 00000000 (hvc_console) vs. 00000000 
>>>>> (rtc0)
>>>>> +hvc_open: request_irq failed with rc -16.
>>>>> +Warning: unable to open an initial console.
>>>>>
>>>>>
>>>>> between 4.4-single and 4.4-multi:
>>>>>
>>>>>  Using NULL legacy PIC
>>>>> -NR_IRQS:4352 nr_irqs:32 0
>>>>> +NR_IRQS:4352 nr_irqs:48 0
>>
>> This is probably OK too since nr_irqs depend on number of CPUs.
>>
>> I think something is messed up with IRQ. I saw last week something
>> from setup_irq() generating a stack dump (warninig) for rtc_cmos but
>> it appeared harmless at that time and now I don't see it anymore.
>>
>> -boris
>>
>>
>>>>>
>>>>> and later on:
>>>>>
>>>>> -rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock
>>>>> +hctosys: unable to open rtc device (rtc0)
>>>>>
>>>>> -genirq: Flags mismatch irq 8. 00000000 (hvc_console) vs. 00000000 
>>>>> (rtc0)
>>>>> -hvc_open: request_irq failed with rc -16.
>>>>> -Warning: unable to open an initial console.
>>>>>
>>>>> attached:
>>>>>     - dmesg with 4.3 kernel with 1 vcpu
>>>>>     - dmesg with 4.4 kernel with 1 vpcu
>>>>>     - dmesg with 4.4 kernel with 2 vpcus
>>>>>     - .config of the 4.4 kernel is attached.
>>>>>
>>>>> -- Sander
>>>>>
>>>>>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ