lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9f451583066900c7c7db345c103f6530@eikelenboom.it>
Date:	Wed, 02 Dec 2015 00:44:54 +0100
From:	Sander Eikelenboom <linux@...elenboom.it>
To:	Boris Ostrovsky <boris.ostrovsky@...cle.com>
Cc:	linux-kernel@...r.kernel.org, xen-devel@...ts.xen.org,
	david.vrabel@...rix.com
Subject: Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv
 guest under Xen with single vcpu.

On 2015-12-02 00:41, Boris Ostrovsky wrote:
> On 12/01/2015 06:30 PM, Sander Eikelenboom wrote:
>> On 2015-12-02 00:19, Boris Ostrovsky wrote:
>>> On 12/01/2015 06:00 PM, Sander Eikelenboom wrote:
>>>> On 2015-12-01 23:47, Boris Ostrovsky wrote:
>>>>> On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:
>>>>>> On 2015-11-30 23:54, Boris Ostrovsky wrote:
>>>>>>> On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:
>>>>>>>> On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:
>>>>>>>>> On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom 
>>>>>>>>> wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>> 
>>>>>>>>>> I have just tested a 4.4-rc2 kernel (current linus tree) + the 
>>>>>>>>>> tip tree
>>>>>>>>>> pulled on top.
>>>>>>>>>> 
>>>>>>>>>> Running this kernel under Xen on PV-guests with multiple vcpus 
>>>>>>>>>> goes well (on
>>>>>>>>>> idle < 10% cpu usage),
>>>>>>>>>> but a guest with only a single vcpu doesn't idle at all, it 
>>>>>>>>>> seems a kworker
>>>>>>>>>> thread is stuck:
>>>>>>>>>> root       569 98.0  0.0      0     0 ?        R 16:02 12:47
>>>>>>>>>> [kworker/0:1]
>>>>>>>>>> 
>>>>>>>>>> Running a 4.3 kernel works fine with a single vpcu, bisecting 
>>>>>>>>>> would probably
>>>>>>>>>> quite painful since there were some breakages this merge 
>>>>>>>>>> window with respect
>>>>>>>>>> to Xen pv-guests.
>>>>>>>>>> 
>>>>>>>>>> There are some differences in the diff's from booting a 4.3, 
>>>>>>>>>> 4.4-single,
>>>>>>>>>> 4.4-multi cpu boot:
>>>>>>>>> 
>>>>>>>>> Boris has been tracking a bunch of them. I am attaching the 
>>>>>>>>> latest set of
>>>>>>>>> patches I've to carry on top of v4.4-rc3.
>>>>>>>> 
>>>>>>>> Hi Konrad,
>>>>>>>> 
>>>>>>>> i will test those, see if it fixes all my issues and report back
>>>>>>> 
>>>>>>> They shouldn't help you ;-( (and I just saw a message from you 
>>>>>>> confirming this)
>>>>>>> 
>>>>>>> The first one fixes a 32-bit bug (on bare metal too). The second 
>>>>>>> fixes
>>>>>>> a fatal bug for 32-bit PV guests. The other two are code
>>>>>>> improvements/cleanup.
>>>>>> 
>>>>>> One of these patches also fixes a bug i was having with a 
>>>>>> pci-passthrough device in
>>>>>> a HVM that wasn't working (depending on which dom0-kernel i was 
>>>>>> using (4.3 or 4.4)),
>>>>>> but didn't report yet.
>>>>>> 
>>>>>> Fingers crossed but i think this pv-guest single vcpu issue is the 
>>>>>> last i'm troubled by for now ;)
>>>>> 
>>>>> I could not reproduce this, including with your kernel config file.
>>>> 
>>>> Hmm that's unpleasant :-\
>>>> 
>>>> Hmm other strange thing is it doesn't seem to affect dom0 (which is 
>>>> also a PV guest), but only unprivileged ones
>>>> All unprivileged pv-guests seem to have the irq issue, but only with 
>>>> a single vcpu i see to get the stuck kworker thread that got my 
>>>> attention, with a 2 vcpu that doesn't seem to happen, but you still 
>>>> get the dmesg output and warnings about hvc)
>>>> 
>>>> Could it be that:
>>>> 
>>>> arch/x86/include/asm/i8259.h
>>>> static inline int nr_legacy_irqs(void)
>>>> {
>>>>         return legacy_pic->nr_legacy_irqs;
>>>> }
>>>> 
>>>> returns something different in some circumstances ?
>>> 
>>> It should return 16 pre-8c058b0b9c34d8c8d7912880956543769323e2d8 and 
>>> 0
>>> after that commit.
>>> 
>>> This is the last number that you see in
>>>     NR_IRQS:4352 nr_irqs:48 0
>>> line.
>>> 
>>> I think you should be able to safely revert both
>>> b4ff8389ed14b849354b59ce9b360bdefcdbf99c and
>>> 8c058b0b9c34d8c8d7912880956543769323e2d8 and see if it makes any
>>> difference.
>>> 
>>> 
>>> -boris
>>> 
>> 
>> That was already underway compiling :)
>> 
>> And it does reveal that reverting both fixes the issue, no stuck 
>> kworker thread .. and no:
>>    genirq: Flags mismatch irq 8. 00000000 (hvc_console) vs. 00000000 
>> (rtc0)
>>    hvc_open: request_irq failed with rc -16.
> 
> 
> Let me try it again tomorrow. Can you post your guest config file, Xen
> version and host HW (Intel or AMD)? 'xl info' maybe?
> 
> -boris

Guest config file == dom0 config file == the one i send you earlier.
Host is an AMD Phenom X6.

# xl info
host                   : serveerstertje
release                : 4.4.0-rc3-20151201-linus-doflr-boris+
version                : #1 SMP Tue Dec 1 19:02:58 CET 2015
machine                : x86_64
nr_cpus                : 6
max_cpu_id             : 5
nr_nodes               : 1
cores_per_socket       : 6
threads_per_core       : 1
cpu_mhz                : 3200
hw_caps                : 
178bf3ff:efd3fbff:00000000:00011300:00802001:00000000:000037ff:00000000
virt_caps              : hvm hvm_directio
total_memory           : 20479
free_memory            : 7745
sharing_freed_memory   : 0
sharing_used_memory    : 0
outstanding_claims     : 0
free_cpus              : 0
xen_major              : 4
xen_minor              : 7
xen_extra              : -unstable
xen_version            : 4.7-unstable
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 
hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler          : credit
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          : Thu Nov 26 20:58:13 2015 +0100 
git:5252636-dirty
xen_commandline        : dom0_mem=1536M,max:1536M loglvl=all 
loglvl_guest=all console_timestamps=datems vga=gfx-1280x1024x32 cpuidle 
cpufreq=xen com1=38400,8n1 console=vga,com1 ivrs_ioapic[6]=00:14.0 
iommu=on,verbose,debug,amd-iommu-debug conring_size=128k ucode=-1
cc_compiler            : gcc-4.9.real (Debian 4.9.2-10) 4.9.2
cc_compile_by          : root
cc_compile_domain      : dyndns.org
cc_compile_date        : Thu Nov 26 21:18:41 CET 2015
xend_config_format     : 4

If you need and can get more info by letting me run a debug patch for 
you (because you can't reproduce) don't hesitate to send one :)

Thanks so far !

--
Sander

> 
> 
>> 
>> What i did get was an conflict reverting 
>> b4ff8389ed14b849354b59ce9b360bdefcdbf99c:
>> arch/arm64/include/asm/irq.h, although that shouldn't matter because 
>> we are on x86 and not on arm.
>> 
>> -- Sander
>> 
>> 
>>>> 
>>>> -- Sander
>>>> 
>>>>> 
>>>>> -boris
>>>> 
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@...ts.xen.org
>>>> http://lists.xen.org/xen-devel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ