lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e5cb48fc-081a-f47c-810f-cc649c765ae6@huawei.com>
Date:   Tue, 23 Jul 2019 16:29:15 +0100
From:   John Garry <john.garry@...wei.com>
To:     Thomas Gleixner <tglx@...utronix.de>
CC:     Marc Zyngier <maz@...nel.org>, <bigeasy@...utronix.de>,
        chenxiang <chenxiang66@...ilicon.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: About threaded interrupt handler CPU affinity

>
>  Probably because the other CPU(s)
>> in the affinity set are less loaded than the one which handles the hard
>> interrupt.
>
> I will look to get some figures for CPU loading to back this up.
>

As promised, here are some CPU loading figures before and after the 
change to make the thread CPU affinity same as the interrupt affinity:

Before:
CPU	%usr	%sys	%irq	%soft	%idle
all	2.9	13.1	1.2	4.6	78.2				
0	0.0	29.3	10.1	58.6	2.0
1	18.2	39.4	0.0	1.0	41.4
2	0.0	2.0	0.0	0.0	98.0
3	16.0	40.0	0.0	0.0	44.0
4	9.7	56.3	0.0	0.0	34.0
5	0.0	0.0	0.0	0.0	100.0
6	0.0	36.0	12.0	45.0	7.0
7	12.5	35.4	0.0	0.0	52.1
8	10.3	38.1	0.0	0.0	51.6
9	0.0	0.0	0.0	0.0	100.0
10	8.2	41.8	0.0	0.0	50.0
11	0.0	0.0	0.0	0.0	100.0
					
After:
CPU	%usr	%sys	%irq	%soft	%idle
all	3.5	18.4	2.7	6.8	68.6
0	0.0	20.6	29.9	29.9	19.6
1	0.0	39.8	0.0	50.0	10.2
2	18.6	45.4	0.0	0.0	36.1
3	19.6	48.9	0.0	0.0	31.5
4	0.0	0.0	0.0	0.0	100.0
5	14.9	51.1	0.0	0.0	34.0
6	0.0	20.4	24.5	36.7	18.4
7	0.0	36.0	0.0	63.0	1.0
8	12.2	55.1	0.0	0.0	32.7
9	15.0	57.0	0.0	0.0	28.0
10	13.0	54.0	0.0	0.0	33.0
11	14.6	52.1	0.0	0.0	33.3


The system has 96 cores, and we have 6x CPUs set per interrupt affinity 
mask. I'm only showing 2 clusters of 6 CPUs, but the loading pattern is 
common across all clusters, albeit higher clusters are generally much 
less loaded.

We can see that CPU0,6 are almost 100% loaded before, taking on all the 
irq and softirq processing.

With the change, CPU0,6 are much less loaded, and CPU1,7 take on much 
softirq processing.

In total, irq and softirq processing has increased - I suppose that the 
reason is that we're just pumping through more IO.

We'll do some more testing at lower loads - but from limited testing we 
see no regression here. In the above test we're using many disks on the 
storage controller (> 20).

Please let me know your thoughts.

Cheers,
John

>>
>> This is heavily use case dependent I assume, so making this a general
>> change is perhaps not a good idea, but we could surely make this
>> optional.
>
> That sounds reasonable. So would the idea be to enable this optionally
> at the request threaded irq call?
>
> Thanks,
> John
>
>>
>> Thanks,
>>
>>     tglx
>>
>> .
>>
>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ