[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <963e38b0-a7d6-0b13-af89-81b03028d1ae@huawei.com>
Date: Mon, 10 May 2021 16:48:31 +0800
From: xuyihang <xuyihang@...wei.com>
To: Thomas Gleixner <tglx@...utronix.de>,
Ming Lei <ming.lei@...hat.com>
CC: Peter Xu <peterx@...hat.com>, Christoph Hellwig <hch@....de>,
Jason Wang <jasowang@...hat.com>,
Luiz Capitulino <lcapitulino@...hat.com>,
"Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>,
"Michael S. Tsirkin" <mst@...hat.com>, <minlei@...hat.com>,
<liaochang1@...wei.com>
Subject: Re: Virtio-scsi multiqueue irq affinity
Thomas,
在 2021/5/8 20:26, Thomas Gleixner 写道:
> Yihang,
>
> On Sat, May 08 2021 at 15:52, xuyihang wrote:
>> We are dealing with a scenario which may need to assign a default
>> irqaffinity for managed IRQ.
>>
>> Assume we have a full CPU usage RT thread running binded to a specific
>> CPU.
>>
>> In the mean while, interrupt handler registered by a device which is
>> ksoftirqd may never have a chance to run. (And we don't want to use
>> isolate CPU)
> A device cannot register and interrupt handler in ksoftirqd.
>
>> There could be a couple way to deal with this problem:
>>
>> 1. Adjust priority of ksoftirqd or RT thread, so the interrupt handler
>> could preempt
>>
>> RT thread. However, I am not sure whether it could have some side
>> effects or not.
>>
>> 2. Adjust interrupt CPU affinity or RT thread affinity. But managed IRQ
>> seems design to forbid user from manipulating interrupt affinity.
>>
>> It seems managed IRQ is coupled with user side application to me.
>>
>> Would you share your thoughts about this issue please?
> Can you please provide a more detailed description of your system?
>
> - Number of CPUs
It's a 4 CPU x86 VM.
> - Kernel version
This experiment run on linux-4.19
> - Is NOHZ full enabled?
nohz=off
> - Any isolation mechanisms enabled, and if so how are they
> configured (e.g. on the kernel command line)?
Some core is isolated by command line (such as : isolcpus=3), and bind
with RT thread, and no other isolation configure.
> - Number of queues in the multiqueue device
Only one queue.
[root@...alhost ~]# cat /proc/interrupts | grep request
27: 5499 0 0 0 PCI-MSI
65539-edge virtio1-request
This environment is a virtual machine and it's a virtio device, I guess it
should not make any difference in this case.
> - Is the RT thread issuing I/O to the multiqueue device?
The RT thread doesn't issue IO.
We simplified the reproduce procedure:
1. Start a busy loopping program that have near 100% cpu usage, named print
./print 1 1 &
2. Make the program become realtime application
chrt -f -p 1 11514
3. Bind the RT process to the **managed irq** core
taskset -cpa 0 11514
4. Use dd to write to hard drive, and dd could not finish and return.
dd if=/dev/zero of=/test.img bs=1K count=1 oflag=direct,sync &
Since CPU is fully utilized by RT application, and hard drive driver choose
CPU0 to handle it's softirq, there is no chance for dd to run.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11514 root -2 0 2228 740 676 R 100.0 0.0 3:26.70 print
If we make some change on this experiment:
1. Make this RT application use less CPU time instead of 100%, the problem
disappear.
2, If we change rq_affinity to 2, in order to avoid handle softirq on
the same
core of RT thread, the problem also disappear. However, this approach
result in about 10%-30% random write proformance deduction comparing
to rq_affinity = 1, since it may has better cache utilization.
echo 2 > /sys/block/sda/queue/rq_affinity
Therefore, I want to exclude some CPU from managed irq on boot parameter,
which has simliar approach to 11ea68f553e2 ("genirq, sched/isolation:
Isolate
from handling managed interrupts").
Thanks,
Yihang
Powered by blists - more mailing lists