linux-kernel - Re: Virtio-scsi multiqueue irq affinity

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <eb893e8e-4805-1a04-d934-b7f821c64a8e@huawei.com>
Date:   Tue, 18 May 2021 09:37:28 +0800
From:   "liaochang (A)" <liaochang1@...wei.com>
To:     Thomas Gleixner <tglx@...utronix.de>,
        xuyihang <xuyihang@...wei.com>, "Ming Lei" <ming.lei@...hat.com>
CC:     Peter Xu <peterx@...hat.com>, Christoph Hellwig <hch@....de>,
        Jason Wang <jasowang@...hat.com>,
        Luiz Capitulino <lcapitulino@...hat.com>,
        "Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>,
        "Michael S. Tsirkin" <mst@...hat.com>, <minlei@...hat.com>
Subject: Re: Virtio-scsi multiqueue irq affinity

Thomas,

在 2021/5/10 15:54, Thomas Gleixner 写道:
> Liao,
> 
> On Mon, May 10 2021 at 11:19, liaochang wrote:
>> 1.We have a machine with 36 CPUs,and assign several RT threads to last
>> two CPUs(CPU-34, CPU-35).
> 
> Which kind of machine? x86?
> 
>> 2.I/O device driver create single managed irq, the affinity of which
>> includes CPU-34 and CPU-35.
> 
> If that driver creates only a single managed interrupt, then the
> possible affinity of that interrupt spawns CPUs 0 - 35.
> 
> That's expected, but what is the effective affinity of that interrupt?
> 
> # cat /proc/irq/$N/effective_affinity
> 
> Also please provide the full output of
> 
> # cat /proc/interrupts
> 
> and point out which device we are talking about.

the mentioned managed irq is registered by virtio-scsi driver over PCI (on X86 platform, VM with 4 vCPU),
as shown below.

#lspci -vvv
...
00:04.0 SCSI storage controller: Virtio: Virtio SCSI
        Subsystem: Virtio: Device 0008
        Physical Slot: 4
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 11
        Region 0: I/O ports at c140 [size=64]
        Region 1: Memory at febd2000 (32-bit, non-prefetchable) [size=4K]
        Region 4: Memory at fe004000 (64-bit, prefetchable) [size=16K]
        Capabilities: [98] MSI-X: Enable+ Count=4 Masked-
                Vector table: BAR=1 offset=00000000
                PBA: BAR=1 offset=00000800

#ls /sys/bus/pci/devices/0000:00:04.0/msi_irqs
33 34 35 36

#cat /proc/interrupts
...
 33:          0          0          0          0   PCI-MSI 65536-edge      virtio1-config
 34:          0          0          0          0   PCI-MSI 65537-edge      virtio1-control
 35:          0          0          0          0   PCI-MSI 65538-edge      virtio1-event
 36:      10637          0          0          0   PCI-MSI 65539-edge      virtio1-request

As you see, virtio-scsi allocates four MSI-X interrupts,from 33 to 36, and the last one supposes to
be triggered when the data of virtqueue is ready to receive, then its interrupt handler will raise
ksoftirqd to process I/O.If I assign FIFO RT thread to CPU0, a simple I/O operation issued by command
"dd if=/dev/zero of=/test.img bs=1K cout=1 oflag=direct,sync" will never finish.

Although that's expected, do you think it is sort of risky for Linux availability? Given in cloud
based environment,services from different teams may have serious influence to each other because of
lack of enough communication or good understanding about infrastructure, Thanks.

This problem arises when RT thread and ksoftirqd scheduled on the same CPU, beside placing RT thread
carefully, I also tried to set "rq_affinity" as 2, but the cost is a performance degradation of some
I/O benchmark by 10%~30%. So I wonder if the affinity of managed irq supports configuration from user space
or via kernel bootargs? Thanks.

> 
> Thanks,
> 
>         tglx
> .
> 
BR,
Liao, Chang