linux-kernel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20230915093758.31397-1-jacky_gam_2001@163.com>
Date:   Fri, 15 Sep 2023 17:37:57 +0800
From:   Ping Gan <jacky_gam_2001@....com>
To:     chaitanyak@...dia.com
Cc:     ping_gan@...l.com, kbusch@...nel.org,
        linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org,
        hch@....de, sagi@...mberg.me, axboe@...nel.dk,
        jacky_gam_2001@....com
Subject: 

> On 9/13/2023 1:34 AM, Ping Gan wrote:
> > Since nvme target currently does not support to submit bio to a
> > polling
> > queue, the bio's completion relies on system interrupt. But when there
> > is high workload in system and the competition is very high, so it
> > makes
> > sense to add polling queue task to submit bio to disk's polling queue
> > and poll the completion queue of disk.
> > 
> >
>
> I did some work in the past for nvmet polling and saw good
> performance improvement.
>
> Can you please share performance numbers for this series ?
> 
> -ck

hi,
I have verified this patch on two testbeds one for host and the other
for target. I used tcp as transport protocol, spdk perf as initiator. 
I do two group tests. The IO size of first is 4K, and the other is 2M. 
Both includ randrw, randwrite and randrw.  Both have same prerequisites.
At the initiator side I used 1 qp, 32 queue depth,and 1 spdk perf 
application, and for target side I bound tcp queue to 1 target core.
And I get below results.
iosize_4k        polling queue                        interrupt
randrw           NIC_rx:338M/s NIC_tx:335M/s      NIC_rx:260M/s
NIC_tx:258M/s
randwrite        NIC_rx:587M/s                    NIC_rx:431M/s
randread         NIC_tx:873M/s                    NIC_tx:654M/s

iosize_2M        polling queue                        interrupt
randrw           NIC_rx:738M/s NIC_tx:741M/s      NIC_rx:674M/s
NIC_tx:674M/s
randwrite        NIC_rx:1199M/s                   NIC_rx:1146M/s
randread         NIC_tx:2226M/s                   NIC_tx:2119M/s

For iosize 4k the NIC's bandwidth of poling queue is more than 30% than
bandwidth of interrupt. But for iosize 2M the improvement is not obvious,
the randrw of polling queue is about 9% more than interrupt; randwrite 
and randread of polling queue is about 5% more than interrupt.

Thanks,
Ping