linux-kernel - Re: Re: [PATCH] vduse: Use fixed 4KB bounce pages for arm64 64KB page size

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <18e7bf9b-eb4c-4c4c-a00e-bfe0bc07e81c@bytedance.com>
Date: Tue, 16 Sep 2025 21:54:19 +0800
From: Sheng Zhao <sheng.zhao@...edance.com>
To: Jason Wang <jasowang@...hat.com>
Cc: mst@...hat.com, xuanzhuo@...ux.alibaba.com, eperezma@...hat.com,
 virtualization@...ts.linux.dev, linux-kernel@...r.kernel.org,
 xieyongji@...edance.com
Subject: Re: Re: [PATCH] vduse: Use fixed 4KB bounce pages for arm64 64KB page
 size



在 2025/9/16 15:34, Jason Wang 写道:
> On Mon, Sep 15, 2025 at 7:07 PM Sheng Zhao <sheng.zhao@...edance.com> wrote:
>>
>>
>>
>> 在 2025/9/15 16:21, Jason Wang 写道:
>>> On Mon, Sep 15, 2025 at 3:34 PM <sheng.zhao@...edance.com> wrote:
>>>>
>>>> From: Sheng Zhao <sheng.zhao@...edance.com>
>>>>
>>>> The allocation granularity of bounce pages is PAGE_SIZE. This may cause
>>>> even small IO requests to occupy an entire bounce page exclusively.
>>>
>>> This sounds more like an issue of the IOVA allocating that use the
>>> wrong granular?
>>>
>>
>> Sorry, the previous email has a slight formatting issue.
>>
>> The granularity of the IOVA allocator is customized during the
>> initialization of the vduse domain, and this value is also modified in
>> this commit.
> 
> Ok, let's add this to the changelog.
> 
> Btw, do you have perf numbers to demonstrate the benefit?
> 
> Thanks
> 

For arm64 64KB base pages, compared with fixed 4KB bounce pages, using 
native pages is more likely to fill up the bounce buffer(default 64MB), 
resulting in I/O performance bottlenecks.

I used QEMU vduse-blk as the backend for testing write performance. 
Below are the fio test results:


	  | native       | fixed-4k
----------+--------------+-------------
numjobs=2 | bw=44.4MiB/s | bw=47.0MiB/s
iodepth=4 | iops=90.9k   | iops=96.1k
----------+--------------+-------------
numjobs=4 | bw=58.8MiB/s | bw=61.1MiB/s
iodepth=4 | iops=120.3k  | iops=125.4k
----------+--------------+-------------
numjobs=8 | bw=64.0MiB/s | bw=74.7MiB/s
iodepth=8 | iops=131.1k  | iops=153.1k
----------+--------------+-------------
numjobs=16| bw=69.8MiB/s | bw=92.7MiB/s
iodepth=8 | iops=143.0k  | iops=190.0k


Thanks

>>
>> Thanks
>>
>>>> The
>>>> kind of memory waste will be more significant on arm64 with 64KB pages.
>>>>
>>>> So, optimize it by using fixed 4KB bounce pages.
>>>>
>>>> Signed-off-by: Sheng Zhao <sheng.zhao@...edance.com>
>>>
>>> Thanks
>>>
>>
>