[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <18e7bf9b-eb4c-4c4c-a00e-bfe0bc07e81c@bytedance.com>
Date: Tue, 16 Sep 2025 21:54:19 +0800
From: Sheng Zhao <sheng.zhao@...edance.com>
To: Jason Wang <jasowang@...hat.com>
Cc: mst@...hat.com, xuanzhuo@...ux.alibaba.com, eperezma@...hat.com,
virtualization@...ts.linux.dev, linux-kernel@...r.kernel.org,
xieyongji@...edance.com
Subject: Re: Re: [PATCH] vduse: Use fixed 4KB bounce pages for arm64 64KB page
size
在 2025/9/16 15:34, Jason Wang 写道:
> On Mon, Sep 15, 2025 at 7:07 PM Sheng Zhao <sheng.zhao@...edance.com> wrote:
>>
>>
>>
>> 在 2025/9/15 16:21, Jason Wang 写道:
>>> On Mon, Sep 15, 2025 at 3:34 PM <sheng.zhao@...edance.com> wrote:
>>>>
>>>> From: Sheng Zhao <sheng.zhao@...edance.com>
>>>>
>>>> The allocation granularity of bounce pages is PAGE_SIZE. This may cause
>>>> even small IO requests to occupy an entire bounce page exclusively.
>>>
>>> This sounds more like an issue of the IOVA allocating that use the
>>> wrong granular?
>>>
>>
>> Sorry, the previous email has a slight formatting issue.
>>
>> The granularity of the IOVA allocator is customized during the
>> initialization of the vduse domain, and this value is also modified in
>> this commit.
>
> Ok, let's add this to the changelog.
>
> Btw, do you have perf numbers to demonstrate the benefit?
>
> Thanks
>
For arm64 64KB base pages, compared with fixed 4KB bounce pages, using
native pages is more likely to fill up the bounce buffer(default 64MB),
resulting in I/O performance bottlenecks.
I used QEMU vduse-blk as the backend for testing write performance.
Below are the fio test results:
| native | fixed-4k
----------+--------------+-------------
numjobs=2 | bw=44.4MiB/s | bw=47.0MiB/s
iodepth=4 | iops=90.9k | iops=96.1k
----------+--------------+-------------
numjobs=4 | bw=58.8MiB/s | bw=61.1MiB/s
iodepth=4 | iops=120.3k | iops=125.4k
----------+--------------+-------------
numjobs=8 | bw=64.0MiB/s | bw=74.7MiB/s
iodepth=8 | iops=131.1k | iops=153.1k
----------+--------------+-------------
numjobs=16| bw=69.8MiB/s | bw=92.7MiB/s
iodepth=8 | iops=143.0k | iops=190.0k
Thanks
>>
>> Thanks
>>
>>>> The
>>>> kind of memory waste will be more significant on arm64 with 64KB pages.
>>>>
>>>> So, optimize it by using fixed 4KB bounce pages.
>>>>
>>>> Signed-off-by: Sheng Zhao <sheng.zhao@...edance.com>
>>>
>>> Thanks
>>>
>>
>
Powered by blists - more mailing lists