lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1a899ea7-c58f-4b65-9888-c26adc8da00f@bytedance.com>
Date: Thu, 25 Sep 2025 11:13:43 +0800
From: Sheng Zhao <sheng.zhao@...edance.com>
To: Jason Wang <jasowang@...hat.com>
Cc: mst@...hat.com, xuanzhuo@...ux.alibaba.com, eperezma@...hat.com,
 virtualization@...ts.linux.dev, linux-kernel@...r.kernel.org,
 xieyongji@...edance.com
Subject: Re: Re: [PATCH] vduse: Use fixed 4KB bounce pages for arm64 64KB page
 size



On 2025/9/25 08:20, Jason Wang wrote:
> On Wed, Sep 24, 2025 at 2:38 PM Sheng Zhao <sheng.zhao@...edance.com> wrote:
>>
>>
>>
>> On 2025/9/24 12:15, Jason Wang wrote:
>>> On Wed, Sep 24, 2025 at 12:05 PM Sheng Zhao <sheng.zhao@...edance.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2025/9/24 08:57, Jason Wang wrote:
>>>>> On Tue, Sep 23, 2025 at 8:37 PM Sheng Zhao <sheng.zhao@...edance.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 2025/9/17 16:16, Jason Wang wrote:
>>>>>>> On Mon, Sep 15, 2025 at 3:34 PM <sheng.zhao@...edance.com> wrote:
>>>>>>>>
>>>>>>>> From: Sheng Zhao <sheng.zhao@...edance.com>
>>>>>>>>
>>>>>>>> The allocation granularity of bounce pages is PAGE_SIZE. This may cause
>>>>>>>> even small IO requests to occupy an entire bounce page exclusively. The
>>>>>>>> kind of memory waste will be more significant on arm64 with 64KB pages.
>>>>>>>
>>>>>>> Let's tweak the title as there are archs that are using non 4KB pages
>>>>>>> other than arm.
>>>>>>>
>>>>>>
>>>>>> Got it. I will modify this in v2.
>>>>>>
>>>>>>>>
>>>>>>>> So, optimize it by using fixed 4KB bounce pages.
>>>>>>>>
>>>>>>>> Signed-off-by: Sheng Zhao <sheng.zhao@...edance.com>
>>>>>>>> ---
>>>>>>>>      drivers/vdpa/vdpa_user/iova_domain.c | 120 +++++++++++++++++----------
>>>>>>>>      drivers/vdpa/vdpa_user/iova_domain.h |   5 ++
>>>>>>>>      2 files changed, 83 insertions(+), 42 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c
>>>>>>>> index 58116f89d8da..768313c80b62 100644
>>>>>>>> --- a/drivers/vdpa/vdpa_user/iova_domain.c
>>>>>>>> +++ b/drivers/vdpa/vdpa_user/iova_domain.c
>>>>>>>> @@ -103,19 +103,26 @@ void vduse_domain_clear_map(struct vduse_iova_domain *domain,
>>>>>>>>      static int vduse_domain_map_bounce_page(struct vduse_iova_domain *domain,
>>>>>>>>                                              u64 iova, u64 size, u64 paddr)
>>>>>>>>      {
>>>>>>>> -       struct vduse_bounce_map *map;
>>>>>>>> +       struct vduse_bounce_map *map, *head_map;
>>>>>>>> +       struct page *tmp_page;
>>>>>>>>             u64 last = iova + size - 1;
>>>>>>>>
>>>>>>>>             while (iova <= last) {
>>>>>>>> -               map = &domain->bounce_maps[iova >> PAGE_SHIFT];
>>>>>>>> +               map = &domain->bounce_maps[iova >> BOUNCE_PAGE_SHIFT];
>>>>>>>
>>>>>>> BOUNCE_PAGE_SIZE is kind of confusing as it's not the size of any page
>>>>>>> at all when PAGE_SIZE is not 4K.
>>>>>>>
>>>>>>
>>>>>> How about BOUNCE_MAP_SIZE?
>>>>>
>>>>> Fine with me.
>>>>>
>>>>>>
>>>>>>>>                     if (!map->bounce_page) {
>>>>>>>> -                       map->bounce_page = alloc_page(GFP_ATOMIC);
>>>>>>>> -                       if (!map->bounce_page)
>>>>>>>> -                               return -ENOMEM;
>>>>>>>> +                       head_map = &domain->bounce_maps[(iova & PAGE_MASK) >> BOUNCE_PAGE_SHIFT];
>>>>>>>> +                       if (!head_map->bounce_page) {
>>>>>>>> +                               tmp_page = alloc_page(GFP_ATOMIC);
>>>>>>>> +                               if (!tmp_page)
>>>>>>>> +                                       return -ENOMEM;
>>>>>>>> +                               if (cmpxchg(&head_map->bounce_page, NULL, tmp_page))
>>>>>>>> +                                       __free_page(tmp_page);
>>>>>>>
>>>>>>> I don't understand why we need cmpxchg() logic.
>>>>>>>
>>>>>>> Btw, it looks like you want to make multiple bounce_map to point to
>>>>>>> the same 64KB page? I wonder what's the advantages of doing this. Can
>>>>>>> we simply keep the 64KB page in bounce_map?
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>
>>>>>> That's correct. We use fixed 4KB-sized bounce pages, and there will be a
>>>>>> many-to-one relationship between these 4KB bounce pages and the 64KB
>>>>>> memory pages.
>>>>>>
>>>>>> Bounce pages are allocated on demand. As a result, it may occur that
>>>>>> multiple bounce pages corresponding to the same 64KB memory page attempt
>>>>>> to allocate memory simultaneously, so we use cmpxchg to handle this
>>>>>> concurrency.
>>>>>>
>>>>>> In the current implementation, the bounce_map structure requires no
>>>>>> modification. However, if we keep the 64KB page into a single bounce_map
>>>>>> while still wanting to implement a similar logic, we may need an
>>>>>> additional array to store multiple orig_phys values in order to
>>>>>> accommodate the many-to-one relationship.
>>>>>
>>>>> Or simply having a bitmap is sufficient per bounce_map?
>>>>>
>>>>
>>>> Yes, using a bitmap can mark the usage status of each 4KB, but it may
>>>> not simplify things overall.
>>>>
>>>> - we will inevitably need to add an additional array per bounce_map to
>>>> store the orig_phys corresponding to each 4KB for subsequent copying
>>>> (vduse_domain_bounce).
>>>
>>> I may miss something, the PAGE_SIZE is 64KB in this case, why do we
>>> need to store per 4KB orig_phys?
>>>
>>
>> Since one orig_phys originates from one IO request. If we want the
>> minimum size of bounce pages occupied by an IO request to be 4KB instead
>> of 64KB. we need to store their respective orig_phys values for each 4KB
>> corresponding to the IO request.
>>
>> In other words, we may not be able to guarantee that the orig_phys
>> values of all IO requests within the same 64KB memory page are
>> contiguous, so we need to store them separately.
> 
> Ok, let's leave a comment to explain this design.
> 
> Thanks
> 

Sure, I will add the relevant comments in v2.

Thanks

>>
>> Thanks
>>>>
>>>> - compared to the current commit, this modification may only be a
>>>> structural change and fail to reduce the amount of changes to the code
>>>> logic. For instance, cmpxchg is still required.
>>>
>>> Thanks
>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>>> Thanks
>>>>>
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>
>>>>
>>>
>>
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ