[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ce0de356-1ca1-df5f-c7db-fbe5a7fabff5@hisilicon.com>
Date: Tue, 7 May 2024 17:22:13 +0800
From: Junxian Huang <huangjunxian6@...ilicon.com>
To: Jason Gunthorpe <jgg@...pe.ca>
CC: <leon@...nel.org>, <linux-rdma@...r.kernel.org>, <linuxarm@...wei.com>,
<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH for-next] RDMA/hns: Support flexible WQE buffer page size
On 2024/4/30 21:41, Jason Gunthorpe wrote:
> On Tue, Apr 30, 2024 at 05:28:45PM +0800, Junxian Huang wrote:
>> From: Chengchang Tang <tangchengchang@...wei.com>
>>
>> Currently, driver fixedly allocates 4K pages for userspace WQE buffer
>> and results in HW reading WQE with a granularity of 4K even in a 64K
>> system. HW has to switch pages every 4K, leading to a loss of performance.
>
>> In order to improve performance, add support for userspace to allocate
>> flexible WQE buffer page size between 4K to system PAGESIZE.
>> @@ -90,7 +90,8 @@ struct hns_roce_ib_create_qp {
>> __u8 log_sq_bb_count;
>> __u8 log_sq_stride;
>> __u8 sq_no_prefetch;
>> - __u8 reserved[5];
>> + __u8 pageshift;
>> + __u8 reserved[4];
>
> It doesn't make any sense to pass in a pageshift from userspace.
>
> Kernel should detect whatever underlying physical contiguity userspace
> has been able to create and configure the hardware optimally. The umem
> already has all the tools to do this trivially.
>
> Why would you need to specify anything?
>
> Jason
Hi Jason. Sorry for the late response.
WQE buffer of hns HW actually consists of 3 regions: SQ WQE, RQ WQE and
ext SGE. Userspace and kernel driver both computes buffer size and start
offset of these 3 regions based on the page shift. Kernel needs to obtains
the page shift from userspace to ensure the buffer size and start offset
are the same between kernel and userspace and avoid invalid memory access.
The "tools of umem" you said refers to ib_umem_find_best_pgsz() I assume.
This API cannot ensure returning the same page size as userspace, and
kernel cannot determine the start offset of the 3 regions in userspace in
this case.
Junxian
Powered by blists - more mailing lists