[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7284da8c-9993-76c4-b495-32c814607a4b@cornelisnetworks.com>
Date: Wed, 12 May 2021 08:45:45 -0400
From: Dennis Dalessandro <dennis.dalessandro@...nelisnetworks.com>
To: Leon Romanovsky <leon@...nel.org>
Cc: "Marciniszyn, Mike" <mike.marciniszyn@...nelisnetworks.com>,
Haakon Bugge <haakon.bugge@...cle.com>,
Doug Ledford <dledford@...hat.com>,
Jason Gunthorpe <jgg@...dia.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
OFED mailing list <linux-rdma@...r.kernel.org>
Subject: Re: [PATCH rdma-next] RDMA/rdmavt: Decouple QP and SGE lists
allocations
On 5/12/21 8:13 AM, Leon Romanovsky wrote:
> On Wed, May 12, 2021 at 12:08:59AM -0400, Dennis Dalessandro wrote:
>>
>> On 5/11/21 3:27 PM, Leon Romanovsky wrote:
>>> On Tue, May 11, 2021 at 07:15:09PM +0000, Marciniszyn, Mike wrote:
>>>>>>
>>>>>> Why not kzalloc_node() here?
>>>>
>>>> I agree here.
>>>>
>>>> Other allocations that have been promoted to the core have lost the node attribute in the allocation.
>>>
>>> Did you notice any performance degradation?
>>>
>>
>> So what's the motivation to change it from the way it was originally
>> designed? It seems to me if the original implementation went to the trouble
>> to allocate the memory on the local node, refactoring the code should
>> respect that.
>
> I have no problem to make rdma_zalloc_*() node aware, but would like to get
> real performance justification. My assumption is that rdmavt use kzalloc_node
> for the control plane based on some internal performance testing and we finally
> can see the difference between kzalloc and kzalloc_node, am I right?
>
> Is the claim of performance degradation backed by data?
Yes, in the past. I don't have access anymore now that I'm not with
Intel. It probably would not have been publishable anyway.
> The main reason (maybe I'm wrong here) is to avoid _node() allocators
> because they increase chances of memory allocation failure due to not
> doing fallback in case node memory is depleted.
Agreed. It's a trade-off that was deemed acceptable.
> Again, I'm suggesting to do plain kzalloc() for control part of QP.
Now I don't recall data for that specifically, but to be on the safe
side I would not want to risk a performance regression.
-Denny
Powered by blists - more mailing lists