linux-kernel - Re: [PATCH rdma-next] RDMA/rdmavt: Decouple QP and SGE lists allocations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YKdTOEeC55X+SZl+@unreal>
Date:   Fri, 21 May 2021 09:29:12 +0300
From:   Leon Romanovsky <leon@...nel.org>
To:     Dennis Dalessandro <dennis.dalessandro@...nelisnetworks.com>
Cc:     Jason Gunthorpe <jgg@...dia.com>,
        "Marciniszyn, Mike" <mike.marciniszyn@...nelisnetworks.com>,
        Doug Ledford <dledford@...hat.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>
Subject: Re: [PATCH rdma-next] RDMA/rdmavt: Decouple QP and SGE lists
 allocations

On Thu, May 20, 2021 at 06:02:09PM -0400, Dennis Dalessandro wrote:
> On 5/19/21 4:26 PM, Jason Gunthorpe wrote:
> > On Wed, May 19, 2021 at 03:49:31PM -0400, Dennis Dalessandro wrote:
> > > On 5/19/21 2:29 PM, Jason Gunthorpe wrote:
> > > > On Wed, May 19, 2021 at 07:56:32AM -0400, Dennis Dalessandro wrote:

<...>

> > Especially since for RDMA all of the above is highly situational. The
> > IRQ/WQ processing anything in RDMA should be tied to the comp_vector,
> > so without knowing that information you simply can't do anything
> > correct at allocation time.
> 
> I don't think that's true for our case. The comp_vector may in some cases be
> the right thing to dictate where memory should be, in our case I don't think
> that's true all the time.

In verbs world, the comp_vector is always the right thing to dictate
node policy. We can argue if it works correctly or not.

https://www.rdmamojo.com/2012/11/03/ibv_create_cq/
comp_vector:
 MSI-X completion vector that will be used for signaling Completion events.
 If the IRQ affinity masks of these interrupts have been configured to spread
 each MSI-X interrupt to be handled by a different core, this parameter can be
 used to spread the completion workload over multiple cores.

> 
> > The idea of allocating every to the HW's node is simply not correct
> > design. I will grant you it may have made sense ages ago before the
> > NUMA stuff was more completed, but today it does not and you'd be
> > better to remove it all and use memory policy properly than insist we
> > keep it around forever.
> 
> Not insisting anything. If the trend is to remove these sort of allocations
> and other drivers are no longer doing this "not correct design" we are
> certainly open to change. We just want to understand the impact first rather
> than being strong armed into accepting a performance regression just so Leon
> can refactor some code.

It is hard to talk without data.

Thanks

> 
> -Denny