[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230412165816.GB182481@unreal>
Date: Wed, 12 Apr 2023 19:58:16 +0300
From: Leon Romanovsky <leon@...nel.org>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Brett Creeley <bcreeley@....com>,
Brett Creeley <brett.creeley@....com>, davem@...emloft.net,
netdev@...r.kernel.org, drivers@...sando.io,
shannon.nelson@....com, neel.patel@....com
Subject: Re: [PATCH net] ionic: Fix allocation of q/cq info structures from
device local node
On Tue, Apr 11, 2023 at 12:49:45PM -0700, Jakub Kicinski wrote:
> On Tue, 11 Apr 2023 15:47:04 +0300 Leon Romanovsky wrote:
> > > We want to allocate memory from the node local to our PCI device, which is
> > > not necessarily the same as the node that the thread is running on where
> > > vzalloc() first tries to alloc.
> >
> > I'm not sure about it as you are running kernel thread which is
> > triggered directly by device and most likely will run on same node as
> > PCI device.
>
> Isn't that true only for bus-side probing?
> If you bind/unbind via sysfs does it still try to move to the right
> node? Same for resources allocated during ifup?
Kernel threads are more interesting case, as they are not controlled
through mempolicy (maybe it is not true in 2023, I'm not sure).
User triggered threads are subjected to mempolicy and all allocations
are expected to follow it. So users, who wants specific memory behaviour
should use it.
https://docs.kernel.org/6.1/admin-guide/mm/numa_memory_policy.html
There is a huge chance that fallback mechanisms proposed here in ionic
and implemented in ENA are "break" this interface.
>
> > > Since it wasn't clear to us that vzalloc_node() does any fallback,
> >
> > vzalloc_node() doesn't do fallback, but vzalloc will find the right node
> > for you.
>
> Sounds like we may want a vzalloc_node_with_fallback or some GFP flag?
> All the _node() helpers which don't fall back lead to unpleasant code
> in the users.
I would challenge the whole idea of having *_node() allocations in
driver code at the first place. Even in RDMA, where we super focused
on performance and allocation of memory in right place is super
critical, we rely on general kzalloc().
There is one exception in RDMA world (hfi1), but it is more because of
legacy implementation and not because of specific need, at least Intel
folks didn't success to convince me with real data.
>
> > > we followed the example in the ena driver to follow up with a more
> > > generic vzalloc() request.
> >
> > I don't know about ENA implementation, maybe they have right reasons to
> > do it, but maybe they don't.
> >
> > >
> > > Also, the custom message helps us quickly figure out exactly which
> > > allocation failed.
> >
> > If OOM is missing some info to help debug allocation failures, let's add
> > it there, but please do not add any custom prints after alloc failures.
>
> +1
Powered by blists - more mailing lists