[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHS8izOaxD6qLHNwuSBSZK1gx7OpW1mY3kwT=H-i9h6Ycap-_Q@mail.gmail.com>
Date: Sat, 19 Aug 2023 10:59:48 -0700
From: Mina Almasry <almasrymina@...gle.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: David Ahern <dsahern@...nel.org>, Jakub Kicinski <kuba@...nel.org>,
Praveen Kaligineedi <pkaligineedi@...gle.com>, netdev@...r.kernel.org,
Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
Jesper Dangaard Brouer <hawk@...nel.org>, Ilias Apalodimas <ilias.apalodimas@...aro.org>,
Magnus Karlsson <magnus.karlsson@...el.com>, sdf@...gle.com,
Willem de Bruijn <willemb@...gle.com>, Kaiyuan Zhang <kaiyuanz@...gle.com>
Subject: Re: [RFC PATCH v2 02/11] netdev: implement netlink api to bind
dma-buf to netdevice
On Sat, Aug 19, 2023 at 7:19 AM Willem de Bruijn
<willemdebruijn.kernel@...il.com> wrote:
>
> On Fri, Aug 18, 2023 at 11:30 PM David Ahern <dsahern@...nel.org> wrote:
> >
> > On 8/18/23 8:06 PM, Jakub Kicinski wrote:
> > > On Fri, 18 Aug 2023 19:34:32 -0600 David Ahern wrote:
> > >> On 8/18/23 3:52 PM, Mina Almasry wrote:
> > >>> The sticking points are:
> > >>> 1. From David: this proposal doesn't give an application the ability
> > >>> to flush an rx queue, which means that we have to rely on a driver
> > >>> reset that affects all queues to refill the rx queue buffers.
> > >>
> > >> Generically, the design needs to be able to flush (or invalidate) all
> > >> references to the dma-buf once the process no longer "owns" it.
> > >
> > > Are we talking about the ability for the app to flush the queue
> > > when it wants to (do no idea what)? Or auto-flush when app crashes?
> >
> > If a buffer reference can be invalidated such that a posted buffer is
> > ignored by H/W, then no flush is needed per se. Either way the key point
> > is that posted buffers can no longer be filled by H/W once a process no
> > longer owns the dma-buf reference. I believe the actual mechanism here
> > will vary by H/W.
>
> Right. Many devices only allow bringing all queues down at the same time.
>
FWIW, I spoke with our Praveen (GVE maintainer) about this. Suspicion
is that bringing up/down individual queues _should_ work with GVE for
the most part, but it's pending me trying it and confirming.
I think if a driver can't support bringing up/down individual queues,
then Jakub's direction for per queue configs all cannot be done on
that driver (queue_mem_alloc, queue_mem_free, queue_start,
queue_stop), and addressing David's concerns vis-a-vis dma-buf being
auto-detached if the application crashes/exists also cannot be done.
The driver will not be able to support device memory TCP unless there
is an option to make it work with a full driver reset.
> Once a descriptor is posted and the ring head is written, there is no
> way to retract that. Since waiting for the device to catch up is not
> acceptable, the only option is to bring down the queue, right? Which
> will imply bringing down the entire device on many devices. Not ideal,
> but acceptable short term, imho.
>
I also wonder if it may be acceptable to have both modes supported.
I.e. (roughly):
1. Add APIs that create an rx-queue bound to a dma-buf.
2. Add APIs that bind an rx-queue to a dma-buf.
Drivers that support per-queue allocation/freeing can support and use
#1 and can work as David likes. Drivers that cannot allocate or bring
up individual queues can only support #2, and trigger a driver-reset
to refill or release the dma-buf references.
This patch series already implements APIs #2.
> That may be an incentive for vendors to support per-queue
> start/stop/alloc/free. Maybe the ones that support RDMA already do?
--
Thanks,
Mina
Powered by blists - more mailing lists