linux-kernel - Re: [RFC PATCH net-next v6 05/15] netdev: support binding dma-buf to netdevice

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <76266a89-8ec1-6a4c-716b-da422f0b2cd5@huawei.com>
Date: Thu, 7 Mar 2024 20:15:26 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: Mina Almasry <almasrymina@...gle.com>
CC: <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<linux-doc@...r.kernel.org>, <linux-alpha@...r.kernel.org>,
	<linux-mips@...r.kernel.org>, <linux-parisc@...r.kernel.org>,
	<sparclinux@...r.kernel.org>, <linux-trace-kernel@...r.kernel.org>,
	<linux-arch@...r.kernel.org>, <bpf@...r.kernel.org>,
	<linux-kselftest@...r.kernel.org>, <linux-media@...r.kernel.org>,
	<dri-devel@...ts.freedesktop.org>, "David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo
 Abeni <pabeni@...hat.com>, Jonathan Corbet <corbet@....net>, Richard
 Henderson <richard.henderson@...aro.org>, Ivan Kokshaysky
	<ink@...assic.park.msu.ru>, Matt Turner <mattst88@...il.com>, Thomas
 Bogendoerfer <tsbogend@...ha.franken.de>, "James E.J. Bottomley"
	<James.Bottomley@...senpartnership.com>, Helge Deller <deller@....de>,
	Andreas Larsson <andreas@...sler.com>, Jesper Dangaard Brouer
	<hawk@...nel.org>, Ilias Apalodimas <ilias.apalodimas@...aro.org>, Steven
 Rostedt <rostedt@...dmis.org>, Masami Hiramatsu <mhiramat@...nel.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, Arnd Bergmann
	<arnd@...db.de>, Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann
	<daniel@...earbox.net>, Andrii Nakryiko <andrii@...nel.org>, Martin KaFai Lau
	<martin.lau@...ux.dev>, Eduard Zingerman <eddyz87@...il.com>, Song Liu
	<song@...nel.org>, Yonghong Song <yonghong.song@...ux.dev>, John Fastabend
	<john.fastabend@...il.com>, KP Singh <kpsingh@...nel.org>, Stanislav Fomichev
	<sdf@...gle.com>, Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
	David Ahern <dsahern@...nel.org>, Willem de Bruijn
	<willemdebruijn.kernel@...il.com>, Shuah Khan <shuah@...nel.org>, Sumit
 Semwal <sumit.semwal@...aro.org>, Christian König
	<christian.koenig@....com>, Pavel Begunkov <asml.silence@...il.com>, David
 Wei <dw@...idwei.uk>, Jason Gunthorpe <jgg@...pe.ca>, Shailend Chand
	<shailend@...gle.com>, Harshitha Ramamurthy <hramamurthy@...gle.com>, Shakeel
 Butt <shakeelb@...gle.com>, Jeroen de Borst <jeroendb@...gle.com>, Praveen
 Kaligineedi <pkaligineedi@...gle.com>, Willem de Bruijn <willemb@...gle.com>,
	Kaiyuan Zhang <kaiyuanz@...gle.com>
Subject: Re: [RFC PATCH net-next v6 05/15] netdev: support binding dma-buf to
 netdevice

On 2024/3/7 6:10, Mina Almasry wrote:

..

>>>>> +static int netdev_restart_rx_queue(struct net_device *dev, int rxq_idx)
>>>>> +{
>>>>> +     void *new_mem;
>>>>> +     void *old_mem;
>>>>> +     int err;
>>>>> +
>>>>> +     if (!dev || !dev->netdev_ops)
>>>>> +             return -EINVAL;
>>>>> +
>>>>> +     if (!dev->netdev_ops->ndo_queue_stop ||
>>>>> +         !dev->netdev_ops->ndo_queue_mem_free ||
>>>>> +         !dev->netdev_ops->ndo_queue_mem_alloc ||
>>>>> +         !dev->netdev_ops->ndo_queue_start)
>>>>> +             return -EOPNOTSUPP;
>>>>> +
>>>>> +     new_mem = dev->netdev_ops->ndo_queue_mem_alloc(dev, rxq_idx);
>>>>> +     if (!new_mem)
>>>>> +             return -ENOMEM;
>>>>> +
>>>>> +     err = dev->netdev_ops->ndo_queue_stop(dev, rxq_idx, &old_mem);
>>>>> +     if (err)
>>>>> +             goto err_free_new_mem;
>>>>> +
>>>>> +     err = dev->netdev_ops->ndo_queue_start(dev, rxq_idx, new_mem);
>>>>> +     if (err)
>>>>> +             goto err_start_queue;
>>>>> +
>>>>> +     dev->netdev_ops->ndo_queue_mem_free(dev, old_mem);
>>>>> +
>>>>> +     return 0;
>>>>> +
>>>>> +err_start_queue:
>>>>> +     dev->netdev_ops->ndo_queue_start(dev, rxq_idx, old_mem);
>>>>
>>>> It might worth mentioning why queue start with old_mem will always
>>>> success here as the return value seems to be ignored here.
>>>>
>>>
>>> So the old queue, we stopped it, and if we fail to bring up the new
>>> queue, then we want to start the old queue back up to get the queue
>>> back to a workable state.
>>>
>>> I don't see what we can do to recover if restarting the old queue
>>> fails. Seems like it should be a requirement that the driver tries as
>>> much as possible to keep the old queue restartable.
>>
>> Is it possible that we may have the 'old_mem' leaking if the driver
>> fails to restart the old queue? how does the driver handle the
>> firmware cmd failure for ndo_queue_start()? it seems a little
>> tricky to implement it.
>>
> 
> I'm not sure what we can do to meaningfully recover from failure to
> restarting the old queue, except log it so the error is visible. In
> theory because we have not modifying any queue configurations
> restarting it would be straight forward, but since it's dealing with
> hardware then any failures are possible.

Yes, we may need to have a clear semantics of how should the driver
implement the above interface, for example if the driver should free
the memory when fail to start a queue or the driver should restart
the queue when fail to stop a queue? Otherwise we may have different
driver implementing different behavior.

>From the disscusion you mentioned below, does it make senses to
modeling rdma subsystem by using create_queue/modify_queue/destroy_queue
semantics instead?

> 
>>>
>>> I can improve this by at least logging or warning if restarting the
>>> old queue fails.
>>
>> Also the semantics of the above function seems odd that it is not
>> only restarting rx queue, but also freeing and allocating memory
>> despite the name only suggests 'restart', I am a litte afraid that
>> it may conflict with future usecae when user only need the
>> 'restart' part, perhaps rename it to a more appropriate name.
>>
> 
> Oh, what we want here is just the 'restart' part. However, Jakub
> mandates that if you restart a queue (or a driver), you do it like
> this, hence the slightly more complicated implementation.
> 
> https://patchwork.kernel.org/project/netdevbpf/patch/20231106024413.2801438-13-almasrymina@google.com/#25590262
> https://lore.kernel.org/netdev/20230815171638.4c057dcd@kernel.org/

Thanks for the link.

I like david's idea of "a more generic design where H/W queues are created
and destroyed - e.g., queues unique to a process which makes the cleanup
so much easier." , but it seems it is a lot of work for networking to
implement that for now.

>