[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <76266a89-8ec1-6a4c-716b-da422f0b2cd5@huawei.com>
Date: Thu, 7 Mar 2024 20:15:26 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: Mina Almasry <almasrymina@...gle.com>
CC: <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<linux-doc@...r.kernel.org>, <linux-alpha@...r.kernel.org>,
<linux-mips@...r.kernel.org>, <linux-parisc@...r.kernel.org>,
<sparclinux@...r.kernel.org>, <linux-trace-kernel@...r.kernel.org>,
<linux-arch@...r.kernel.org>, <bpf@...r.kernel.org>,
<linux-kselftest@...r.kernel.org>, <linux-media@...r.kernel.org>,
<dri-devel@...ts.freedesktop.org>, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo
Abeni <pabeni@...hat.com>, Jonathan Corbet <corbet@....net>, Richard
Henderson <richard.henderson@...aro.org>, Ivan Kokshaysky
<ink@...assic.park.msu.ru>, Matt Turner <mattst88@...il.com>, Thomas
Bogendoerfer <tsbogend@...ha.franken.de>, "James E.J. Bottomley"
<James.Bottomley@...senpartnership.com>, Helge Deller <deller@....de>,
Andreas Larsson <andreas@...sler.com>, Jesper Dangaard Brouer
<hawk@...nel.org>, Ilias Apalodimas <ilias.apalodimas@...aro.org>, Steven
Rostedt <rostedt@...dmis.org>, Masami Hiramatsu <mhiramat@...nel.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, Arnd Bergmann
<arnd@...db.de>, Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann
<daniel@...earbox.net>, Andrii Nakryiko <andrii@...nel.org>, Martin KaFai Lau
<martin.lau@...ux.dev>, Eduard Zingerman <eddyz87@...il.com>, Song Liu
<song@...nel.org>, Yonghong Song <yonghong.song@...ux.dev>, John Fastabend
<john.fastabend@...il.com>, KP Singh <kpsingh@...nel.org>, Stanislav Fomichev
<sdf@...gle.com>, Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
David Ahern <dsahern@...nel.org>, Willem de Bruijn
<willemdebruijn.kernel@...il.com>, Shuah Khan <shuah@...nel.org>, Sumit
Semwal <sumit.semwal@...aro.org>, Christian König
<christian.koenig@....com>, Pavel Begunkov <asml.silence@...il.com>, David
Wei <dw@...idwei.uk>, Jason Gunthorpe <jgg@...pe.ca>, Shailend Chand
<shailend@...gle.com>, Harshitha Ramamurthy <hramamurthy@...gle.com>, Shakeel
Butt <shakeelb@...gle.com>, Jeroen de Borst <jeroendb@...gle.com>, Praveen
Kaligineedi <pkaligineedi@...gle.com>, Willem de Bruijn <willemb@...gle.com>,
Kaiyuan Zhang <kaiyuanz@...gle.com>
Subject: Re: [RFC PATCH net-next v6 05/15] netdev: support binding dma-buf to
netdevice
On 2024/3/7 6:10, Mina Almasry wrote:
..
>>>>> +static int netdev_restart_rx_queue(struct net_device *dev, int rxq_idx)
>>>>> +{
>>>>> + void *new_mem;
>>>>> + void *old_mem;
>>>>> + int err;
>>>>> +
>>>>> + if (!dev || !dev->netdev_ops)
>>>>> + return -EINVAL;
>>>>> +
>>>>> + if (!dev->netdev_ops->ndo_queue_stop ||
>>>>> + !dev->netdev_ops->ndo_queue_mem_free ||
>>>>> + !dev->netdev_ops->ndo_queue_mem_alloc ||
>>>>> + !dev->netdev_ops->ndo_queue_start)
>>>>> + return -EOPNOTSUPP;
>>>>> +
>>>>> + new_mem = dev->netdev_ops->ndo_queue_mem_alloc(dev, rxq_idx);
>>>>> + if (!new_mem)
>>>>> + return -ENOMEM;
>>>>> +
>>>>> + err = dev->netdev_ops->ndo_queue_stop(dev, rxq_idx, &old_mem);
>>>>> + if (err)
>>>>> + goto err_free_new_mem;
>>>>> +
>>>>> + err = dev->netdev_ops->ndo_queue_start(dev, rxq_idx, new_mem);
>>>>> + if (err)
>>>>> + goto err_start_queue;
>>>>> +
>>>>> + dev->netdev_ops->ndo_queue_mem_free(dev, old_mem);
>>>>> +
>>>>> + return 0;
>>>>> +
>>>>> +err_start_queue:
>>>>> + dev->netdev_ops->ndo_queue_start(dev, rxq_idx, old_mem);
>>>>
>>>> It might worth mentioning why queue start with old_mem will always
>>>> success here as the return value seems to be ignored here.
>>>>
>>>
>>> So the old queue, we stopped it, and if we fail to bring up the new
>>> queue, then we want to start the old queue back up to get the queue
>>> back to a workable state.
>>>
>>> I don't see what we can do to recover if restarting the old queue
>>> fails. Seems like it should be a requirement that the driver tries as
>>> much as possible to keep the old queue restartable.
>>
>> Is it possible that we may have the 'old_mem' leaking if the driver
>> fails to restart the old queue? how does the driver handle the
>> firmware cmd failure for ndo_queue_start()? it seems a little
>> tricky to implement it.
>>
>
> I'm not sure what we can do to meaningfully recover from failure to
> restarting the old queue, except log it so the error is visible. In
> theory because we have not modifying any queue configurations
> restarting it would be straight forward, but since it's dealing with
> hardware then any failures are possible.
Yes, we may need to have a clear semantics of how should the driver
implement the above interface, for example if the driver should free
the memory when fail to start a queue or the driver should restart
the queue when fail to stop a queue? Otherwise we may have different
driver implementing different behavior.
>From the disscusion you mentioned below, does it make senses to
modeling rdma subsystem by using create_queue/modify_queue/destroy_queue
semantics instead?
>
>>>
>>> I can improve this by at least logging or warning if restarting the
>>> old queue fails.
>>
>> Also the semantics of the above function seems odd that it is not
>> only restarting rx queue, but also freeing and allocating memory
>> despite the name only suggests 'restart', I am a litte afraid that
>> it may conflict with future usecae when user only need the
>> 'restart' part, perhaps rename it to a more appropriate name.
>>
>
> Oh, what we want here is just the 'restart' part. However, Jakub
> mandates that if you restart a queue (or a driver), you do it like
> this, hence the slightly more complicated implementation.
>
> https://patchwork.kernel.org/project/netdevbpf/patch/20231106024413.2801438-13-almasrymina@google.com/#25590262
> https://lore.kernel.org/netdev/20230815171638.4c057dcd@kernel.org/
Thanks for the link.
I like david's idea of "a more generic design where H/W queues are created
and destroyed - e.g., queues unique to a process which makes the cleanup
so much easier." , but it seems it is a lot of work for networking to
implement that for now.
>
Powered by blists - more mailing lists