[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZrygE7SjwPpdWM5G@nanopsycho.orion>
Date: Wed, 14 Aug 2024 14:16:19 +0200
From: Jiri Pirko <jiri@...nulli.us>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: Marek Szyprowski <m.szyprowski@...sung.com>, netdev@...r.kernel.org,
davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, jasowang@...hat.com, xuanzhuo@...ux.alibaba.com,
virtualization@...ts.linux.dev, ast@...nel.org,
daniel@...earbox.net, hawk@...nel.org, john.fastabend@...il.com,
dave.taht@...il.com, kerneljasonxing@...il.com,
hengqi@...ux.alibaba.com
Subject: Re: [PATCH net-next v3] virtio_net: add support for Byte Queue Limits
Wed, Aug 14, 2024 at 11:43:51AM CEST, mst@...hat.com wrote:
>On Wed, Aug 14, 2024 at 10:17:15AM +0200, Jiri Pirko wrote:
>> >diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> >index 3f10c72743e9..c6af18948092 100644
>> >--- a/drivers/net/virtio_net.c
>> >+++ b/drivers/net/virtio_net.c
>> >@@ -2867,8 +2867,8 @@ static int virtnet_enable_queue_pair(struct virtnet_info *vi, int qp_index)
>> > if (err < 0)
>> > goto err_xdp_reg_mem_model;
>> >
>> >- virtnet_napi_enable(vi->rq[qp_index].vq, &vi->rq[qp_index].napi);
>> > netdev_tx_reset_queue(netdev_get_tx_queue(vi->dev, qp_index));
>> >+ virtnet_napi_enable(vi->rq[qp_index].vq, &vi->rq[qp_index].napi);
>> > virtnet_napi_tx_enable(vi, vi->sq[qp_index].vq, &vi->sq[qp_index].napi);
>>
>> Hmm, I have to look at this a bit more. I think this might be accidental
>> fix. The thing is, napi can be triggered even if it is disabled:
>>
>> ->__local_bh_enable_ip()
>> -> net_rx_action()
>> -> __napi_poll()
>>
>> Here __napi_poll() calls napi_is_scheduled() and calls virtnet_poll_tx()
>> in case napi is scheduled. napi_is_scheduled() checks NAPI_STATE_SCHED
>> bit in napi state.
>>
>> However, this bit is set previously by netif_napi_add_weight().
>
>It's actually set in napi_disable too, isn't it?
Yes, in both.
I actually find exactly what's the issue.
After virtnet_napi_enable() is called, the following path is hit
__napi_poll()
-> virtnet_poll()
-> virtnet_poll_cleantx()
-> netif_tx_wake_queue()
That wakes the TX queue and allows skbs to be submitted and accounted by
BQL counters.
Then netdev_tx_reset_queue() is called that resets BQL counters and
eventually leads to the BUG in dql_completed().
That's why virtnet_napi_tx_enable() move helped. Will submit.
>
>>
>> >
>> > > ...
>> >
>> >Best regards
>> >--
>> >Marek Szyprowski, PhD
>> >Samsung R&D Institute Poland
>> >
>>
>>
>> >
>> > return 0;
>> >
>> >
>> >Will submit the patch in a jiff. Thanks!
>> >
>> >
>> >
>> >>
>> >>Best regards
>> >>--
>> >>Marek Szyprowski, PhD
>> >>Samsung R&D Institute Poland
>> >>
>
Powered by blists - more mailing lists