lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZrygE7SjwPpdWM5G@nanopsycho.orion>
Date: Wed, 14 Aug 2024 14:16:19 +0200
From: Jiri Pirko <jiri@...nulli.us>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: Marek Szyprowski <m.szyprowski@...sung.com>, netdev@...r.kernel.org,
	davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
	pabeni@...hat.com, jasowang@...hat.com, xuanzhuo@...ux.alibaba.com,
	virtualization@...ts.linux.dev, ast@...nel.org,
	daniel@...earbox.net, hawk@...nel.org, john.fastabend@...il.com,
	dave.taht@...il.com, kerneljasonxing@...il.com,
	hengqi@...ux.alibaba.com
Subject: Re: [PATCH net-next v3] virtio_net: add support for Byte Queue Limits

Wed, Aug 14, 2024 at 11:43:51AM CEST, mst@...hat.com wrote:
>On Wed, Aug 14, 2024 at 10:17:15AM +0200, Jiri Pirko wrote:
>> >diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> >index 3f10c72743e9..c6af18948092 100644
>> >--- a/drivers/net/virtio_net.c
>> >+++ b/drivers/net/virtio_net.c
>> >@@ -2867,8 +2867,8 @@ static int virtnet_enable_queue_pair(struct virtnet_info *vi, int qp_index)
>> > 	if (err < 0)
>> > 		goto err_xdp_reg_mem_model;
>> > 
>> >-	virtnet_napi_enable(vi->rq[qp_index].vq, &vi->rq[qp_index].napi);
>> > 	netdev_tx_reset_queue(netdev_get_tx_queue(vi->dev, qp_index));
>> >+	virtnet_napi_enable(vi->rq[qp_index].vq, &vi->rq[qp_index].napi);
>> > 	virtnet_napi_tx_enable(vi, vi->sq[qp_index].vq, &vi->sq[qp_index].napi);
>> 
>> Hmm, I have to look at this a bit more. I think this might be accidental
>> fix. The thing is, napi can be triggered even if it is disabled:
>> 
>>        ->__local_bh_enable_ip()
>>          -> net_rx_action()
>>            -> __napi_poll()
>> 
>> Here __napi_poll() calls napi_is_scheduled() and calls virtnet_poll_tx()
>> in case napi is scheduled. napi_is_scheduled() checks NAPI_STATE_SCHED
>> bit in napi state.
>> 
>> However, this bit is set previously by netif_napi_add_weight().
>
>It's actually set in napi_disable too, isn't it?

Yes, in both.

I actually find exactly what's the issue.

After virtnet_napi_enable() is called, the following path is hit
  __napi_poll()
    -> virtnet_poll()
      -> virtnet_poll_cleantx()
        -> netif_tx_wake_queue()

That wakes the TX queue and allows skbs to be submitted and accounted by
BQL counters.

Then netdev_tx_reset_queue() is called that resets BQL counters and
eventually leads to the BUG in dql_completed().

That's why virtnet_napi_tx_enable() move helped. Will submit.


>
>> 
>> >
>> > > ...
>> >
>> >Best regards
>> >-- 
>> >Marek Szyprowski, PhD
>> >Samsung R&D Institute Poland
>> >
>> 
>> 
>> > 
>> > 	return 0;
>> >
>> >
>> >Will submit the patch in a jiff. Thanks!
>> >
>> >
>> >
>> >>
>> >>Best regards
>> >>-- 
>> >>Marek Szyprowski, PhD
>> >>Samsung R&D Institute Poland
>> >>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ