netdev - Re: [PATCH net] ibmvnic: Do not reset dql stats on NON

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230626103305.3d8bb0b5@kernel.org>
Date: Mon, 26 Jun 2023 10:33:05 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Nick Child <nnac123@...ux.ibm.com>
Cc: netdev@...r.kernel.org, haren@...ux.ibm.com, ricklind@...ibm.com
Subject: Re: [PATCH net] ibmvnic: Do not reset dql stats on NON_FATAL err

On Mon, 26 Jun 2023 10:45:38 -0500 Nick Child wrote:
> On 6/24/23 17:19, Jakub Kicinski wrote:
> > On Thu, 22 Jun 2023 14:03:32 -0500 Nick Child wrote:  
>  [...]  
> > 
> > Why are you trying to clear this bit?
> > 
> > If the completions will still come the bit will be cleared (or not)
> > during completion handling (netdev_tx_completed_queue() et al.)
> > 
> > Drivers shouldn't be poking into queue state bits directly.  
> 
> Most likely, yes there could be some bytes that will get a completion 
> which would clear this bit.
> 
> That being said, it is also possible that all bytes sent to the NIC are 
> already completed. In which case we would not get a completion. The 
> ibmvnic driver sends its skb's to the NIC in batches, it makes a call to 
> netdev_tx_sent_queue on every time an skb is added to the batch. This is 
> not necessarily every-time that the batch is sent to the NIC.

If packets can get stuck in the driver that needs a dedicated fix.

> I am not sure what number of bytes causes dql to set 
> __QUEUE_STATE_STACK_XOFF but I do know that it is possible for up to 15 
> skb's to be sitting in the queues batch. If there are no outstanding 
> bytes on the NIC (ie not expecting a tx completion) and the internal 
> batch has 15 references per queue, is this enough to enforce dql and set 
> __QUEUE_STATE_STACK_XOFF? If so, then we must clear 
> __QUEUE_STATE_STACK_XOFF when resetting.

You should only be doing the batching if xmit_more is set, really.
And xmit_more will not be set if core is about to set
__QUEUE_STATE_STACK_XOFF. 

With a correctly written driver STACK_XOFF can only be set if we're
expecting a completion.

> I had a feeling this would raise some eyebrows. The main intent is to do 
> everything that netdev_tx_reset_queue() does besides resetting 
> statistics. Setting a "*STACK*" flag in driver code feels wrong 
> (especially since a *DRV* flag exists) but I could not find an 
> appropriate upper-level function. I suppose an alternative is to read 
> this flag after the device finishes the reset and sending the batch if 
> it is set. Is this any better?

AFAIU you shouldn't have to clear this flag. We need to reset DQL when
hard resetting the queue because we may lose completions. But if no
completions are lost, STACK_XOFF should be left alone. 

Just clearing that flag is not valid either because qdiscs will not be
rescheduled, so until some socket tries to xmit next packet the data
already queued will not move.