lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <25e42e2a-4662-e00a-e274-a6887aaae9d6@linux.ibm.com>
Date: Mon, 26 Jun 2023 10:45:38 -0500
From: Nick Child <nnac123@...ux.ibm.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: netdev@...r.kernel.org, haren@...ux.ibm.com, ricklind@...ibm.com
Subject: Re: [PATCH net] ibmvnic: Do not reset dql stats on NON_FATAL err



On 6/24/23 17:19, Jakub Kicinski wrote:
> On Thu, 22 Jun 2023 14:03:32 -0500 Nick Child wrote:
>> +		if (adapter->reset_reason == VNIC_RESET_NON_FATAL)
>> +			clear_bit(__QUEUE_STATE_STACK_XOFF,
>> +				  &netdev_get_tx_queue(netdev, i)->state);
> 
> Why are you trying to clear this bit?
> 
> If the completions will still come the bit will be cleared (or not)
> during completion handling (netdev_tx_completed_queue() et al.)
> 
> Drivers shouldn't be poking into queue state bits directly.

Most likely, yes there could be some bytes that will get a completion 
which would clear this bit.

That being said, it is also possible that all bytes sent to the NIC are 
already completed. In which case we would not get a completion. The 
ibmvnic driver sends its skb's to the NIC in batches, it makes a call to 
netdev_tx_sent_queue on every time an skb is added to the batch. This is 
not necessarily every-time that the batch is sent to the NIC.

I am not sure what number of bytes causes dql to set 
__QUEUE_STATE_STACK_XOFF but I do know that it is possible for up to 15 
skb's to be sitting in the queues batch. If there are no outstanding 
bytes on the NIC (ie not expecting a tx completion) and the internal 
batch has 15 references per queue, is this enough to enforce dql and set 
__QUEUE_STATE_STACK_XOFF? If so, then we must clear 
__QUEUE_STATE_STACK_XOFF when resetting.

I had a feeling this would raise some eyebrows. The main intent is to do 
everything that netdev_tx_reset_queue() does besides resetting 
statistics. Setting a "*STACK*" flag in driver code feels wrong 
(especially since a *DRV* flag exists) but I could not find an 
appropriate upper-level function. I suppose an alternative is to read 
this flag after the device finishes the reset and sending the batch if 
it is set. Is this any better?

As always, thanks for the review.
Nick

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ