netdev - Re: [PATCH net-next 1/3] net: provide macros for commonly copied lockless queue stop/wake code

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c5b2f71718e6431ce4bc61ffde3ee16d7b5da260.camel@redhat.com>
Date:   Thu, 30 Mar 2023 16:56:00 +0200
From:   Paolo Abeni <pabeni@...hat.com>
To:     Jakub Kicinski <kuba@...nel.org>,
        Alexander Duyck <alexander.duyck@...il.com>
Cc:     davem@...emloft.net, netdev@...r.kernel.org, edumazet@...gle.com,
        willemb@...gle.com
Subject: Re: [PATCH net-next 1/3] net: provide macros for commonly copied
 lockless queue stop/wake code

Hi,

On Tue, 2023-03-28 at 17:56 -0700, Jakub Kicinski wrote:
> On Sun, 26 Mar 2023 14:23:07 -0700 Alexander Duyck wrote:
> > > > Except this isn't "stop", this is "maybe stop".  
> > > 
> > > So the return value from try_stop and maybe_stop would be different?
> > > try_stop needs to return 0 if it stopped - the same semantics as
> > > trylock(), AFAIR. Not that I love those semantics, but it's a fairly
> > > strong precedent.  
> > 
> > The problem is this isn't a lock. Ideally with this we aren't taking
> > the action. So if anything this functions in my mind more like the
> > inverse where if this does stop we have to abort more like trylock
> > failing.
> 
> No.. for try_stop we are trying to stop.
> 
> > This is why I mentioned that maybe this should be renamed. I view this
> > more as a check to verify we are good to proceed. In addition there is
> > the problem that there are 3 possible outcomes with maybe_stop versus
> > the two from try_stop.
> 
> I'm open to other names :S
> 
> > > > The thing is in order to make this work for the ixgbe patch you didn't
> > > > use the maybe_stop instead you went with the try_stop. If you replaced
> > > > the ixgbe_maybe_stop_tx with your maybe stop would have to do
> > > > something such as the code above to make it work. That is what I am
> > > > getting at. From what I can tell the only real difference between
> > > > ixgbe_maybe_stop_tx and your maybe_stop is that you avoided having to
> > > > move the restart_queue stat increment out.  
> > > 
> > > I can convert ixgbe further, true, but I needed the try_stop, anyway,
> > > because bnxt does:
> > > 
> > > if (/* need to stop */) {
> > >         if (xmit_more())
> > >                 flush_db_write();
> > >         netif_tx_queue_try_stop();
> > > }
> > > 
> > > which seems reasonable.  
> > 
> > I wasn't saying we didn't need try_stop. However the logic here
> > doesn't care about the return value. In the ixgbe case we track the
> > queue restarts so we would want a 0 on success and a non-zero if we
> > have to increment the stat. I would be okay with the 0 (success) / -1
> > (queue restarted) in this case.
> > 
> > > > The general thought is I would prefer to keep it so that 0 is the
> > > > default most likely case in both where the queue is enabled and is
> > > > still enabled. By moving the "take action" items into the 1/-1 values
> > > > then it becomes much easier to sort them out with 1 being a stat
> > > > increment and -1 being an indication to stop transmitting and prep for
> > > > watchdog hang if we don't clear this in the next watchdog period.  
> > > 
> > > Maybe worth taking a step back - the restart stat which ixgbe
> > > maintains made perfect sense when you pioneered this approach but
> > > I think we had a decade of use, and have kprobes now, so we don't
> > > really need to maintain a statistic for a condition with no impact
> > > to the user? New driver should not care 1 vs -1..  
> > 
> > Actually the restart_queue stat is VERY useful for debugging. It tells
> > us we are seeing backlogs develop in the Tx queue. We track it any
> > time we wake up the queue, not just in the maybe_stop case.
> > 
> > WIthout that we are then having to break out kprobes and the like
> > which we could only add after-the-fact which makes things much harder
> > to debug when issues occur. For example, a common case to use it is to
> > monitor it when we see a system with slow Tx connections. With that
> > stat we can tell if we are building a backlog in the qdisc or if it is
> > something else such as a limited amount of socket memory is limiting
> > the transmits.
> 
> Oh, I missed that wake uses the same stat. Let me clarify - the
> stop/start counter is definitely useful. What I thought the restart
> counter is counting is just the race cases. I don't think the race
> cases are worth counting in any way.
> 
> > > > The thought I had with the enum is to more easily connect the outcomes
> > > > with the sources. It would also help to prevent any confusion on what
> > > > is what. Having the two stop/wake functions return different values is
> > > > a potential source for errors since 0/1 means different things in the
> > > > different functions. Basically since we have 3 possible outcomes using
> > > > the enum would make it very clear what the mapping is between the two.  
> > > 
> > > IMO only two outcomes matter in practice (as mentioned above).
> > > I really like the ability to treat the return value as a bool, if only
> > > we had negative zero we would have a perfect compromise :(  
> > 
> > I think we are just thinking about two different things. I am focusing
> > on the "maybe" calls that have 3 outcomes whereas I think you are
> > mostly focused on the "try" calls. My thought is to treat it something
> > like the msix allocation calls where a negative indicates a failure
> > forcing us to stop since the ring is full, 0 is a success, and a value
> > indicates that there are resources but they are/were limited.
> 
> I don't see a strong analogy to PCI resource allocation :(
> 
> I prefer to keep the 0 vs non-0 distinction to indicate whether 
> the action was performed.
> 
> Paolo, Eric, any opinion? Other than the one likely vs unlikely
> flip -- is this good enough to merge for you?

As you know I'm usually horrible at name related choice, but you asked,
so...

I'm personally ok with the current naming, and AFAICS the coding style
guidelines suggest returning 0 when imperative functions complete
successfully. 

I think we should apply the guidelines here, even if are talking about
macros.

That means netif_tx_queue_maybe_stop() and netif_tx_queue_try_stop()
should return 0 when the queue is actually stopped.

I'm personally fine with the current implementation.

Cheers,

Paolo