netdev - Re: iwlwifi warnings in 5.5-rc1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87k172gbrn.fsf@toke.dk>
Date:   Wed, 11 Dec 2019 15:47:08 +0100
From:   Toke Høiland-Jørgensen <toke@...hat.com>
To:     Johannes Berg <johannes@...solutions.net>,
        Jens Axboe <axboe@...nel.dk>,
        Emmanuel Grumbach <emmanuel.grumbach@...el.com>,
        Luca Coelho <luciano.coelho@...el.com>
Cc:     "linux-wireless\@vger.kernel.org" <linux-wireless@...r.kernel.org>,
        Networking <netdev@...r.kernel.org>
Subject: Re: iwlwifi warnings in 5.5-rc1

Johannes Berg <johannes@...solutions.net> writes:

> On Wed, 2019-12-11 at 15:04 +0100, Toke Høiland-Jørgensen wrote:
>> Johannes Berg <johannes@...solutions.net> writes:
>> 
>> > Btw, there's *another* issue. You said in the commit log:
>> > 
>> >     This patch does *not* include any mechanism to wake a throttled TXQ again,
>> >     on the assumption that this will happen anyway as a side effect of whatever
>> >     freed the skb (most commonly a TX completion).
>> > 
>> > Thinking about this some more, I'm not convinced that this assumption
>> > holds. You could have been stopped due to the global limit, and now you
>> > wake some queue but the TXQ is empty - now you should reschedule some
>> > *other* TXQ since the global limit had kicked in, not the per-TXQ limit,
>> > and prevented dequeuing, no?
>> 
>> Well if you hit the global limit that means you have 24ms worth of data
>> queued in the hardware; those should be completed in turn, and enable
>> more to be dequeued, no?
>
> Yes, but on which queues?
>
> Say you have some queues - some (Q1-Qn) got a LOT of traffic, and
> another (Q0) just has some interactive traffic.
>
> You could then end up in a situation where you have 24ms queued up on
> Q1-Qn (with n high enough to not have hit the per-queue AQL limit),
> right?
>
> Say also the last frame on Q0 was dequeued by the hardware, but the
> tx_dequeue() got NULL because of the AQL limit having been eaten up by
> all the packets on Q1-Qn.
>
> Now you'll no longer get a new dequeue attempt on Q0 (it was already
> empty last time, so no hardware reclaim to trigger new dequeues), and a
> new dequeue on the *other* queues will not do anything for this queue.

Oh, right, I see; yeah, that could probably happen. I guess we could
either kick all available queues whenever the global limit goes from
"above" to "below"; or we could remove the "return NULL" logic from
tx_dequeue() and rely on next_txq() to throttle. I think the latter is
probably simpler, but I'm a little worried that the throttling will
become too lax (because the driver can keep dequeueing in the same
scheduling round)...

-Toke