linux-kernel - Re: [PATCH 1/2] net:cxgb3: replace tasklets with works

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <CACT4oudUd2YuV=GFhz1asvwX8h_mGtqzjZygBD26Tj98cxfCpw@mail.gmail.com>
Date:   Thu, 3 Jun 2021 14:33:39 +0200
From:   Íñigo Huguet <ihuguet@...hat.com>
To:     Hillf Danton <hdanton@...a.com>
Cc:     rajur@...lsio.com, "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org, Ivan Vecera <ivecera@...hat.com>
Subject: Re: [PATCH 1/2] net:cxgb3: replace tasklets with works

On Thu, Jun 3, 2021 at 9:47 AM Hillf Danton <hdanton@...a.com> wrote:
>
> On Thu,  3 Jun 2021 08:34:29 +0200 Inigo Huguet wrote:
> >
> > Moreover, given that probably the ring is not empty yet, so the
> > DMA still has work to do, we don't need to be so fast to justify
> > using tasklets/softirq instead of running in a thread.
>
> [...]
>
> > -             tasklet_kill(&qs->txq[TXQ_OFLD].qresume_tsk);
> > -             tasklet_kill(&qs->txq[TXQ_CTRL].qresume_tsk);
> > +             cancel_work_sync(&qs->txq[TXQ_OFLD].qresume_task);
> > +             cancel_work_sync(&qs->txq[TXQ_OFLD].qresume_task);
>
> This is the last minute mark that figures are needed to support your
> reasoning above.

OFLD queue has length=1024, and CTRL queue length=256. If a queue is
so full that the next packet doesn't fit, driver "stop" that queue.
That means that it adds the new outcoming packets to a list of packets
which are pending from being enqueued. Packets which were already in
the queue keep being processed by the DMA. When the queue has half or
more of free space, pending packets are moved from the "pending" list
to the queue.

If the list got full in the first place, it was because the NIC was
processing the packets slower than the CPU was trying to send them.
Given that, it is reasonable to think that there is enough time to
enqueue the pending packets before the DMA and the NIC are able to
process the other half of data still in the queue. If for some reason
the situation has changed in the meanwhile, and now the NIC is capable
of sending the packets REALLY fast, and the queue gets empty before
the qresume_task is run, it will be a small delay only once, not many
times, so it is not a big problem. But honestly, I don't know if this
situation can really happen in practice.

In my opinion there are no drawbacks moving these tasks to threads,
and in exchange we avoid increasing latencies in softirq context.
Consider that there might be a high amount of packets pending of being
enqueued, and in these "resume" tasks pending packets keep being
enqueued until there are no more of them, or until the queue gets full
again. The "resume" task might run for quite a long time.

cancel_work_sync is only called when the interface is set to down
state or the module removed.

--
Íñigo Huguet