lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 10 Dec 2021 18:43:17 +0100
From:   Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To:     Eric Dumazet <edumazet@...gle.com>
Cc:     Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH net-next] net: dev: Always serialize on Qdisc::busylock
 in __dev_xmit_skb() on PREEMPT_RT.

On 2021-12-10 08:52:55 [-0800], Eric Dumazet wrote:
> 
> Problem is that if you have a qdisc, qdisc_dequeue() will not know that the
> packet queued by high prio thread needs to shortcut all prior packets and
> be sent right away.

That is correct.

> Because of that, it seems just better that a high prio thread returns
> immediately and let the dirty work (dequeue packets and send them to devices)
> be done by other threads ?

Ah okay. Lets say you have a task sending a packet every 100ms. And you
want that packet to leave the NIC asap. This is your task with
SCHED_FIFO 70 priority. IMPORTANT otherwise.
Lets say you have a ssh session on the same NIC running at SCHED_OTHER.
Nothing important obviously.

The NIC is idle, ssh sends a keep-alive, 80 bytes, one skb. 
    SCHED_OTHER, SSH, CPU0                                    FIFO 70, IMPORTANT, CPU1
  __dev_xmit_skb()
    -> spin_lock(root_lock);
    -> qdisc_run_begin()
      -> __test_and_set_bit(__QDISC_STATE2_RUNNING);
    -> sch_direct_xmit()
        *PREEMPT* (by something else)
                                                            __dev_xmit_skb()
                                                            -> spin_lock(root_lock);
                                                           <--- PI-boost
      -> spin_unlock(root_lock);   
         *PREEMPT* again     ---> PI boost ends after unlock
                                                            -> qdisc_run_begin()
                                                              -> __test_and_set_bit(__QDISC_STATE2_RUNNING) (nope);
                                                             -> dev_qdisc_enqueue()
                                                             -> qdisc_run_begin() returns false
                                                            -> spin_unlock(root_lock);

at this point, we don't return to the SSH thread, instead other RT tasks
are running. That means that the SSH thread, which owns the qdisc and
the NIC, remains preempted and the NIC idle.

300ms pass by, the IMPORTANT thread queued two additional packets.
Now, a few usecs later, all SCHED_FIFO tasks go idle on CPU0, and the
SSH thread can continue with dev_hard_start_xmit() and send its packet, 
    -> dev_hard_start_xmit()
    __qdisc_run() (yes, and the three additional packets)
    …

By always acquiring the busy-lock, in the previous scenario, the task
with the high-priority allows the low-priority task to run until
qdisc_run_end() at which point it releases the qdisc so the
high-priority task can actually send packets.

One side note: This scenario requires two CPUs (one for low priority
task that gets preempted and owns the qdisc, and one one for high
priority task that wants to sends packets).
With one CPU only the local_bh_disable() invocation would ensure that
low-priority thread left the bh-disable section entirely.

> > > Thanks!
> > >
> > > Paolo
> >

Sebastian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ