linux-kernel - Re: [PATCH V2] taprio: Set the value of picos_per_byte before fill sched

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20221001114458.mjt3qkollggmgdwo@skbuf>
Date:   Sat, 1 Oct 2022 14:44:58 +0300
From:   Vladimir Oltean <olteanv@...il.com>
To:     jianghaoran <jianghaoran@...inos.cn>
Cc:     linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        Vinicius Costa Gomes <vinicius.gomes@...el.com>,
        Jakub Kicinski <kuba@...nel.org>
Subject: Re: [PATCH V2] taprio: Set the value of picos_per_byte before fill
 sched_entry

Hi Jianghao,

On Sat, Oct 01, 2022 at 04:06:26PM +0800, jianghaoran wrote:
> If the value of picos_per_byte is set after fill sched_entry,
> as a result, the min_duration calculated by length_to_duration is 0,
> and the validity of the input interval cannot be judged,
> too small intervals couldn't allow any packet to be transmitted.
> It will appear like commit b5b73b26b3ca ("taprio:
> Fix allowing too small intervals") described problem.
> Here is a further modification of this problem.
> 
> example configuration which will not be able to transmit:
> 
> tc qdisc replace dev enp5s0f0 parent root handle 100 taprio \
>               num_tc 3 \
>               map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
>               queues 1@0 1@1 2@2 \
>               base-time  1528743495910289987 \
>               sched-entry S 01 9 \
> 	      sched-entry S 02 9 \
> 	      sched-entry S 04 9 \
>               clockid CLOCK_TAI
> 
> Fixes: b5b73b26b3ca ("taprio: Fix allowing too small intervals")
> Signed-off-by: jianghaoran <jianghaoran@...inos.cn>
> ---

I think this is just a symptomatic treatment of a bigger problem with
the solution Vinicius tried to implement.

One can still change the qdisc on an interface whose link is down, and
the determination logic will still be bypassed, thereby allowing the 9
ns schedule intervals to be accepted as valid.

Is your problem that the 9 ns intervals will kill the kernel due to the
frequent hrtimers, or that no packets will be dequeued from the qdisc?

If the latter, I was working on a feature called queueMaxSDU, where one
can limit the MTU per traffic class. Packets exceeding the max MTU are
dropped at the enqueue() level (therefore, before being accepted into
the Qdisc queues). The problem here, really, is that we accept packets
in enqueue() which will never be eligible in dequeue(). We have the
exact same problem with gates which are forever closed (in your own
example, that would be gates 3 and higher).

Currently, I only added support for user space to input queueMaxSDU into
the kernel over netlink, as well as for the basic qdisc_drop() mechanism
based on skb->len. But I was thinking that the kernel should have a
mechanism to automatically reduce the queueMaxSDU to an even lower value
than specified by the user, if the gate intervals don't accept MTU sized
packets. The "operational" queueMaxSDU is determined by the current link
speed and the smallest contiguous interval corresponding to each traffic
class.

In fact, if you search for vsc9959_tas_guard_bands_update(), you'll see
most of the logic already being written, but just for an offloading
device driver. I was thinking I should generalize this logic and push it
into taprio.

If your problem is the former (9ns hrtimers kill the kernel, how do we
avoid them?), then it's pretty hard to make a judgement that works for
all link speeds (taprio will still accept the interval as valid for a
100Gbps interface, because theoretically, the transmission time of
ETH_ZLEN bytes is still below 9 ns. I don't know how one can realistically
deal with that in a generic way.

Given that it's so easy to bypass taprio's restriction by having the
link down, I don't think it makes much sense to keep pretending that it
works, and submit this as a bug fix :)

I was going to move vsc9959_tas_guard_bands_update() into taprio anyway,
although I'm not sure if in this kernel development cycle. If you're
interested, I can keep you on CC.