netdev - Re: [EXT] Re: [net-next] net: dsa: felix: disable always guard band bit for TAS config

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2898c3ae1319756e13b95da2b74ccacc@walle.cc>
Date:   Fri, 07 May 2021 09:35:12 +0200
From:   Michael Walle <michael@...le.cc>
To:     Xiaoliang Yang <xiaoliang.yang_1@....com>
Cc:     Vladimir Oltean <vladimir.oltean@....com>,
        Vladimir Oltean <olteanv@...il.com>,
        UNGLinuxDriver@...rochip.com, alexandre.belloni@...tlin.com,
        allan.nielsen@...rochip.com,
        Claudiu Manoil <claudiu.manoil@....com>, davem@...emloft.net,
        idosch@...lanox.com, joergen.andreasen@...rochip.com,
        linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        Po Liu <po.liu@....com>, vinicius.gomes@...el.com
Subject: Re: [EXT] Re: [net-next] net: dsa: felix: disable always guard band
 bit for TAS config

Hi Xiaoliang,

Am 2021-05-07 09:16, schrieb Xiaoliang Yang:
> On 2021-05-06 21:25, Michael Walle <michael@...le.cc> wrote:
>> Am 2021-05-04 23:33, schrieb Vladimir Oltean:
>> > [ trimmed the CC list, as this is most likely spam for most people ]
>> >
>> > On Tue, May 04, 2021 at 10:23:11PM +0200, Michael Walle wrote:
>> >> Am 2021-05-04 21:17, schrieb Vladimir Oltean:
>> >> > On Tue, May 04, 2021 at 09:08:00PM +0200, Michael Walle wrote:
>> >> > > > > > > As explained in another mail in this thread, all queues
>> >> > > > > > > are marked as scheduled. So this is actually a no-op,
>> >> > > > > > > correct? It doesn't matter if it set or not set for now. Dunno why
>> we even care for this bit then.
>> >> > > > > >
>> >> > > > > > It matters because ALWAYS_GUARD_BAND_SCH_Q reduces the
>> >> > > > > > available throughput when set.
>> >> > > > >
>> >> > > > > Ahh, I see now. All queues are "scheduled" but the guard band
>> >> > > > > only applies for "non-scheduled" -> "scheduled" transitions.
>> >> > > > > So the guard band is never applied, right? Is that really
>> >> > > > > what we want?
>> >> > > >
>> >> > > > Xiaoliang explained that yes, this is what we want. If the end
>> >> > > > user wants a guard band they can explicitly add a "sched-entry
>> >> > > > 00" in the tc-taprio config.
>> >> > >
>> >> > > You're disabling the guard band, then. I figured, but isn't that
>> >> > > suprising for the user? Who else implements taprio? Do they do it
>> >> > > in the same way? I mean this behavior is passed right to the
>> >> > > userspace and have a direct impact on how it is configured. Of
>> >> > > course a user can add it manually, but I'm not sure that is what
>> >> > > we want here. At least it needs to be documented somewhere. Or
>> >> > > maybe it should be a switchable option.
>> >> > >
>> >> > > Consider the following:
>> >> > > sched-entry S 01 25000
>> >> > > sched-entry S fe 175000
>> >> > > basetime 0
>> >> > >
>> >> > > Doesn't guarantee, that queue 0 is available at the beginning of
>> >> > > the cycle, in the worst case it takes up to <begin of cycle> +
>> >> > > ~12.5us until the frame makes it through (given gigabit and 1518b
>> >> > > frames).
>> >> > >
>> >> > > Btw. there are also other implementations which don't need a
>> >> > > guard band (because they are store-and-forward and cound the
>> >> > > remaining bytes). So yes, using a guard band and scheduling is
>> >> > > degrading the performance.
>> >> >
>> >> > What is surprising for the user, and I mentioned this already in
>> >> > another thread on this patch, is that the Felix switch overruns the
>> >> > time gate (a packet taking 2 us to transmit will start transmission
>> >> > even if there is only 1 us left of its time slot, delaying the
>> >> > packets from the next time slot by 1 us). I guess that this is why
>> >> > the ALWAYS_GUARD_BAND_SCH_Q bit exists, as a way to avoid these
>> >> > overruns, but it is a bit of a poor tool for that job. Anyway,
>> >> > right now we disable it and live with the overruns.
>> >>
>> >> We are talking about the same thing here. Why is that a poor tool?
>> >
>> > It is a poor tool because it revolves around the idea of "scheduled
>> > queues" and "non-scheduled queues".
>> >
>> > Consider the following tc-taprio schedule:
>> >
>> >       sched-entry S 81 2000 # TC 7 and 0 open, all others closed
>> >       sched-entry S 82 2000 # TC 7 and 1 open, all others closed
>> >       sched-entry S 84 2000 # TC 7 and 2 open, all others closed
>> >       sched-entry S 88 2000 # TC 7 and 3 open, all others closed
>> >       sched-entry S 90 2000 # TC 7 and 4 open, all others closed
>> >       sched-entry S a0 2000 # TC 7 and 5 open, all others closed
>> >       sched-entry S c0 2000 # TC 7 and 6 open, all others closed
>> >
>> > Otherwise said, traffic class 7 should be able to send any time it
>> > wishes.
>> 
>> What is the use case behind that? TC7 (with the highest priority) may 
>> always
>> take precedence of the other TCs, thus what is the point of having a 
>> dedicated
>> window for the others.
>> 
>> Anyway, I've tried it and there are no hiccups. I've meassured the 
>> delta
>> between the start of successive packets and they are always ~12370ns 
>> for a
>> 1518b frame. TC7 is open all the time, which makes sense. It only 
>> happens if
>> you actually close the gate, eg. you have a sched-entry where a TC7 
>> bit is not
>> set. In this case, I can see a difference between 
>> ALWAYS_GUARD_BAND_SCH_Q
>> set and not set. If it is set, there is up to a ~12.5us delay added 
>> (of course it
>> depends on when the former frame was scheduled).
>> 
>> It seems that also needs to be 1->0 transition.
>> 
>> You've already mentioned that the switch violates the Qbv standard.
>> What makes you think so? IMHO before that patch, it wasn't violated.
>> Now it likely is (still have to confirm that). How can this be 
>> reasonable?
>> 
>> If you have a look at the initial commit message, it is about making 
>> it possible
>> to have a smaller gate window, but that is not possible because of the 
>> current
>> guard band of ~12.5us. It seems to be a shortcut for not having the 
>> MAXSDU
>> (and thus the length of the guard band) configurable. Yes (static) 
>> guard bands
>> will have a performance impact, also described in [1]. You are trading 
>> the
>> correctness of the TAS for performance. And it is the sole purpose of 
>> Qbv to
>> have a determisitc way (in terms of timing) of sending the frames.
>> 
>> And telling the user, hey, we know we violate the Qbv standard, please 
>> insert
>> the guard bands yourself if you really need them is not a real 
>> solution. As
>> already mentioned, (1) it is not documented anywhere, (2) can't be 
>> shared
>> among other switches (unless they do the same workaround) and (3) what 
>> am
>> I supposed to do for TSN compliance testing. Modifying the schedule 
>> that is
>> about to be checked (and thus given by the compliance suite)?
>> 
> I disable the always guard band bit because each gate control list
> needs to reserve a guard band slot, which affects performance. The
> user does not need to set a guard band for each queue transmission.
> For example, "sched-entry S 01 2000 sched-entry S fe 98000". Queue 0
> is protected traffic and has smaller frames, so there is no need to
> reserve a guard band during the open time of queue 0. The user can add
> the following guard band before protected traffic: "sched-entry S 00
> 25000 sched-entry S 01 2000 sched-entry S fe 73000"

Again, you're passing the handling of the guard band to the user,
which is an implementation detail for this switch (unless there is
a new switch for it on the qdisc IMHO). And (1), (2) and (3) from
above is still valid.

Consider the entry
  sched-entry S 01 2000
  sched-entry S 02 20000

A user assumes that TC0 is open for 2us. But with your change
it is bascially open for 2us + 12.5us. And even worse, it is not
deterministic. A frame in the subsequent queue (ie TC1) can be
scheduled anywhere beteeen 0us and 12.5us after opening the gate,
depending on wether there is still a frame transmitting on TC0.

> I checked other devices such as ENETC and it can calculate how long
> the frame transmission will take and determine whether to transmit
> before the gate is closed. The VSC9959 device does not have this
> hardware function. If we implement guard bands on each queue, we need
> to reserve a 12.5us guard band for each GCL, even if it only needs to
> send a small packet. This confuses customers.

How about getting it right and working on how we can set the MAXSDU
per queue and thus making the guard band smaller?

> actually, I'm not sure if this will violate the Qbv standard. I looked
> up the Qbv standard spec, and found it only introduce the guard band
> before protected window (Annex Q (informative)Traffic scheduling). It
> allows to design the schedule to accommodate the intended pattern of
> transmission without overrunning the next gate-close event for the
> traffic classes concerned.

Vladimir already quoted "IEEE 802.1Q-2018 clause 8.6.8.4". I didn't
check it, though.

A static guard band is one of the options you have to fulfill that.
Granted, it is not that efficient, but it is how the switch handles
it.

-michael