[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250610150254.w4gvmbsw6nrhb6k4@skbuf>
Date: Tue, 10 Jun 2025 18:02:54 +0300
From: Vladimir Oltean <vladimir.oltean@....com>
To: MD Danish Anwar <danishanwar@...com>
Cc: Meghana Malladi <m-malladi@...com>,
Vignesh Raghavendra <vigneshr@...com>,
Simon Horman <horms@...nel.org>,
Guillaume La Roque <glaroque@...libre.com>,
Sascha Hauer <s.hauer@...gutronix.de>,
Roger Quadros <rogerq@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Jakub Kicinski <kuba@...nel.org>,
Eric Dumazet <edumazet@...gle.com>,
"David S. Miller" <davem@...emloft.net>,
Andrew Lunn <andrew+netdev@...n.ch>,
linux-arm-kernel@...ts.infradead.org, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, srk@...com,
Roger Quadros <rogerq@...com>
Subject: Re: [PATCH net-next v10] net: ti: icssg-prueth: add TAPRIO offload
support
On Tue, Jun 10, 2025 at 04:14:29PM +0530, MD Danish Anwar wrote:
> > 1. If there is no "existing" schedule, what does the "extend" variable
> > extend? The custom base-time mechanism has to work even for the first
> > taprio schedule. (this is an unanswered pre-existing question)
>
> The firmware has a cycle-time of 1ms even if there is no schedule. Every
> 1ms, firmware updates a counter. The curr_time is calculated as
> CounterValue * 1ms.
>
> Even if there are no schedule, the default cycle-time will remain 1ms.
> Let's say the first schedule has a base-time of 20.5ms then the default
> cycle will be extended by 0.5 ms and then the schedule will apply. This
> is the reason, the extend feature is also impacting get/set_time
> calculations.
So what is an ICSSG IEP cycle, then? I think this is another case where
the firmware calls something X, taprio (and 802.1Q) also calls something
X, and yet, they are talking about different things.
A schedule (in taprio terms) is an array of gate states and the
respective time intervals for which those states apply. The schedule is
periodic, and a cycle is the period of time after which the schedule
repeats itself. By default, the cycle time (the duration of the cycle)
is equal to the sum of the gate intervals, but it can also be longer or
shorter (extended or truncated schedule).
Whereas in the case of ICSSG, based on a code search for
IEP_DEFAULT_CYCLE_TIME_NS, a cycle seems to be some elementary unit of
timekeeping, used by the firmware API to express other information in
terms of it, like packet timestamps and periodic output. Would that be a
correct description?
When you say that "even if there is no schedule, the default cycle time
will remain 1ms", you are talking about the cycle time in the ICSSG
sense, because in the taprio/802.1Q sense, it is absurd to talk about a
cycle time in absense of a schedule. And implicitly, you are saying that
when the firmware extends the next-to-last cycle in order for unaligned
base times to work, this alters that timekeeping unit. Like stretching
the ruler in order for a mouse and an elephant to measure the same.
I think it needs to be explicitly pointed out that the taprio schedule
is only supposed to affect packet scheduling at the egress of the port,
but taprio is only a reader of the timekeeping process (and PTP time)
and should not alter it. The timekeeping process should be independent
and the taprio portion of the firmware should keep track of its own
"cycles" which may not begin at the beginning of a timekeeping "cycle",
may not end at the end of one, and may span multiple timekeeping
"cycles", respectively.
What if you also have a Qci (tc-gate) schedule on the ingress of the
same port, and that is configured for a different cycle-time or a
different base-time (requiring a different "extend" value)? It will be
too inflexible to apply the restriction that the parameters have to be
the same.
> >>> That's what our first approach was. If it's okay with you I can drop all
> >>> these changes and add below check in driver
> >>>
> >>> if (taprio->base_time % taprio->cycle_time) {
> >>> NL_SET_ERR_MSG_MOD(taprio->extack, "Base-time should be multiple of cycle-time");
> >>> return -EOPNOTSUPP;
> >>> }
> >
> > I don't want to make a definitive statement on this just yet, I don't
> > fully understand what was implemented in the firmware and what was the
> > thinking.
> >
>
> The firmware always expects cycle-time of 1ms and base-time to multiple
> of cycle-time i.e. base-time to be multiple of 1ms.
>
> This way all the schedules will be aligned and that's what current
> implementation is.
>
> To add support for base-time that are not multiple of cycle-time we
> added "extend". However that will also only work as long as cycle-time
> is 1ms. cycle-time other than 1ms is not supported by firmware as of
> now. This is something we discovered recently.
>
> We have a check for TAS_MIN_CYCLE_TIME and TAS_MAX_CYCLE_TIME and they
> are defined as,
>
> /* Minimum cycle time supported by implementation (in ns) */
> #define TAS_MIN_CYCLE_TIME (1000000)
>
> /* Minimum cycle time supported by implementation (in ns) */
> #define TAS_MAX_CYCLE_TIME (4000000000)
>
> But it is wrong. As per current firmware implementation,
> TAS_MIN_CYCLE_TIME = TAS_MAX_CYCLE_TIME = 1ms
>
>
> The ideal use case will be to support,
> 1. Different cycle times
> 2. Different base times which may or may not be multiple of cycle-times.
>
> With the current implementation, we are able to support #2 however #1 is
> still a limitation. Once support for #1 is added, the implementation
> will need to be changed.
(...)
> > As you can see, I still have trouble understanding the concepts proposed
> > by the firmware.
>
> I understand that. I hope this makes it a bit more clear. Let me know
> what needs to be done now.
Was the "extend" feature added to the ICSSG firmware as a result of the
previous taprio review feedback? Because if it was, it's unfortunate
that because of the lack of clarity of the firmware concepts, the way
this feature was implemented is essentially DOA and cannot be used.
I am not very positive that even if adding the extra restrictions
discovered here (cycle-time cannot be != IEP_DEFAULT_CYCLE_TIME_NS),
the implementation will work as expected. I am not sure that our image
of "as expected" is the same.
Given that these don't seem to be hardware limitations, but constraints
imposed by the ICSSG firmware, I guess my suggestion would be to start
with the selftest I mentioned earlier (which may need to be adapted),
and use it to get a better picture of the gaps. Then make a plan to fix
them in the firmware, and see what it takes. If it isn't going to be
sufficient to fix the bugs unless major API changes are introduced, then
maybe it doesn't make sense for Linux to support taprio offload on the
buggy firmware versions.
Or maybe it does (with the appropriate restrictions), but it would still
inspire more trust to see that the developer at least got some version
of the firmware to pass a selftest, and has a valid reference to follow.
Not going to lie, it doesn't look great that we discover during v10 that
taprio offload only works with a cycle time of 1 ms. The schedule is
network-dependent and user-customizable, and maybe the users didn't get
the memo that only 1 ms was tested :-/
Powered by blists - more mailing lists