lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 24 Nov 2020 03:20:05 +0200
From:   Vladimir Oltean <vladimir.oltean@....com>
To:     Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org,
        Po Liu <po.liu@....com>,
        Claudiu Manoil <claudiu.manoil@....com>
Cc:     Vinicius Costa Gomes <vinicius.gomes@...el.com>
Subject: [PATCH net] enetc: Advance the taprio base time in the future

The tc-taprio base time indicates the beginning of the tc-taprio
schedule, which is cyclic by definition (where the length of the cycle
in nanoseconds is called the cycle time). The base time is a 64-bit PTP
time in the TAI domain.

Logically, the base-time should be a future time. But that imposes some
restrictions to user space, which has to retrieve the current PTP time
first from the NIC first, then calculate a base time that will still be
larger than the base time by the time the kernel driver programs this
value into the hardware. Actually ensuring that the programmed base time
is in the future is still a problem even if the kernel alone deals with
this - what the proposed patch does is to "reserve" 100 ms for potential
delays, but otherwise this is an unsolved problem in the general case.

Nonetheless, what is important for tc-taprio in a LAN is not precisely
the base-time value, but rather the fact that the taprio schedules are
synchronized across all nodes in the network, or at least have a given
phase offset.

Therefore, the expectation for user space is that specifying a base-time
of 0 would mean that the tc-taprio schedule should start "right away",
with one twist: the effective base-time written into the NIC is still
congruent with the originally specified base-time. Otherwise stated,
if the current PTP time of the NIC is 2.123456789, the base-time of the
schedule is 0.000000000 and the cycle-time is 0.500000000, then the
effective base-time should be 2.500000000, since that is the first
beginning of a new cycle starting at base-time 0.000000000, with a cycle
time of 500 ms, that is larger than the current PTP time.

So in short, the kernel driver, or the hardware, should allow user space
to skip the calculation of the future base time, and transparently allow
a PTP time in the past. The formula for advancing the base time should be:

effective-base-time = base-time + N x cycle-time

where N is the smallest integer number of cycles such that
effective-base-time >= now.

Actually, the base-time of 0.000000000 is not special in any way.
Reiterating the example above, just with a base-time of 0.000500000. The
effective base time in this case should be 2.500500000, according to the
formula. There are use cases for applying phase shifts like this.

The enetc driver is not doing that. It special-cases the case where the
specified base time is zero, and it replaces that with a plain "current
PTP time".

Such an implementation is unusable for applications that expect the
phase to be preserved. We already have drivers in the kernel that comply
to the behavior described above (maybe more than just the ones listed
below):
- the software implementation of taprio does it in taprio_get_start_time:

	/* Schedule the start time for the beginning of the next
	 * cycle.
	 */
	n = div64_s64(ktime_sub_ns(now, base), cycle);
	*start = ktime_add_ns(base, (n + 1) * cycle);

- the sja1105 offload does it via future_base_time()
- the ocelot/felix offload does it via vsc9959_new_base_time()

As for the obvious question: doesn't the hardware just "do the right
thing" if passed a time in the past? I've tested and it doesn't look
like it. I cannot determine what base-time it uses in that case, however
traffic does not have the correct phase alignment.

Fixes: 34c6adf1977b ("enetc: Configure the Time-Aware Scheduler via tc-taprio offload")
Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
---
 .../net/ethernet/freescale/enetc/enetc_qos.c  | 34 +++++++++++++------
 1 file changed, 23 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/freescale/enetc/enetc_qos.c b/drivers/net/ethernet/freescale/enetc/enetc_qos.c
index aeb21dc48099..379deef5d9e0 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_qos.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_qos.c
@@ -45,6 +45,20 @@ void enetc_sched_speed_set(struct enetc_ndev_priv *priv, int speed)
 		      | pspeed);
 }
 
+static inline s64 future_base_time(s64 base_time, s64 cycle_time, s64 now)
+{
+	s64 a, b, n;
+
+	if (base_time >= now)
+		return base_time;
+
+	a = now - base_time;
+	b = cycle_time;
+	n = div_s64(a + b - 1, b);
+
+	return base_time + n * cycle_time;
+}
+
 static int enetc_setup_taprio(struct net_device *ndev,
 			      struct tc_taprio_qopt_offload *admin_conf)
 {
@@ -55,7 +69,9 @@ static int enetc_setup_taprio(struct net_device *ndev,
 	struct gce *gce;
 	dma_addr_t dma;
 	u16 data_size;
+	s64 base_time;
 	u16 gcl_len;
+	u64 now;
 	u32 tge;
 	int err;
 	int i;
@@ -92,18 +108,14 @@ static int enetc_setup_taprio(struct net_device *ndev,
 	gcl_config->atc = 0xff;
 	gcl_config->acl_len = cpu_to_le16(gcl_len);
 
-	if (!admin_conf->base_time) {
-		gcl_data->btl =
-			cpu_to_le32(enetc_rd(&priv->si->hw, ENETC_SICTR0));
-		gcl_data->bth =
-			cpu_to_le32(enetc_rd(&priv->si->hw, ENETC_SICTR1));
-	} else {
-		gcl_data->btl =
-			cpu_to_le32(lower_32_bits(admin_conf->base_time));
-		gcl_data->bth =
-			cpu_to_le32(upper_32_bits(admin_conf->base_time));
-	}
+	now = (u64)enetc_rd(&priv->si->hw, ENETC_SICTR1) << 32;
+	now |= enetc_rd(&priv->si->hw, ENETC_SICTR0);
 
+	base_time = future_base_time(admin_conf->base_time,
+				     admin_conf->cycle_time,
+				     now + NSEC_PER_SEC / 10);
+	gcl_data->btl = cpu_to_le32(lower_32_bits(base_time));
+	gcl_data->bth = cpu_to_le32(upper_32_bits(base_time));
 	gcl_data->ct = cpu_to_le32(admin_conf->cycle_time);
 	gcl_data->cte = cpu_to_le32(admin_conf->cycle_time_extension);
 
-- 
2.25.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ