netdev - Re: [PATCH net-next] selftests: forwarding: add Per-Stream Filtering and Policing test for Ocelot

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220429093845.tyzwcwppsgbjbw2s@skbuf>
Date:   Fri, 29 Apr 2022 09:38:46 +0000
From:   Vladimir Oltean <vladimir.oltean@....com>
To:     Kurt Kanzenbach <kurt@...utronix.de>
CC:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Jakub Kicinski <kuba@...nel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Paolo Abeni <pabeni@...hat.com>,
        Eric Dumazet <edumazet@...gle.com>,
        Florian Fainelli <f.fainelli@...il.com>,
        Vivien Didelot <vivien.didelot@...il.com>,
        Andrew Lunn <andrew@...n.ch>,
        Claudiu Manoil <claudiu.manoil@....com>,
        Alexandre Belloni <alexandre.belloni@...tlin.com>,
        "UNGLinuxDriver@...rochip.com" <UNGLinuxDriver@...rochip.com>,
        Vinicius Costa Gomes <vinicius.gomes@...el.com>,
        Gerhard Engleder <gerhard@...leder-embedded.com>,
        "Y.B. Lu" <yangbo.lu@....com>,
        Xiaoliang Yang <xiaoliang.yang_1@....com>,
        Richard Cochran <richardcochran@...il.com>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Yannick Vignon <yannick.vignon@....com>,
        Rui Sousa <rui.sousa@....com>, Jiri Pirko <jiri@...dia.com>,
        Ido Schimmel <idosch@...dia.com>
Subject: Re: [PATCH net-next] selftests: forwarding: add Per-Stream Filtering
 and Policing test for Ocelot

Hi Kurt,

Thanks for reviewing.

On Fri, Apr 29, 2022 at 08:32:22AM +0200, Kurt Kanzenbach wrote:
> Hi Vladimir,
> 
> On Thu Apr 28 2022, Vladimir Oltean wrote:
> > The Felix VSC9959 switch in NXP LS1028A supports the tc-gate action
> > which enforced time-based access control per stream. A stream as seen by
> > this switch is identified by {MAC DA, VID}.
> >
> > We use the standard forwarding selftest topology with 2 host interfaces
> > and 2 switch interfaces. The host ports must require timestamping non-IP
> > packets and supporting tc-etf offload, for isochron to work. The
> > isochron program monitors network sync status (ptp4l, phc2sys) and
> > deterministically transmits packets to the switch such that the tc-gate
> > action either (a) always accepts them based on its schedule, or
> > (b) always drops them.
> >
> > I tried to keep as much of the logic that isn't specific to the NXP
> > LS1028A in a new tsn_lib.sh, for future reuse. This covers
> > synchronization using ptp4l and phc2sys, and isochron.
> 
> For running this selftest `isochron` tool is required. That's neither
> packaged on Linux distributions or available in the kernel source. I
> guess, it has to be built from your Github account/repository?

This is slightly inconvenient, yes. But for this selftest in particular,
a more specialized setup is required anyway, as it only runs on an NXP
LS1028A based board. So I guess it's only the smaller of several
inconveniences?

A few years ago when I decided to work on isochron, I searched for an
application for detailed network latency testing and I couldn't find
one. I don't think the situation has improved a lot since then. If
isochron is useful for a larger audience, I can look into what I can do
about distribution. It's license-compatible with the kernel, but it's a
large-ish program to just toss into tools/testing/selftests/, plus I
still commit rather frequently to it, and I'd probably annoy the crap
out of everyone if I move its development to netdev@...r.kernel.org.

> >
> > The cycle-time chosen for this selftest isn't particularly impressive
> > (and the focus is the functionality of the switch), but I didn't really
> > know what to do better, considering that it will mostly be run during
> > debugging sessions, various kernel bloatware would be enabled, like
> > lockdep, KASAN, etc, and we certainly can't run any races with those on.
> >
> > I tried to look through the kselftest framework for other real time
> > applications and didn't really find any, so I'm not sure how better to
> > prepare the environment in case we want to go for a lower cycle time.
> > At the moment, the only thing the selftest is ensuring is that dynamic
> > frequency scaling is disabled on the CPU that isochron runs on. It would
> > probably be useful to have a blacklist of kernel config options (checked
> > through zcat /proc/config.gz) and some cyclictest scripts to run
> > beforehand, but I saw none of those.
> >
> > Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
> 
> [snip]
> 
> > diff --git a/tools/testing/selftests/net/forwarding/tsn_lib.sh b/tools/testing/selftests/net/forwarding/tsn_lib.sh
> > new file mode 100644
> > index 000000000000..efac5badd5a0
> > --- /dev/null
> > +++ b/tools/testing/selftests/net/forwarding/tsn_lib.sh
> > @@ -0,0 +1,219 @@
> > +#!/bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright 2021-2022 NXP
> > +
> > +# Tunables
> > +UTC_TAI_OFFSET=37
> 
> Why do you need the UTC to TAI offset? isochron could just use CLOCK_TAI
> as clockid for the task scheduling.

isochron indeed works in CLOCK_TAI (this is done so that all timestamps
are chronologically ordered when everything is synchronized).

However, not all the input it has to work with is in CLOCK_TAI. For
example, software PTP timestamps are collected by the kernel using
__net_timestamp() -> ktime_get_real(), and that is in CLOCK_REALTIME
domain. So user space converts the CLOCK_REALTIME timestamps to
CLOCK_TAI by factoring in the UTC-to-TAI offset.

I am not in love with specifying this offset via a tunable script value
either. The isochron program has the ability to detect the kernel's TAI
offset and run with that, but sadly, phc2sys in non-automatic mode wants
the "-O" argument to be supplied externally. So regardless, I have to
come up with an offset to give to phc2sys which it will apply when
disciplining the PHC. So I figured why not just supply 37, the current
value.

I'm not sure what you're suggesting?

> > +ISOCHRON_CPU=1
> 
> Seems reasonable to assume two cpus.
> 
> > +
> > +# https://github.com/vladimiroltean/tsn-scripts
> > +# WARNING: isochron versions pre-1.0 are unstable,
> > +# always use the latest version
> > +require_command isochron
> > +require_command phc2sys
> > +require_command ptp4l
> > +
> > +phc2sys_start()
> > +{
> > +	local if_name=$1
> > +	local uds_address=$2
> > +	local extra_args=""
> > +
> > +	if ! [ -z "${uds_address}" ]; then
> > +		extra_args="${extra_args} -z ${uds_address}"
> > +	fi
> > +
> > +	phc2sys_log="$(mktemp)"
> > +
> > +	chrt -f 10 phc2sys -m \
> > +		-c ${if_name} \
> > +		-s CLOCK_REALTIME \
> > +		-O ${UTC_TAI_OFFSET} \
> > +		--step_threshold 0.00002 \
> > +		--first_step_threshold 0.00002 \
> > +		${extra_args} \
> > +		> "${phc2sys_log}" 2>&1 &
> > +	phc2sys_pid=$!
> > +
> > +	echo "phc2sys logs to ${phc2sys_log} and has pid ${phc2sys_pid}"
> > +
> > +	sleep 1
> > +}
> > +
> > +phc2sys_stop()
> > +{
> > +	{ kill ${phc2sys_pid} && wait ${phc2sys_pid}; } 2> /dev/null
> > +	rm "${phc2sys_log}" 2> /dev/null
> > +}
> > +
> > +ptp4l_start()
> > +{
> > +	local if_name=$1
> > +	local slave_only=$2
> > +	local uds_address=$3
> > +	local log="ptp4l_log_${if_name}"
> > +	local pid="ptp4l_pid_${if_name}"
> > +	local extra_args=""
> > +
> > +	if [ "${slave_only}" = true ]; then
> > +		extra_args="${extra_args} -s"
> > +	fi
> > +
> > +	# declare dynamic variables ptp4l_log_${if_name} and ptp4l_pid_${if_name}
> > +	# as global, so that they can be referenced later
> > +	declare -g "${log}=$(mktemp)"
> > +
> > +	chrt -f 10 ptp4l -m -2 -P \
> > +		-i ${if_name} \
> > +		--step_threshold 0.00002 \
> > +		--first_step_threshold 0.00002 \
> > +		--tx_timestamp_timeout 100 \
> > +		--uds_address="${uds_address}" \
> > +		${extra_args} \
> > +		> "${!log}" 2>&1 &
> > +	declare -g "${pid}=$!"
> > +
> > +	echo "ptp4l for interface ${if_name} logs to ${!log} and has pid ${!pid}"
> > +
> > +	sleep 1
> > +}
> > +
> > +ptp4l_stop()
> > +{
> > +	local if_name=$1
> > +	local log="ptp4l_log_${if_name}"
> > +	local pid="ptp4l_pid_${if_name}"
> > +
> > +	{ kill ${!pid} && wait ${!pid}; } 2> /dev/null
> > +	rm "${!log}" 2> /dev/null
> > +}
> > +
> > +cpufreq_max()
> > +{
> > +	local cpu=$1
> > +	local freq="cpu${cpu}_freq"
> > +	local governor="cpu${cpu}_governor"
> > +
> > +	# declare dynamic variables cpu${cpu}_freq and cpu${cpu}_governor as
> > +	# global, so they can be referenced later
> > +	declare -g "${freq}=$(cat /sys/bus/cpu/devices/cpu${cpu}/cpufreq/scaling_min_freq)"
> > +	declare -g "${governor}=$(cat /sys/bus/cpu/devices/cpu${cpu}/cpufreq/scaling_governor)"
> > +
> > +	cat /sys/bus/cpu/devices/cpu${cpu}/cpufreq/scaling_max_freq > \
> > +		/sys/bus/cpu/devices/cpu${cpu}/cpufreq/scaling_min_freq
> > +	echo -n "performance" > \
> > +		/sys/bus/cpu/devices/cpu${cpu}/cpufreq/scaling_governor
> > +}
> > +
> > +cpufreq_restore()
> > +{
> > +	local cpu=$1
> > +	local freq="cpu${cpu}_freq"
> > +	local governor="cpu${cpu}_governor"
> > +
> > +	echo "${!freq}" > /sys/bus/cpu/devices/cpu${cpu}/cpufreq/scaling_min_freq
> > +	echo -n "${!governor}" > \
> > +		/sys/bus/cpu/devices/cpu${cpu}/cpufreq/scaling_governor
> > +}
> > +
> > +isochron_recv_start()
> > +{
> > +	local if_name=$1
> > +	local uds=$2
> > +	local extra_args=$3
> > +
> > +	if ! [ -z "${uds}" ]; then
> > +		extra_args="--unix-domain-socket ${uds}"
> > +	fi
> > +
> > +	isochron rcv \
> > +		--interface ${if_name} \
> > +		--sched-priority 98 \
> > +		--sched-rr \
> 
> Why SCHED_RR?

Because it's not SCHED_OTHER? Why not SCHED_RR?