[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20211125232118.2644060-5-vladimir.oltean@nxp.com>
Date: Fri, 26 Nov 2021 01:21:18 +0200
From: Vladimir Oltean <vladimir.oltean@....com>
To: netdev@...r.kernel.org, Po Liu <po.liu@....com>
Cc: "David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Claudiu Manoil <claudiu.manoil@....com>,
Alexandre Belloni <alexandre.belloni@...tlin.com>,
Antoine Tenart <atenart@...nel.org>,
UNGLinuxDriver@...rochip.com, Andrew Lunn <andrew@...n.ch>,
Vivien Didelot <vivien.didelot@...il.com>,
Florian Fainelli <f.fainelli@...il.com>,
Xiaoliang Yang <xiaoliang.yang_1@....com>,
Yangbo Lu <yangbo.lu@....com>, Rui Sousa <rui.sousa@....com>,
Richard Cochran <richardcochran@...il.com>,
"Allan W . Nielsen" <allan.nielsen@...rochip.com>
Subject: [PATCH net-next 4/4] net: mscc: ocelot: set up traps for PTP packets
IEEE 1588 support was declared too soon for the Ocelot switch. Out of
reset, this switch does not apply any special treatment for PTP packets,
i.e. when an event message is received, the natural tendency is to
forward it by MAC DA/VLAN ID. This poses a problem when the ingress port
is under a bridge, since user space application stacks (written
primarily for endpoint ports, not switches) like ptp4l expect that PTP
messages are always received on AF_PACKET / AF_INET sockets (depending
on the PTP transport being used), and never being autonomously
forwarded. Any forwarding, if necessary (for example in Transparent
Clock mode) is handled in software by ptp4l. Having the hardware forward
these packets too will cause duplicates which will confuse endpoints
connected to these switches.
So PTP over L2 barely works, in the sense that PTP packets reach the CPU
port, but they reach it via flooding, and therefore reach lots of other
unwanted destinations too. But PTP over IPv4/IPv6 does not work at all.
This is because the Ocelot switch have a separate destination port mask
for unknown IP multicast (which PTP over IP is) flooding compared to
unknown non-IP multicast (which PTP over L2 is) flooding. Specifically,
the driver allows the CPU port to be in the PGID_MC port group, but not
in PGID_MCIPV4 and PGID_MCIPV6. There are several presentations from
Allan Nielsen which explain that the embedded MIPS CPU on Ocelot
switches is not very powerful at all, so every penny they could save by
not allowing flooding to the CPU port module matters. Unknown IP
multicast did not make it.
The de facto consensus is that when a switch is PTP-aware and an
application stack for PTP is running, switches should have some sort of
trapping mechanism for PTP packets, to extract them from the hardware
data path. This avoids both problems:
(a) PTP packets are no longer flooded to unwanted destinations
(b) PTP over IP packets are no longer denied from reaching the CPU since
they arrive there via a trap and not via flooding
It is not the first time when this change is attempted. Last time, the
feedback from Allan Nielsen and Andrew Lunn was that the traps should
not be installed by default, and that PTP-unaware switching may be
desired for some use cases:
https://patchwork.ozlabs.org/project/netdev/patch/20190813025214.18601-5-yangbo.lu@nxp.com/
To address that feedback, the present patch adds the necessary packet
traps according to the RX filter configuration transmitted by user space
through the SIOCSHWTSTAMP ioctl. Trapping is done via VCAP IS2, where we
keep 5 filters, which are amended each time RX timestamping is enabled
or disabled on a port:
- 1 for PTP over L2
- 2 for PTP over IPv4 (UDP ports 319 and 320)
- 2 for PTP over IPv6 (UDP ports 319 and 320)
The cookie by which these filters (invisible to tc) are identified is
strategically chosen such that it does not collide with the filters used
for the ocelot-8021q tagging protocol by the Felix driver, or with the
MRP traps set up by the Ocelot library.
Other alternatives were considered, like patching user space to do
something, but there are so many ways in which PTP packets could be made
to reach the CPU, generically speaking, that "do what?" is a very valid
question. The ptp4l program from the linuxptp stack already attempts to
do something: it calls setsockopt(IP_ADD_MEMBERSHIP) (and
PACKET_ADD_MEMBERSHIP, respectively) which translates in both cases into
a dev_mc_add() on the interface, in the kernel:
https://github.com/richardcochran/linuxptp/blob/v3.1.1/udp.c#L73
https://github.com/richardcochran/linuxptp/blob/v3.1.1/raw.c
Reality shows that this is not sufficient in case the interface belongs
to a switchdev driver, as dev_mc_add() does not show the intention to
trap a packet to the CPU, but rather the intention to not drop it (it is
strictly for RX filtering, same as promiscuous does not mean to send all
traffic to the CPU, but to not drop traffic with unknown MAC DA). This
topic is a can of worms in itself, and it would be great if user space
could just stay out of it.
On the other hand, setting up PTP traps privately within the driver is
not new by any stretch of the imagination:
https://elixir.bootlin.com/linux/v5.16-rc2/source/drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c#L833
https://elixir.bootlin.com/linux/v5.16-rc2/source/drivers/net/dsa/hirschmann/hellcreek.c#L1050
https://elixir.bootlin.com/linux/v5.16-rc2/source/include/linux/dsa/sja1105.h#L21
So this is the approach taken here as well. The difference here being
that we prepare and destroy the traps per port, dynamically at runtime,
as opposed to driver init time, because apparently, PTP-unaware
forwarding is a use case.
Fixes: 4e3b0468e6d7 ("net: mscc: PTP Hardware Clock (PHC) support")
Reported-by: Po Liu <po.liu@....com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
---
drivers/net/ethernet/mscc/ocelot.c | 241 ++++++++++++++++++++++++++++-
1 file changed, 240 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
index f5490533c884..456aeba2b245 100644
--- a/drivers/net/ethernet/mscc/ocelot.c
+++ b/drivers/net/ethernet/mscc/ocelot.c
@@ -1347,6 +1347,225 @@ int ocelot_fdb_dump(struct ocelot *ocelot, int port,
}
EXPORT_SYMBOL(ocelot_fdb_dump);
+static void ocelot_populate_l2_ptp_trap_key(struct ocelot_vcap_filter *trap)
+{
+ trap->key_type = OCELOT_VCAP_KEY_ETYPE;
+ *(__be16 *)trap->key.etype.etype.value = htons(ETH_P_1588);
+ *(__be16 *)trap->key.etype.etype.mask = htons(0xffff);
+}
+
+static void
+ocelot_populate_ipv4_ptp_event_trap_key(struct ocelot_vcap_filter *trap)
+{
+ trap->key_type = OCELOT_VCAP_KEY_IPV4;
+ trap->key.ipv4.dport.value = PTP_EV_PORT;
+ trap->key.ipv4.dport.mask = 0xffff;
+}
+
+static void
+ocelot_populate_ipv6_ptp_event_trap_key(struct ocelot_vcap_filter *trap)
+{
+ trap->key_type = OCELOT_VCAP_KEY_IPV6;
+ trap->key.ipv6.dport.value = PTP_EV_PORT;
+ trap->key.ipv6.dport.mask = 0xffff;
+}
+
+static void
+ocelot_populate_ipv4_ptp_general_trap_key(struct ocelot_vcap_filter *trap)
+{
+ trap->key_type = OCELOT_VCAP_KEY_IPV4;
+ trap->key.ipv4.dport.value = PTP_GEN_PORT;
+ trap->key.ipv4.dport.mask = 0xffff;
+}
+
+static void
+ocelot_populate_ipv6_ptp_general_trap_key(struct ocelot_vcap_filter *trap)
+{
+ trap->key_type = OCELOT_VCAP_KEY_IPV6;
+ trap->key.ipv6.dport.value = PTP_GEN_PORT;
+ trap->key.ipv6.dport.mask = 0xffff;
+}
+
+static int ocelot_trap_add(struct ocelot *ocelot, int port,
+ unsigned long cookie,
+ void (*populate)(struct ocelot_vcap_filter *f))
+{
+ struct ocelot_vcap_block *block_vcap_is2;
+ struct ocelot_vcap_filter *trap;
+ bool new = false;
+ int err;
+
+ block_vcap_is2 = &ocelot->block[VCAP_IS2];
+
+ trap = ocelot_vcap_block_find_filter_by_id(block_vcap_is2, cookie,
+ false);
+ if (!trap) {
+ trap = kzalloc(sizeof(*trap), GFP_KERNEL);
+ if (!trap)
+ return -ENOMEM;
+
+ populate(trap);
+ trap->prio = 1;
+ trap->id.cookie = cookie;
+ trap->id.tc_offload = false;
+ trap->block_id = VCAP_IS2;
+ trap->type = OCELOT_VCAP_FILTER_OFFLOAD;
+ trap->lookup = 0;
+ trap->action.cpu_copy_ena = true;
+ trap->action.mask_mode = OCELOT_MASK_MODE_PERMIT_DENY;
+ trap->action.port_mask = 0;
+ new = true;
+ }
+
+ trap->ingress_port_mask |= BIT(port);
+
+ if (new)
+ err = ocelot_vcap_filter_add(ocelot, trap, NULL);
+ else
+ err = ocelot_vcap_filter_replace(ocelot, trap);
+ if (err) {
+ trap->ingress_port_mask &= ~BIT(port);
+ if (!trap->ingress_port_mask)
+ kfree(trap);
+ return err;
+ }
+
+ return 0;
+}
+
+static int ocelot_trap_del(struct ocelot *ocelot, int port,
+ unsigned long cookie)
+{
+ struct ocelot_vcap_block *block_vcap_is2;
+ struct ocelot_vcap_filter *trap;
+
+ block_vcap_is2 = &ocelot->block[VCAP_IS2];
+
+ trap = ocelot_vcap_block_find_filter_by_id(block_vcap_is2, cookie,
+ false);
+ if (!trap)
+ return 0;
+
+ trap->ingress_port_mask &= ~BIT(port);
+ if (!trap->ingress_port_mask)
+ return ocelot_vcap_filter_del(ocelot, trap);
+
+ return ocelot_vcap_filter_replace(ocelot, trap);
+}
+
+static int ocelot_l2_ptp_trap_add(struct ocelot *ocelot, int port)
+{
+ unsigned long l2_cookie = ocelot->num_phys_ports + 1;
+
+ return ocelot_trap_add(ocelot, port, l2_cookie,
+ ocelot_populate_l2_ptp_trap_key);
+}
+
+static int ocelot_l2_ptp_trap_del(struct ocelot *ocelot, int port)
+{
+ unsigned long l2_cookie = ocelot->num_phys_ports + 1;
+
+ return ocelot_trap_del(ocelot, port, l2_cookie);
+}
+
+static int ocelot_ipv4_ptp_trap_add(struct ocelot *ocelot, int port)
+{
+ unsigned long ipv4_gen_cookie = ocelot->num_phys_ports + 2;
+ unsigned long ipv4_ev_cookie = ocelot->num_phys_ports + 3;
+ int err;
+
+ err = ocelot_trap_add(ocelot, port, ipv4_ev_cookie,
+ ocelot_populate_ipv4_ptp_event_trap_key);
+ if (err)
+ return err;
+
+ err = ocelot_trap_add(ocelot, port, ipv4_gen_cookie,
+ ocelot_populate_ipv4_ptp_general_trap_key);
+ if (err)
+ ocelot_trap_del(ocelot, port, ipv4_ev_cookie);
+
+ return err;
+}
+
+static int ocelot_ipv4_ptp_trap_del(struct ocelot *ocelot, int port)
+{
+ unsigned long ipv4_gen_cookie = ocelot->num_phys_ports + 2;
+ unsigned long ipv4_ev_cookie = ocelot->num_phys_ports + 3;
+ int err;
+
+ err = ocelot_trap_del(ocelot, port, ipv4_ev_cookie);
+ err |= ocelot_trap_del(ocelot, port, ipv4_gen_cookie);
+ return err;
+}
+
+static int ocelot_ipv6_ptp_trap_add(struct ocelot *ocelot, int port)
+{
+ unsigned long ipv6_gen_cookie = ocelot->num_phys_ports + 4;
+ unsigned long ipv6_ev_cookie = ocelot->num_phys_ports + 5;
+ int err;
+
+ err = ocelot_trap_add(ocelot, port, ipv6_ev_cookie,
+ ocelot_populate_ipv6_ptp_event_trap_key);
+ if (err)
+ return err;
+
+ err = ocelot_trap_add(ocelot, port, ipv6_gen_cookie,
+ ocelot_populate_ipv6_ptp_general_trap_key);
+ if (err)
+ ocelot_trap_del(ocelot, port, ipv6_ev_cookie);
+
+ return err;
+}
+
+static int ocelot_ipv6_ptp_trap_del(struct ocelot *ocelot, int port)
+{
+ unsigned long ipv6_gen_cookie = ocelot->num_phys_ports + 4;
+ unsigned long ipv6_ev_cookie = ocelot->num_phys_ports + 5;
+ int err;
+
+ err = ocelot_trap_del(ocelot, port, ipv6_ev_cookie);
+ err |= ocelot_trap_del(ocelot, port, ipv6_gen_cookie);
+ return err;
+}
+
+static int ocelot_setup_ptp_traps(struct ocelot *ocelot, int port,
+ bool l2, bool l4)
+{
+ int err;
+
+ if (l2)
+ err = ocelot_l2_ptp_trap_add(ocelot, port);
+ else
+ err = ocelot_l2_ptp_trap_del(ocelot, port);
+ if (err)
+ return err;
+
+ if (l4) {
+ err = ocelot_ipv4_ptp_trap_add(ocelot, port);
+ if (err)
+ goto err_ipv4;
+
+ err = ocelot_ipv6_ptp_trap_add(ocelot, port);
+ if (err)
+ goto err_ipv6;
+ } else {
+ err = ocelot_ipv4_ptp_trap_del(ocelot, port);
+
+ err |= ocelot_ipv6_ptp_trap_del(ocelot, port);
+ }
+ if (err)
+ return err;
+
+ return 0;
+
+err_ipv6:
+ ocelot_ipv4_ptp_trap_del(ocelot, port);
+err_ipv4:
+ if (l2)
+ ocelot_l2_ptp_trap_del(ocelot, port);
+ return err;
+}
+
int ocelot_hwstamp_get(struct ocelot *ocelot, int port, struct ifreq *ifr)
{
return copy_to_user(ifr->ifr_data, &ocelot->hwtstamp_config,
@@ -1357,7 +1576,9 @@ EXPORT_SYMBOL(ocelot_hwstamp_get);
int ocelot_hwstamp_set(struct ocelot *ocelot, int port, struct ifreq *ifr)
{
struct ocelot_port *ocelot_port = ocelot->ports[port];
+ bool l2 = false, l4 = false;
struct hwtstamp_config cfg;
+ int err;
if (copy_from_user(&cfg, ifr->ifr_data, sizeof(cfg)))
return -EFAULT;
@@ -1392,19 +1613,37 @@ int ocelot_hwstamp_set(struct ocelot *ocelot, int port, struct ifreq *ifr)
case HWTSTAMP_FILTER_PTP_V2_L4_EVENT:
case HWTSTAMP_FILTER_PTP_V2_L4_SYNC:
case HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ:
+ l4 = true;
+ break;
case HWTSTAMP_FILTER_PTP_V2_L2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_L2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ:
+ l2 = true;
+ break;
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
- cfg.rx_filter = HWTSTAMP_FILTER_PTP_V2_EVENT;
+ l2 = true;
+ l4 = true;
break;
default:
mutex_unlock(&ocelot->ptp_lock);
return -ERANGE;
}
+ err = ocelot_setup_ptp_traps(ocelot, port, l2, l4);
+ if (err)
+ return err;
+
+ if (l2 && l4)
+ cfg.rx_filter = HWTSTAMP_FILTER_PTP_V2_EVENT;
+ else if (l2)
+ cfg.rx_filter = HWTSTAMP_FILTER_PTP_V2_L2_EVENT;
+ else if (l4)
+ cfg.rx_filter = HWTSTAMP_FILTER_PTP_V2_L4_EVENT;
+ else
+ cfg.rx_filter = HWTSTAMP_FILTER_NONE;
+
/* Commit back the result & save it */
memcpy(&ocelot->hwtstamp_config, &cfg, sizeof(cfg));
mutex_unlock(&ocelot->ptp_lock);
--
2.25.1
Powered by blists - more mailing lists