[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200112161017.43b728c8@cakuba>
Date: Sun, 12 Jan 2020 16:10:17 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: Ido Schimmel <idosch@...sch.org>
Cc: netdev@...r.kernel.org, davem@...emloft.net, jiri@...lanox.com,
mlxsw@...lanox.com, Shalom Toledo <shalomt@...lanox.com>,
Ido Schimmel <idosch@...lanox.com>
Subject: Re: [PATCH net 2/4] mlxsw: switchx2: Do not modify cloned SKBs
during xmit
On Sun, 12 Jan 2020 18:06:39 +0200, Ido Schimmel wrote:
> From: Shalom Toledo <shalomt@...lanox.com>
>
> The driver needs to prepend a Tx header to each packet it is transmitting.
> The header includes information such as the egress port and traffic class.
>
> The addition of the header requires the driver to modify the SKB's data
> buffer and therefore the SKB must be unshared first. Otherwise, we risk
> hitting various race conditions with cloned SKBs.
>
> For example, when a packet is flooded (cloned) by the bridge driver to two
> switch ports swp1 and swp2:
>
> t0 - mlxsw_sp_port_xmit() is called for swp1. Tx header is prepended with
> swp1's port number
> t1 - mlxsw_sp_port_xmit() is called for swp2. Tx header is prepended with
> swp2's port number, overwriting swp1's port number
> t2 - The device processes data buffer from t0. Packet is transmitted via
> swp2
> t3 - The device processes data buffer from t1. Packet is transmitted via
> swp2
>
> Usually, the device is fast enough and transmits the packet before its
> Tx header is overwritten, but this is not the case in emulated
> environments.
>
> Fix this by unsharing the SKB.
Isn't this what skb_cow_head() is for?
> diff --git a/drivers/net/ethernet/mellanox/mlxsw/switchx2.c b/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
> index de6cb22f68b1..47826e905e5c 100644
> --- a/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
> +++ b/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
> @@ -299,6 +299,10 @@ static netdev_tx_t mlxsw_sx_port_xmit(struct sk_buff *skb,
> u64 len;
> int err;
>
> + skb = skb_unshare(skb, GFP_ATOMIC);
> + if (unlikely(!skb))
> + return NETDEV_TX_BUSY;
> +
> memset(skb->cb, 0, sizeof(struct mlxsw_skb_cb));
>
> if (mlxsw_core_skb_transmit_busy(mlxsw_sx->core, &tx_info))
the next line here is:
return NETDEV_TX_BUSY;
Is it okay to return BUSY after copying an skb? The reference to the
original skb may already be gone at this point, while the copy is going
to be leaked, right?
Powered by blists - more mailing lists