lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 23 Jun 2019 05:13:20 +0000
From:   Sudarsana Reddy Kalluru <skalluru@...vell.com>
To:     "Guilherme G. Piccoli" <gpiccoli@...onical.com>,
        GR-everest-linux-l2 <GR-everest-linux-l2@...vell.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
CC:     Ariel Elior <aelior@...vell.com>,
        "jay.vosburgh@...onical.com" <jay.vosburgh@...onical.com>
Subject: RE: [EXT] [PATCH] bnx2x: Prevent ptp_task to be rescheduled
 indefinitely


> -----Original Message-----
> From: Guilherme G. Piccoli <gpiccoli@...onical.com>
> Sent: Saturday, June 22, 2019 2:57 AM
> To: GR-everest-linux-l2 <GR-everest-linux-l2@...vell.com>;
> netdev@...r.kernel.org
> Cc: Ariel Elior <aelior@...vell.com>; Sudarsana Reddy Kalluru
> <skalluru@...vell.com>; gpiccoli@...onical.com;
> jay.vosburgh@...onical.com
> Subject: [EXT] [PATCH] bnx2x: Prevent ptp_task to be rescheduled indefinitely
> 
> External Email
> 
> ----------------------------------------------------------------------
> Currently bnx2x ptp worker tries to read a register with timestamp
> information in case of TX packet timestamping and in case it fails, the routine
> reschedules itself indefinitely. This was reported as a kworker always at 100%
> of CPU usage, which was narrowed down to be bnx2x ptp_task.
> 
> By following the ioctl handler, we could narrow down the problem to an NTP
> tool (chrony) requesting HW timestamping from bnx2x NIC with RX filter
> zeroed; this isn't reproducible for example with linuxptp since this tool
> request a supported RX filter. It seems the NIC HW timestamp mechanism
> cannot work well with RX_FILTER_NONE - in driver's PTP filter initialization
> routine, when there's not a supported filter request the function does not
> perform a specific register write to the adapter.
> 
> This patch addresses the problem of the everlasting reschedule of the ptp
> worker by limiting that to 3 attempts (the first one plus two reschedules), in
> order to prevent the unbound resource consumption from the driver. It's not
> correct behavior for a driver to not take into account potential problems in a
> routine reading a device register, be it an invalid RX filter (leading to a non-
> functional HW clock) or even a potential device FW issue causing the register
> value to be wrong, hence we believe the fix is relevant to ensure proper
> driver behavior.
> 
> This has no functional change in the succeeding path of the HW
> timestamping code in the driver, only portion of code it changes is the error
> path for TX timestamping. It was tested using both linuxptp and chrony.
> 
> Reported-and-tested-by: Przemyslaw Hausman
> <przemyslaw.hausman@...onical.com>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@...onical.com>
> ---
>  drivers/net/ethernet/broadcom/bnx2x/bnx2x.h    |  1 +
>  .../net/ethernet/broadcom/bnx2x/bnx2x_cmn.c    |  1 +
>  .../net/ethernet/broadcom/bnx2x/bnx2x_main.c   | 18 +++++++++++++-----
>  3 files changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
> index 6026b53137aa..349965135227 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
> @@ -1838,6 +1838,7 @@ struct bnx2x {
>  	bool timecounter_init_done;
>  	struct sk_buff *ptp_tx_skb;
>  	unsigned long ptp_tx_start;
> +	u8 ptp_retry_count;
>  	bool hwtstamp_ioctl_called;
>  	u16 tx_type;
>  	u16 rx_filter;
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
> index 008ad0ca89ba..990ec049f357 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
> @@ -3865,6 +3865,7 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb,
> struct net_device *dev)
>  			/* schedule check for Tx timestamp */
>  			bp->ptp_tx_skb = skb_get(skb);
>  			bp->ptp_tx_start = jiffies;
> +			bp->ptp_retry_count = 0;
>  			schedule_work(&bp->ptp_task);
>  		}
>  	}
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> index 03ac10b1cd1e..872ae672faaa 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> @@ -15233,16 +15233,24 @@ static void bnx2x_ptp_task(struct work_struct
> *work)
>  		memset(&shhwtstamps, 0, sizeof(shhwtstamps));
>  		shhwtstamps.hwtstamp = ns_to_ktime(ns);
>  		skb_tstamp_tx(bp->ptp_tx_skb, &shhwtstamps);
> -		dev_kfree_skb_any(bp->ptp_tx_skb);
> -		bp->ptp_tx_skb = NULL;
> -
>  		DP(BNX2X_MSG_PTP, "Tx timestamp, timestamp cycles =
> %llu, ns = %llu\n",
>  		   timestamp, ns);
> +		goto clear;
>  	} else {
>  		DP(BNX2X_MSG_PTP, "There is no valid Tx timestamp
> yet\n");
> -		/* Reschedule to keep checking for a valid timestamp value
> */
> -		schedule_work(&bp->ptp_task);
> +		/* Reschedule twice to check again for a valid timestamp */
> +		if (++bp->ptp_retry_count < 3) {
> +			schedule_work(&bp->ptp_task);
> +			return;
> +		}
> +		DP(BNX2X_MSG_PTP, "Gave up Tx timestamp, register read
> %u\n", val_seq);
> +		netdev_warn_once(bp->dev,
> +				 "Gave up Tx timestamp, register read %u\n",
> val_seq);
>  	}
> +clear:
> +	dev_kfree_skb_any(bp->ptp_tx_skb);
> +	bp->ptp_tx_skb = NULL;
> +	bp->ptp_retry_count = 0;
>  }
> 
>  void bnx2x_set_rx_ts(struct bnx2x *bp, struct sk_buff *skb)
> --
> 2.21.0

Thanks for uncovering this issue, and the fix.
With the proposed fix, if HW latches the timestamp after 3 iterations then it would lead to erroneous PTP functionality. When driver receives the next PTP packet, driver sees that timestamp is available in the register and would assign this value (which corresponds to last/skipped ptp packet) to the PTP packet. In general, FW takes around 1ms to 100ms to perform the timestamp and may take more. Hence the promising solution for this issue would be to wait for timestamp for particular period of time. If Tx timestamp is not available for a pre-determined max period, then declare it as an error scenario and terminate the thread.

Just for the reference, similar fix was added for Marvell qede driver recently. Patch details are,
9adebac37e7d: (qede: Handle infinite driver spinning for Tx timestamp.)
https://www.spinics.net/lists/netdev/msg572838.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ