netdev - Re: [Intel-wired-lan] [PATCH] ice/ptp: fix the PTP worker retrying indefinitely if the link went down

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2bdeb975-6d45-67bb-3017-f19df62fe7af@intel.com>
Date:   Tue, 17 Jan 2023 10:46:20 -0800
From:   Jacob Keller <jacob.e.keller@...el.com>
To:     Daniel Vacek <neelx@...hat.com>,
        Jesse Brandeburg <jesse.brandeburg@...el.com>,
        Tony Nguyen <anthony.l.nguyen@...el.com>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        "Richard Cochran" <richardcochran@...il.com>,
        "Kolacinski, Karol" <karol.kolacinski@...el.com>,
        Siddaraju <siddaraju.dh@...el.com>,
        "Michalik, Michal" <michal.michalik@...el.com>
CC:     <netdev@...r.kernel.org>, <intel-wired-lan@...ts.osuosl.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [Intel-wired-lan] [PATCH] ice/ptp: fix the PTP worker retrying
 indefinitely if the link went down

On 1/17/2023 10:15 AM, Daniel Vacek wrote:
> When the link goes down the ice_ptp_tx_tstamp_work() may loop forever trying
> to process the packets. In this case it makes sense to just drop them.
> 

Hi,

Do you have some more details on what situation caused this state? Or is
this just based on code review?

It won't loop forever because if link is down for more than 2 seconds
we'll discard the old timestamps which we assume are not going to arrive.

The trouble is that if a timestamp *does* arrive late, we need to ensure
that we never assign the captured time to the wrong packet, and that for
E822 devices we always read the correct number of timestamps (otherwise
we can get the logic for timestamp interrupt generation broken).

Consider for example this flow for e810:

1) a tx packet with a timestamp request is scheduled to hw
2) the packet begins being transmitted
3) link goes down
4) interrupt triggers, ice_ptp_tx_tstamp is run
5) link is down, so we skip reading this timestamp. Since we don't read
the timestamp, we just discard the skb and we don't update the cached tx
timestamp value
6) link goes up
7) 2 tx packets with a timestamp request are sent and one of them is on
the same index as the packet in (1)
8) the other tx packet completes and we get an interrupt
9) the loop reads both timestamps. Since the tx packet in slot (1)
doesn't match its cached value it looks "new" so the function reports
the old timestamp to the wrong packet.

Consider this flow for e822:

1) 2 tx packets with a timestamp request are scheduled to hw
2) the packets begin being transmitted
3) link goes down
4) an interrupt for the Tx timestamp is triggered, but we don't read the
timestamps because we have link down and we skipped the ts_read.
5) the internal e822 hardware counter is not decremented due to no reads
6) no more timestamp interrupts will be triggered by hardware until we
read the appropriate number of timestamps

I am not sure if link going up will properly reset and re-initialize the
Tx timestamp block but I suspect it does not. @Karol, @Siddaraju,
@Michal, do you recall more details on this?

I understand the desire to avoid polling when unnecessary, but I am
worried because the hardware and firmware interactions here are pretty
complex and its easy to get this wrong. (see: all of the previous
patches and bug fixes we've been working on... we got this wrong a LOT
already...)

Without a more concrete explanation of what this fixes I'm worried about
this change :(

At a minimum I think I would only set drop_ts but not not goto skip_ts_read.

That way if we happen to have a ready timestamp (for E822) we'll still
read it and avoid the miscounting from not reading a completed timestamp.

This also ensures that on e810 the cached timestamp value is updated and
that we avoid the other situation.

I'd still prefer if you have a bug report or more details on the failure
case. I believe even if we poll it should be no more than 2 seconds for
an old timestamp that never got sent to be discarded.

> Signed-off-by: Daniel Vacek <neelx@...hat.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_ptp.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
> index d63161d73eb16..c313177ba6676 100644
> --- a/drivers/net/ethernet/intel/ice/ice_ptp.c
> +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
> @@ -680,6 +680,7 @@ static bool ice_ptp_tx_tstamp(struct ice_ptp_tx *tx)
>  	struct ice_pf *pf;
>  	struct ice_hw *hw;
>  	u64 tstamp_ready;
> +	bool link_up;
>  	int err;
>  	u8 idx;
>  
> @@ -695,6 +696,8 @@ static bool ice_ptp_tx_tstamp(struct ice_ptp_tx *tx)
>  	if (err)
>  		return false;
>  
> +	link_up = hw->port_info->phy.link_info.link_info & ICE_AQ_LINK_UP;
> +
>  	for_each_set_bit(idx, tx->in_use, tx->len) {
>  		struct skb_shared_hwtstamps shhwtstamps = {};
>  		u8 phy_idx = idx + tx->offset;
> @@ -702,6 +705,12 @@ static bool ice_ptp_tx_tstamp(struct ice_ptp_tx *tx)
>  		bool drop_ts = false;
>  		struct sk_buff *skb;
>  
> +		/* Drop packets if the link went down */
> +		if (!link_up) {
> +			drop_ts = true;
> +			goto skip_ts_read;
> +		}
> +
>  		/* Drop packets which have waited for more than 2 seconds */
>  		if (time_is_before_jiffies(tx->tstamps[idx].start + 2 * HZ)) {
>  			drop_ts = true;