lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <8d2b0aec270b8cd0111654dc4b361987a112d3ce.camel@sipsolutions.net>
Date: Thu, 04 May 2023 14:10:50 +0200
From: Johannes Berg <johannes@...solutions.net>
To: Sultan Alsawaf <sultan@...neltoast.com>
Cc: "Greenman, Gregory" <gregory.greenman@...el.com>, Kalle Valo
 <kvalo@...nel.org>, "David S. Miller" <davem@...emloft.net>, Eric Dumazet
 <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni
 <pabeni@...hat.com>, "Goodstein, Mordechay"
 <mordechay.goodstein@...el.com>,  "Coelho, Luciano"
 <luciano.coelho@...el.com>, "Sisodiya, Mukesh" <mukesh.sisodiya@...el.com>,
  "linux-wireless@...r.kernel.org" <linux-wireless@...r.kernel.org>,
 "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] wifi: iwlwifi: Fix spurious packet drops with RSS

[let's see if my reply will make it to the list, the original seems to
not have]

On Sun, 2023-04-30 at 00:13 +0000, Sultan Alsawaf wrote:
> From: Sultan Alsawaf <sultan@...neltoast.com>
> 
> When RSS is used and one of the RX queues lags behind others by more than
> 2048 frames, then new frames arriving on the lagged RX queue are
> incorrectly treated as old rather than new by the reorder buffer, and are
> thus spuriously dropped. This is because the reorder buffer treats frames
> as old when they have an SN that is more than 2048 away from the head SN,
> which causes the reorder buffer to drop frames that are actually valid.
> 
> The odds of this occurring naturally increase with the number of
> RX queues used, so CPUs with many threads are more susceptible to
> encountering spurious packet drops caused by this issue.
> 
> As it turns out, the firmware already detects when a frame is either old or
> duplicated and exports this information, but it's currently unused. Using
> these firmware bits to decide when frames are old or duplicated fixes the
> spurious drops.

So I assume you tested it now, and it works? Somehow I had been under
the impression we never got it to work back when...

> Johannes mentions that the 9000 series' firmware doesn't support these
> bits, so disable RSS on the 9000 series chipsets since they lack a
> mechanism to properly detect old and duplicated frames.

Indeed, I checked this again, I also somehow thought it was backported
to some versions but doesn't look like. We can either leave those old
ones broken (they only shipped with fewer cores anyway), or just disable
it as you did here, not sure. RSS is probably not as relevant with those
slower speeds anyway.

> +++ b/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c
> @@ -918,7 +918,6 @@ static bool iwl_mvm_reorder(struct iwl_mvm *mvm,
>         struct iwl_mvm_sta *mvm_sta;
>         struct iwl_mvm_baid_data *baid_data;
>         struct iwl_mvm_reorder_buffer *buffer;
> -       struct sk_buff *tail;
>         u32 reorder = le32_to_cpu(desc->reorder_data);
>         bool amsdu = desc->mac_flags2 & IWL_RX_MPDU_MFLG2_AMSDU;
>         bool last_subframe =
> @@ -1020,7 +1019,7 @@ static bool iwl_mvm_reorder(struct iwl_mvm *mvm,
>                                  rx_status->device_timestamp, queue);
> 
>         /* drop any oudated packets */
> -       if (ieee80211_sn_less(sn, buffer->head_sn))
> +       if (reorder & IWL_RX_MPDU_REORDER_BA_OLD_SN)
>                 goto drop;
> 
>         /* release immediately if allowed by nssn and no stored frames */
> @@ -1068,24 +1067,12 @@ static bool iwl_mvm_reorder(struct iwl_mvm *mvm,
>                 return false;
>         }

All that "send queue sync" code in the middle that was _meant_ to fix
this issue but I guess never really did can also be removed, no? And the
timer, etc. etc.

johannes

[leaving full quote for the benefit of the mailing list]

> 
> -       index = sn % buffer->buf_size;
> -
> -       /*
> -        * Check if we already stored this frame
> -        * As AMSDU is either received or not as whole, logic is simple:
> -        * If we have frames in that position in the buffer and the last frame
> -        * originated from AMSDU had a different SN then it is a retransmission.
> -        * If it is the same SN then if the subframe index is incrementing it
> -        * is the same AMSDU - otherwise it is a retransmission.
> -        */
> -       tail = skb_peek_tail(&entries[index].e.frames);
> -       if (tail && !amsdu)
> -               goto drop;
> -       else if (tail && (sn != buffer->last_amsdu ||
> -                         buffer->last_sub_index >= sub_frame_idx))
> +       /* drop any duplicated packets */
> +       if (desc->status & cpu_to_le32(IWL_RX_MPDU_STATUS_DUPLICATE))
>                 goto drop;
> 
>         /* put in reorder buffer */
> +       index = sn % buffer->buf_size;
>         __skb_queue_tail(&entries[index].e.frames, skb);
>         buffer->num_stored++;
>         entries[index].e.reorder_time = jiffies;
> --
> 2.40.1
> 


Powered by blists - more mailing lists