lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 4 Mar 2022 09:19:25 +0000
From:   <Claudiu.Beznea@...rochip.com>
To:     <robert.hancock@...ian.com>, <netdev@...r.kernel.org>
CC:     <Nicolas.Ferre@...rochip.com>, <davem@...emloft.net>,
        <kuba@...nel.org>, <soren.brinkmann@...inx.com>,
        <scott.mcnutt@...iusxm.com>, <stable@...r.kernel.org>
Subject: Re: [PATCH net v2] net: macb: Fix lost RX packet wakeup race in NAPI
 receive

On 03.03.2022 20:10, Robert Hancock wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> There is an oddity in the way the RSR register flags propagate to the
> ISR register (and the actual interrupt output) on this hardware: it
> appears that RSR register bits only result in ISR being asserted if the
> interrupt was actually enabled at the time, so enabling interrupts with
> RSR bits already set doesn't trigger an interrupt to be raised. There
> was already a partial fix for this race in the macb_poll function where
> it checked for RSR bits being set and re-triggered NAPI receive.
> However, there was a still a race window between checking RSR and
> actually enabling interrupts, where a lost wakeup could happen. It's
> necessary to check again after enabling interrupts to see if RSR was set
> just prior to the interrupt being enabled, and re-trigger receive in that
> case.
> 
> This issue was noticed in a point-to-point UDP request-response protocol
> which periodically saw timeouts or abnormally high response times due to
> received packets not being processed in a timely fashion. In many
> applications, more packets arriving, including TCP retransmissions, would
> cause the original packet to be processed, thus masking the issue.
> 
> Fixes: 02f7a34f34e3 ("net: macb: Re-enable RX interrupt only when RX is done")
> Cc: stable@...r.kernel.org
> Co-developed-by: Scott McNutt <scott.mcnutt@...iusxm.com>
> Signed-off-by: Scott McNutt <scott.mcnutt@...iusxm.com>
> Signed-off-by: Robert Hancock <robert.hancock@...ian.com>

Tested on SAMA7G5:
Tested-by: Claudiu Beznea <claudiu.beznea@...rochip.com>

> ---
> 
> Changes since v1:
> -removed unrelated cleanup
> -added notes on observed frequency of branches to comments
> 
>  drivers/net/ethernet/cadence/macb_main.c | 25 +++++++++++++++++++++++-
>  1 file changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
> index 98498a76ae16..d13f06cf0308 100644
> --- a/drivers/net/ethernet/cadence/macb_main.c
> +++ b/drivers/net/ethernet/cadence/macb_main.c
> @@ -1573,7 +1573,14 @@ static int macb_poll(struct napi_struct *napi, int budget)
>         if (work_done < budget) {
>                 napi_complete_done(napi, work_done);
> 
> -               /* Packets received while interrupts were disabled */
> +               /* RSR bits only seem to propagate to raise interrupts when
> +                * interrupts are enabled at the time, so if bits are already
> +                * set due to packets received while interrupts were disabled,
> +                * they will not cause another interrupt to be generated when
> +                * interrupts are re-enabled.
> +                * Check for this case here. This has been seen to happen
> +                * around 30% of the time under heavy network load.
> +                */
>                 status = macb_readl(bp, RSR);
>                 if (status) {
>                         if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
> @@ -1581,6 +1588,22 @@ static int macb_poll(struct napi_struct *napi, int budget)
>                         napi_reschedule(napi);
>                 } else {
>                         queue_writel(queue, IER, bp->rx_intr_mask);
> +
> +                       /* In rare cases, packets could have been received in
> +                        * the window between the check above and re-enabling
> +                        * interrupts. Therefore, a double-check is required
> +                        * to avoid losing a wakeup. This can potentially race
> +                        * with the interrupt handler doing the same actions
> +                        * if an interrupt is raised just after enabling them,
> +                        * but this should be harmless.
> +                        */
> +                       status = macb_readl(bp, RSR);
> +                       if (unlikely(status)) {
> +                               queue_writel(queue, IDR, bp->rx_intr_mask);
> +                               if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
> +                                       queue_writel(queue, ISR, MACB_BIT(RCOMP));
> +                               napi_schedule(napi);
> +                       }
>                 }
>         }
> 
> --
> 2.31.1
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ