lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f0305064-64d9-4705-9846-cdc0fb103b82@gmail.com>
Date: Fri, 10 May 2024 23:29:13 +0100
From: Ken Milmore <ken.milmore@...il.com>
To: Heiner Kallweit <hkallweit1@...il.com>, netdev@...r.kernel.org
Cc: nic_swsd@...ltek.com
Subject: Re: r8169: transmit queue timeouts and IRQ masking

On 10/05/2024 23:06, Heiner Kallweit wrote:
> 
> Nice idea. The following is a simplified version.
> It's based on the thought that between scheduling NAPI and start of NAPI
> polling interrupts don't hurt.
> 
> Signed-off-by: Heiner Kallweit <hkallweit1@...il.com>
> ---
>  drivers/net/ethernet/realtek/r8169_main.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
> index e5ea827a2..7b04dfecc 100644
> --- a/drivers/net/ethernet/realtek/r8169_main.c
> +++ b/drivers/net/ethernet/realtek/r8169_main.c
> @@ -592,6 +592,7 @@ enum rtl_flag {
>  	RTL_FLAG_TASK_RESET_PENDING,
>  	RTL_FLAG_TASK_RESET_NO_QUEUE_WAKE,
>  	RTL_FLAG_TASK_TX_TIMEOUT,
> +	RTL_FLAG_IRQ_DISABLED,
>  	RTL_FLAG_MAX
>  };
>  
> @@ -4657,10 +4658,7 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
>  		rtl_schedule_task(tp, RTL_FLAG_TASK_RESET_PENDING);
>  	}
>  
> -	if (napi_schedule_prep(&tp->napi)) {
> -		rtl_irq_disable(tp);
> -		__napi_schedule(&tp->napi);
> -	}
> +	napi_schedule(&tp->napi);
>  out:
>  	rtl_ack_events(tp, status);
>  
> @@ -4714,12 +4712,17 @@ static int rtl8169_poll(struct napi_struct *napi, int budget)
>  	struct net_device *dev = tp->dev;
>  	int work_done;
>  
> +	if (!test_and_set_bit(RTL_FLAG_IRQ_DISABLED, tp->wk.flags))
> +		rtl_irq_disable(tp);
> +
>  	rtl_tx(dev, tp, budget);
>  
>  	work_done = rtl_rx(dev, tp, budget);
>  
> -	if (work_done < budget && napi_complete_done(napi, work_done))
> +	if (work_done < budget && napi_complete_done(napi, work_done)) {
> +		clear_bit(RTL_FLAG_IRQ_DISABLED, tp->wk.flags);
>  		rtl_irq_enable(tp);
> +	}
>  
>  	return work_done;
>  }

Reading this worries me though:

https://docs.kernel.org/networking/napi.html
"napi_disable() and subsequent calls to the poll method only wait for the ownership of the instance to be released, not for the poll method to exit.
This means that drivers should avoid accessing any data structures after calling napi_complete_done()."

Which seems to imply that the IRQ enable following napi_complete_done() is unguarded, and might race with the disable on an incoming poll.
Is that a possibility?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ