lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8707a2c6-644d-4ccd-989f-1fb66c48d34a@gmail.com>
Date: Fri, 6 Sep 2024 23:16:59 +0200
From: Heiner Kallweit <hkallweit1@...il.com>
To: En-Wei Wu <en-wei.wu@...onical.com>, nic_swsd@...ltek.com,
 davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
 pabeni@...hat.com, netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Cc: kuan-ying.lee@...onical.com, kai.heng.feng@...onical.com
Subject: Re: [PATCH net] r8169: correct the reset timing of RTL8125 for
 link-change event

On 06.09.2024 10:35, En-Wei Wu wrote:
> The commit 621735f59064 ("r8169: fix rare issue with broken rx after
> link-down on RTL8125") set a reset work for RTL8125 in
> r8169_phylink_handler() to avoid the MAC from locking up, this
> makes the connection broken after unplugging then re-plugging the
> Ethernet cable.
> 
> This is because the commit mistakenly put the reset work in the
> link-down path rather than the link-up path (The commit message says
> it should be put in the link-up path).
> 
That's not what the commit message is saying. It says vendor driver
r8125 does it in the link-up path.
I moved it intentionally to the link-down path, because traffic may
be flowing already after link-up.

> Moving the reset work from the link-down path to the link-up path fixes
> the issue. Also, remove the unnecessary enum member.
> 
The user who reported the issue at that time confirmed that the original
change fixed the issue for him.
Can you explain, from the NICs perspective, what exactly the difference
is when doing the reset after link-up?
Including an explanation how the original change suppresses the link-up
interrupt. And why that's not the case when doing the reset after link-up.

I simply want to be convinced enough that your change doesn't break
behavior for other users.

> Fixes: 621735f59064 ("r8169: fix rare issue with broken rx after link-down on RTL8125")
> Signed-off-by: En-Wei Wu <en-wei.wu@...onical.com>
> ---
>  drivers/net/ethernet/realtek/r8169_main.c | 11 +++++------
>  1 file changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
> index 3507c2e28110..632e661fc74b 100644
> --- a/drivers/net/ethernet/realtek/r8169_main.c
> +++ b/drivers/net/ethernet/realtek/r8169_main.c
> @@ -590,7 +590,6 @@ struct rtl8169_tc_offsets {
>  enum rtl_flag {
>  	RTL_FLAG_TASK_ENABLED = 0,
>  	RTL_FLAG_TASK_RESET_PENDING,
> -	RTL_FLAG_TASK_RESET_NO_QUEUE_WAKE,
>  	RTL_FLAG_TASK_TX_TIMEOUT,
>  	RTL_FLAG_MAX
>  };
> @@ -4698,8 +4697,6 @@ static void rtl_task(struct work_struct *work)
>  reset:
>  		rtl_reset_work(tp);
>  		netif_wake_queue(tp->dev);
> -	} else if (test_and_clear_bit(RTL_FLAG_TASK_RESET_NO_QUEUE_WAKE, tp->wk.flags)) {
> -		rtl_reset_work(tp);
>  	}
>  out_unlock:
>  	rtnl_unlock();
> @@ -4729,11 +4726,13 @@ static void r8169_phylink_handler(struct net_device *ndev)
>  	if (netif_carrier_ok(ndev)) {
>  		rtl_link_chg_patch(tp);
>  		pm_request_resume(d);
> -		netif_wake_queue(tp->dev);
> -	} else {
> +
>  		/* In few cases rx is broken after link-down otherwise */
>  		if (rtl_is_8125(tp))
> -			rtl_schedule_task(tp, RTL_FLAG_TASK_RESET_NO_QUEUE_WAKE);
> +			rtl_schedule_task(tp, RTL_FLAG_TASK_RESET_PENDING);
> +		else
> +			netif_wake_queue(tp->dev);

This call to netif_wake_queue() isn't needed any longer, it was introduced with
the original change only.

> +	} else {
>  		pm_runtime_idle(d);
>  	}
>  


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ