linux-kernel - Re: [PATCH] e1000e: Link flap workaround option for false IRP events

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <50d86329-98b1-4579-9cf1-d974cf7a748d@app.fastmail.com>
Date: Thu, 27 Feb 2025 10:05:41 -0500
From: "Mark Pearson" <mpearson-lenovo@...ebb.ca>
To: "Andrew Lunn" <andrew@...n.ch>
Cc: anthony.l.nguyen@...el.com, przemyslaw.kitszel@...el.com,
 andrew+netdev@...n.ch, davem@...emloft.net, edumazet@...gle.com,
 kuba@...nel.org, pabeni@...hat.com, intel-wired-lan@...ts.osuosl.org,
 netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] e1000e: Link flap workaround option for false IRP events

Hi Andrew,

On Wed, Feb 26, 2025, at 5:52 PM, Andrew Lunn wrote:
> On Wed, Feb 26, 2025 at 02:44:12PM -0500, Mark Pearson wrote:
>> Issue is seen on some Lenovo desktop workstations where there
>> is a false IRP event which triggers a link flap.
>> Condition is rare and only seen on networks where link speed
>> may differ along the path between nodes (e.g 10M/100M)
>> 
>> Intel are not able to determine root cause but provided a
>> workaround that does fix the issue. Tested extensively at Lenovo.
>> 
>> Adding a module option to enable this workaround for users
>> who are impacted by this issue.
>
> Why is a module option needed? Does the workaround itself introduce
> issues? Please describe those issues?
>
> In general, module options are not liked. So please include in the
> commit message why a module option is the only option.
> 

Understood. 

The reason for the module option is I'm playing it safe, as Intel couldn't determine root cause.
The aim of the patch is to keep the effect to a minimum whilst allowing users who are impacted to turn on the workaround, if they are encountering the issue.

Issue details:
We have seen the issue when running high level traffic on a network involving at least two nodes and also having two different network speeds are need. For example:
[Lenovo WS] <---1G link---> Network switch <---100M link--->[traffic source]
The link flap can take a day or two to reproduce - it's rare.

We worked for a long time with the Intel networking team to try and root cause the issue but unfortunately, despite being able to reproduce the issue in their lab, they decided to not pursue the investigation. They suggested the register read as a workaround and we confirmed it fixes the problem (setup ran for weeks without issue - we haven't seen any side issues). Unfortunately nobody can explain why the fix works.

I don't think the workaround should be implemented as a general case without support from Intel. 
I considered a DMI quirk, but without root cause I do worry about unknown side effects.
There is also the possibility of the issue showing up on other platforms we don't know of yet - and I wanted a way to be able to easily enable it if needed (e.g be able to tell a customer - try enabling this and see if it fixes it).

A module option seemed like a good compromise, but I'm happy to consider alternatives if there are any recommendations.

>> Signed-off-by: Mark Pearson <mpearson-lenovo@...ebb.ca>
>> ---
>>  drivers/net/ethernet/intel/e1000e/netdev.c | 19 +++++++++++++++++++
>>  1 file changed, 19 insertions(+)
>> 
>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
>> index 286155efcedf..06774fb4b2dd 100644
>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>> @@ -37,6 +37,10 @@ static int debug = -1;
>>  module_param(debug, int, 0);
>>  MODULE_PARM_DESC(debug, "Debug level (0=none,...,16=all)");
>>  
>> +static int false_irp_workaround;
>> +module_param(false_irp_workaround, int, 0);
>> +MODULE_PARM_DESC(false_irp_workaround, "Enable workaround for rare false IRP event causing link flap");
>> +
>>  static const struct e1000_info *e1000_info_tbl[] = {
>>  	[board_82571]		= &e1000_82571_info,
>>  	[board_82572]		= &e1000_82572_info,
>> @@ -1757,6 +1761,21 @@ static irqreturn_t e1000_intr_msi(int __always_unused irq, void *data)
>>  	/* read ICR disables interrupts using IAM */
>>  	if (icr & E1000_ICR_LSC) {
>>  		hw->mac.get_link_status = true;
>> +
>> +		/*
>> +		 * False IRP workaround
>> +		 * Issue seen on Lenovo P5 and P7 workstations where if there
>> +		 * are different link speeds in the network a false IRP event
>> +		 * is received, leading to a link flap.
>> +		 * Intel unable to determine root cause. This read prevents
>> +		 * the issue occurring
>> +		 */
>> +		if (false_irp_workaround) {
>> +			u16 phy_data;
>> +
>> +			e1e_rphy(hw, PHY_REG(772, 26), &phy_data);
>
> Please add some #define for these magic numbers, so we have some idea
> what PHY register you are actually reading. That in itself might help
> explain how the workaround actually works.
>

I don't know what this register does I'm afraid - that's Intel knowledge and has not been shared.

This approach, with magic numbers, is used all over the place in the driver and related modules, presumably contributed previously by Intel engineers. Can I push back on this request with a note that Intel would need to provide the register definitions for their components first.

Thanks for the review. I'll give it a couple of days to see if any other feedback, and push a v2 with updated commit description.

Mark