netdev - Re: [PATCH iwl-next v4] e1000e: Fix real-time violations on link up

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <97ae6e74-3b4e-46ef-a8f2-937bc7fd393c@intel.com>
Date: Sun, 22 Dec 2024 10:49:43 +0200
From: "Lifshits, Vitaly" <vitaly.lifshits@...el.com>
To: Gerhard Engleder <gerhard@...leder-embedded.com>,
	<intel-wired-lan@...ts.osuosl.org>, <netdev@...r.kernel.org>,
	<linux-pci@...r.kernel.org>
CC: <anthony.l.nguyen@...el.com>, <przemyslaw.kitszel@...el.com>,
	<andrew+netdev@...n.ch>, <davem@...emloft.net>, <kuba@...nel.org>,
	<edumazet@...gle.com>, <pabeni@...hat.com>, <bhelgaas@...gle.com>,
	<pmenzel@...gen.mpg.de>, <aleksander.lobakin@...el.com>, Gerhard Engleder
	<eg@...a.com>, Avigail Dahan <avigailx.dahan@...el.com>
Subject: Re: [PATCH iwl-next v4] e1000e: Fix real-time violations on link up



On 12/19/2024 9:27 PM, Gerhard Engleder wrote:
> From: Gerhard Engleder <eg@...a.com>
> 
> Link down and up triggers update of MTA table. This update executes many
> PCIe writes and a final flush. Thus, PCIe will be blocked until all
> writes are flushed. As a result, DMA transfers of other targets suffer
> from delay in the range of 50us. This results in timing violations on
> real-time systems during link down and up of e1000e in combination with
> an Intel i3-2310E Sandy Bridge CPU.
> 
> The i3-2310E is quite old. Launched 2011 by Intel but still in use as
> robot controller. The exact root cause of the problem is unclear and
> this situation won't change as Intel support for this CPU has ended
> years ago. Our experience is that the number of posted PCIe writes needs
> to be limited at least for real-time systems. With posted PCIe writes a
> much higher throughput can be generated than with PCIe reads which
> cannot be posted. Thus, the load on the interconnect is much higher.
> Additionally, a PCIe read waits until all posted PCIe writes are done.
> Therefore, the PCIe read can block the CPU for much more than 10us if a
> lot of PCIe writes were posted before. Both issues are the reason why we
> are limiting the number of posted PCIe writes in row in general for our
> real-time systems, not only for this driver.
> 
> A flush after a low enough number of posted PCIe writes eliminates the
> delay but also increases the time needed for MTA table update. The
> following measurements were done on i3-2310E with e1000e for 128 MTA
> table entries:
> 
> Single flush after all writes: 106us
> Flush after every write:       429us
> Flush after every 2nd write:   266us
> Flush after every 4th write:   180us
> Flush after every 8th write:   141us
> Flush after every 16th write:  121us
> 
> A flush after every 8th write delays the link up by 35us and the
> negative impact to DMA transfers of other targets is still tolerable.
> 
> Execute a flush after every 8th write. This prevents overloading the
> interconnect with posted writes.
> 
> Signed-off-by: Gerhard Engleder <eg@...a.com>
> Link: https://lore.kernel.org/netdev/f8fe665a-5e6c-4f95-b47a-2f3281aa0e6c@lunn.ch/T/
> CC: Vitaly Lifshits <vitaly.lifshits@...el.com>
> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@...el.com>
> Tested-by: Avigail Dahan <avigailx.dahan@...el.com>

Reviewed-by: Vitaly Lifshits <vitaly.lifshits@...el.com>

> ---
> v4:
> - add PREEMPT_RT dependency again (Vitaly Lifshits)
> - fix comment styple (Alexander Lobakin)
> - add to comment each 8th and explain why (Alexander Lobakin)
> - simplify check for every 8th write (Alexander Lobakin)
> 
> v3:
> - mention problematic platform explicitly (Bjorn Helgaas)
> - improve comment (Paul Menzel)
> 
> v2:
> - remove PREEMPT_RT dependency (Andrew Lunn, Przemek Kitszel)
> ---
>   drivers/net/ethernet/intel/e1000e/mac.c | 15 ++++++++++++++-
>   1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/e1000e/mac.c b/drivers/net/ethernet/intel/e1000e/mac.c
> index d7df2a0ed629..44249dd91bd6 100644
> --- a/drivers/net/ethernet/intel/e1000e/mac.c
> +++ b/drivers/net/ethernet/intel/e1000e/mac.c
> @@ -331,8 +331,21 @@ void e1000e_update_mc_addr_list_generic(struct e1000_hw *hw,
>   	}
>   
>   	/* replace the entire MTA table */
> -	for (i = hw->mac.mta_reg_count - 1; i >= 0; i--)
> +	for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) {
>   		E1000_WRITE_REG_ARRAY(hw, E1000_MTA, i, hw->mac.mta_shadow[i]);
> +
> +		if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> +			/*
> +			 * Do not queue up too many posted writes to prevent
> +			 * increased latency for other devices on the
> +			 * interconnect. Flush after each 8th posted write,
> +			 * to keep additional execution time low while still
> +			 * preventing increased latency.
> +			 */
> +			if (!(i % 8) && i)
> +				e1e_flush();
> +		}
> +	}
>   	e1e_flush();
>   }
>