netdev - Re: [PATCH iwl-next v3] e1000e: Fix real-time violations on link up

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e9bd85fc-a682-4c60-9dfd-0829ca886889@engleder-embedded.com>
Date: Wed, 18 Dec 2024 20:43:42 +0100
From: Gerhard Engleder <gerhard@...leder-embedded.com>
To: Alexander Lobakin <aleksander.lobakin@...el.com>
Cc: intel-wired-lan@...ts.osuosl.org, netdev@...r.kernel.org,
 linux-pci@...r.kernel.org, anthony.l.nguyen@...el.com,
 przemyslaw.kitszel@...el.com, andrew+netdev@...n.ch, davem@...emloft.net,
 kuba@...nel.org, edumazet@...gle.com, pabeni@...hat.com,
 bhelgaas@...gle.com, pmenzel@...gen.mpg.de, Gerhard Engleder <eg@...a.com>,
 Vitaly Lifshits <vitaly.lifshits@...el.com>
Subject: Re: [PATCH iwl-next v3] e1000e: Fix real-time violations on link up

On 18.12.24 16:23, Alexander Lobakin wrote:
> From: Gerhard Engleder <gerhard@...leder-embedded.com>
> Date: Sat, 14 Dec 2024 20:16:23 +0100
> 
>> From: Gerhard Engleder <eg@...a.com>
>>
>> Link down and up triggers update of MTA table. This update executes many
>> PCIe writes and a final flush. Thus, PCIe will be blocked until all
>> writes are flushed. As a result, DMA transfers of other targets suffer
>> from delay in the range of 50us. This results in timing violations on
>> real-time systems during link down and up of e1000e in combination with
>> an Intel i3-2310E Sandy Bridge CPU.
>>
>> The i3-2310E is quite old. Launched 2011 by Intel but still in use as
>> robot controller. The exact root cause of the problem is unclear and
>> this situation won't change as Intel support for this CPU has ended
>> years ago. Our experience is that the number of posted PCIe writes needs
>> to be limited at least for real-time systems. With posted PCIe writes a
>> much higher throughput can be generated than with PCIe reads which
>> cannot be posted. Thus, the load on the interconnect is much higher.
>> Additionally, a PCIe read waits until all posted PCIe writes are done.
>> Therefore, the PCIe read can block the CPU for much more than 10us if a
>> lot of PCIe writes were posted before. Both issues are the reason why we
>> are limiting the number of posted PCIe writes in row in general for our
>> real-time systems, not only for this driver.
>>
>> A flush after a low enough number of posted PCIe writes eliminates the
>> delay but also increases the time needed for MTA table update. The
>> following measurements were done on i3-2310E with e1000e for 128 MTA
>> table entries:
>>
>> Single flush after all writes: 106us
>> Flush after every write:       429us
>> Flush after every 2nd write:   266us
>> Flush after every 4th write:   180us
>> Flush after every 8th write:   141us
>> Flush after every 16th write:  121us
>>
>> A flush after every 8th write delays the link up by 35us and the
>> negative impact to DMA transfers of other targets is still tolerable.
>>
>> Execute a flush after every 8th write. This prevents overloading the
>> interconnect with posted writes.
>>
>> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@...el.com>
>> CC: Vitaly Lifshits <vitaly.lifshits@...el.com>
>> Link: https://lore.kernel.org/netdev/f8fe665a-5e6c-4f95-b47a-2f3281aa0e6c@lunn.ch/T/
>> Signed-off-by: Gerhard Engleder <eg@...a.com>
>> ---
>> v3:
>> - mention problematic platform explicitly (Bjorn Helgaas)
>> - improve comment (Paul Menzel)
>>
>> v2:
>> - remove PREEMPT_RT dependency (Andrew Lunn, Przemek Kitszel)
>> ---
>>   drivers/net/ethernet/intel/e1000e/mac.c | 9 ++++++++-
>>   1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/intel/e1000e/mac.c b/drivers/net/ethernet/intel/e1000e/mac.c
>> index d7df2a0ed629..0174c16bbb43 100644
>> --- a/drivers/net/ethernet/intel/e1000e/mac.c
>> +++ b/drivers/net/ethernet/intel/e1000e/mac.c
>> @@ -331,8 +331,15 @@ void e1000e_update_mc_addr_list_generic(struct e1000_hw *hw,
>>   	}
>>   
>>   	/* replace the entire MTA table */
>> -	for (i = hw->mac.mta_reg_count - 1; i >= 0; i--)
>> +	for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) {
>>   		E1000_WRITE_REG_ARRAY(hw, E1000_MTA, i, hw->mac.mta_shadow[i]);
>> +
>> +		/* do not queue up too many posted writes to prevent increased
>> +		 * latency for other devices on the interconnect
>> +		 */
> 
> I think a multi-line comment should start with a capital letter and have
> a '.' at the end of the sentence.
> 
> + netdev code doesn't have the special rule for multi-line comments,
> they should look the same way as in the rest of the kernel:
> 
> 		/*
> 		 * Do not queue up ...
> 		 * latency ...
> 		 */

Oh the preferred style changed, I missed that. Will be done.

>> +		if ((i % 8) == 0 && i != 0)
>> +			e1e_flush();
> 
> IIRC explicit `== 0` / `!= 0` are considered redundant.
> 
> 		if (!(i % 8) && i)

You are right, will be changed.

> 
> I'd also mention in the comment above that this means "flush each 8th
> write" and why exactly 8.

I will add that information to the comment.

Thank you for the review!

Gerhard