[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9fb5f018-7333-421b-8e2d-1f6eb98cffaa@intel.com>
Date: Thu, 19 Jun 2025 15:20:35 +0300
From: "Lifshits, Vitaly" <vitaly.lifshits@...el.com>
To: Christian Heusel <christian@...sel.eu>,
Marek Marczykowski-Górecki <marmarek@...isiblethingslab.com>
CC: Paul Menzel <pmenzel@...gen.mpg.de>, Tony Nguyen
<anthony.l.nguyen@...el.com>, Przemek Kitszel <przemyslaw.kitszel@...el.com>,
<netdev@...r.kernel.org>, <intel-wired-lan@...ts.osuosl.org>,
<regressions@...ts.linux.dev>, <stable@...r.kernel.org>, Sasha Levin
<sashal@...nel.org>
Subject: Re: [Intel-wired-lan] [REGRESSION] e1000e heavy packet loss on Meteor
Lake - 6.14.2
On 6/18/2025 4:41 PM, Christian Heusel wrote:
> On 25/06/18 03:28PM, Marek Marczykowski-Górecki wrote:
>> On Fri, May 09, 2025 at 02:17:32AM +0200, Marek Marczykowski-Górecki wrote:
>>> On Fri, May 09, 2025 at 01:28:36AM +0200, Marek Marczykowski-Górecki wrote:
>>>> On Fri, May 09, 2025 at 01:13:28AM +0200, Paul Menzel wrote:
>>>>> Dear Marek, dear Vitaly,
>>>>>
>>>>>
>>>>> Am 09.05.25 um 00:41 schrieb Marek Marczykowski-Górecki:
>>>>>> On Thu, May 08, 2025 at 09:26:18AM +0300, Lifshits, Vitaly
>>>>>>> On 4/21/2025 4:28 PM, Marek Marczykowski-Górecki wrote:
>>>>>>>> On Mon, Apr 21, 2025 at 03:19:12PM +0200, Marek Marczykowski-Górecki wrote:
>>>>>>>>> On Mon, Apr 21, 2025 at 03:44:02PM +0300, Lifshits, Vitaly wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 4/16/2025 3:43 PM, Marek Marczykowski-Górecki wrote:
>>>>>>>>>>> On Wed, Apr 16, 2025 at 03:09:39PM +0300, Lifshits, Vitaly wrote:
>>>>>>>>>>>> Can you please also share the output of ethtool -i? I would like to know the
>>>>>>>>>>>> NVM version that you have on your device.
>>>>>>>>>>>
>>>>>>>>>>> driver: e1000e
>>>>>>>>>>> version: 6.14.1+
>>>>>>>>>>> firmware-version: 1.1-4
>>>>>>>>>>> expansion-rom-version:
>>>>>>>>>>> bus-info: 0000:00:1f.6
>>>>>>>>>>> supports-statistics: yes
>>>>>>>>>>> supports-test: yes
>>>>>>>>>>> supports-eeprom-access: yes
>>>>>>>>>>> supports-register-dump: yes
>>>>>>>>>>> supports-priv-flags: yes
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Your firmware version is not the latest, can you check with the board
>>>>>>>>>> manufacturer if there is a BIOS update to your system?
>>>>>>>>>
>>>>>>>>> I can check, but still, it's a regression in the Linux driver - old
>>>>>>>>> kernel did work perfectly well on this hw. Maybe new driver tries to use
>>>>>>>>> some feature that is missing (or broken) in the old firmware?
>>>>>>>>
>>>>>>>> A little bit of context: I'm maintaining the kernel package for a Qubes
>>>>>>>> OS distribution. While I can try to update firmware on my test system, I
>>>>>>>> have no influence on what hardware users will use this kernel, and
>>>>>>>> which firmware version they will use (and whether all the vendors
>>>>>>>> provide newer firmware at all). I cannot ship a kernel that is known
>>>>>>>> to break network on some devices.
>>>>>>>>
>>>>>>>>>> Also, you mentioned that on another system this issue doesn't reproduce, do
>>>>>>>>>> they have the same firmware version?
>>>>>>>>>
>>>>>>>>> The other one has also 1.1-4 firmware. And I re-checked, e1000e from
>>>>>>>>> 6.14.2 works fine there.
>>>>>
>>>>>>> Thank you for your detailed feedback and for providing the requested
>>>>>>> information.
>>>>>>>
>>>>>>> We have conducted extensive testing of this patch across multiple systems
>>>>>>> and have not observed any packet loss issues. Upon comparing the mentioned
>>>>>>> setups, we noted that while the LAN controller is similar, the CPU differs.
>>>>>>> We believe that the issue may be related to transitions in the CPU's low
>>>>>>> power states.
>>>>>>>
>>>>>>> Consequently, we kindly request that you disable the CPU low power state
>>>>>>> transitions in the S0 system state and verify if the issue persists. You can
>>>>>>> disable this in the kernel parameters on the command line with idle=poll.
>>>>>>> Please note that this command is intended for debugging purposes only, as it
>>>>>>> may result in higher power consumption.
>>>>>>
>>>>>> I tried with idle=poll, and it didn't help, I still see a lot of packet
>>>>>> losses. But I can also confirm that idle=poll makes the system use
>>>>>> significantly more power (previously at 25-30W, with this option stays
>>>>>> at about 42W).
>>>>>>
>>>>>> Is there any other info I can provide, enable some debug features or
>>>>>> something?
>>>>>>
>>>>>> I see the problem is with receiving packets - in my simple ping test,
>>>>>> the ping target sees all the echo requests (and respond to them), but
>>>>>> the responses aren't reaching ping back (and are not visible on tcpdump
>>>>>> on the problematic system either).
>>>>>
>>>>> As the cause is still unclear, can the commit please be reverted in the
>>>>> master branch due adhere to Linux’ no-regression policy, so that it can be
>>>>> reverted from the stable series?
>>>>>
>>>>> Marek, did you also test 6.15 release candidates?
>>>>
>>>> The last test I did was on 6.15-rc3. I can re-test on -rc5.
>>>
>>> Same with 6.15-rc5.
>>
>> And the same issue still applies to 6.16-rc2. FWIW Qubes OS kernel has
>> this buggy patch revered and nobody complained (contrary to the version
>> with the patch included). Should I submit the revert patch?
It is not a good idea to revert this patch as most of the systems will
encounter the original issues (PHY access and packet loss). The reason I
first introduced this patch was because big vendors reported the packet
loss issue. You can refer to the following sightings:
https://answers.launchpad.net/ubuntu/+question/816003
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2066064
https://bugzilla.kernel.org/show_bug.cgi?id=218869
As an intermediate solution we can either use a privileged flag to make
it configurable. I will share with you a patch that might fix the issue
on your system that I would like you to try.
FYI, we are currently investigating a similar issue that seems to be due
to a misconfiguration of the system firmware.
>
> Just submit a revert then 👍 I have no authority here, but had good
> experience with just sending a revert patch in the past 🤗
>
> Cheers,
> Chris
Powered by blists - more mailing lists