[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f0552842-7824-424e-af21-ac9eb3c5c14d@kernel.org>
Date: Wed, 21 Aug 2024 07:09:20 +0200
From: Jiri Slaby <jirislaby@...nel.org>
To: Bjorn Helgaas <helgaas@...nel.org>, Petr Valenta <petr@...klidu.cz>
Cc: Bjorn Helgaas <bhelgaas@...gle.com>, Len Brown <lenb@...nel.org>,
"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
Linux kernel mailing list <linux-kernel@...r.kernel.org>,
Linux regressions mailing list <regressions@...ts.linux.dev>,
Tony Nguyen <anthony.l.nguyen@...el.com>, przemyslaw.kitszel@...el.com,
intel-wired-lan@...ts.osuosl.org, "Rafael J. Wysocki" <rafael@...nel.org>
Subject: Re: ACPI IRQ storm with 6.10
On 20. 08. 24, 23:30, Bjorn Helgaas wrote:
> On Tue, Aug 20, 2024 at 11:13:54PM +0200, Petr Valenta wrote:
>> Dne 20. 08. 24 v 20:09 Bjorn Helgaas napsal(a):
>>> On Mon, Aug 19, 2024 at 07:23:42AM +0200, Jiri Slaby wrote:
>>>> On 19. 08. 24, 6:50, Jiri Slaby wrote:
>>>>> CC e1000e guys + Jesse (due to 75a3f93b5383) + Bjorn (due to b2c289415b2b)
>>>>
>>>> Bjorn,
>>>>
>>>> I am confused by these changes:
>>>> ==========================================
>>>> @@ -291,16 +288,13 @@ static int e1000_set_link_ksettings(struct net_device
>>>> *net
>>>> dev,
>>>> * duplex is forced.
>>>> */
>>>> if (cmd->base.eth_tp_mdix_ctrl) {
>>>> - if (hw->phy.media_type != e1000_media_type_copper) {
>>>> - ret_val = -EOPNOTSUPP;
>>>> - goto out;
>>>> - }
>>>> + if (hw->phy.media_type != e1000_media_type_copper)
>>>> + return -EOPNOTSUPP;
>>>>
>>>> if ((cmd->base.eth_tp_mdix_ctrl != ETH_TP_MDI_AUTO) &&
>>>> (cmd->base.autoneg != AUTONEG_ENABLE)) {
>>>> e_err("forcing MDI/MDI-X state is not supported when
>>>> lin
>>>> k speed and/or duplex are forced\n");
>>>> - ret_val = -EINVAL;
>>>> - goto out;
>>>> + return -EINVAL;
>>>> }
>>>> }
>>>>
>>>> @@ -347,7 +341,6 @@ static int e1000_set_link_ksettings(struct net_device
>>>> *netde
>>>> v,
>>>> }
>>>>
>>>> out:
>>>> - pm_runtime_put_sync(netdev->dev.parent);
>>>> clear_bit(__E1000_RESETTING, &adapter->state);
>>>> return ret_val;
>>>> }
>>>> ==========================================
>>>>
>>>> So no more clear_bit(__E1000_RESETTING in the above fail paths. Is that
>>>> intentional?
>>>
>>> Not intentional. Petr, do you have the ability to test the patch
>>> below? I'm not sure it's the correct fix, but it reverts the pieces
>>> of b2c289415b2b that Jiri pointed out.
>>
>> I tested the patch below but it didn't help. After the first boot with new
>> kernel it looked promising as the irq storm only appeared for a few seconds,
>> but with subsequent reboots it was the same as without the patch.
>
> Thank you very much for testing that!
>> To be sure, I also send the md5sum of ethtool.c after applying the patch:
>>
>> a25c003257538f16994b4d7c4260e894 ethtool.c
>
> Thanks, that matches what I get when applying the patch on v6.10.
>
> I'm at a loss. You could try reverting the entire b2c289415b2b commit
> (patch for that is below).
FWIW he already tested with b2c289415b2b reverted (I provided him with a
built kernel). It behaves the same. So you are not the breaker.
> If that doesn't help, I guess you could try reverting the other
> commits Jiri mentioned:
>
> 76a0a3f9cc2f e1000e: fix force smbus during suspend flow
> c93a6f62cb1b e1000e: Fix S0ix residency on corporate systems
> bfd546a552e1 e1000e: move force SMBUS near the end of enable_ulp function
> 6918107e2540 net: e1000e & ixgbe: Remove PCI_HEADER_TYPE_MFD duplicates
> 1eb2cded45b3 net: annotate writes on dev->mtu from ndo_change_mtu()
> b2c289415b2b e1000e: Remove redundant runtime resume for ethtool_ops
> 75a3f93b5383 net: intel: implement modern PM ops declarations
>
> If you do this, I would revert 76a0a3f9cc2f, test, then revert
> c93a6f62cb1b in addition, test, then revert bfd546a552e1 in addition,
> etc.
Or perhaps easier to do:
git bisect v6.10 v6.9 -- drivers/net/ethernet/intel/e1000e/
directly. But that assumes one of the above commits broke it. If they
did not, as a last resort, you can still do full bisect (without the "--
drivers" part).
I would take v6.10 suses config.
Would boot 6.10.
do lsmod > /tmp/lsmod
make LSMOD=/tmp/lsmod localyesconfig
make bzImage
and use that bzImage.
Note that graphics, wireless and other stuff will be defunct unless you
build in firmwares for them (EXTRA_FIRMWARE config). Alternatively use
localmodconfig and build and install also modules (now limited to your
machine).
thanks,
--
js
suse labs
Powered by blists - more mailing lists