lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z92q84PdWVtftxI4@wunner.de>
Date: Fri, 21 Mar 2025 19:07:47 +0100
From: Lukas Wunner <lukas@...ner.de>
To: Bjorn Helgaas <helgaas@...nel.org>
Cc: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
	Bjorn Helgaas <bhelgaas@...gle.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
	Mika Westerberg <mika.westerberg@...ux.intel.com>,
	linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/1] PCI/hotplug: Don't enable HPIE in poll mode

On Fri, Mar 21, 2025 at 12:09:19PM -0500, Bjorn Helgaas wrote:
>   - It does make me wonder why we have both pcie_enable_interrupt()
>     and pcie_enable_notification().  Apparently we don't need to
>     restore the other PCI_EXP_SLTCTL bits when resuming?  Maybe we
>     depend on some other restoration, e.g., pci_restore_pcie_state(),
>     that happens first?

Yes and yes.

> 
>     That makes me worry that there's a window between
>     pci_restore_pcie_state() and pcie_enable_interrupt().  I suppose
>     we probably saved the pcie state after pcie_disable_interrupt(),
>     so HPIE would be disabled in the saved state.

Yes.

> 
>   - I also wonder about the fact that pci_restore_pcie_state() doesn't
>     account for Command Completed events, so we might write to Slot
>     Control too fast.

We don't. :)  See commit 469e764c4a3c ("PCI: pciehp: Obey compulsory
command delay after resume").

So this is all a bit subtle and perhaps restoring the Slot Control
registers should not be done in pci_restore_pcie_state() but rather
in pciehp and pnv (which are the only hotplug drivers touching
PCI_EXP_SLTCTL).

In particular, if hotplug control was not granted by the platform,
I don't think the kernel is supposed to touch the Slot Control
registers at all.  So it probably shouldn't restore them either.

We've been doing this since 2006 with commit b56a5a23bfec ("PCI:
Restore PCI Express capability registers after PM event").
Negotiation of _OSC wasn't added until 2010 with 75fb60f26bef
("ACPI/PCI: Negotiate _OSC control bits before requesting them").
But the OSC_PCI_EXPRESS_NATIVE_HP_CONTROL predates the introduction
of git.  I guess noone ever realized that restoring Slot Control
shouldn't be done if hotplug control wasn't granted.  I can't
remember anyone ever reporting issues because of that though.

On the other hand, changing something like this always risks
regressions.

>   - It's annoying that pcie_enable_interrupt() and
>     pcie_disable_interrupt() are global symbols, a consequence of
>     pciehp being split across five files instead of being one, which
>     is also a nuisance for code browsing.

Roughly,
pciehp_core.c contains the interface to the PCI hotplug core
  (registering the hotplug_slot_ops etc),
pciehp_hpc.c contains the interaction with hardware registers,
pciehp_core.c contains the state machine,
pciehp_pci.c contains the interaction with the PCI core
  (enumeration / de-enumeration of devices on slot bringup / bringdown).

The only reason I've refrained from making major adjustments to this
structure in the past was that it would make "git blame" a little more
difficult and applying fixes to stable kernels would also become somewhat
more painful as it would require backporting.

>     Also annoying that they are generically named, with no pciehp
>     connection (probably another consequence of being split into
>     several files).

They can be renamed.  There's no good reason for using the "pcie_"
prefix, it just appears to be a historic artifact.

>   - The eb34da60edee commit log hints at the reason for testing
>     pme_is_native().  Would be nice if there were also a comment in
>     the code about this because it's not 100% obvious why we test PME
>     support in the PCIe native hotplug driver.
> 
>   - Also slightly weird that eb34da60edee added the pme_is_native()
>     tests in pciehp_suspend() and pciehp_resume(), but somewhere along
>     the line the suspend-side one got moved to
>     pciehp_disable_interrupt(), so they're no longer parallel for no
>     obvious reason.
> 
>   - I forgot why we have both pcie_write_cmd() and
>     pcie_write_cmd_nowait() and how to decide which to use.

pcie_write_cmd_nowait() is the "fire and forget" variant,
whereas pcie_write_cmd() can be thought of as the "_sync" variant,
i.e. the control flow doesn't continue until the command has been
processed by the slot.

E.g. pciehp_power_on_slot() waits for the slot command to complete
before making sure the Link Disable bit is clear.  It wouldn't make
much sense to do the latter when the former hasn't been completed yet.

Thanks,

Lukas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ