lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211126173309.GA12255@wunner.de>
Date:   Fri, 26 Nov 2021 18:33:09 +0100
From:   Lukas Wunner <lukas@...ner.de>
To:     Liguang Zhang <zhangliguang@...ux.alibaba.com>
Cc:     Bjorn Helgaas <bhelgaas@...gle.com>,
        Kuppuswamy Sathyanarayanan 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>,
        Amey Narkhede <ameynarkhede03@...il.com>,
        linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] PCI: pciehp: clear cmd_busy bit when Command Completed
 in polling mode

On Thu, Nov 11, 2021 at 01:42:58PM +0800, Liguang Zhang wrote:
> This patch fixes this problem that on driver probe from system startup,
> pciehp checks the Presence Detect State bit in the Slot Status register
> to bring up an occupied slot or bring down an unoccupied slot. If empty
> slot's power status is on, turn power off. The Hot-Plug interrupt isn't
> requested yet, so avoid triggering a notification by calling
> pcie_disable_notification().
> 
> Both the CCIE and HPIE bits are masked in pcie_disable_notification(),
> when we issue a hotplug command, pcie_wait_cmd() will polling the
> Command Completed bit instead of waiting for an interrupt. But cmd_busy
> bit was not cleared when Command Completed which results in timeouts
> like this in pciehp_power_off_slot() and pcie_init_notification():
> 
>   pcieport 0000:00:03.0: pciehp: Timeout on hotplug command 0x01c0
> (issued 2264 msec ago)
>   pcieport 0000:00:03.0: pciehp: Timeout on hotplug command 0x05c0
> (issued 2288 msec ago)
> 
> Signed-off-by: Liguang Zhang <zhangliguang@...ux.alibaba.com>

Fixes: a5dd4b4b0570 ("PCI: pciehp: Wait for hotplug command completion where necessary")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215143
Reviewed-by: Lukas Wunner <lukas@...ner.de>
Cc: stable@...r.kernel.org # v4.19+

Thanks a lot, that's a really good catch.

It's a somewhat intricate bug, so I'll try to explain in my own words:

If notification is disabled (HPIE or CCIE not set in the Slot Status
register), we rely on pcie_poll_cmd() to poll for Command Completed.
But once it's signaled, we neglect to clear ctrl->cmd_busy.
(Normally it is cleared by the hardirq handler pciehp_isr() if
notification is enabled.)

The result is that starting with the second Slot Control write,
pciehp will gratuitously wait for a command to finish which has
already finished and it will incorrectly report a timeout.

The bug was originally introduced in 2015 by commit a5dd4b4b0570
("PCI: pciehp: Wait for hotplug command completion where necessary"),
but didn't manifest itself because the first Slot Control Write already
enabled notification and from that point on the hardirq handler would
clear ctrl->cmd_busy.  However I think the bug may have manifested
itself with pciehp_poll_mode=1.

It wasn't until commit 4e6a13356f1c ("PCI: pciehp: Deduplicate presence
check on probe & resume") that multiple consecutive Slot Control writes
were performed on ->probe with notification disabled, so that's the
commit which first exposed the bug with pciehp_poll_mode=0.

Thanks,

Lukas

> ---
>  drivers/pci/hotplug/pciehp_hpc.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
> index 83a0fa119cae..8698aefc6041 100644
> --- a/drivers/pci/hotplug/pciehp_hpc.c
> +++ b/drivers/pci/hotplug/pciehp_hpc.c
> @@ -98,6 +98,8 @@ static int pcie_poll_cmd(struct controller *ctrl, int timeout)
>  		if (slot_status & PCI_EXP_SLTSTA_CC) {
>  			pcie_capability_write_word(pdev, PCI_EXP_SLTSTA,
>  						   PCI_EXP_SLTSTA_CC);
> +			ctrl->cmd_busy = 0;
> +			smp_mb();
>  			return 1;
>  		}
>  		msleep(10);
> -- 
> 2.19.1.6.gb485710b

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ