linux-kernel - Re: [bugzilla-daemon@...nel.org: [Bug 217251] New: pciehp: nvme not visible after re-insert to tbt port]

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <871qlank6o.fsf@n90.eu>
Date:   Mon, 27 Mar 2023 17:43:18 +0000
From:   Aleksander Trofimowicz <alex@....eu>
To:     Keith Busch <kbusch@...nel.org>
Cc:     Bjorn Helgaas <helgaas@...nel.org>, Jens Axboe <axboe@...com>,
        Christoph Hellwig <hch@....de>,
        Sagi Grimberg <sagi@...mberg.me>,
        Lukas Wunner <lukas@...ner.de>, linux-pci@...r.kernel.org,
        linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [bugzilla-daemon@...nel.org: [Bug 217251] New: pciehp: nvme not
 visible after re-insert to tbt port]


Keith Busch <kbusch@...nel.org> writes:

> On Mon, Mar 27, 2023 at 09:33:59AM -0500, Bjorn Helgaas wrote:
>> Forwarding to NVMe folks, lists for visibility.
>>
>> ----- Forwarded message from bugzilla-daemon@...nel.org -----
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=217251
>> ...
>>
>> Created attachment 304031
>>   --> https://bugzilla.kernel.org/attachment.cgi?id=304031&action=edit
>> the tracing of nvme_pci_enable() during re-insertion
>>
>> Hi,
>>
>> There is a JHL7540-based device that may host a NVMe device. After the first
>> insertion a nvme drive is properly discovered and handled by the relevant
>> modules. Once disconnected any further attempts are not successful. The device
>> is visible on a PCI bus, but nvme_pci_enable() ends up calling
>> pci_disable_device() every time; the runtime PM status of the device is
>> "suspended", the power status of the 04:01.0 PCI bridge is D3. Preventing the
>> device from being power managed ("on" -> /sys/devices/../power/control)
>> combined with device removal and pci rescan changes nothing. A host reboot
>> restores the initial state.
>>
>> I would appreciate any suggestions how to debug it further.
>
> Sounds the same as this report:
>
>   http://lists.infradead.org/pipermail/linux-nvme/2023-March/038259.html
>
> The driver is bailing on the device because we can't read it's status register
> out of the remapped BAR. There's nothing we can do about that from the nvme
> driver level. Memory mapped IO has to work in order to proceed.
>
Thanks. I can confirm it is the same problem:

a) the platform is Intel Alderlake
b) readl(dev->bar + NVME_REG_CSTS) in nvme_pci_enable() fails
c) reading BAR0 via setpci gives 0x00000004

--
at