[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aQEwsdvx8fvPyj5k@aurel32.net>
Date: Tue, 28 Oct 2025 22:08:01 +0100
From: Aurelien Jarno <aurelien@...el32.net>
To: Alex Elder <elder@...cstar.com>
Cc: Johannes Erdfelt <johannes@...felt.com>, robh@...nel.org,
krzk+dt@...nel.org, conor+dt@...nel.org, bhelgaas@...gle.com,
lpieralisi@...nel.org, kwilczynski@...nel.org, mani@...nel.org,
vkoul@...nel.org, kishon@...nel.org, dlan@...too.org,
guodong@...cstar.com, pjw@...nel.org, palmer@...belt.com,
aou@...s.berkeley.edu, alex@...ti.fr, p.zabel@...gutronix.de,
christian.bruel@...s.st.com, shradha.t@...sung.com,
krishna.chundru@....qualcomm.com, qiang.yu@....qualcomm.com,
namcao@...utronix.de, thippeswamy.havalige@....com,
inochiama@...il.com, devicetree@...r.kernel.org,
linux-pci@...r.kernel.org, linux-phy@...ts.infradead.org,
spacemit@...ts.linux.dev, linux-riscv@...ts.infradead.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 0/7] Introduce SpacemiT K1 PCIe phy and host controller
On 2025-10-28 14:10, Alex Elder wrote:
> On 10/28/25 1:42 PM, Johannes Erdfelt wrote:
> > On Tue, Oct 28, 2025, Aurelien Jarno <aurelien@...el32.net> wrote:
> > > Hi Alex,
> > >
> > > On 2025-10-17 11:21, Alex Elder wrote:
> > > > On 10/16/25 11:47 AM, Aurelien Jarno wrote:
> > > > > Hi Alex,
> > > > >
> > > > > On 2025-10-13 10:35, Alex Elder wrote:
> > > > > > This series introduces a PHY driver and a PCIe driver to support PCIe
> > > > > > on the SpacemiT K1 SoC. The PCIe implementation is derived from a
> > > > > > Synopsys DesignWare PCIe IP. The PHY driver supports one combination
> > > > > > PCIe/USB PHY as well as two PCIe-only PHYs. The combo PHY port uses
> > > > > > one PCIe lane, and the other two ports each have two lanes. All PCIe
> > > > > > ports operate at 5 GT/second.
> > > > > >
> > > > > > The PCIe PHYs must be configured using a value that can only be
> > > > > > determined using the combo PHY, operating in PCIe mode. To allow
> > > > > > that PHY to be used for USB, the calibration step is performed by
> > > > > > the PHY driver automatically at probe time. Once this step is done,
> > > > > > the PHY can be used for either PCIe or USB.
> > > > > >
> > > > > > Version 2 of this series incorporates suggestions made during the
> > > > > > review of version 1. Specific highlights are detailed below.
> > > > >
> > > > > With the issues mentioned in patch 4 fixed, this patchset works fine for
> > > > > me. That said I had to disable ASPM by passing pcie_aspm=off on the
> > > > > command line, as it is now enabled by default since 6.18-rc1 [1]. At
> > > > > this stage, I am not sure if it is an issue with my NVME drive or an
> > > > > issue with the controller.
> > > >
> > > > Can you describe what symptoms you had that required you to pass
> > > > "pcie_aspm=off" on the kernel command line?
> > > >
> > > > I see these lines in my boot log related to ASPM (and added by
> > > > the commit you link to), for both pcie1 and pcie2:
> > > >
> > > > pci 0000:01:00.0: ASPM: DT platform, enabling L0s-up L0s-dw L1 AS
> > > > PM-L1.1 ASPM-L1.2 PCI-PM-L1.1 PCI-PM-L1.2
> > > > pci 0000:01:00.0: ASPM: DT platform, enabling ClockPM
> > > >
> > > > . . .
> > > >
> > > > nvme nvme0: pci function 0000:01:00.0
> > > > nvme 0000:01:00.0: enabling device (0000 -> 0002)
> > > > nvme nvme0: allocated 64 MiB host memory buffer (16 segments).
> > > > nvme nvme0: 8/0/0 default/read/poll queues
> > > > nvme0n1: p1
> > > >
> > > > My NVMe drive on pcie1 works correctly.
> > > > https://www.crucial.com/ssd/p3/CT1000P3SSD8
> > > >
> > > > root@...anapif3:~# df /a
> > > > Filesystem 1K-blocks Used Available Use% Mounted on
> > > > /dev/nvme0n1p1 960302804 32063304 879385040 4% /a
> > > > root@...anapif3:~#
> > >
> > > Sorry for the delay, it took me time to test some more things and
> > > different SSDs. First of all I still see the issue with your v3 on top
> > > of v6.18-rc3, which includes some fixes for ASPM support [1].
> > >
> > > I have tried 3 different SSDs, none of them are working, but the
> > > symptoms are different, although all related with ASPM (pcie_aspm=off
> > > workarounds the issue).
> > >
> > > With a Fox Spirit PM18 SSD (Silicon Motion, Inc. SM2263EN/SM2263XT
> > > controller), I do not have more than this:
> > > [ 5.196723] nvme nvme0: pci function 0000:01:00.0
> > > [ 5.198843] nvme 0000:01:00.0: enabling device (0000 -> 0002)
> > >
> > > With a WD Blue SN570 SSD, I get this:
> > > [ 5.199513] nvme nvme0: pci function 0000:01:00.0
> > > [ 5.201653] nvme 0000:01:00.0: enabling device (0000 -> 0002)
> > > [ 5.270334] nvme nvme0: allocated 32 MiB host memory buffer (8 segments).
> > > [ 5.277624] nvme nvme0: 8/0/0 default/read/poll queues
> > > [ 19.192350] nvme nvme0: using unchecked data buffer
> > > [ 48.108400] nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0x10
> > > [ 48.113885] nvme nvme0: Does your device have a faulty power saving mode enabled?
> > > [ 48.121346] nvme nvme0: Try "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off" and report a bug
> > > [ 48.176878] nvme0n1: I/O Cmd(0x2) @ LBA 0, 8 blocks, I/O Error (sct 0x3 / sc 0x71)
> > > [ 48.181926] I/O error, dev nvme0n1, sector 0 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2
> > > [ 48.243670] nvme 0000:01:00.0: enabling device (0000 -> 0002)
> > > [ 48.246914] nvme nvme0: Disabling device after reset failure: -19
> > > [ 48.280495] Buffer I/O error on dev nvme0n1, logical block 0, async page read
> > >
> > >
> > > Finally with a PNY CS1030 SSD (Phison PS5015-E15 controller), I get this:
> > > [ 5.215631] nvme nvme0: pci function 0000:01:00.0
> > > [ 5.220435] nvme 0000:01:00.0: enabling device (0000 -> 0002)
> > > [ 5.329565] nvme nvme0: allocated 64 MiB host memory buffer (16 segments).
> > > [ 66.540485] nvme nvme0: I/O tag 28 (401c) QID 0 timeout, disable controller
> > > [ 66.585245] nvme 0000:01:00.0: probe with driver nvme failed with error -4
> > >
> > > Note that I also tested this latest SSD on a VisionFive 2 board with exactly
> > > the same kernel (I just moved the SSD and booted), and it works fine with ASPM
> > > enabled (confirmed with lspci).
> >
> > I have been testing this patchset recently as well, but on an Orange Pi
> > RV2 board instead (and an extra RV2 specific patch to enable power to
> > the M.2 slot).
> >
> > I ran into the same symptoms you had ("QID 0 timeout" after about 60
> > seconds). However, I'm using an Intel 600p. I can confirm my NVME drive
> > seems to work fine with the "pcie_aspm=off" workaround as well.
>
> I don't see this problem, and haven't tried to reproduce it yet.
>
> Mani told me I needed to add these lines to ensure the "runtime
> PM hierarchy of PCIe chain" won't be "broken":
>
> pm_runtime_set_active()
> pm_runtime_no_callbacks()
> devm_pm_runtime_enable()
>
> Just out of curiosity, could you try with those lines added
> just before these assignments in k1_pcie_probe()?
>
> k1->pci.dev = dev;
> k1->pci.ops = &k1_pcie_ops;
> dw_pcie_cap_set(&k1->pci, REQ_RES);
>
> I doubt it will fix what you're seeing, but at the moment I'm
> working on something else.
>
Thanks for your fast answer. I have just tried this patch:
--- a/drivers/pci/controller/dwc/pcie-spacemit-k1.c
+++ b/drivers/pci/controller/dwc/pcie-spacemit-k1.c
@@ -271,6 +271,16 @@ static int k1_pcie_probe(struct platform_device *pdev)
return dev_err_probe(dev, PTR_ERR(k1->phy),
"failed to get PHY\n");
+ ret = pm_runtime_set_active(dev);
+ if (ret < 0)
+ return dev_err_probe(dev, ret, "Failed to activate runtime PM\n");
+
+ pm_runtime_no_callbacks(dev);
+
+ ret = devm_pm_runtime_enable(dev);
+ if (ret < 0)
+ return dev_err_probe(dev, ret, "Failed to enable runtime PM\n");
+
k1->pci.dev = dev;
k1->pci.ops = &k1_pcie_ops;
dw_pcie_cap_set(&k1->pci, REQ_RES);
Unfortunately this doesn't fix the issue. On the positive side, things
still work with it and pcie_aspm=off.
Regards
AUrelien
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@...el32.net http://aurel32.net
Powered by blists - more mailing lists