lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 25 Apr 2019 16:04:39 -0500
From:   Bjorn Helgaas <helgaas@...nel.org>
To:     Remi Pommarel <repk@...plefau.lt>
Cc:     Thomas Petazzoni <thomas.petazzoni@...tlin.com>,
        Ellie Reeves <ellierevves@...il.com>,
        Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
        linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH v2] PCI: aardvark: Use LTSSM state to build link training
 flag

Hi Remi,

On Sat, Mar 16, 2019 at 05:12:43PM +0100, Remi Pommarel wrote:
> The PCI_EXP_LNKSTA_LT flag in the emulated root device's PCI_EXP_LNKSTA
> config register does not reflect the actual link training state and is
> always cleared. The Link Training and Status State Machine (LTSSM) flag
> in LMI config register could be used as a link training indicator.

LMI?  I assume this is an Aardvark-specific register?  Maybe "Aardvark
LMI register", since the other things here are generic PCIe registers?

Is this a hardware erratum?  I know advk does some software emulation,
but it looks like the Aardvark PCIE_CORE_PCIEXP_CAP + PCI_EXP_LNKSTA
register is awfully close to being exactly the PCIe-defined
PCI_EXP_LNKSTA, so the difference seems like a mistake.

> Indeed if the LTSSM is in L0 or upper state then link training has
> completed (see [1]).
> 
> Unfortunately because setting the PCI_EXP_LINCTL_RL flag does not

s/PCI_EXP_LINCTL_RL/PCI_EXP_LNKCTL_RL/

> instantly imply a LTSSM state change (e.g. L0s to recovery state
> transition takes some time), LTSSM can be in L0 but link training has
> not finished yet. Thus a lower L0 LTSSM state followed by a L0 or upper
> state sequence has to be seen to be sure that link training has been
> done.
> 
> Because one may not call a pcie conf register read on LNKSTA after
> doing a retrain link or may miss the link down state due to timing, a
> 20ms timeout is used. Passing this timeout link is considered retrained.

It sounds like reading and/or writing some registers during a retrain
causes some sort of EL1 error?  Is this a separate erratum?  Is there
a list of the registers and operations (read/write) that are affected?
The backtrace below suggests that it's actually a read of LNKCAP or
LNKCTL (not LNKSTA) that caused the error.

It sounds like there are really two problems:

  1) Reading PCI_EXP_LNKSTA (or the Aardvark equivalent) doesn't give
     valid data for PCI_EXP_LNKSTA_LT.

  2) Sometimes config reads cause EL1 errors.

If that's the case and if it's possible, can you split this into a
patch for each issue?

> This fixes boot hang or kernel panic with the following callstack due to
> ASPM setup doing a link re-train and polling for PCI_EXP_LNKSTA_LT flag
> to be cleared before using it.
> 
> -------------------- 8< -------------------
> 	[    0.915389]  dump_backtrace+0x0/0x140
> 	[    0.915391]  show_stack+0x14/0x20
> 	[    0.915393]  dump_stack+0x90/0xb4
> 	[    0.915394]  panic+0x134/0x2c0
> 	[    0.915396]  nmi_panic+0x6c/0x70
> 	[    0.915398]  arm64_serror_panic+0x74/0x80
> 	[    0.915400]  is_valid_bugaddr+0x0/0x8
> 	[    0.915402]  el1_error+0x7c/0xe4
> 	[    0.915404]  advk_pcie_rd_conf+0x4c/0x250
> 	[    0.915406]  pci_bus_read_config_word+0x7c/0xd0
> 	[    0.915408]  pcie_capability_read_word+0x90/0xc8
> 	[    0.915410]  pcie_get_aspm_reg+0x68/0x118
> 	[    0.915412]  pcie_aspm_init_link_state+0x460/0xa98

This backtrace doesn't make sense to me as being related to this
issue.  You said above that the bug was that PCI_EXP_LNKSTA_LT is not
updated.  But apparently even *reading* a register at the wrong time
causes this EL1 error.  And pcie_get_aspm_reg() doesn't even read
LNKSTA; it only reads LNKCAP and LNKCTL.

BTW, if you're including a backtrace in a commit log, you can strip
out the timestamps and the "cut" lines because they don't contribute
information that's relevant in this context.

> 	[    0.915414]  pci_scan_slot+0xe8/0x100
> 	[    0.915416]  pci_scan_child_bus_extend+0x50/0x288
> 	[    0.915418]  pci_scan_bridge_extend+0x348/0x4f0
> 	[    0.915420]  pci_scan_child_bus_extend+0x1dc/0x288
> 	[    0.915423]  pci_scan_root_bus_bridge+0xc4/0xe0
> 	[    0.915424]  pci_host_probe+0x14/0xa8
> 	[    0.915426]  advk_pcie_probe+0x838/0x910
> 	[...]
> -------------------- 8< -------------------
> 
> [1] "PCI Express Base Specification", REV. 2.1
>     PCI Express, March 4 2009, Table 4-7
> 
> Signed-off-by: Remi Pommarel <repk@...plefau.lt>
> ---
> Changes since v1:
>   - Rename retraining flag field
>   - Fix DEVCTL register writing
> ---
>  drivers/pci/controller/pci-aardvark.c | 33 ++++++++++++++++++++++++++-
>  1 file changed, 32 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> index eb58dfdaba1b..47b707b5fc2c 100644
> --- a/drivers/pci/controller/pci-aardvark.c
> +++ b/drivers/pci/controller/pci-aardvark.c
> @@ -180,6 +180,7 @@
>  #define LINK_WAIT_MAX_RETRIES		10
>  #define LINK_WAIT_USLEEP_MIN		90000
>  #define LINK_WAIT_USLEEP_MAX		100000
> +#define LINK_RETRAIN_DELAY_MAX		(20 * HZ / 1000) /* 20 ms */
>  
>  #define MSI_IRQ_NUM			32
>  
> @@ -199,6 +200,8 @@ struct advk_pcie {
>  	u16 msi_msg;
>  	int root_bus_nr;
>  	struct pci_bridge_emul bridge;
> +	unsigned long rl_deadline; /* Retrain link jiffies deadline */
> +	u8 rl_asked; /* Retraining has been asked and is in transition */
>  };
>  
>  static inline void advk_writel(struct advk_pcie *pcie, u32 val, u64 reg)
> @@ -400,6 +403,19 @@ static int advk_pcie_wait_pio(struct advk_pcie *pcie)
>  	return -ETIMEDOUT;
>  }
>  
> +static int advk_pcie_link_retraining(struct advk_pcie *pcie)
> +{
> +	if (!advk_pcie_link_up(pcie)) {
> +		pcie->rl_asked = 0;
> +		return 1;
> +	}
> +
> +	if (pcie->rl_asked && time_before(jiffies, pcie->rl_deadline))
> +		return 1;
> +
> +	pcie->rl_asked = 0;
> +	return 0;
> +}
>  
>  static pci_bridge_emul_read_status_t
>  advk_pci_bridge_emul_pcie_conf_read(struct pci_bridge_emul *bridge,
> @@ -426,11 +442,19 @@ advk_pci_bridge_emul_pcie_conf_read(struct pci_bridge_emul *bridge,
>  		return PCI_BRIDGE_EMUL_HANDLED;
>  	}
>  
> +	case PCI_EXP_LNKCTL: {

Don't you mean PCI_EXP_LNKSTA here?

> +		u32 val = advk_readl(pcie, PCIE_CORE_PCIEXP_CAP + reg) &
> +			~(PCI_EXP_LNKSTA_LT << 16);
> +		if (advk_pcie_link_retraining(pcie))
> +			val |= (PCI_EXP_LNKSTA_LT << 16);
> +		*value = val;
> +		return PCI_BRIDGE_EMUL_HANDLED;
> +	}
> +
>  	case PCI_CAP_LIST_ID:
>  	case PCI_EXP_DEVCAP:
>  	case PCI_EXP_DEVCTL:
>  	case PCI_EXP_LNKCAP:
> -	case PCI_EXP_LNKCTL:

If you did mean PCI_EXP_LNKSTA above, I suppose you would leave
PCI_EXP_LNKCTL here?

>  		*value = advk_readl(pcie, PCIE_CORE_PCIEXP_CAP + reg);
>  		return PCI_BRIDGE_EMUL_HANDLED;
>  	default:
> @@ -447,8 +471,15 @@ advk_pci_bridge_emul_pcie_conf_write(struct pci_bridge_emul *bridge,
>  
>  	switch (reg) {
>  	case PCI_EXP_DEVCTL:
> +		advk_writel(pcie, new, PCIE_CORE_PCIEXP_CAP + reg);
> +		break;

What's the purpose of this DEVCTL change?  Could it be a separate patch?
I can't tell that it's related to the PCI_EXP_LNKSTA_LT issue.

>  	case PCI_EXP_LNKCTL:
>  		advk_writel(pcie, new, PCIE_CORE_PCIEXP_CAP + reg);
> +		if (new & PCI_EXP_LNKCTL_RL) {
> +			pcie->rl_asked = 1;
> +			pcie->rl_deadline = jiffies + LINK_RETRAIN_DELAY_MAX;
> +		}
>  		break;
>  
>  	case PCI_EXP_RTCTL:
> -- 
> 2.20.1
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@...ts.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

Powered by blists - more mailing lists