lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aGaQBghdAl8VGWmV@gmail.com>
Date: Thu, 3 Jul 2025 15:13:26 +0100
From: Breno Leitao <leitao@...ian.org>
To: Mark Rutland <mark.rutland@....com>
Cc: cov@...eaurora.org, rmk+kernel@...linux.org.uk, catalin.marinas@....com,
	linux-serial@...r.kernel.org, rmikey@...a.com,
	linux-arm-kernel@...ts.infradead.org, usamaarif642@...il.com,
	leo.yan@....com, linux-kernel@...r.kernel.org, paulmck@...nel.org
Subject: Re: arm64: csdlock at early boot due to slow serial (?)

On Thu, Jul 03, 2025 at 11:28:50AM +0100, Mark Rutland wrote:
> On Wed, Jul 02, 2025 at 10:10:21AM -0700, Breno Leitao wrote:
> > I'm observing two unusual behaviors during the boot process on my SBSA
> > ARM machine, with upstream kernel (6.16-rc4):
> 
> Can you say which SoC in particular that is? Knowing that would help to
> identify whether there's some known erratum, clocking issue, etc.

This is custom made rack mounted machine based on Grace CPU. Here are
some info about the hardware:

	# lscpu:
		Vendor ID:                   ARM
		  Model name:                Neoverse-V2
		    Model:                   0
		    Thread(s) per core:      1
		    Core(s) per socket:      72
		    Socket(s):               1
		    Stepping:                r0p0

	# /proc/cpuinfo
		processor	: 71
		BogoMIPS	: 2000.00
		Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh bti
		CPU implementer	: 0x41
		CPU architecture: 8
		CPU variant	: 0x0
		CPU part	: 0xd4f
		CPU revision	: 0

	# lshw
	    description: Rack Mount Chassis
	    product: <Internal name>
	    vendor: Quanta
	    version: <Internal name>
	    width: 64 bits
	    capabilities: smbios-3.6.0 dmi-3.6.0 smp sve_default_vector_length tagged_addr_disabled
	    configuration: boot=normal chassis=rackmount family=Default string sku=Default string uuid=...

How do I find the SoC exactly?

> Likewise that might imply more folk to add to Cc.
> 
> [...]
> 
> > At timestamp 9.69 seconds, the serial console is still flushing messages from
> > 0.92 seconds, indicating that the initial 9-second gap is spent looping in
> > cpu_relax()-about 20,000 times per message, which is clearly suboptimal.
> > 
> > Further debugging revealed the following sequence with the pl011 registers:
> > 
> > 	1) uart_console_write()
> > 	2) REG_FR has BUSY | RXFE | TXFF for a while (~1k cpu_relax())
> > 	3) RXFE and TXFF are cleaned, and BUSY stay on for another 17k-19k cpu_relax()
> > 
> > Michael has reported a hardware issue where the BUSY bit could get
> > stuck (see commit d8a4995bcea1: "tty: pl011: Work around QDF2400 E44 stuck BUSY
> > bit"), which is very similar. TXFE goes down, but BUSY is(?) still stuck for long.
> 
> Looking at the commit message, that was an issue with the a "custom
> (non-PrimeCell) implementation of the SBSA UART" present on QDF400. I
> assume that was soemthing that Qualcomm Datacenter Technologies designed
> themselves.
> 
> It's possible that your SoC has a similar issue with whatever IP block
> is being used as the UART, but the issue in that commit certainly
> doesn't apply to most PL011 / SBSA-UART implementations.

That makes total sense. Decoding SPCR I see the following:

	# iasl -d spcr.dat
	Intel ACPI Component Architecture
	ASL+ Optimizing Compiler/Disassembler version 20210604
	Copyright (c) 2000 - 2021 Intel Corporation

	File appears to be binary: found 56 non-ASCII characters, disassembling
	Binary file appears to be a valid ACPI table, disassembling
	Input file spcr.dat, Length 0x50 (80) bytes
	ACPI: SPCR 0x0000000000000000 000050 (v02 NVIDIA A M I    00000001 ARMH 00010000)
	Acpi Data Table [SPCR] decoded
	Formatted output:  spcr.dsl - 2624 bytes

Thanks,
--breno

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ