[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <juiog3337iozva23zpf4apdydegj4z7jibqykfvcgnkabemw4w@z5g5hhwrqr2w>
Date: Wed, 9 Jul 2025 07:23:44 -0700
From: Breno Leitao <leitao@...ian.org>
To: Mark Rutland <mark.rutland@....com>, ankita@...dia.com,
bwicaksono@...dia.com
Cc: rmk+kernel@...linux.org.uk, catalin.marinas@....com,
linux-serial@...r.kernel.org, rmikey@...a.com, linux-arm-kernel@...ts.infradead.org,
usamaarif642@...il.com, leo.yan@....com, linux-kernel@...r.kernel.org,
paulmck@...nel.org
Subject: Re: arm64: csdlock at early boot due to slow serial (?)
On Tue, Jul 08, 2025 at 07:00:45AM -0700, Breno Leitao wrote:
> On Thu, Jul 03, 2025 at 05:31:09PM +0100, Mark Rutland wrote:
>
> Here is more information I got about this problem. TL;DR: While the
> machine is booting, it is throttled by the UART speed, while having IRQ
> disabled.
quick update: I've identified a solution that significantly improves the
situation. I've found that the serial issue was heavily affecting boot
time, which is unleashed now.
After applying the following fix, the boot speed has improved
dramatically. It's the fastest I've seen, and the CSD lockups are gone.
If no concerns raise in the next days, I will send it officially to the
serial maintainers.
Author: Breno Leitao <leitao@...ian.org>
Date: Wed Jul 9 05:57:06 2025 -0700
serial: amba-pl011: Fix boot performance by switching to console_initcall()
Replace arch_initcall() with console_initcall() for PL011 driver initialization
to resolve severe boot performance issues.
The current arch_initcall() registration causes the console to initialize
before the printk subsystem is ready, forcing the driver into atomic mode
during early boot. This results in:
- 5-8 second boot delay while ~700 boot messages are processed
- System freeze with IRQs disabled during message output
- Each character transmitted synchronously with cpu_relax() polling
This is what is driving the driver to atomic mode in the early boot:
static inline void printk_get_console_flush_type(struct console_flush_type *ft)
{
....
if (printk_kthreads_running)
ft->nbcon_offload = true;
The atomic path processes each character individually through
pl011_console_putchar(), waiting for UART transmission completion
before proceeding. With only one CPU online during early boot,
this creates a bottleneck where the system spends excessive time
in interrupt-disabled state.
Here is how the code looks like:
1) disable interrupt
2) for each of these 700 messages, call pl011_console_write_atomic()
3) for each character in the message, calls pl011_console_putchar(),
which waits for the character to be transmitted
4) once all the line is transmitted, wait for the UART to be idle
5) re-enable interrupt
Here is the code representation of the above:
pl011_console_write_atomic() {
...
// For each char in the message
pl011_console_putchar() {
while (pl011_read(uap, REG_FR) & UART01x_FR_TXFF)
cpu_relax();
}
while ((pl011_read(uap, REG_FR) ^ uap->vendor->inv_fr) & uap->vendor->fr_busy)
cpu_relax();
Using console_initcall() ensures proper initialization order,
allowing the printk subsystem to use threaded output instead
of atomic mode, eliminating the performance bottleneck.
Performance improvement: 16x faster kernel boot time at my GRACE SoC
machine.
- Before: 10.08s to reach init process
- After: 0.62s to reach init process
Here are more timing details, collected from Linus' upstream, where the
only different is this patch:
Linus upstream:
[ 0.616203] printk: legacy console [netcon_ext0] enabled
[ 0.627469] Run /init as init process
[ 0.837477] loop: module loaded
[ 8.354803] Adding 134199360k swap on /swapvol/swapfile.
With this patch:
[ 0.305109] ARMH0011:00: ttyAMA0 at MMIO 0xc280000 (irq = 66, base_baud = 0) is a SBSA
[ 10.081742] Run /init as init process
[ 13.288717] loop: module loaded
[ 22.919934] Adding 134199168k swap on /swapvol/swapfile.
Link: https://lore.kernel.org/all/aGVn%2FSnOvwWewkOW@gmail.com/ [1]
Signed-off-by: Breno Leitao <leitao@...ian.org>
diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c
index 22939841b1de..0cf251365825 100644
--- a/drivers/tty/serial/amba-pl011.c
+++ b/drivers/tty/serial/amba-pl011.c
@@ -3116,7 +3116,7 @@ static void __exit pl011_exit(void)
* While this can be a module, if builtin it's most likely the console
* So let's leave module_exit but move module_init to an earlier place
*/
-arch_initcall(pl011_init);
+console_initcall(pl011_init);
module_exit(pl011_exit);
MODULE_AUTHOR("ARM Ltd/Deep Blue Solutions Ltd");
Powered by blists - more mailing lists