lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <be1db8f5-af55-48a4-be7a-5e8a1a5e25c4@sifive.com>
Date: Thu, 15 Aug 2024 09:41:53 -0500
From: Samuel Holland <samuel.holland@...ive.com>
To: Anup Patel <apatel@...tanamicro.com>, Thomas Gleixner
 <tglx@...utronix.de>,
 Emil Renner Berthing <emil.renner.berthing@...onical.com>
Cc: linux-kernel@...r.kernel.org, linux-riscv@...ts.infradead.org,
 Paul Walmsley <paul.walmsley@...ive.com>, Palmer Dabbelt
 <palmer@...belt.com>, Albert Ou <aou@...s.berkeley.edu>,
 Daniel Lezcano <daniel.lezcano@...aro.org>
Subject: Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression

On 2024-08-15 9:16 AM, Anup Patel wrote:
> On Thu, Aug 15, 2024 at 7:41 PM Thomas Gleixner <tglx@...utronix.de> wrote:
>>
>> On Thu, Aug 15 2024 at 08:32, Samuel Holland wrote:
>>> On 2024-08-15 8:16 AM, Thomas Gleixner wrote:
>>>> Yes. So the riscv timer is not working on this thing or it stops
>>>> somehow.
>>>
>>> That's correct. With the (firmware) devicetree that Emil is using, the OpenSBI
>>> firmware does not have a timer device, so it does not expose the (optional[1])
>>> SBI time extension, and sbi_set_timer() does nothing.
>>
>> Sigh. Does RISCV really have to repeat all mistakes which have been made
>> by x86, ARM and others before? It's known for decades that the kernel
>> relies on a working timer...
> 
> My apologies for the delay in finding a fix for this issue.
> 
> Almost all RISC-V platforms (except this one) have SBI Timer always
> available and Linux uses a better timer or Sstc extension whenever
> it is available.

So this is the immediate solution: add the CLINT to the firmware devicetree so
that the SBI time extension works, and Linux will boot without any code changes,
albeit with a higher-overhead clockevent device.

Additionally merging the sun4i timer patch[1] will allow the system to switch to
the better MMIO clocksource later in the boot process.

The reason the CLINT was not added to the devicetree already is that the T-HEAD
version of the CLINT includes an extension to drive SSIP/STIP from a second
S-mode visible set of registers. So it should really have twice as many entries
in its interrupts-extended property as the existing CLINT, and I never got
around to validating that this would work.

The long-term solution would be adding driver support for the T-HEAD CLINT
extensions, which provide an even better clockevent than the sun4i timer.

[1]: https://lore.kernel.org/all/20240312192519.1602493-1-samuel.holland@sifive.com/

> When Emil first reported this issue, I did try to help him root cause
> the issue but unfortunately I don't have this particular platform and
> PLIC on all other RISC-V platforms works fine.
> 
> I am also surprised that none of the Allwinner folks tried helping.

Allwinner D1 support was upstreamed by unpaid hobbyists with very little
first-party assistance.

>>> I wrote a patch (not submitted) to skip registering riscv_clock_event when the
>>> SBI time extension is unavailable, but this doesn't fully solve the issue
>>> either, because then we have no clockevent at all when
>>> check_unaligned_access_all_cpus() is called.
>>
>> check_unaligned_access_all_cpus() is irrelevant.
>>
>>> How early in the boot process are we "required" to have a functional clockevent?
>>> Do we need to refactor check_unaligned_access_all_cpus() so it works on systems
>>> where the only clockevent is provided by a platform device?
>>
>> Right after init/main::late_time_init() everything can depend on a
>> working timer and on jiffies increasing.
>>
>> I'm actually surprised that the boot process gets that far. That's just
>> by pure luck, really.

Thanks for clearing this up!

Regards,
Samuel


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ