[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <mhng-1183a9e0-5954-4e30-9a69-2fe95ae4f270@palmerdabbelt-glaptop1>
Date: Sat, 26 Sep 2020 22:37:53 -0700 (PDT)
From: Palmer Dabbelt <palmerdabbelt@...gle.com>
To: anup@...infault.org
CC: Damien Le Moal <Damien.LeMoal@....com>,
Paul Walmsley <paul.walmsley@...ive.com>,
Anup Patel <Anup.Patel@....com>, aou@...s.berkeley.edu,
linux-riscv@...ts.infradead.org, Atish Patra <Atish.Patra@....com>,
Alistair Francis <Alistair.Francis@....com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] RISC-V: Check clint_time_val before use
On Sat, 26 Sep 2020 22:35:39 PDT (-0700), anup@...infault.org wrote:
> On Sun, Sep 27, 2020 at 5:50 AM Palmer Dabbelt <palmerdabbelt@...gle.com> wrote:
>>
>> On Sat, 26 Sep 2020 03:31:29 PDT (-0700), Damien Le Moal wrote:
>> > On Sat, 2020-09-26 at 15:51 +0530, Anup Patel wrote:
>> >> The NoMMU kernel is broken for QEMU virt machine from Linux-5.9-rc6
>> >> because the get_cycles() and friends are called very early from
>> >> rand_initialize() before CLINT driver is probed. To fix this, we
>> >> should check clint_time_val before use in get_cycles() and friends.
>>
>> I don't think this is the right way to solve that problem, as we're essentially
>> just lying about the timer rather than informing the system we can't get
>> timer-based entropy right now. MIPS is explicit about this, I don't see any
>> reason why we shouldn't be as well.
>>
>> Does this fix the boot issue (see below for the NULL)? There's some other
>> random-related arch functions so this might not be quite the right way to do
>> it.
>>
>> diff --git a/arch/riscv/include/asm/timex.h b/arch/riscv/include/asm/timex.h
>> index 7f659dda0032..7e39b0068932 100644
>> --- a/arch/riscv/include/asm/timex.h
>> +++ b/arch/riscv/include/asm/timex.h
>> @@ -33,6 +33,18 @@ static inline u32 get_cycles_hi(void)
>> #define get_cycles_hi get_cycles_hi
>> #endif /* CONFIG_64BIT */
>>
>> +/*
>> + * Much like MIPS, we may not have a viable counter to use at an early point in
>> + * the boot process. Unfortunately we don't have a fallback, so instead we
>> + * just return 0.
>> + */
>> +static inline unsigned long random_get_entropy(void)
>> +{
>> + if (unlikely(clint_time_val == NULL))
>> + return 0;
>> + return get_cycles();
>> +}
>> +
>
> Overall, this approach is good but this change is incomplete so does not work.
>
> The linux/timex.h expects random_get_entropy() to be macro so we need a
> "#define" as well.
>
> After fixing rand_initialize() with custom random_get_entropy(), we get another
> issue in boot_init_stack_canary() because boot_init_stack_canary() directly
> calls get_cycles() so we remove use of get_cycles() from
> boot_init_stack_canary()
> and this is similar to ARM, ARM64, and MIPS kernel.
OK, are you going to send that patch or do you want me to?
>
> Regards,
> Anup
>
>> #else /* CONFIG_RISCV_M_MODE */
>>
>> static inline cycles_t get_cycles(void)
>>
>> >> Fixes: d5be89a8d118 ("RISC-V: Resurrect the MMIO timer implementation
>> >> for M-mode systems")
>> >> Signed-off-by: Anup Patel <anup.patel@....com>
>> >> ---
>> >> Changes since v1:
>> >> - Explicitly initialize clint_time_val to NULL in CLINT driver to
>> >> avoid hang on Kendryte K210
>> >> ---
>> >> arch/riscv/include/asm/timex.h | 12 +++++++++---
>> >> drivers/clocksource/timer-clint.c | 2 +-
>> >> 2 files changed, 10 insertions(+), 4 deletions(-)
>> >>
>> >> diff --git a/arch/riscv/include/asm/timex.h b/arch/riscv/include/asm/timex.h
>> >> index 7f659dda0032..6e7b04874755 100644
>> >> --- a/arch/riscv/include/asm/timex.h
>> >> +++ b/arch/riscv/include/asm/timex.h
>> >> @@ -17,18 +17,24 @@ typedef unsigned long cycles_t;
>> >> #ifdef CONFIG_64BIT
>> >> static inline cycles_t get_cycles(void)
>> >> {
>> >> - return readq_relaxed(clint_time_val);
>> >> + if (clint_time_val)
>> >> + return readq_relaxed(clint_time_val);
>> >> + return 0;
>> >> }
>> >> #else /* !CONFIG_64BIT */
>> >> static inline u32 get_cycles(void)
>> >> {
>> >> - return readl_relaxed(((u32 *)clint_time_val));
>> >> + if (clint_time_val)
>> >> + return readl_relaxed(((u32 *)clint_time_val));
>> >> + return 0;
>> >> }
>> >> #define get_cycles get_cycles
>> >>
>> >> static inline u32 get_cycles_hi(void)
>> >> {
>> >> - return readl_relaxed(((u32 *)clint_time_val) + 1);
>> >> + if (clint_time_val)
>> >> + return readl_relaxed(((u32 *)clint_time_val) + 1);
>> >> + return 0;
>> >> }
>> >> #define get_cycles_hi get_cycles_hi
>> >> #endif /* CONFIG_64BIT */
>> >> diff --git a/drivers/clocksource/timer-clint.c b/drivers/clocksource/timer-clint.c
>> >> index d17367dee02c..8dbec85979fd 100644
>> >> --- a/drivers/clocksource/timer-clint.c
>> >> +++ b/drivers/clocksource/timer-clint.c
>> >> @@ -37,7 +37,7 @@ static unsigned long clint_timer_freq;
>> >> static unsigned int clint_timer_irq;
>> >>
>> >> #ifdef CONFIG_RISCV_M_MODE
>> >> -u64 __iomem *clint_time_val;
>> >> +u64 __iomem *clint_time_val = NULL;
>>
>> This one I definately don't get. According the internet, the C standard says
>>
>> If an object that has static storage duration is not initialized
>> explicitly, it is initialized implicitly as if every member that has
>> arithmetic type were assigned 0 and every member that has pointer type were
>> assigned a null pointer constant.
>>
>> so unless I'm missing something there shouldn't be any difference between these
>> two lines. When I just apply this I get exactly the same "objdump -dt" before
>> and after. I do see some difference in assembly, but only when I don't pass
>> "-fno-common" and that ends up being passed during my Linux builds.
>>
>> >> #endif
>> >>
>> >> static void clint_send_ipi(const struct cpumask *target)
>> >
>> > For Kendryte:
>> >
>> > Tested-by: Damien Le Moal <damien.lemoal@....com>
>> >
>> > --
>> > Damien Le Moal
>> > Western Digital
Powered by blists - more mailing lists