lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 25 Oct 2023 11:39:38 -0700
From:   Guenter Roeck <linux@...ck-us.net>
To:     Geert Uytterhoeven <geert@...ux-m68k.org>,
        Pavel Machek <pavel@...x.de>,
        Wolfram Sang <wsa+renesas@...g-engineering.com>,
        Ulf Hansson <ulf.hansson@...aro.org>
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        niklas.soderlund+renesas@...natech.se,
        yoshihiro.shimoda.uh@...esas.com, biju.das.jz@...renesas.com,
        Chris.Paterson2@...esas.com, stable@...r.kernel.org,
        patches@...ts.linux.dev, linux-kernel@...r.kernel.org,
        torvalds@...ux-foundation.org, akpm@...ux-foundation.org,
        shuah@...nel.org, patches@...nelci.org,
        lkft-triage@...ts.linaro.org, jonathanh@...dia.com,
        f.fainelli@...il.com, sudipm.mukherjee@...il.com,
        srw@...dewatkins.net, rwarsow@....de, conor@...nel.org,
        linux-reneas-soc@...r.kernel.org,
        Linux MMC List <linux-mmc@...r.kernel.org>
Subject: Re: renesas_sdhi problems in 5.10-stable was Re: [PATCH 5.10 000/226]
 5.10.198-rc1 review

On 10/25/23 10:05, Geert Uytterhoeven wrote:
> On Wed, Oct 25, 2023 at 2:35 PM Geert Uytterhoeven <geert@...ux-m68k.org> wrote:
>> On Wed, Oct 25, 2023 at 12:53 PM Geert Uytterhoeven
>> <geert@...ux-m68k.org> wrote:
>>> On Wed, Oct 25, 2023 at 12:47 PM Geert Uytterhoeven
>>> <geert@...ux-m68k.org> wrote:
>>>> On Tue, Oct 24, 2023 at 9:22 PM Pavel Machek <pavel@...x.de> wrote:
>>>>> But we still have failures on Renesas with 5.10.199-rc2:
>>>>>
>>>>> https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/pipelines/1047368849
>>>>>
>>>>> And they still happed during MMC init:
>>>>>
>>>>>      2.638013] renesas_sdhi_internal_dmac ee100000.mmc: Got CD GPIO
>>>>> [    2.638846] INFO: trying to register non-static key.
>>>>> [    2.644192] ledtrig-cpu: registered to indicate activity on CPUs
>>>>> [    2.649066] The code is fine but needs lockdep annotation, or maybe
>>>>> [    2.649069] you didn't initialize this object before use?
>>>>> [    2.649071] turning off the locking correctness validator.
>>>>> [    2.649080] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.199-rc2-arm64-renesas-ge31b6513c43d #1
>>>>> [    2.649082] Hardware name: HopeRun HiHope RZ/G2M with sub board (DT)
>>>>> [    2.649086] Call trace:
>>>>> [    2.655106] SMCCC: SOC_ID: ARCH_SOC_ID not implemented, skipping ....
>>>>> [    2.661354]  dump_backtrace+0x0/0x194
>>>>> [    2.661361]  show_stack+0x14/0x20
>>>>> [    2.667430] usbcore: registered new interface driver usbhid
>>>>> [    2.672230]  dump_stack+0xe8/0x130
>>>>> [    2.672238]  register_lock_class+0x480/0x514
>>>>> [    2.672244]  __lock_acquire+0x74/0x20ec
>>>>> [    2.681113] usbhid: USB HID core driver
>>>>> [    2.687450]  lock_acquire+0x218/0x350
>>>>> [    2.687456]  _raw_spin_lock+0x58/0x80
>>>>> [    2.687464]  tmio_mmc_irq+0x410/0x9ac
>>>>> [    2.688556] renesas_sdhi_internal_dmac ee160000.mmc: mmc0 base at 0x00000000ee160000, max clock rate 200 MHz
>>>>> [    2.744936]  __handle_irq_event_percpu+0xbc/0x340
>>>>> [    2.749635]  handle_irq_event+0x60/0x100
>>>>> [    2.753553]  handle_fasteoi_irq+0xa0/0x1ec
>>>>> [    2.757644]  __handle_domain_irq+0x7c/0xdc
>>>>> [    2.761736]  efi_header_end+0x4c/0xd0
>>>>> [    2.765393]  el1_irq+0xcc/0x180
>>>>> [    2.768530]  arch_cpu_idle+0x14/0x2c
>>>>> [    2.772100]  default_idle_call+0x58/0xe4
>>>>> [    2.776019]  do_idle+0x244/0x2c0
>>>>> [    2.779242]  cpu_startup_entry+0x20/0x6c
>>>>> [    2.783160]  rest_init+0x164/0x28c
>>>>> [    2.786561]  arch_call_rest_init+0xc/0x14
>>>>> [    2.790565]  start_kernel+0x4c4/0x4f8
>>>>> [    2.794233] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000014
>>>>> [    2.803011] Mem abort info:
>>>>>
>>>>> from https://lava.ciplatform.org/scheduler/job/1025535
>>>>> from
>>>>> https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/jobs/5360973735 .
>>>>>
>>>>> Is there something else missing?
>>>>
>>>> I don't have a HopeRun HiHope RZ/G2M, but both v5.10.198 and v5.10.199
>>>> seem to work fine on Salvator-XS with R-Car H3 ES2.0 and Salvator-X
>>>> with R-Car M3-W ES1.0, using a config based on latest renesas_defconfig.
>>>
>>> Sorry, I looked at the wrong log on R-Car M3-W.
>>> I do see the issue with v5.10.198, but not with v5.10.199.
>>
>> It seems to be an intermittent issue. Investigating...
> 
> After spending too much time on bisecting, the bad guy turns out to
> be commit 6d3745bbc3341d3b ("mmc: renesas_sdhi: register irqs before
> registering controller") in v5.10.198.
> 
> Adding debug information shows the lock is mmc_host.lock.
> 
> It is definitely initialized:
> 
>      renesas_sdhi_probe()
>      {
>          ...
>          tmio_mmc_host_alloc()
>              mmc_alloc_host
>                  spin_lock_init(&host->lock);
>          ...
>          devm_request_irq()
>          -> tmio_mmc_irq
>              tmio_mmc_cmd_irq()
>                  spin_lock(&host->lock);
>          ...
>      }
> 
> That leaves us with a missing lockdep annotation?
> 

Is it possible that the lock initialization is overwritten ?
I seem to recall a recent case where this happens.

Also, there is
	spin_lock_init(&_host->lock);
in tmio_mmc_host_probe(), and tmio_mmc_host_probe() is called after
devm_request_irq().

Also, how would lockdep annotation help with "Unable to handle
kernel NULL pointer dereference at virtual address 0000000000000014"
in the log above ?

Guenter

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ