lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+G9fYsChy=HzEwkBVydPW4gJhDjkB87dY9FA833H2tZLfSh-w@mail.gmail.com>
Date:   Wed, 11 Jan 2023 15:01:39 +0530
From:   Naresh Kamboju <naresh.kamboju@...aro.org>
To:     Arnd Bergmann <arnd@...db.de>
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        stable@...r.kernel.org, patches@...ts.linux.dev,
        linux-kernel@...r.kernel.org,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Guenter Roeck <linux@...ck-us.net>, shuah@...nel.org,
        patches@...nelci.org, lkft-triage@...ts.linaro.org,
        Pavel Machek <pavel@...x.de>,
        Jon Hunter <jonathanh@...dia.com>,
        Florian Fainelli <f.fainelli@...il.com>,
        Sudip Mukherjee <sudipm.mukherjee@...il.com>,
        srw@...dewatkins.net, rwarsow@....de,
        Mark Brown <broonie@...nel.org>
Subject: Re: [PATCH 6.0 000/148] 6.0.19-rc1 review

On Wed, 11 Jan 2023 at 13:48, Arnd Bergmann <arnd@...db.de> wrote:
>
> On Wed, Jan 11, 2023, at 07:16, Naresh Kamboju wrote:
> > On Tue, 10 Jan 2023 at 23:36, Greg Kroah-Hartman <gregkh@...uxfoundation.org> wrote:
> >>
> >
> > Results from Linaro’s test farm.
> > Regressions on arm64 Raspberry Pi 4 Model B.
> >
> > Reported-by: Linux Kernel Functional Testing <lkft@...aro.org>
> >
> > While running LTP controllers cgroup_fj_stress_blkio test cases
> > the Insufficient stack space to handle exception! occurred and
> > followed by kernel panic on arm64 Raspberry Pi 4 Model B with
> > clang-15 built kernel Image.
> >
> > The full boot and test log attached to this email and build and
> > Kconfig links provided in the bottom of this email.
> >
> > I will try to reproduce this reported issue and get back to you.
>
> I looked at the log between 6.0.18 and 6.0.19-rc1, but don't see
> any arm64 or memory management patches that could result in this.
> Do you know if 6.0.18 ran successful

Yes, it ran successfully on 6.0.18.

On the same kernel 6.0.19-rc1 built with gcc-12 did not find this panic.
The reported issue is specific to clang-15 build.

> > [ 2893.044339] Insufficient stack space to handle exception!
> > [ 2893.044351] ESR: 0x0000000096000047 -- DABT (current EL)
> > [ 2893.044360] FAR: 0xffff8000128180d0
> > [ 2893.044364] Task stack:     [0xffff800012a18000..0xffff800012a1c000]
> > [ 2893.044370] IRQ stack:      [0xffff80000a798000..0xffff80000a79c000]
> > [ 2893.044375] Overflow stack: [0xffff0000f77c4310..0xffff0000f77c5310]
> ...
> > [ 2893.044413] pc : el1h_64_sync+0x0/0x68
> > [ 2893.044430] lr : wp_page_copy+0xf8/0x90c
> > [ 2893.044445] sp : ffff8000128180d0
> ...
> > [ 2893.044692]  el1h_64_sync+0x0/0x68
> > [ 2893.044700]  do_wp_page+0x4a0/0x5c8
> > [ 2893.044708]  handle_mm_fault+0x7fc/0x14dc
> > [ 2893.044718]  do_page_fault+0x29c/0x450
> > [ 2893.044727]  do_mem_abort+0x4c/0xf8
> > [ 2893.044741]  el0_da+0x48/0xa8
> > [ 2893.044750]  el0t_64_sync_handler+0xcc/0xf0
> > [ 2893.044759]  el0t_64_sync+0x18c/0x190
>
> It claims that the stack overflow happened in do_wp_page(),
> but that has a really short call chain. It would be good
> to have the source line for do_wp_page+0x4a0/0x5c8 and
> wp_page_copy+0xf8/0x90c to see where exactly it was.
>
>
> > [ 2893.285975] WARNING: CPU: 2 PID: 315758 at kernel/sched/core.c:3119
> > set_task_cpu+0x14c/0x208
> ....
> > [ 2893.286117] CPU: 2 PID: 315758 Comm: cgroup_fj_stres Not tainted
> > [ 2893.286416]  arch_timer_handler_phys+0x44/0x54
> > [ 2893.286427]  handle_percpu_devid_irq+0x90/0x220
> > [ 2893.286439]  generic_handle_domain_irq+0x38/0x50
> > [ 2893.286447]  gic_handle_irq+0x68/0xe8
> > [ 2893.286455]  el1_interrupt+0x88/0xc8
> > [ 2893.286464]  el1h_64_irq_handler+0x18/0x24
> > [ 2893.286474]  el1h_64_irq+0x64/0x68
> > [ 2893.286482]  panic+0x2d8/0x374
>
> This is apparently a second unrelated bug -- it still processes timer
> interrupts after calling panic() and this apparently fails because
> the system is already unusable.
>
> >   artifact-location:
> > https://storage.tuxsuite.com/public/linaro/lkft/builds/2K9JDtix2mHMoYRjNkBef3oR5JT
>

Adding " / " at end works.
https://storage.tuxsuite.com/public/linaro/lkft/builds/2K9JDtix2mHMoYRjNkBef3oR5JT/

> file not found. I tried to get the vmlinux file to look at the disassembly
> but the artifacts appear to be gone already.

System.map:
https://storage.tuxsuite.com/public/linaro/lkft/builds/2K9JDtix2mHMoYRjNkBef3oR5JT/System.map

vmlinux:
https://storage.tuxsuite.com/public/linaro/lkft/builds/2K9JDtix2mHMoYRjNkBef3oR5JT/vmlinux.xz

Sorry for the trouble.

- Naresh

>
>      Arnd

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ