linux-kernel - Re: [PATCH 5.19 000/101] 5.19.13-rc1 review

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CA+G9fYuzs=tVXBzDuYMp6YpCg5_OOgVjjPC+=CdVsVb45bM-3g@mail.gmail.com>
Date:   Thu, 6 Oct 2022 13:15:38 +0530
From:   Naresh Kamboju <naresh.kamboju@...aro.org>
To:     Feng Tang <feng.tang@...el.com>
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        linux-kernel@...r.kernel.org, stable@...r.kernel.org,
        torvalds@...ux-foundation.org, akpm@...ux-foundation.org,
        linux@...ck-us.net, shuah@...nel.org, patches@...nelci.org,
        lkft-triage@...ts.linaro.org, pavel@...x.de, jonathanh@...dia.com,
        f.fainelli@...il.com, sudipm.mukherjee@...il.com,
        srw@...dewatkins.net, Hyeonggon Yoo <42.hyeyoo@...il.com>,
        Waiman Long <longman@...hat.com>,
        Vlastimil Babka <vbabka@...e.cz>
Subject: Re: [PATCH 5.19 000/101] 5.19.13-rc1 review

On Wed, 5 Oct 2022 at 15:09, Feng Tang <feng.tang@...el.com> wrote:
>
> On Tue, Oct 04, 2022 at 12:18:05PM +0530, Naresh Kamboju wrote:
> > On Mon, 3 Oct 2022 at 12:43, Greg Kroah-Hartman
> > <gregkh@...uxfoundation.org> wrote:
> > >
> > > This is the start of the stable review cycle for the 5.19.13 release.
> > > There are 101 patches in this series, all will be posted as a response
> > > to this one.  If anyone has any issues with these being applied, please
> > > let me know.
> > >
> > > Responses should be made by Wed, 05 Oct 2022 07:07:06 +0000.
> > > Anything received after that time might be too late.
> > >
> > > The whole patch series can be found in one patch at:
> > >         https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.19.13-rc1.gz
> > > or in the git tree and branch at:
> > >         git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.19.y
> > > and the diffstat can be found below.
> > >
> > > thanks,
> > >
> > > greg k-h
> >
> > Results from Linaro's test farm.
> > No regressions on arm64, arm, x86_64, and i386.
> >
> > Tested-by: Linux Kernel Functional Testing <lkft@...aro.org>
> >
> > NOTE:
> > 1) Build warning
> > 2) Boot warning on qemu-arm64 with KASAN and Kunit test
> >    Suspecting one of the recently commits causing this warning and
> >    need to bisect to confirm the commit id.
> >     mm/slab_common: fix possible double free of kmem_cache
> > [ Upstream commit d71608a877362becdc94191f190902fac1e64d35 ]
>
> Hi Naresh Kamboju,
>
> Thanks for the report!
>
> Could you try reverting the commit and re-test it to confirm?

Anders re-run the tests multiple times with and without the patch reverted and
was not successful in reproducing the reported problem.
Which confirms that, it is not easy to reproduce.

> Also could you provide the kernel dmesg of the failure and the
> kernel config of the test?

dmesg log attached to this email.

Here is the build link,
https://builds.tuxbuild.com/2FcCwzbNgR7TlQXzJ0nu32y1CpB/


>
> I locally pulled the linux-stable source and used QEMU to test
> it with kasan/kfence enabled, but could not reproduce it (I
> only have x86 HW at hand).
>
> > 2) Following kernel boot warning noticed on qemu-arm64 with KASAN and
> > KUNIT enabled [1]
> >
> >      [  177.651182] ------------[ cut here ]------------
> >      [  177.652217] kmem_cache_destroy test: Slab cache still has
> > objects when called from test_exit+0x28/0x40
> >      [  177.654849] WARNING: CPU: 0 PID: 1 at mm/slab_common.c:520
> > kmem_cache_destroy+0x1e8/0x20c
> >      [  177.666237] Modules linked in:
> >      [  177.667325] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B
> >        5.19.13-rc1 #1
> >      [  177.668666] Hardware name: linux,dummy-virt (DT)
> >      [  177.669783] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT
> > -SSBS BTYPE=--)
> >      [  177.671120] pc : kmem_cache_destroy+0x1e8/0x20c
> >      [  177.672217] lr : kmem_cache_destroy+0x1e8/0x20c
> >      [  177.691598] Call trace:
> >      [  177.692165]  kmem_cache_destroy+0x1e8/0x20c
> >      [  177.693196]  test_exit+0x28/0x40
> >      [  177.694158]  kunit_catch_run_case+0x5c/0x120
> >      [  177.695177]  kunit_try_catch_run+0x144/0x26c
> >      [  177.696211]  kunit_run_case_catch_errors+0x158/0x1e0
> >      [  177.697353]  kunit_run_tests+0x374/0x750
> >      [  177.698333]  __kunit_test_suites_init+0x74/0xa0
> >      [  177.699386]  kunit_run_all_tests+0x160/0x380
> >      [  177.700428]  kernel_init_freeable+0x32c/0x388
> >      [  177.701497]  kernel_init+0x2c/0x150
> >      [  177.702347]  ret_from_fork+0x10/0x20
> >      [  177.703308] ---[ end trace 0000000000000000 ]---
> >
> > [1] https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2FcCyacq1SusUcnAfamULqzkdUA
>
> I also tried the reproduce cmmand from the above link:
>
> tuxrun --runtime podman --device qemu-arm64 --kernel https://builds.tuxbuild.com/2FcCwzbNgR7TlQXzJ0nu32y1CpB/Image.gz --modules https://builds.tuxbuild.com/2FcCwzbNgR7TlQXzJ0nu32y1CpB/modules.tar.xz --rootfs https://storage.lkft.org/rootfs/oe-kirkstone/20220824-114729/juno/lkft-tux-image-juno-20220824120304.rootfs.ext4.gz --parameters SKIPFILE=skipfile-lkft.yaml --image docker.io/lavasoftware/lava-dispatcher:2022.06 --tests kunit --timeouts boot=30
>
> Which also didn't reproduce it, but had some RCU stall problems
> (could also be related to the x86 HWs)
>
> [  321.006279] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> [  321.007281]  ffff0000074c2300: 00 07 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [  321.009283] rcu:      0-...0: (1 GPs behind) idle=40f/1/0x4000000000000000 softirq=436/437 fqs=5
>
> [  321.024995] rcu: rcu_preempt kthread timer wakeup didn't happen for 4464 jiffies! g-207 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
> [  321.026343] rcu:      Possible timer handling issue on cpu=1 timer-softirq=1426
> [  321.027340] rcu: rcu_preempt kthread starved for 4465 jiffies! g-207 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
> [  321.028517] rcu:      Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
> [  321.029488] rcu: RCU grace-period kthread stack dump:
> [  321.030251] task:rcu_preempt     state:I stack:    0 pid:   16 ppid:     2 flags:0x00000008
> [  321.031434] Call trace:
> [  321.031878]  __switch_to+0x140/0x1e0
> [  321.032565]  __schedule+0x4f4/0xc74
> [  321.033228]  schedule+0x88/0x13c
> [  321.033915]  schedule_timeout+0x104/0x2b0
> [  321.034646]  rcu_gp_fqs_loop+0x1a0/0x784
> [  321.035119]  rcu_gp_kthread+0x278/0x3a0
> [  321.035608]  kthread+0x160/0x170
> [  339.882465]  ret_from_fork+0x10/0x20
> [  339.883898] rcu: Stack dump where RCU GP kthread last ran:
>
> The full .xz log is attched.

Thanks for looking into this.

>
> Thanks,
> Feng


- Naresh

View attachment "qemu-arm64-kasan-kfence-kunit-warning-5.19.13-rc1.txt" of type "text/plain" (225808 bytes)