lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+G9fYu=_8vT95Q0MQzzTcCr+RTdMi=6MdSN7Cwo=C215VE1fQ@mail.gmail.com>
Date: Thu, 11 Jul 2024 14:00:17 +0530
From: Naresh Kamboju <naresh.kamboju@...aro.org>
To: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc: stable@...r.kernel.org, patches@...ts.linux.dev, 
	linux-kernel@...r.kernel.org, torvalds@...ux-foundation.org, 
	akpm@...ux-foundation.org, linux@...ck-us.net, shuah@...nel.org, 
	patches@...nelci.org, lkft-triage@...ts.linaro.org, pavel@...x.de, 
	jonathanh@...dia.com, f.fainelli@...il.com, sudipm.mukherjee@...il.com, 
	srw@...dewatkins.net, rwarsow@....de, conor@...nel.org, allen.lkml@...il.com, 
	broonie@...nel.org, Anders Roxell <anders.roxell@...aro.org>, 
	Mike Christie <michael.christie@...cle.com>, "Paul E. McKenney" <paulmck@...nel.org>, 
	rcu <rcu@...r.kernel.org>, gnstark@...utedevices.com, 
	linux-mm <linux-mm@...ck.org>, Jinliang Zheng <alexjlzheng@...cent.com>, 
	Dan Carpenter <dan.carpenter@...aro.org>, Arnd Bergmann <arnd@...db.de>, 
	Mickaël Salaün <mic@...ikod.net>, 
	Brendan Higgins <brendanhiggins@...gle.com>, "Eric W. Biederman" <ebiederm@...ssion.com>, 
	Shuah Khan <skhan@...uxfoundation.org>
Subject: Re: [PATCH 6.9 000/197] 6.9.9-rc1 review

On Wed, 10 Jul 2024 at 08:41, Naresh Kamboju <naresh.kamboju@...aro.org> wrote:
>
> On Tue, 9 Jul 2024 at 16:49, Greg Kroah-Hartman
> <gregkh@...uxfoundation.org> wrote:
> >
> > This is the start of the stable review cycle for the 6.9.9 release.
> > There are 197 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Thu, 11 Jul 2024 11:06:25 +0000.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> >         https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.9.9-rc1.gz
> > or in the git tree and branch at:
> >         git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.9.y
> > and the diffstat can be found below.
> >
> > thanks,
> >
> > greg k-h
>
> Results from Linaro’s test farm.
> The following kernel panic was noticed while running kunit tests.
>
> First seen on
> git commit id: 2471237b27c681c22e5f2b7175584aa7d5c89bfc
> date: on July 9th 2024.
>
>   GOOD: v6.9.7-223-g03247eed042d
>   BAD: v6.9.8-198-g2471237b27c6
>
> Always reproduce: yes.
>
> * qemu-arm64, Juno-r2, rk3399-rock-pi-4b and qemu-x86_64 the kunit-boot failed.
>   - gcc-13-defconfig-kunit
>   - clang-18-defconfig-kunit
>   - clang-nightly-defconfig-kunit
>   - gcc-8-defconfig-kunit
>
> Reported-by: Linux Kernel Functional Testing <lkft@...aro.org>
>
> Crash log [1]:
> ---------
> <4>[ 63.601683] kernel_init_freeable (init/main.c:1559)
> <4>[ 63.602335] kernel_init (init/main.c:1446)
> <4>[ 63.602980] ret_from_fork (arch/arm64/kernel/entry.S:861)
> <3>[   63.603703]
> <3>[   63.604057] The buggy address belongs to the object at fff00000c07582e8
> <3>[   63.604057]  which belongs to the cache inode_cache of size 616
> <3>[   63.605281] The buggy address is located 80 bytes to the right of
> <3>[   63.605281]  allocated 616-byte region [fff00000c07582e8,
> fff00000c0758550)
> <3>[   63.606592]
> <3>[   63.607294] The buggy address belongs to the physical page:
> <4>[   63.607948] page: refcount:1 mapcount:0 mapping:0000000000000000
> index:0x0 pfn:0x100758
> <4>[   63.608830] head: order:2 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> <4>[   63.609593] flags:
> 0xbfffe0000000840(slab|head|node=0|zone=2|lastcpupid=0x1ffff)
> <4>[   63.610449] page_type: 0xffffffff()
> <4>[   63.611934] raw: 0bfffe0000000840 fff00000c08d2dc0
> dead000000000122 0000000000000000
> <4>[   63.612810] raw: 0000000000000000 0000000080160016
> 00000001ffffffff 0000000000000000
> <4>[   63.613668] head: 0bfffe0000000840 fff00000c08d2dc0
> dead000000000122 0000000000000000
> <4>[   63.614525] head: 0000000000000000 0000000080160016
> 00000001ffffffff 0000000000000000
> <4>[   63.615694] head: 0bfffe0000000002 ffffc1ffc301d601
> ffffc1ffc301d648 00000000ffffffff
> <4>[   63.616583] head: 0000000400000000 0000000000000000
> 00000000ffffffff 0000000000000000
> <4>[   63.617391] page dumped because: kasan: bad access detected
> <3>[   63.618034]
> <3>[   63.618354] Memory state around the buggy address:
> <3>[   63.619281]  fff00000c0758480: 00 00 00 00 00 00 00 00 00 00 00
> 00 00 00 00 00
> <3>[   63.620086]  fff00000c0758500: 00 00 00 00 00 00 00 00 00 00 fc
> fc fc fc fc fc
> <3>[   63.620886] >fff00000c0758580: fc fc fc fc fc fc fc fc fc fc 00
> 00 00 00 00 00
> <3>[   63.621664]                                ^
> <3>[   63.622235]  fff00000c0758600: 00 00 00 00 00 00 00 00 00 00 00
> 00 00 00 00 00
> <3>[   63.623362]  fff00000c0758680: 00 00 00 00 00 00 00 00 00 00 00
> 00 00 00 00 00
> <3>[   63.624138]
> ==================================================================
> <6>[   63.645148]         not ok 1 block_bits=10 cluster_bits=3
> blocks_per_group=8192 group_count=4 desc_size=64
> <6>[   63.658504]         ok 2 block_bits=12 cluster_bits=3
> blocks_per_group=8192 group_count=4 desc_size=64
> <1>[   63.674531] Unable to handle kernel NULL pointer dereference at
> virtual address 0000000000000008
> <1>[   63.675691] Mem abort info:
> <1>[   63.676527]   ESR = 0x000000009600006b
> <1>[   63.677658]   EC = 0x25: DABT (current EL), IL = 32 bits
> <1>[   63.678274]   SET = 0, FnV = 0
> <1>[   63.678906]   EA = 0, S1PTW = 0
> <1>[   63.679880]   FSC = 0x2b: level -1 translation fault
> <1>[   63.680879] Data abort info:
> <1>[   63.681606]   ISV = 0, ISS = 0x0000006b, ISS2 = 0x00000000
> <1>[   63.682544]   CM = 0, WnR = 1, TnD = 0, TagAccess = 0
> <1>[   63.683493]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> <1>[   63.684732] [0000000000000008] user address but active_mm is swapper
> <0>[   63.686080] Internal error: Oops: 000000009600006b [#1] PREEMPT SMP
> <4>[   63.688843] Modules linked in:
> <4>[   63.689662] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B
>     N 6.9.9-rc1 #1
> <4>[   63.690522] Hardware name: linux,dummy-virt (DT)
> <4>[   63.691532] pstate: 224000c9 (nzCv daIF +PAN -UAO +TCO -DIT
> -SSBS BTYPE=--)
> <4>[ 63.692423] pc : _raw_spin_lock_irq
> (arch/arm64/include/asm/atomic_lse.h:271
> arch/arm64/include/asm/cmpxchg.h:120
> arch/arm64/include/asm/cmpxchg.h:169
> include/linux/atomic/atomic-arch-fallback.h:2055
> include/linux/atomic/atomic-arch-fallback.h:2173
> include/linux/atomic/atomic-instrumented.h:1302
> include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187
> include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
> <4>[ 63.693097] lr : _raw_spin_lock_irq
> (include/linux/atomic/atomic-arch-fallback.h:2172 (discriminator 1)
> include/linux/atomic/atomic-instrumented.h:1302 (discriminator 1)
> include/asm-generic/qspinlock.h:111 (discriminator 1)
> include/linux/spinlock.h:187 (discriminator 1)
> include/linux/spinlock_api_smp.h:120 (discriminator 1)
> kernel/locking/spinlock.c:170 (discriminator 1))
> <4>[   63.693714] sp : ffff800080087620
> <4>[   63.694519] x29: ffff800080087680 x28: 1ffff00010010f45 x27:
> ffff800080087a20
> <4>[   63.696315] x26: 1ffff00010010f47 x25: ffff800080087a00 x24:
> fff00000c99f0028
> <4>[   63.697334] x23: 0000000000000000 x22: dfff800000000000 x21:
> ffff800080087640
> <4>[   63.698333] x20: 1ffff00010010ec4 x19: 0000000000000008 x18:
> 000000004b9fd0a9
> <4>[   63.699711] x17: 0000000000000000 x16: fff00000da13e180 x15:
> ffffaa8e3330c4c4
> <4>[   63.700732] x14: ffffaa8e33312508 x13: ffffaa8e3583b0ec x12:
> ffff700010010ec9
> <4>[   63.701734] x11: 1ffff00010010ec8 x10: ffff700010010ec8 x9 :
> dfff800000000000
> <4>[   63.702792] x8 : 0000000000000003 x7 : 0000000000000001 x6 :
> ffff700010010ec8
> <4>[   63.704539] x5 : ffff800080087640 x4 : ffff700010010ec8 x3 :
> ffffaa8e3585c520
> <4>[   63.705529] x2 : 0000000000000001 x1 : 0000000000000000 x0 :
> 0000000000000000
> <4>[   63.706625] Call trace:
> <4>[ 63.707408] _raw_spin_lock_irq
> (arch/arm64/include/asm/atomic_lse.h:271
> arch/arm64/include/asm/cmpxchg.h:120
> arch/arm64/include/asm/cmpxchg.h:169
> include/linux/atomic/atomic-arch-fallback.h:2055
> include/linux/atomic/atomic-arch-fallback.h:2173
> include/linux/atomic/atomic-instrumented.h:1302
> include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187
> include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
> <4>[ 63.708130] wait_for_completion_timeout
> (kernel/sched/completion.c:84 kernel/sched/completion.c:116
> kernel/sched/completion.c:127 kernel/sched/completion.c:167)
> <4>[ 63.708902] kunit_try_catch_run (lib/kunit/try-catch.c:85)
> <4>[ 63.709710] kunit_run_case_catch_errors (lib/kunit/test.c:544)
> <4>[ 63.710558] kunit_run_tests (lib/kunit/test.c:649)
> <4>[ 63.711555] __kunit_test_suites_init (lib/kunit/test.c:732
> (discriminator 1))
> <4>[ 63.712489] kunit_run_all_tests (lib/kunit/executor.c:276
> lib/kunit/executor.c:392)
> <4>[ 63.713317] kernel_init_freeable (init/main.c:1559)
> <4>[ 63.713971] kernel_init (init/main.c:1446)
> <4>[ 63.714503] ret_from_fork (arch/arm64/kernel/entry.S:861)
> <0>[ 63.716154] Code: 93407c02 d503201f 2a0003e1 52800022 (88e17e62)
> All code
> ========
>    0: 93407c02 sxtw x2, w0
>    4: d503201f nop
>    8: 2a0003e1 mov w1, w0
>    c: 52800022 mov w2, #0x1                    // #1
>   10:* 88e17e62 casa w1, w2, [x19] <-- trapping instruction
>
> Code starting with the faulting instruction
> ===========================================
>    0: 88e17e62 casa w1, w2, [x19]
> <4>[   63.717705] ---[ end trace 0000000000000000 ]---
> <6>[   63.718649] note: swapper/0[1] exited with irqs disabled
> <6>[   63.720758] note: swapper/0[1] exited with preempt_count 1
> <0>[   63.722091] Kernel panic - not syncing: Attempted to kill init!
> exitcode=0x0000000b
> <2>[   63.724608] SMP: stopping secondary CPUs
> <0>[   63.725877] Kernel Offset: 0x2a8db2200000 from 0xffff800080000000
> <0>[   63.726534] PHYS_OFFSET: 0x40000000
> <0>[   63.727664] CPU features: 0x0,00000006,8f17bd7c,6766773f
> <0>[   63.729291] Memory Limit: none
> <0>[   63.731074] ---[ end Kernel panic - not syncing: Attempted to
> kill init! exitcode=0x0000000b ]---
>
> Steps to reproduce on qemu-arm64 link provided [2].
> Build artifacts arm64 link provided [3].
> The Kconfig is built with defconfig+Kunit and the config link provided [4].
> This occurred on following Toolchain builds gcc-13, clang-18,
> clang-nightly and gcc-8.
>
> Reproducible always on following devices and emulators,
>  * rk3399-rock-pi-4b
>  * qemu-arm64
>  * qemu-x86_64
>
> I am bisecting this reported issue.

Anders, bisected this issue and found this as the first bad commit and after
reverting this patch the boot pass and kernel panic were not found.

# first bad commit: [6b91f756f49a16223085b38cedb7e297d134d64d]
    kunit: Handle test faults
    [ Upstream commit 3a35c13007dea132a65f07de05c26b87837fadc2 ]


> Please let me know if you need more information.
>
> Links:
> [1] https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.9.y/build/v6.9.8-198-g2471237b27c6/testrun/24552495/suite/boot/test/gcc-13-lkftconfig-kunit/log
> [2] https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2j0YCwfoW0D9nrPUAeDspHG0iFE/reproducer
> [3] https://storage.tuxsuite.com/public/linaro/lkft/builds/2j0Y8lX6zRNGTiGTZmteMwzwHj2/
> [4] https://storage.tuxsuite.com/public/linaro/lkft/builds/2j0Y8lX6zRNGTiGTZmteMwzwHj2/config


Bisection log:
------
$ git bisect log
# bad: [6ced42af8b2d3a7dfca7121f6f01b19aa818eeab] Linux 6.9.9-rc1
# good: [2106717b5214895a29c038b37f15d78a3202aa2b] Linux 6.9.8
git bisect start 'stable-rc/linux-6.9.y' 'v6.9.8'
# bad: [30dbb09f028b3de9e304e3ab0bbf123613d6b61f] KVM: s390: fix LPSWEY handling
git bisect bad 30dbb09f028b3de9e304e3ab0bbf123613d6b61f
# good: [81d79946a4623d2d2ce21e68ccd1e9997119e8ff]
thermal/drivers/mediatek/lvts_thermal: Check NULL ptr on lvts_data
git bisect good 81d79946a4623d2d2ce21e68ccd1e9997119e8ff
# bad: [9a2f3e9cd7906c3dc5eebcc1d981c3403c742ea7] s390/pkey: Wipe
sensitive data on failure
git bisect bad 9a2f3e9cd7906c3dc5eebcc1d981c3403c742ea7
# good: [1b4c766b624614df4deaf51d20dc596859ec8254] orangefs: fix
out-of-bounds fsid access
git bisect good 1b4c766b624614df4deaf51d20dc596859ec8254
# bad: [867bfb350f89b328fe30537afda02fe4c4ae23f5] bpf: Avoid
uninitialized value in BPF_CORE_READ_BITFIELD
git bisect bad 867bfb350f89b328fe30537afda02fe4c4ae23f5
# bad: [18363e5c5cbf3c51d0d857e4b9706674cbfd5a96] powerpc/xmon: Check
cpu id in commands "c#", "dp#" and "dx#"
git bisect bad 18363e5c5cbf3c51d0d857e4b9706674cbfd5a96
# bad: [6b91f756f49a16223085b38cedb7e297d134d64d] kunit: Handle test faults
git bisect bad 6b91f756f49a16223085b38cedb7e297d134d64d
# good: [3bbd81663033fa72cac3a867e5284b925b445ad6] kunit: Fix timeout message
git bisect good 3bbd81663033fa72cac3a867e5284b925b445ad6
# first bad commit: [6b91f756f49a16223085b38cedb7e297d134d64d] kunit:
Handle test faults

-  Naresh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ