lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240411.eogh8Ainuema@digikod.net>
Date: Thu, 11 Apr 2024 10:48:43 +0200
From: Mickaël Salaün <mic@...ikod.net>
To: David Gow <davidgow@...gle.com>
Cc: Will Deacon <will@...nel.org>, 
	Naresh Kamboju <naresh.kamboju@...aro.org>, keescook@...omium.org, rmoar@...gle.com, 
	lkft-triage@...ts.linaro.org, kunit-dev@...glegroups.com, linux-kernel@...r.kernel.org, 
	peterz@...radead.org, mingo@...hat.com, longman@...hat.com, boqun.feng@...il.com, 
	anders.roxell@...aro.org, dan.carpenter@...aro.org, arnd@...db.de, linux@...ck-us.net, 
	Linux Kernel Functional Testing <lkft@...aro.org>
Subject: Re: BUG: KASAN: null-ptr-deref in _raw_spin_lock_irq next-20240410

On Thu, Apr 11, 2024 at 12:25:40PM +0800, David Gow wrote:
> On Wed, 10 Apr 2024 at 23:23, Will Deacon <will@...nel.org> wrote:
> >
> > On Wed, Apr 10, 2024 at 03:57:10PM +0530, Naresh Kamboju wrote:
> > > Following kernel crash noticed on Linux next-20240410 tag while running
> > > kunit testing on qemu-arm64 and qemu-x86_64.
> > >
> > > Reported-by: Linux Kernel Functional Testing <lkft@...aro.org>
> > >
> > > Crash log on qemu-arm64:
> > > ----------------
> > > <3>[ 30.465716] BUG: KASAN: null-ptr-deref in _raw_spin_lock_irq (include/linux/instrumented.h:96 include/linux/atomic/atomic-instrumented.h:1301 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
> > > <3>[   30.467097] Write of size 4 at addr 0000000000000008 by task swapper/0/1
> > > <3>[   30.468059]
> > > <3>[   30.468393] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B            N 6.9.0-rc3-next-20240410 #1
> > > <3>[   30.469209] Hardware name: linux,dummy-virt (DT)
> > > <3>[   30.469645] Call trace:
> > > <3>[ 30.469919] dump_backtrace (arch/arm64/kernel/stacktrace.c:319)
> > > <3>[ 30.471622] show_stack (arch/arm64/kernel/stacktrace.c:326)
> > > <3>[ 30.472124] dump_stack_lvl (lib/dump_stack.c:117)
> > > <3>[ 30.472947] print_report (mm/kasan/report.c:493)
> > > <3>[ 30.473755] kasan_report (mm/kasan/report.c:603)
> > > <3>[ 30.474524] kasan_check_range (mm/kasan/generic.c:175 mm/kasan/generic.c:189)
> > > <3>[ 30.475094] __kasan_check_write (mm/kasan/shadow.c:38)
> > > <3>[ 30.475683] _raw_spin_lock_irq (include/linux/instrumented.h:96 include/linux/atomic/atomic-instrumented.h:1301 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
> > > <3>[ 30.476257] wait_for_completion_timeout (kernel/sched/completion.c:84 kernel/sched/completion.c:116 kernel/sched/completion.c:127 kernel/sched/completion.c:167)
> > > <3>[ 30.476909] kunit_try_catch_run (lib/kunit/try-catch.c:86)
> > > <3>[ 30.477628] kunit_run_case_catch_errors (lib/kunit/test.c:544)
> > > <3>[ 30.478311] kunit_run_tests (lib/kunit/test.c:635)
> > > <3>[ 30.478865] __kunit_test_suites_init (lib/kunit/test.c:729 (discriminator 1))
> > > <3>[ 30.479482] kunit_run_all_tests (lib/kunit/executor.c:276 lib/kunit/executor.c:392)
> > > <3>[ 30.480079] kernel_init_freeable (init/main.c:1578)
> > > <3>[ 30.480747] kernel_init (init/main.c:1465)
> > > <3>[ 30.481474] ret_from_fork (arch/arm64/kernel/entry.S:861)
> > > <3>[   30.482080] ==================================================================
> > > <1>[   30.484503] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
> > > <1>[   30.485369] Mem abort info:
> > > <1>[   30.485923]   ESR = 0x000000009600006b
> > > <1>[   30.486943]   EC = 0x25: DABT (current EL), IL = 32 bits
> > > <1>[   30.487540]   SET = 0, FnV = 0
> > > <1>[   30.488007]   EA = 0, S1PTW = 0
> > > <1>[   30.488509]   FSC = 0x2b: level -1 translation fault
> > > <1>[   30.489150] Data abort info:
> > > <1>[   30.489610]   ISV = 0, ISS = 0x0000006b, ISS2 = 0x00000000
> > > <1>[   30.490360]   CM = 0, WnR = 1, TnD = 0, TagAccess = 0
> > > <1>[   30.491057]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> > > <1>[   30.491822] [0000000000000008] user address but active_mm is swapper
> > > <0>[   30.493008] Internal error: Oops: 000000009600006b [#1] PREEMPT SMP
> > > <4>[   30.494105] Modules linked in:
> > > <4>[   30.496244] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B            N 6.9.0-rc3-next-20240410 #1
> > > <4>[   30.497171] Hardware name: linux,dummy-virt (DT)
> > > <4>[   30.497905] pstate: 224000c9 (nzCv daIF +PAN -UAO +TCO -DIT -SSBS BTYPE=--)
> > > <4>[ 30.498895] pc : _raw_spin_lock_irq (arch/arm64/include/asm/atomic_lse.h:271 arch/arm64/include/asm/cmpxchg.h:120 arch/arm64/include/asm/cmpxchg.h:169 include/linux/atomic/atomic-arch-fallback.h:2055 include/linux/atomic/atomic-arch-fallback.h:2173 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
> > > <4>[ 30.499542] lr : _raw_spin_lock_irq (include/linux/atomic/atomic-arch-fallback.h:2172 (discriminator 1) include/linux/atomic/atomic-instrumented.h:1302 (discriminator 1) include/asm-generic/qspinlock.h:111 (discriminator 1) include/linux/spinlock.h:187 (discriminator 1) include/linux/spinlock_api_smp.h:120 (discriminator 1) kernel/locking/spinlock.c:170 (discriminator 1))
> > >
> > > <trim>
> >
> > It's a shame that you have trimmed the register dump here.
> >
> > > <4>[   30.511022] Call trace:
> > > <4>[ 30.511437] _raw_spin_lock_irq (arch/arm64/include/asm/atomic_lse.h:271 arch/arm64/include/asm/cmpxchg.h:120 arch/arm64/include/asm/cmpxchg.h:169 include/linux/atomic/atomic-arch-fallback.h:2055 include/linux/atomic/atomic-arch-fallback.h:2173 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
> > > <4>[ 30.512013] wait_for_completion_timeout (kernel/sched/completion.c:84 kernel/sched/completion.c:116 kernel/sched/completion.c:127 kernel/sched/completion.c:167)
> > > <4>[ 30.512627] kunit_try_catch_run (lib/kunit/try-catch.c:86)
> > > <4>[ 30.513188] kunit_run_case_catch_errors (lib/kunit/test.c:544)
> > > <4>[ 30.513801] kunit_run_tests (lib/kunit/test.c:635)
> >
> > Ok, so 'task_struct->vfork_done' is NULL. Looks like this code was added
> > recently, so adding Mickaël to cc.
> >
> 
> Thanks. This looks like a race condition where the KUnit test kthread
> can terminate before we wait on it.
> 
> Mickaël, does this seem like a correct fix to you?
> ---
> From: David Gow <davidgow@...gle.com>
> Date: Thu, 11 Apr 2024 12:07:47 +0800
> Subject: [PATCH] kunit: Fix race condition in try-catch completion
> 
> KUnit's try-catch infrastructure now uses vfork_done, which is always
> set to a valid completion when a kthread is crated, but which is set to

s/crated/created/

> NULL once the thread terminates. This creates a race condition, where
> the kthread exits before we can wait on it.
> 
> Keep a copy of vfork_done, which is taken before we wake_up_process()
> and so valid, and wait on that instead.
> 
> Fixes: 4de2a8e4cca4 ("kunit: Handle test faults")
> Reported-by: Linux Kernel Functional Testing <lkft@...aro.org>
> Signed-off-by: David Gow <davidgow@...gle.com>

Minor suggestions, but it looks good. Thanks!

Acked-by: Mickaël Salaün <mic@...ikod.net>


> ---
> lib/kunit/try-catch.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/kunit/try-catch.c b/lib/kunit/try-catch.c
> index fa687278ccc9..fc6cd4d7e80f 100644
> --- a/lib/kunit/try-catch.c
> +++ b/lib/kunit/try-catch.c
> @@ -63,6 +63,7 @@ void kunit_try_catch_run(struct kunit_try_catch
> *try_catch, void *context)
> {
>        struct kunit *test = try_catch->test;
>        struct task_struct *task_struct;
> +       struct completion *task_done;
>        int exit_code, time_remaining;
> 
>        try_catch->context = context;
> @@ -75,13 +76,14 @@ void kunit_try_catch_run(struct kunit_try_catch
> *try_catch, void *context)
>                return;
>        }
>        get_task_struct(task_struct);
> +       task_done = task_struct->vfork_done;
>        wake_up_process(task_struct);

>        /*
>         * As for a vfork(2), task_struct->vfork_done (pointing to the
>         * underlying kthread->exited) can be used to wait for the end of a
>         * kernel thread.

"kernel thread.  It is set to NULL when the thread ends."

>         */

This block comment can now be moved up where task_done is set.

> -       time_remaining = wait_for_completion_timeout(task_struct->vfork_done,
> +       time_remaining = wait_for_completion_timeout(task_done,
>                                                     kunit_test_timeout());
>        if (time_remaining == 0) {
>                try_catch->try_result = -ETIMEDOUT;
> --



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ