lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANpmjNMXD3Qzj748CXWtmenxx4cC3Q8Fr70L5PWNe6ZSARcZ9w@mail.gmail.com>
Date:   Tue, 14 Jan 2020 13:51:01 +0100
From:   Marco Elver <elver@...gle.com>
To:     Qian Cai <cai@....pw>
Cc:     LKMM Maintainers -- Akira Yokosawa <akiyks@...il.com>,
        Alan Stern <stern@...land.harvard.edu>,
        Alexander Potapenko <glider@...gle.com>,
        Andrea Parri <parri.andrea@...il.com>,
        Andrey Konovalov <andreyknvl@...gle.com>,
        Andy Lutomirski <luto@...nel.org>,
        Ard Biesheuvel <ard.biesheuvel@...aro.org>,
        Arnd Bergmann <arnd@...db.de>,
        Boqun Feng <boqun.feng@...il.com>,
        Borislav Petkov <bp@...en8.de>, Daniel Axtens <dja@...ens.net>,
        Daniel Lustig <dlustig@...dia.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        David Howells <dhowells@...hat.com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        "H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
        Jade Alglave <j.alglave@....ac.uk>,
        Joel Fernandes <joel@...lfernandes.org>,
        Jonathan Corbet <corbet@....net>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Luc Maranget <luc.maranget@...ia.fr>,
        Mark Rutland <Mark.Rutland@....com>,
        Nicholas Piggin <npiggin@...il.com>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Will Deacon <will@...nel.org>,
        Eric Dumazet <edumazet@...gle.com>,
        kasan-dev <kasan-dev@...glegroups.com>,
        linux-arch <linux-arch@...r.kernel.org>,
        "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
        linux-efi@...r.kernel.org,
        Linux Kbuild mailing list <linux-kbuild@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Linux Memory Management List <linux-mm@...ck.org>,
        "the arch/x86 maintainers" <x86@...nel.org>
Subject: Re: [PATCH v4 01/10] kcsan: Add Kernel Concurrency Sanitizer infrastructure

On Tue, 14 Jan 2020 at 12:08, Qian Cai <cai@....pw> wrote:
>
>
>
> > On Jan 6, 2020, at 7:47 AM, Marco Elver <elver@...gle.com> wrote:
> >
> > Thanks, I'll look into KCSAN + lockdep compatibility. It's probably
> > missing some KCSAN_SANITIZE := n in some Makefile.
>
> Can I have a update on fixing this? It looks like more of a problem that kcsan_setup_watchpoint() will disable IRQs and then dive into the page allocator where it would complain because it might sleep.

KCSAN does *not* keep IRQs disabled (we have a clear irqsave / restore
pair kcsan_setup_watchpoint).

If you look closer at the warning you sent in this thread, the warning
is not generated because IRQs are off when it wants to sleep, but
rather because IRQs are enabled but IRQ tracing state is inconsistent:
"DEBUG_LOCKS_WARN_ON(!current->hardirqs_enabled)" in lockdep checks
that if IRQs are enabled, the trace state matches. These are only
checked with LOCKDEP_DEBUG and TRACE_IRQFLAGS.

In other words, IRQ trace flags got corrupted somewhere. AFAIK, this
problem here is only relevant with TRACE_IRQFLAGS -- again, it is
clear that IRQs are enabled but the IRQ tracing logic somehow ended up
corrupting hardirqs_enabled (TRACE_IRQFLAGS).

I believe this patch will take care of this issue:
http://lkml.kernel.org/r/20200114124919.11891-1-elver@google.com

Thanks,
-- Marco

> BTW, I saw Paul sent a pull request for 5.6 but it is ugly to have everybody could trigger a deadlock (sleep function called in atomic context) like this during boot once this hits the mainline not to mention about only recently it is possible to test this feature (thanks to warning ratelimit) with the existing debugging options because it was unable to boot due to the brokenness with debug_pagealloc as mentioned in this thread, so this does sounds like it needs more soak time for the mainline to me.
>
> 0000000000000400
> [   13.416814][    T1] Call Trace:
> [   13.416814][    T1]  lock_is_held_type+0x66/0x160
> [   13.416814][    T1]  ___might_sleep+0xc1/0x1d0
> [   13.416814][    T1]  __might_sleep+0x5b/0xa0
> [   13.416814][    T1]  slab_pre_alloc_hook+0x7b/0xa0
> [   13.416814][    T1]  __kmalloc_node+0x60/0x300
> [   13.416814   T1]  ? alloc_cpumask_var_node+0x44/0x70
> [   13.416814][    T1]  ? topology_phys_to_logical_die+0x7e/0x180
> [   13.416814][    T1]  alloc_cpumask_var_node+0x44/0x70
> [   13.416814][    T1]  zalloc_cpumask_var+0x2a/0x40
> [   13.416814][    T1]  native_smp_prepare_cpus+0x246/0x425
> [   13.416814][    T1]  kernel_init_freeable+0x1b8/0x496
> [   13.416814][    T1]  ? rest_init+0x381/0x381
> [   13.416814][    T1]  kernel_init+0x18/0x17f
> [   13.416814][    T1]  ? rest_init+0x381/0x381
> [   13.416814][    T1]  ret_from_fork+0x3a/0x50
> [   13.416814][    T1] irq event stamp: 910
> [   13.416814][    T1] hardirqs last  enabled at (909): [<ffffffff8d1240f3>] _raw_write_unlock_irqrestore+0x53/0x57
> [   13.416814][    T1] hardirqs last disabled at (910): [<ffffffff8c8bba76>] kcsan_setup_watchpoint+0x96/0x460
> [   13.416814][    T1] softirqs last  enabled at (0): [<ffffffff8c6b697a>] copy_process+0x11fa/0x34f0
> [   13.416814][    T1] softirqs last disabled at (0): [<0000000000000000>] 0x0
> [   13.416814][    T1] ---[ end trace 7d1df66da055aa92 ]---
> [   13.416814][    T1] possible reason: unannotated irqs-on.
> [   13.416814][ent stamp: 910
> [   13.416814][    T1] hardirqs last  enabled at (909): [<ffffffff8d1240f3>] _raw_write_unlock_irqrestore+0x53/0x57
> [   13.416814][    T1] hardirqs last disabled at (910): [<ffffffff8c8bba76>] kcsan_setup_watchpoint+0x96/0x460
> [   13.416814][    T1] softirqs last  enabled at (0): [<ffffffff8c6b697a>] copy_process+0x11fa/0x34f0
> [   13.416814][    T1] softirqs last disabled at (0): [<0000000000000000>] 0x0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ