lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200623191335.GA86296@elver.google.com>
Date:   Tue, 23 Jun 2020 21:13:35 +0200
From:   Marco Elver <elver@...gle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     "Ahmed S. Darwish" <a.darwish@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>, Will Deacon <will@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        the arch/x86 maintainers <x86@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Steven Rostedt <rostedt@...dmis.org>, bigeasy@...utronix.de,
        "David S. Miller" <davem@...emloft.net>,
        sparclinux@...r.kernel.org, Michael Ellerman <mpe@...erman.id.au>,
        linuxppc-dev@...ts.ozlabs.org, heiko.carstens@...ibm.com,
        linux-s390@...r.kernel.org, linux@...linux.org.uk,
        Qian Cai <cai@....pw>
Subject: Re: [PATCH v4 7/8] lockdep: Change hardirq{s_enabled,_context} to
 per-cpu variables

On Tue, Jun 23, 2020 at 08:39PM +0200, Marco Elver wrote:
> On Tue, 23 Jun 2020 at 20:13, Peter Zijlstra <peterz@...radead.org> wrote:
> >
> > On Tue, Jun 23, 2020 at 07:59:57PM +0200, Marco Elver wrote:
> > > On Tue, Jun 23, 2020 at 06:37PM +0200, Peter Zijlstra wrote:
> > > > On Tue, Jun 23, 2020 at 06:13:21PM +0200, Ahmed S. Darwish wrote:
> > > > > Well, freshly merged code is using it. For example, KCSAN:
> > > > >
> > > > >     => f1bc96210c6a ("kcsan: Make KCSAN compatible with lockdep")
> > > > >     => kernel/kcsan/report.c:
> > > > >
> > > > >     void kcsan_report(...)
> > > > >     {
> > > > >   ...
> > > > >         /*
> > > > >          * With TRACE_IRQFLAGS, lockdep's IRQ trace state becomes corrupted if
> > > > >          * we do not turn off lockdep here; this could happen due to recursion
> > > > >          * into lockdep via KCSAN if we detect a race in utilities used by
> > > > >          * lockdep.
> > > > >          */
> > > > >         lockdep_off();
> > > > >   ...
> > > > >     }
> > > >
> > > > Marco, do you remember what exactly happened there? Because I'm about to
> > > > wreck that. That is, I'm going to make TRACE_IRQFLAGS ignore
> > > > lockdep_off().
> > >
> > > Yeah, I was trying to squash any kind of recursion:
> > >
> > >       lockdep -> other libs ->
> > >               -> KCSAN
> > >               -> print report
> > >               -> dump stack, printk and friends
> > >               -> lockdep -> other libs
> > >                       -> KCSAN ...
> > >
> > > Some history:
> > >
> > > * Initial patch to fix:
> > >       https://lore.kernel.org/lkml/20200115162512.70807-1-elver@google.com/
> >
> > That patch is weird; just :=n on lockdep.c should've cured that, the
> > rest is massive overkill.
> >
> > > * KCSAN+lockdep+ftrace:
> > >       https://lore.kernel.org/lkml/20200214211035.209972-1-elver@google.com/
> >
> > That doesn't really have anything useful..
> >
> > > lockdep now has KCSAN_SANITIZE := n, but we still need to ensure that
> > > there are no paths out of lockdep, or the IRQ flags tracing code, that
> > > might lead through other libs, through KCSAN, libs used to generate a
> > > report, and back to lockdep.
> > >
> > > I never quite figured out the exact trace that led to corruption, but
> > > avoiding any kind of potential for recursion was the only thing that
> > > would avoid the check_flags() warnings.
> >
> > Fair enough; I'll rip it all up and boot a KCSAN kernel, see what if
> > anything happens.
> 
> Thanks!
> 
> This was happening with Qian Cai's (Cc'd) test cases. If the kernel or
> this patch changed things around so this doesn't happen anymore
> regardless, then I don't see a problem.

I see the below report when I boot with your branch + KCSAN and
PROVE_LOCKING. config attached. Trying to make sense of what's
happening.

Thanks,
-- Marco

------ >8 ------

[   10.182354] ------------[ cut here ]------------
[   10.183058] WARNING: CPU: 7 PID: 136 at kernel/locking/lockdep.c:398 lockdep_hardirqs_on_prepare+0x1c6/0x270
[   10.184347] Modules linked in:
[   10.184771] CPU: 7 PID: 136 Comm: systemd-journal Not tainted 5.8.0-rc1+ #3
[   10.185706] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
[   10.186821] RIP: 0010:lockdep_hardirqs_on_prepare+0x1c6/0x270
[   10.187594] Code: 75 28 65 48 8b 04 25 28 00 00 00 48 3b 44 24 08 0f 85 b9 00 00 00 48 83 c4 10 5b 41 5e 41 5f c3 65 48 ff 05 d4 24 4e 75 eb d8 <0f> 0b 90 41 c7 86 c4 08 00 00 00 00 00 00 eb c8 e8 65 09 71 01 85
[   10.190203] RSP: 0018:ffffa7ee802b7848 EFLAGS: 00010017
[   10.190989] RAX: 0000000000000001 RBX: ffff955e92a34ab0 RCX: 0000000000000001
[   10.192053] RDX: 0000000000000006 RSI: ffff955e92a34a88 RDI: ffff955e92a341c0
[   10.193117] RBP: ffffa7ee802b7be8 R08: 0000000000000000 R09: 0000ffffffffffff
[   10.194186] R10: 0000ffffffffffff R11: 0000ffff8d07e268 R12: 0000000000000001
[   10.195249] R13: ffffffff8e41bb10 R14: ffff955e92a341c0 R15: 0000000000000001
[   10.196312] FS:  00007fd6862aa8c0(0000) GS:ffff955e9fd80000(0000) knlGS:0000000000000000
[   10.197513] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   10.198373] CR2: 00007fd6837dd000 CR3: 0000000812acc001 CR4: 0000000000760ee0
[   10.199436] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   10.200494] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   10.201554] PKRU: 55555554
[   10.201967] Call Trace:
[   10.202348]  ? _raw_spin_unlock_irqrestore+0x40/0x70
[   10.203093]  trace_hardirqs_on+0x56/0x60			<----- enter IRQ flags tracing code?
[   10.203686]  _raw_spin_unlock_irqrestore+0x40/0x70  		<----- take report_lock
[   10.204406]  prepare_report+0x11f/0x150
[   10.204986]  kcsan_report+0xca/0x6c0 			<----- generating a KCSAN report
[   10.205529]  ? __this_cpu_preempt_check+0x18/0x20
[   10.206236]  ? validate_chain+0x104/0x3320
[   10.206853]  ? __lock_acquire+0x9b2/0x10b0
[   10.207472]  ? __lock_acquire+0x9b2/0x10b0
[   10.208091]  ? lock_acquire+0xd3/0x2c0
[   10.208665]  ? kernfs_iop_permission+0x3f/0x170
[   10.209347]  ? lockref_get_not_dead+0x14/0x60
[   10.210012]  ? kernfs_iop_permission+0x3f/0x170
[   10.210698]  ? __mutex_trylock_or_owner+0x9a/0xf0
[   10.211404]  ? kernfs_iop_permission+0x3f/0x170
[   10.212083]  ? __mutex_lock+0x226/0x710
[   10.212669]  kcsan_found_watchpoint+0xe5/0x110
[   10.213334]  kernfs_iop_permission+0x72/0x170
[   10.214000]  ? kernfs_evict_inode+0x40/0x40
[   10.214632]  link_path_walk+0x667/0x840
[   10.215215]  path_lookupat+0x7b/0x4b0
[   10.215774]  filename_lookup+0xe9/0x330
[   10.216355]  ? getname_flags+0x7a/0x370
[   10.216939]  ? __might_fault+0x5/0x10
[   10.217495]  ? strncpy_from_user+0x190/0x240
[   10.218149]  user_path_at_empty+0x3b/0x50
[   10.218757]  do_faccessat+0x1e8/0x430
[   10.219313]  __x64_sys_access+0x33/0x40
[   10.219896]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   10.220679]  do_syscall_64+0x65/0xa0
[   10.221221]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   10.221987] RIP: 0033:0x7fd6855679c7
[   10.222528] Code: Bad RIP value.
[   10.223010] RSP: 002b:00007ffcca601df8 EFLAGS: 00000246 ORIG_RAX: 0000000000000015
[   10.224123] RAX: ffffffffffffffda RBX: 00007ffcca601e80 RCX: 00007fd6855679c7
[   10.225173] RDX: 00302e33303a3030 RSI: 0000000000000000 RDI: 00007ffcca601e00
[   10.226229] RBP: 00007ffcca602f20 R08: 000000000000c0c0 R09: 0000000000000120
[   10.227283] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffcca602f40
[   10.228339] R13: 000055878dd137c8 R14: 0000000000000018 R15: 00007ffcca601e00
[   10.229400] irq event stamp: 63012
[   10.229922] hardirqs last  enabled at (63011): [<ffffffff8adf9f68>] step_into+0x328/0xd20
[   10.231130] hardirqs last disabled at (63012): [<ffffffff8c24f99e>] _raw_spin_lock_irqsave+0x2e/0x80
[   10.232415] softirqs last  enabled at (62514): [<ffffffff8c400fd2>] asm_call_on_stack+0x12/0x20
[   10.233564] softirqs last disabled at (62327): [<ffffffff8c400fd2>] asm_call_on_stack+0x12/0x20
[   10.234718] ---[ end trace 696b39497a6233dc ]---

View attachment ".config" of type "text/plain" (130061 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ