[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANpmjNMj8FZuBrZsH62V3bZEhFvT2zXwLusVOLwNuH_-kLhp2g@mail.gmail.com>
Date: Wed, 24 Jun 2020 12:17:56 +0200
From: Marco Elver <elver@...gle.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: "Ahmed S. Darwish" <a.darwish@...utronix.de>,
Ingo Molnar <mingo@...nel.org>, Will Deacon <will@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
"the arch/x86 maintainers" <x86@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
Steven Rostedt <rostedt@...dmis.org>, bigeasy@...utronix.de,
"David S. Miller" <davem@...emloft.net>,
sparclinux@...r.kernel.org, Michael Ellerman <mpe@...erman.id.au>,
linuxppc-dev@...ts.ozlabs.org, heiko.carstens@...ibm.com,
linux-s390@...r.kernel.org, linux@...linux.org.uk,
"Paul E. McKenney" <paulmck@...nel.org>
Subject: Re: [PATCH v4 7/8] lockdep: Change hardirq{s_enabled,_context} to
per-cpu variables
On Wed, 24 Jun 2020 at 11:01, Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Tue, Jun 23, 2020 at 10:24:04PM +0200, Peter Zijlstra wrote:
> > On Tue, Jun 23, 2020 at 08:12:32PM +0200, Peter Zijlstra wrote:
> > > Fair enough; I'll rip it all up and boot a KCSAN kernel, see what if
> > > anything happens.
> >
> > OK, so the below patch doesn't seem to have any nasty recursion issues
> > here. The only 'problem' is that lockdep now sees report_lock can cause
> > deadlocks.
> >
> > It is completely right about it too, but I don't suspect there's much we
> > can do about it, it's pretty much the standard printk() with scheduler
> > locks held report.
>
> So I've been getting tons and tons of this:
>
> [ 60.471348] ==================================================================
> [ 60.479427] BUG: KCSAN: data-race in __rcu_read_lock / __rcu_read_unlock
> [ 60.486909]
> [ 60.488572] write (marked) to 0xffff88840fff1cf0 of 4 bytes by interrupt on cpu 1:
> [ 60.497026] __rcu_read_lock+0x37/0x60
> [ 60.501214] cpuacct_account_field+0x1b/0x170
> [ 60.506081] task_group_account_field+0x32/0x160
> [ 60.511238] account_system_time+0xe6/0x110
> [ 60.515912] update_process_times+0x1d/0xd0
> [ 60.520585] tick_sched_timer+0xfc/0x180
> [ 60.524967] __hrtimer_run_queues+0x271/0x440
> [ 60.529832] hrtimer_interrupt+0x222/0x670
> [ 60.534409] __sysvec_apic_timer_interrupt+0xb3/0x1a0
> [ 60.540052] asm_call_on_stack+0x12/0x20
> [ 60.544434] sysvec_apic_timer_interrupt+0xba/0x130
> [ 60.549882] asm_sysvec_apic_timer_interrupt+0x12/0x20
> [ 60.555621] delay_tsc+0x7d/0xe0
> [ 60.559226] kcsan_setup_watchpoint+0x292/0x4e0
> [ 60.564284] __rcu_read_unlock+0x73/0x2c0
> [ 60.568763] __unlock_page_memcg+0xda/0xf0
> [ 60.573338] unlock_page_memcg+0x32/0x40
> [ 60.577721] page_remove_rmap+0x5c/0x200
> [ 60.582104] unmap_page_range+0x83c/0xc10
> [ 60.586582] unmap_single_vma+0xb0/0x150
> [ 60.590963] unmap_vmas+0x81/0xe0
> [ 60.594663] exit_mmap+0x135/0x2b0
> [ 60.598464] __mmput+0x21/0x150
> [ 60.601970] mmput+0x2a/0x30
> [ 60.605176] exit_mm+0x2fc/0x350
> [ 60.608780] do_exit+0x372/0xff0
> [ 60.612385] do_group_exit+0x139/0x140
> [ 60.616571] __do_sys_exit_group+0xb/0x10
> [ 60.621048] __se_sys_exit_group+0xa/0x10
> [ 60.625524] __x64_sys_exit_group+0x1b/0x20
> [ 60.630189] do_syscall_64+0x6c/0xe0
> [ 60.634182] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 60.639820]
> [ 60.641485] read to 0xffff88840fff1cf0 of 4 bytes by task 2430 on cpu 1:
> [ 60.648969] __rcu_read_unlock+0x73/0x2c0
> [ 60.653446] __unlock_page_memcg+0xda/0xf0
> [ 60.658019] unlock_page_memcg+0x32/0x40
> [ 60.662400] page_remove_rmap+0x5c/0x200
> [ 60.666782] unmap_page_range+0x83c/0xc10
> [ 60.671259] unmap_single_vma+0xb0/0x150
> [ 60.675641] unmap_vmas+0x81/0xe0
> [ 60.679341] exit_mmap+0x135/0x2b0
> [ 60.683141] __mmput+0x21/0x150
> [ 60.686647] mmput+0x2a/0x30
> [ 60.689853] exit_mm+0x2fc/0x350
> [ 60.693458] do_exit+0x372/0xff0
> [ 60.697062] do_group_exit+0x139/0x140
> [ 60.701248] __do_sys_exit_group+0xb/0x10
> [ 60.705724] __se_sys_exit_group+0xa/0x10
> [ 60.710201] __x64_sys_exit_group+0x1b/0x20
> [ 60.714872] do_syscall_64+0x6c/0xe0
> [ 60.718864] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 60.724503]
> [ 60.726156] Reported by Kernel Concurrency Sanitizer on:
> [ 60.732089] CPU: 1 PID: 2430 Comm: sshd Not tainted 5.8.0-rc2-00186-gb4ee11fe08b3-dirty #303
> [ 60.741510] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
> [ 60.752957] ==================================================================
>
> And I figured a quick way to get rid of that would be something like the
> below, seeing how volatile gets auto annotated... but that doesn't seem
> to actually work.
>
> What am I missing?
There's one more in include/linux/rcupdate.h. I suggested this at some point:
https://lore.kernel.org/lkml/20200220213317.GA35033@google.com/
To avoid volatiles as I don't think they are needed here.
[ Still testing your other patches for KCSAN, will send another reply there. ]
Thanks,
-- Marco
Powered by blists - more mailing lists