linux-kernel - Re: [PATCH] x86 spinlock: Fix memory corruption on completing completions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <54D50E84.2060703@oracle.com>
Date:	Fri, 06 Feb 2015 13:57:08 -0500
From:	Sasha Levin <sasha.levin@...cle.com>
To:	Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>,
	tglx@...utronix.de, mingo@...hat.com, hpa@...or.com,
	peterz@...radead.org, torvalds@...ux-foundation.org
CC:	konrad.wilk@...cle.com, pbonzini@...hat.com,
	paulmck@...ux.vnet.ibm.com, waiman.long@...com, davej@...hat.com,
	oleg@...hat.com, x86@...nel.org, jeremy@...p.org,
	paul.gortmaker@...driver.com, ak@...ux.intel.com,
	jasowang@...hat.com, linux-kernel@...r.kernel.org,
	kvm@...r.kernel.org, virtualization@...ts.linux-foundation.org,
	xen-devel@...ts.xenproject.org, riel@...hat.com,
	borntraeger@...ibm.com, akpm@...ux-foundation.org,
	a.ryabinin@...sung.com
Subject: Re: [PATCH] x86 spinlock: Fix memory corruption on completing completions

On 02/06/2015 09:49 AM, Raghavendra K T wrote:
> Paravirt spinlock clears slowpath flag after doing unlock.
> As explained by Linus currently it does:
>                 prev = *lock;
>                 add_smp(&lock->tickets.head, TICKET_LOCK_INC);
> 
>                 /* add_smp() is a full mb() */
> 
>                 if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG))
>                         __ticket_unlock_slowpath(lock, prev);
> 
> 
> which is *exactly* the kind of things you cannot do with spinlocks,
> because after you've done the "add_smp()" and released the spinlock
> for the fast-path, you can't access the spinlock any more.  Exactly
> because a fast-path lock might come in, and release the whole data
> structure.
> 
> Linus suggested that we should not do any writes to lock after unlock(),
> and we can move slowpath clearing to fastpath lock.
> 
> However it brings additional case to be handled, viz., slowpath still
> could be set when somebody does arch_trylock. Handle that too by ignoring
> slowpath flag during lock availability check.
> 
> Reported-by: Sasha Levin <sasha.levin@...cle.com>
> Suggested-by: Linus Torvalds <torvalds@...ux-foundation.org>
> Signed-off-by: Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>

With this patch, my VMs lock up quickly after boot with:

[  161.613469] BUG: spinlock lockup suspected on CPU#31, kworker/31:1/5213
[  161.613469]  lock: purge_lock.28981+0x0/0x40, .magic: dead4ead, .owner: kworker/7:1/6400, .owner_cpu: 7
[  161.613469] CPU: 31 PID: 5213 Comm: kworker/31:1 Not tainted 3.19.0-rc7-next-20150204-sasha-00048-gee3a350 #1869
[  161.613469] Workqueue: events bpf_prog_free_deferred
[  161.613469]  0000000000000000 00000000f03dd27f ffff88056b227a88 ffffffffa1702276
[  161.613469]  0000000000000000 ffff88017cf70000 ffff88056b227aa8 ffffffff9e1d009c
[  161.613469]  ffffffffa3edae60 0000000086c3f830 ffff88056b227ad8 ffffffff9e1d01d7
[  161.613469] Call Trace:
[  161.613469] dump_stack (lib/dump_stack.c:52)
[  161.613469] spin_dump (kernel/locking/spinlock_debug.c:68 (discriminator 8))
[  161.613469] do_raw_spin_lock (include/linux/nmi.h:48 kernel/locking/spinlock_debug.c:119 kernel/locking/spinlock_debug.c:137)
[  161.613469] _raw_spin_lock (kernel/locking/spinlock.c:152)
[  161.613469] ? __purge_vmap_area_lazy (mm/vmalloc.c:615)
[  161.613469] __purge_vmap_area_lazy (mm/vmalloc.c:615)
[  161.613469] ? vm_unmap_aliases (mm/vmalloc.c:1021)
[  161.613469] vm_unmap_aliases (mm/vmalloc.c:1052)
[  161.613469] ? vm_unmap_aliases (mm/vmalloc.c:1021)
[  161.613469] ? __lock_acquire (kernel/locking/lockdep.c:2019 kernel/locking/lockdep.c:3184)
[  161.613469] change_page_attr_set_clr (arch/x86/mm/pageattr.c:1394)
[  161.613469] ? debug_object_deactivate (lib/debugobjects.c:463)
[  161.613469] set_memory_rw (arch/x86/mm/pageattr.c:1662)
[  161.613469] ? __lock_is_held (kernel/locking/lockdep.c:3518)
[  161.613469] bpf_jit_free (include/linux/filter.h:377 arch/x86/net/bpf_jit_comp.c:991)
[  161.613469] bpf_prog_free_deferred (kernel/bpf/core.c:646)
[  161.613469] process_one_work (kernel/workqueue.c:2014 include/linux/jump_label.h:114 include/trace/events/workqueue.h:111 kernel/workqueue.c:2019)
[  161.613469] ? process_one_work (./arch/x86/include/asm/atomic64_64.h:33 include/asm-generic/atomic-long.h:38 kernel/workqueue.c:598 kernel/workqueue.c:625 kernel/workqueue.c:2007)
[  161.613469] worker_thread (include/linux/list.h:189 kernel/workqueue.c:2147)
[  161.613469] ? process_one_work (kernel/workqueue.c:2091)
[  161.613469] kthread (kernel/kthread.c:207)
[  161.613469] ? finish_task_switch (./arch/x86/include/asm/current.h:14 kernel/sched/sched.h:1058 kernel/sched/core.c:2258)
[  161.613469] ? flush_kthread_work (kernel/kthread.c:176)
[  161.613469] ret_from_fork (arch/x86/kernel/entry_64.S:283)
[  161.613469] ? flush_kthread_work (kernel/kthread.c:176)

And a few soft lockup messages inside the scheduler after that.


Thanks,
Sasha


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/