[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <A90E2B85-77CB-4743-AEC3-90D7836C4D47@lca.pw>
Date: Wed, 22 Jan 2020 18:54:43 -0500
From: Qian Cai <cai@....pw>
To: Marco Elver <elver@...gle.com>
Cc: Will Deacon <will@...nel.org>, mingo@...hat.com,
peterz@...radead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] locking/osq_lock: fix a data race in osq_wait_next
> On Jan 22, 2020, at 5:38 PM, Marco Elver <elver@...gle.com> wrote:
>
>
>
> On Wed, 22 Jan 2020, Qian Cai wrote:
>
>>
>>
>>> On Jan 22, 2020, at 11:59 AM, Will Deacon <will@...nel.org> wrote:
>>>
>>> I don't understand this; 'next' is a local variable.
>>>
>>> Not keen on the onslaught of random "add a READ_ONCE() to shut the
>>> sanitiser up" patches we're going to get from kcsan :(
>>
>> My fault. I suspect it is node->next. I’ll do a bit more testing to confirm.
>
> If possible, decode and get the line numbers. I have observed a data
[ 667.817131] Reported by Kernel Concurrency Sanitizer on:
[ 667.823200] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W L 5.5.0-rc7-next-20200121+ #9
[ 667.832839] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
[ 667.842132] ==================================================================
[ 672.299421] ==================================================================
[ 672.307449] BUG: KCSAN: data-race in osq_lock / osq_lock
[ 672.315741] write (marked) to 0xffff8f613013be00 of 8 bytes by task 971 on cpu 59:
[ 672.324085] osq_lock+0x2fb/0x340 kernel/locking/osq_lock.c:200
[ 672.328149] __mutex_lock+0x277/0xd20 kernel/locking/mutex.c:657
[ 672.332561] mutex_lock_nested+0x31/0x40 kernel/locking/mutex.c:1118
[ 672.337236] memcg_create_kmem_cache+0x2e/0x190 mm/slab_common.c:659
[ 672.342534] memcg_kmem_cache_create_func+0x40/0x80
[ 672.348177] process_one_work+0x54c/0xbe0
[ 672.352940] worker_thread+0x80/0x650
[ 672.357351] kthread+0x1e0/0x200
[ 672.361324] ret_from_fork+0x27/0x50
[ 672.367875] read to 0xffff8f613013be00 of 8 bytes by task 708 on cpu 50:
[ 672.375345] osq_lock+0x234/0x340 kernel/locking/osq_lock.c:78
[ 672.379431] __mutex_lock+0x277/0xd20 kernel/locking/mutex.c:657
[ 672.383862] mutex_lock_nested+0x31/0x40 kernel/locking/mutex.c:1118
[ 672.388537] memcg_create_kmem_cache+0x2e/0x190 mm/slab_common.c:659
[ 672.393824] memcg_kmem_cache_create_func+0x40/0x80
[ 672.399461] process_one_work+0x54c/0xbe0
[ 672.404229] worker_thread+0x80/0x650
[ 672.408640] kthread+0x1e0/0x200
[ 672.412613] ret_from_fork+0x27/0x50
This?
diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index 1f7734949ac8..832e87966dcf 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -75,7 +75,7 @@ osq_wait_next(struct optimistic_spin_queue *lock,
* wait for either @lock to point to us, through its Step-B, or
* wait for a new @node->next from its Step-C.
*/
- if (node->next) {
+ if (READ_ONCE(node->next)) {
next = xchg(&node->next, NULL);
if (next)
break;
> race in osq_lock before, however, this is the only one I have recently
> seen in osq_lock:
>
> read to 0xffff88812c12d3d4 of 4 bytes by task 23304 on cpu 0:
> osq_lock+0x170/0x2f0 kernel/locking/osq_lock.c:143
>
> while (!READ_ONCE(node->locked)) {
> /*
> * If we need to reschedule bail... so we can block.
> * Use vcpu_is_preempted() to avoid waiting for a preempted
> * lock holder:
> */
> --> if (need_resched() || vcpu_is_preempted(node_cpu(node->prev)))
> goto unqueue;
>
> cpu_relax();
> }
>
> where
>
> static inline int node_cpu(struct optimistic_spin_node *node)
> {
> --> return node->cpu - 1;
> }
>
>
> write to 0xffff88812c12d3d4 of 4 bytes by task 23334 on cpu 1:
> osq_lock+0x89/0x2f0 kernel/locking/osq_lock.c:99
>
> bool osq_lock(struct optimistic_spin_queue *lock)
> {
> struct optimistic_spin_node *node = this_cpu_ptr(&osq_node);
> struct optimistic_spin_node *prev, *next;
> int curr = encode_cpu(smp_processor_id());
> int old;
>
> node->locked = 0;
> node->next = NULL;
> --> node->cpu = curr;
>
>
> Thanks,
> -- Marco
Powered by blists - more mailing lists