[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <153978584719.7579.14050445370493369372.stgit@warthog.procyon.org.uk>
Date: Wed, 17 Oct 2018 15:17:27 +0100
From: David Howells <dhowells@...hat.com>
To: gregkh@...ux-foundation.org
Cc: Kiran Kumar Modukuri <kiran.modukuri@...il.com>,
David Howells <dhowells@...at.com>, viro@...iv.linux.org.uk,
sandeen@...hat.com, dhowells@...hat.com, linux-cachefs@...hat.com,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: [PATCH 2/4] fscache: Fix race in fscache_op_complete() due to split
atomic_sub & read
From: kiran.modukuri <kiran.modukuri@...il.com>
The code in fscache_retrieval_complete is using atomic_sub followed by an
atomic_read:
atomic_sub(n_pages, &op->n_pages);
if (atomic_read(&op->n_pages) <= 0)
fscache_op_complete(&op->op, true);
This causes two threads doing a decrement of n_pages to race with each
other seeing the op->refcount 0 at same time - and they end up calling
fscache_op_complete() in both the threads leading to an assertion failure.
Fix this by using atomic_sub_return() instead of two calls.
The oops looks something like:
FS-Cache: Assertion failed
FS-Cache: 0 > 0 is false
...
kernel BUG at /usr/src/linux-4.4.0/fs/fscache/operation.c:449!
...
Workqueue: fscache_operation fscache_op_work_func [fscache]
...
RIP: 0010:[<ffffffffc037eacd>] fscache_op_complete+0x10d/0x180 [fscache]
...
Call Trace:
[<ffffffffc1464cf9>] cachefiles_read_copier+0x3a9/0x410 [cachefiles]
[<ffffffffc037e272>] fscache_op_work_func+0x22/0x50 [fscache]
[<ffffffff81096da0>] process_one_work+0x150/0x3f0
[<ffffffff8109751a>] worker_thread+0x11a/0x470
[<ffffffff81808e59>] ? __schedule+0x359/0x980
[<ffffffff81097400>] ? rescuer_thread+0x310/0x310
[<ffffffff8109cdd6>] kthread+0xd6/0xf0
[<ffffffff8109cd00>] ? kthread_park+0x60/0x60
[<ffffffff8180d0cf>] ret_from_fork+0x3f/0x70
[<ffffffff8109cd00>] ? kthread_park+0x60/0x60
This seen this in 4.4.x kernels and the same bug affects fscache in latest
upstreams kernels.
Fixes: 1bb4b7f98f36 ("FS-Cache: The retrieval remaining-pages counter needs to be atomic_t")
Signed-off-by: Kiran Kumar Modukuri <kiran.modukuri@...il.com>
Signed-off-by: David Howells <dhowells@...at.com>
---
include/linux/fscache-cache.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 34cf0fdd7dc7..bf98ed803af2 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -196,11 +196,11 @@ static inline void fscache_enqueue_retrieval(struct fscache_retrieval *op)
static inline void fscache_retrieval_complete(struct fscache_retrieval *op,
int n_pages)
{
- atomic_sub(n_pages, &op->n_pages);
- if (atomic_read(&op->n_pages) <= 0)
+ if (atomic_sub_return(n_pages, &op->n_pages) <= 0)
fscache_op_complete(&op->op, false);
}
+
/**
* fscache_put_retrieval - Drop a reference to a retrieval operation
* @op: The retrieval operation affected
Powered by blists - more mailing lists