[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5322D90C.5050207@cn.fujitsu.com>
Date: Fri, 14 Mar 2014 18:25:16 +0800
From: Gu Zheng <guz.fnst@...fujitsu.com>
To: Benjamin LaHaise <bcrl@...ck.org>
CC: Tang Chen <tangchen@...fujitsu.com>, viro@...iv.linux.org.uk,
jmoyer@...hat.com, kosaki.motohiro@...il.com,
kosaki.motohiro@...fujitsu.com, isimatu.yasuaki@...fujitsu.com,
linux-fsdevel@...r.kernel.org, linux-aio@...ck.org,
linux-kernel@...r.kernel.org, miaox@...fujitsu.com
Subject: Re: [RESEND v2 PATCH 1/2] aio, memory-hotplug: Fix confliction when
migrating and accessing ring pages.
Hi Ben,
On 03/13/2014 06:17 AM, Benjamin LaHaise wrote:
> Hello Tang,
>
> On Wed, Mar 12, 2014 at 01:25:26PM +0800, Tang Chen wrote:
> ... <snip> ...
>
>>> Another spot is in
>>> aio_read_events_ring() where head and tail are fetched from the ring
>>> without
>>> any locking. I also fear we'll be introducing new performance issues with
>>> all the additonal spinlock bouncing, despite the fact that is only ever
>>> needed for migration. I'm going to continue looking into this today and
>>> will try to send out a followup to this email later.
>>
>> In the beginning of aio_read_events_ring(), it reads head and tail, not
>> write.
>> So even if ring pages are migrated, the contents of the pages will not
>> be changed.
>> So reading it is OK, from old page or from the new page, I think.
>
> Your assumption that reading it is okay is incorrect. Since we do not have
> a reference on the page at that point, it is possible that the read of the
> page takes place after the page has been freed and allocated to another part
> of the kernel. This would result in the read returning invalid information.
What about the following patch? It adds additional reference to protect the page
avoid being freed when we reading it.
ps.It is applied on linux-next(3-13).
Signed-off-by: Gu Zheng <guz.fnst@...fujitsu.com>
---
fs/aio.c | 16 ++++++++++++----
1 files changed, 12 insertions(+), 4 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index 4133ba9..a4f3a4f 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -283,7 +283,7 @@ static int aio_migratepage(struct address_space *mapping, struct page *new,
{
struct kioctx *ctx;
unsigned long flags;
- int rc;
+ int rc, extra_count;
rc = 0;
@@ -311,7 +311,10 @@ static int aio_migratepage(struct address_space *mapping, struct page *new,
BUG_ON(PageWriteback(old));
get_page(new);
- rc = migrate_page_move_mapping(mapping, new, old, NULL, mode, 1);
+ extra_count = page_count(old) - page_has_private(old) - 2;
+
+ rc = migrate_page_move_mapping(mapping, new, old,
+ NULL, mode, extra_count);
if (rc != MIGRATEPAGE_SUCCESS) {
put_page(new);
return rc;
@@ -1047,13 +1050,17 @@ static long aio_read_events_ring(struct kioctx *ctx,
unsigned head, tail, pos;
long ret = 0;
int copy_ret;
+ struct page *page;
mutex_lock(&ctx->ring_lock);
- ring = kmap_atomic(ctx->ring_pages[0]);
+ page = ctx->ring_pages[0];
+ get_page(page);
+ ring = kmap_atomic(page);
head = ring->head;
tail = ring->tail;
kunmap_atomic(ring);
+ put_page(page);
pr_debug("h%u t%u m%u\n", head, tail, ctx->nr_events);
@@ -1063,7 +1070,6 @@ static long aio_read_events_ring(struct kioctx *ctx,
while (ret < nr) {
long avail;
struct io_event *ev;
- struct page *page;
avail = (head <= tail ? tail : ctx->nr_events) - head;
if (head == tail)
@@ -1075,6 +1081,7 @@ static long aio_read_events_ring(struct kioctx *ctx,
pos = head + AIO_EVENTS_OFFSET;
page = ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE];
+ get_page(page);
pos %= AIO_EVENTS_PER_PAGE;
/*
@@ -1087,6 +1094,7 @@ static long aio_read_events_ring(struct kioctx *ctx,
copy_ret = copy_to_user(event + ret, ev + pos,
sizeof(*ev) * avail);
kunmap(page);
+ put_page(page);
if (unlikely(copy_ret)) {
ret = -EFAULT;
--
1.7.7
>
> -ben
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists