[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4C323677.9040209@kernel.dk>
Date: Mon, 05 Jul 2010 21:45:59 +0200
From: Jens Axboe <axboe@...nel.dk>
To: Christoph Hellwig <hch@...radead.org>
CC: Ingo Molnar <mingo@...e.hu>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Andrew Morton <akpm@...ux-foundation.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
"Rafael J. Wysocki" <rjw@...k.pl>
Subject: Re: [regression] Crash in wb_clear_pending()
On 05/07/10 21.32, Christoph Hellwig wrote:
> On Mon, Jul 05, 2010 at 09:24:39PM +0200, Jens Axboe wrote:
>> The oops itself looks like a recurrence of the missing RCU grace or
>> too early stack wakeup, which should be a 1-2 liner once it's found.
>
> See the previous thread. There's at least two issues:
>
> - wb_do_writeback checks work->state after it's been freed when we do
> the second test_bit for WS_ONSTACK
> - bdi_work_free accesses work->state after waking up the caller doing
> bdi_wait_on_work_done, which might have re-used the stack space
> allocated for the work item.
>
> The fix for that is to get rid of the fragile work->state stuff and the
> bit wakeups by just using a completion and using that as indicator
> for the stack wait. That's the main change the above patch does. In
> addition it also merges the two structures used for the writeback
> requests. Onl doing the completion and earlier list removal would
> be something like the untested patch below:
If those two late ON_STACK checks is the only issue left there,
why not just apply the below for 2.6.35?
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 0609607..15ce6ab 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -90,12 +90,13 @@ int writeback_in_progress(struct backing_dev_info *bdi)
static void bdi_work_free(struct rcu_head *head)
{
struct bdi_work *work = container_of(head, struct bdi_work, rcu_head);
+ int on_stack = test_bit(WS_ONSTACK, &work->state);
clear_bit(WS_INPROGRESS, &work->state);
smp_mb__after_clear_bit();
wake_up_bit(&work->state, WS_INPROGRESS);
- if (!test_bit(WS_ONSTACK, &work->state))
+ if (!on_stack)
kfree(work);
}
@@ -854,6 +855,7 @@ long wb_do_writeback(struct bdi_writeback *wb, int force_wait)
while ((work = get_next_work_item(bdi, wb)) != NULL) {
struct wb_writeback_args args = work->args;
+ int on_stack = test_bit(WS_ONSTACK, &work->state);
/*
* Override sync mode, in case we must wait for completion
@@ -865,7 +867,7 @@ long wb_do_writeback(struct bdi_writeback *wb, int force_wait)
* If this isn't a data integrity operation, just notify
* that we have seen this work and we are now starting it.
*/
- if (!test_bit(WS_ONSTACK, &work->state))
+ if (!on_stack)
wb_clear_pending(wb, work);
wrote += wb_writeback(wb, &args);
@@ -874,7 +876,7 @@ long wb_do_writeback(struct bdi_writeback *wb, int force_wait)
* This is a data integrity writeback, so only do the
* notification when we have completed the work.
*/
- if (test_bit(WS_ONSTACK, &work->state))
+ if (on_stack)
wb_clear_pending(wb, work);
}
--
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists