[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFxN2-qZwBywE3b+RYyPLg1-ru5ZwJLHLohy0KJ0fD0vqA@mail.gmail.com>
Date: Tue, 2 Jul 2013 20:28:42 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Dave Chinner <david@...morbit.com>
Cc: Jan Kara <jack@...e.cz>, Dave Jones <davej@...hat.com>,
Oleg Nesterov <oleg@...hat.com>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Linux Kernel <linux-kernel@...r.kernel.org>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Andrey Vagin <avagin@...nvz.org>,
Steven Rostedt <rostedt@...dmis.org>
Subject: Re: frequent softlockups with 3.10rc6.
On Tue, Jul 2, 2013 at 8:07 PM, Dave Chinner <david@...morbit.com> wrote:
>>
>> Then that test would become
>>
>> if (wbc->sync_mode == WB_SYNC_SINGLE) {
>>
>> instead, and now "sync_mode" would actually describe what mode of
>> syncing the caller wants, without that hacky special "we know what the
>> caller _really_ meant by looking at *which* caller it is".
>
> The problem is that all the code that currently looks for
> WB_SYNC_ALL for it's behavioural cue during writeback now has
> multiple different modes they have to handle. IOWs, it's not a
> straight forward conversion process. WB_SYNC_ALL reaches right down
> into filesystem ->writepages implementations and they all need to be
> changed if we make up a new sync_mode behaviour.
I have to admit that I absolutely detest our current "sync_mode" to
begin with, so I'd personally be happy to see some major surgery in
this area.
For example, maybe we'd be much better off with something that has
various behavioral flags rather than distinct "mode values". So
instead of being an enum of different reasons for syncing, it would be
a set of bitmasks for specific sync behavior. We have a much better
sync model in our sync_file_range() model, where we have flags like
SYNC_FILE_RANGE_WAIT_xxx (where xxx is BEFORE, WRITE, AFTER to
describe whether you should wait for old writes, start new writes, or
wait after the newly started writes).
That's a very powerful model, and it's also much more easy to think
about. So the above test could become
if (wbc->sync_mode & WB_SYNC_AFTER) {
int err = filemap_fdatawait(mapping);
....
in that kind of model, and the code actually looks sensible. It reads
like "if the caller asked us to synchronize after writing, then we do
an fdatawait on the mapping".
So I think something like that might make sense. And there aren't
_that_ many users of WB_SYNC_xxx, and the patch should be pretty
straightforward. WB_SYNC_NONE semantics would presumably be "just
start writeout" (so it would become WB_SYNC_WRITE), while WB_SYNC_ALL
would become (WB_SYNC_BEFORE | WB_SYNC_WRITE | WB_SYNC_AFTER), but
then the "for_sync" case would remove WB_SYNC_AFTER, because it does
its own waiting after.
Sounds fairly sensible and straightforward to me. Much more
self-explanatory than the current "WB_SYNC_NONE/ALL" distinction,
methinks (well, you'd still have to explain what the point of
BEFORE/AFTER is, and how it interacts with starting writeout, but
especially since we already have that concept for file_sync_range(), I
think that's not too bad).
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists