[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20180515143407.89e3b0e7d73c89e6071196e0@linux-foundation.org>
Date: Tue, 15 May 2018 14:34:07 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc: syzbot <syzbot+4f2e5f086147d543ab03@...kaller.appspotmail.com>,
syzkaller-bugs@...glegroups.com, Al Viro <viro@...iv.linux.org.uk>,
dhowells@...hat.com, ernesto.mnd.fernandez@...il.com,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
slava@...eyko.com
Subject: Re: [PATCH] hfsplus: stop workqueue when fill_super() failed
On Tue, 15 May 2018 19:11:06 +0900 Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp> wrote:
> >From ffd64dcf946502e7bb1d23c021ee9a4fc92f9312 Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
> Date: Tue, 15 May 2018 12:23:03 +0900
> Subject: [PATCH] hfsplus: stop workqueue when fill_super() failed
>
> syzbot is reporting ODEBUG messages at hfsplus_fill_super() [1].
> This is because hfsplus_fill_super() forgot to call
> cancel_delayed_work_sync().
>
> As far as I can see, it is hfsplus_mark_mdb_dirty() from
> hfsplus_new_inode() in hfsplus_fill_super() that calls
> queue_delayed_work(). Therefore, I assume that hfsplus_new_inode() does not
> fail if queue_delayed_work() was called, and the out_put_hidden_dir label
> is the appropriate location to call cancel_delayed_work_sync().
Yes, I was scratching my head over that - it is quite unobvious
whereabouts in hfsplus_fill_super() that the work actually starts
getting scheduled.
"somewhere after the last goto out_put_root" might be true, for now.
But it isn't at all obvious and it isn't very maintainable.
Perhaps it's simply wrong for hfsplus to be marking things dirty and
performing these complex operations partway through fill_super() before
everything is fully set up.
And I wouldn't be comfortable putting the cancel_work_sync() right at
the end of the hfsplus_fill_super() cleanup because the delayed work
handler might be using things which have already been torn down by that
stage.
So... ugh. A solid shotgun approach would be to put a
cancel_work_sync() immediately after each and every goto target in that
cleanup path, but that's just stupid :(
Nasty. I can't think of anything clever here. I guess we can go with
this patch for now, and if new problems crop up we can look at moving
the cancel_work_sync() down to a later part of the teardown sequence.
Powered by blists - more mailing lists