lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 14 Sep 2017 19:37:23 +0100
From:   Al Viro <viro@...IV.linux.org.uk>
To:     Jaegeuk Kim <jaegeuk@...nel.org>
Cc:     linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-f2fs-devel@...ts.sourceforge.net
Subject: Re: [PATCH] vfs: introduce UMOUNT_WAIT which waits for umount
 completion

On Thu, Sep 14, 2017 at 02:30:17AM +0100, Al Viro wrote:
> On Wed, Sep 13, 2017 at 06:10:48PM -0700, Jaegeuk Kim wrote:
> 
> > Android triggers umount(2) by init process, which is definitely not a kernel
> > thread. But, we've seen some kernel panics which say umount(2) was succeeded,
> > but ext4 triggered a kernel panic due to EIO after then like below. I'm also
> > not sure task_work_run() would be also safe enoughly. May I ask where I can
> > find sys_umount() calls task_work_run()?
> 
> ret_{fast,slow}_syscall ->
> 	slow_work_pending ->
> 		do_work_pending() ->
> 			tracehook_notify_resume() ->
> 				task_work_run()
> 
> It's not sys_umount() (or any other sys_...()) - it's syscall dispatcher after
> having called one of those and before returning to userland.  What is guaranteed
> is that after successful task_work_add() the damn thing will be run in context
> of originating process before it returns from syscall.  So any subsequent
> syscalls from that process are guaranteed to happen after the work has run.
> The same happens if the process exits rather than returns to userland (do_exit() ->
> exit_task_work() -> task_work_run()), but for that you would need it to die in
> umount(2) (e.g. get kill -9 delivered on the way out).
> 
> Please, check if you are seeing task_work_add() failure in there and if you do,
> I would like to see a stack trace.  IOW, slap WARN_ON(1); right after
>                         if (!task_work_add(task, &mnt->mnt_rcu, true))
>                                 return;
> and see what (if anything) gets printed.

AFAICS, for task_work_add() to fail here we need a final mntput() to be run
in context of a thread that already had exit_signals() run *and* subsequent
task_work_run() run to completion (with all pending callbacks executed, along
with all callbacks added by those, etc.)

For that to have happened during umount(2) we would've needed
	* killing signal delivered while going through the syscall
	* final mntput() to have been done *NOT* from sys_umount() (otherwise
the work would've been added before we got to exit_signals())
	* final mntput() to have been done *NOT* from any task_work callbacks
(otherwise it would've been added before we'd observed a combination of empty
list of pending work with PF_EXITING)

I really want to see the stack trace of that failing task_work_add(), if that's
what actually happens there.  What kind of a reproducer do you have for that?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ