linux-kernel - Re: pipe

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Sun, 10 Mar 2013 23:33:18 +0000
From:	Al Viro <viro@...IV.linux.org.uk>
To:	J??rn Engel <joern@...fs.org>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Dave Jones <davej@...hat.com>,
	Linux Kernel <linux-kernel@...r.kernel.org>
Subject: Re: pipe_release oops.

On Fri, Mar 08, 2013 at 01:26:49PM -0500, J??rn Engel wrote:
> On Fri, 8 March 2013 10:30:01 -0800, Linus Torvalds wrote:
> > 
> > Hmm. So I've been trying to figure this out, and I really don't see
> > it. Every single pipe open routine *should* make sure that the inode
> > has an inode->i_pipe field. So if the open() has succeeded and you
> > have a valid file descriptor, the inode->i_pipe thing should be there.
> 
> Ok, here is a wild idea that is very likely wrong.  But some
> background first.  I've had problems with process exit times and one
> of the culprits turned out to be exit_files() where one device driver
> went awol for several seconds.  Fixing the device driver is hard, I
> didn't see a good reason not to call exit_files() earlier and
> exit_mm() was the other big offender, so the idea was to run both in
> parallel and I applied the patch below.
> 
> As a result I've gotten a bunch of NULL pointer dereferences that only
> happen in virtual machines, never on real hardware.  For example
>   [<ffffffff81164bf8>] alloc_fd+0x38/0x130
>   [<ffffffff8114857e>] do_sys_open+0xee/0x1f0
>   [<ffffffff811486a1>] sys_open+0x21/0x30
>   [<ffffffff815bea29>] system_call_fastpath+0x16/0x1b
> 
> Now I can easily see how current->files being NULL will result in such
> backtraces.  I can also see how my patch moves the NULLing of
> current->files a bit back in time.  But I could never figure out how
> my patch could have introduced a race that didn't exist before.
> 
> So the wild idea is that we have always had a very unlikely race with
> current->files being NULL and trinity happens to hit it somehow.
> 
> J??rn

> +	files_cookie = async_schedule(exit_files_async, tsk);
>  	exit_mm(tsk);
>  
>  	if (group_dead)
> @@ -990,7 +998,7 @@ void do_exit(long code)
>  
>  	exit_sem(tsk);
>  	exit_shm(tsk);
> -	exit_files(tsk);
> +	async_synchronize_cookie(files_cookie);

That doesn't do what you seem to think it's doing.  It does *not* wait
for the completion of that sucker's execution - only the ones scheduled
before it.  IOW, your exit_files_async() might very well be executed
*after* do_exit() completes and tsk gets reused.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/