lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 6 Feb 2015 20:55:29 +0100
From:	Oleg Nesterov <oleg@...hat.com>
To:	Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
Cc:	linux-api@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	linux-kernel@...r.kernel.org,
	Roman Gushchin <klamm@...dex-team.ru>,
	Nikita Vetoshkin <nekto0n@...dex-team.ru>,
	Pavel Emelyanov <xemul@...allels.com>
Subject: Re: [PATCH 1/2] kernel/fork: handle put_user errors for
	CLONE_CHILD_SETTID/CLEARTID

On 02/06, Oleg Nesterov wrote:
>
> On 02/06, Konstantin Khlebnikov wrote:
> >
> > Whole sequence looks like: task calls fork, glibc calls syscall clone with
> > CLONE_CHILD_SETTID and passes pointer to TLS THREAD_SELF->tid as argument.
> > Child task gets read-only copy of VM including TLS. Child calls put_user()
> > to handle CLONE_CHILD_SETTID from schedule_tail(). put_user() trigger page
> > fault and it fails because do_wp_page()  hits memcg limit without invoking
> > OOM-killer because this is page-fault from kernel-space.
> 
> Because of !FAULT_FLAG_USER?
> 
> Perhaps we should fix this? Say mem_cgroup_oom_enable/disable around put_user(),
> I dunno.
> 
> > Put_user returns
> > -EFAULT, which is ignored.  Child returns into user-space and catches here
> > assert (THREAD_GETMEM (self, tid) != ppid),
> 
> If only I understood why else we need CLONE_CHILD_SETTID ;)
> 
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -2312,8 +2312,20 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev)
> >  	post_schedule(rq);
> >  	preempt_enable();
> >
> > -	if (current->set_child_tid)
> > -		put_user(task_pid_vnr(current), current->set_child_tid);
> > +	if (current->set_child_tid &&
> > +	    unlikely(put_user(task_pid_vnr(current), current->set_child_tid))) {
> > +		int dummy;
> > +
> > +		/*
> > +		 * If this address is unreadable then userspace has not set
> > +		 * proper pointer. Application either doesn't care or will
> > +		 * notice this soon. If this address is readable then task
> > +		 * will be mislead about its own tid. It's better to die.
> > +		 */
> > +		if (!get_user(dummy, current->set_child_tid) &&
> > +		    !fatal_signal_pending(current))
> > +			force_sig(SIGSEGV, current);
> > +	}
>
> Well, get_user() can fail the same way? The page we need to cow can be
> swapped out.
>
> At first glance, to me this problem should be solved somewhere else...
> I'll try to reread this all tomorrow.

And in fact I think that this is not set_child_tid/etc-specific. Perhaps
I am totally confused, but I think that put_user() simply should not fail
this way. Say, why a syscall should return -EFAULT if memory allocation
"silently" fails? Confused.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ