lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170924133038.GA8673@tardis>
Date:   Sun, 24 Sep 2017 21:30:38 +0800
From:   Boqun Feng <boqun.feng@...il.com>
To:     Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Peter Zijlstra <peterz@...radead.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Andrew Hunter <ahh@...gle.com>,
        maged michael <maged.michael@...il.com>,
        gromer <gromer@...gle.com>, Avi Kivity <avi@...lladb.com>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Paul Mackerras <paulus@...ba.org>,
        Michael Ellerman <mpe@...erman.id.au>,
        Dave Watson <davejwatson@...com>,
        Alan Stern <stern@...land.harvard.edu>,
        Will Deacon <will.deacon@....com>,
        Andy Lutomirski <luto@...nel.org>,
        linux-arch <linux-arch@...r.kernel.org>
Subject: Re: [RFC PATCH v3 1/2] membarrier: Provide register expedited
 private command

On Fri, Sep 22, 2017 at 03:10:10PM +0000, Mathieu Desnoyers wrote:
> ----- On Sep 22, 2017, at 4:59 AM, Boqun Feng boqun.feng@...il.com wrote:
> 
> > On Tue, Sep 19, 2017 at 06:13:41PM -0400, Mathieu Desnoyers wrote:
> > [...]
> >> +static inline void membarrier_arch_sched_in(struct task_struct *prev,
> >> +		struct task_struct *next)
> >> +{
> >> +	/*
> >> +	 * Only need the full barrier when switching between processes.
> >> +	 */
> >> +	if (likely(!test_ti_thread_flag(task_thread_info(next),
> >> +			TIF_MEMBARRIER_PRIVATE_EXPEDITED)
> >> +				|| prev->mm == next->mm))
> > 
> > And we also don't need the smp_mb() if !prev->mm, because switching from
> > kernel to user will have a smp_mb() implied by mmdrop()?
> 
> Right. And we also don't need it when switching from userspace to kernel

Yep, but this case is covered already, as I think we don't allow kernel
thread to have TIF_MEMBARRIER_PRIVATE_EXPEDITED set, right?

> thread neither. Something like this:
> 
> static inline void membarrier_arch_sched_in(struct task_struct *prev,
>                 struct task_struct *next)
> {
>         /*
>          * Only need the full barrier when switching between processes.
>          * Barrier when switching from kernel to userspace is not
>          * required here, given that it is implied by mmdrop(). Barrier
>          * when switching from userspace to kernel is not needed after
>          * store to rq->curr.
>          */
>         if (likely(!test_ti_thread_flag(task_thread_info(next),
>                         TIF_MEMBARRIER_PRIVATE_EXPEDITED)
>                         || !prev->mm || !next->mm || prev->mm == next->mm))

, so no need to test next->mm here.

>                 return;
> 
>         /*
>          * The membarrier system call requires a full memory barrier
>          * after storing to rq->curr, before going back to user-space.
>          */
>         smp_mb();
> }
> 
> > 
> >> +		return;
> >> +
> >> +	/*
> >> +	 * The membarrier system call requires a full memory barrier
> >> +	 * after storing to rq->curr, before going back to user-space.
> >> +	 */
> >> +	smp_mb();
> >> +}
> > 
> > [...]
> > 
> >> +static inline void membarrier_fork(struct task_struct *t,
> >> +		unsigned long clone_flags)
> >> +{
> >> +	if (!current->mm || !t->mm)
> >> +		return;
> >> +	t->mm->membarrier_private_expedited =
> >> +		current->mm->membarrier_private_expedited;
> > 
> > Have we already done the copy of ->membarrier_private_expedited in
> > copy_mm()?
> 
> copy_mm() is performed without holding current->sighand->siglock, so
> it appears to be racing with concurrent membarrier register cmd.

Speak of racing, I think we currently have a problem if we do a
register_private_expedited in one thread and do a
membarrer_private_expedited in another thread(sharing the same mm), as
follow:

	{t1,t2,t3 sharing the same ->mm}
	CPU 0				CPU 1				CPU2
	====================		===================		============
	{in thread t1}							
	membarrier_register_private_expedited():
	  ...
	  WRITE_ONCE(->mm->membarrier_private_expedited, 1);
	  membarrier_arch_register_private_expedited():
	    ...
	    <haven't set the TIF for t3 yet>

	    				{in thread t2}
					membarrier_private_expedited():
					  READ_ONCE(->mm->membarrier_private_expedited); // == 1
					  ...
					  for_each_online_cpu()
					    ...
					    <p is cpu_rq(CPU2)->curr>
					    if (p && p->mm == current->mm) // false
					    <so no ipi sent to CPU2>
					    				
									{about to switch to t3}
									rq->curr = t3;
									....
									context_switch():
									  ...
									  finish_task_swtich():
									    membarrier_sched_in():
									    <TIF is not set>
									    // no smp_mb() here.

, and we will miss the smp_mb() on CPU2, right? And this could even
happen if t2 has a membarrier_register_private_expedited() preceding the
membarrier_private_expedited().
					
Am I missing something subtle here?

Regards,
Boqun


> However, given that it is a single flag updated with WRITE_ONCE()
> and read with READ_ONCE(), it might be OK to rely on copy_mm there.
> If userspace runs registration concurrently with fork, they should
> not expect the child to be specifically registered or unregistered.
> 
> So yes, I think you are right about removing this copy and relying on
> copy_mm() instead. I also think we can improve membarrier_arch_fork()
> on powerpc to test the current thread flag rather than using current->mm.
> 
> Which leads to those two changes:
> 
> static inline void membarrier_fork(struct task_struct *t,
>                 unsigned long clone_flags)
> {
>         /*
>          * Prior copy_mm() copies the membarrier_private_expedited field
>          * from current->mm to t->mm.
>          */
>         membarrier_arch_fork(t, clone_flags);
> }
> 
> And on PowerPC:
> 
> static inline void membarrier_arch_fork(struct task_struct *t,
>                 unsigned long clone_flags)
> {
>         /*
>          * Coherence of TIF_MEMBARRIER_PRIVATE_EXPEDITED against thread
>          * fork is protected by siglock. membarrier_arch_fork is called
>          * with siglock held.
>          */
>         if (test_thread_flag(TIF_MEMBARRIER_PRIVATE_EXPEDITED))
>                 set_ti_thread_flag(task_thread_info(t),
>                                 TIF_MEMBARRIER_PRIVATE_EXPEDITED);
> }
> 
> Thanks,
> 
> Mathieu
> 
> 
> > 
> > Regards,
> > Boqun
> > 
> >> +	membarrier_arch_fork(t, clone_flags);
> >> +}
> >> +static inline void membarrier_execve(struct task_struct *t)
> >> +{
> >> +	t->mm->membarrier_private_expedited = 0;
> >> +	membarrier_arch_execve(t);
> >> +}
> >> +#else
> >> +static inline void membarrier_sched_in(struct task_struct *prev,
> >> +		struct task_struct *next)
> >> +{
> >> +}
> >> +static inline void membarrier_fork(struct task_struct *t,
> >> +		unsigned long clone_flags)
> >> +{
> >> +}
> >> +static inline void membarrier_execve(struct task_struct *t)
> >> +{
> >> +}
> >> +#endif
> >> +
> > [...]
> 
> -- 
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com

Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ