lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201201101637.GU2414@hirez.programming.kicks-ass.net>
Date:   Tue, 1 Dec 2020 11:16:37 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Andy Lutomirski <luto@...nel.org>
Cc:     x86@...nel.org, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Nicholas Piggin <npiggin@...il.com>,
        Arnd Bergmann <arnd@...db.de>,
        Anton Blanchard <anton@...abs.org>
Subject: Re: [PATCH 3/3] membarrier: Propagate SYNC_CORE and RSEQ actions
 more carefully

On Mon, Nov 30, 2020 at 09:50:35AM -0800, Andy Lutomirski wrote:
> membarrier() carefully propagates SYNC_CORE and RSEQ actions to all
> other CPUs, but there are two issues.
> 
>  - membarrier() does not sync_core() or rseq_preempt() the calling
>    CPU.  Aside from the logic being mind-bending, this also means
>    that it may not be safe to modify user code through an alias,
>    call membarrier(), and then jump to a different executable alias
>    of the same code.

I always understood this to be on purpose. The calling CPU can fix up
itself just fine. The pain point is fixing up the other CPUs, and that's
where membarrier() helps.

That said, I don't mind including self, these aren't fast calls by any
means.

>  - membarrier() does not explicitly sync_core() remote CPUs either;
>    instead, it relies on the assumption that an IPI will result in a
>    core sync.  On x86, I think this may be true in practice, but
>    it's not architecturally reliable.  In particular, the SDM and
>    APM do not appear to guarantee that interrupt delivery is
>    serializing.

Right, I don't think we rely on that, we do rely on interrupt delivery
providing order though -- as per the previous email.

>    On a preemptible kernel, IPI return can schedule,
>    thereby switching to another task in the same mm that was
>    sleeping in a syscall.  The new task could then SYSRET back to
>    usermode without ever executing IRET.

This; I think we all overlooked this scenario.

> This patch simplifies the code to treat the calling CPU just like
> all other CPUs, and explicitly sync_core() on all target CPUs.  This
> eliminates the need for the smp_mb() at the end of the function
> except in the special case of a targeted remote membarrier().  This
> patch updates that code and the comments accordingly.
> 
> Signed-off-by: Andy Lutomirski <luto@...nel.org>

> @@ -228,25 +258,33 @@ static int membarrier_private_expedited(int flags, int cpu_id)
>  		rcu_read_unlock();
>  	}
>  
> -	preempt_disable();
> -	if (cpu_id >= 0)
> -		smp_call_function_single(cpu_id, ipi_func, NULL, 1);
> -	else
> -		smp_call_function_many(tmpmask, ipi_func, NULL, 1);
> -	preempt_enable();
> +	if (cpu_id >= 0) {
> +		int cpu = get_cpu();
> +
> +		if (cpu_id == cpu) {
> +			ipi_func(NULL);
> +		} else {
> +			smp_call_function_single(cpu_id, ipi_func, NULL, 1);
> +			/*
> +			 * This is analogous to the smp_mb() at the beginning
> +			 * of the function -- exit from a system call is not a
> +			 * barrier.  We only need this if we're targeting a
> +			 * specific remote CPU, though -- otherwise ipi_func()
> +			 * would serves the same purpose.
> +			 */
> +			smp_mb();

smp_call_function_single(.wait=1) already orders against completion of
the IPI. Do we really need more?

> +		}
> +
> +		put_cpu();
> +	} else {
> +		on_each_cpu_mask(tmpmask, ipi_func, NULL, true);
> +	}
>  
>  out:
>  	if (cpu_id < 0)
>  		free_cpumask_var(tmpmask);
>  	cpus_read_unlock();
>  
> -	/*
> -	 * Memory barrier on the caller thread _after_ we finished
> -	 * waiting for the last IPI. Matches memory barriers around
> -	 * rq->curr modification in scheduler.
> -	 */
> -	smp_mb();	/* exit from system call is not a mb */
> -
>  	return 0;
>  }
>  
> -- 
> 2.28.0
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ