linux-kernel - Re: [PATCH v10 20/40] arm64/gcs: Ensure that new threads have a GCS

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZsM0wkRRguMchecK@arm.com>
Date: Mon, 19 Aug 2024 13:04:18 +0100
From: Catalin Marinas <catalin.marinas@....com>
To: Mark Brown <broonie@...nel.org>
Cc: Will Deacon <will@...nel.org>, Jonathan Corbet <corbet@....net>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Marc Zyngier <maz@...nel.org>,
	Oliver Upton <oliver.upton@...ux.dev>,
	James Morse <james.morse@....com>,
	Suzuki K Poulose <suzuki.poulose@....com>,
	Arnd Bergmann <arnd@...db.de>, Oleg Nesterov <oleg@...hat.com>,
	Eric Biederman <ebiederm@...ssion.com>,
	Shuah Khan <shuah@...nel.org>,
	"Rick P. Edgecombe" <rick.p.edgecombe@...el.com>,
	Deepak Gupta <debug@...osinc.com>, Ard Biesheuvel <ardb@...nel.org>,
	Szabolcs Nagy <Szabolcs.Nagy@....com>, Kees Cook <kees@...nel.org>,
	"H.J. Lu" <hjl.tools@...il.com>,
	Paul Walmsley <paul.walmsley@...ive.com>,
	Palmer Dabbelt <palmer@...belt.com>,
	Albert Ou <aou@...s.berkeley.edu>,
	Florian Weimer <fweimer@...hat.com>,
	Christian Brauner <brauner@...nel.org>,
	Thiago Jung Bauermann <thiago.bauermann@...aro.org>,
	Ross Burton <ross.burton@....com>,
	linux-arm-kernel@...ts.infradead.org, linux-doc@...r.kernel.org,
	kvmarm@...ts.linux.dev, linux-fsdevel@...r.kernel.org,
	linux-arch@...r.kernel.org, linux-mm@...ck.org,
	linux-kselftest@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-riscv@...ts.infradead.org
Subject: Re: [PATCH v10 20/40] arm64/gcs: Ensure that new threads have a GCS

On Thu, Aug 01, 2024 at 01:06:47PM +0100, Mark Brown wrote:
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 5f00cb0da9c3..d6d3a96cf2e4 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -285,9 +285,32 @@ static void flush_gcs(void)
>  	write_sysreg_s(0, SYS_GCSPR_EL0);
>  }
>  
> +static int copy_thread_gcs(struct task_struct *p,
> +			   const struct kernel_clone_args *args)
> +{
> +	unsigned long gcs;
> +
> +	gcs = gcs_alloc_thread_stack(p, args);
> +	if (IS_ERR_VALUE(gcs))
> +		return PTR_ERR((void *)gcs);

Is 0 an ok value here? I can see further down that
gcs_alloc_thread_stack() may return 0.

> +
> +	p->thread.gcs_el0_mode = current->thread.gcs_el0_mode;
> +	p->thread.gcs_el0_locked = current->thread.gcs_el0_locked;
> +
> +	/* Ensure the current state of the GCS is seen by CoW */
> +	gcsb_dsync();

I don't get this barrier. What does it have to do with CoW, which memory
effects is it trying to order?

> diff --git a/arch/arm64/mm/gcs.c b/arch/arm64/mm/gcs.c
> index b0a67efc522b..b71f6b408513 100644
> --- a/arch/arm64/mm/gcs.c
> +++ b/arch/arm64/mm/gcs.c
> @@ -8,6 +8,138 @@
>  #include <asm/cpufeature.h>
>  #include <asm/page.h>
>  
> +static unsigned long alloc_gcs(unsigned long addr, unsigned long size)
> +{
> +	int flags = MAP_ANONYMOUS | MAP_PRIVATE;
> +	struct mm_struct *mm = current->mm;
> +	unsigned long mapped_addr, unused;
> +
> +	if (addr)
> +		flags |= MAP_FIXED_NOREPLACE;
> +
> +	mmap_write_lock(mm);
> +	mapped_addr = do_mmap(NULL, addr, size, PROT_READ, flags,
> +			      VM_SHADOW_STACK | VM_WRITE, 0, &unused, NULL);
> +	mmap_write_unlock(mm);
> +
> +	return mapped_addr;
> +}
> +
> +static unsigned long gcs_size(unsigned long size)
> +{
> +	if (size)
> +		return PAGE_ALIGN(size);
> +
> +	/* Allocate RLIMIT_STACK/2 with limits of PAGE_SIZE..2G */
> +	size = PAGE_ALIGN(min_t(unsigned long long,
> +				rlimit(RLIMIT_STACK) / 2, SZ_2G));
> +	return max(PAGE_SIZE, size);
> +}

So we still have RLIMIT_STACK/2. I thought we got rid of that and just
went with RLIMIT_STACK (or I misremember).

> +
> +static bool gcs_consume_token(struct mm_struct *mm, unsigned long user_addr)
> +{
> +	u64 expected = GCS_CAP(user_addr);
> +	u64 val;
> +	int ret;
> +
> +	/* This should really be an atomic cmpxchg.  It is not. */
> +	ret = access_remote_vm(mm, user_addr, &val, sizeof(val),
> +			       FOLL_FORCE);
> +	if (ret != sizeof(val))
> +		return false;
> +
> +	if (val != expected)
> +		return false;
> +
> +	val = 0;
> +	ret = access_remote_vm(mm, user_addr, &val, sizeof(val),
> +			       FOLL_FORCE | FOLL_WRITE);
> +	if (ret != sizeof(val))
> +		return false;
> +
> +	return true;
> +}

As per the clone3() thread, I think we should try to use
get_user_page_vma_remote() and do a cmpxchg() directly.

How does the user write the initial token? Do we need any barriers
before/after consuming the token?

-- 
Catalin