linux-hardening - Re: [PATCH v5 2/8] lkdtm/powerpc: Add test to hijack a patch mapping

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <CDGVLP8OS8N9.13R0RIGJ1WJ8R@oc8246131445.ibm.com>
Date:   Wed, 11 Aug 2021 12:57:00 -0500
From:   "Christopher M. Riedl" <cmr@...ux.ibm.com>
To:     "Christophe Leroy" <christophe.leroy@...roup.eu>,
        <linuxppc-dev@...ts.ozlabs.org>
Cc:     <keescook@...omium.org>, <peterz@...radead.org>, <x86@...nel.org>,
        <npiggin@...il.com>, <linux-hardening@...r.kernel.org>,
        <tglx@...utronix.de>, <dja@...ens.net>
Subject: Re: [PATCH v5 2/8] lkdtm/powerpc: Add test to hijack a patch
 mapping

On Thu Aug 5, 2021 at 4:13 AM CDT, Christophe Leroy wrote:
>
>
> Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> > When live patching with STRICT_KERNEL_RWX the CPU doing the patching
> > must temporarily remap the page(s) containing the patch site with +W
> > permissions. While this temporary mapping is in use, another CPU could
> > write to the same mapping and maliciously alter kernel text. Implement a
> > LKDTM test to attempt to exploit such an opening during code patching.
> > The test is implemented on powerpc and requires LKDTM built into the
> > kernel (building LKDTM as a module is insufficient).
> > 
> > The LKDTM "hijack" test works as follows:
> > 
> >    1. A CPU executes an infinite loop to patch an instruction. This is
> >       the "patching" CPU.
> >    2. Another CPU attempts to write to the address of the temporary
> >       mapping used by the "patching" CPU. This other CPU is the
> >       "hijacker" CPU. The hijack either fails with a fault/error or
> >       succeeds, in which case some kernel text is now overwritten.
> > 
> > The virtual address of the temporary patch mapping is provided via an
> > LKDTM-specific accessor to the hijacker CPU. This test assumes a
> > hypothetical situation where this address was leaked previously.
> > 
> > How to run the test:
> > 
> > 	mount -t debugfs none /sys/kernel/debug
> > 	(echo HIJACK_PATCH > /sys/kernel/debug/provoke-crash/DIRECT)
> > 
> > A passing test indicates that it is not possible to overwrite kernel
> > text from another CPU by using the temporary mapping established by
> > a CPU for patching.
> > 
> > Signed-off-by: Christopher M. Riedl <cmr@...ux.ibm.com>
> > 
> > ---
> > 
> > v5:  * Use `u32*` instead of `struct ppc_inst*` based on new series in
> >         upstream.
> > 
> > v4:  * Separate the powerpc and x86_64 bits into individual patches.
> >       * Use __put_kernel_nofault() when attempting to hijack the mapping
> >       * Use raw_smp_processor_id() to avoid triggering the BUG() when
> >         calling smp_processor_id() in preemptible code - the only thing
> >         that matters is that one of the threads is bound to a different
> >         CPU - we are not using smp_processor_id() to access any per-cpu
> >         data or similar where preemption should be disabled.
> >       * Rework the patching_cpu() kthread stop condition to avoid:
> >         https://lwn.net/Articles/628628/
> > ---
> >   drivers/misc/lkdtm/core.c  |   1 +
> >   drivers/misc/lkdtm/lkdtm.h |   1 +
> >   drivers/misc/lkdtm/perms.c | 134 +++++++++++++++++++++++++++++++++++++
> >   3 files changed, 136 insertions(+)
> > 
> > diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c
> > index 8024b6a5cc7fc..fbcb95eda337b 100644
> > --- a/drivers/misc/lkdtm/core.c
> > +++ b/drivers/misc/lkdtm/core.c
> > @@ -147,6 +147,7 @@ static const struct crashtype crashtypes[] = {
> >   	CRASHTYPE(WRITE_RO),
> >   	CRASHTYPE(WRITE_RO_AFTER_INIT),
> >   	CRASHTYPE(WRITE_KERN),
> > +	CRASHTYPE(HIJACK_PATCH),
> >   	CRASHTYPE(REFCOUNT_INC_OVERFLOW),
> >   	CRASHTYPE(REFCOUNT_ADD_OVERFLOW),
> >   	CRASHTYPE(REFCOUNT_INC_NOT_ZERO_OVERFLOW),
> > diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h
> > index 99f90d3e5e9cb..87e7e6136d962 100644
> > --- a/drivers/misc/lkdtm/lkdtm.h
> > +++ b/drivers/misc/lkdtm/lkdtm.h
> > @@ -62,6 +62,7 @@ void lkdtm_EXEC_USERSPACE(void);
> >   void lkdtm_EXEC_NULL(void);
> >   void lkdtm_ACCESS_USERSPACE(void);
> >   void lkdtm_ACCESS_NULL(void);
> > +void lkdtm_HIJACK_PATCH(void);
> >   
> >   /* refcount.c */
> >   void lkdtm_REFCOUNT_INC_OVERFLOW(void);
> > diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
> > index 2dede2ef658f3..39e7456852229 100644
> > --- a/drivers/misc/lkdtm/perms.c
> > +++ b/drivers/misc/lkdtm/perms.c
> > @@ -9,6 +9,7 @@
> >   #include <linux/vmalloc.h>
> >   #include <linux/mman.h>
> >   #include <linux/uaccess.h>
> > +#include <linux/kthread.h>
> >   #include <asm/cacheflush.h>
> >   
> >   /* Whether or not to fill the target memory area with do_nothing(). */
> > @@ -222,6 +223,139 @@ void lkdtm_ACCESS_NULL(void)
> >   	pr_err("FAIL: survived bad write\n");
> >   }
> >   
> > +#if (IS_BUILTIN(CONFIG_LKDTM) && defined(CONFIG_STRICT_KERNEL_RWX) && \
> > +	defined(CONFIG_PPC))
>
>
> I think this test shouldn't be limited to CONFIG_PPC and shouldn't be
> limited to
> CONFIG_STRICT_KERNEL_RWX. It should be there all the time.
>
> Also why limiting it to IS_BUILTIN(CONFIG_LKDTM) ?
>

The test needs read_cpu_patching_addr() which definitely cannot be
exposed outside of the kernel (ie. builtin).

> > +/*
> > + * This is just a dummy location to patch-over.
> > + */
> > +static void patching_target(void)
> > +{
> > +	return;
> > +}
> > +
> > +#include <asm/code-patching.h>
> > +const u32 *patch_site = (const u32 *)&patching_target;
> > +
> > +static inline int lkdtm_do_patch(u32 data)
> > +{
> > +	return patch_instruction((u32 *)patch_site, ppc_inst(data));
> > +}
> > +
> > +static inline u32 lkdtm_read_patch_site(void)
> > +{
> > +	return READ_ONCE(*patch_site);
> > +}
> > +
> > +/* Returns True if the write succeeds */
> > +static inline bool lkdtm_try_write(u32 data, u32 *addr)
> > +{
> > +	__put_kernel_nofault(addr, &data, u32, err);
> > +	return true;
> > +
> > +err:
> > +	return false;
> > +}
> > +
> > +static int lkdtm_patching_cpu(void *data)
> > +{
> > +	int err = 0;
> > +	u32 val = 0xdeadbeef;
> > +
> > +	pr_info("starting patching_cpu=%d\n", raw_smp_processor_id());
> > +
> > +	do {
> > +		err = lkdtm_do_patch(val);
> > +	} while (lkdtm_read_patch_site() == val && !err && !kthread_should_stop());
> > +
> > +	if (err)
> > +		pr_warn("XFAIL: patch_instruction returned error: %d\n", err);
> > +
> > +	while (!kthread_should_stop()) {
> > +		set_current_state(TASK_INTERRUPTIBLE);
> > +		schedule();
> > +	}
> > +
> > +	return err;
> > +}
> > +
> > +void lkdtm_HIJACK_PATCH(void)
> > +{
> > +	struct task_struct *patching_kthrd;
> > +	int patching_cpu, hijacker_cpu, attempts;
> > +	unsigned long addr;
> > +	bool hijacked;
> > +	const u32 bad_data = 0xbad00bad;
> > +	const u32 original_insn = lkdtm_read_patch_site();
> > +
> > +	if (!IS_ENABLED(CONFIG_SMP)) {
> > +		pr_err("XFAIL: this test requires CONFIG_SMP\n");
> > +		return;
> > +	}
> > +
> > +	if (num_online_cpus() < 2) {
> > +		pr_warn("XFAIL: this test requires at least two cpus\n");
> > +		return;
> > +	}
> > +
> > +	hijacker_cpu = raw_smp_processor_id();
> > +	patching_cpu = cpumask_any_but(cpu_online_mask, hijacker_cpu);
> > +
> > +	patching_kthrd = kthread_create_on_node(&lkdtm_patching_cpu, NULL,
> > +						cpu_to_node(patching_cpu),
> > +						"lkdtm_patching_cpu");
> > +	kthread_bind(patching_kthrd, patching_cpu);
> > +	wake_up_process(patching_kthrd);
> > +
> > +	addr = offset_in_page(patch_site) | read_cpu_patching_addr(patching_cpu);
> > +
> > +	pr_info("starting hijacker_cpu=%d\n", hijacker_cpu);
> > +	for (attempts = 0; attempts < 100000; ++attempts) {
> > +		/* Try to write to the other CPU's temp patch mapping */
> > +		hijacked = lkdtm_try_write(bad_data, (u32 *)addr);
> > +
> > +		if (hijacked) {
> > +			if (kthread_stop(patching_kthrd)) {
> > +				pr_info("hijack attempts: %d\n", attempts);
> > +				pr_err("XFAIL: error stopping patching cpu\n");
> > +				return;
> > +			}
> > +			break;
> > +		}
> > +	}
> > +	pr_info("hijack attempts: %d\n", attempts);
> > +
> > +	if (hijacked) {
> > +		if (lkdtm_read_patch_site() == bad_data)
> > +			pr_err("overwrote kernel text\n");
> > +		/*
> > +		 * There are window conditions where the hijacker cpu manages to
> > +		 * write to the patch site but the site gets overwritten again by
> > +		 * the patching cpu. We still consider that a "successful" hijack
> > +		 * since the hijacker cpu did not fault on the write.
> > +		 */
> > +		pr_err("FAIL: wrote to another cpu's patching area\n");
> > +	} else {
> > +		kthread_stop(patching_kthrd);
> > +	}
> > +
> > +	/* Restore the original data to be able to run the test again */
> > +	lkdtm_do_patch(original_insn);
> > +}
> > +
> > +#else
> > +
> > +void lkdtm_HIJACK_PATCH(void)
> > +{
> > +	if (!IS_ENABLED(CONFIG_PPC))
> > +		pr_err("XFAIL: this test only runs on powerpc\n");
> > +	if (!IS_ENABLED(CONFIG_STRICT_KERNEL_RWX))
> > +		pr_err("XFAIL: this test requires CONFIG_STRICT_KERNEL_RWX\n");
> > +	if (!IS_BUILTIN(CONFIG_LKDTM))
> > +		pr_err("XFAIL: this test requires CONFIG_LKDTM=y (not =m!)\n");
> > +}
> > +
> > +#endif
> > +
> >   void __init lkdtm_perms_init(void)
> >   {
> >   	/* Make sure we can write to __ro_after_init values during __init */
> >