lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080306141855.GA5767@Krystal>
Date:	Thu, 6 Mar 2008 09:18:55 -0500
From:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Srinivasa DS <srinivasa@...ibm.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, ananth@...ibm.com,
	Jim Keniston <jkenisto@...ibm.com>, srikar@...ux.vnet.ibm.com,
	SystemTAP <systemtap@...rces.redhat.com>,
	Andi Kleen <andi@...stfloor.org>, pageexec@...email.hu,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Jeremy Fitzhardinge <jeremy@...p.org>,
	Arjan van de Ven <arjan@...radead.org>
Subject: Re: [PATCH] x86 - Enhance DEBUG_RODATA support - alternatives

* Ingo Molnar (mingo@...e.hu) wrote:
> 
> [ Cc:-ed Arjan - he wrote and handles the DEBUG_RODATA bits. ]
> 
> * Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca> wrote:
> 
> > Hrm, yeah, that's because http://lkml.org/lkml/2008/2/2/227
> > 
> > x86 - Enhance DEBUG_RODATA support - alternatives
> > 
> > has been pulled out of the x86 tree. It implements the correct 
> > text_poke required to support this. It's been removed because Xen does 
> > not support the WP bit in the guest kernels.
> > 
> > Can you try my new replacement implementation ? It creates another 
> > mapping instead of modifying the WP bit.
> 
> thanks, applied it to x86.git.
> 
> note, there were rejects and i had to hand-merge it - in general it's 
> best to send development x86 patches against:
> 
>     http://people.redhat.com/mingo/x86.git/README
> 
> find the merged patch is below - i hope i merged it right ;)
> 

Yep, it looks good. Ok, I'll try to update my patches against the x86
tree as much as possible in the future before submission.

Due to the number of trees I work in (mainline for LTTng, vm.git for
slub development, x86...), I may happen to forget to port my patches to
the right tree. I'll do my best.

Thanks,

Mathieu

> 	Ingo
> 
> ------------->
> Subject: x86: enhance DEBUG_RODATA support - alternatives
> From: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
> Date: Thu, 6 Mar 2008 08:48:49 -0500
> 
> Fix a memcpy that should be a text_poke (in apply_alternatives).
> 
> Use kernel_wp_save/kernel_wp_restore in text_poke to support DEBUG_RODATA
> correctly and so the CPU HOTPLUG special case can be removed.
> 
> Add text_poke_early, for alternatives and paravirt boot-time and module load
> time patching.
> 
> Changelog:
> 
> - Fix text_set and text_poke alignment check (mixed up bitwise and and or)
> - Remove text_set
> - Export add_nops, so it can be used by others.
> - Document text_poke_early.
> - Remove clflush, since it breaks some VIA architectures and is not strictly
>   necessary.
> - Add kerneldoc to text_poke and text_poke_early.
> - Create a second vmap instead of using the WP bit to support Xen and VMI.
> - Move local_irq disable within text_poke and text_poke_early to be able to
>   be sleepable in these functions.
> 
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
> CC: Andi Kleen <andi@...stfloor.org>
> CC: pageexec@...email.hu
> CC: H. Peter Anvin <hpa@...or.com>
> CC: Jeremy Fitzhardinge <jeremy@...p.org>
> Signed-off-by: Ingo Molnar <mingo@...e.hu>
> ---
>  arch/x86/kernel/alternative.c |   88 +++++++++++++++++++++++++++++++-----------
>  include/asm-x86/alternative.h |   23 ++++++++++
>  2 files changed, 87 insertions(+), 24 deletions(-)
> 
> Index: linux-x86.q/arch/x86/kernel/alternative.c
> ===================================================================
> --- linux-x86.q.orig/arch/x86/kernel/alternative.c
> +++ linux-x86.q/arch/x86/kernel/alternative.c
> @@ -11,6 +11,8 @@
>  #include <asm/mce.h>
>  #include <asm/nmi.h>
>  #include <asm/vsyscall.h>
> +#include <asm/cacheflush.h>
> +#include <asm/io.h>
>  
>  #define MAX_PATCH_LEN (255-1)
>  
> @@ -173,7 +175,7 @@ static const unsigned char*const * find_
>  #endif /* CONFIG_X86_64 */
>  
>  /* Use this to add nops to a buffer, then text_poke the whole buffer. */
> -static void add_nops(void *insns, unsigned int len)
> +void add_nops(void *insns, unsigned int len)
>  {
>  	const unsigned char *const *noptable = find_nop_table();
>  
> @@ -186,6 +188,7 @@ static void add_nops(void *insns, unsign
>  		len -= noplen;
>  	}
>  }
> +EXPORT_SYMBOL_GPL(add_nops);
>  
>  extern struct alt_instr __alt_instructions[], __alt_instructions_end[];
>  extern u8 *__smp_locks[], *__smp_locks_end[];
> @@ -219,7 +222,7 @@ void apply_alternatives(struct alt_instr
>  		memcpy(insnbuf, a->replacement, a->replacementlen);
>  		add_nops(insnbuf + a->replacementlen,
>  			 a->instrlen - a->replacementlen);
> -		text_poke(instr, insnbuf, a->instrlen);
> +		text_poke_early(instr, insnbuf, a->instrlen);
>  	}
>  }
>  
> @@ -280,7 +283,6 @@ void alternatives_smp_module_add(struct 
>  				 void *text,  void *text_end)
>  {
>  	struct smp_alt_module *smp;
> -	unsigned long flags;
>  
>  	if (noreplace_smp)
>  		return;
> @@ -306,39 +308,37 @@ void alternatives_smp_module_add(struct 
>  		__func__, smp->locks, smp->locks_end,
>  		smp->text, smp->text_end, smp->name);
>  
> -	spin_lock_irqsave(&smp_alt, flags);
> +	spin_lock(&smp_alt);
>  	list_add_tail(&smp->next, &smp_alt_modules);
>  	if (boot_cpu_has(X86_FEATURE_UP))
>  		alternatives_smp_unlock(smp->locks, smp->locks_end,
>  					smp->text, smp->text_end);
> -	spin_unlock_irqrestore(&smp_alt, flags);
> +	spin_unlock(&smp_alt);
>  }
>  
>  void alternatives_smp_module_del(struct module *mod)
>  {
>  	struct smp_alt_module *item;
> -	unsigned long flags;
>  
>  	if (smp_alt_once || noreplace_smp)
>  		return;
>  
> -	spin_lock_irqsave(&smp_alt, flags);
> +	spin_lock(&smp_alt);
>  	list_for_each_entry(item, &smp_alt_modules, next) {
>  		if (mod != item->mod)
>  			continue;
>  		list_del(&item->next);
> -		spin_unlock_irqrestore(&smp_alt, flags);
> +		spin_unlock(&smp_alt);
>  		DPRINTK("%s: %s\n", __func__, item->name);
>  		kfree(item);
>  		return;
>  	}
> -	spin_unlock_irqrestore(&smp_alt, flags);
> +	spin_unlock(&smp_alt);
>  }
>  
>  void alternatives_smp_switch(int smp)
>  {
>  	struct smp_alt_module *mod;
> -	unsigned long flags;
>  
>  #ifdef CONFIG_LOCKDEP
>  	/*
> @@ -355,7 +355,7 @@ void alternatives_smp_switch(int smp)
>  		return;
>  	BUG_ON(!smp && (num_online_cpus() > 1));
>  
> -	spin_lock_irqsave(&smp_alt, flags);
> +	spin_lock(&smp_alt);
>  
>  	/*
>  	 * Avoid unnecessary switches because it forces JIT based VMs to
> @@ -379,7 +379,7 @@ void alternatives_smp_switch(int smp)
>  						mod->text, mod->text_end);
>  	}
>  	smp_mode = smp;
> -	spin_unlock_irqrestore(&smp_alt, flags);
> +	spin_unlock(&smp_alt);
>  }
>  
>  #endif
> @@ -407,7 +407,7 @@ void apply_paravirt(struct paravirt_patc
>  
>  		/* Pad the rest with nops */
>  		add_nops(insnbuf + used, p->len - used);
> -		text_poke(p->instr, insnbuf, p->len);
> +		text_poke_early(p->instr, insnbuf, p->len);
>  	}
>  }
>  extern struct paravirt_patch_site __start_parainstructions[],
> @@ -416,8 +416,6 @@ extern struct paravirt_patch_site __star
>  
>  void __init alternative_instructions(void)
>  {
> -	unsigned long flags;
> -
>  	/* The patching is not fully atomic, so try to avoid local interruptions
>  	   that might execute the to be patched code.
>  	   Other CPUs are not running. */
> @@ -426,7 +424,6 @@ void __init alternative_instructions(voi
>  	stop_mce();
>  #endif
>  
> -	local_irq_save(flags);
>  	apply_alternatives(__alt_instructions, __alt_instructions_end);
>  
>  	/* switch to patch-once-at-boottime-only mode and free the
> @@ -458,7 +455,6 @@ void __init alternative_instructions(voi
>  	}
>  #endif
>   	apply_paravirt(__parainstructions, __parainstructions_end);
> -	local_irq_restore(flags);
>  
>  	if (smp_alt_once)
>  		free_init_pages("SMP alternatives",
> @@ -471,18 +467,64 @@ void __init alternative_instructions(voi
>  #endif
>  }
>  
> -/*
> - * Warning:
> +/**
> + * text_poke_early - Update instructions on a live kernel at boot time
> + * @addr: address to modify
> + * @opcode: source of the copy
> + * @len: length to copy
> + *
>   * When you use this code to patch more than one byte of an instruction
>   * you need to make sure that other CPUs cannot execute this code in parallel.
> - * Also no thread must be currently preempted in the middle of these instructions.
> - * And on the local CPU you need to be protected again NMI or MCE handlers
> - * seeing an inconsistent instruction while you patch.
> + * Also no thread must be currently preempted in the middle of these
> + * instructions. And on the local CPU you need to be protected again NMI or MCE
> + * handlers seeing an inconsistent instruction while you patch.
>   */
> -void __kprobes text_poke(void *addr, unsigned char *opcode, int len)
> +void *text_poke_early(void *addr, const void *opcode, size_t len)
>  {
> +	unsigned long flags;
> +	local_irq_save(flags);
>  	memcpy(addr, opcode, len);
> +	local_irq_restore(flags);
> +	sync_core();
> +	/* Could also do a CLFLUSH here to speed up CPU recovery; but
> +	   that causes hangs on some VIA CPUs. */
> +	return addr;
> +}
> +
> +/**
> + * text_poke - Update instructions on a live kernel
> + * @addr: address to modify
> + * @opcode: source of the copy
> + * @len: length to copy
> + *
> + * Only atomic text poke/set should be allowed when not doing early patching.
> + * It means the size must be writable atomically and the address must be aligned
> + * in a way that permits an atomic write. It also makes sure we fit on a single
> + * page.
> + */
> +void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
> +{
> +	unsigned long flags;
> +	char *vaddr;
> +	int nr_pages = 2;
> +
> +	BUG_ON(len > sizeof(long));
> +	BUG_ON((((long)addr + len - 1) & ~(sizeof(long) - 1))
> +		- ((long)addr & ~(sizeof(long) - 1)));
> +	{
> +		struct page *pages[2] = { virt_to_page(addr),
> +			virt_to_page(addr + PAGE_SIZE) };
> +		if (!pages[1])
> +			nr_pages = 1;
> +		vaddr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL);
> +		WARN_ON(!vaddr);
> +		local_irq_save(flags);
> +		memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
> +		local_irq_restore(flags);
> +		vunmap(vaddr);
> +	}
>  	sync_core();
>  	/* Could also do a CLFLUSH here to speed up CPU recovery; but
>  	   that causes hangs on some VIA CPUs. */
> +	return addr;
>  }
> Index: linux-x86.q/include/asm-x86/alternative.h
> ===================================================================
> --- linux-x86.q.orig/include/asm-x86/alternative.h
> +++ linux-x86.q/include/asm-x86/alternative.h
> @@ -156,6 +156,27 @@ apply_paravirt(struct paravirt_patch_sit
>  #define __parainstructions_end	NULL
>  #endif
>  
> -extern void text_poke(void *addr, unsigned char *opcode, int len);
> +extern void add_nops(void *insns, unsigned int len);
> +
> +/*
> + * Clear and restore the kernel write-protection flag on the local CPU.
> + * Allows the kernel to edit read-only pages.
> + * Side-effect: any interrupt handler running between save and restore will have
> + * the ability to write to read-only pages.
> + *
> + * Warning:
> + * Code patching in the UP case is safe if NMIs and MCE handlers are stopped and
> + * no thread can be preempted in the instructions being modified (no iret to an
> + * invalid instruction possible) or if the instructions are changed from a
> + * consistent state to another consistent state atomically.
> + * More care must be taken when modifying code in the SMP case because of
> + * Intel's errata.
> + * On the local CPU you need to be protected again NMI or MCE handlers seeing an
> + * inconsistent instruction while you patch.
> + * The _early version expects the memory to already be RW.
> + */
> +
> +extern void *text_poke(void *addr, const void *opcode, size_t len);
> +extern void *text_poke_early(void *addr, const void *opcode, size_t len);
>  
>  #endif /* _ASM_X86_ALTERNATIVE_H */

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ