lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <499A1A43.6000009@redhat.com>
Date:	Mon, 16 Feb 2009 21:00:35 -0500
From:	Masami Hiramatsu <mhiramat@...hat.com>
To:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
	Nick Piggin <npiggin@...e.de>
CC:	Peter Zijlstra <peterz@...radead.org>,
	akpm <akpm@...ux-foundation.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...e.hu>,
	Ananth N Mavinakayanahalli <ananth@...ibm.com>,
	Jim Keniston <jkenisto@...ibm.com>
Subject: Re: irq-disabled vs vmap vs text_poke

Mathieu Desnoyers wrote:
> * Nick Piggin (npiggin@...e.de) wrote:
>> On Mon, Feb 16, 2009 at 10:04:43AM -0500, Masami Hiramatsu wrote:
>>>>>>>> BTW, what about using map_vm_area() in text_poke() instead of vmap()?
>>>>>>>> Since text_poke() just maps text pages to alias pages temporarily,
>>>>>>>> I think we don't need to use delayed vunmap().
>> [...] 
>>
>>> Here is the patch which replace v(un)map with (un)map_vm_area.
>> I don't quite understand the point of this... delayed vunmap() is
>> just an implementation detail of vmap subsystem. Callers should not
>> have to care.
>>
> 
> AFAIK, map_vm_area/unmap_vm_area is faster than vmap/vunmap.  This is
> the point of this patch. Masami, could you provide a quick benchmark of
> text_poke()/seconds before and after this optimization is applied to
> confirm this ?

Sure, here is the result of calling text_poke() 2^14 times.

<Without this patch>
Total: 3634133356(cycles), 221809(cycles/text_poke)
Total: 3699532690(cycles), 225801(cycles/text_poke)
Total: 3249855588(cycles), 198355(cycles/text_poke)

<With this patch>
Total: 483467579(cycles), 29508(cycles/text_poke)
Total: 497441301(cycles), 30361(cycles/text_poke)
Total: 497604548(cycles), 30371(cycles/text_poke)

BTW, this is not only for performance, but also simplicity and its need.
Vmap may allocate new vm_area. However, since text_poke() just needs to
map pages temporarily (yeah, very short time), we don't want to call
kmalloc or any other memory allocators.
And since text_poke() makes WRITABLE aliases of READ-ONLY pages, we
want to purge these pages ASAP.
So, I think just reserving a small vm_area for text_poke() and
reusing it is enough.

Thanks,

> 
> Thanks,
> 
> Mathieu
> 
>>  
>>> Signed-off-by: Masami Hiramatsu <mhiramat@...hat.com>
>>> ---
>>>  arch/x86/include/asm/alternative.h |    1 +
>>>  arch/x86/kernel/alternative.c      |   25 ++++++++++++++++++++-----
>>>  init/main.c                        |    3 +++
>>>  3 files changed, 24 insertions(+), 5 deletions(-)
>>>
>>> Index: linux-2.6/arch/x86/include/asm/alternative.h
>>> ===================================================================
>>> --- linux-2.6.orig/arch/x86/include/asm/alternative.h
>>> +++ linux-2.6/arch/x86/include/asm/alternative.h
>>> @@ -177,6 +177,7 @@ extern void add_nops(void *insns, unsign
>>>   * The _early version expects the memory to already be RW.
>>>   */
>>>
>>> +extern void text_poke_init(void);
>>>  extern void *text_poke(void *addr, const void *opcode, size_t len);
>>>  extern void *text_poke_early(void *addr, const void *opcode, size_t len);
>>>
>>> Index: linux-2.6/arch/x86/kernel/alternative.c
>>> ===================================================================
>>> --- linux-2.6.orig/arch/x86/kernel/alternative.c
>>> +++ linux-2.6/arch/x86/kernel/alternative.c
>>> @@ -485,6 +485,16 @@ void *text_poke_early(void *addr, const
>>>  	return addr;
>>>  }
>>>
>>> +static struct vm_struct *text_poke_area[2];
>>> +static DEFINE_SPINLOCK(text_poke_lock);
>>> +
>>> +void __init text_poke_init(void)
>>> +{
>>> +	text_poke_area[0] = get_vm_area(PAGE_SIZE, VM_ALLOC);
>>> +	text_poke_area[1] = get_vm_area(2 * PAGE_SIZE, VM_ALLOC);
>>> +	BUG_ON(!text_poke_area[0] || !text_poke_area[1]);
>>> +}
>>> +
>>>  /**
>>>   * text_poke - Update instructions on a live kernel
>>>   * @addr: address to modify
>>> @@ -501,8 +511,9 @@ void *__kprobes text_poke(void *addr, co
>>>  	unsigned long flags;
>>>  	char *vaddr;
>>>  	int nr_pages = 2;
>>> -	struct page *pages[2];
>>> -	int i;
>>> +	struct page *pages[2], **pgp = pages;
>>> +	int i, ret;
>>> +	struct vm_struct *vma;
>>>
>>>  	if (!core_kernel_text((unsigned long)addr)) {
>>>  		pages[0] = vmalloc_to_page(addr);
>>> @@ -515,12 +526,16 @@ void *__kprobes text_poke(void *addr, co
>>>  	BUG_ON(!pages[0]);
>>>  	if (!pages[1])
>>>  		nr_pages = 1;
>>> -	vaddr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL);
>>> -	BUG_ON(!vaddr);
>>> +	spin_lock(&text_poke_lock);
>>> +	vma = text_poke_area[nr_pages-1];
>>> +	ret = map_vm_area(vma, PAGE_KERNEL, &pgp);
>>> +	BUG_ON(ret);
>>> +	vaddr = vma->addr;
>>>  	local_irq_save(flags);
>>>  	memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
>>>  	local_irq_restore(flags);
>>> -	vunmap(vaddr);
>>> +	unmap_kernel_range((unsigned long)vma->addr, (unsigned 
>>> long)vma->size);
>>> +	spin_unlock(&text_poke_lock);
>>>  	sync_core();
>>>  	/* Could also do a CLFLUSH here to speed up CPU recovery; but
>>>  	   that causes hangs on some VIA CPUs. */
>>> Index: linux-2.6/init/main.c
>>> ===================================================================
>>> --- linux-2.6.orig/init/main.c
>>> +++ linux-2.6/init/main.c
>>> @@ -676,6 +676,9 @@ asmlinkage void __init start_kernel(void
>>>  	taskstats_init_early();
>>>  	delayacct_init();
>>>
>>> +#ifdef CONFIG_X86
>>> +	text_poke_init();
>>> +#endif
>>>  	check_bugs();
>>>
>>>  	acpi_early_init(); /* before LAPIC and SMP init */
>>>
>>> -- 
>>> Masami Hiramatsu
>>>
>>> Software Engineer
>>> Hitachi Computer Products (America) Inc.
>>> Software Solutions Division
>>>
>>> e-mail: mhiramat@...hat.com
> 

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@...hat.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ