[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20080324170243.GB14908@Krystal>
Date: Mon, 24 Mar 2008 13:02:43 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To: Ingo Molnar <mingo@...e.hu>
Cc: linux-kernel@...r.kernel.org,
Arjan van de Ven <arjan@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH] x86 Fix text_poke for vmalloced pages
* Ingo Molnar (mingo@...e.hu) wrote:
>
> * Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca> wrote:
>
> > The shadow vmap for DEBUG_RODATA kernel text modification uses
> > virt_to_page to get the pages from the pointer address.
> >
> > However, I think vmalloc_to_page would be required in case the page is
> > used for modules.
> >
> > Since only the core kernel text is marked read-only, use
> > core_kernel_text() to make sure we only shadow map the core kernel
> > text, not modules.
> >
> > This is an incremental change to make the DEBUG_RODATA and text_poke
> > play together nicely. A future step will be to make the module text
> > read-only too, which will require changes to load module, module free
> > and text_poke. The idea is to fix the current x86 git tree quickly.
>
> > + if (core_kernel_text((unsigned long)addr)) {
> > struct page *pages[2] = { virt_to_page(addr),
> > virt_to_page(addr + PAGE_SIZE) };
> > if (!pages[1])
> > @@ -522,6 +522,13 @@ void *__kprobes text_poke(void *addr, co
> > memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
> > local_irq_restore(flags);
> > vunmap(vaddr);
> > + } else {
> > + /*
> > + * modules are in vmalloc'ed memory, always writable.
> > + */
> > + local_irq_save(flags);
> > + memcpy(addr, opcode, len);
> > + local_irq_restore(flags);
>
> hm, this looks ugly, and the whole text_poke() function looks ugly. For
> example why the extra code block + indentation here:
>
> +void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
> +{
> + unsigned long flags;
> + char *vaddr;
> + int nr_pages = 2;
> +
> + BUG_ON(len > sizeof(long));
> + BUG_ON((((long)addr + len - 1) & ~(sizeof(long) - 1))
> + - ((long)addr & ~(sizeof(long) - 1)));
> + {
> + struct page *pages[2] = { virt_to_page(addr),
> + virt_to_page(addr + PAGE_SIZE) };
>
The extra indentation is there so we can declare
struct page *pages[2] = { };
Otherwise, we would be located after the BUG_ON lines and would have to
do the following, which takes extra lines...
unsigned long flags;
char *vaddr;
int nr_pages = 2;
struct page *pages[2];
BUG_ONs...
pages[0] = virt_to_page(addr);
pages[1] = virt_to_page(addr + PAGE_SIZE);
but then extra indentation in the new text_poke is within a block
depending on a "if" statement, so it becomes less curbersome. But it all
goes away in the update below.
> also, more fundamentally - why not introduce a proper, generic "look up
> kernel text struct page *" method, instead of open-coding various
> assumptions about which kernel text is readonly and which isnt?
>
I totally agree with you, that's the correct way to do it. And why
should we special-case the "kernel text is not read-only" case ?
Considering this is a slow path and that it would just potentially make
bugs harder to detect, I would simply consider _always_ doing a shadow
map to modify the kernel text. New patch attached.
Mathieu
> Ingo
x86 Fix text_poke for vmalloced pages
The shadow vmap for DEBUG_RODATA kernel text modification uses virt_to_page to
get the pages from the pointer address and vmalloc_to_page for vmalloc'ed
pages.
However, I think vmalloc_to_page would be required in case the page is used for
modules.
- Changelog:
Deal with read-only module text.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
CC: Ingo Molnar <mingo@...e.hu>
---
arch/x86/kernel/alternative.c | 27 ++++++++++++++++-----------
1 file changed, 16 insertions(+), 11 deletions(-)
Index: linux-2.6-lttng/arch/x86/kernel/alternative.c
===================================================================
--- linux-2.6-lttng.orig/arch/x86/kernel/alternative.c 2008-03-24 10:00:30.000000000 -0400
+++ linux-2.6-lttng/arch/x86/kernel/alternative.c 2008-03-24 10:18:19.000000000 -0400
@@ -507,22 +507,27 @@ void *__kprobes text_poke(void *addr, co
unsigned long flags;
char *vaddr;
int nr_pages = 2;
+ struct page *pages[2];
BUG_ON(len > sizeof(long));
BUG_ON((((long)addr + len - 1) & ~(sizeof(long) - 1))
- ((long)addr & ~(sizeof(long) - 1)));
- {
- struct page *pages[2] = { virt_to_page(addr),
- virt_to_page(addr + PAGE_SIZE) };
- if (!pages[1])
- nr_pages = 1;
- vaddr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL);
- WARN_ON(!vaddr);
- local_irq_save(flags);
- memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
- local_irq_restore(flags);
- vunmap(vaddr);
+ if (is_vmalloc_addr(addr)) {
+ pages[0] = vmalloc_to_page(addr);
+ pages[1] = vmalloc_to_page(addr + PAGE_SIZE);
+ } else {
+ pages[0] = virt_to_page(addr);
+ pages[1] = virt_to_page(addr + PAGE_SIZE);
}
+ BUG_ON(!pages[0]);
+ if (!pages[1])
+ nr_pages = 1;
+ vaddr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL);
+ BUG_ON(!vaddr);
+ local_irq_save(flags);
+ memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
+ local_irq_restore(flags);
+ vunmap(vaddr);
sync_core();
/* Could also do a CLFLUSH here to speed up CPU recovery; but
that causes hangs on some VIA CPUs. */
--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists