[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20091108113546.GN11372@elte.hu>
Date: Sun, 8 Nov 2009 12:35:46 +0100
From: Ingo Molnar <mingo@...e.hu>
To: Stefani Seibold <stefani@...bold.net>
Cc: linux-kernel <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
"H. Peter Anvin" <hpa@...or.com>,
Americo Wang <xiyou.wangcong@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
Andi Kleen <andi@...stfloor.org>
Subject: Re: [PATCH] RFC x86_64 more accurate KSTK_ESP implementation
* Stefani Seibold <stefani@...bold.net> wrote:
> Hi,
>
> this is a RFC for a more accurate KSTK_ESP implementation for the x86_64
> architecture.
>
> Because the usersp will be only updated by a context switch this value
> is most of the time outdated. This patch update the per CPU variable
> old_rsp in the device and timer interrupt too.
>
> In my opinion this can be save done if the current stack pointer is
> outside the kernel stack of the current task and the instruction pointer
> is not inside the kernel.
>
> The old_rsp value will be stored in usersp in case of a context switch.
>
> The KSTK_ESP will get the value from old_rsp in case the task is the
> current task, otherwise it will read usersp.
>
> I know about the performance coast, so this is why i ask for comments.
>
> Stefani
>
> Signed-off-by: Stefani Seibold <stefani@...bold.net>
>
> include/asm/processor.h | 4 +++-
> kernel/apic/apic.c | 3 +++
> kernel/irq_64.c | 1 +
> kernel/process_64.c | 20 ++++++++++++++++++++
> 4 files changed, 27 insertions(+), 1 deletion(-)
>
> --- linux-2.6.32-rc5.old/arch/x86/include/asm/processor.h 2009-10-16 02:41:50.000000000 +0200
> +++ linux-2.6.32-rc5.new/arch/x86/include/asm/processor.h 2009-11-05 08:28:23.765300812 +0100
> @@ -1000,7 +1000,7 @@
> #define thread_saved_pc(t) (*(unsigned long *)((t)->thread.sp - 8))
>
> #define task_pt_regs(tsk) ((struct pt_regs *)(tsk)->thread.sp0 - 1)
> -#define KSTK_ESP(tsk) -1 /* sorry. doesn't work for syscall. */
> +extern unsigned long KSTK_ESP(struct task_struct *task);
> #endif /* CONFIG_X86_64 */
>
> extern void start_thread(struct pt_regs *regs, unsigned long new_ip,
> @@ -1052,4 +1052,6 @@
> return ratio;
> }
>
> +extern void update_usersp(struct pt_regs *regs);
> +
> #endif /* _ASM_X86_PROCESSOR_H */
> --- linux-2.6.32-rc5.old/arch/x86/kernel/process_64.c 2009-10-16 02:41:50.000000000 +0200
> +++ linux-2.6.32-rc5.new/arch/x86/kernel/process_64.c 2009-11-05 08:52:39.965227285 +0100
> @@ -664,3 +664,23 @@
> return do_arch_prctl(current, code, addr);
> }
>
> +void update_usersp(struct pt_regs *regs)
> +{
> + unsigned long stk = (unsigned long)task_stack_page(current);
> + unsigned long stkp = (regs)->sp;
Cleanliness: no need for that parenthesis.
> +
> + if (((stkp < stk) || (stkp >= stk + THREAD_SIZE))
> + && regs->ip < PAGE_OFFSET)
> + percpu_write(old_rsp, stkp);
> +}
that check for regs->ip looks imprecise - why dont you use the
user_mode_vm()?
It's true that the value itself is statistical, but still we dont want
to leak a kernel-space regs->sp reason - it's an information leak.
> +
> +unsigned long KSTK_ESP(struct task_struct *task)
> +{
> + if (test_tsk_thread_flag(task, TIF_IA32))
> + return task_pt_regs(task)->sp;
> +
> + if (task != current)
> + return task->thread.usersp;
> +
> + return percpu_read(old_rsp);
> +}
> --- linux-2.6.32-rc5.old/arch/x86/kernel/irq_64.c 2009-10-16 02:41:50.000000000 +0200
> +++ linux-2.6.32-rc5.new/arch/x86/kernel/irq_64.c 2009-11-04 22:29:55.762951577 +0100
> @@ -53,6 +53,7 @@
> struct irq_desc *desc;
>
> stack_overflow_check(regs);
> + update_usersp(regs);
>
>
> desc = irq_to_desc(irq);
> if (unlikely(!desc))
> --- linux-2.6.32-rc5.old/arch/x86/kernel/apic/apic.c 2009-10-16 02:41:50.000000000 +0200
> +++ linux-2.6.32-rc5.new/arch/x86/kernel/apic/apic.c 2009-11-04 23:12:32.805086991 +0100
> @@ -831,6 +831,9 @@
> {
> struct pt_regs *old_regs = set_irq_regs(regs);
>
> +#ifndef CONFIG_X86_32
> + update_usersp(regs);
> +#endif
Cleanliness: please eliminate this #ifdef by defining update_usersp() on
32-bit as well, as an empty inline function.
But, i dont like this patch because it adds overhead to the IRQ
fastpath.
I'd suggest a competely different method: why dont you use an IPI to
sample the SP whenever someone wants to read it from /proc and we see
that the task is running on a CPU right now?
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists