[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4B9F9E02.6030007@gmail.com>
Date: Tue, 16 Mar 2010 17:04:34 +0200
From: Török Edwin <edwintorok@...il.com>
To: Frederic Weisbecker <fweisbec@...il.com>
CC: Ingo Molnar <mingo@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
"H. Peter Anvin" <hpa@...or.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Paul Mackerras <paulus@...ba.org>, x86@...nel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] perf: x86: fix callgraphs of 32-bit processes on 64-bit
kernels.
On 03/16/2010 04:49 PM, Frederic Weisbecker wrote:
> On Mon, Mar 15, 2010 at 05:34:20PM +0200, Török Edwin wrote:
>> When profiling a 32-bit process on a 64-bit kernel, callgraph tracing
>> stopped after the first function, because it has seen a garbage memory address
>> (tried to interpret the frame pointer, and return address as a 64-bit pointer).
>>
>> Fix this by using a struct stack_frame with 32-bit pointers when the TIF_IA32 flag is set.
>>
>> Note that TIF_IA32 flag must be used, and not is_compat_task(), because the
>> latter is only set when the 32-bit process is executing a syscall,
>> which may not always be the case (when tracing page fault events for example).
>>
>> Signed-off-by: Török Edwin <edwintorok@...il.com>
>> ---
>> arch/x86/kernel/cpu/perf_event.c | 33 +++++++++++++++++++++++++++++++++
>> 1 files changed, 33 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
>> index 8c1c070..13ee83a 100644
>> --- a/arch/x86/kernel/cpu/perf_event.c
>> +++ b/arch/x86/kernel/cpu/perf_event.c
>> @@ -2401,6 +2401,20 @@ static int copy_stack_frame(const void __user *fp, struct stack_frame *frame)
>> return bytes == sizeof(*frame);
>> }
>>
>> +struct stack_frame_ia32 {
>> + u32 next_frame;
>> + u32 return_address;
>> +};
>> +
>> +static int copy_stack_frame_ia32(u32 fp, struct stack_frame_ia32 *frame)
>> +{
>> + unsigned long bytes;
>> +
>> + bytes = copy_from_user_nmi(frame, (const void __user*)(unsigned long)fp, sizeof(*frame));
>> +
>> + return bytes == sizeof(*frame);
>> +}
>> +
>> static void
>> perf_callchain_user(struct pt_regs *regs, struct perf_callchain_entry *entry)
>> {
>> @@ -2414,6 +2428,25 @@ perf_callchain_user(struct pt_regs *regs, struct perf_callchain_entry *entry)
>>
>> callchain_store(entry, PERF_CONTEXT_USER);
>> callchain_store(entry, regs->ip);
>> + if (test_thread_flag(TIF_IA32)) {
>> + /* 32-bit process in 64-bit kernel. */
>> + u32 fp = regs->bp;
>> + struct stack_frame_ia32 frame;
>> + while (entry->nr < PERF_MAX_STACK_DEPTH) {
>> + frame.next_frame = 0;
>> + frame.return_address = 0;
>> +
>> + if (!copy_stack_frame_ia32(fp, &frame))
>> + break;
>> +
>> + if ((unsigned long)fp < regs->sp)
>> + break;
>
>
>
> Shouldn't it be this?
>
> if (fp < (u32)regs->sp)
>
> As the high part of fp is zeroed in the cast but
> the high part of regs->sp remains. I don't know what could be
> there, but since the user doesn't use it, perhaps just garbage?
> And that could messup the comparison.
Agreed. I sent a new patch with that change.
>
> Otherwise the rest looks pretty good to me, nice catch.
>
Thanks.
Best regards,
--Edwin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists