[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c62985530812231844w7bdc493secbbeaf4372afa52@mail.gmail.com>
Date: Wed, 24 Dec 2008 03:44:23 +0100
From: "Frédéric Weisbecker" <fweisbec@...il.com>
To: "Steven Rostedt" <rostedt@...dmis.org>
Cc: "Ingo Molnar" <mingo@...e.hu>,
"Linux Kernel" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/2] tracing/ftrace: don't trace on early stage of secondary cpu boot
2008/12/24 Steven Rostedt <rostedt@...dmis.org>:
>
> On Wed, 24 Dec 2008, Frederic Weisbecker wrote:
>
>> Impact: fix a crash/hard reboot while enabling cpu on runtime
>>
>> On some archs, the boot of a secondary cpu can have an early fragile state.
>> On x86-64, the pda is not initialized on the first stage of a cpu boot but
>> it is needed to get the cpu number and the current task pointer. These datas
>> are needed during tracing. As they were dereferenced at this stage, we got a
>> crash while turning on a cpu on runtime while tracing.
>>
>> Some other archs like ia64 can have such kind of issue too.
>>
>> This patch implements a function to verify if the cpu has been correctly initialized
>> before tracing it. By default, the check does nothing but archs can override it (as does
>> x86-64 in this patch).
>>
>> Signed-off-by: Frederic Weisbecker <fweisbec@...il.com>
>> ---
>> arch/x86/kernel/ftrace.c | 13 +++++++++++++
>> include/linux/ftrace.h | 8 ++++++++
>> kernel/trace/ftrace.c | 6 ++++++
>> kernel/trace/trace.c | 6 +++---
>> 4 files changed, 30 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
>> index 1b43086..7693c53 100644
>> --- a/arch/x86/kernel/ftrace.c
>> +++ b/arch/x86/kernel/ftrace.c
>> @@ -24,6 +24,19 @@
>> #include <asm/nmi.h>
>>
>>
>> +#ifdef CONFIG_X86_64
>> +/*
>> + * A cpu doesn't want to be traced if pda hasn't yet been initialized.
>> + * Pda contains the cpu number and current task pointer. A lot
>> + * of functions use them, not only ftrace.
>> + * We use stack_smp_processor_id() here to avoid using the pda.
>> + */
>> +bool ftrace_in_early_cpuinit(void)
>> +{
>> + return !cpu_isset(stack_smp_processor_id(), cpu_online_map);
>> +}
>> +#endif
>
> Is there something lighter weight than this? This is called during the
> function tracer and that is called at ever function. I would also rather
> this be an inline function defined in asm/ftrace.h if possible.
>
I first thought about annotate those first functions with "notrace".
But the code inside
cpu_init/pda_init could change later and call external/traced function.
The problem with inlining is that I would loose the weak weight of the
function....
>> +
>> #ifdef CONFIG_DYNAMIC_FTRACE
>>
>> union ftrace_code_union {
>> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
>> index 677432b..9a86e3d 100644
>> --- a/include/linux/ftrace.h
>> +++ b/include/linux/ftrace.h
>> @@ -78,6 +78,14 @@ void clear_ftrace_function(void);
>>
>> extern void ftrace_stub(unsigned long a0, unsigned long a1);
>>
>> +/*
>> + * On some archs, the secondary cpu boot is sometimes fragile.
>> + * Some important datas could be not yet initialized. In such states,
>> + * tracing is too dangerous and we use this function to detect these
>> + * kinds of context. Archs which have this sort of issues can override it.
>> + */
>> +extern bool ftrace_in_early_cpuinit(void);
>> +
>> #else /* !CONFIG_FUNCTION_TRACER */
>> # define register_ftrace_function(ops) do { } while (0)
>> # define unregister_ftrace_function(ops) do { } while (0)
>> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
>> index 2f32969..463e1eb 100644
>> --- a/kernel/trace/ftrace.c
>> +++ b/kernel/trace/ftrace.c
>> @@ -1962,6 +1962,12 @@ ftrace_enable_sysctl(struct ctl_table *table, int write,
>> return ret;
>> }
>>
>> +/* Overriden by archs if needed */
>> +bool __attribute__((weak)) ftrace_in_early_cpuinit(void)
>> +{
>> + return false;
>> +}
>> +
>> #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>>
>> static atomic_t ftrace_graph_active;
>> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
>> index b07e34e..1ac9b98 100644
>> --- a/kernel/trace/trace.c
>> +++ b/kernel/trace/trace.c
>> @@ -1175,7 +1175,7 @@ function_trace_call_preempt_only(unsigned long ip, unsigned long parent_ip)
>> int cpu, resched;
>> int pc;
>>
>> - if (unlikely(!ftrace_function_enabled))
>> + if (unlikely(!ftrace_function_enabled || ftrace_in_early_cpuinit()))
>> return;
>>
>> pc = preempt_count();
>> @@ -1202,7 +1202,7 @@ function_trace_call(unsigned long ip, unsigned long parent_ip)
>> int cpu;
>> int pc;
>>
>> - if (unlikely(!ftrace_function_enabled))
>> + if (unlikely(!ftrace_function_enabled || ftrace_in_early_cpuinit()))
>> return;
>>
>> /*
>> @@ -1233,7 +1233,7 @@ int trace_graph_entry(struct ftrace_graph_ent *trace)
>> int cpu;
>> int pc;
>>
>> - if (!ftrace_trace_task(current))
>> + if (!ftrace_trace_task(current) || ftrace_in_early_cpuinit())
>> return 0;
>
> Actually, didn't you say current is not available either? We are testing
> current first.
Oops. Strange, my tests didn't give me any problem.
I will fix it.
> I still wonder if there's a better way to find out if it is safe to run.
> Perhaps the arch code can test the %gs register to see if it is actually
> something valid?
>
> -- Steve
I could test if MSR_GS_BASE references cpu_pda(cpu) but that remains
as much heavy...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists