[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ffbad99ce24a54ce51dbcd9557f3a993.squirrel@mail.sublimeip.com>
Date: Tue, 20 Nov 2012 21:33:42 +1100
From: u3557@...o.sublimeip.com
To: "Steven Rostedt" <rostedt@...dmis.org>
Cc: "Oleg Nesterov" <oleg@...hat.com>,
"Frederic Weisbecker" <fweisbec@...il.com>,
"Ingo Molnar" <mingo@...hat.com>,
"Peter Zijlstra" <a.p.zijlstra@...llo.nl>,
"Amnon Shiloh" <u3557@...o.sublimeip.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] arch_check_bp_in_kernelspace: fix the range check
Dear Steve,
> But here, there's no prejudice between tasks. All tasks will now hit the
> breakpoint regardless of if it is being traced or not.
Just to clarify, there is no intention to allow conventional breakpoints
in the vsyscall page - that would indeed be a disaster affecting all other
processes.
The aim of this patch is to allow ptracers to set the x86 debug-registers
of their traced process, so that it stops when it reaches the entry points
of those (few) system-calls that are in the vsyscall page. Note that those
debug registers are anyway swapped at context-switch, so no other processes
are affected.
> Is this really what we want? I mean, isn't syscall tracing in the kernel
> done by a flag set to the task's structure. If the task has a flag set,
> then it does the slow (ptrace) path which handles the breakpoint for the
> user.
Most system-calls can be trapped by the PTRACE_SYSCALL option (and the
later PTRACE_SYSEMU), but those few system-calls on the vsyscall page
defy the intention to trap ALL system-calls. They also cannot be
checkpointed by inserting a trap-instruction (as that, if allowed, would
break all other processes), hence the only alternative left is to have
this patch that fixes the oversight in the design of
PTRACE_SYSCALL/PTRACE_SYSEMU in the presence of a vsyscall page.
Best Regards,
Amnon.
> On Mon, 2012-11-19 at 18:47 +0100, Oleg Nesterov wrote:
>> On 11/09, Oleg Nesterov wrote:
>> >
>> > On 11/09, Oleg Nesterov wrote:
>> > >
>> > > Note: TASK_SIZE doesn't look right at least on x86, I think it
>> should
>> > > be replaced by TASK_SIZE_MAX.
>> > > ...
>> > > --- x/arch/x86/kernel/hw_breakpoint.c
>> > > +++ x/arch/x86/kernel/hw_breakpoint.c
>> > > @@ -200,7 +200,7 @@ int arch_check_bp_in_kernelspace(struct
>> > > va = info->address;
>> > > len = get_hbp_len(info->len);
>> > >
>> > > - return (va >= TASK_SIZE) && ((va + len - 1) >= TASK_SIZE);
>> > > + return (va >= TASK_SIZE) || ((va + len - 1) >= TASK_SIZE);
>> >
>> > But actully I'd like to ask another question.
>> >
>> > Can't we add the additional !in_gate_area_no_mm() check to allow
>> > the hw breakpoints in vsyscall?
>> >
>> > Quoting Amnon:
>> >
>> > My service needs to catch the system-calls of its traced son.
>> > Almost all system-calls are caught with PTRACE_SYSCALL, but not those
>> > using the vsyscall page - especially "gettimeofday()" and "time()".
>> >
>> > ...> Is this really what we want? I mean, isn't syscall tracing in
the kernel
> done by a flag set to the task's structure. If the task has a flag set,
> then it does the slow (ptrace) path which handles the breakpoint for the
> user.
>> >
>> > However, the code in "arch/x86/kernel/ptrace.c" forbids catching
>> kernel
>> > addresses.
>> >
>> > I tend to agree with Amnon...
>> >
>> > What do you think?
>>
>> ping ;)
>>
>> I agree the patch I sent is very minor (although it looks like the
>> trivial
>> bugfix to me).
>>
>> Just I think Amnon has a point, shouldn't we change this change like
>> below?
>> (on top of this fixlet).
>>
>> Oleg.
>>
>> --- x/arch/x86/kernel/hw_breakpoint.c
>> +++ x/arch/x86/kernel/hw_breakpoint.c
>> @@ -188,6 +188,11 @@ static int get_hbp_len(u8 hbp_len)
>> return len_in_bytes;
>> }
>>
>> +static inline bool bp_in_gate(unsigned long start, unsigned long end)
>> +{
>> + return in_gate_area_no_mm(start) && in_gate_area_no_mm(end);
>> +}
>> +
>> /*
>> * Check for virtual address in kernel space.
>> */
>> @@ -200,7 +205,8 @@ int arch_check_bp_in_kernelspace(struct
>> va = info->address;
>> len = get_hbp_len(info->len);
>>
>> - return (va >= TASK_SIZE) || ((va + len - 1) >= TASK_SIZE);
>> + return ((va >= TASK_SIZE) || ((va + len - 1) >= TASK_SIZE)) &&
>> + !bp_in_gate(va, va + len - 1);
>
> So we want to allow non privileged to add a breakpoint in a vsyscall
> area even if it happens to lay in kernel space?
>
> Is this really what we want? I mean, isn't syscall tracing in the kernel
> done by a flag set to the task's structure. If the task has a flag set,
> then it does the slow (ptrace) path which handles the breakpoint for the
> user.
>
> But here, there's no prejudice between tasks. All tasks will now hit the
> breakpoint regardless of if it is being traced or not.
>
> -- Steve
>
>
>> }
>>
>> int arch_bp_generic_fields(int x86_len, int x86_type,
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists