linux-kernel - Re: [PATCH] arch_check_bp_in_kernelspace: fix the range check

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ffbad99ce24a54ce51dbcd9557f3a993.squirrel@mail.sublimeip.com>
Date:	Tue, 20 Nov 2012 21:33:42 +1100
From:	u3557@...o.sublimeip.com
To:	"Steven Rostedt" <rostedt@...dmis.org>
Cc:	"Oleg Nesterov" <oleg@...hat.com>,
	"Frederic Weisbecker" <fweisbec@...il.com>,
	"Ingo Molnar" <mingo@...hat.com>,
	"Peter Zijlstra" <a.p.zijlstra@...llo.nl>,
	"Amnon Shiloh" <u3557@...o.sublimeip.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] arch_check_bp_in_kernelspace: fix the range check

Dear Steve,

> But here, there's no prejudice between tasks. All tasks will now hit the
> breakpoint regardless of if it is being traced or not.

Just to clarify, there is no intention to allow conventional breakpoints
in the vsyscall page - that would indeed be a disaster affecting all other
processes.

The aim of this patch is to allow ptracers to set the x86 debug-registers
of their traced process, so that it stops when it reaches the entry points
of those (few) system-calls that are in the vsyscall page.  Note that those
debug registers are anyway swapped at context-switch, so no other processes
are affected.

> Is this really what we want? I mean, isn't syscall tracing in the kernel
> done by a flag set to the task's structure. If the task has a flag set,
> then it does the slow (ptrace) path which handles the breakpoint for the
> user.

Most system-calls can be trapped by the PTRACE_SYSCALL option (and the
later PTRACE_SYSEMU), but those few system-calls on the vsyscall page
defy the intention to trap ALL system-calls.  They also cannot be
checkpointed by inserting a trap-instruction (as that, if allowed, would
break all other processes), hence the only alternative left is to have
this patch that fixes the oversight in the design of
PTRACE_SYSCALL/PTRACE_SYSEMU in the presence of a vsyscall page.

Best Regards,
Amnon.

> On Mon, 2012-11-19 at 18:47 +0100, Oleg Nesterov wrote:
>> On 11/09, Oleg Nesterov wrote:
>> >
>> > On 11/09, Oleg Nesterov wrote:
>> > >
>> > > Note: TASK_SIZE doesn't look right at least on x86, I think it
>> should
>> > > be replaced by TASK_SIZE_MAX.
>> > > ...
>> > > --- x/arch/x86/kernel/hw_breakpoint.c
>> > > +++ x/arch/x86/kernel/hw_breakpoint.c
>> > > @@ -200,7 +200,7 @@ int arch_check_bp_in_kernelspace(struct
>> > >  	va = info->address;
>> > >  	len = get_hbp_len(info->len);
>> > >
>> > > -	return (va >= TASK_SIZE) && ((va + len - 1) >= TASK_SIZE);
>> > > +	return (va >= TASK_SIZE) || ((va + len - 1) >= TASK_SIZE);
>> >
>> > But actully I'd like to ask another question.
>> >
>> > Can't we add the additional !in_gate_area_no_mm() check to allow
>> > the hw breakpoints in vsyscall?
>> >
>> > Quoting Amnon:
>> >
>> > 	My service needs to catch the system-calls of its traced son.
>> > 	Almost all system-calls are caught with PTRACE_SYSCALL, but not those
>> > 	using the vsyscall page - especially "gettimeofday()" and "time()".
>> >
>> > 	...> Is this really what we want? I mean, isn't syscall tracing in
the kernel
> done by a flag set to the task's structure. If the task has a flag set,
> then it does the slow (ptrace) path which handles the breakpoint for the
> user.
>> >
>> > 	However, the code in "arch/x86/kernel/ptrace.c" forbids catching
>> kernel
>> > 	addresses.
>> >
>> > I tend to agree with Amnon...
>> >
>> > What do you think?
>>
>> ping ;)
>>
>> I agree the patch I sent is very minor (although it looks like the
>> trivial
>> bugfix to me).
>>
>> Just I think Amnon has a point, shouldn't we change this change like
>> below?
>> (on top of this fixlet).
>>
>> Oleg.
>>
>> --- x/arch/x86/kernel/hw_breakpoint.c
>> +++ x/arch/x86/kernel/hw_breakpoint.c
>> @@ -188,6 +188,11 @@ static int get_hbp_len(u8 hbp_len)
>>  	return len_in_bytes;
>>  }
>>
>> +static inline bool bp_in_gate(unsigned long start, unsigned long end)
>> +{
>> +	return in_gate_area_no_mm(start) && in_gate_area_no_mm(end);
>> +}
>> +
>>  /*
>>   * Check for virtual address in kernel space.
>>   */
>> @@ -200,7 +205,8 @@ int arch_check_bp_in_kernelspace(struct
>>  	va = info->address;
>>  	len = get_hbp_len(info->len);
>>
>> -	return (va >= TASK_SIZE) || ((va + len - 1) >= TASK_SIZE);
>> +	return ((va >= TASK_SIZE) || ((va + len - 1) >= TASK_SIZE)) &&
>> +		!bp_in_gate(va, va + len - 1);
>
> So we want to allow non privileged to add a breakpoint in a vsyscall
> area even if it happens to lay in kernel space?
>
> Is this really what we want? I mean, isn't syscall tracing in the kernel
> done by a flag set to the task's structure. If the task has a flag set,
> then it does the slow (ptrace) path which handles the breakpoint for the
> user.
>
> But here, there's no prejudice between tasks. All tasks will now hit the
> breakpoint regardless of if it is being traced or not.
>
> -- Steve
>
>
>>  }
>>
>>  int arch_bp_generic_fields(int x86_len, int x86_type,
>
>
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/