linux-kernel - Re: [PATCH 3/5] hw-breakpoints: Rewrite the hw-breakpoints layer on top of perf counters

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4AADE19D.7030808@web.de>
Date:	Mon, 14 Sep 2009 08:24:29 +0200
From:	Jan Kiszka <jan.kiszka@....de>
To:	Frederic Weisbecker <fweisbec@...il.com>
CC:	Ingo Molnar <mingo@...e.hu>, LKML <linux-kernel@...r.kernel.org>,
	Prasad <prasad@...ux.vnet.ibm.com>,
	Alan Stern <stern@...land.harvard.edu>,
	Peter Zijlstra <peterz@...radead.org>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Jiri Slaby <jirislaby@...il.com>,
	Li Zefan <lizf@...fujitsu.com>, Avi Kivity <avi@...hat.com>,
	Paul Mackerras <paulus@...ba.org>,
	Mike Galbraith <efault@....de>,
	Masami Hiramatsu <mhiramat@...hat.com>
Subject: Re: [PATCH 3/5] hw-breakpoints: Rewrite the hw-breakpoints layer
 on top of perf counters

Frederic Weisbecker wrote:
> On Sat, Sep 12, 2009 at 12:09:40AM +0200, Jan Kiszka wrote:
>> Frederic Weisbecker wrote:
>>> This patch rebase the implementation of the breakpoints API on top of
>>> perf counters instances.
>>>
>>> The core breakpoint API has changed a bit:
>>>
>>> - register_kernel_hw_breakpoint() now takes a cpu as a parameter. For
>>>   now it doesn't support all cpu wide breakpoints but this may be
>>>   implemented soon.
>>>
>>> - unregister_kernel_hw_breakpoint() and unregister_user_hw_breakpoint()
>>>   have been unified in a single unregister_hw_breakpoint()
>>>
>>> Each breakpoints now match a perf counter which now handles the
>>> register scheduling, thread/cpu attachment, etc..
>>>
>>> The new layering is now made as follows:
>>>
>>>        ptrace       kgdb      ftrace   perf syscall
>>>           \          |          /         /
>>>            \         |         /         /
>> kgdb doesn't fit here as it requires nmi-safe services.
>>
>> I don't think you want to make the whole stack nmi-safe but rather
>> provide a separate interface that allows kgdb to announce to the kernel
>> when it uses some slot. Those slots should simply be excluded from
>> hardware updates. That's roughly the logic we use in KVM for guest
>> debugging: when the host starts to use debug registers for that purpose,
>> the guest's setting will not effect the real hardware anymore.
> 
> 
> 
> I don't quite understand what must be NMI-safe here. Is it when
> we request a breakpoint or when we hit one?
> 

Both. With kgdb, the kernel may be interrupted (almost) everywhere, and
then the operator may decide to add/remove hardware breakpoints during
this interruption.

> 
>  
>> Still on my wishlist for KVM is a cheap & easy way to obtain the current
>> register content or to refresh it in hardware. It's not yet clear to me
>> where to hook this in the given design. It looks like this information
>> can be scattered over the current thread and some perf counters.
> 
> 
> With this design approach, the debug registers are not anymore stored
> in the thread structure. They are not stored anymore actually.
> 
> Especially because the breakpoint are not anymore assigned to a
> specific address register. This one is decided when the counter
> is enabled. And the counter is often toggled on/off, depending
> if we start/end profiling the desired context. It can be a single task,
> in which case the counter is enabled while the task is sched in, and
> disabled when it is sched out.
> And between two sched atoms, the register used for a breakpoint
> can be different.
> 
> The arch informations about the breakpoints (len/type/addr) are stored
> in the counter structure, and the address/control registers contents
> are now dynamically computed.
> 
> For your needs, basically the control must be done from perfcounters.
> When you switch from host to guest, the counter must be sched out.
> And in the reverse direction, it must be sched in.
> Then perf will take care of that by itself.

Actually, we wanted to avoid sched-out activity, and so far this is
possible. But if both steps are cheap enough, specifically if the
sched-out does _not_ touch the hardware and is very cheap if no
breakpoints are set, KVM will likely be a happy user.

Does that API already exist or what additional work is required?

Jan


Download attachment "signature.asc" of type "application/pgp-signature" (258 bytes)