[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4AADE19D.7030808@web.de>
Date: Mon, 14 Sep 2009 08:24:29 +0200
From: Jan Kiszka <jan.kiszka@....de>
To: Frederic Weisbecker <fweisbec@...il.com>
CC: Ingo Molnar <mingo@...e.hu>, LKML <linux-kernel@...r.kernel.org>,
Prasad <prasad@...ux.vnet.ibm.com>,
Alan Stern <stern@...land.harvard.edu>,
Peter Zijlstra <peterz@...radead.org>,
Arnaldo Carvalho de Melo <acme@...hat.com>,
Steven Rostedt <rostedt@...dmis.org>,
Jiri Slaby <jirislaby@...il.com>,
Li Zefan <lizf@...fujitsu.com>, Avi Kivity <avi@...hat.com>,
Paul Mackerras <paulus@...ba.org>,
Mike Galbraith <efault@....de>,
Masami Hiramatsu <mhiramat@...hat.com>
Subject: Re: [PATCH 3/5] hw-breakpoints: Rewrite the hw-breakpoints layer
on top of perf counters
Frederic Weisbecker wrote:
> On Sat, Sep 12, 2009 at 12:09:40AM +0200, Jan Kiszka wrote:
>> Frederic Weisbecker wrote:
>>> This patch rebase the implementation of the breakpoints API on top of
>>> perf counters instances.
>>>
>>> The core breakpoint API has changed a bit:
>>>
>>> - register_kernel_hw_breakpoint() now takes a cpu as a parameter. For
>>> now it doesn't support all cpu wide breakpoints but this may be
>>> implemented soon.
>>>
>>> - unregister_kernel_hw_breakpoint() and unregister_user_hw_breakpoint()
>>> have been unified in a single unregister_hw_breakpoint()
>>>
>>> Each breakpoints now match a perf counter which now handles the
>>> register scheduling, thread/cpu attachment, etc..
>>>
>>> The new layering is now made as follows:
>>>
>>> ptrace kgdb ftrace perf syscall
>>> \ | / /
>>> \ | / /
>> kgdb doesn't fit here as it requires nmi-safe services.
>>
>> I don't think you want to make the whole stack nmi-safe but rather
>> provide a separate interface that allows kgdb to announce to the kernel
>> when it uses some slot. Those slots should simply be excluded from
>> hardware updates. That's roughly the logic we use in KVM for guest
>> debugging: when the host starts to use debug registers for that purpose,
>> the guest's setting will not effect the real hardware anymore.
>
>
>
> I don't quite understand what must be NMI-safe here. Is it when
> we request a breakpoint or when we hit one?
>
Both. With kgdb, the kernel may be interrupted (almost) everywhere, and
then the operator may decide to add/remove hardware breakpoints during
this interruption.
>
>
>> Still on my wishlist for KVM is a cheap & easy way to obtain the current
>> register content or to refresh it in hardware. It's not yet clear to me
>> where to hook this in the given design. It looks like this information
>> can be scattered over the current thread and some perf counters.
>
>
> With this design approach, the debug registers are not anymore stored
> in the thread structure. They are not stored anymore actually.
>
> Especially because the breakpoint are not anymore assigned to a
> specific address register. This one is decided when the counter
> is enabled. And the counter is often toggled on/off, depending
> if we start/end profiling the desired context. It can be a single task,
> in which case the counter is enabled while the task is sched in, and
> disabled when it is sched out.
> And between two sched atoms, the register used for a breakpoint
> can be different.
>
> The arch informations about the breakpoints (len/type/addr) are stored
> in the counter structure, and the address/control registers contents
> are now dynamically computed.
>
> For your needs, basically the control must be done from perfcounters.
> When you switch from host to guest, the counter must be sched out.
> And in the reverse direction, it must be sched in.
> Then perf will take care of that by itself.
Actually, we wanted to avoid sched-out activity, and so far this is
possible. But if both steps are cheap enough, specifically if the
sched-out does _not_ touch the hardware and is very cheap if no
breakpoints are set, KVM will likely be a happy user.
Does that API already exist or what additional work is required?
Jan
Download attachment "signature.asc" of type "application/pgp-signature" (258 bytes)
Powered by blists - more mailing lists