lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Tue, 30 Mar 2010 21:14:03 +0200
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Ingo Molnar <mingo@...e.hu>, LKML <linux-kernel@...r.kernel.org>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	Paul Mackerras <paulus@...ba.org>,
	David Miller <davem@...emloft.net>
Subject: Re: [PATCH 2/2] perf: Use hot regs with software sched
	switch/migrate events

On Tue, Mar 30, 2010 at 08:54:52PM +0200, Peter Zijlstra wrote:
> On Tue, 2010-03-30 at 00:43 +0200, Frederic Weisbecker wrote:
> 
> > Actually I have doubts about what should be the strict sense
> > of exclude_kernel.
> > 
> > Does that mean we exclude any event that happened in the kernel?
> > Or does that mean we exclude the part that happened in the kernel?
> > 
> > Depending on the case, we do either.
> > 
> > In perf_swevent_hrtimer(), we simply go back to task_pt_regs()
> > if exclude_kernel.
> > 
> > But in other software events, we don't such fix, we actually
> > filter out the event if it is not user_mode().
> > 
> > So, I'm a bit confused on what to do.
> > I'm tempted to adopt the meaning from perf_swevent_hrtimer()
> > for software events too, I'm not sure...
> 
> Yes, that is indeed a good point. Problem is that perf_swevent_hrtimer()
> is not quite correct either, since strictly speaking its timeline should
> stop on the excluded region, but implementing that would make context
> switches horribly expensive.



No we wouldn't need that. We would just need to change the regs
check.

Currently we have this:

	regs = get_irq_regs();
	/*
	 * In case we exclude kernel IPs or are somehow not in interrupt
	 * context, provide the next best thing, the user IP.
	 */
	if ((event->attr.exclude_kernel || !regs) &&
			!event->attr.exclude_user)
		regs = task_pt_regs(current);


According to the strict meaning of exclude_kernel (event that happened in
userspace), we should have this:


	regs = get_irq_regs();

	if ((event->attr.exclude_kernel && regs)
		return ret;

	if (!regs && !event->attr.exclude_user && current->mm)
		regs = task_pt_regs(current);

	if (regs)
		overflow()


Note the current code is also buggy because we call task_pt_regs()
whenever we are a kernel thread or not.


> 
> That said, the option that would be most correct is to simply not count
> these events, and in that respect the current behaviour seems best.


Ok. But in this case I'm not sure what to do with the context switch
software event. The new hot regs thing now capture the kernel context,
whereas before it was only capturing userspace exit point.

Are you fine with that? The callchain will still go to userspace too.


> Maybe we can make a new perf feature that would for each kernel event
> (hw pmu included) report on the userspace state, would that be useful?


I'm not sure it would be useful...

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ