lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1290770183.2145.122.camel@laptop>
Date:	Fri, 26 Nov 2010 12:16:23 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	Stephane Eranian <eranian@...gle.com>
Cc:	linux-kernel@...r.kernel.org, mingo@...e.hu, paulus@...ba.org,
	davem@...emloft.net, fweisbec@...il.com,
	perfmon2-devel@...ts.sf.net, eranian@...il.com,
	robert.richter@....com, acme@...hat.com, lizf@...fujitsu.com
Subject: Re: [PATCH 1/2] perf_events: add support for per-cpu per-cgroup
 monitoring (v5)

On Thu, 2010-11-25 at 22:32 +0100, Stephane Eranian wrote:
> On Thu, Nov 25, 2010 at 4:02 PM, Peter Zijlstra <peterz@...radead.org> wrote:
> > On Thu, 2010-11-25 at 15:51 +0100, Stephane Eranian wrote:
> >>
> >>
> >> On Thu, Nov 25, 2010 at 12:20 PM, Peter Zijlstra <peterz@...radead.org> wrote:
> >>         On Thu, 2010-11-18 at 12:40 +0200, Stephane Eranian wrote:
> >>         > @@ -919,6 +945,10 @@ static inline void perf_event_task_sched_in(struct task_struct *task)
> >>         >  static inline
> >>         >  void perf_event_task_sched_out(struct task_struct *task, struct task_struct *next)
> >>         >  {
> >>         > +#ifdef CONFIG_CGROUPS
> >>         > +       atomic_t *cgroup_events = &__get_cpu_var(perf_cgroup_events);
> >>         > +       COND_STMT(cgroup_events, perf_cgroup_switch(task, next));
> >>         > +#endif
> >>         >         COND_STMT(&perf_task_events, __perf_event_task_sched_out(task, next));
> >>         >  }
> >>
> >>
> >>         I don't think that'll actually work, the jump label stuff needs a static
> >>         address.
> >>
> >> I did not know that.
> >
> > Yeah, its unfortunate the fallback code doesn't mandate this :/
> >
> >
> >>         Why not simply: s/perf_task_events/perf_sched_events/ and
> >>         increment it
> >>         for cgroup events as well?
> >>
> >> But you would need to demultiplex. that's not because perf_sched_events is
> >> set that you want BOTH perf_cgroup_switch() AND perf_event_task_sched_out().
> >
> > The main purpose of the jump-label stuff is to optimize the function
> > call and conditional into the perf code away, the moment we a function
> > call we might as well do everything, at that point its only a single
> > conditional.
> >
> > Jump labels are supposed to work like (they don't actually work like
> > this yet):
> >
> > my_func:
> >    asm-foo
> > addr_of_nop:
> >    nop5
> > after_nop:
> >    more-asm-foo
> >    iret
> >
> > out_of_line:
> >    do-special-foo
> >    jmp after_nop
> >
> >
> > We then keep a section of tuples:
> >
> > __jump_labels:
> >
> >  &perf_task_events,addr_of_nop
> >
> > Then when we flip perf_task_events from 0 -> !0 we rewrite the nop5 at
> > addr_of_nop to "jmp out_of_line" (5 bytes on x86, hence nop5), or the
> > reverse on !0 -> 0.
> >
> >
> > So 1) we need the 'key' (&perf_task_events) to be a static address
> > because the compiler needs to place the address in the special section
> > -- otherwise we can never find the nop location again, this also means
> > per-cpu variables don't make sense, there's only 1 copy of the text.
> >
> > and 2) the moment we take the out-of-line branch we incur the icache hit
> > and already set up a call, so optimizing away another conditional at the
> > cost of an extra function call doesn't really make sense.
> >
> >
> Ok, I understand. Thanks for the explanation.
> 
> So perf_sched_events would indicate that there may be some perf
> ctxsw work to do. It would set as soon as there is at least one
> event defined and not just a per-thread event.
> 
> BTW, I suspect, isn't  there a bug with
> PERF_COUNT_SW_CONTEXT_SWITCHES not being updated
> if you don't have at least one per-thread event. You need to
> hoist that perf_sw_event() into the header file before the COND().

Ah, indeed, that needs fixing.

> That implies you need to maintain:
>    - perf_sched_events: number of active events system-wide
>    - perf_task_events: number of per-thread events system-wide
>    - cgroup_events: per-cpu variable representing the number of cgroup
> events on a CPU
> 
> Is that what you are thinking?

Not sure you need the last two, once we have the jump label enabled and
call into the perf context switch handler the rest are simple conditions
on the current task state, no need to add yet more conditions to check
those conditions.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ