linux-kernel - Re: [PATCH v6] perf: Sharing PMU counters across compatible events

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20191106204419.GI3079@worktop.programming.kicks-ass.net>
Date:   Wed, 6 Nov 2019 21:44:19 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Song Liu <songliubraving@...com>
Cc:     open list <linux-kernel@...r.kernel.org>,
        Kernel Team <Kernel-team@...com>,
        "acme@...nel.org" <acme@...nel.org>,
        Arnaldo Carvalho de Melo <acme@...hat.com>,
        Jiri Olsa <jolsa@...nel.org>,
        Alexey Budankov <alexey.budankov@...ux.intel.com>,
        Namhyung Kim <namhyung@...nel.org>, Tejun Heo <tj@...nel.org>
Subject: Re: [PATCH v6] perf: Sharing PMU counters across compatible events

On Wed, Nov 06, 2019 at 05:40:29PM +0000, Song Liu wrote:
> > On Nov 6, 2019, at 1:14 AM, Peter Zijlstra <peterz@...radead.org> wrote:

> >> OTOH, non-cgroup event could also be inactive. For example, when we have 
> >> to rotate events, we may schedule slave before master. 
> > 
> > Right, although I suppose in that case you can do what you did in your
> > patch here. If someone did IOC_DISABLE on the master, we'd have to
> > re-elect a master -- obviously (and IOC_ENABLE).
> 
> Re-elect master on IOC_DISABLE is good. But we still need work for ctx
> rotation. Otherwise, we need keep the master on at all time. 

I meant to says that for the rotation case we can do as you did here, if
we do add() on a slave, add the master if it wasn't add()'ed yet.

> >> And if the master is in an event group, it will be more complicated...
> > 
> > Hurmph, do you actually have that use-case? And yes, this one is tricky.
> > 
> > Would it be sufficient if we disallow group events to be master (but
> > allow them to be slaves) ?
> 
> Maybe we can solve this with an extra "first_active" pointer in perf_event.
> first_active points to the first event that being added by event_pmu_add(). 
> Then we need something like:
> 
> event_pmu_add(event)
> {
> 	if (event->dup_master->first_active) {
> 		/* sync with first_active */
> 	} else {
> 		/* this event will be the first_active */
> 		event->dup_master->first_active = event;
> 		pmu->add(event);
> 	}
> }

I'm confused on what exactly you're trying to solve with the
first_active thing. The problem with the group event as master is that
you then _must_ schedule the whole group, which is obviously difficult.

> >> If we do GFP_ATOMIC in perf_event_alloc(), maybe with an extra option, we
> >> don't need the tmp_master hack. So we only allocate master when we will 
> >> use it. 
> > 
> > You can't, that's broken on -RT. ctx->lock is a raw_spinlock_t and
> > allocator locks are spinlock_t.
> 
> How about we add another step in __perf_install_in_context(), like
> 
> __perf_install_in_context()
> {
> 	bool alloc_master;
> 
> 	perf_ctx_lock();
> 	alloc_master = find_new_sharing(event, ctx);
> 	perf_ctx_unlock();
> 	
> 	if (alloc_master)
> 		event->dup_master = perf_event_alloc();
> 	/* existing logic of __perf_install_in_context() */
> 
> }
> 
> In this way, we only allocate the master event when necessary, and it
> is outside of the locks. 

It's still broken on -RT, because __perf_install_in_context() is in
hardirq context (IPI) and the allocator locks are spinlock_t.