[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAM9d7ciZQVkfFhaEZ8aNyTXUVhuVtATtpXt1t6Ya2Wazgwr-vg@mail.gmail.com>
Date: Sun, 9 May 2021 00:13:48 -0700
From: Namhyung Kim <namhyung@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...nel.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Jiri Olsa <jolsa@...hat.com>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
LKML <linux-kernel@...r.kernel.org>,
Stephane Eranian <eranian@...gle.com>,
Andi Kleen <ak@...ux.intel.com>,
Ian Rogers <irogers@...gle.com>,
Song Liu <songliubraving@...com>, Tejun Heo <tj@...nel.org>,
kernel test robot <lkp@...el.com>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH v3 1/2] perf/core: Share an event with multiple cgroups
Hi Peter,
Thinking about the interface a bit more...
On Fri, Apr 16, 2021 at 4:59 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Fri, Apr 16, 2021 at 08:22:38PM +0900, Namhyung Kim wrote:
> > On Fri, Apr 16, 2021 at 7:28 PM Peter Zijlstra <peterz@...radead.org> wrote:
> > >
> > > On Fri, Apr 16, 2021 at 11:29:30AM +0200, Peter Zijlstra wrote:
> > >
> > > > > So I think we've had proposals for being able to close fds in the past;
> > > > > while preserving groups etc. We've always pushed back on that because of
> > > > > the resource limit issue. By having each counter be a filedesc we get a
> > > > > natural limit on the amount of resources you can consume. And in that
> > > > > respect, having to use 400k fds is things working as designed.
> > > > >
> > > > > Anyway, there might be a way around this..
> > >
> > > So how about we flip the whole thing sideways, instead of doing one
> > > event for multiple cgroups, do an event for multiple-cpus.
> > >
> > > Basically, allow:
> > >
> > > perf_event_open(.pid=fd, cpu=-1, .flag=PID_CGROUP);
> > >
> > > Which would have the kernel create nr_cpus events [the corrolary is that
> > > we'd probably also allow: (.pid=-1, cpu=-1) ].
> >
> > Do you mean it'd have separate perf_events per cpu internally?
> > From a cpu's perspective, there's nothing changed, right?
> > Then it will have the same performance problem as of now.
>
> Yes, but we'll not end up in ioctl() hell. The interface is sooo much
> better. The performance thing just means we need to think harder.
So I'd like to have vector support for cgroups but it could be
extended later. So open with a flag that it'd accept a vector
fd = perf_event_open(.pid=-1, .cpu=N, .flag=VECTOR);
Then it'd still need an additional interface (probably ioctl) to
set (or append) the vector.
ioctl(fd, ADD_VECTOR, { .type = VEC_CGROUP, .nr = N, ... });
Maybe we also need to add FORMAT_VECTOR and use read(v)
or friends to read the contents for each entry. It'd be nice
if it can have a vector-specific info like cgroup-id in this case.
What do you think?
Thanks,
Namhyung
Powered by blists - more mailing lists