[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z65BLfBlWpvp2xr4@google.com>
Date: Thu, 13 Feb 2025 10:59:57 -0800
From: Namhyung Kim <namhyung@...nel.org>
To: Ian Rogers <irogers@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>,
Kan Liang <kan.liang@...ux.intel.com>, Hao Ge <gehao@...inos.cn>,
James Clark <james.clark@...aro.org>,
Howard Chu <howardchu95@...il.com>,
Dominique Martinet <asmadeus@...ewreck.org>,
Levi Yun <yeoreum.yun@....com>, Xu Yang <xu.yang_2@....com>,
Tengda Wu <wutengda@...weicloud.com>,
Yang Jihong <yangjihong1@...wei.com>,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 00/10] Move uid filtering to BPF filters
On Thu, Feb 13, 2025 at 10:13:57AM -0800, Ian Rogers wrote:
> On Thu, Feb 13, 2025 at 9:47 AM Namhyung Kim <namhyung@...nel.org> wrote:
> >
> > On Wed, Feb 12, 2025 at 11:27:38PM -0800, Ian Rogers wrote:
> > > On Wed, Feb 12, 2025 at 5:44 PM Namhyung Kim <namhyung@...nel.org> wrote:
> > > >
> > > > On Wed, Feb 12, 2025 at 03:17:35PM -0800, Ian Rogers wrote:
> > > > > On Wed, Feb 12, 2025 at 2:56 PM Namhyung Kim <namhyung@...nel.org> wrote:
> > > > > >
> > > > > > On Wed, Feb 12, 2025 at 12:00:42PM -0800, Ian Rogers wrote:
> > > > > > > On Wed, Feb 12, 2025 at 10:46 AM Namhyung Kim <namhyung@...nel.org> wrote:
> > > > > > > > It's not completely broken and works sometimes.
> > > > > > >
> > > > > > > No this is the definition of completely broken. If it only works
> > > > > > > sometimes then you can't use it, we can't put a test on it, there is
> > > > > > > no point in it. Even when it doesn't fail on perf_event_open, does it
> > > > > > > work for processes that start after /proc is scanned? No, it is
> > > > > > > completely broken.
> > > > > >
> > > > > > Ok, we have a different definition for it. Let's ignore the imaginary
> > > > > > users of the broken features. Have you added a test for this change?
> > > > > >
> > > > > > Anyway I've tested your change and found some issues:
> > > > > >
> > > > > > 1. It silently exited when BPF-skel is not built. Better to put some
> > > > > > messages at least.
> > > > > >
> > > > > > $ sudo ./perf record -u $(id -u) -- sleep 1
> > > > > >
> > > > > > 2. Even with BPF-skel, perf record doesn't work well. It did something
> > > > > > but failed to get sample data for some reason.
> > > > > >
> > > > > > $ sudo ./perf record -u $(id -u) -- sleep 1
> > > > > > [ perf record: Woken up 1 times to write data ]
> > > > > > [ perf record: Captured and wrote 0.045 MB perf.data ]
> > > > > >
> > > > > > Oh, I think you now need to pass -a because it now works in
> > > > > > system-wide mode and drops samples for other users.
> > > > >
> > > > > This is a pre-existing problem, no?
> > > >
> > > > No, it worked without -a in the past. Please see my previous reply.
> > > > I think -u/--uid is one of the supported target in the perf tool (not
> > > > for the system call) and it used to disable system-wide mode if -u is
> > > > used at the same time.
> > >
> > > As I keep repeating the '-u' option has never worked in the past, it
> > > either early exited or missed recording new processes.
> >
> > I agree the early exit of existing process is a problem of '-u'. But
> > I'm not sure it would miss all new processes. IIUC '-u' is a shortcut
> > for '-p' with all processes belong to the user. That means it'll have
> > perf_event with attr.inherit set then it can track new processes from
> > them. Of course, there's a race between /proc scanning and process
> > creation so it will miss some. But even '-p' has the same race.
> >
> > >
> > > The documentation for `perf record` is:
> > > ```
> > > ...
> > > -a, --all-cpus
> > > System-wide collection from all CPUs (default if no target
> > > is specified).
> > > ...
> > > -u, --uid=
> > > Record events in threads owned by uid. Name or number.
> > > ...
> > > ```
> > > So `-a` is implied without a target/workload but you are specifying a
> > > workload and `-u`. So do you want samples from the workload or from
> > > the user's processes? I can tell from your command line being sleep
> > > that you want it to imply `-a` but would that be true if it were a
> > > benchmark? For the benchmark case you probably don't want `-a`
> > > implied. If you specify `-p` and `-u` should the processes that don't
> > > match -u be sampled or are you expecting implicit system wide
> > > profiling now?
> >
> > I don't follow. You cannot specify two targets at the same time. If
> > you use -a and -u together currently, you will see:
> >
> > $ perf record -a -u $(id -u)
> > Warning:
> > UID switch overriding SYSTEM
> >
> > And for -p and -u:
> >
> > $ perf record -p 1 -u $(id -u)
> > Warning:
> > PID/TID switch overriding UID
> >
> > And specifying a target and a workload means it will profile the target
> > while the workload is running.
> >
> > $ perf record -u $(id -u) -- sleep 1
> >
> > So the above command will profile all processes belong to me for 1
> > second. As you're doing system wide for 1 second with below command.
> >
> > $ perf record -a -- sleep 1
> >
> > You can change sleep with any other workload but it's not the target of
> > the profile unless it matches to the target specifically.
> >
> > >
> > > I agree the behavior in the patch series doesn't match the existing
> > > behavior, but I'm not sure the existing behavior agrees with the
> > > documentation nor working as expected. Having the `-u` not select
> > > system wide, as in the patch, seems to agree with the documentation
> > > better. You have specified a target workload/process/eventual pid and
> > > you want samples owned by the uid, why should you now start getting
> > > samples from other processes? With `-p` you wouldn't expect `-a` to
> > > suddenly get implied, I'm not sure it makes any more sense with any
> > > target/workload.
> >
> > As I said, the existing behavior doesn't imply system wide. But your
> > change now requires it.
>
> The existing behavior doesn't work. If we can't put a test on it, it
> doesn't work.
Ok, then I'd be happy to see a test case with this series.
>
> I'm asking what should the behavior be? You are asking to match
> something that doesn't work, why? Pointing out warnings ignores my
> question, you just seem to be out to prove you can find fault with the
> patch series as-is.
Sorry if you felt that way. I just pointed out the behavior changes in
the series. You can freely use BPF filters if the condition is met.
Otherwise it can fallback to the existing behavior or at least fail with
a message explaining the situation and hopefully how to fix it.
>
> The major use-case for `-u` I find being perf top and not perf record,
> largely because if you are trying to do perf record you are less
> tolerant to the brokeness that is the existing -u behavior. perf top
> doesn't have a workload and system wide is implied.
I agree it'd be useful to perf top more than perf record. And it will
be run as root mostly so it should be good to go. But that doesn't mean
you can leave perf record in a broken state.
>
> Should the behavior be with -u to imply system wide or should it be a
> filter to the user's other options or should we scan /proc and have
> that brokenness again? The latter has been your expressed preference
> as you don't see -u as broken.
I think it's natural to imply system-wide. And as I pointed out, it's
not (used to be) possible to mix -u with other target options.
But it now requires higher privilege and/or perf_event_paranoid.
Thanks,
Namhyung
> >
> > >
> > > > > > 3. With BPF-skel, non-root users will see this.
> > > > > >
> > > > > > $ ./perf record -u $(id -u) -- sleep 1
> > > > > > cannot get fd for 'filters' map
> > > > > > failed to set filter "BPF" on event cycles:P with 13 (Permission denied)
> > > > > >
> > > > > > I think it's confusing and better to tell user that you need to run
> > > > > > 'perf record --setup-filter pin' as root first. But maybe due to the
> > > > > > issue #2, you still need to run it as root.
> > >
> > > I think that is an okay addition to BPF filters in general. I'm wary
> > > of having patch series feature crept into requiring the entire tool
> > > being reimplemented.
> > >
> > > Thanks,
> > > Ian
Powered by blists - more mailing lists