linux-kernel - Re: [PATCH v3 00/10] Move uid filtering to BPF filters

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAP-5=fUEYqHLYwGWn_BUdk2rZngEVH79=96yVdQT0P+vR6=tQw@mail.gmail.com>
Date: Tue, 3 Jun 2025 17:01:59 -0700
From: Ian Rogers <irogers@...gle.com>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Arnaldo Carvalho de Melo <acme@...nel.org>, Mark Rutland <mark.rutland@....com>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Adrian Hunter <adrian.hunter@...el.com>, Kan Liang <kan.liang@...ux.intel.com>, 
	James Clark <james.clark@...aro.org>, Dapeng Mi <dapeng1.mi@...ux.intel.com>, 
	Thomas Richter <tmricht@...ux.ibm.com>, Veronika Molnarova <vmolnaro@...hat.com>, Hao Ge <gehao@...inos.cn>, 
	Howard Chu <howardchu95@...il.com>, Weilin Wang <weilin.wang@...el.com>, 
	Levi Yun <yeoreum.yun@....com>, "Dr. David Alan Gilbert" <linux@...blig.org>, 
	Dominique Martinet <asmadeus@...ewreck.org>, Xu Yang <xu.yang_2@....com>, 
	Tengda Wu <wutengda@...weicloud.com>, linux-perf-users@...r.kernel.org, 
	linux-kernel@...r.kernel.org, bpf@...r.kernel.org
Subject: Re: [PATCH v3 00/10] Move uid filtering to BPF filters

On Tue, Jun 3, 2025 at 4:41 PM Namhyung Kim <namhyung@...nel.org> wrote:
>
> On Tue, Jun 03, 2025 at 04:22:53PM -0700, Ian Rogers wrote:
> > On Tue, Jun 3, 2025 at 3:32 PM Namhyung Kim <namhyung@...nel.org> wrote:
> > >
> > > On Mon, Jun 02, 2025 at 11:26:12PM -0700, Ian Rogers wrote:
> > > > On Mon, Jun 2, 2025 at 9:41 PM Namhyung Kim <namhyung@...nel.org> wrote:
> > > > >
> > > > > Hi Ian,
> > > > >
> > > > > On Tue, May 27, 2025 at 01:39:21PM -0700, Ian Rogers wrote:
> > > > > > On Fri, Apr 25, 2025 at 2:40 PM Ian Rogers <irogers@...gle.com> wrote:
> > > > > > >
> > > > > > > Rather than scanning /proc and skipping PIDs based on their UIDs, use
> > > > > > > BPF filters for uid filtering. The /proc scanning in thread_map is
> > > > > > > racy as the PID may exit before the perf_event_open causing perf to
> > > > > > > abort. BPF UID filters are more robust as they avoid the race. The
> > > > > > > /proc scanning also misses processes starting after the perf
> > > > > > > command. Add a helper for commands that support UID filtering and wire
> > > > > > > up. Remove the non-BPF UID filtering support given it doesn't work.
> > > > > > >
> > > > > > > v3: Add lengthier commit messages as requested by Arnaldo. Rebase on
> > > > > > >     tmp.perf-tools-next.
> > > > > > >
> > > > > > > v2: Add a perf record uid test (Namhyung) and force setting
> > > > > > >     system-wide for perf trace and perf record (Namhyung). Ensure the
> > > > > > >     uid filter isn't set on tracepoint evsels.
> > > > > > >
> > > > > > > v1: https://lore.kernel.org/lkml/20250111190143.1029906-1-irogers@google.com/
> > > > > >
> > > > > > Ping. Thanks,
> > > > >
> > > > > I'm ok with preferring BPF over /proc scanning, but still hesitate to
> > > > > remove it since some people don't use BPF.  Can you please drop that
> > > > > part and make parse_uid_filter() conditional on BPF?
> > > >
> > > > Hi Namhyung,
> > > >
> > > > The approach of scanning /proc fails as:
> > > > 1) processes that start after perf starts will be missed,
> > > > 2) processes that terminate between being scanned in /proc and
> > > > perf_event_open will cause perf to fail (essentially the -u option is
> > > > just sugar to scan /proc and then provide the processes as if they
> > > > were a -p option - such an approach doesn't need building into the
> > > > tool).
> > >
> > > Yeah, I remember we had this discussion before.  I think (1) is not true
> > > as perf events will be inherited to children (but there is a race).
> >
> > If you log in from another terminal? Anything that creates a new
> > process for that user but isn't inherited will be missed, which isn't
> > merely a race.
>
> As long as the another terminal is owned by the same user, any new
> process from the terminal will inherit events, no?
>
> >
> > >  And
> > > (2) is a real problem but it's also about a race and it can succeed.
> > >
> > > Maybe we could change it to skip failed events when the target is a
> > > user but that's not the direction you want.
> >
> > We could have other events and try to discover new processes via them,
> > do things like dummy events to cover races. It is just a lot of
> > complexity for something that is a trivial amount of BPF. In something
> > like 10 years nobody has bothered to fix this up.
>
> I don't want any complex solution for this.  Let's not touch this.
>
> >
> > > >
> > > > This patch series adds a test [1] and perf test has lots of processes
> > > > starting and exiting, matching condition (2) above*. If this series
> > > > were changed to an approach that uses BPF and falls back on /proc
> > > > scanning then the -u option would be broken for both reasons above but
> > > > also prove a constant source of test flakes.
> > > >
> > > > Rather than give the users something both frustrating to use (keeps
> > > > quitting due to failed opens) and broken (missing processes) I think
> > > > it is better to quit perf at that point informing the user they need
> > > > more permissions to load the BPF program. This also makes the -u
> > > > option testable.
> > > >
> > > > So the request for a change I don't think is sensible as it provides a
> > > > worse user and testing experience. There is also the cognitive load of
> > > > having the /proc scanning code in the code base, whereas the BPF
> > > > filter is largely isolated.
> > >
> > > But I think the problem is that it has different requirements - BPF and
> > > root privilege.  So it should be used after checking the requirements
> > > and fail or fallback.
> > >
> > > Does it print proper error messages if not?  With that we can deprecate
> > > the existing behavior and remove it later.
> >
> > For `perf top` with TUI you get an error message in a box of:
> > ```
> > failed to set filter "BPF" on event cpu_atom/cycles/P with 1
> > (Operation not permitted)
> > ```
> > With --stdio you get:
> > ```
> > libbpf: Error in bpf_object__probe_loading(): -EPERM. Couldn't load
> > trivial BPF program. Make sure your kernel supports BPF
> > (CONFIG_BPF_SYSCALL=y) and/or that RLIMIT_MEMLOCK is set to big enough
> > value.
> > libbpf: failed to load object 'sample_filter_bpf'
> > libbpf: failed to load BPF skeleton 'sample_filter_bpf': -EPERM
> > Failed to load perf sample-filter BPF skeleton
> > failed to set filter "BPF" on event cpu_atom/cycles/P with 1
> > (Operation not permitted)
> > ```
> > This matches the existing behavior if you put a filter on an event.
>
> But that's different as user directly asked the BPF filter.
> The following message would be better (unless you fallback to the old
> behavior).
>
> "-u/--uid option is using BPF filter but perf is not built with BPF.
> Please make sure to build with libbpf and bpf skeletons."
>
> and/or
>
> "-u/--uid option is using BPF filter which requires root privilege."
>
> You may check if the filter program and map is pinned already.

I don't disagree that these error messages would be better, shouldn't
the existing BPF filter code also produce these more user friendly
error messages? Once it does that we can adapt it when the caller is
the '-u' option so that the error message doesn't imply '--filter' was
used?

Thanks,
Ian

> Thanks,
> Namhyung
>