lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAM9d7cjkUJ0wKi30winkDz=MKGB0Fhpp6Qnp2kSxC1eL+ZWNwA@mail.gmail.com>
Date:   Wed, 22 Feb 2023 11:42:57 -0800
From:   Namhyung Kim <namhyung@...nel.org>
To:     Jiri Olsa <olsajiri@...il.com>
Cc:     Ian Rogers <irogers@...gle.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        Adrian Hunter <adrian.hunter@...el.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Kan Liang <kan.liang@...ux.intel.com>,
        Song Liu <song@...nel.org>,
        Stephane Eranian <eranian@...gle.com>,
        Ravi Bangoria <ravi.bangoria@....com>,
        Leo Yan <leo.yan@...aro.org>,
        James Clark <james.clark@....com>, Hao Luo <haoluo@...gle.com>,
        LKML <linux-kernel@...r.kernel.org>,
        linux-perf-users@...r.kernel.org, bpf@...r.kernel.org
Subject: Re: [RFC/PATCHSET 0/7] perf record: Implement BPF sample filter (v1)

Hi Jiri,

On Tue, Feb 21, 2023 at 3:54 AM Jiri Olsa <olsajiri@...il.com> wrote:
>
> On Tue, Feb 14, 2023 at 10:01:41AM -0800, Namhyung Kim wrote:
> > Hi Ian,
> >
> > On Tue, Feb 14, 2023 at 8:58 AM Ian Rogers <irogers@...gle.com> wrote:
> > >
> > > On Mon, Feb 13, 2023 at 9:05 PM Namhyung Kim <namhyung@...nel.org> wrote:
> > > >
> > > > Hello,
> > > >
> > > > There have been requests for more sophisticated perf event sample
> > > > filtering based on the sample data.  Recently the kernel added BPF
> > > > programs can access perf sample data and this is the userspace part
> > > > to enable such a filtering.
> > > >
> > > > This still has some rough edges and needs more improvements.  But
> > > > I'd like to share the current work and get some feedback for the
> > > > directions and idea for further improvements.
> > > >
> > > > The kernel changes are in the tip.git tree (perf/core branch) for now.
> > > > perf record has --filter option to set filters on the last specified
> > > > event in the command line.  It worked only for tracepoints and Intel
> > > > PT events so far.  This patchset extends it to have 'bpf:' prefix in
> > > > order to enable the general sample filters using BPF for any events.
> > > >
> > > > A new filter expression parser was added (using flex/bison) to process
> > > > the filter string.  Right now, it only accepts very simple expressions
> > > > separated by comma.  I'd like to keep the filter expression as simple
> > > > as possible.
> > > >
> > > > It requires samples satisfy all the filter expressions otherwise it'd
> > > > drop the sample.  IOW filter expressions are connected with logical AND
> > > > operations implicitly.
> > > >
> > > > Essentially the BPF filter expression is:
> > > >
> > > >   "bpf:" <term> <operator> <value> ("," <term> <operator> <value>)*
> > > >
> > > > The <term> can be one of:
> > > >   ip, id, tid, pid, cpu, time, addr, period, txn, weight, phys_addr,
> > > >   code_pgsz, data_pgsz, weight1, weight2, weight3, ins_lat, retire_lat,
> > > >   p_stage_cyc, mem_op, mem_lvl, mem_snoop, mem_remote, mem_lock,
> > > >   mem_dtlb, mem_blk, mem_hops
> > > >
> > > > The <operator> can be one of:
> > > >   ==, !=, >, >=, <, <=, &
> > > >
> > > > The <value> can be one of:
> > > >   <number> (for any term)
> > > >   na, load, store, pfetch, exec (for mem_op)
> > > >   l1, l2, l3, l4, cxl, io, any_cache, lfb, ram, pmem (for mem_lvl)
> > > >   na, none, hit, miss, hitm, fwd, peer (for mem_snoop)
> > > >   remote (for mem_remote)
> > > >   na, locked (for mem_locked)
> > > >   na, l1_hit, l1_miss, l2_hit, l2_miss, any_hit, any_miss, walk, fault (for mem_dtlb)
> > > >   na, by_data, by_addr (for mem_blk)
> > > >   hops0, hops1, hops2, hops3 (for mem_hops)
> > > >
> > > > I plan to improve it with range expressions like for ip or addr and it
> > > > should support symbols like the existing addr-filters.  Also cgroup
> > > > should understand and convert cgroup names to IDs.
>
> this seems similar to what ftrace is doing in filter_match_preds,
> I checked the code briefly and I wonder if we shoud be able to write
> that function logic in bpf, assuming that the filter is prepared in
> user space
>
> it might solve the 'part' data problem in generic way.. but I might be
> missing some blocker of course.. just an idea ;-)
>
> could replace the tracepoint filters.. if we actually care

I'm not sure about replacing tracepoint filters.  IIRC BPF is optional,
then tracepoints should work without it.  From the BPF's perspective,
it has its own way of handling tracepoints so no need to deal with
perf or event tracing (ftrace) for that.

>From the perf's perspective, I think it can use either the existing ftrace
filters or build a new BPF filter for each event.  But it cannot use BTF
for perf tracepoint events at least for now.  Certainly it can use RAW
sample data and parse the event format to access the fields but I'm
not sure it's worth doing that. :)

Thanks,
Namhyung

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ