lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fUD=RNqD-7229J5fgaUCMtNiu-urp-9B3LDq8JnP2sGBg@mail.gmail.com>
Date:   Mon, 17 Apr 2023 09:37:37 -0700
From:   Ian Rogers <irogers@...gle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Adrian Hunter <adrian.hunter@...el.com>,
        Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...nel.org>,
        Namhyung Kim <namhyung@...nel.org>,
        linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC 0/5] perf: Add ioctl to emit sideband events

On Mon, Apr 17, 2023 at 4:02 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Fri, Apr 14, 2023 at 11:22:55AM +0300, Adrian Hunter wrote:
> > Hi
> >
> > Here is a stab at adding an ioctl for sideband events.
> >
> > This is to overcome races when reading the same information
> > from /proc.
>
> What races? Are you talking about reading old state in /proc the kernel
> delivering a sideband event for the new state, and then you writing the
> old state out?
>
> Surely that's something perf tool can fix without kernel changes?

So my reading is that during event synthesis there are races between
reading the different /proc files. There is still, I believe, a race
in with perf record/top with uid filtering which reminds me of this.
The uid filtering race is that we scan /proc to find processes (pids)
for a uid, we then synthesize the maps for each of these pids but if a
pid starts or exits we either error out or don't sample that pid. I
believe the error out behavior is easy to hit 100% of the time making
uid mode of limited use.

This may be for something other than synthesis, but for synthesis a
few points are:
 - as servers get bigger and consequently more jobs get consolidated
on them, synthesis is slow (hence --num-thread-synthesize) and also
the events dominate the perf.data file - perhaps >90% of the file
size, and a lot of that will be for processes with no samples in them.
Another issue here is that all those file descriptors don't come for
free in the kernel.
 - BPF has buildid+offset stack traces that remove the need for
synthesis by having more expensive stack generation. I believe this is
unpopular as adding this as a variant for every kind of event would be
hard, but perhaps we can do some low-hanging fruit like instructions
and cycles.
 - I believe Jiri looked at doing synthesis with BPF. Perhaps we could
do something similar to the off-cpu and tail-synthesize, where more
things happen at the tail end of perf. Off-cpu records data in maps
that it then synthesizes into samples.

There is also a long standing issue around not sampling munmap (or
mremap) that causes plenty of issues. Perhaps if we had less mmap in
the perf.data file we could add these.

Thanks,
Ian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ