lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aAeQcgmL-iqGbG_g@gmail.com>
Date: Tue, 22 Apr 2025 14:49:54 +0200
From: Ingo Molnar <mingo@...nel.org>
To: James Clark <james.clark@...aro.org>
Cc: Yabin Cui <yabinc@...gle.com>, coresight@...ts.linaro.org,
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
	linux-perf-users@...r.kernel.org,
	Suzuki K Poulose <suzuki.poulose@....com>,
	Mike Leach <mike.leach@...aro.org>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Namhyung Kim <namhyung@...nel.org>,
	Mark Rutland <mark.rutland@....com>, Jiri Olsa <jolsa@...nel.org>,
	Ian Rogers <irogers@...gle.com>,
	Adrian Hunter <adrian.hunter@...el.com>,
	Liang Kan <kan.liang@...ux.intel.com>
Subject: Re: [PATCH 1/2] perf: Allow non-contiguous AUX buffer pages via PMU
 capability


* James Clark <james.clark@...aro.org> wrote:

> 
> 
> On 21/04/2025 10:58 pm, Yabin Cui wrote:
> > For PMUs like ARM ETM/ETE, contiguous AUX buffers are unnecessary
> > and increase memory fragmentation.
> > 
> > This patch introduces PERF_PMU_CAP_AUX_NON_CONTIGUOUS_PAGES, allowing
> > PMUs to request non-contiguous pages for their AUX buffers.
> > 
> > Signed-off-by: Yabin Cui <yabinc@...gle.com>
> > ---
> >   include/linux/perf_event.h  | 1 +
> >   kernel/events/ring_buffer.c | 6 ++++++
> >   2 files changed, 7 insertions(+)
> > 
> > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> > index 0069ba6866a4..26ca35d6a9f2 100644
> > --- a/include/linux/perf_event.h
> > +++ b/include/linux/perf_event.h
> > @@ -301,6 +301,7 @@ struct perf_event_pmu_context;
> >   #define PERF_PMU_CAP_AUX_OUTPUT			0x0080
> >   #define PERF_PMU_CAP_EXTENDED_HW_TYPE		0x0100
> >   #define PERF_PMU_CAP_AUX_PAUSE			0x0200
> > +#define PERF_PMU_CAP_AUX_NON_CONTIGUOUS_PAGES	0x0400
> >   /**
> >    * pmu::scope
> > diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
> > index 5130b119d0ae..87f42f4e8edc 100644
> > --- a/kernel/events/ring_buffer.c
> > +++ b/kernel/events/ring_buffer.c
> > @@ -710,6 +710,12 @@ int rb_alloc_aux(struct perf_buffer *rb, struct perf_event *event,
> >   		max_order = ilog2(nr_pages);
> >   		watermark = 0;
> >   	}
> > +	/*
> > +	 * When the PMU doesn't prefer contiguous AUX buffer pages, favor
> > +	 * low-order allocations to reduce memory fragmentation.
> > +	 */
> > +	if (event->pmu->capabilities & PERF_PMU_CAP_AUX_NON_CONTIGUOUS_PAGES)
> > +		max_order = 0;
> >   	/*
> >   	 * kcalloc_node() is unable to allocate buffer if the size is larger
> 
> Hi Yabin,
> 
> I was wondering if this is just the opposite of 
> PERF_PMU_CAP_AUX_NO_SG, and that order 0 should be used by default 
> for all devices to solve the issue you describe. Because we already 
> have PERF_PMU_CAP_AUX_NO_SG for devices that need contiguous pages. 
> Then I found commit 5768402fd9c6 ("perf/ring_buffer: Use high order 
> allocations for AUX buffers optimistically") that explains that the 
> current allocation strategy is an optimization.
> 
> Your change seems to decide that for certain devices we want to 
> optimize for fragmentation rather than performance. If these are 
> rarely used features specifically when looking at performance should 
> we not continue to optimize for performance? Or at least make it user 
> configurable?

So there seems to be 3 categories:

 - 1) Must have physically contiguous AUX buffers, it's a hardware ABI. 
      (PERF_PMU_CAP_AUX_NO_SG for Intel BTS and PT.)

 - 2) Would be nice to have continguous AUX buffers, for a bit more 
      performance.

 - 3) Doesn't really care.

So we do have #1, and it appears Yabin's usecase is #3?

I strongly suspect that #2 and #3 are mostly the same in practice, and 
that we don't really need a lot of differentiation and complexity here, 
just the AUX_NO_SG flag that must have a max-order allocation - all 
other cases should allocate the AUX buffer in a default-nice, 
MM-friendly way.

Thanks,

	Ingo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ