lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAP-5=fVQNDtKUWVgWPJ8i0+8G7+CZg-FEPWTsW6kcvSdLg2v3w@mail.gmail.com>
Date: Mon, 9 Dec 2024 13:47:53 -0800
From: Ian Rogers <irogers@...gle.com>
To: David Laight <David.Laight@...lab.com>
Cc: Leo Yan <leo.yan@....com>, Peter Zijlstra <peterz@...radead.org>, 
	Ingo Molnar <mingo@...hat.com>, Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>, 
	Mark Rutland <mark.rutland@....com>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Adrian Hunter <adrian.hunter@...el.com>, Kan Liang <kan.liang@...ux.intel.com>, 
	James Clark <james.clark@...aro.org>, Kyle Meyer <kyle.meyer@....com>, 
	Ben Gainey <ben.gainey@....com>, 
	"linux-perf-users@...r.kernel.org" <linux-perf-users@...r.kernel.org>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v1 1/8] perf: Increase MAX_NR_CPUS to 4096

On Mon, Dec 9, 2024 at 1:36 PM David Laight <David.Laight@...lab.com> wrote:
>
> ..
> > > > Just changing the int to be a s16 would lower the memory overhead,
> > > > which is why I'd kind of like the abstraction to be minimal.
> > >
> > > Here I am not clear what for "changing the int to be a s16".  Could you
> > > elaberate a bit for this?
> >
> > I meant this :-)
> > https://lore.kernel.org/lkml/20241207052133.102829-1-irogers@google.com/
>
> How many time is this allocated?
> If it is 2 bytes in a larger structure it is likely to be noise.
> For a local the code is likely to be worse.
> Any maths and you start forcing the compiler to mask the value
> (on pretty much anything except x86).

So the data structure is a sorted array of ints, this changes it to
int16s. On the 32 socket GNR with > 2048 logical CPUs, the array would
be over 8kb before and 4kb after for all online CPUs. On my more
modest desktop with 72 logical cores the size goes from 288 bytes down
to 144, a reduction of 2 cache lines. I'm not super excited about the
memory savings, but the patch is only 8 lines in difference.

Thanks,
Ian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ