lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 19 Nov 2019 12:06:41 +0100
From:   Jiri Olsa <jolsa@...hat.com>
To:     Arnaldo Carvalho de Melo <acme@...nel.org>
Cc:     Jiri Olsa <jolsa@...nel.org>, lkml <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Namhyung Kim <namhyung@...nel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>,
        Michael Petlan <mpetlan@...hat.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Stephane Eranian <eranian@...gle.com>,
        Jin Yao <yao.jin@...ux.intel.com>,
        Alexey Budankov <alexey.budankov@...ux.intel.com>
Subject: Re: [PATCHv2 0/2] perf tools: Share struct map after clone

On Mon, Nov 18, 2019 at 06:48:51PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Nov 18, 2019 at 01:14:00PM +0100, Jiri Olsa escreveu:
> > On Tue, Oct 29, 2019 at 09:58:55PM +0100, Jiri Olsa wrote:
> > > > > 
> > > > > Also available in here:
> > > > >   git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
> > > > >   perf/map_shared
> 
> > > > I rebased to latest perf/core and pushed the branch out
> 
> > > rebased and pushed out
>  
> > heya,
> > I lost track of this.. what's the status, are you going with your
> > version, or is this one still in? I don't see any of them in latest
> > code..
> 
> So, I'm still working on and off on this, current status is at:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=perf/map_share
> 
> Its just one patch more than perf/core, the one that does the sharing.
> 
> The thing is, as I'm going over all the fields in 'struct map', it seems
> that we'll end up with just one cacheline per instance, as there are
> things there that are not strictly related to a map, but to a map_group
> (unmap_ip/map_ip), or to a dso (maj, min, ino, ino_generation), and some
> need less than what is allocated to them.
> 
> Current status is:
> 
> [root@...co ~]# pahole -C map ~acme/bin/perf
> struct map {
> 	union {
> 		struct rb_node rb_node __attribute__((__aligned__(8))); /*     0    24 */
> 		struct list_head node;                   /*     0    16 */
> 	} __attribute__((__aligned__(8)));                                               /*     0    24 */
> 	u64                        start;                /*    24     8 */
> 	u64                        end;                  /*    32     8 */
> 	_Bool                      erange_warned:1;      /*    40: 0  1 */
> 	_Bool                      priv:1;               /*    40: 1  1 */
> 
> 	/* XXX 6 bits hole, try to pack */
> 	/* XXX 3 bytes hole, try to pack */
> 
> 	u32                        prot;                 /*    44     4 */
> 	u64                        pgoff;                /*    48     8 */
> 	u64                        reloc;                /*    56     8 */
> 	/* --- cacheline 1 boundary (64 bytes) --- */
> 	u64                        (*map_ip)(struct map *, u64); /*    64     8 */
> 	u64                        (*unmap_ip)(struct map *, u64); /*    72     8 */
> 	struct dso *               dso;                  /*    80     8 */
> 	refcount_t                 refcnt;               /*    88     4 */
> 	u32                        flags;                /*    92     4 */
> 
> 	/* size: 96, cachelines: 2, members: 13 */
> 	/* sum members: 92, holes: 1, sum holes: 3 */
> 	/* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits */
> 	/* forced alignments: 1 */
> 	/* last cacheline: 32 bytes */
> } __attribute__((__aligned__(8)));
> [root@...co ~]#
> 
> This is with the tentative move of maj/min/ino/ino_generation to 'struct
> dso', but that needs more work to match the sort order that touches it
> "dcacheline", i.e. a map that comes with the same backing DSO but
> different values for those fields is not the same DSO, right?
> 
> Right now with moving the maj/min/etc to dso, in the map_share patch we
> get the structure used to keep shared entries in the rb_tree at 40
> bytes, under one cacheline, while the full 'struct map' is 32 bytes more
> than one cacheline, so still good for sharing:
> 
> [acme@...co perf]$ pahole -C map_node ~/bin/perf
> struct map_node {
> 	union {
> 		struct rb_node rb_node __attribute__((__aligned__(8))); /*     0    24 */
> 		struct list_head node;                   /*     0    16 */
> 	} __attribute__((__aligned__(8)));               /*     0    24 */
> 	refcount_t                 refcnt;               /*    24     4 */
> 	_Bool                      is_node:1;            /*    28: 0  1 */
> 
> 	/* XXX 7 bits hole, try to pack */
> 	/* XXX 3 bytes hole, try to pack */
> 
> 	struct map *               map;                  /*    32     8 */
> 
> 	/* size: 40, cachelines: 1, members: 4 */
> 	/* sum members: 36, holes: 1, sum holes: 3 */
> 	/* sum bitfield members: 1 bits, bit holes: 1, sum bit holes: 7 bits */
> 	/* forced alignments: 1 */
> 	/* last cacheline: 40 bytes */
> } __attribute__((__aligned__(8)));
> [acme@...co perf]$ pahole -C map ~/bin/perf
> struct map {
> 	union {
> 		struct rb_node rb_node __attribute__((__aligned__(8))); /*     0    24 */
> 		struct list_head node;                   /*     0    16 */
> 	} __attribute__((__aligned__(8)));               /*     0    24 */
> 	refcount_t                 refcnt;               /*    24     4 */
> 	_Bool                      is_node:1;            /*    28: 0  1 */
> 	_Bool                      erange_warned:1;      /*    28: 1  1 */
> 	_Bool                      priv:1;               /*    28: 2  1 */
> 
> 	/* XXX 5 bits hole, try to pack */
> 	/* XXX 3 bytes hole, try to pack */
> 
> 	u64                        start;                /*    32     8 */
> 	u64                        end;                  /*    40     8 */
> 	u64                        pgoff;                /*    48     8 */
> 	u64                        reloc;                /*    56     8 */
> 	/* --- cacheline 1 boundary (64 bytes) --- */
> 	u64                        (*map_ip)(struct map *, u64); /*    64     8 */
> 	u64                        (*unmap_ip)(struct map *, u64); /*    72     8 */
> 	struct dso *               dso;                  /*    80     8 */
> 	u32                        flags;                /*    88     4 */
> 	u32                        prot;                 /*    92     4 */
> 
> 	/* size: 96, cachelines: 2, members: 14 */
> 	/* sum members: 92, holes: 1, sum holes: 3 */
> 	/* sum bitfield members: 3 bits, bit holes: 1, sum bit holes: 5 bits */
> 	/* forced alignments: 1 */
> 	/* last cacheline: 32 bytes */
> } __attribute__((__aligned__(8)));
> [acme@...co perf]$
> 
> So give me some more time, please :-)

sure ;-) I just did not want to loose track of this

thanks,
jirka

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ