lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 12 Feb 2024 12:26:12 -0800
From: Namhyung Kim <namhyung@...nel.org>
To: Ian Rogers <irogers@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Arnaldo Carvalho de Melo <acme@...nel.org>, Mark Rutland <mark.rutland@....com>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Adrian Hunter <adrian.hunter@...el.com>, Song Liu <song@...nel.org>, 
	Colin Ian King <colin.i.king@...il.com>, Liam Howlett <liam.howlett@...cle.com>, 
	K Prateek Nayak <kprateek.nayak@....com>, Artem Savkov <asavkov@...hat.com>, 
	Changbin Du <changbin.du@...wei.com>, Masami Hiramatsu <mhiramat@...nel.org>, 
	Athira Rajeev <atrajeev@...ux.vnet.ibm.com>, Alexey Dobriyan <adobriyan@...il.com>, 
	James Clark <james.clark@....com>, Vincent Whitchurch <vincent.whitchurch@...s.com>, 
	linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org, 
	bpf@...r.kernel.org, leo.yan@...ux.dev
Subject: Re: [PATCH v3 1/6] perf maps: Switch from rbtree to lazily sorted
 array for addresses

On Mon, Feb 12, 2024 at 12:19 PM Ian Rogers <irogers@...gle.com> wrote:
>
> On Mon, Feb 12, 2024 at 12:15 PM Namhyung Kim <namhyung@...nel.org> wrote:
> >
> > On Fri, Feb 9, 2024 at 7:18 PM Ian Rogers <irogers@...gle.com> wrote:
> > >
> > > Maps is a collection of maps primarily sorted by the starting address
> > > of the map. Prior to this change the maps were held in an rbtree
> > > requiring 4 pointers per node. Prior to reference count checking, the
> > > rbnode was embedded in the map so 3 pointers per node were
> > > necessary. This change switches the rbtree to an array lazily sorted
> > > by address, much as the array sorting nodes by name. 1 pointer is
> > > needed per node, but to avoid excessive resizing the backing array may
> > > be twice the number of used elements. Meaning the memory overhead is
> > > roughly half that of the rbtree. For a perf record with
> > > "--no-bpf-event -g -a" of true, the memory overhead of perf inject is
> > > reduce fom 3.3MB to 3MB, so 10% or 300KB is saved.
> > >
> > > Map inserts always happen at the end of the array. The code tracks
> > > whether the insertion violates the sorting property. O(log n) rb-tree
> > > complexity is switched to O(1).
> > >
> > > Remove slides the array, so O(log n) rb-tree complexity is degraded to
> > > O(n).
> > >
> > > A find may need to sort the array using qsort which is O(n*log n), but
> > > in general the maps should be sorted and so average performance should
> > > be O(log n) as with the rbtree.
> > >
> > > An rbtree node consumes a cache line, but with the array 4 nodes fit
> > > on a cache line. Iteration is simplified to scanning an array rather
> > > than pointer chasing.
> > >
> > > Overall it is expected the performance after the change should be
> > > comparable to before, but with half of the memory consumed.
> > >
> > > To avoid a list and repeated logic around splitting maps,
> > > maps__merge_in is rewritten in terms of
> > > maps__fixup_overlap_and_insert. maps_merge_in splits the given mapping
> > > inserting remaining gaps. maps__fixup_overlap_and_insert splits the
> > > existing mappings, then adds the incoming mapping. By adding the new
> > > mapping first, then re-inserting the existing mappings the splitting
> > > behavior matches.
> > >
> > > Signed-off-by: Ian Rogers <irogers@...gle.com>
> > > Acked-by: Namhyung Kim <namhyung@...nel.org>
> > > ---
> > [SNIP]
> > >  int maps__for_each_map(struct maps *maps, int (*cb)(struct map *map, void *data), void *data)
> > >  {
> > > -       struct map_rb_node *pos;
> > > +       bool done = false;
> > >         int ret = 0;
> > >
> > > -       down_read(maps__lock(maps));
> > > -       maps__for_each_entry(maps, pos) {
> > > -               ret = cb(pos->map, data);
> > > -               if (ret)
> > > -                       break;
> > > +       /* See locking/sorting note. */
> > > +       while (!done) {
> > > +               down_read(maps__lock(maps));
> > > +               if (maps__maps_by_address_sorted(maps)) {
> > > +                       /*
> > > +                        * maps__for_each_map callbacks may buggily/unsafely
> > > +                        * insert into maps_by_address. Deliberately reload
> > > +                        * maps__nr_maps and maps_by_address on each iteration
> > > +                        * to avoid using memory freed by maps__insert growing
> > > +                        * the array - this may cause maps to be skipped or
> > > +                        * repeated.
> > > +                        */
> > > +                       for (unsigned int i = 0; i < maps__nr_maps(maps); i++) {
> > > +                               struct map **maps_by_address = maps__maps_by_address(maps);
> >
> > Any chance they can move out of the loop?  I guess not as they are
> > not marked to const/pure functions..
>
> It's not because the cb(...) call below will potentially modify
> maps_by_address by inserting maps and reallocating the array. Having
> it outside the loop was what caused the original bug.

Oh, I meant if compiler can move them automatically.

Thanks,
Namhyung

> >
> >
> > > +                               struct map *map = maps_by_address[i];
> > > +
> > > +                               ret = cb(map, data);
> > > +                               if (ret)
> > > +                                       break;
> > > +                       }
> > > +                       done = true;
> > > +               }
> > > +               up_read(maps__lock(maps));
> > > +               if (!done)
> > > +                       maps__sort_by_address(maps);
> > >         }
> > > -       up_read(maps__lock(maps));
> > >         return ret;
> > >  }

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ