lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 1 Feb 2024 18:48:27 -0800
From: Namhyung Kim <namhyung@...nel.org>
To: Ian Rogers <irogers@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Arnaldo Carvalho de Melo <acme@...nel.org>, Mark Rutland <mark.rutland@....com>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Adrian Hunter <adrian.hunter@...el.com>, Nick Terrell <terrelln@...com>, 
	Kan Liang <kan.liang@...ux.intel.com>, Andi Kleen <ak@...ux.intel.com>, 
	Kajol Jain <kjain@...ux.ibm.com>, Athira Rajeev <atrajeev@...ux.vnet.ibm.com>, 
	Huacai Chen <chenhuacai@...nel.org>, Masami Hiramatsu <mhiramat@...nel.org>, 
	Vincent Whitchurch <vincent.whitchurch@...s.com>, "Steinar H. Gunderson" <sesse@...gle.com>, 
	Liam Howlett <liam.howlett@...cle.com>, Miguel Ojeda <ojeda@...nel.org>, 
	Colin Ian King <colin.i.king@...il.com>, Dmitrii Dolgov <9erthalion6@...il.com>, 
	Yang Jihong <yangjihong1@...wei.com>, Ming Wang <wangming01@...ngson.cn>, 
	James Clark <james.clark@....com>, K Prateek Nayak <kprateek.nayak@....com>, 
	Sean Christopherson <seanjc@...gle.com>, Leo Yan <leo.yan@...aro.org>, 
	Ravi Bangoria <ravi.bangoria@....com>, German Gomez <german.gomez@....com>, 
	Changbin Du <changbin.du@...wei.com>, Paolo Bonzini <pbonzini@...hat.com>, Li Dong <lidong@...o.com>, 
	Sandipan Das <sandipan.das@....com>, liuwenyu <liuwenyu7@...wei.com>, 
	linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org, 
	Guilherme Amadio <amadio@...too.org>
Subject: Re: [PATCH v7 01/25] perf maps: Switch from rbtree to lazily sorted
 array for addresses

Hi Ian,

On Tue, Jan 2, 2024 at 9:07 PM Ian Rogers <irogers@...gle.com> wrote:
>
> Maps is a collection of maps primarily sorted by the starting address
> of the map. Prior to this change the maps were held in an rbtree
> requiring 4 pointers per node. Prior to reference count checking, the
> rbnode was embedded in the map so 3 pointers per node were
> necessary. This change switches the rbtree to an array lazily sorted
> by address, much as the array sorting nodes by name. 1 pointer is
> needed per node, but to avoid excessive resizing the backing array may
> be twice the number of used elements. Meaning the memory overhead is
> roughly half that of the rbtree. For a perf record with
> "--no-bpf-event -g -a" of true, the memory overhead of perf inject is
> reduce fom 3.3MB to 3MB, so 10% or 300KB is saved.
>
> Map inserts always happen at the end of the array. The code tracks
> whether the insertion violates the sorting property. O(log n) rb-tree
> complexity is switched to O(1).
>
> Remove slides the array, so O(log n) rb-tree complexity is degraded to
> O(n).
>
> A find may need to sort the array using qsort which is O(n*log n), but
> in general the maps should be sorted and so average performance should
> be O(log n) as with the rbtree.
>
> An rbtree node consumes a cache line, but with the array 4 nodes fit
> on a cache line. Iteration is simplified to scanning an array rather
> than pointer chasing.
>
> Overall it is expected the performance after the change should be
> comparable to before, but with half of the memory consumed.

I don't know how much performance impact it would have but I guess
search/iteration would be the most frequent operation.  So I like the
memory saving it can bring.

>
> To avoid a list and repeated logic around splitting maps,
> maps__merge_in is rewritten in terms of
> maps__fixup_overlap_and_insert. maps_merge_in splits the given mapping
> inserting remaining gaps. maps__fixup_overlap_and_insert splits the
> existing mappings, then adds the incoming mapping. By adding the new
> mapping first, then re-inserting the existing mappings the splitting
> behavior matches.
>
> Signed-off-by: Ian Rogers <irogers@...gle.com>
> ---
>  tools/perf/tests/maps.c |    3 +
>  tools/perf/util/map.c   |    1 +
>  tools/perf/util/maps.c  | 1183 +++++++++++++++++++++++----------------
>  tools/perf/util/maps.h  |   54 +-
>  4 files changed, 757 insertions(+), 484 deletions(-)
>
> diff --git a/tools/perf/tests/maps.c b/tools/perf/tests/maps.c
> index bb3fbfe5a73e..b15417a0d617 100644
> --- a/tools/perf/tests/maps.c
> +++ b/tools/perf/tests/maps.c
> @@ -156,6 +156,9 @@ static int test__maps__merge_in(struct test_suite *t __maybe_unused, int subtest
>         TEST_ASSERT_VAL("merge check failed", !ret);
>
>         maps__zput(maps);
> +       map__zput(map_kcore1);
> +       map__zput(map_kcore2);
> +       map__zput(map_kcore3);
>         return TEST_OK;
>  }
>
> diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
> index 54c67cb7ecef..cf5a15db3a1f 100644
> --- a/tools/perf/util/map.c
> +++ b/tools/perf/util/map.c
> @@ -168,6 +168,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
>                 if (dso == NULL)
>                         goto out_delete;
>
> +               assert(!dso->kernel);
>                 map__init(result, start, start + len, pgoff, dso);
>
>                 if (anon || no_dso) {
> diff --git a/tools/perf/util/maps.c b/tools/perf/util/maps.c
> index 0334fc18d9c6..6ee81160cdab 100644
> --- a/tools/perf/util/maps.c
> +++ b/tools/perf/util/maps.c
> @@ -10,286 +10,477 @@
>  #include "ui/ui.h"
>  #include "unwind.h"
>
> -struct map_rb_node {
> -       struct rb_node rb_node;
> -       struct map *map;
> -};
> -
> -#define maps__for_each_entry(maps, map) \
> -       for (map = maps__first(maps); map; map = map_rb_node__next(map))
> +static void check_invariants(const struct maps *maps __maybe_unused)
> +{
> +#ifndef NDEBUG
> +       assert(RC_CHK_ACCESS(maps)->nr_maps <= RC_CHK_ACCESS(maps)->nr_maps_allocated);
> +       for (unsigned int i = 0; i < RC_CHK_ACCESS(maps)->nr_maps; i++) {
> +               struct map *map = RC_CHK_ACCESS(maps)->maps_by_address[i];
> +
> +               /* Check map is well-formed. */
> +               assert(map__end(map) == 0 || map__start(map) <= map__end(map));
> +               /* Expect at least 1 reference count. */
> +               assert(refcount_read(map__refcnt(map)) > 0);
> +
> +               if (map__dso(map) && map__dso(map)->kernel)
> +                       assert(RC_CHK_EQUAL(map__kmap(map)->kmaps, maps));
> +
> +               if (i > 0) {
> +                       struct map *prev = RC_CHK_ACCESS(maps)->maps_by_address[i - 1];
> +
> +                       /* If addresses are sorted... */
> +                       if (RC_CHK_ACCESS(maps)->maps_by_address_sorted) {
> +                               /* Maps should be in start address order. */
> +                               assert(map__start(prev) <= map__start(map));
> +                               /*
> +                                * If the ends of maps aren't broken (during
> +                                * construction) then they should be ordered
> +                                * too.
> +                                */
> +                               if (!RC_CHK_ACCESS(maps)->ends_broken) {
> +                                       assert(map__end(prev) <= map__end(map));
> +                                       assert(map__end(prev) <= map__start(map) ||
> +                                              map__start(prev) == map__start(map));
> +                               }
> +                       }
> +               }
> +       }
> +       if (RC_CHK_ACCESS(maps)->maps_by_name) {
> +               for (unsigned int i = 0; i < RC_CHK_ACCESS(maps)->nr_maps; i++) {
> +                       struct map *map = RC_CHK_ACCESS(maps)->maps_by_name[i];
>
> -#define maps__for_each_entry_safe(maps, map, next) \
> -       for (map = maps__first(maps), next = map_rb_node__next(map); map; \
> -            map = next, next = map_rb_node__next(map))
> +                       /*
> +                        * Maps by name maps should be in maps_by_address, so
> +                        * the reference count should be higher.
> +                        */
> +                       assert(refcount_read(map__refcnt(map)) > 1);
> +               }
> +       }
> +#endif
> +}
>
> -static struct rb_root *maps__entries(struct maps *maps)
> +static struct map **maps__maps_by_address(const struct maps *maps)
>  {
> -       return &RC_CHK_ACCESS(maps)->entries;
> +       return RC_CHK_ACCESS(maps)->maps_by_address;
>  }
>
> -static struct rw_semaphore *maps__lock(struct maps *maps)
> +static void maps__set_maps_by_address(struct maps *maps, struct map **new)
>  {
> -       return &RC_CHK_ACCESS(maps)->lock;
> +       RC_CHK_ACCESS(maps)->maps_by_address = new;
> +
>  }
>
> -static struct map **maps__maps_by_name(struct maps *maps)
> +/* Not in the header, to aid reference counting. */
> +static struct map **maps__maps_by_name(const struct maps *maps)
>  {
>         return RC_CHK_ACCESS(maps)->maps_by_name;
> +
>  }
>
> -static struct map_rb_node *maps__first(struct maps *maps)
> +static void maps__set_maps_by_name(struct maps *maps, struct map **new)
>  {
> -       struct rb_node *first = rb_first(maps__entries(maps));
> +       RC_CHK_ACCESS(maps)->maps_by_name = new;
>
> -       if (first)
> -               return rb_entry(first, struct map_rb_node, rb_node);
> -       return NULL;
>  }
>
> -static struct map_rb_node *map_rb_node__next(struct map_rb_node *node)
> +static bool maps__maps_by_address_sorted(const struct maps *maps)
>  {
> -       struct rb_node *next;
> -
> -       if (!node)
> -               return NULL;
> -
> -       next = rb_next(&node->rb_node);
> +       return RC_CHK_ACCESS(maps)->maps_by_address_sorted;
> +}
>
> -       if (!next)
> -               return NULL;
> +static void maps__set_maps_by_address_sorted(struct maps *maps, bool value)
> +{
> +       RC_CHK_ACCESS(maps)->maps_by_address_sorted = value;
> +}
>
> -       return rb_entry(next, struct map_rb_node, rb_node);
> +static bool maps__maps_by_name_sorted(const struct maps *maps)
> +{
> +       return RC_CHK_ACCESS(maps)->maps_by_name_sorted;
>  }
>
> -static struct map_rb_node *maps__find_node(struct maps *maps, struct map *map)
> +static void maps__set_maps_by_name_sorted(struct maps *maps, bool value)
>  {
> -       struct map_rb_node *rb_node;
> +       RC_CHK_ACCESS(maps)->maps_by_name_sorted = value;
> +}
>
> -       maps__for_each_entry(maps, rb_node) {
> -               if (rb_node->RC_CHK_ACCESS(map) == RC_CHK_ACCESS(map))
> -                       return rb_node;
> -       }
> -       return NULL;
> +static struct rw_semaphore *maps__lock(struct maps *maps)
> +{
> +       /*
> +        * When the lock is acquired or released the maps invariants should
> +        * hold.
> +        */
> +       check_invariants(maps);
> +       return &RC_CHK_ACCESS(maps)->lock;
>  }
>
>  static void maps__init(struct maps *maps, struct machine *machine)
>  {
> -       refcount_set(maps__refcnt(maps), 1);
>         init_rwsem(maps__lock(maps));
> -       RC_CHK_ACCESS(maps)->entries = RB_ROOT;
> +       RC_CHK_ACCESS(maps)->maps_by_address = NULL;
> +       RC_CHK_ACCESS(maps)->maps_by_name = NULL;
>         RC_CHK_ACCESS(maps)->machine = machine;
> -       RC_CHK_ACCESS(maps)->last_search_by_name = NULL;
> +#ifdef HAVE_LIBUNWIND_SUPPORT
> +       RC_CHK_ACCESS(maps)->addr_space = NULL;
> +       RC_CHK_ACCESS(maps)->unwind_libunwind_ops = NULL;
> +#endif
> +       refcount_set(maps__refcnt(maps), 1);
>         RC_CHK_ACCESS(maps)->nr_maps = 0;
> -       RC_CHK_ACCESS(maps)->maps_by_name = NULL;
> +       RC_CHK_ACCESS(maps)->nr_maps_allocated = 0;
> +       RC_CHK_ACCESS(maps)->last_search_by_name_idx = 0;
> +       RC_CHK_ACCESS(maps)->maps_by_address_sorted = true;
> +       RC_CHK_ACCESS(maps)->maps_by_name_sorted = false;
>  }
>
> -static void __maps__free_maps_by_name(struct maps *maps)
> +static void maps__exit(struct maps *maps)
>  {
> -       /*
> -        * Free everything to try to do it from the rbtree in the next search
> -        */
> -       for (unsigned int i = 0; i < maps__nr_maps(maps); i++)
> -               map__put(maps__maps_by_name(maps)[i]);
> +       struct map **maps_by_address = maps__maps_by_address(maps);
> +       struct map **maps_by_name = maps__maps_by_name(maps);
>
> -       zfree(&RC_CHK_ACCESS(maps)->maps_by_name);
> -       RC_CHK_ACCESS(maps)->nr_maps_allocated = 0;
> +       for (unsigned int i = 0; i < maps__nr_maps(maps); i++) {
> +               map__zput(maps_by_address[i]);
> +               if (maps_by_name)
> +                       map__zput(maps_by_name[i]);
> +       }
> +       zfree(&maps_by_address);
> +       zfree(&maps_by_name);
> +       unwind__finish_access(maps);
>  }
>
> -static int __maps__insert(struct maps *maps, struct map *map)
> +struct maps *maps__new(struct machine *machine)
>  {
> -       struct rb_node **p = &maps__entries(maps)->rb_node;
> -       struct rb_node *parent = NULL;
> -       const u64 ip = map__start(map);
> -       struct map_rb_node *m, *new_rb_node;
> -
> -       new_rb_node = malloc(sizeof(*new_rb_node));
> -       if (!new_rb_node)
> -               return -ENOMEM;
> -
> -       RB_CLEAR_NODE(&new_rb_node->rb_node);
> -       new_rb_node->map = map__get(map);
> +       struct maps *result;
> +       RC_STRUCT(maps) *maps = zalloc(sizeof(*maps));
>
> -       while (*p != NULL) {
> -               parent = *p;
> -               m = rb_entry(parent, struct map_rb_node, rb_node);
> -               if (ip < map__start(m->map))
> -                       p = &(*p)->rb_left;
> -               else
> -                       p = &(*p)->rb_right;
> -       }
> +       if (ADD_RC_CHK(result, maps))
> +               maps__init(result, machine);
>
> -       rb_link_node(&new_rb_node->rb_node, parent, p);
> -       rb_insert_color(&new_rb_node->rb_node, maps__entries(maps));
> -       return 0;
> +       return result;
>  }
>
> -int maps__insert(struct maps *maps, struct map *map)
> +static void maps__delete(struct maps *maps)
>  {
> -       int err;
> -       const struct dso *dso = map__dso(map);
> -
> -       down_write(maps__lock(maps));
> -       err = __maps__insert(maps, map);
> -       if (err)
> -               goto out;
> +       maps__exit(maps);
> +       RC_CHK_FREE(maps);
> +}
>
> -       ++RC_CHK_ACCESS(maps)->nr_maps;
> +struct maps *maps__get(struct maps *maps)
> +{
> +       struct maps *result;
>
> -       if (dso && dso->kernel) {
> -               struct kmap *kmap = map__kmap(map);
> +       if (RC_CHK_GET(result, maps))
> +               refcount_inc(maps__refcnt(maps));
>
> -               if (kmap)
> -                       kmap->kmaps = maps;
> -               else
> -                       pr_err("Internal error: kernel dso with non kernel map\n");
> -       }
> +       return result;
> +}
>
> +void maps__put(struct maps *maps)
> +{
> +       if (maps && refcount_dec_and_test(maps__refcnt(maps)))
> +               maps__delete(maps);
> +       else
> +               RC_CHK_PUT(maps);
> +}
>
> +static void __maps__free_maps_by_name(struct maps *maps)
> +{
>         /*
> -        * If we already performed some search by name, then we need to add the just
> -        * inserted map and resort.
> +        * Free everything to try to do it from the rbtree in the next search
>          */
> -       if (maps__maps_by_name(maps)) {
> -               if (maps__nr_maps(maps) > RC_CHK_ACCESS(maps)->nr_maps_allocated) {
> -                       int nr_allocate = maps__nr_maps(maps) * 2;
> -                       struct map **maps_by_name = realloc(maps__maps_by_name(maps),
> -                                                           nr_allocate * sizeof(map));
> +       for (unsigned int i = 0; i < maps__nr_maps(maps); i++)
> +               map__put(maps__maps_by_name(maps)[i]);
>
> -                       if (maps_by_name == NULL) {
> -                               __maps__free_maps_by_name(maps);
> -                               err = -ENOMEM;
> -                               goto out;
> -                       }
> +       zfree(&RC_CHK_ACCESS(maps)->maps_by_name);
> +}
>
> -                       RC_CHK_ACCESS(maps)->maps_by_name = maps_by_name;
> -                       RC_CHK_ACCESS(maps)->nr_maps_allocated = nr_allocate;
> +static int map__start_cmp(const void *a, const void *b)
> +{
> +       const struct map *map_a = *(const struct map * const *)a;
> +       const struct map *map_b = *(const struct map * const *)b;
> +       u64 map_a_start = map__start(map_a);
> +       u64 map_b_start = map__start(map_b);
> +
> +       if (map_a_start == map_b_start) {
> +               u64 map_a_end = map__end(map_a);
> +               u64 map_b_end = map__end(map_b);
> +
> +               if  (map_a_end == map_b_end) {
> +                       /* Ensure maps with the same addresses have a fixed order. */
> +                       if (RC_CHK_ACCESS(map_a) == RC_CHK_ACCESS(map_b))
> +                               return 0;
> +                       return (intptr_t)RC_CHK_ACCESS(map_a) > (intptr_t)RC_CHK_ACCESS(map_b)
> +                               ? 1 : -1;
>                 }
> -               maps__maps_by_name(maps)[maps__nr_maps(maps) - 1] = map__get(map);
> -               __maps__sort_by_name(maps);
> +               return map_a_end > map_b_end ? 1 : -1;
>         }
> - out:
> -       up_write(maps__lock(maps));
> -       return err;
> +       return map_a_start > map_b_start ? 1 : -1;
>  }
>
> -static void __maps__remove(struct maps *maps, struct map_rb_node *rb_node)
> +static void __maps__sort_by_address(struct maps *maps)
>  {
> -       rb_erase_init(&rb_node->rb_node, maps__entries(maps));
> -       map__put(rb_node->map);
> -       free(rb_node);
> +       if (maps__maps_by_address_sorted(maps))
> +               return;
> +
> +       qsort(maps__maps_by_address(maps),
> +               maps__nr_maps(maps),
> +               sizeof(struct map *),
> +               map__start_cmp);
> +       maps__set_maps_by_address_sorted(maps, true);
>  }
>
> -void maps__remove(struct maps *maps, struct map *map)
> +static void maps__sort_by_address(struct maps *maps)
>  {
> -       struct map_rb_node *rb_node;
> -
>         down_write(maps__lock(maps));
> -       if (RC_CHK_ACCESS(maps)->last_search_by_name == map)
> -               RC_CHK_ACCESS(maps)->last_search_by_name = NULL;
> -
> -       rb_node = maps__find_node(maps, map);
> -       assert(rb_node->RC_CHK_ACCESS(map) == RC_CHK_ACCESS(map));
> -       __maps__remove(maps, rb_node);
> -       if (maps__maps_by_name(maps))
> -               __maps__free_maps_by_name(maps);
> -       --RC_CHK_ACCESS(maps)->nr_maps;
> +       __maps__sort_by_address(maps);
>         up_write(maps__lock(maps));
>  }
>
> -static void __maps__purge(struct maps *maps)
> +static int map__strcmp(const void *a, const void *b)
>  {
> -       struct map_rb_node *pos, *next;
> -
> -       if (maps__maps_by_name(maps))
> -               __maps__free_maps_by_name(maps);
> +       const struct map *map_a = *(const struct map * const *)a;
> +       const struct map *map_b = *(const struct map * const *)b;
> +       const struct dso *dso_a = map__dso(map_a);
> +       const struct dso *dso_b = map__dso(map_b);
> +       int ret = strcmp(dso_a->short_name, dso_b->short_name);
>
> -       maps__for_each_entry_safe(maps, pos, next) {
> -               rb_erase_init(&pos->rb_node,  maps__entries(maps));
> -               map__put(pos->map);
> -               free(pos);
> +       if (ret == 0 && RC_CHK_ACCESS(map_a) != RC_CHK_ACCESS(map_b)) {
> +               /* Ensure distinct but name equal maps have an order. */
> +               return map__start_cmp(a, b);
>         }
> +       return ret;
>  }
>
> -static void maps__exit(struct maps *maps)
> +static int maps__sort_by_name(struct maps *maps)
>  {
> +       int err = 0;
>         down_write(maps__lock(maps));
> -       __maps__purge(maps);
> +       if (!maps__maps_by_name_sorted(maps)) {
> +               struct map **maps_by_name = maps__maps_by_name(maps);
> +
> +               if (!maps_by_name) {
> +                       maps_by_name = malloc(RC_CHK_ACCESS(maps)->nr_maps_allocated *
> +                                       sizeof(*maps_by_name));
> +                       if (!maps_by_name)
> +                               err = -ENOMEM;
> +                       else {
> +                               struct map **maps_by_address = maps__maps_by_address(maps);
> +                               unsigned int n = maps__nr_maps(maps);
> +
> +                               maps__set_maps_by_name(maps, maps_by_name);
> +                               for (unsigned int i = 0; i < n; i++)
> +                                       maps_by_name[i] = map__get(maps_by_address[i]);
> +                       }
> +               }
> +               if (!err) {
> +                       qsort(maps_by_name,
> +                               maps__nr_maps(maps),
> +                               sizeof(struct map *),
> +                               map__strcmp);
> +                       maps__set_maps_by_name_sorted(maps, true);
> +               }
> +       }
>         up_write(maps__lock(maps));
> +       return err;
>  }
>
> -bool maps__empty(struct maps *maps)
> +static unsigned int maps__by_address_index(const struct maps *maps, const struct map *map)
>  {
> -       return !maps__first(maps);
> +       struct map **maps_by_address = maps__maps_by_address(maps);
> +
> +       if (maps__maps_by_address_sorted(maps)) {
> +               struct map **mapp =
> +                       bsearch(&map, maps__maps_by_address(maps), maps__nr_maps(maps),
> +                               sizeof(*mapp), map__start_cmp);
> +
> +               if (mapp)
> +                       return mapp - maps_by_address;
> +       } else {
> +               for (unsigned int i = 0; i < maps__nr_maps(maps); i++) {
> +                       if (RC_CHK_ACCESS(maps_by_address[i]) == RC_CHK_ACCESS(map))
> +                               return i;
> +               }
> +       }
> +       pr_err("Map missing from maps");
> +       return -1;
>  }
>
> -struct maps *maps__new(struct machine *machine)
> +static unsigned int maps__by_name_index(const struct maps *maps, const struct map *map)
>  {
> -       struct maps *result;
> -       RC_STRUCT(maps) *maps = zalloc(sizeof(*maps));
> +       struct map **maps_by_name = maps__maps_by_name(maps);
> +
> +       if (maps__maps_by_name_sorted(maps)) {
> +               struct map **mapp =
> +                       bsearch(&map, maps_by_name, maps__nr_maps(maps),
> +                               sizeof(*mapp), map__strcmp);
> +
> +               if (mapp)
> +                       return mapp - maps_by_name;
> +       } else {
> +               for (unsigned int i = 0; i < maps__nr_maps(maps); i++) {
> +                       if (RC_CHK_ACCESS(maps_by_name[i]) == RC_CHK_ACCESS(map))
> +                               return i;
> +               }
> +       }
> +       pr_err("Map missing from maps");
> +       return -1;
> +}
>
> -       if (ADD_RC_CHK(result, maps))
> -               maps__init(result, machine);
> +static int __maps__insert(struct maps *maps, struct map *new)
> +{
> +       struct map **maps_by_address = maps__maps_by_address(maps);
> +       struct map **maps_by_name = maps__maps_by_name(maps);
> +       const struct dso *dso = map__dso(new);
> +       unsigned int nr_maps = maps__nr_maps(maps);
> +       unsigned int nr_allocate = RC_CHK_ACCESS(maps)->nr_maps_allocated;
> +
> +       if (nr_maps + 1 > nr_allocate) {
> +               nr_allocate = !nr_allocate ? 32 : nr_allocate * 2;
> +
> +               maps_by_address = realloc(maps_by_address, nr_allocate * sizeof(new));
> +               if (!maps_by_address)
> +                       return -ENOMEM;
> +
> +               maps__set_maps_by_address(maps, maps_by_address);
> +               if (maps_by_name) {
> +                       maps_by_name = realloc(maps_by_name, nr_allocate * sizeof(new));
> +                       if (!maps_by_name) {
> +                               /*
> +                                * If by name fails, just disable by name and it will
> +                                * recompute next time it is required.
> +                                */
> +                               __maps__free_maps_by_name(maps);
> +                       }
> +                       maps__set_maps_by_name(maps, maps_by_name);
> +               }
> +               RC_CHK_ACCESS(maps)->nr_maps_allocated = nr_allocate;
> +       }
> +       /* Insert the value at the end. */
> +       maps_by_address[nr_maps] = map__get(new);
> +       if (maps_by_name)
> +               maps_by_name[nr_maps] = map__get(new);
>
> -       return result;
> +       nr_maps++;
> +       RC_CHK_ACCESS(maps)->nr_maps = nr_maps;
> +
> +       /*
> +        * Recompute if things are sorted. If things are inserted in a sorted
> +        * manner, for example by processing /proc/pid/maps, then no
> +        * sorting/resorting will be necessary.
> +        */
> +       if (nr_maps == 1) {
> +               /* If there's just 1 entry then maps are sorted. */
> +               maps__set_maps_by_address_sorted(maps, true);
> +               maps__set_maps_by_name_sorted(maps, maps_by_name != NULL);
> +       } else {
> +               /* Sorted if maps were already sorted and this map starts after the last one. */
> +               maps__set_maps_by_address_sorted(maps,
> +                       maps__maps_by_address_sorted(maps) &&
> +                       map__end(maps_by_address[nr_maps - 2]) <= map__start(new));
> +               maps__set_maps_by_name_sorted(maps, false);
> +       }
> +       if (map__end(new) < map__start(new))
> +               RC_CHK_ACCESS(maps)->ends_broken = true;
> +       if (dso && dso->kernel) {
> +               struct kmap *kmap = map__kmap(new);
> +
> +               if (kmap)
> +                       kmap->kmaps = maps;
> +               else
> +                       pr_err("Internal error: kernel dso with non kernel map\n");
> +       }
> +       check_invariants(maps);

Probably not needed as it's checked when you get the lock below.


> +       return 0;
>  }
>
> -static void maps__delete(struct maps *maps)
> +int maps__insert(struct maps *maps, struct map *map)
>  {
> -       maps__exit(maps);
> -       unwind__finish_access(maps);
> -       RC_CHK_FREE(maps);
> +       int ret;
> +
> +       down_write(maps__lock(maps));
> +       ret = __maps__insert(maps, map);
> +       up_write(maps__lock(maps));
> +       return ret;
>  }
>
> -struct maps *maps__get(struct maps *maps)
> +static void __maps__remove(struct maps *maps, struct map *map)
>  {
> -       struct maps *result;
> +       struct map **maps_by_address = maps__maps_by_address(maps);
> +       struct map **maps_by_name = maps__maps_by_name(maps);
> +       unsigned int nr_maps = maps__nr_maps(maps);
> +       unsigned int address_idx;
> +
> +       /* Slide later mappings over the one to remove */
> +       address_idx = maps__by_address_index(maps, map);
> +       map__put(maps_by_address[address_idx]);
> +       memmove(&maps_by_address[address_idx],
> +               &maps_by_address[address_idx + 1],
> +               (nr_maps - address_idx - 1) * sizeof(*maps_by_address));
> +
> +       if (maps_by_name) {
> +               unsigned int name_idx = maps__by_name_index(maps, map);
> +
> +               map__put(maps_by_name[name_idx]);
> +               memmove(&maps_by_name[name_idx],
> +                       &maps_by_name[name_idx + 1],
> +                       (nr_maps - name_idx - 1) *  sizeof(*maps_by_name));
> +       }
>
> -       if (RC_CHK_GET(result, maps))
> -               refcount_inc(maps__refcnt(maps));
> +       --RC_CHK_ACCESS(maps)->nr_maps;
> +       check_invariants(maps);

Ditto.

> +}
>
> -       return result;
> +void maps__remove(struct maps *maps, struct map *map)
> +{
> +       down_write(maps__lock(maps));
> +       __maps__remove(maps, map);
> +       up_write(maps__lock(maps));
>  }
>
> -void maps__put(struct maps *maps)
> +bool maps__empty(struct maps *maps)
>  {
> -       if (maps && refcount_dec_and_test(maps__refcnt(maps)))
> -               maps__delete(maps);
> -       else
> -               RC_CHK_PUT(maps);
> +       return maps__nr_maps(maps) == 0;
>  }
>
>  int maps__for_each_map(struct maps *maps, int (*cb)(struct map *map, void *data), void *data)
>  {
> -       struct map_rb_node *pos;
> +       bool done = false;
>         int ret = 0;
>
> -       down_read(maps__lock(maps));
> -       maps__for_each_entry(maps, pos) {
> -               ret = cb(pos->map, data);
> -               if (ret)
> -                       break;
> +       /* See locking/sorting note. */
> +       while (!done) {
> +               down_read(maps__lock(maps));
> +               if (maps__maps_by_address_sorted(maps)) {
> +                       struct map **maps_by_address = maps__maps_by_address(maps);
> +                       unsigned int n = maps__nr_maps(maps);
> +
> +                       for (unsigned int i = 0; i < n; i++) {
> +                               struct map *map = maps_by_address[i];
> +
> +                               ret = cb(map, data);
> +                               if (ret)
> +                                       break;
> +                       }
> +                       done = true;
> +               }
> +               up_read(maps__lock(maps));
> +               if (!done)
> +                       maps__sort_by_address(maps);
>         }
> -       up_read(maps__lock(maps));
>         return ret;
>  }
>
>  void maps__remove_maps(struct maps *maps, bool (*cb)(struct map *map, void *data), void *data)
>  {
> -       struct map_rb_node *pos, *next;
> -       unsigned int start_nr_maps;
> +       struct map **maps_by_address;
>
>         down_write(maps__lock(maps));
>
> -       start_nr_maps = maps__nr_maps(maps);
> -       maps__for_each_entry_safe(maps, pos, next)      {
> -               if (cb(pos->map, data)) {
> -                       __maps__remove(maps, pos);
> -                       --RC_CHK_ACCESS(maps)->nr_maps;
> -               }
> +       maps_by_address = maps__maps_by_address(maps);
> +       for (unsigned int i = 0; i < maps__nr_maps(maps);) {
> +               if (cb(maps_by_address[i], data))
> +                       __maps__remove(maps, maps_by_address[i]);
> +               else
> +                       i++;
>         }
> -       if (maps__maps_by_name(maps) && start_nr_maps != maps__nr_maps(maps))
> -               __maps__free_maps_by_name(maps);
> -
>         up_write(maps__lock(maps));
>  }
>
> @@ -300,7 +491,7 @@ struct symbol *maps__find_symbol(struct maps *maps, u64 addr, struct map **mapp)
>         /* Ensure map is loaded before using map->map_ip */
>         if (map != NULL && map__load(map) >= 0) {
>                 if (mapp != NULL)
> -                       *mapp = map;
> +                       *mapp = map; // TODO: map_put on else path when find returns a get.
>                 return map__find_symbol(map, map__map_ip(map, addr));
>         }
>
> @@ -348,7 +539,7 @@ int maps__find_ams(struct maps *maps, struct addr_map_symbol *ams)
>         if (ams->addr < map__start(ams->ms.map) || ams->addr >= map__end(ams->ms.map)) {
>                 if (maps == NULL)
>                         return -1;
> -               ams->ms.map = maps__find(maps, ams->addr);
> +               ams->ms.map = maps__find(maps, ams->addr);  // TODO: map_get
>                 if (ams->ms.map == NULL)
>                         return -1;
>         }
> @@ -393,24 +584,28 @@ size_t maps__fprintf(struct maps *maps, FILE *fp)
>   * Find first map where end > map->start.
>   * Same as find_vma() in kernel.
>   */
> -static struct rb_node *first_ending_after(struct maps *maps, const struct map *map)
> +static unsigned int first_ending_after(struct maps *maps, const struct map *map)
>  {
> -       struct rb_root *root;
> -       struct rb_node *next, *first;
> +       struct map **maps_by_address = maps__maps_by_address(maps);
> +       int low = 0, high = (int)maps__nr_maps(maps) - 1, first = high + 1;
> +
> +       assert(maps__maps_by_address_sorted(maps));
> +       if (low <= high && map__end(maps_by_address[0]) > map__start(map))
> +               return 0;
>
> -       root = maps__entries(maps);
> -       next = root->rb_node;
> -       first = NULL;
> -       while (next) {
> -               struct map_rb_node *pos = rb_entry(next, struct map_rb_node, rb_node);
> +       while (low <= high) {
> +               int mid = (low + high) / 2;
> +               struct map *pos = maps_by_address[mid];
>
> -               if (map__end(pos->map) > map__start(map)) {
> -                       first = next;
> -                       if (map__start(pos->map) <= map__start(map))
> +               if (map__end(pos) > map__start(map)) {
> +                       first = mid;
> +                       if (map__start(pos) <= map__start(map)) {
> +                               /* Entry overlaps map. */
>                                 break;
> -                       next = next->rb_left;
> +                       }
> +                       high = mid - 1;
>                 } else
> -                       next = next->rb_right;
> +                       low = mid + 1;
>         }
>         return first;
>  }
> @@ -419,171 +614,249 @@ static struct rb_node *first_ending_after(struct maps *maps, const struct map *m
>   * Adds new to maps, if new overlaps existing entries then the existing maps are
>   * adjusted or removed so that new fits without overlapping any entries.
>   */
> -int maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
> +static int __maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
>  {
> -
> -       struct rb_node *next;
> +       struct map **maps_by_address;
>         int err = 0;
>         FILE *fp = debug_file();
>
> -       down_write(maps__lock(maps));
> +sort_again:
> +       if (!maps__maps_by_address_sorted(maps))
> +               __maps__sort_by_address(maps);
>
> -       next = first_ending_after(maps, new);
> -       while (next && !err) {
> -               struct map_rb_node *pos = rb_entry(next, struct map_rb_node, rb_node);
> -               next = rb_next(&pos->rb_node);
> +       maps_by_address = maps__maps_by_address(maps);
> +       /*
> +        * Iterate through entries where the end of the existing entry is
> +        * greater-than the new map's start.
> +        */
> +       for (unsigned int i = first_ending_after(maps, new); i < maps__nr_maps(maps); ) {
> +               struct map *pos = maps_by_address[i];
> +               struct map *before = NULL, *after = NULL;
>
>                 /*
>                  * Stop if current map starts after map->end.
>                  * Maps are ordered by start: next will not overlap for sure.
>                  */
> -               if (map__start(pos->map) >= map__end(new))
> +               if (map__start(pos) >= map__end(new))
>                         break;
>
> -               if (verbose >= 2) {
> -
> -                       if (use_browser) {
> -                               pr_debug("overlapping maps in %s (disable tui for more info)\n",
> -                                        map__dso(new)->name);
> -                       } else {
> -                               pr_debug("overlapping maps:\n");
> -                               map__fprintf(new, fp);
> -                               map__fprintf(pos->map, fp);
> -                       }
> +               if (use_browser) {
> +                       pr_debug("overlapping maps in %s (disable tui for more info)\n",
> +                               map__dso(new)->name);
> +               } else if (verbose >= 2) {
> +                       pr_debug("overlapping maps:\n");
> +                       map__fprintf(new, fp);
> +                       map__fprintf(pos, fp);
>                 }
>
> -               rb_erase_init(&pos->rb_node, maps__entries(maps));
>                 /*
>                  * Now check if we need to create new maps for areas not
>                  * overlapped by the new map:
>                  */
> -               if (map__start(new) > map__start(pos->map)) {
> -                       struct map *before = map__clone(pos->map);
> +               if (map__start(new) > map__start(pos)) {
> +                       /* Map starts within existing map. Need to shorten the existing map. */
> +                       before = map__clone(pos);
>
>                         if (before == NULL) {
>                                 err = -ENOMEM;
> -                               goto put_map;
> +                               goto out_err;
>                         }
> -
>                         map__set_end(before, map__start(new));
> -                       err = __maps__insert(maps, before);
> -                       if (err) {
> -                               map__put(before);
> -                               goto put_map;
> -                       }
>
>                         if (verbose >= 2 && !use_browser)
>                                 map__fprintf(before, fp);
> -                       map__put(before);
>                 }
> -
> -               if (map__end(new) < map__end(pos->map)) {
> -                       struct map *after = map__clone(pos->map);
> +               if (map__end(new) < map__end(pos)) {
> +                       /* The new map isn't as long as the existing map. */
> +                       after = map__clone(pos);
>
>                         if (after == NULL) {
> +                               map__zput(before);
>                                 err = -ENOMEM;
> -                               goto put_map;
> +                               goto out_err;
>                         }
>
>                         map__set_start(after, map__end(new));
> -                       map__add_pgoff(after, map__end(new) - map__start(pos->map));
> -                       assert(map__map_ip(pos->map, map__end(new)) ==
> -                               map__map_ip(after, map__end(new)));
> -                       err = __maps__insert(maps, after);
> -                       if (err) {
> -                               map__put(after);
> -                               goto put_map;
> -                       }
> +                       map__add_pgoff(after, map__end(new) - map__start(pos));
> +                       assert(map__map_ip(pos, map__end(new)) ==
> +                              map__map_ip(after, map__end(new)));
> +
>                         if (verbose >= 2 && !use_browser)
>                                 map__fprintf(after, fp);
> -                       map__put(after);
>                 }
> -put_map:
> -               map__put(pos->map);
> -               free(pos);
> +               /*
> +                * If adding one entry, for `before` or `after`, we can replace
> +                * the existing entry. If both `before` and `after` are
> +                * necessary than an insert is needed. If the existing entry
> +                * entirely overlaps the existing entry it can just be removed.
> +                */
> +               if (before) {
> +                       map__put(maps_by_address[i]);
> +                       maps_by_address[i] = before;
> +                       /* Maps are still ordered, go to next one. */
> +                       i++;
> +                       if (after) {
> +                               __maps__insert(maps, after);
> +                               map__put(after);
> +                               if (!maps__maps_by_address_sorted(maps)) {
> +                                       /*
> +                                        * Sorting broken so invariants don't
> +                                        * hold, sort and go again.
> +                                        */
> +                                       goto sort_again;
> +                               }
> +                               /*
> +                                * Maps are still ordered, skip after and go to
> +                                * next one (terminate loop).
> +                                */
> +                               i++;
> +                       }
> +               } else if (after) {
> +                       map__put(maps_by_address[i]);
> +                       maps_by_address[i] = after;
> +                       /* Maps are ordered, go to next one. */
> +                       i++;
> +               } else {
> +                       __maps__remove(maps, pos);
> +                       /*
> +                        * Maps are ordered but no need to increase `i` as the
> +                        * later maps were moved down.
> +                        */
> +               }
> +               check_invariants(maps);
>         }
>         /* Add the map. */
> -       err = __maps__insert(maps, new);
> -       up_write(maps__lock(maps));
> +       __maps__insert(maps, new);
> +out_err:
>         return err;
>  }
>
> -int maps__copy_from(struct maps *maps, struct maps *parent)
> +int maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
>  {
>         int err;
> -       struct map_rb_node *rb_node;
>
> +       down_write(maps__lock(maps));
> +       err =  __maps__fixup_overlap_and_insert(maps, new);
> +       up_write(maps__lock(maps));
> +       return err;
> +}
> +
> +int maps__copy_from(struct maps *dest, struct maps *parent)
> +{
> +       /* Note, if struct map were immutable then cloning could use ref counts. */
> +       struct map **parent_maps_by_address;
> +       int err = 0;
> +       unsigned int n;
> +
> +       down_write(maps__lock(dest));
>         down_read(maps__lock(parent));
>
> -       maps__for_each_entry(parent, rb_node) {
> -               struct map *new = map__clone(rb_node->map);
> +       parent_maps_by_address = maps__maps_by_address(parent);
> +       n = maps__nr_maps(parent);
> +       if (maps__empty(dest)) {
> +               /* No existing mappings so just copy from parent to avoid reallocs in insert. */
> +               unsigned int nr_maps_allocated = RC_CHK_ACCESS(parent)->nr_maps_allocated;
> +               struct map **dest_maps_by_address =
> +                       malloc(nr_maps_allocated * sizeof(struct map *));
> +               struct map **dest_maps_by_name = NULL;
>
> -               if (new == NULL) {
> +               if (!dest_maps_by_address)
>                         err = -ENOMEM;
> -                       goto out_unlock;
> +               else {
> +                       if (maps__maps_by_name(parent)) {
> +                               dest_maps_by_name =
> +                                       malloc(nr_maps_allocated * sizeof(struct map *));
> +                       }
> +
> +                       RC_CHK_ACCESS(dest)->maps_by_address = dest_maps_by_address;
> +                       RC_CHK_ACCESS(dest)->maps_by_name = dest_maps_by_name;
> +                       RC_CHK_ACCESS(dest)->nr_maps_allocated = nr_maps_allocated;
>                 }
>
> -               err = unwind__prepare_access(maps, new, NULL);
> -               if (err)
> -                       goto out_unlock;
> +               for (unsigned int i = 0; !err && i < n; i++) {
> +                       struct map *pos = parent_maps_by_address[i];
> +                       struct map *new = map__clone(pos);
>
> -               err = maps__insert(maps, new);
> -               if (err)
> -                       goto out_unlock;
> +                       if (!new)
> +                               err = -ENOMEM;
> +                       else {
> +                               err = unwind__prepare_access(dest, new, NULL);
> +                               if (!err) {
> +                                       dest_maps_by_address[i] = new;
> +                                       if (dest_maps_by_name)
> +                                               dest_maps_by_name[i] = map__get(new);
> +                                       RC_CHK_ACCESS(dest)->nr_maps = i + 1;
> +                               }
> +                       }
> +                       if (err)
> +                               map__put(new);
> +               }
> +               maps__set_maps_by_address_sorted(dest, maps__maps_by_address_sorted(parent));
> +               if (!err) {
> +                       RC_CHK_ACCESS(dest)->last_search_by_name_idx =
> +                               RC_CHK_ACCESS(parent)->last_search_by_name_idx;
> +                       maps__set_maps_by_name_sorted(dest,
> +                                               dest_maps_by_name &&
> +                                               maps__maps_by_name_sorted(parent));
> +               } else {
> +                       RC_CHK_ACCESS(dest)->last_search_by_name_idx = 0;
> +                       maps__set_maps_by_name_sorted(dest, false);
> +               }
> +       } else {
> +               /* Unexpected copying to a maps containing entries. */
> +               for (unsigned int i = 0; !err && i < n; i++) {
> +                       struct map *pos = parent_maps_by_address[i];
> +                       struct map *new = map__clone(pos);
>
> -               map__put(new);
> +                       if (!new)
> +                               err = -ENOMEM;
> +                       else {
> +                               err = unwind__prepare_access(dest, new, NULL);
> +                               if (!err)
> +                                       err = maps__insert(dest, new);

Shouldn't it be __maps__insert()?


> +                       }
> +                       map__put(new);
> +               }
>         }
> -
> -       err = 0;
> -out_unlock:
>         up_read(maps__lock(parent));
> +       up_write(maps__lock(dest));
>         return err;
>  }
>
> -struct map *maps__find(struct maps *maps, u64 ip)
> +static int map__addr_cmp(const void *key, const void *entry)
>  {
> -       struct rb_node *p;
> -       struct map_rb_node *m;
> -
> +       const u64 ip = *(const u64 *)key;
> +       const struct map *map = *(const struct map * const *)entry;
>
> -       down_read(maps__lock(maps));
> -
> -       p = maps__entries(maps)->rb_node;
> -       while (p != NULL) {
> -               m = rb_entry(p, struct map_rb_node, rb_node);
> -               if (ip < map__start(m->map))
> -                       p = p->rb_left;
> -               else if (ip >= map__end(m->map))
> -                       p = p->rb_right;
> -               else
> -                       goto out;
> -       }
> -
> -       m = NULL;
> -out:
> -       up_read(maps__lock(maps));
> -       return m ? m->map : NULL;
> +       if (ip < map__start(map))
> +               return -1;
> +       if (ip >= map__end(map))
> +               return 1;
> +       return 0;
>  }
>
> -static int map__strcmp(const void *a, const void *b)
> +struct map *maps__find(struct maps *maps, u64 ip)
>  {
> -       const struct map *map_a = *(const struct map **)a;
> -       const struct map *map_b = *(const struct map **)b;
> -       const struct dso *dso_a = map__dso(map_a);
> -       const struct dso *dso_b = map__dso(map_b);
> -       int ret = strcmp(dso_a->short_name, dso_b->short_name);
> -
> -       if (ret == 0 && map_a != map_b) {
> -               /*
> -                * Ensure distinct but name equal maps have an order in part to
> -                * aid reference counting.
> -                */
> -               ret = (int)map__start(map_a) - (int)map__start(map_b);
> -               if (ret == 0)
> -                       ret = (int)((intptr_t)map_a - (intptr_t)map_b);
> +       struct map *result = NULL;
> +       bool done = false;
> +
> +       /* See locking/sorting note. */
> +       while (!done) {
> +               down_read(maps__lock(maps));
> +               if (maps__maps_by_address_sorted(maps)) {
> +                       struct map **mapp =
> +                               bsearch(&ip, maps__maps_by_address(maps), maps__nr_maps(maps),
> +                                       sizeof(*mapp), map__addr_cmp);
> +
> +                       if (mapp)
> +                               result = *mapp; // map__get(*mapp);
> +                       done = true;
> +               }
> +               up_read(maps__lock(maps));
> +               if (!done)
> +                       maps__sort_by_address(maps);
>         }
> -
> -       return ret;
> +       return result;
>  }
>
>  static int map__strcmp_name(const void *name, const void *b)
> @@ -593,126 +866,113 @@ static int map__strcmp_name(const void *name, const void *b)
>         return strcmp(name, dso->short_name);
>  }
>
> -void __maps__sort_by_name(struct maps *maps)
> -{
> -       qsort(maps__maps_by_name(maps), maps__nr_maps(maps), sizeof(struct map *), map__strcmp);
> -}
> -
> -static int map__groups__sort_by_name_from_rbtree(struct maps *maps)
> -{
> -       struct map_rb_node *rb_node;
> -       struct map **maps_by_name = realloc(maps__maps_by_name(maps),
> -                                           maps__nr_maps(maps) * sizeof(struct map *));
> -       int i = 0;
> -
> -       if (maps_by_name == NULL)
> -               return -1;
> -
> -       up_read(maps__lock(maps));
> -       down_write(maps__lock(maps));
> -
> -       RC_CHK_ACCESS(maps)->maps_by_name = maps_by_name;
> -       RC_CHK_ACCESS(maps)->nr_maps_allocated = maps__nr_maps(maps);
> -
> -       maps__for_each_entry(maps, rb_node)
> -               maps_by_name[i++] = map__get(rb_node->map);
> -
> -       __maps__sort_by_name(maps);
> -
> -       up_write(maps__lock(maps));
> -       down_read(maps__lock(maps));
> -
> -       return 0;
> -}
> -
> -static struct map *__maps__find_by_name(struct maps *maps, const char *name)
> +struct map *maps__find_by_name(struct maps *maps, const char *name)
>  {
> -       struct map **mapp;
> +       struct map *result = NULL;
> +       bool done = false;
>
> -       if (maps__maps_by_name(maps) == NULL &&
> -           map__groups__sort_by_name_from_rbtree(maps))
> -               return NULL;
> +       /* See locking/sorting note. */
> +       while (!done) {
> +               unsigned int i;
>
> -       mapp = bsearch(name, maps__maps_by_name(maps), maps__nr_maps(maps),
> -                      sizeof(*mapp), map__strcmp_name);
> -       if (mapp)
> -               return *mapp;
> -       return NULL;
> -}
> +               down_read(maps__lock(maps));
>
> -struct map *maps__find_by_name(struct maps *maps, const char *name)
> -{
> -       struct map_rb_node *rb_node;
> -       struct map *map;
> -
> -       down_read(maps__lock(maps));
> +               /* First check last found entry. */
> +               i = RC_CHK_ACCESS(maps)->last_search_by_name_idx;
> +               if (i < maps__nr_maps(maps) && maps__maps_by_name(maps)) {
> +                       struct dso *dso = map__dso(maps__maps_by_name(maps)[i]);
>
> +                       if (dso && strcmp(dso->short_name, name) == 0) {
> +                               result = maps__maps_by_name(maps)[i]; // TODO: map__get
> +                               done = true;
> +                       }
> +               }
>
> -       if (RC_CHK_ACCESS(maps)->last_search_by_name) {
> -               const struct dso *dso = map__dso(RC_CHK_ACCESS(maps)->last_search_by_name);
> +               /* Second search sorted array. */
> +               if (!done && maps__maps_by_name_sorted(maps)) {
> +                       struct map **mapp =
> +                               bsearch(name, maps__maps_by_name(maps), maps__nr_maps(maps),
> +                                       sizeof(*mapp), map__strcmp_name);
>
> -               if (strcmp(dso->short_name, name) == 0) {
> -                       map = RC_CHK_ACCESS(maps)->last_search_by_name;
> -                       goto out_unlock;
> +                       if (mapp) {
> +                               result = *mapp; // TODO: map__get
> +                               i = mapp - maps__maps_by_name(maps);
> +                               RC_CHK_ACCESS(maps)->last_search_by_name_idx = i;
> +                       }
> +                       done = true;
>                 }
> -       }
> -       /*
> -        * If we have maps->maps_by_name, then the name isn't in the rbtree,
> -        * as maps->maps_by_name mirrors the rbtree when lookups by name are
> -        * made.
> -        */
> -       map = __maps__find_by_name(maps, name);
> -       if (map || maps__maps_by_name(maps) != NULL)
> -               goto out_unlock;
> -
> -       /* Fallback to traversing the rbtree... */
> -       maps__for_each_entry(maps, rb_node) {
> -               struct dso *dso;
> -
> -               map = rb_node->map;
> -               dso = map__dso(map);
> -               if (strcmp(dso->short_name, name) == 0) {
> -                       RC_CHK_ACCESS(maps)->last_search_by_name = map;
> -                       goto out_unlock;
> +               up_read(maps__lock(maps));
> +               if (!done) {
> +                       /* Sort and retry binary search. */
> +                       if (maps__sort_by_name(maps)) {
> +                               /*
> +                                * Memory allocation failed do linear search
> +                                * through address sorted maps.
> +                                */
> +                               struct map **maps_by_address;
> +                               unsigned int n;
> +
> +                               down_read(maps__lock(maps));
> +                               maps_by_address =  maps__maps_by_address(maps);
> +                               n = maps__nr_maps(maps);
> +                               for (i = 0; i < n; i++) {
> +                                       struct map *pos = maps_by_address[i];
> +                                       struct dso *dso = map__dso(pos);
> +
> +                                       if (dso && strcmp(dso->short_name, name) == 0) {
> +                                               result = pos; // TODO: map__get
> +                                               break;
> +                                       }
> +                               }
> +                               up_read(maps__lock(maps));
> +                               done = true;
> +                       }
>                 }
>         }
> -       map = NULL;
> -
> -out_unlock:
> -       up_read(maps__lock(maps));
> -       return map;
> +       return result;
>  }
>
>  struct map *maps__find_next_entry(struct maps *maps, struct map *map)
>  {
> -       struct map_rb_node *rb_node = maps__find_node(maps, map);
> -       struct map_rb_node *next = map_rb_node__next(rb_node);
> +       unsigned int i;
> +       struct map *result = NULL;
>
> -       if (next)
> -               return next->map;
> +       down_read(maps__lock(maps));
> +       i = maps__by_address_index(maps, map);
> +       if (i < maps__nr_maps(maps))
> +               result = maps__maps_by_address(maps)[i]; // TODO: map__get
>
> -       return NULL;
> +       up_read(maps__lock(maps));
> +       return result;
>  }
>
>  void maps__fixup_end(struct maps *maps)
>  {
> -       struct map_rb_node *prev = NULL, *curr;
> +       struct map **maps_by_address;
> +       unsigned int n;
>
>         down_write(maps__lock(maps));
> +       if (!maps__maps_by_address_sorted(maps))
> +               __maps__sort_by_address(maps);
>
> -       maps__for_each_entry(maps, curr) {
> -               if (prev && (!map__end(prev->map) || map__end(prev->map) > map__start(curr->map)))
> -                       map__set_end(prev->map, map__start(curr->map));
> +       maps_by_address = maps__maps_by_address(maps);
> +       n = maps__nr_maps(maps);
> +       for (unsigned int i = 1; i < n; i++) {
> +               struct map *prev = maps_by_address[i - 1];
> +               struct map *curr = maps_by_address[i];
>
> -               prev = curr;
> +               if (!map__end(prev) || map__end(prev) > map__start(curr))
> +                       map__set_end(prev, map__start(curr));
>         }
>
>         /*
>          * We still haven't the actual symbols, so guess the
>          * last map final address.
>          */
> -       if (curr && !map__end(curr->map))
> -               map__set_end(curr->map, ~0ULL);
> +       if (n > 0 && !map__end(maps_by_address[n - 1]))
> +               map__set_end(maps_by_address[n - 1], ~0ULL);
> +
> +       RC_CHK_ACCESS(maps)->ends_broken = false;
>
>         up_write(maps__lock(maps));
>  }
> @@ -723,117 +983,92 @@ void maps__fixup_end(struct maps *maps)
>   */
>  int maps__merge_in(struct maps *kmaps, struct map *new_map)
>  {
> -       struct map_rb_node *rb_node;
> -       struct rb_node *first;
> -       bool overlaps;
> -       LIST_HEAD(merged);
> -       int err = 0;
> +       unsigned int first_after_, kmaps__nr_maps;
> +       struct map **kmaps_maps_by_address;
> +       struct map **merged_maps_by_address;
> +       unsigned int merged_nr_maps_allocated;
> +
> +       /* First try under a read lock. */
> +       while (true) {
> +               down_read(maps__lock(kmaps));
> +               if (maps__maps_by_address_sorted(kmaps))
> +                       break;
>
> -       down_read(maps__lock(kmaps));
> -       first = first_ending_after(kmaps, new_map);
> -       rb_node = first ? rb_entry(first, struct map_rb_node, rb_node) : NULL;
> -       overlaps = rb_node && map__start(rb_node->map) < map__end(new_map);
> -       up_read(maps__lock(kmaps));
> +               up_read(maps__lock(kmaps));
> +
> +               /* First after binary search requires sorted maps. Sort and try again. */
> +               maps__sort_by_address(kmaps);
> +       }
> +       first_after_ = first_ending_after(kmaps, new_map);
> +       kmaps_maps_by_address = maps__maps_by_address(kmaps);
>
> -       if (!overlaps)
> +       if (first_after_ >= maps__nr_maps(kmaps) ||
> +           map__start(kmaps_maps_by_address[first_after_]) >= map__end(new_map)) {
> +               /* No overlap so regular insert suffices. */
> +               up_read(maps__lock(kmaps));
>                 return maps__insert(kmaps, new_map);
> +       }
> +       up_read(maps__lock(kmaps));
>
> -       maps__for_each_entry(kmaps, rb_node) {
> -               struct map *old_map = rb_node->map;
> +       /* Plain insert with a read-lock failed, try again now with the write lock. */
> +       down_write(maps__lock(kmaps));
> +       if (!maps__maps_by_address_sorted(kmaps))
> +               __maps__sort_by_address(kmaps);
>
> -               /* no overload with this one */
> -               if (map__end(new_map) < map__start(old_map) ||
> -                   map__start(new_map) >= map__end(old_map))
> -                       continue;
> +       first_after_ = first_ending_after(kmaps, new_map);
> +       kmaps_maps_by_address = maps__maps_by_address(kmaps);
> +       kmaps__nr_maps = maps__nr_maps(kmaps);
>
> -               if (map__start(new_map) < map__start(old_map)) {
> -                       /*
> -                        * |new......
> -                        *       |old....
> -                        */
> -                       if (map__end(new_map) < map__end(old_map)) {
> -                               /*
> -                                * |new......|     -> |new..|
> -                                *       |old....| ->       |old....|
> -                                */
> -                               map__set_end(new_map, map__start(old_map));
> -                       } else {
> -                               /*
> -                                * |new.............| -> |new..|       |new..|
> -                                *       |old....|    ->       |old....|
> -                                */
> -                               struct map_list_node *m = map_list_node__new();
> +       if (first_after_ >= kmaps__nr_maps ||
> +           map__start(kmaps_maps_by_address[first_after_]) >= map__end(new_map)) {
> +               /* No overlap so regular insert suffices. */
> +               up_write(maps__lock(kmaps));
> +               return maps__insert(kmaps, new_map);

I think it could be:

        ret = __maps__insert(kmaps, new_map);
        up_write(maps__lock(kmaps));
        return ret;


> +       }
> +       /* Array to merge into, possibly 1 more for the sake of new_map. */
> +       merged_nr_maps_allocated = RC_CHK_ACCESS(kmaps)->nr_maps_allocated;
> +       if (kmaps__nr_maps + 1 == merged_nr_maps_allocated)
> +               merged_nr_maps_allocated++;
> +
> +       merged_maps_by_address = malloc(merged_nr_maps_allocated * sizeof(*merged_maps_by_address));
> +       if (!merged_maps_by_address) {
> +               up_write(maps__lock(kmaps));
> +               return -ENOMEM;
> +       }
> +       RC_CHK_ACCESS(kmaps)->maps_by_address = merged_maps_by_address;
> +       RC_CHK_ACCESS(kmaps)->maps_by_address_sorted = true;
> +       zfree(&RC_CHK_ACCESS(kmaps)->maps_by_name);
> +       RC_CHK_ACCESS(kmaps)->maps_by_name_sorted = false;
> +       RC_CHK_ACCESS(kmaps)->nr_maps_allocated = merged_nr_maps_allocated;

Why not use the accessor functions?

Thanks,
Namhyung

>
> -                               if (!m) {
> -                                       err = -ENOMEM;
> -                                       goto out;
> -                               }
> +       /* Copy entries before the new_map that can't overlap. */
> +       for (unsigned int i = 0; i < first_after_; i++)
> +               merged_maps_by_address[i] = map__get(kmaps_maps_by_address[i]);
>
> -                               m->map = map__clone(new_map);
> -                               if (!m->map) {
> -                                       free(m);
> -                                       err = -ENOMEM;
> -                                       goto out;
> -                               }
> +       RC_CHK_ACCESS(kmaps)->nr_maps = first_after_;
>
> -                               map__set_end(m->map, map__start(old_map));
> -                               list_add_tail(&m->node, &merged);
> -                               map__add_pgoff(new_map, map__end(old_map) - map__start(new_map));
> -                               map__set_start(new_map, map__end(old_map));
> -                       }
> -               } else {
> -                       /*
> -                        *      |new......
> -                        * |old....
> -                        */
> -                       if (map__end(new_map) < map__end(old_map)) {
> -                               /*
> -                                *      |new..|   -> x
> -                                * |old.........| -> |old.........|
> -                                */
> -                               map__put(new_map);
> -                               new_map = NULL;
> -                               break;
> -                       } else {
> -                               /*
> -                                *      |new......| ->         |new...|
> -                                * |old....|        -> |old....|
> -                                */
> -                               map__add_pgoff(new_map, map__end(old_map) - map__start(new_map));
> -                               map__set_start(new_map, map__end(old_map));
> -                       }
> -               }
> -       }
> +       /* Add the new map, it will be split when the later overlapping mappings are added. */
> +       __maps__insert(kmaps, new_map);
>
> -out:
> -       while (!list_empty(&merged)) {
> -               struct map_list_node *old_node;
> +       /* Insert mappings after new_map, splitting new_map in the process. */
> +       for (unsigned int i = first_after_; i < kmaps__nr_maps; i++)
> +               __maps__fixup_overlap_and_insert(kmaps, kmaps_maps_by_address[i]);
>
> -               old_node = list_entry(merged.next, struct map_list_node, node);
> -               list_del_init(&old_node->node);
> -               if (!err)
> -                       err = maps__insert(kmaps, old_node->map);
> -               map__put(old_node->map);
> -               free(old_node);
> -       }
> +       /* Copy the maps from merged into kmaps. */
> +       for (unsigned int i = 0; i < kmaps__nr_maps; i++)
> +               map__zput(kmaps_maps_by_address[i]);
>
> -       if (new_map) {
> -               if (!err)
> -                       err = maps__insert(kmaps, new_map);
> -               map__put(new_map);
> -       }
> -       return err;
> +       free(kmaps_maps_by_address);
> +       up_write(maps__lock(kmaps));
> +       return 0;
>  }
>
>  void maps__load_first(struct maps *maps)
>  {
> -       struct map_rb_node *first;
> -
>         down_read(maps__lock(maps));
>
> -       first = maps__first(maps);
> -       if (first)
> -               map__load(first->map);
> +       if (maps__nr_maps(maps) > 0)
> +               map__load(maps__maps_by_address(maps)[0]);
>
>         up_read(maps__lock(maps));
>  }
> diff --git a/tools/perf/util/maps.h b/tools/perf/util/maps.h
> index d836d04c9402..df9dd5a0e3c0 100644
> --- a/tools/perf/util/maps.h
> +++ b/tools/perf/util/maps.h
> @@ -25,21 +25,56 @@ static inline struct map_list_node *map_list_node__new(void)
>         return malloc(sizeof(struct map_list_node));
>  }
>
> -struct map *maps__find(struct maps *maps, u64 addr);
> +/*
> + * Locking/sorting note:
> + *
> + * Sorting is done with the write lock, iteration and binary searching happens
> + * under the read lock requiring being sorted. There is a race between sorting
> + * releasing the write lock and acquiring the read lock for iteration/searching
> + * where another thread could insert and break the sorting of the maps. In
> + * practice inserting maps should be rare meaning that the race shouldn't lead
> + * to live lock. Removal of maps doesn't break being sorted.
> + */
>
>  DECLARE_RC_STRUCT(maps) {
> -       struct rb_root      entries;
>         struct rw_semaphore lock;
> -       struct machine   *machine;
> -       struct map       *last_search_by_name;
> +       /**
> +        * @maps_by_address: array of maps sorted by their starting address if
> +        * maps_by_address_sorted is true.
> +        */
> +       struct map       **maps_by_address;
> +       /**
> +        * @maps_by_name: optional array of maps sorted by their dso name if
> +        * maps_by_name_sorted is true.
> +        */
>         struct map       **maps_by_name;
> -       refcount_t       refcnt;
> -       unsigned int     nr_maps;
> -       unsigned int     nr_maps_allocated;
> +       struct machine   *machine;
>  #ifdef HAVE_LIBUNWIND_SUPPORT
> -       void                            *addr_space;
> +       void            *addr_space;
>         const struct unwind_libunwind_ops *unwind_libunwind_ops;
>  #endif
> +       refcount_t       refcnt;
> +       /**
> +        * @nr_maps: number of maps_by_address, and possibly maps_by_name,
> +        * entries that contain maps.
> +        */
> +       unsigned int     nr_maps;
> +       /**
> +        * @nr_maps_allocated: number of entries in maps_by_address and possibly
> +        * maps_by_name.
> +        */
> +       unsigned int     nr_maps_allocated;
> +       /**
> +        * @last_search_by_name_idx: cache of last found by name entry's index
> +        * as frequent searches for the same dso name are common.
> +        */
> +       unsigned int     last_search_by_name_idx;
> +       /** @maps_by_address_sorted: is maps_by_address sorted. */
> +       bool             maps_by_address_sorted;
> +       /** @maps_by_name_sorted: is maps_by_name sorted. */
> +       bool             maps_by_name_sorted;
> +       /** @ends_broken: does the map contain a map where end values are unset/unsorted? */
> +       bool             ends_broken;
>  };
>
>  #define KMAP_NAME_LEN 256
> @@ -102,6 +137,7 @@ size_t maps__fprintf(struct maps *maps, FILE *fp);
>  int maps__insert(struct maps *maps, struct map *map);
>  void maps__remove(struct maps *maps, struct map *map);
>
> +struct map *maps__find(struct maps *maps, u64 addr);
>  struct symbol *maps__find_symbol(struct maps *maps, u64 addr, struct map **mapp);
>  struct symbol *maps__find_symbol_by_name(struct maps *maps, const char *name, struct map **mapp);
>
> @@ -117,8 +153,6 @@ struct map *maps__find_next_entry(struct maps *maps, struct map *map);
>
>  int maps__merge_in(struct maps *kmaps, struct map *new_map);
>
> -void __maps__sort_by_name(struct maps *maps);
> -
>  void maps__fixup_end(struct maps *maps);
>
>  void maps__load_first(struct maps *maps);
> --
> 2.43.0.472.g3155946c3a-goog
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ