linux-kernel - Re: [PATCH] modpost: Optimize symbol search from linear to binary search

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK7LNATHTjzU4dG8ES6BjtUQv-vq50Lh1ZmBW3kv4K3zkF-SGg@mail.gmail.com>
Date:   Mon, 25 Sep 2023 23:23:59 +0900
From:   Masahiro Yamada <masahiroy@...nel.org>
To:     Jack Brennen <jbrennen@...gle.com>
Cc:     Nathan Chancellor <nathan@...nel.org>,
        Nick Desaulniers <ndesaulniers@...gle.com>,
        Nicolas Schier <nicolas@...sle.eu>, Tom Rix <trix@...hat.com>,
        linux-kernel@...r.kernel.org, linux-kbuild@...r.kernel.org,
        llvm@...ts.linux.dev
Subject: Re: [PATCH] modpost: Optimize symbol search from linear to binary search

On Sun, Sep 24, 2023 at 2:21 AM Jack Brennen <jbrennen@...gle.com> wrote:
>
> Mainly just clarifying that we're willing to change the behavior in corner
> cases?


Yes.

I do not see a sensible reason in the previous quirk
(take the first for the exact match, but the last for others).




> The existing behavior has one set of quirks about which symtab entry
> is returned, and your proposed behavior has another different set of quirks.
>
> It's probably OK to break builds which have an undocumented assumption
> about the order of symtab entries; personally, I'd rather not risk that myself,
> but if somebody with more experience is willing to back that decision, I'm
> OK with it.
>
> Also, there's an alternative approach that uses bsearch from the standard
> library and a common comparison function between qsort and bsearch. I
> considered this alternative earlier; maybe you would prefer it because it
> eliminates having to reimplement a binary search algorithm.
> I chose not to do it this way because of trying to duplicate the quirks.
> If no duplication of the quirks is needed, this becomes easier.
>
> The idea for that is to build a sorted array of syminfo that look like this:
>
> (section_index, addr_lo, addr_hi, sym_lo, sym_hi)
>
> What this represents is the situation where for any lookup in the
> range from (section_index, addr_lo) to (section_index, addr_hi)
> inclusive, the nearest symbol will be either sym_lo or sym_hi.
> There are four different meanings for (sym_lo, sym_hi):
>
> (sym_lo = 0)
> This is a placeholder for a duplicated address, and it cannot
> compare equal to anything. After we sort the array, we set all
> of the duplicated addresses except for the last one to sym_lo = 0.
>
> (sym_lo = MAX_SYM)
> This is used to designate an address being looked up. When this
> is seen, it compares equal to any other syminfo that has an
> overlapping range.
>
> (sym_lo != 0, sym_hi = 0)
> This represents the last range in a section. There's no following
> address that could match. Should also have addr_hi = MAX.
>
> (sym_lo != 0, sym_hi != 0)
> This represents a range in a section that's not the last range.
> sym_hi may be usable to satisfy the lookup, but only if it's
> closer than sym_lo and if allow_negative is true. Note that
> the address of sym_hi will be addr_hi+1, so we don't need any
> additional code to fetch that address.
>
> Here's a sample comparison function:
> int syminfo_compare(const void *a, const void *b) {
>   const struct syminfo *sym1 = a;
>   const struct syminfo *sym2 = b;
>
>   if (sym1->section_index > sym2->section_index)
>     return 1;
>   if (sym1->section_index < sym2->section_index)
>     return -1;
>   if ((sym1->sym_lo == MAX_SYM && sym2->sym_lo != 0) ||
>       (sym2->sym_lo == MAX_SYM && sym1->sym_lo != 0)) {
>     /* Overlap is equality - test for it */
>     if (sym1->addr_hi >= sym2->addr_lo &&
>         sym2->addr_hi >= sym1->addr_lo) {
>       return 0;
>     }
>     /* No overlap, fall through */
>   }
>   if (sym1->addr_lo > sym2->addr_lo)
>     return 1;
>   if (sym1->addr_lo < sym2->addr_lo)
>     return -1;
>   /* Note that if we are comparing a lookup (MAX_SYM) with
>      a placeholder (0), the lookup always compares greater.
>      This causes us to search to the "right" of the placeholder
>      for a match, which is what we want. */
>   if (sym1->sym_lo > sym2->sym_lo)
>     return 1;
>   if (sym1->sym_lo < sym2->sym_lo)
>     return -1;
>   return 0;
> }
>
> So this greatly simplifies the back-end searching. It's a bsearch()
> which gives you either a miss, or one or two alternatives for the result.
> On the front end, you have an extra step after sorting which massages the
> search array into the right configuration.  There's actually not much code
> needed to do that.
>
> Is that of interest?  The leveraging of bsearch() in that way?


I am curious about how to use bsearch() if the whole implementation
is short, but I could not understand that logic fully.




> A few other responses below...
>
> On Sat, Sep 23, 2023 at 4:50 AM Masahiro Yamada <masahiroy@...nel.org> wrote:
> > > +/* Populate the search array that we just allocated.
> > > + * Be slightly paranoid here.  If the ELF file changes during processing,
> >
> > I could not understand. In which case, the ELF file changes?
> >
> > modpost loads the entire file to memory first..
> >
> > In which scenario, the memory content changes?
> >
>
> modpost doesn't load the entire file, it uses mmap to map it into the
> address space.The easiest way to imagine this being an issue is that some
> buggy parallelization happens and something is modifying vmlinux.o while
> modpost is processing it.  Of course it's probably acceptable to say,
> "Don't do that!"
>
> There are two alternatives here: actually read in the entire file, which
> is certainly suboptimal, or just live with the fact that mmap makes no
> guarantees about whether changes in the file are reflected in the memory map.

You are right. I missed the fact that grab_file() used mmap.

I am OK with the careful check.

Yet another possible alternative is, as Nick suggested, cut the first loop.
Use realloc() to grow the array as needed, but this is also suboptimal
if memory copy occurs.




> > > +       /* A bit of paranoia; make sure that the end sentinel's address is
> > > +        * different than its predecessor.  Not doing this could cause
> > > +        * possible undefined behavior if anybody ever inserts a symbol
> > > +        * with section_index and addr both at their max values.
> >
> > I could not understand this comment.
> >
> > If section_index and addr both at their max values at [table_size - 2],
> > ->table[table_size - 2].addr + 1 wraps to zero.
> >
> > The table is not sorted any longer?
> >
>
> That's correct, the table would not be sorted any longer. But by design,
> we never do a relational comparison against the end sentinel. But if
> we rework the search, either by your suggestion or by using bsearch(),
> this sentinel goes away.

Understood.
With mid = lo + (hi - lo) / 2,
the last sentinel is never set as mid.



--
Best Regards
Masahiro Yamada