linux-kernel - Re: [PATCH 4/5] lib/list_sort: Simplify and remove MAX_LIST_LENGTH

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAMuHMdVgmb8eZ7bD+jye8DG7cQRywvVmwo_XNJ00f198pDcwxQ@mail.gmail.com>
Date:   Fri, 15 Mar 2019 13:57:05 +0100
From:   Geert Uytterhoeven <geert@...ux-m68k.org>
To:     George Spelvin <lkml@....org>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Daniel Wagner <daniel.wagner@...mens.com>,
        Dave Chinner <dchinner@...hat.com>,
        Don Mullis <don.mullis@...il.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Rasmus Villemoes <linux@...musvillemoes.dk>,
        Andrey Abramov <st5pub@...dex.ru>
Subject: Re: [PATCH 4/5] lib/list_sort: Simplify and remove MAX_LIST_LENGTH_BITS

Hi George,

On Fri, Mar 15, 2019 at 11:23 AM George Spelvin <lkml@....org> wrote:
> On Fri, 15 Mar 2019 at 09:20:58 +0100, Geert Uytterhoeven wrote:
> > On Fri, Mar 15, 2019 at 5:33 AM George Spelvin <lkml@....org> wrote:
> >> On Thu, 14 Mar 2019 at 11:10:41 +0200, Andy Shevchenko wrote:
> >>> On Tue, Mar 05, 2019 at 03:06:44AM +0000, George Spelvin wrote:
> >>>> +            for (bit = 1; count & bit; bit <<= 1) {
> >>>> +                    cur = merge(priv, (cmp_func)cmp, pending, cur);
> >>>> +                    pending = pending->prev;  /* Untouched by merge() */
> >>>>              }
> >>>
> >>> Wouldn't be it the same to
> >>>
> >>>       bit = ffz(count);
> >>>       while (bit--) {
> >>>               ...
> >>>       }
> >>> ?
> >>>
> >>> Though I dunno which one is generating better code.
> >>
> >> One question I should ask everyone: should "count" be 32 or 64 bits
> >> on 64-bit machines?  That would let x86 save a few REX bytes.  (815
> >> vs. 813 byte code, if anyone cares.)
> >>
> >> Allegedy ARM can save a few pJ by gating the high 32
> >> bits of the ALU.
> >>
> >> Most other 64-bit processors would prefer 64-bit operations as
> >> it saves masking operations.
> >>
> >> If we never sort a list with more than 2^32 entries, it
> >> makes no difference.
> >>
> >> If we use a 32-bit count and we *do* sort a list with more than
> >> 2^32 entries, then it still sorts, but the performance degrades to
> >> O((n/2^32)^2).
> >>
> >> Just how often do we expect the kernel to face lists that long?
> >> (Note that the old code was O((n/2^20)^2).)
> >
> > Using size_t sounds most logical to me (argument of least surprise).
>
> Yes, it is the obvious solution, which is why that's my default choice.
>
> But a bit of thought shows that a list long enough to break a
> 32-bit implementation is beyond ridiculous.
>
> The list must be at least 3 * 2^32 elements long to make the sort
> merge non-optimally.  That's 1.5 * 2^37 bytes (192 GiB) of list_head
> structures alone; at least double that for any practical application.
> And 32 * 3 * 2^32 + (2 + 3) * 2^32 = 101 * 2^32 = 1.57 * 2^38
> compares.
>
> That seems like a lot but that's not the botteneck.  Each compare
> reads from a new list element, and pretty soon, they'll miss all
> caches and go to main memory.
>
> Since the memory locations are random, for any small subset of the
> list, you'll get only one element per cache line.  A 32 MiB L3
> cache is 2^19 cache lines (assuming 64B lines).  So merge levels
> 20 through 33 will go to main memory.
>
> That's (12 * 3 + 5) * 2^32 = 1.28 * 2^37 cache misses.  At 60 ns each (typical
> local DRAM access time on i7 & Xeon according to Intel), that's a
> hard minimum of 10565 seconds = 2h 56m 05s in one list_sort call.
>
> This is definitely the scale of problem where a mutithreaded sort is
> called for.
>
> It's *so* impossible that maybe it's worth trading that capability
> for a couple of bytes in the inner loop.
>
> >> In the code, I could do something like
> >>
> >> #ifdef CONFIG_X86_64
> >> /* Comment explaining why */
> >> typedef uint32_t count_t;
> >> #else
> >> typedef size_t count_t;
> >> #endif

So just make it unsigned int, unconditionally.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@...ux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds