lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <jx6yufij6gygb6ypsoq2yhw3eb3nobr4ytnb7phgmbpn5gmtws@23hu2rwhm4mt>
Date: Sun, 7 Apr 2024 00:12:22 -0400
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: Kuan-Wei Chiu <visitorckw@...il.com>
Cc: bfoster@...hat.com, jserv@...s.ncku.edu.tw, 
	linux-bcachefs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] bcachefs: Optimize eytzinger0_sort() with bottom-up
 heapsort

On Sun, Apr 07, 2024 at 11:39:04AM +0800, Kuan-Wei Chiu wrote:
> This optimization reduces the average number of comparisons required
> from 2*n*log2(n) - 3*n + o(n) to n*log2(n) + 0.37*n + o(n). When n is
> sufficiently large, it results in approximately 50% fewer comparisons.
> 
> Currently, eytzinger0_sort employs the textbook version of heapsort,
> where during the heapify process, each level requires two comparisons
> to determine the maximum among three elements. In contrast, the
> bottom-up heapsort, during heapify, only compares two children at each
> level until reaching a leaf node. Then, it backtracks from the leaf
> node to find the correct position. Since heapify typically continues
> until very close to the leaf node, the standard heapify requires about
> 2*log2(n) comparisons, while the bottom-up variant only needs log2(n)
> comparisons.
> 
> The experimental data presented below is based on an array generated
> by get_random_u32().
> 
> |   N   | comparisons(old) | comparisons(new) | time(old) | time(new) |
> |-------|------------------|------------------|-----------|-----------|
> | 10000 |     235381       |     136615       |  25545 us |  20366 us |
> | 20000 |     510694       |     293425       |  31336 us |  18312 us |
> | 30000 |     800384       |     457412       |  35042 us |  27386 us |
> | 40000 |    1101617       |     626831       |  48779 us |  38253 us |
> | 50000 |    1409762       |     799637       |  62238 us |  46950 us |
> | 60000 |    1721191       |     974521       |  75588 us |  58367 us |
> | 70000 |    2038536       |    1152171       |  90823 us |  68778 us |
> | 80000 |    2362958       |    1333472       | 104165 us |  78625 us |
> | 90000 |    2690900       |    1516065       | 116111 us |  89573 us |
> | 100000|    3019413       |    1699879       | 133638 us | 100998 us |
> 
> Refs:
>   BOTTOM-UP-HEAPSORT, a new variant of HEAPSORT beating, on an average,
>   QUICKSORT (if n is not very small)
>   Ingo Wegener
>   Theoretical Computer Science, 118(1); Pages 81-98, 13 September 1993
>   https://doi.org/10.1016/0304-3975(93)90364-Y
> 
> Signed-off-by: Kuan-Wei Chiu <visitorckw@...il.com>

Thanks - applied

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ