lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CABZAGREQwVUfRQkkTTXUYD_Uvkf0Wxa=dj7r_r85vMsTnUn67A@mail.gmail.com>
Date: Tue, 3 Feb 2026 19:03:09 +0800
From: Nick Huang <sef1548@...il.com>
To: David Laight <david.laight.linux@...il.com>
Cc: "Rafael J . Wysocki" <rafael@...nel.org>, Robert Moore <robert.moore@...el.com>, 
	Len Brown <lenb@...nel.org>, linux-acpi@...r.kernel.org, acpica-devel@...ts.linux.dev, 
	linux-kernel@...r.kernel.org, paladin@...b.edu.tw, kusogame68@...il.com, 
	ceyanglab@...il.com, n1136402@...b.edu.tw
Subject: Re: [PATCH 1/2] ACPI: nsrepair2: Replace O(n²) bubble sort with O(n log n) sort_r()

David Laight <david.laight.linux@...il.com> 於 2026年2月2日週一 上午6:48寫道:
>
> On Sun,  1 Feb 2026 13:03:33 +0000
> Nick Huang <sef1548@...il.com> wrote:
>
> >    Replace the O(n²) bubble sort implementation in acpi_ns_sort_list()
> >    with the kernel's sort_r() function which uses heapsort, providing
> >    O(n log n) time complexity.
> >
> >    This improves performance for large ACPI package lists while also
> >    reducing code complexity by leveraging the existing kernel sort API.
>
> What is the break even size?
> While the heapsort is O(n long n) it is also more complicated.
> There is also the cost of the function call - especially with all the
> mitigations that distro kernels are likely to enable.
>
> For large datasets the d-cache locality of both sorts is particularly horrid.
> It is almost certainly better to allocate an array of index:value pairs
> and sort that.
> For very big datasets you want to sort small sections (that fit in the
> d-cache) and then use merge sorts (also O(n log n)) to combine them.
> (Yes - this is how you sort data with 3 mag-tape drives....)
>
> Oh, in any case, write separate functions for ascending/descending.
>
>         David
>
Hi David

  Thank you for your reply and the detailed feedback.


   I ran KUnit benchmarks to investigate your questions.


   === Break-Even Point ===


   I compared bubble sort vs heapsort across various array sizes:


     N  | Bubble(ns) | Heap(ns) | Faster

    ----|------------|----------|--------

      2 |         61 |       99 | bubble

      3 |         63 |      115 | bubble

      4 |         78 |      163 | bubble

      5 |         96 |      215 | bubble

      6 |        119 |      260 | bubble

      8 |        177 |      388 | bubble

     10 |        257 |      539 | bubble

     12 |        415 |      721 | bubble

     16 |        726 |     1044 | bubble

     20 |       1106 |     1484 | bubble

     32 |       2854 |     3091 | bubble


   Bubble sort is faster for all tested sizes. The break-even point was not

   reached within N=32, which covers all realistic ACPI use cases:


     - _ALR: 2-10 entries (ambient light response)

     - _CST: 2-8 entries (C-states)

     - _PSS: 5-20 entries (P-states)

     - _TSS: 2-16 entries (T-states)


   As you noted, heapsort has additional overhead from function calls and

   mitigations that outweigh its O(n log n) advantage at small N.


   === Separate Ascending/Descending Functions ===


   I also tested combined vs separate sort functions as you suggested:


     N  | Combined(ns) | Separate(ns) | Saved

    ----|--------------|--------------|------

      4 |           84 |           75 |  11%

      8 |          179 |          175 |   3%

     16 |          737 |          806 |  -9%


   The results are mixed. Separate functions show marginal improvement at

   small N, but the combined function performs better at N=16, possibly

   due to instruction cache effects.


   === Conclusion ===


   Given these results, replacing bubble sort with heapsort would likely

   degrade performance for typical ACPI workloads. The existing bubble

   sort implementation appears to be the right choice for this use case.



-- 
Regards,
Nick Huang

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ