linux-kernel - Re: [PATCH 1/2] ACPI: nsrepair2: Replace O(n²) bubble sort with O(n log n) sort

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CABZAGREQwVUfRQkkTTXUYD_Uvkf0Wxa=dj7r_r85vMsTnUn67A@mail.gmail.com>
Date: Tue, 3 Feb 2026 19:03:09 +0800
From: Nick Huang <sef1548@...il.com>
To: David Laight <david.laight.linux@...il.com>
Cc: "Rafael J . Wysocki" <rafael@...nel.org>, Robert Moore <robert.moore@...el.com>, 
	Len Brown <lenb@...nel.org>, linux-acpi@...r.kernel.org, acpica-devel@...ts.linux.dev, 
	linux-kernel@...r.kernel.org, paladin@...b.edu.tw, kusogame68@...il.com, 
	ceyanglab@...il.com, n1136402@...b.edu.tw
Subject: Re: [PATCH 1/2] ACPI: nsrepair2: Replace O(n²) bubble sort with O(n log n) sort_r()

David Laight <david.laight.linux@...il.com> 於 2026年2月2日週一 上午6:48寫道：
>
> On Sun,  1 Feb 2026 13:03:33 +0000
> Nick Huang <sef1548@...il.com> wrote:
>
> >    Replace the O(n²) bubble sort implementation in acpi_ns_sort_list()
> >    with the kernel's sort_r() function which uses heapsort, providing
> >    O(n log n) time complexity.
> >
> >    This improves performance for large ACPI package lists while also
> >    reducing code complexity by leveraging the existing kernel sort API.
>
> What is the break even size?
> While the heapsort is O(n long n) it is also more complicated.
> There is also the cost of the function call - especially with all the
> mitigations that distro kernels are likely to enable.
>
> For large datasets the d-cache locality of both sorts is particularly horrid.
> It is almost certainly better to allocate an array of index:value pairs
> and sort that.
> For very big datasets you want to sort small sections (that fit in the
> d-cache) and then use merge sorts (also O(n log n)) to combine them.
> (Yes - this is how you sort data with 3 mag-tape drives....)
>
> Oh, in any case, write separate functions for ascending/descending.
>
>         David
>
Hi David

  Thank you for your reply and the detailed feedback.


   I ran KUnit benchmarks to investigate your questions.


   === Break-Even Point ===


   I compared bubble sort vs heapsort across various array sizes:


     N  | Bubble(ns) | Heap(ns) | Faster

    ----|------------|----------|--------

      2 |         61 |       99 | bubble

      3 |         63 |      115 | bubble

      4 |         78 |      163 | bubble

      5 |         96 |      215 | bubble

      6 |        119 |      260 | bubble

      8 |        177 |      388 | bubble

     10 |        257 |      539 | bubble

     12 |        415 |      721 | bubble

     16 |        726 |     1044 | bubble

     20 |       1106 |     1484 | bubble

     32 |       2854 |     3091 | bubble


   Bubble sort is faster for all tested sizes. The break-even point was not

   reached within N=32, which covers all realistic ACPI use cases:


     - _ALR: 2-10 entries (ambient light response)

     - _CST: 2-8 entries (C-states)

     - _PSS: 5-20 entries (P-states)

     - _TSS: 2-16 entries (T-states)


   As you noted, heapsort has additional overhead from function calls and

   mitigations that outweigh its O(n log n) advantage at small N.


   === Separate Ascending/Descending Functions ===


   I also tested combined vs separate sort functions as you suggested:


     N  | Combined(ns) | Separate(ns) | Saved

    ----|--------------|--------------|------

      4 |           84 |           75 |  11%

      8 |          179 |          175 |   3%

     16 |          737 |          806 |  -9%


   The results are mixed. Separate functions show marginal improvement at

   small N, but the combined function performs better at N=16, possibly

   due to instruction cache effects.


   === Conclusion ===


   Given these results, replacing bubble sort with heapsort would likely

   degrade performance for typical ACPI workloads. The existing bubble

   sort implementation appears to be the right choice for this use case.



-- 
Regards,
Nick Huang