linux-kernel - Re: [PATCH 12/12] docs: kdoc: Improve the output text accumulation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250710101931.202953d1@foz.lan>
Date: Thu, 10 Jul 2025 10:19:31 +0200
From: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
To: Jonathan Corbet <corbet@....net>
Cc: linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org, Akira Yokosawa
 <akiyks@...il.com>
Subject: Re: [PATCH 12/12] docs: kdoc: Improve the output text accumulation

Em Thu, 10 Jul 2025 09:13:52 +0200
Mauro Carvalho Chehab <mchehab+huawei@...nel.org> escreveu:

> Em Thu, 10 Jul 2025 08:41:19 +0200
> Mauro Carvalho Chehab <mchehab+huawei@...nel.org> escreveu:
> 
> > Em Wed,  2 Jul 2025 16:35:24 -0600
> > Jonathan Corbet <corbet@....net> escreveu:
> >   
> > > Building strings with repeated concatenation is somewhat inefficient in
> > > Python; it is better to make a list and glom them all together at the end.
> > > Add a small set of methods to the OutputFormat superclass to manage the
> > > output string, and use them throughout.
> > > 
> > > Signed-off-by: Jonathan Corbet <corbet@....net>    
> > 
> > The patch looks good to me. Just a minor nit below.
> >   
> > > ---
> > >  scripts/lib/kdoc/kdoc_output.py | 185 +++++++++++++++++---------------
> > >  1 file changed, 98 insertions(+), 87 deletions(-)
> > > 
> > > diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
> > > index ea8914537ba0..d4aabdaa9c51 100644
> > > --- a/scripts/lib/kdoc/kdoc_output.py
> > > +++ b/scripts/lib/kdoc/kdoc_output.py
> > > @@ -73,7 +73,19 @@ class OutputFormat:
> > >          self.config = None
> > >          self.no_doc_sections = False
> > >  
> > > -        self.data = ""
> > > +    #
> > > +    # Accumulation and management of the output text.
> > > +    #
> > > +    def reset_output(self):
> > > +        self._output = []
> > > +
> > > +    def emit(self, text):
> > > +        """Add a string to out output text"""
> > > +        self._output.append(text)
> > > +
> > > +    def output(self):
> > > +        """Obtain the accumulated output text"""
> > > +        return ''.join(self._output)    
> > 
> > I would prefer to use a more Pythonic name for this function:
> > 
> > 	def __str__(self)
> > 
> > This way, all it takes to get the final string is to use str():
> > 
> > 	out_str = str(out)
> > 
> > With that:
> > 
> > Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>  
> 
> 
> Hmm... actually, I would code it on a different way, using something like:
> 
> class OutputString:
>     def __init__(self):
> 	"""Initialize internal list"""
>         self._output = []
>     
>     # Probably not needed - The code can simply do, instead:
>     # a = OutputString() to create a new string.
>     def reset(self):
>         """Reset the output text"""
>         self._output = []
>     
>     def __add__(self, text):
> 	"""Add a string to out output text"""
>         if not isinstance(text, str):
>             raise TypeError("Can only append strings")
>         self._output.append(text)
>         return self
> 
>     def __str__(self):
>         return ''.join(self._output)
>     
>     # and, if needed, add a getter/setter:
> 
>     @property
>     def data(self):
>         """Getter for the current output"""
>         return ''.join(self._output)
> 
>     @data.setter
>     def data(self, new_value):
>         if isinstance(new_value, str):
> 	    self._output = [new_value]
> 	elif isinstance(new_value, list):
>             self._output = new_value
>         else:
>             raise TypeError("Value should be either list or string")
> 
> That would allow things like:
> 
> 	out = OutputString()
> 	out = out + "Foo" + " " + "Bar"
> 	print(out)
> 
> 	out = OutputString()
> 	out += "Foo"
> 	out += " "
> 	out += "Bar"
> 	return(str(out))
> 
> and won't require much changes at the output logic, and IMO will
> provide a cleaner code.
> 
> Thanks,
> Mauro

Heh, on those times where LLM can quickly code trivial things for us,
I actually decided to test 3 different variants:

- using string +=
- using list append
- using __add__
- using __iadd__

Except if the LLM-generated did something wrong (I double checked, and
was unable to find any issues), the results on Python 3.13.5 are:

$ /tmp/bench.py
Benchmarking 1,000 ops × 1000 strings = 1,000,000 appends

Run    str+=        ExplicitList __add__      __iadd__    
------------------------------------------------------------
1      25.26        29.44        53.42        50.71       
2      29.34        29.35        53.45        50.61       
3      29.44        29.56        53.41        50.67       
4      29.28        29.23        53.26        50.64       
5      29.28        29.20        45.90        40.47       
6      23.53        23.62        42.74        40.61       
7      23.43        23.76        42.97        40.78       
8      23.51        23.59        42.67        40.61       
9      23.43        23.52        42.77        40.72       
10     23.53        23.54        42.78        40.67       
11     23.83        23.63        42.98        40.87       
12     23.49        23.45        42.67        40.53       
13     23.43        23.69        42.75        40.66       
14     23.47        23.49        42.70        40.56       
15     23.44        23.63        42.72        40.52       
16     23.51        23.56        42.65        40.66       
17     23.48        23.60        42.86        40.81       
18     23.67        23.53        42.73        40.59       
19     23.75        23.62        42.78        40.58       
20     23.68        23.55        42.77        40.54       
21     23.65        23.67        42.76        40.59       
22     23.73        23.49        42.78        40.61       
23     23.61        23.59        42.78        40.58       
24     23.66        23.51        42.73        40.55       
------------------------------------------------------------
Avg    24.60        24.78        44.67        42.30       

Summary:
ExplicitList : 100.74% slower than str+=
__add__      : 181.56% slower than str+=
__iadd__     : 171.93% slower than str+=

(running it a couple of times sometimes it sometimes show list
 addition a little bit better, bu all at the +/- 1% range)

In practice, it means that my suggestion of using __add__ (or even
using the __iadd__ variant) was not good, but it also showed
that Python 3.13 implementation is actually very efficient
with str += operations.

With that, I would just drop this patch, as the performance is
almost identical, and using "emit()" instead of "+=" IMO makes
the code less clear.

-

Btw, with Python 3.9, "".join(list) is a lot worse than str += :

$ python3.9 /tmp/bench.py
Benchmarking 1,000 ops × 1000 strings = 1,000,000 appends

Run    str+=        ExplicitList __add__      __iadd__    
------------------------------------------------------------
1      28.27        87.24        96.03        88.81       
2      32.76        87.35        87.40        88.92       
3      32.69        85.98        73.01        70.87       
4      26.28        69.80        70.62        71.90       
5      27.21        70.54        71.04        72.00       
6      27.77        70.06        70.22        70.92       
7      27.03        69.75        70.30        70.89       
8      33.31        72.63        70.57        70.59       
9      26.33        70.15        70.27        70.97       
10     26.29        69.84        71.60        70.94       
11     26.59        69.60        70.16        71.26       
12     26.38        69.57        71.64        70.95       
13     26.41        69.89        70.11        70.85       
14     26.38        69.86        70.36        70.93       
15     26.43        69.57        70.18        70.90       
16     26.38        70.04        70.26        71.19       
17     26.40        70.02        80.50        71.01       
18     26.41        71.74        80.39        71.90       
19     28.06        69.60        71.95        70.88       
20     28.28        69.90        71.12        71.07       
21     26.34        69.74        72.42        71.02       
22     26.33        69.86        70.25        70.97       
23     26.40        69.78        71.64        71.10       
24     26.44        69.73        70.23        70.83       
------------------------------------------------------------
Avg    27.55        72.18        73.43        72.57       

Summary:
ExplicitList : 262.00% slower than str+=
__add__      : 266.54% slower than str+=
__iadd__     : 263.42% slower than str+=


Thanks,
Mauro

---

#!/usr/bin/env python3

import timeit

class ExplicitList:
    def __init__(self):
        self._output = []

    def emit(self, text):
        self._output.append(text)

    def output(self):
        return ''.join(self._output)

class OutputStringAdd:
    def __init__(self):
        self._output = []

    def __add__(self, text):
        self._output.append(text)
        return self

    def __str__(self):
        return ''.join(self._output)

class OutputStringIAdd:
    def __init__(self):
        self._output = []

    def __iadd__(self, text):
        self._output.append(text)
        return self

    def __str__(self):
        return ''.join(self._output)

def calculate_comparison(base_time, compare_time):
    """Returns tuple of (is_faster, percentage)"""
    if compare_time < base_time:
        return (True, (1 - compare_time/base_time)*100)
    else:
        return (False, (compare_time/base_time)*100)

def benchmark():
    N = 1000       # Operations
    STRINGS_PER_OP = 1000
    REPEATS = 24

    # Generate test data (1000 unique 10-character strings)
    test_strings = [f"string_{i:03d}" for i in range(STRINGS_PER_OP)]

    print(f"Benchmarking {N:,} ops × {STRINGS_PER_OP} strings = {N*STRINGS_PER_OP:,} appends\n")
    headers = ['Run', 'str+=', 'ExplicitList', '__add__', '__iadd__']
    print(f"{headers[0]:<6} {headers[1]:<12} {headers[2]:<12} {headers[3]:<12} {headers[4]:<12}")
    print("-" * 60)

    results = []

    for i in range(REPEATS):
        # Benchmark normal string +=
        t_str = timeit.timeit(
            'result = ""\nfor s in test_strings: result += s',
            globals={'test_strings': test_strings},
            number=N
        ) * 1000

        # Benchmark ExplicitList
        t_explicit = timeit.timeit(
            'obj = ExplicitList()\nfor s in test_strings: obj.emit(s)',
            globals={'test_strings': test_strings, 'ExplicitList': ExplicitList},
            number=N
        ) * 1000

        # Benchmark __add__ version
        t_add = timeit.timeit(
            'obj = OutputStringAdd()\nfor s in test_strings: obj += s',
            globals={'test_strings': test_strings, 'OutputStringAdd': OutputStringAdd},
            number=N
        ) * 1000

        # Benchmark __iadd__ version
        t_iadd = timeit.timeit(
            'obj = OutputStringIAdd()\nfor s in test_strings: obj += s',
            globals={'test_strings': test_strings, 'OutputStringIAdd': OutputStringIAdd},
            number=N
        ) * 1000

        results.append((t_str, t_explicit, t_add, t_iadd))
        print(f"{i+1:<6} {t_str:<12.2f} {t_explicit:<12.2f} {t_add:<12.2f} {t_iadd:<12.2f}")

    # Calculate averages
    avg_str = sum(r[0] for r in results) / REPEATS
    avg_explicit = sum(r[1] for r in results) / REPEATS
    avg_add = sum(r[2] for r in results) / REPEATS
    avg_iadd = sum(r[3] for r in results) / REPEATS

    print("-" * 60)
    print(f"{'Avg':<6} {avg_str:<12.2f} {avg_explicit:<12.2f} {avg_add:<12.2f} {avg_iadd:<12.2f}")

    print()
    print("Summary:")
    # Calculate and print comparisons
    for name, time in [("ExplicitList", avg_explicit),
                      ("__add__", avg_add),
                      ("__iadd__", avg_iadd)]:
        is_faster, percentage = calculate_comparison(avg_str, time)
        if is_faster:
            print(f"{name:<12} : {percentage:.2f}% faster than str+=")
        else:
            print(f"{name:<12} : {percentage:.2f}% slower than str+=")


if __name__ == "__main__":
    benchmark()