lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250708101303.142ea3ee@foz.lan>
Date: Tue, 8 Jul 2025 10:13:03 +0200
From: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
To: Jonathan Corbet <corbet@....net>
Cc: linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org, Akira Yokosawa
 <akiyks@...il.com>
Subject: Re: [PATCH v2 2/7] docs: kdoc: micro-optimize KernRe

Em Fri, 04 Jul 2025 08:59:45 -0600
Jonathan Corbet <corbet@....net> escreveu:

> Mauro Carvalho Chehab <mchehab+huawei@...nel.org> writes:
> 
> > Em Thu, 03 Jul 2025 17:47:13 -0600
> > Jonathan Corbet <corbet@....net> escreveu:
> >  
> >> Mauro Carvalho Chehab <mchehab+huawei@...nel.org> writes:
> >>   
> >> > Em Thu,  3 Jul 2025 12:43:58 -0600
> >> > Jonathan Corbet <corbet@....net> escreveu:
> >> >    
> >> >> Rework _add_regex() to avoid doing the lookup twice for the (hopefully
> >> >> common) cache-hit case.
> >> >> 
> >> >> Signed-off-by: Jonathan Corbet <corbet@....net>
> >> >> ---
> >> >>  scripts/lib/kdoc/kdoc_re.py | 7 ++-----
> >> >>  1 file changed, 2 insertions(+), 5 deletions(-)
> >> >> 
> >> >> diff --git a/scripts/lib/kdoc/kdoc_re.py b/scripts/lib/kdoc/kdoc_re.py
> >> >> index e81695b273bf..612223e1e723 100644
> >> >> --- a/scripts/lib/kdoc/kdoc_re.py
> >> >> +++ b/scripts/lib/kdoc/kdoc_re.py
> >> >> @@ -29,12 +29,9 @@ class KernRe:
> >> >>          """
> >> >>          Adds a new regex or re-use it from the cache.
> >> >>          """
> >> >> -
> >> >> -        if string in re_cache:
> >> >> -            self.regex = re_cache[string]
> >> >> -        else:
> >> >> +        self.regex = re_cache.get(string, None)    
> >> >
> >> > With get, None is default...
> >> >    
> >> >> +        if not self.regex:
> >> >>              self.regex = re.compile(string, flags=flags)    
> >> >
> >> > ... yet, as you're using get, better to code it as:
> >> >
> >> > 	self.regex = re_cache.get(string, re.compile(string, flags=flags))    
> >> 
> >> ...but that will recompile the regex each time, defeating the purpose of
> >> the cache, no?  
> >
> > No. It should do exactly like the previous code:
> >
> > - if re_cache[string] exists, it returns it. 
> > - Otherwise, it returns re.compile(string, flags=flags).
> >
> > https://www.w3schools.com/python/ref_dictionary_get.asp  
> 
> The re.compile() call is evaluated before the call to get() - just like
> it would be in C.  This is easy enough to prove to yourself in the REPL
> if you doubt me...

You're right!

Tested with the small code snippet:

	# test.py
	inner called
	Inner will be called: True
	inner called
	Inner should  not be called: False

I guess I expected too much from python's optimizer ;-) My fault.

Your patch looks OK to me.

Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>

-

As reference, this was the test code

#!/usr/bin/env python3

def inner():
   print("inner called")

   return True

c = {}

print(f"Inner will be called: {c.get('a', inner())}")

c = { "a": "False"}

print(f"Inner should  not be called: {c.get('a', inner())}")




Thanks,
Mauro

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ