[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAMRc=McFmcFVyVE+b1EPASdQAuuBrXds11o+=QVXY+HocXOGBA@mail.gmail.com>
Date: Tue, 18 Nov 2025 13:25:56 +0100
From: Bartosz Golaszewski <brgl@...ev.pl>
To: Andy Shevchenko <andy.shevchenko@...il.com>
Cc: Kees Cook <kees@...nel.org>, Andy Shevchenko <andy@...nel.org>,
Linus Walleij <linus.walleij@...aro.org>, linux-hardening@...r.kernel.org,
linux-kernel@...r.kernel.org,
Bartosz Golaszewski <bartosz.golaszewski@...aro.org>, Stephen Rothwell <sfr@...b.auug.org.au>
Subject: Re: [PATCH 0/2] string: strends() follow-ups
On Tue, Nov 18, 2025 at 12:58 PM Andy Shevchenko
<andy.shevchenko@...il.com> wrote:
>
> On Tue, Nov 18, 2025 at 12:43 PM Bartosz Golaszewski <brgl@...ev.pl> wrote:
> > On Tue, Nov 18, 2025 at 11:32 AM Andy Shevchenko
> > <andy.shevchenko@...il.com> wrote:
> > > On Tue, Nov 18, 2025 at 12:14 PM Bartosz Golaszewski <brgl@...ev.pl> wrote:
> > > > On Tue, Nov 18, 2025 at 11:09 AM Andy Shevchenko
> > > > <andy.shevchenko@...il.com> wrote:
> > > > > On Tue, Nov 18, 2025 at 12:04 PM Bartosz Golaszewski <brgl@...ev.pl> wrote:
>
> ...
>
> > > > > > A couple follow-up changes to the new strends() string helper. This
> > > > > > needs to go through the GPIO tree as this is where the strends()
> > > > > > currently is.
> > >
> > > > > For the
> > > > > strends() I proposed to get rid of strlen() calls by
> > > > >
> > > > > char *p;
> > > > >
> > > > > p = strrchr(str, suffix[0[);
> > > > > if (!p)
> > > > > return false;
> > > > >
> > > > > return strcmp(p, suffix) == 0;
> > > >
> > > > IMO that's a bit less readable. Unless you benchmark it and show it's
> > > > faster than the current version, I'd say: let's keep the current
> > > > implementation.
> > >
> > > For the static suffixes the second strlen() becomes a hardcoded value,
> > > and I expect the benchmark will be closer to the variant I propose.
> > > Otherwise it will be definitely faster as the strrchr() implies
> > > partial strlen() and strcmp() is the same or even faster in my case as
> > > here we don't do the additional calculations with the pointers. Do you
> > > really need a benchmark for this?
> >
> > Ok, I don't want to load too much string-related stuff into my tree.
> > I'm adding it to my TODO list for the next release where I already
> > have an item to replace the OF-specific implementation of suffix
> > comparator with strends().
>
> Okay, I have done a little benchmark (but on GLibC, a.k.a. in user
> space) for both.
>
> TL;DR: if the suffix is known (string literal) ahead, yours beats
> mine, and vice versa. Since I suppose that most of the time the suffix
> will be known, I think your current implementation is fine. And I
> assume that the kernel implementations of the respective C functions
> are on par with GLibC.
>
Thanks for taking the time to measure it. Ok, I will keep my
implementation for now then.
Bart
> RUN 1
> // static to compare with '-0c' out of 24 strings with different
> endings, 23 do not match
> strends1 took 0.919427 seconds to execute. (last 00-45-76-09-01-02-03-0c)
> strends2 took 1.100318 seconds to execute. (last 00-45-76-09-01-02-03-0c)
>
> // same, but the (random) suffix came from a variable, so the compiler
> can't optimise that
> strends1 took 1.398531 seconds to execute. (last 10-f5-06-08-04-02-03-0d)
> strends2 took 1.055180 seconds to execute. (last 10-f5-06-08-04-02-03-0d)
>
> RUN2
> strends1 took 0.945234 seconds to execute. (last 00-45-76-09-01-02-03-0c)
> strends2 took 1.012759 seconds to execute. (last 00-45-76-09-01-02-03-0c)
> strends1 took 1.210770 seconds to execute. (last 00-45-76-09-01-02-03-04)
> strends2 took 0.933389 seconds to execute. (last 00-45-76-09-01-02-03-04)
>
> RUN3
> strends1 took 0.858516 seconds to execute. (last 00-45-76-09-01-02-03-0c)
> strends2 took 1.011493 seconds to execute. (last 00-45-76-09-01-02-03-0c)
> strends1 took 1.221119 seconds to execute. (last 00-45-76-09-01-02-03-14)
> strends2 took 0.953013 seconds to execute. (last 00-45-76-09-01-02-03-14)
>
> Each run is 10mil over an array of 24 strings each of 24 bytes, so, I
> assume cache lines are drained properly.
>
> So, if you can see my algo beats the dynamic case well (bigger time
> delta for the static comparison). If you want, it can be done both
> ways (each algo for its own cases).
>
Powered by blists - more mailing lists