lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAH0uvoj-aS44f78iQiVqDZiR1mN5SHcXgR17_Lnu_b5CC-ZsPQ@mail.gmail.com>
Date: Thu, 6 Feb 2025 17:04:28 -0800
From: Howard Chu <howardchu95@...il.com>
To: Krzysztof Łopatowski <krzysztof.m.lopatowski@...il.com>
Cc: namhyung@...nel.org, acme@...nel.org, irogers@...gle.com, 
	linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] perf: Improve startup time by reducing unnecessary stat() calls

Hello Krzysztof,

On Thu, Feb 6, 2025 at 3:36 AM Krzysztof Łopatowski
<krzysztof.m.lopatowski@...il.com> wrote:
>
> When testing perf trace on NixOS, I noticed significant startup delays:
> - `ls`: ~2ms
> - `strace ls`: ~10ms
> - `perf trace ls`: ~550ms

Thank you so much for this insight.

>
> Profiling showed that 51% of the time is spent reading files,
> 26% in loading BPF programs, and 11% in `newfstatat`.

That's a valuable piece of info, thank you.

>
> This patch optimizes module path exploration by avoiding `stat()` calls
> unless necessary. For filesystems that do not implement `d_type`
> (DT_UNKNOWN), it falls back to the old behavior.
> See `readdir(3)` for details.
>
> This reduces `perf trace ls` time to ~500ms.

Before:

perf $ time sudo ./perf trace -a --max-events=1 -S 2 >&1 1>/dev/null
         ? (         ): :2278675/2278675  ... [continued]: read())
                                        = 1
real    0m0.833s

So 833ms.

After:

perf $ time sudo ./perf trace -a --max-events=1 -S 2 >&1 1>/dev/null
         ? (         ): :2274650/2274650  ... [2: No such file or directory
<SNIP>
real    0m0.805s
user    0m0.004s
sys     0m0.011s

805ms.

And the results are consistent. Apparently works for me, thank you :)

>
> A more thorough startup optimization based on command parameters would
> be ideal, but that is a larger effort.
>
> Signed-off-by: Krzysztof Łopatowski <krzysztof.m.lopatowski@...il.com>
> ---
>  tools/perf/util/machine.c | 18 ++++++++++++------
>  1 file changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index 2d51badfbf2e..76b857fd752b 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -1354,7 +1354,7 @@ static int maps__set_module_path(struct maps *maps, const char *path, struct kmo
>
>  static int maps__set_modules_path_dir(struct maps *maps, const char *dir_name, int depth)
>  {
> -       struct dirent *dent;
> +       const struct dirent *dent;

Is it necessary to constify dent?

>         DIR *dir = opendir(dir_name);
>         int ret = 0;
>
> @@ -1365,14 +1365,20 @@ static int maps__set_modules_path_dir(struct maps *maps, const char *dir_name, i
>
>         while ((dent = readdir(dir)) != NULL) {
>                 char path[PATH_MAX];
> -               struct stat st;
> +               unsigned char d_type = dent->d_type;
>
> -               /*sshfs might return bad dent->d_type, so we have to stat*/
>                 path__join(path, sizeof(path), dir_name, dent->d_name);
> -               if (stat(path, &st))
> -                       continue;
>
> -               if (S_ISDIR(st.st_mode)) {
> +               if (d_type == DT_UNKNOWN) {
> +                       struct stat st;
> +
> +                       if (stat(path, &st))
> +                               continue;
> +                       if (S_ISDIR(st.st_mode))
> +                               d_type = DT_DIR;
> +               }
> +
> +               if (d_type == DT_DIR) {
>                         if (!strcmp(dent->d_name, ".") ||
>                             !strcmp(dent->d_name, ".."))
>                                 continue;
> --
> 2.47.2
>
>

Acked-by: Howard Chu <howardchu95@...il.com>

Thanks,
Howard

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ