lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231115103241.GD3818@noisy.programming.kicks-ass.net>
Date:   Wed, 15 Nov 2023 11:32:41 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     David Wang <00107082@....com>
Cc:     mingo@...hat.com, acme@...nel.org, mark.rutland@....com,
        alexander.shishkin@...ux.intel.com, jolsa@...nel.org,
        namhyung@...nel.org, irogers@...gle.com, adrian.hunter@...el.com,
        linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [Regression or Fix]perf: profiling stats sigificantly changed
 for aio_write/read(ext4) between 6.7.0-rc1 and 6.6.0


Namhyung, could you please take a look, you know how to operate this
cgroup stuff.

On Tue, Nov 14, 2023 at 11:20:32PM +0800, David Wang wrote:
> 
> 
> At 2023-11-14 12:31:12, "David Wang" <00107082@....com> wrote:
> >Hi,
> >
> >I was making kernel profiling to identify kernel changes between revisions under the load of fio benchmark, 
> 
> >But I did not notice a significant change in the overhead of perf tracing based on the fio report, 
> >that is why I am not sure whether this is a regression or an improvement....
> >Just report the change.
> >
> 
> I run the same test on another disk which is almost idle all the time, and still, significant perf sample stats changes detected:
> 
> 6.7.0-rc1
> 
> aio_write(49.340% 131501/266521)
>     ext4_file_write_iter(94.567% 124356/131501)
>         iomap_dio_rw(78.494% 97612/124356)
>             __iomap_dio_rw(99.550% 97173/97612)
>                 iomap_dio_bio_iter(47.773% 46422/97173)   <<----
>                 blk_finish_plug(29.931% 29085/97173)      <<----
>                 iomap_iter(14.082% 13684/97173)
>                 kmalloc_trace(1.814% 1763/97173)
>                 kiocb_invalidate_pages(0.631% 613/97173)
>                 invalidate_inode_pages2_range(0.448% 435/97173)
>                 srso_return_thunk(0.320% 311/97173)
>                 blk_start_plug(0.167% 162/97173)
> 
> 
> 6.7.0-rc1 with f06cc667f79 reverted
> 
> aio_write(49.071% 155873/317649)
>     ext4_file_write_iter(95.211% 148409/155873)
>         iomap_dio_rw(81.816% 121422/148409)
>             __iomap_dio_rw(99.684% 121038/121422)
>                 iomap_dio_bio_iter(40.280% 48754/121038)   <<---
>                 blk_finish_plug(36.760% 44494/121038)      <<---
>                 iomap_iter(14.657% 17740/121038)
>                 kmalloc_trace(1.775% 2149/121038)
>                 kiocb_invalidate_pages(0.599% 725/121038)
>                 invalidate_inode_pages2_range(0.330% 399/121038)
>                 srso_return_thunk(0.263% 318/121038)
>                 blk_start_plug(0.169% 204/121038)
> 
> 
> I remove the 4 occurrences for checking nr_cgroups:
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 683dc086ef10..7583418e94d9 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -690,8 +690,8 @@ static void perf_ctx_disable(struct perf_event_context *ctx, bool cgroup)
>         struct perf_event_pmu_context *pmu_ctx;
>  
>         list_for_each_entry(pmu_ctx, &ctx->pmu_ctx_list, pmu_ctx_entry) {
> -               if (cgroup && !pmu_ctx->nr_cgroups)
> -                       continue;
> +       //      if (cgroup && !pmu_ctx->nr_cgroups)
> +       //              continue;
>                 perf_pmu_disable(pmu_ctx->pmu);
>         }
>  }
> @@ -701,8 +701,8 @@ static void perf_ctx_enable(struct perf_event_context *ctx, bool cgroup)
>         struct perf_event_pmu_context *pmu_ctx;
>  
>         list_for_each_entry(pmu_ctx, &ctx->pmu_ctx_list, pmu_ctx_entry) {
> -               if (cgroup && !pmu_ctx->nr_cgroups)
> -                       continue;
> +       //      if (cgroup && !pmu_ctx->nr_cgroups)
> +       //              continue;
>                 perf_pmu_enable(pmu_ctx->pmu);
>         }
>  }
> @@ -3307,8 +3307,8 @@ ctx_sched_out(struct perf_event_context *ctx, enum event_type_t event_type)
>         is_active ^= ctx->is_active; /* changed bits */
>  
>         list_for_each_entry(pmu_ctx, &ctx->pmu_ctx_list, pmu_ctx_entry) {
> -               if (cgroup && !pmu_ctx->nr_cgroups)
> -                       continue;
> +       //      if (cgroup && !pmu_ctx->nr_cgroups)
> +       //              continue;
>                 __pmu_ctx_sched_out(pmu_ctx, is_active);
>         }
>  }
> 
> 
>  The resulting profiling stats is similar to reverting f06cc667f79.
> I think there are some cases where pmu_ctx->nr_cgroups is zero but there is still cgroup event within, and some samples are missed, causing the stats changes, just a guess.
> 
> 
> David Wang
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ