lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 20 Jun 2023 23:18:05 -0700
From:   Stephane Eranian <eranian@...gle.com>
To:     Eric Lin <eric.lin@...ive.com>
Cc:     peterz@...radead.org, mingo@...hat.com, acme@...nel.org,
        mark.rutland@....com, alexander.shishkin@...ux.intel.com,
        jolsa@...nel.org, namhyung@...nel.org, irogers@...gle.com,
        adrian.hunter@...el.com, palmer@...belt.com,
        linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-riscv@...ts.infradead.org, greentime.hu@...ive.com,
        vincent.chen@...ive.com
Subject: Re: [PATCH] perf/core: Add pmu stop before unthrottling to prevent WARNING

On Tue, Jun 20, 2023 at 9:25 PM Eric Lin <eric.lin@...ive.com> wrote:
>
> CC: Stephane Eranian
>
> On Fri, Jun 2, 2023 at 5:49 PM Eric Lin <eric.lin@...ive.com> wrote:
> >
> > Currently, during the perf sampling, if the perf interrupt takes too long,
> > perf framework will lower the perf_event_max_sample_rate. This will limit
> > the number of samples per timer tick (max_samples_per_tick) and set hwc->interrupts
> > to MAX_INTERRUPTS within the __perf_event_account_interrupt() function.
> >
> > Afterward, the perf framework will unthrottle the event in the timer interrupt
> > handler, which triggers the driver's *_pmu_start() function. Most of the driver's
> > *_pmu_start() functions will check the event->hw.state to determine whether this
> > event has stopped. If the event has not stopped, a WARN_ON_ONCE() warning
> > will be triggered as shown below:
> >
> > [ 2110.224723] ------------[ cut here ]------------
> > [ 2110.224851] WARNING: CPU: 0 PID: 240 at drivers/perf/riscv_pmu.c:184 riscv_pmu_start+0x7c/0x8e
> > [ 2110.225242] Modules linked in:
> > [ 2110.225380] CPU: 0 PID: 240 Comm: ls Not tainted 6.4-rc4-g19d0788e9ef2 #1
> > [ 2110.225574] Hardware name: SiFive (DT)
> > [ 2110.225657] epc : riscv_pmu_start+0x7c/0x8e
> > [ 2110.225834]  ra : riscv_pmu_start+0x28/0x8e
> > [ 2110.225998] epc : ffffffff80aef864 ra : ffffffff80aef810 sp : ffff8f80004db6f0
> > [ 2110.226135]  gp : ffffffff81c83750 tp : ffffaf80069f9bc0 t0 : ffff8f80004db6c0
> > [ 2110.226245]  t1 : 0000000000000000 t2 : 000000000000001f s0 : ffff8f80004db720
> > [ 2110.226367]  s1 : ffffaf8008ca1068 a0 : 0000ffffffffffff a1 : 0000000000000000
> > [ 2110.226488]  a2 : 0000000000000001 a3 : 0000000000000870 a4 : 0000000000000000
> > [ 2110.226605]  a5 : 0000000000000000 a6 : 0000000000000840 a7 : 0000000000000030
> > [ 2110.226721]  s2 : 0000000000000000 s3 : ffffaf8005165800 s4 : ffffaf800424da00
> > [ 2110.226838]  s5 : ffffffffffffffff s6 : ffffffff81cc7590 s7 : 0000000000000000
> > [ 2110.226955]  s8 : 0000000000000006 s9 : 0000000000000001 s10: ffffaf807efbc340
> > [ 2110.227064]  s11: ffffaf807efbbf00 t3 : ffffaf8006a16028 t4 : 00000000dbfbb796
> > [ 2110.227180]  t5 : 0000000700000000 t6 : ffffaf8005269870
> > [ 2110.227277] status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
> > [ 2110.227407] [<ffffffff80aef864>] riscv_pmu_start+0x7c/0x8e
> > [ 2110.227622] [<ffffffff80185b56>] perf_adjust_freq_unthr_context+0x15e/0x174
> > [ 2110.227961] [<ffffffff80188642>] perf_event_task_tick+0x88/0x9c
> > [ 2110.228235] [<ffffffff800626a8>] scheduler_tick+0xfe/0x27c
> > [ 2110.228463] [<ffffffff800b5640>] update_process_times+0x9a/0xba
> > [ 2110.228690] [<ffffffff800c5bd4>] tick_sched_handle+0x32/0x66
> > [ 2110.229007] [<ffffffff800c5e0c>] tick_sched_timer+0x64/0xb0
> > [ 2110.229253] [<ffffffff800b5e50>] __hrtimer_run_queues+0x156/0x2f4
> > [ 2110.229446] [<ffffffff800b6bdc>] hrtimer_interrupt+0xe2/0x1fe
> > [ 2110.229637] [<ffffffff80acc9e8>] riscv_timer_interrupt+0x38/0x42
> > [ 2110.229984] [<ffffffff80090a16>] handle_percpu_devid_irq+0x90/0x1d2
> > [ 2110.230162] [<ffffffff8008a9f4>] generic_handle_domain_irq+0x28/0x36
> >
> > To prevent this warning, we should call the driver's *_pmu_stop() function before unthrottling
> >
> > Signed-off-by: Eric Lin <eric.lin@...ive.com>
> > ---
> >  kernel/events/core.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index db016e418931..098c875abe88 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -4128,6 +4128,7 @@ perf_adjust_freq_unthr_context(struct perf_event_context *ctx, bool unthrottle)
> >
> >                 if (hwc->interrupts == MAX_INTERRUPTS) {
> >                         hwc->interrupts = 0;
> > +                       event->pmu->stop(event, 0);

But how could the event have been stopped with a call to pmu->stop()
during throttling?

>
> >                         perf_log_throttle(event, 1);
> >                         event->pmu->start(event, 0);
> >                 }
> > --
> > 2.17.1
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ