lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 24 Nov 2020 14:01:39 +0900
From:   Namhyung Kim <namhyung@...nel.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Ingo Molnar <mingo@...nel.org>, Borislav Petkov <bp@...en8.de>,
        Thomas Gleixner <tglx@...utronix.de>,
        "H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
        LKML <linux-kernel@...r.kernel.org>,
        Stephane Eranian <eranian@...gle.com>,
        Kan Liang <kan.liang@...ux.intel.com>,
        John Sperbeck <jsperbeck@...gle.com>,
        "Lendacky, Thomas" <Thomas.Lendacky@....com>
Subject: Re: [RFC] perf/x86: Fix a warning on x86_pmu_stop()

Hi Peter,

On Mon, Nov 23, 2020 at 11:23 PM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Sat, Nov 21, 2020 at 11:50:11AM +0900, Namhyung Kim wrote:
> > When large PEBS is enabled, the below warning is triggered:
> >
> >   [6070379.453697] WARNING: CPU: 23 PID: 42379 at arch/x86/events/core.c:1466 x86_pmu_stop+0x95/0xa0
> >   ...
> >   [6070379.453831] Call Trace:
> >   [6070379.453840]  x86_pmu_del+0x50/0x150
> >   [6070379.453845]  event_sched_out.isra.0+0x95/0x200
> >   [6070379.453848]  group_sched_out.part.0+0x53/0xd0
> >   [6070379.453851]  __perf_event_disable+0xee/0x1e0
> >   [6070379.453854]  event_function+0x89/0xd0
> >   [6070379.453859]  remote_function+0x3e/0x50
> >   [6070379.453866]  generic_exec_single+0x91/0xd0
> >   [6070379.453870]  smp_call_function_single+0xd1/0x110
> >   [6070379.453874]  event_function_call+0x11c/0x130
> >   [6070379.453877]  ? task_ctx_sched_out+0x20/0x20
> >   [6070379.453880]  ? perf_mux_hrtimer_handler+0x370/0x370
> >   [6070379.453882]  ? event_function_call+0x130/0x130
> >   [6070379.453886]  perf_event_for_each_child+0x34/0x80
> >   [6070379.453889]  ? event_function_call+0x130/0x130
> >   [6070379.453891]  _perf_ioctl+0x24b/0x6a0
> >   [6070379.453898]  ? sched_setaffinity+0x1ad/0x2a0
> >   [6070379.453904]  ? _cond_resched+0x15/0x30
> >   [6070379.453906]  perf_ioctl+0x3d/0x60
> >   [6070379.453912]  ksys_ioctl+0x87/0xc0
> >   [6070379.453917]  __x64_sys_ioctl+0x16/0x20
> >   [6070379.453923]  do_syscall_64+0x52/0x180
> >   [6070379.453928]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> > The commit 3966c3feca3f ("x86/perf/amd: Remove need to check "running"
> > bit in NMI handler") introduced this.  It seems x86_pmu_stop can be
> > called recursively (like when it losts some samples) like below:
> >
> >   x86_pmu_stop
> >     intel_pmu_disable_event  (x86_pmu_disable)
> >       intel_pmu_pebs_disable
> >         intel_pmu_drain_pebs_buffer
> >           x86_pmu_stop
> >
>
> This shouldn't be possible; intel_pmu_drain_pebs_buffer() calls
> drain_pebs(.iregs=NULL), which means that __intel_pmu_pebs_event()
> should not end up x86_pmu_stop().
>
> Are you running some old kernel?

Well, it's actually 5.7.17 but I think the latest version has the same problem.

Yes, it's not about __intel_pmu_pebs_event().  I'm looking at
intel_pmu_drain_pebs_nhm() specifically.  There's code like

        /* log dropped samples number */
        if (error[bit]) {
            perf_log_lost_samples(event, error[bit]);

            if (perf_event_account_interrupt(event))
                x86_pmu_stop(event, 0);
        }

        if (counts[bit]) {
            __intel_pmu_pebs_event(event, iregs, base,
                           top, bit, counts[bit],
                           setup_pebs_fixed_sample_data);
        }

There's a path to x86_pmu_stop() when an error bit is on.

Thanks,
Namhyung

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ