[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <bf8a41ed-3d93-9f60-5e2d-780abc177325@oracle.com>
Date: Tue, 22 Mar 2022 16:00:43 -0400
From: Boris Ostrovsky <boris.ostrovsky@...cle.com>
To: Juergen Gross <jgross@...e.com>, xen-devel@...ts.xenproject.org,
x86@...nel.org, linux-kernel@...r.kernel.org
Cc: Stefano Stabellini <sstabellini@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
"H. Peter Anvin" <hpa@...or.com>,
Marek Marczykowski-Górecki
<marmarek@...isiblethingslab.com>
Subject: Re: [PATCH] xen: fix is_xen_pmu()
On 3/22/22 11:50 AM, Juergen Gross wrote:
> is_xen_pmu() is taking the cpu number as parameter, but it is not using
> it. Instead it just tests whether the Xen PMU initialization on the
> current cpu did succeed. As this test is done by checking a percpu
> pointer, preemption needs to be disabled in order to avoid switching
> the cpu while doing the test. While resuming from suspend() this seems
> not to be the case:
>
> [ 88.082751] ACPI: PM: Low-level resume complete
> [ 88.087933] ACPI: EC: EC started
> [ 88.091464] ACPI: PM: Restoring platform NVS memory
> [ 88.097166] xen_acpi_processor: Uploading Xen processor PM info
> [ 88.103850] Enabling non-boot CPUs ...
> [ 88.108128] installing Xen timer for CPU 1
> [ 88.112763] BUG: using smp_processor_id() in preemptible [00000000] code: systemd-sleep/7138
> [ 88.122256] caller is is_xen_pmu+0x12/0x30
> [ 88.126937] CPU: 0 PID: 7138 Comm: systemd-sleep Tainted: G W 5.16.13-2.fc32.qubes.x86_64 #1
> [ 88.137939] Hardware name: Star Labs StarBook/StarBook, BIOS 7.97 03/21/2022
> [ 88.145930] Call Trace:
> [ 88.148757] <TASK>
> [ 88.151193] dump_stack_lvl+0x48/0x5e
> [ 88.155381] check_preemption_disabled+0xde/0xe0
> [ 88.160641] is_xen_pmu+0x12/0x30
> [ 88.164441] xen_smp_intr_init_pv+0x75/0x100
There is actually another PMU-related problem on restore which was caused (or, rather, highlighted) by ff083a2d972f56bebfd82409ca62e5dfce950961:
[ 116.861637] ------------[ cut here ]------------
[ 116.861651] WARNING: CPU: 1 PID: 31 at kernel/events/core.c:6614 perf_register_guest_info_callbacks+0x68/0x70
[ 116.861673] Modules linked in:
[ 116.861682] CPU: 1 PID: 31 Comm: xenwatch Not tainted 5.17.0-rc7ostr #103
[ 116.861695] RIP: e030:perf_register_guest_info_callbacks+0x68/0x70
[ 116.861706] Code: c7 c7 40 e1 86 82 e8 d7 e7 ff ff 48 8b 53 10 48 85 d2 74 14 48 c7 c6 f0 0a c0 81 48 c7 c7 30 e1 86 82 5b e9 ba e7 ff ff 5b c3 <0f> 0b c3 0f 1f 44 00 00 0f 1f 44 00 00 48 8b 05 54 fd 0b 02 48 39
[ 116.861747] RSP: e02b:ffffc9004016fe18 EFLAGS: 00010286
[ 116.861758] RAX: ffffffff82432850 RBX: 0000000000000003 RCX: ffff888079c00000
[ 116.861768] RDX: ffff888079c00000 RSI: ffffc9004016fe30 RDI: ffffffff82432850
[ 116.861778] RBP: 0000000000000000 R08: 0000160000000000 R09: ffffea00000ed340
[ 116.861788] R10: 0000000000000758 R11: 0000000000000000 R12: ffff888003b4d000
[ 116.861797] R13: 0000000000000003 R14: ffffffff8162cf10 R15: 0000000000000000
[ 116.861819] FS: 0000000000000000(0000) GS:ffff888079c80000(0000) knlGS:0000000000000000
[ 116.861830] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 116.861839] CR2: 0000000000000000 CR3: 00000000062b6000 CR4: 0000000000040660
[ 116.861853] Call Trace:
[ 116.861861] <TASK>
[ 116.861866] xen_pmu_init+0x187/0x280
[ 116.861879] xen_arch_resume+0x30/0x50
[ 116.861888] do_suspend.cold+0x132/0x147
[ 116.861899] shutdown_handler+0x12e/0x140
[ 116.861910] xenwatch_thread+0x94/0x180
[ 116.861919] ? finish_wait+0x80/0x80
[ 116.861928] kthread+0xe7/0x110
[ 116.861938] ? kthread_complete_and_exit+0x20/0x20
[ 116.861948] ret_from_fork+0x22/0x30
[ 116.861959] </TASK>
[ 116.861964] ---[ end trace 0000000000000000 ]---
I was going to send a patch but I think yours can be slightly modified to take care of this problem as well.
> @@ -542,6 +539,7 @@ void xen_pmu_init(int cpu)
> per_cpu(xenpmu_shared, cpu).flags = 0;
>
> if (cpu == 0) {
if (!is_xen_pmu)
> + is_xen_pmu = true;
> perf_register_guest_info_callbacks(&xen_guest_cbs);
> xen_pmu_arch_init();
> }
> @@ -572,4 +570,7 @@ void xen_pmu_finish(int cpu)
>
> free_pages((unsigned long)per_cpu(xenpmu_shared, cpu).xenpmu_data, 0);
> per_cpu(xenpmu_shared, cpu).xenpmu_data = NULL;
> +
> + if (cpu == 0)
> + is_xen_pmu = false;
And drop this hunk.
-boris
Powered by blists - more mailing lists