[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0022df64-b7f5-43b4-87ed-5df5d47c5c6a@amd.com>
Date: Tue, 24 Oct 2023 13:30:59 -0500
From: Mario Limonciello <mario.limonciello@....com>
To: Ingo Molnar <mingo@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>,
Borislav Petkov <bp@...en8.de>,
Thomas Gleixner <tglx@...utronix.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Sandipan Das <sandipan.das@....com>,
"H . Peter Anvin" <hpa@...or.com>, linux-kernel@...r.kernel.org,
x86@...nel.org, linux-pm@...r.kernel.org, rafael@...nel.org,
pavel@....cz, linux-perf-users@...r.kernel.org,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Ian Rogers <irogers@...gle.com>,
Adrian Hunter <adrian.hunter@...el.com>
Subject: Re: [PATCH 2/2] perf/x86/amd: Don't allow pre-emption in
amd_pmu_lbr_reset()
On 10/24/2023 11:51, Ingo Molnar wrote:
>
> * Ingo Molnar <mingo@...nel.org> wrote:
>
>>
>> * Mario Limonciello <mario.limonciello@....com> wrote:
>>
>>> Fixes a BUG reported during suspend to ram testing.
>>>
>>> ```
>>> [ 478.274752] BUG: using smp_processor_id() in preemptible [00000000] code: rtcwake/2948
>>> [ 478.274754] caller is amd_pmu_lbr_reset+0x19/0xc0
>>> ```
>>>
>>> Cc: stable@...r.kernel.org # 6.1+
>>> Fixes: ca5b7c0d9621 ("perf/x86/amd/lbr: Add LbrExtV2 branch record support")
>>> Signed-off-by: Mario Limonciello <mario.limonciello@....com>
>>> ---
>>> arch/x86/events/amd/lbr.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/events/amd/lbr.c b/arch/x86/events/amd/lbr.c
>>> index eb31f850841a..5b98e8c7d8b7 100644
>>> --- a/arch/x86/events/amd/lbr.c
>>> +++ b/arch/x86/events/amd/lbr.c
>>> @@ -321,7 +321,7 @@ int amd_pmu_lbr_hw_config(struct perf_event *event)
>>>
>>> void amd_pmu_lbr_reset(void)
>>> {
>>> - struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>>> + struct cpu_hw_events *cpuc = get_cpu_ptr(&cpu_hw_events);
>>> int i;
>>>
>>> if (!x86_pmu.lbr_nr)
>>> @@ -335,6 +335,7 @@ void amd_pmu_lbr_reset(void)
>>>
>>> cpuc->last_task_ctx = NULL;
>>> cpuc->last_log_id = 0;
>>> + put_cpu_ptr(&cpu_hw_events);
>>> wrmsrl(MSR_AMD64_LBR_SELECT, 0);
>>> }
>>
>> Weird, amd_pmu_lbr_reset() is called from these places:
>>
>> - amd_pmu_lbr_sched_task(): during task sched-in during
>> context-switching, this should already have preemption disabled.
>>
>> - amd_pmu_lbr_add(): this gets indirectly called by amd_pmu::add
>> (amd_pmu_add_event()), called by event_sched_in(), which too should have
>> preemption disabled.
>>
>> I clearly must have missed some additional place it gets called in.
>
> Just for completeness, the additional place I missed is
> amd_pmu_cpu_reset():
>
> static_call(amd_pmu_branch_reset)();
>
> ... and the amd_pmu_branch_reset static call is set up with
> amd_pmu_lbr_reset, which is why git grep missed it.
>
> Anyway, amd_pmu_cpu_reset() is very much something that should run
> non-preemptable to begin with, so your patch only papers over the real
> problem AFAICS.
>
> Thanks,
>
> Ingo
In that case - should preemption be disabled for all of
x86_pmu_dying_cpu() perhaps?
For good measure x86_pmu_starting_cpu() too?
Powered by blists - more mailing lists