[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5c77d242-2b03-d9b9-7208-b1d4fdc8796a@amd.com>
Date: Tue, 12 Jul 2022 22:10:03 -0500
From: Mario Limonciello <mario.limonciello@....com>
To: "Yuan, Perry" <Perry.Yuan@....com>,
Oleksandr Natalenko <oleksandr@...alenko.name>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Huang, Ray" <Ray.Huang@....com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
Sasha Levin <sashal@...nel.org>,
"x86@...nel.org" <x86@...nel.org>,
"H. Peter Anvin" <hpa@...or.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [REGRESSION] amd-pstate doesn't work since v5.18.11
On 7/12/22 21:40, Yuan, Perry wrote:
> [AMD Official Use Only - General]
>
> Hi Mario.
>
>> -----Original Message-----
>> From: Limonciello, Mario <Mario.Limonciello@....com>
>> Sent: Wednesday, July 13, 2022 4:07 AM
>> To: Oleksandr Natalenko <oleksandr@...alenko.name>; Yuan, Perry
>> <Perry.Yuan@....com>; linux-kernel@...r.kernel.org; Huang, Ray
>> <Ray.Huang@....com>
>> Cc: Rafael J. Wysocki <rafael.j.wysocki@...el.com>; Sasha Levin
>> <sashal@...nel.org>; x86@...nel.org; H. Peter Anvin <hpa@...or.com>;
>> Greg Kroah-Hartman <gregkh@...uxfoundation.org>
>> Subject: Re: [REGRESSION] amd-pstate doesn't work since v5.18.11
>>
>> On 7/12/2022 12:54, Oleksandr Natalenko wrote:
>>> Hello.
>>>
>>> On úterý 12. července 2022 19:50:33 CEST Limonciello, Mario wrote:
>>>> [Public]
>>>>
>>>> + Ray
>>>>
>>>>> -----Original Message-----
>>>>> From: Yuan, Perry <Perry.Yuan@....com>
>>>>> Sent: Tuesday, July 12, 2022 12:50
>>>>> To: Oleksandr Natalenko <oleksandr@...alenko.name>; Limonciello,
>>>>> Mario <Mario.Limonciello@....com>; linux-kernel@...r.kernel.org
>>>>> Cc: Rafael J. Wysocki <rafael.j.wysocki@...el.com>; Sasha Levin
>>>>> <sashal@...nel.org>; x86@...nel.org; H. Peter Anvin <hpa@...or.com>;
>>>>> Greg Kroah-Hartman <gregkh@...uxfoundation.org>
>>>>> Subject: RE: [REGRESSION] amd-pstate doesn't work since v5.18.11
>>>>>
>>>>> [AMD Official Use Only - General]
>>>>>
>>>>> Hi Oleksandr:
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Oleksandr Natalenko <oleksandr@...alenko.name>
>>>>>> Sent: Wednesday, July 13, 2022 1:40 AM
>>>>>> To: Limonciello, Mario <Mario.Limonciello@....com>; linux-
>>>>>> kernel@...r.kernel.org
>>>>>> Cc: Yuan, Perry <Perry.Yuan@....com>; Rafael J. Wysocki
>>>>>> <rafael.j.wysocki@...el.com>; Sasha Levin <sashal@...nel.org>;
>>>>>> x86@...nel.org; H. Peter Anvin <hpa@...or.com>; Greg Kroah-Hartman
>>>>>> <gregkh@...uxfoundation.org>
>>>>>> Subject: [REGRESSION] amd-pstate doesn't work since v5.18.11
>>>>>>
>>>>>> [CAUTION: External Email]
>>>>>>
>>>>>> Hello Mario.
>>>>>>
>>>>>> The following commits were pulled into v5.18.11:
>>>>>>
>>>>>> ```
>>>>>> $ git log --oneline --no-merges v5.18.10..v5.18.11 | grep ACPI
>>>>>> 2783414e6ef7 ACPI: CPPC: Don't require _OSC if X86_FEATURE_CPPC is
>>>>>> supported
>>>>>> 3068cfeca3b5 ACPI: CPPC: Only probe for _CPC if CPPC v2 is acked
>>>>>> 8beb71759cc8 ACPI: bus: Set CPPC _OSC bits for all and when
>>>>>> CPPC_LIB is supported
>>>>>> 13bb696dd2f3 ACPI: CPPC: Check _OSC for flexible address space ```
>>>>>>
>>>>>> and now this happens:
>>>>>>
>>>>>> ```
>>>>>> $ sudo modprobe amd-pstate shared_mem=1
>>>>>> modprobe: ERROR: could not insert 'amd_pstate': No such device ```
>>>>>>
>>>>>> With v5.18.10 this worked just fine.
>>>>>>
>>>>>> In your upstream commit
>> 8b356e536e69f3a4d6778ae9f0858a1beadabb1f
>>>>>> you write:
>>>>>>
>>>>>> ```
>>>>>> If there is additional breakage on the shared memory designs also
>>>>>> missing this _OSC, additional follow up changes may be needed.
>>>>>> ```
>>>>>>
>>>>>> So the question is what else should be pulled into the stable tree
>>>>>> to unbreak amd-pstate?
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> --
>>>>>> Oleksandr Natalenko (post-factum)
>>>>>>
>>>>>
>>>>> Could you share the lscpu output ?
>>>
>>> Here's my `lscpu`:
>>>
>>> ```
>>> Architecture: x86_64
>>> CPU op-mode(s): 32-bit, 64-bit
>>> Address sizes: 43 bits physical, 48 bits virtual
>>> Byte Order: Little Endian
>>> CPU(s): 24
>>> On-line CPU(s) list: 0-23
>>> Vendor ID: AuthenticAMD
>>> Model name: AMD Ryzen 9 3900XT 12-Core Processor
>>> CPU family: 23
>>> Model: 113
>>> Thread(s) per core: 2
>>> Core(s) per socket: 12
>>> Socket(s): 1
>>> Stepping: 0
>>> Frequency boost: enabled
>>> CPU(s) scaling MHz: 59%
>>> CPU max MHz: 3800,0000
>>> CPU min MHz: 2200,0000
>>> BogoMIPS: 7589.71
>>> Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
>> mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
>> pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid
>> aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic
>> movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic
>> cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce
>> topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3
>> hw_pstate ssbd mba ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm
>> rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves
>> cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf
>> xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale
>> vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic
>> v_vmsave_vmload vgif v_spec_ctrl umip rdpid overflow_recov succor smca
>> sev sev_es
>>> Virtualization: AMD-V
>>> L1d cache: 384 KiB (12 instances)
>>> L1i cache: 384 KiB (12 instances)
>>> L2 cache: 6 MiB (12 instances)
>>> L3 cache: 64 MiB (4 instances)
>>> NUMA node(s): 1
>>> NUMA node0 CPU(s): 0-23
>>> Vulnerability Itlb multihit: Not affected
>>> Vulnerability L1tf: Not affected
>>> Vulnerability Mds: Not affected
>>> Vulnerability Meltdown: Not affected
>>> Vulnerability Mmio stale data: Not affected
>>> Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass
>> disabled via prctl
>>> Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and
>> __user pointer sanitization
>>> Vulnerability Spectre v2: Mitigation; Retpolines, IBPB conditional, STIBP
>> conditional, RSB filling
>>> Vulnerability Srbds: Not affected
>>> Vulnerability Tsx async abort: Not affected
>>>
>>> ```
>>>
>>>>> Perry.
>>>>
>>>> Thanks this is the sort of thing I was worried might happen as a
>>>> result of requiring the _OSC. It was introduced as part of that commit
>> 8beb71759cc8.
>>>>
>>>> To solve it I think we need to add more things to
>>>> cpc_supported_by_cpu
>>>> (https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgi
>>>>
>> thub.com%2Ftorvalds%2Flinux%2Fblob%2F525496a030de4ae64bb9e1d6bfc8
>> 8eec
>>>>
>> 6f5fe6e2%2Farch%2Fx86%2Fkernel%2Facpi%2Fcppc.c%23L19&data=05
>> %7C01
>>>> %7CMario.Limonciello%40amd.com%7C96addaab0edc4e22779908da642f
>> 84ac%7C3
>>>>
>> dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637932453099304670
>> %7CUnknow
>>>>
>> n%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1ha
>> WwiL
>>>>
>> CJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=4KHD3UUlfDJEmpTpqDC
>> muV1x%2F7n
>>>> j%2F0iuhwdnhJqtQeU%3D&reserved=0)
>>>>
>>>> The question is how do we safely detect the shared memory designs?
>>>> These are a fixed quantity as newer designs /should/ be using the MSR.
>>>>
>>>> I am tending to thing that unfortunately we need to have an
>>>> allow-list of shared memory design here unless someone has other ideas.
>>>
>>> Happy to test any patches as needed.
>>>
>>
>> See if this helps out:
>>
>> diff --git a/arch/x86/kernel/acpi/cppc.c b/arch/x86/kernel/acpi/cppc.c index
>> 734b96454896..88a81e6b9228 100644
>> --- a/arch/x86/kernel/acpi/cppc.c
>> +++ b/arch/x86/kernel/acpi/cppc.c
>> @@ -16,6 +16,13 @@ bool cpc_supported_by_cpu(void)
>> switch (boot_cpu_data.x86_vendor) {
>> case X86_VENDOR_AMD:
>> case X86_VENDOR_HYGON:
>> + if (boot_cpu_data.x86 == 0x19 &&
>> + ((boot_cpu_data.x86_model >= 0x00 &&
>> boot_cpu_data.x86_model <= 0x0f) ||
>> + (boot_cpu_data.x86_model >= 0x20 &&
>> boot_cpu_data.x86_model <= 0x2f)))
>> + return true;
>> + else if (boot_cpu_data.x86 == 0x17 &&
>> + boot_cpu_data.x86_model >= 0x70 &&
>> boot_cpu_data.x86_model <= 0x7f)
>> + return true;
>> return boot_cpu_has(X86_FEATURE_CPPC);
>> }
>> return false;
>>
>> If that works and no one has a better idea how to do it for these systems I'll
>> send out a proper proper patch tomorrow.
>
> This could be a short-term solution, I would prefer to add CPU Ids check and we can maintain that list for
> all the model info including MSRs and Shared mem types.
What's longer term solution when it comes to shared mem? It seems like
it's either a list of IDs or a heuristic. Given it's a fixed list and
new designs take MSR, I would think the list of IDs is preferable.
Furthermore; I would argue that if there was another design introduced
for some reason that takes shared mem instead of MSR it should be using
_OSC to indicate CPPCv2 support not this list. This list only needs to
exist because the requirement for CPPC support in the _OSC is very
recent to the kernel.
Regarding MSR -
boot_cpu_has(X86_FEATURE_CPPC) indicates the MSR support. There
shouldn't be any need to maintain a list in this _OSC override check here.
>
> Meanwhile I have the similar issues concern for the coming EPP driver, some systems don`t support EPP and we cannot identify that without IDs list.
That will be localized into the EPP driver source at least.
>
> Perry.
>
Powered by blists - more mailing lists