[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<CAGwozwGELOK_4KNADW6OvGi4TyiJsrbtEceaXRpk4CRNpqweZw@mail.gmail.com>
Date: Mon, 27 Oct 2025 09:31:24 +0100
From: Antheas Kapenekakis <lkml@...heas.dev>
To: Shyam Sundar S K <Shyam-sundar.S-k@....com>
Cc: Mario Limonciello <mario.limonciello@....com>,
Alex Deucher <alexander.deucher@....com>,
Perry Yuan <perry.yuan@....com>, amd-gfx@...ts.freedesktop.org,
dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
platform-driver-x86@...r.kernel.org, Sanket Goswami <Sanket.Goswami@....com>
Subject: Re: [PATCH v1 1/3] platform/x86/amd/pmc: Add support for Van Gogh SoC
On Mon, 27 Oct 2025 at 09:22, Shyam Sundar S K <Shyam-sundar.S-k@....com> wrote:
>
>
>
> On 10/24/2025 22:02, Mario Limonciello wrote:
> >
> >
> > On 10/24/2025 11:08 AM, Antheas Kapenekakis wrote:
> >> On Fri, 24 Oct 2025 at 17:43, Mario Limonciello
> >> <mario.limonciello@....com> wrote:
> >>>
> >>>
> >>>
> >>> On 10/24/2025 10:21 AM, Antheas Kapenekakis wrote:
> >>>> The ROG Xbox Ally (non-X) SoC features a similar architecture to the
> >>>> Steam Deck. While the Steam Deck supports S3 (s2idle causes a crash),
> >>>> this support was dropped by the Xbox Ally which only S0ix suspend.
> >>>>
> >>>> Since the handler is missing here, this causes the device to not
> >>>> suspend
> >>>> and the AMD GPU driver to crash while trying to resume afterwards
> >>>> due to
> >>>> a power hang.
> >>>>
> >>>> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4659
> >>>> Signed-off-by: Antheas Kapenekakis <lkml@...heas.dev>
> >>>> ---
> >>>> drivers/platform/x86/amd/pmc/pmc.c | 3 +++
> >>>> drivers/platform/x86/amd/pmc/pmc.h | 1 +
> >>>> 2 files changed, 4 insertions(+)
> >>>>
> >>>> diff --git a/drivers/platform/x86/amd/pmc/pmc.c b/drivers/
> >>>> platform/x86/amd/pmc/pmc.c
> >>>> index bd318fd02ccf..cae3fcafd4d7 100644
> >>>> --- a/drivers/platform/x86/amd/pmc/pmc.c
> >>>> +++ b/drivers/platform/x86/amd/pmc/pmc.c
> >>>> @@ -106,6 +106,7 @@ static void amd_pmc_get_ip_info(struct
> >>>> amd_pmc_dev *dev)
> >>>> switch (dev->cpu_id) {
> >>>> case AMD_CPU_ID_PCO:
> >>>> case AMD_CPU_ID_RN:
> >>>> + case AMD_CPU_ID_VG:
> >>>> case AMD_CPU_ID_YC:
> >>>> case AMD_CPU_ID_CB:
> >>>> dev->num_ips = 12;
> >>>> @@ -517,6 +518,7 @@ static int amd_pmc_get_os_hint(struct
> >>>> amd_pmc_dev *dev)
> >>>> case AMD_CPU_ID_PCO:
> >>>> return MSG_OS_HINT_PCO;
> >>>> case AMD_CPU_ID_RN:
> >>>> + case AMD_CPU_ID_VG:
> >>>> case AMD_CPU_ID_YC:
> >>>> case AMD_CPU_ID_CB:
> >>>> case AMD_CPU_ID_PS:
> >>>> @@ -717,6 +719,7 @@ static const struct pci_device_id
> >>>> pmc_pci_ids[] = {
> >>>> { PCI_DEVICE(PCI_VENDOR_ID_AMD, AMD_CPU_ID_RV) },
> >>>> { PCI_DEVICE(PCI_VENDOR_ID_AMD, AMD_CPU_ID_SP) },
> >>>> { PCI_DEVICE(PCI_VENDOR_ID_AMD, AMD_CPU_ID_SHP) },
> >>>> + { PCI_DEVICE(PCI_VENDOR_ID_AMD, AMD_CPU_ID_VG) },
> >>>> { PCI_DEVICE(PCI_VENDOR_ID_AMD,
> >>>> PCI_DEVICE_ID_AMD_1AH_M20H_ROOT) },
> >>>> { PCI_DEVICE(PCI_VENDOR_ID_AMD,
> >>>> PCI_DEVICE_ID_AMD_1AH_M60H_ROOT) },
> >>>> { }
> >>>> diff --git a/drivers/platform/x86/amd/pmc/pmc.h b/drivers/
> >>>> platform/x86/amd/pmc/pmc.h
> >>>> index 62f3e51020fd..fe3f53eb5955 100644
> >>>> --- a/drivers/platform/x86/amd/pmc/pmc.h
> >>>> +++ b/drivers/platform/x86/amd/pmc/pmc.h
> >>>> @@ -156,6 +156,7 @@ void amd_mp2_stb_deinit(struct amd_pmc_dev *dev);
> >>>> #define AMD_CPU_ID_RN 0x1630
> >>>> #define AMD_CPU_ID_PCO AMD_CPU_ID_RV
> >>>> #define AMD_CPU_ID_CZN AMD_CPU_ID_RN
> >>>> +#define AMD_CPU_ID_VG 0x1645
> >>>
> >>> Can you see if 0xF14 gives you a reasonable value for the idle mask if
> >>> you add it to amd_pmc_idlemask_read()? Make a new define for it
> >>> though,
> >>> it shouldn't use the same define as 0x1a platforms.
> >>
> >> It does not work. Reports 0. I also tested the other ones, but the
> >> 0x1a was the same as you said. All report 0x0.
> >
> > It's possible the platform doesn't report an idle mask.
> >
> > 0xF14 is where I would have expected it to report.
> >
> > Shyam - can you look into this to see if it's in a different place
> > than 0xF14 for Van Gogh?
>
> Van Gogh is before Cezzane? I am bit surprised that pmc is getting
> loaded there.
The device only came out last week, so I suppose they had to add it
> Antheas - what is the output of
>
> #lspci -s 00:00.0
>
> 0xF14 index is meant for 1Ah (i.e. Strix and above)
lspci -s 00:00.0 -nn
00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] VanGogh
Root Complex [1022:1645]
> >
> >>
> >> Any idea why the OS hint only works 90% of the time?
>
> What is the output of amd_pmc_dump_registers() when 10% of the time
> when the OS_HINT is not working?
First sleep with initial data:
[ 63.569557] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_RESPONSE:1
[ 63.569581] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_ARGUMENT:73f
[ 63.569597] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_MESSAGE:9
[ 63.583472] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_RESPONSE:1
[ 63.583497] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_ARGUMENT:735677a0
[ 63.583513] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_MESSAGE:5
[ 63.607472] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_RESPONSE:1
[ 63.607496] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_ARGUMENT:0
[ 63.607512] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_MESSAGE:4
[ 63.607687] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_RESPONSE:1
[ 63.607702] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_ARGUMENT:0
[ 63.607709] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_MESSAGE:7
[ 63.608417] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_RESPONSE:1
[ 63.608436] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_ARGUMENT:0
[ 63.608452] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_MESSAGE:6
[ 63.608603] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_RESPONSE:1
[ 63.608621] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_ARGUMENT:1
[ 63.608637] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_MESSAGE:3
[ 64.764466] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_RESPONSE:1
[ 64.764490] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_ARGUMENT:0
[ 64.764506] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_MESSAGE:3
[ 64.764631] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_RESPONSE:1
[ 64.764646] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_ARGUMENT:0
[ 64.764660] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_MESSAGE:8
Second sleep (successful):
[ 235.211752] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_RESPONSE:1
[ 235.211776] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_ARGUMENT:0
[ 235.211790] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_MESSAGE:7
[ 235.211931] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_RESPONSE:1
[ 235.211946] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_ARGUMENT:0
[ 235.211960] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_MESSAGE:6
[ 235.212083] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_RESPONSE:1
[ 235.212096] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_ARGUMENT:1
[ 235.212109] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_MESSAGE:3
[ 236.520156] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_RESPONSE:1
[ 236.520177] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_ARGUMENT:0
[ 236.520192] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_MESSAGE:3
[ 236.520330] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_RESPONSE:1
[ 236.520346] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_ARGUMENT:0
[ 236.520360] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_MESSAGE:8
Failed sleep:
[ 152.839926] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_RESPONSE:1
[ 152.839951] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_ARGUMENT:0
[ 152.839965] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_MESSAGE:7
[ 152.840115] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_RESPONSE:1
[ 152.840134] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_ARGUMENT:0
[ 152.840148] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_MESSAGE:6
[ 152.840270] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_RESPONSE:1
[ 152.840276] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_ARGUMENT:1
[ 152.840280] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_MESSAGE:3
[ 158.037073] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_RESPONSE:1
[ 158.037097] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_ARGUMENT:0
[ 158.037111] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_MESSAGE:3
[ 158.037252] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_RESPONSE:1
[ 158.037268] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_ARGUMENT:0
[ 158.037282] amd_pmc AMDI0005:00: AMD_PMC_REGISTER_MESSAGE:8
So it is the same
> What I can surmise is, though pmc driver is sending the hint PMFW is
> not taking any action (since the support in FW is missing)
The hint is working... 90% of the time. Without the hint in the patch,
sleep never works.
> >
> > If we get the idle mask reporting working we would have a better idea
> > if that is what is reported wrong.
> >
>
> IIRC, The concept of idlemask came only after cezzane that too after a
> certain PMFW version. So I am not sure if idlemask actually exists.
>
>
> > If I was to guess though; maybe GFX is still active.
> >
> > Depending upon what's going wrong smu_fw_info might have some more
> > information too.
>
> That's a good point to try it out.
>
> Thanks,
> Shyam
>
>
Powered by blists - more mailing lists