lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: 
 <CAGwozwFRWiR4xQ422tp6H0R9knLjNkn4ewERyYtZgzOYfnJWxw@mail.gmail.com>
Date: Wed, 5 Nov 2025 12:34:59 +0100
From: Antheas Kapenekakis <lkml@...heas.dev>
To: Shyam Sundar S K <Shyam-sundar.S-k@....com>
Cc: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
	Mario Limonciello <mario.limonciello@....com>,
 Alex Deucher <alexander.deucher@....com>,
	Perry Yuan <perry.yuan@....com>, amd-gfx@...ts.freedesktop.org,
	dri-devel@...ts.freedesktop.org, LKML <linux-kernel@...r.kernel.org>,
	platform-driver-x86@...r.kernel.org, Sanket Goswami <Sanket.Goswami@....com>
Subject: Re: [PATCH v1 1/3] platform/x86/amd/pmc: Add support for Van Gogh SoC

On Wed, 5 Nov 2025 at 12:28, Shyam Sundar S K <Shyam-sundar.S-k@....com> wrote:
>
> Hi Ilpo,
>
> On 11/5/2025 16:43, Ilpo Järvinen wrote:
> > On Mon, 27 Oct 2025, Antheas Kapenekakis wrote:
> >
> >> On Mon, 27 Oct 2025 at 09:36, Shyam Sundar S K <Shyam-sundar.S-k@....com> wrote:
> >>>
> >>>
> >>>
> >>> On 10/27/2025 13:52, Shyam Sundar S K wrote:
> >>>>
> >>>>
> >>>> On 10/24/2025 22:02, Mario Limonciello wrote:
> >>>>>
> >>>>>
> >>>>> On 10/24/2025 11:08 AM, Antheas Kapenekakis wrote:
> >>>>>> On Fri, 24 Oct 2025 at 17:43, Mario Limonciello
> >>>>>> <mario.limonciello@....com> wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 10/24/2025 10:21 AM, Antheas Kapenekakis wrote:
> >>>>>>>> The ROG Xbox Ally (non-X) SoC features a similar architecture to the
> >>>>>>>> Steam Deck. While the Steam Deck supports S3 (s2idle causes a crash),
> >>>>>>>> this support was dropped by the Xbox Ally which only S0ix suspend.
> >>>>>>>>
> >>>>>>>> Since the handler is missing here, this causes the device to not
> >>>>>>>> suspend
> >>>>>>>> and the AMD GPU driver to crash while trying to resume afterwards
> >>>>>>>> due to
> >>>>>>>> a power hang.
> >>>>>>>>
> >>>>>>>> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4659
> >>>>>>>> Signed-off-by: Antheas Kapenekakis <lkml@...heas.dev>
> >>>>>>>> ---
> >>>>>>>>    drivers/platform/x86/amd/pmc/pmc.c | 3 +++
> >>>>>>>>    drivers/platform/x86/amd/pmc/pmc.h | 1 +
> >>>>>>>>    2 files changed, 4 insertions(+)
> >>>>>>>>
> >>>>>>>> diff --git a/drivers/platform/x86/amd/pmc/pmc.c b/drivers/
> >>>>>>>> platform/x86/amd/pmc/pmc.c
> >>>>>>>> index bd318fd02ccf..cae3fcafd4d7 100644
> >>>>>>>> --- a/drivers/platform/x86/amd/pmc/pmc.c
> >>>>>>>> +++ b/drivers/platform/x86/amd/pmc/pmc.c
> >>>>>>>> @@ -106,6 +106,7 @@ static void amd_pmc_get_ip_info(struct
> >>>>>>>> amd_pmc_dev *dev)
> >>>>>>>>        switch (dev->cpu_id) {
> >>>>>>>>        case AMD_CPU_ID_PCO:
> >>>>>>>>        case AMD_CPU_ID_RN:
> >>>>>>>> +     case AMD_CPU_ID_VG:
> >>>>>>>>        case AMD_CPU_ID_YC:
> >>>>>>>>        case AMD_CPU_ID_CB:
> >>>>>>>>                dev->num_ips = 12;
> >>>>>>>> @@ -517,6 +518,7 @@ static int amd_pmc_get_os_hint(struct
> >>>>>>>> amd_pmc_dev *dev)
> >>>>>>>>        case AMD_CPU_ID_PCO:
> >>>>>>>>                return MSG_OS_HINT_PCO;
> >>>>>>>>        case AMD_CPU_ID_RN:
> >>>>>>>> +     case AMD_CPU_ID_VG:
> >>>>>>>>        case AMD_CPU_ID_YC:
> >>>>>>>>        case AMD_CPU_ID_CB:
> >>>>>>>>        case AMD_CPU_ID_PS:
> >>>>>>>> @@ -717,6 +719,7 @@ static const struct pci_device_id
> >>>>>>>> pmc_pci_ids[] = {
> >>>>>>>>        { PCI_DEVICE(PCI_VENDOR_ID_AMD, AMD_CPU_ID_RV) },
> >>>>>>>>        { PCI_DEVICE(PCI_VENDOR_ID_AMD, AMD_CPU_ID_SP) },
> >>>>>>>>        { PCI_DEVICE(PCI_VENDOR_ID_AMD, AMD_CPU_ID_SHP) },
> >>>>>>>> +     { PCI_DEVICE(PCI_VENDOR_ID_AMD, AMD_CPU_ID_VG) },
> >>>>>>>>        { PCI_DEVICE(PCI_VENDOR_ID_AMD,
> >>>>>>>> PCI_DEVICE_ID_AMD_1AH_M20H_ROOT) },
> >>>>>>>>        { PCI_DEVICE(PCI_VENDOR_ID_AMD,
> >>>>>>>> PCI_DEVICE_ID_AMD_1AH_M60H_ROOT) },
> >>>>>>>>        { }
> >>>>>>>> diff --git a/drivers/platform/x86/amd/pmc/pmc.h b/drivers/
> >>>>>>>> platform/x86/amd/pmc/pmc.h
> >>>>>>>> index 62f3e51020fd..fe3f53eb5955 100644
> >>>>>>>> --- a/drivers/platform/x86/amd/pmc/pmc.h
> >>>>>>>> +++ b/drivers/platform/x86/amd/pmc/pmc.h
> >>>>>>>> @@ -156,6 +156,7 @@ void amd_mp2_stb_deinit(struct amd_pmc_dev *dev);
> >>>>>>>>    #define AMD_CPU_ID_RN                       0x1630
> >>>>>>>>    #define AMD_CPU_ID_PCO                      AMD_CPU_ID_RV
> >>>>>>>>    #define AMD_CPU_ID_CZN                      AMD_CPU_ID_RN
> >>>>>>>> +#define AMD_CPU_ID_VG                        0x1645
> >>>>>>>
> >>>>>>> Can you see if 0xF14 gives you a reasonable value for the idle mask if
> >>>>>>> you add it to amd_pmc_idlemask_read()?  Make a new define for it
> >>>>>>> though,
> >>>>>>> it shouldn't use the same define as 0x1a platforms.
> >>>>>>
> >>>>>> It does not work. Reports 0. I also tested the other ones, but the
> >>>>>> 0x1a was the same as you said. All report 0x0.
> >>>>>
> >>>>> It's possible the platform doesn't report an idle mask.
> >>>>>
> >>>>> 0xF14 is where I would have expected it to report.
> >>>>>
> >>>>> Shyam - can you look into this to see if it's in a different place
> >>>>> than 0xF14 for Van Gogh?
> >>>>
> >>>> Van Gogh is before Cezzane? I am bit surprised that pmc is getting
> >>>> loaded there.
> >>>>
> >>>> Antheas - what is the output of
> >>>>
> >>>> #lspci -s 00:00.0
> >>>
> >>> OK. I get it from the diff.
> >>>
> >>> +#define AMD_CPU_ID_VG                        0x1645
> >>>
> >>> S0 its 0x1645 that indicates SoC is 17h family and 90h model.
> >>>
> >>> What is the PMFW version running on your system?
> >>> amd_pmc_get_smu_version() tells you that information.
> >>
> >> cat /sys/devices/platform/AMDI0005:00/smu_fw_version
> >> 63.18.0
> >> cat /sys/devices/platform/AMDI0005:00/smu_program
> >> 7
> >>
> >>> Can you see if you put the scratch information same as Cezzane and if
> >>> that works? i.e.
> >>>
> >>> AMD_PMC_SCRATCH_REG_CZN(0x94) instead of AMD_PMC_SCRATCH_REG_1AH(0xF14)
> >>
> >> I tried all idle masks and they return 0
> >
> > Hi Shyam & Antheas,
> >
> > This discussion seems to have died down without clear indication what's
> > the best course of action here. Should I still wait?
> >
> > There's no particular hurry from my side but it seems Mario gave his
> > Reviewed-by already and there hasn't been any follow-ups between you two,
> > I'm left a bit unsure how to interpret that.
> >
>
> The thought process to was understand how do we debug the rest 5%
> failures when we do no not have idlemask concept, which got introduced
> after sometime. But both the patches should work independently, so I
> am ok with both patch 1/3 and 2/3.
>
> Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@....com>
>
>
> >
> > In addition, is the patch 3/3 entire independent from these two PMC ones?
> > (If yes, I don't know why they were submitted as a series as that just
> > manages to add a little bit of uncertainty when combined into a series.)
>
> I see a note from Mario on the cover letter that the patch 3/3 can be
> dropped from this series and a newer approach is being planned.

To be more specific, patch 3 became two separate patches that went through drm.

For the rare failure, it would be an additional patch (if appropriate)
that does not affect 1 and 2.

Do you have any idea of where the failure for the other 5% of cases
comes from? I noticed that after I hibernated my device and it booted
up, it would never go into LPS0, the OS hint stopped working, would
that be a hint?

Antheas

> So, 1/3 and 2/3 of this series can be taken.
>
> Thanks,
> Shyam
> >
> > Thanks in advance,
> >
> > --
> >  i.
> >
> >> Antheas
> >>
> >>> Thanks,
> >>> Shyam
> >>>
> >>>
> >>>>
> >>>> 0xF14 index is meant for 1Ah (i.e. Strix and above)
> >>>>
> >>>>>
> >>>>>>
> >>>>>> Any idea why the OS hint only works 90% of the time?
> >>>>
> >>>> What is the output of amd_pmc_dump_registers() when 10% of the time
> >>>> when the OS_HINT is not working?
> >>>>
> >>>> What I can surmise is, though pmc driver is sending the hint PMFW is
> >>>> not taking any action (since the support in FW is missing)
> >>>>
> >>>>>
> >>>>> If we get the idle mask reporting working we would have a better idea
> >>>>> if that is what is reported wrong.
> >>>>>
> >>>>
> >>>> IIRC, The concept of idlemask came only after cezzane that too after a
> >>>> certain PMFW version. So I am not sure if idlemask actually exists.
> >>>>
> >>>>
> >>>>> If I was to guess though; maybe GFX is still active.
> >>>>>
> >>>>> Depending upon what's going wrong smu_fw_info might have some more
> >>>>> information too.
> >>>>
> >>>> That's a good point to try it out.
> >>>>
> >>>> Thanks,
> >>>> Shyam
> >>>>
> >>>
> >>>
> >>
> >>
> >
>
>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ