[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0ggYTPnYVMGy1uWGCPQNOu=xGFDxm=qf2G05hvZ999s_g@mail.gmail.com>
Date: Wed, 14 Jan 2026 12:41:39 +0100
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Feng Tang <feng.tang@...ux.alibaba.com>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>, Sudeep Holla <sudeep.holla@....com>, Len Brown <lenb@...nel.org>,
Jeremy Linton <jeremy.linton@....com>, Hanjun Guo <guohanjun@...wei.com>,
James Morse <james.morse@....com>, Joanthan Cameron <Jonathan.Cameron@...wei.com>,
linux-acpi@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] ACPI: PPTT: Dump PPTT table when error detected
On Wed, Jan 14, 2026 at 8:48 AM Feng Tang <feng.tang@...ux.alibaba.com> wrote:
>
> Hi Rafael,
>
> Thanks for the review!
>
> On Tue, Jan 13, 2026 at 05:21:07PM +0100, Rafael J. Wysocki wrote:
> > On Mon, Jan 12, 2026 at 6:03 PM Sudeep Holla <sudeep.holla@....com> wrote:
> > >
> > > On Wed, Dec 31, 2025 at 06:49:09PM +0800, Feng Tang wrote:
> > > > There was warning message about PPTT table:
> > > >
> > > > "ACPI PPTT: PPTT table found, but unable to locate core 1 (1)",
> > > >
> > > > and it in turn caused scheduler warnings when building up the system.
> > > > It took a while to root cause the problem be related a broken PPTT
> > > > table which has wrong cache information.
> > > >
> > > > To speedup debugging similar issues, dump the PPTT table, which makes
> > > > the warning more noticeable and helps bug hunting.
> > > >
> > > > The dumped info format on a ARM server is like:
> > > >
> > > > ACPI PPTT: Processors:
> > > > P[ 0][0x0024]: parent=0x0000 acpi_proc_id= 0 num_res=1 flags=0x11(package)
> > > > P[ 1][0x005a]: parent=0x0024 acpi_proc_id= 0 num_res=1 flags=0x12()
> > > > P[ 2][0x008a]: parent=0x005a acpi_proc_id= 0 num_res=3 flags=0x1a(leaf)
> > > > P[ 3][0x00f2]: parent=0x005a acpi_proc_id= 1 num_res=3 flags=0x1a(leaf)
> > > > P[ 4][0x015a]: parent=0x005a acpi_proc_id= 2 num_res=3 flags=0x1a(leaf)
> > > > ...
> > > > ACPI PPTT: Caches:
> > > > C[ 0][0x0072]: flags=0x7f next_level=0x0000 size=0x4000000 sets=65536 way=16 attribute=0xa line_size=64
> > > > C[ 1][0x00aa]: flags=0x7f next_level=0x00da size=0x10000 sets=256 way=4 attribute=0x4 line_size=64
> > > > C[ 2][0x00c2]: flags=0x7f next_level=0x00da size=0x10000 sets=256 way=4 attribute=0x2 line_size=64
> > > > C[ 3][0x00da]: flags=0x7f next_level=0x0000 size=0x100000 sets=2048 way=8 attribute=0xa line_size=64
> > > > ...
> > > >
> > > > It provides a global and straightforward view of the hierarchy of the
> > > > processor and caches info of the platform, and from the offset info
> > > > (the 3rd column), the child-parent relation could be checked.
> > > >
> > > > With this, the root cause of the original issue was pretty obvious,
> > > > that there were some caches items missing which caused the issue when
> > > > building up scheduler domain.
> > > >
> > >
> > > While this may sound like a good idea, it deviates from how errors in other
> > > table-parsing code are handled. Instead of dumping the entire table, it would
> > > be preferable to report the specific issue encountered during parsing.
> > >
> > > I do not have a strong objection if Rafael is comfortable with this approach;
> >
> > I'm not a big fan of it TBH.
> >
> > > however, it does differ from the established pattern used by similar code.
> > > Dumping the entire table in a custom manner is not the standard way of
> > > handling parsing errors. Just my opinion.
> >
> > I agree.
>
> I understand the concern of this could be kind of special, Hanjun and Sudeep
> have the same feeling.
>
> The reason for the patch is:
> * The apcidump tool follow the standard general format to dump each item,
> without grouping them according to type, the number of lines of acpidump
> is about 20X more than this, making it harder to parse
But you can develop a PPTT parser.
All ACPI tables are exposed verbatim via /sys/firmware/acpi/tables/.
> * In rare cases like for silicon enabling, sometimes the kernel can fail
> early where the user space checking is not available. If HW debugger is
> not available either, the kernel dumping is the only way to debug.
But I don't think you need to dump the entire table in those cases.
> Does the proposal of putting it under a kernel config look doable to you?
That would mean extra code that's almost never used and needs to be
taken into account when making changes that may affect it. Thanks,
but no thanks.
Powered by blists - more mailing lists