[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aWZZeD496CPi20Gc@bogus>
Date: Tue, 13 Jan 2026 14:40:56 +0000
From: Sudeep Holla <sudeep.holla@....com>
To: Feng Tang <feng.tang@...ux.alibaba.com>
Cc: "Rafael J . Wysocki" <rafael@...nel.org>, Len Brown <lenb@...nel.org>,
Jeremy Linton <jeremy.linton@....com>,
Sudeep Holla <sudeep.holla@....com>,
Hanjun Guo <guohanjun@...wei.com>,
James Morse <james.morse@....com>,
Joanthan Cameron <Jonathan.Cameron@...wei.com>,
linux-acpi@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] ACPI: PPTT: Dump PPTT table when error detected
On Tue, Jan 13, 2026 at 04:25:29PM +0800, Feng Tang wrote:
> Hi Sudeep,
>
> Thanks for the reviews!
>
> On Mon, Jan 12, 2026 at 05:02:59PM +0000, Sudeep Holla wrote:
> > On Wed, Dec 31, 2025 at 06:49:09PM +0800, Feng Tang wrote:
> > > There was warning message about PPTT table:
> > >
> > > "ACPI PPTT: PPTT table found, but unable to locate core 1 (1)",
> > >
> > > and it in turn caused scheduler warnings when building up the system.
> > > It took a while to root cause the problem be related a broken PPTT
> > > table which has wrong cache information.
> > >
> > > To speedup debugging similar issues, dump the PPTT table, which makes
> > > the warning more noticeable and helps bug hunting.
> > >
> > > The dumped info format on a ARM server is like:
> > >
> > > ACPI PPTT: Processors:
> > > P[ 0][0x0024]: parent=0x0000 acpi_proc_id= 0 num_res=1 flags=0x11(package)
> > > P[ 1][0x005a]: parent=0x0024 acpi_proc_id= 0 num_res=1 flags=0x12()
> > > P[ 2][0x008a]: parent=0x005a acpi_proc_id= 0 num_res=3 flags=0x1a(leaf)
> > > P[ 3][0x00f2]: parent=0x005a acpi_proc_id= 1 num_res=3 flags=0x1a(leaf)
> > > P[ 4][0x015a]: parent=0x005a acpi_proc_id= 2 num_res=3 flags=0x1a(leaf)
> > > ...
> > > ACPI PPTT: Caches:
> > > C[ 0][0x0072]: flags=0x7f next_level=0x0000 size=0x4000000 sets=65536 way=16 attribute=0xa line_size=64
> > > C[ 1][0x00aa]: flags=0x7f next_level=0x00da size=0x10000 sets=256 way=4 attribute=0x4 line_size=64
> > > C[ 2][0x00c2]: flags=0x7f next_level=0x00da size=0x10000 sets=256 way=4 attribute=0x2 line_size=64
> > > C[ 3][0x00da]: flags=0x7f next_level=0x0000 size=0x100000 sets=2048 way=8 attribute=0xa line_size=64
> > > ...
> > >
> > > It provides a global and straightforward view of the hierarchy of the
> > > processor and caches info of the platform, and from the offset info
> > > (the 3rd column), the child-parent relation could be checked.
> > >
> > > With this, the root cause of the original issue was pretty obvious,
> > > that there were some caches items missing which caused the issue when
> > > building up scheduler domain.
> > >
> >
> > While this may sound like a good idea, it deviates from how errors in other
> > table-parsing code are handled. Instead of dumping the entire table, it would
> > be preferable to report the specific issue encountered during parsing.
> >
> > I do not have a strong objection if Rafael is comfortable with this approach;
> > however, it does differ from the established pattern used by similar code.
> > Dumping the entire table in a custom manner is not the standard way of
> > handling parsing errors. Just my opinion.
>
> Yes, it's a fair point about the error handling. Actually for the issue
> we met, the PPTT table complies with ACPI spec and PPTT table spec nicely,
> that it has no checksum or format issue, the only problem is some items
> are missing.
>
Agreed, but how is this any different from other tables that contain optional
entries the ASL compiler cannot detect?
> So I would say the dump itself doesn't break any existing ACPI table error
> handling, or change anything. As Hanjun suggested, it could be put under a
> CONFIG_ACPI_PPTT_ERR_DUMP option as a PPTT specific debug method, and not
> related to general ACPI table error handling.
>
Sure, that could be an option as long as CONFIG_ACPI_PPTT_ERR_DUMP is default
off and are enabled only when debugging and not always like in distro images.
Does that work for you ?
> We have had this in our tree for a while, and the good part is it gives a
> direct overview of all the processors and caches in system, you get to
> know the rough number of them from the index, and items are listed side
> by side so that some minor error could be very obvious in this comparing
> mode.
>
Agreed, but all this info are available to userspace in some form already.
What does this dump give other than debugging a broken PPTT ?
--
Regards,
Sudeep
Powered by blists - more mailing lists