lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aWdKLClEHQ1cg34-@U-2FWC9VHC-2323.local>
Date: Wed, 14 Jan 2026 15:47:56 +0800
From: Feng Tang <feng.tang@...ux.alibaba.com>
To: "Rafael J. Wysocki" <rafael@...nel.org>
Cc: Sudeep Holla <sudeep.holla@....com>, Len Brown <lenb@...nel.org>,
	Jeremy Linton <jeremy.linton@....com>,
	Hanjun Guo <guohanjun@...wei.com>,
	James Morse <james.morse@....com>,
	Joanthan Cameron <Jonathan.Cameron@...wei.com>,
	linux-acpi@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] ACPI: PPTT: Dump PPTT table when error detected

Hi Rafael,

Thanks for the review!

On Tue, Jan 13, 2026 at 05:21:07PM +0100, Rafael J. Wysocki wrote:
> On Mon, Jan 12, 2026 at 6:03 PM Sudeep Holla <sudeep.holla@....com> wrote:
> >
> > On Wed, Dec 31, 2025 at 06:49:09PM +0800, Feng Tang wrote:
> > > There was warning message about PPTT table:
> > >
> > >       "ACPI PPTT: PPTT table found, but unable to locate core 1 (1)",
> > >
> > > and it in turn caused scheduler warnings when building up the system.
> > > It took a while to root cause the problem be related a broken PPTT
> > > table which has wrong cache information.
> > >
> > > To speedup debugging similar issues, dump the PPTT table, which makes
> > > the warning more noticeable and helps bug hunting.
> > >
> > > The dumped info format on a ARM server is like:
> > >
> > >     ACPI PPTT: Processors:
> > >     P[  0][0x0024]: parent=0x0000 acpi_proc_id=  0 num_res=1 flags=0x11(package)
> > >     P[  1][0x005a]: parent=0x0024 acpi_proc_id=  0 num_res=1 flags=0x12()
> > >     P[  2][0x008a]: parent=0x005a acpi_proc_id=  0 num_res=3 flags=0x1a(leaf)
> > >     P[  3][0x00f2]: parent=0x005a acpi_proc_id=  1 num_res=3 flags=0x1a(leaf)
> > >     P[  4][0x015a]: parent=0x005a acpi_proc_id=  2 num_res=3 flags=0x1a(leaf)
> > >     ...
> > >     ACPI PPTT: Caches:
> > >     C[   0][0x0072]: flags=0x7f next_level=0x0000 size=0x4000000  sets=65536  way=16 attribute=0xa  line_size=64
> > >     C[   1][0x00aa]: flags=0x7f next_level=0x00da size=0x10000    sets=256    way=4  attribute=0x4  line_size=64
> > >     C[   2][0x00c2]: flags=0x7f next_level=0x00da size=0x10000    sets=256    way=4  attribute=0x2  line_size=64
> > >     C[   3][0x00da]: flags=0x7f next_level=0x0000 size=0x100000   sets=2048   way=8  attribute=0xa  line_size=64
> > >     ...
> > >
> > > It provides a global and straightforward view of the hierarchy of the
> > > processor and caches info of the platform, and from the offset info
> > > (the 3rd column), the child-parent relation could be checked.
> > >
> > > With this, the root cause of the original issue was pretty obvious,
> > > that there were some caches items missing which caused the issue when
> > > building up scheduler domain.
> > >
> >
> > While this may sound like a good idea, it deviates from how errors in other
> > table-parsing code are handled. Instead of dumping the entire table, it would
> > be preferable to report the specific issue encountered during parsing.
> >
> > I do not have a strong objection if Rafael is comfortable with this approach;
> 
> I'm not a big fan of it TBH.
> 
> > however, it does differ from the established pattern used by similar code.
> > Dumping the entire table in a custom manner is not the standard way of
> > handling parsing errors. Just my opinion.
> 
> I agree.

I understand the concern of this could be kind of special, Hanjun and Sudeep
have the same feeling.

The reason for the patch is:
* The apcidump tool follow the standard general format to dump each item,
  without grouping them according to type, the number of lines of acpidump
  is about 20X more than this, making it harder to parse
* In rare cases like for silicon enabling, sometimes the kernel can fail
  early where the user space checking is not available. If HW debugger is
  not available either, the kernel dumping is the only way to debug.

Does the proposal of putting it under a kernel config look doable to you?
If not, I will keeep the code local for now.

Thanks,
Feng



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ