[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <674fd2b4942f1_3e0f629420@dwillia2-mobl3.amr.corp.intel.com.notmuch>
Date: Tue, 3 Dec 2024 19:55:32 -0800
From: Dan Williams <dan.j.williams@...el.com>
To: Raghavendra K T <raghavendra.kt@....com>, <linux-kernel@...r.kernel.org>,
<linux-cxl@...r.kernel.org>
CC: <bharata@....com>, Raghavendra K T <raghavendra.kt@....com>, Huang Ying
<ying.huang@...el.com>, Andrew Morton <akpm@...ux-foundation.org>, "Dan
Williams" <dan.j.williams@...el.com>, David Hildenbrand <david@...hat.com>,
Davidlohr Bueso <dave@...olabs.net>, Jonathan Cameron
<jonathan.cameron@...wei.com>, Dave Jiang <dave.jiang@...el.com>, "Alison
Schofield" <alison.schofield@...el.com>, Vishal Verma
<vishal.l.verma@...el.com>, Ira Weiny <ira.weiny@...el.com>, Alistair Popple
<apopple@...dia.com>, Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Bjorn Helgaas <bhelgaas@...gle.com>, Baoquan He <bhe@...hat.com>,
<ilpo.jarvinen@...ux.intel.com>, Mika Westerberg
<mika.westerberg@...ux.intel.com>, Fontenot Nathan <Nathan.Fontenot@....com>,
Wei Huang <wei.huang2@....com>, <regressions@...ts.linux.dev>
Subject: Re: [RFC PATCH] resource: Fix CXL node not populated issue
[ add regressions@...ts.linux.dev ]
Next time make the subject of the patch:
Revert "resource: fix region_intersects() vs add_memory_driver_managed()"
...to make it clear that this is a revert, not a fix.
The revert should be applied if a fix does not materialize in the next few weeks.
Raghavendra K T wrote:
> Before:
> ~]$ numastat -m
> ...
> Node 0 Node 1 Total
> --------------- --------------- ---------------
> MemTotal 128096.18 128838.48 256934.65
>
> After:
> $ numastat -m
> .....
> Node 0 Node 1 Node 2 Total
> --------------- --------------- --------------- ---------------
> MemTotal 128054.16 128880.51 129024.00 385958.67
>
> Current patch reverts the effect of first commit where the issue is seen.
Might you be able to dig a bit further into the details like memory map
for this platform and ACPI SRAT tables? A dmesg comparison of the good
and bad cases would be useful (those can be shared via a github gist).
Even better would be some debug instrumentation to identify which call
to __region_intersects() started behaving differently resulting in a
whole node disappearing.
In terms of the urgency of fixing this it would also help to know how
prevalent the system this was found on is in the wild.
Powered by blists - more mailing lists