[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8dd1e8f6-f96d-4d36-ac2a-c258ac842f75@redhat.com>
Date: Wed, 23 Jul 2025 10:42:47 +0200
From: David Hildenbrand <david@...hat.com>
To: Xuanye Liu <liuqiye2025@....com>, Kees Cook <kees@...nel.org>
Cc: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>, Vlastimil Babka
<vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: add stack trace when bad rss-counter state is
detected
On 23.07.25 10:05, David Hildenbrand wrote:
> On 23.07.25 09:45, Xuanye Liu wrote:
>>
>> 在 2025/7/23 15:31, Kees Cook 写道:
>>> On Wed, Jul 23, 2025 at 03:23:49PM +0800, Xuanye Liu wrote:
>>>> The check_mm() function verifies the correctness of rss counters in
>>>> struct mm_struct. Currently, it only prints an alert when a bad
>>>> rss-counter state is detected, but lacks sufficient context for
>>>> debugging.
>>>>
>>>> This patch adds a dump_stack() call to provide a stack trace when
>>>> the rss-counter state is invalid. This helps developers identify
>>>> where the corrupted mm_struct is being checked and trace the
>>>> underlying cause of the inconsistency.
>>> Why not just convert the pr_alert to a WARN?
>> Good idea! I'll gather more feedback from others and then update to v2.
>
> Makes sense to me.
After discussion this with Lorenzo off-list, isn't the stack completely
misleading/useless in that case?
Whatever caused the RSS counter mismatch (e.g., unmapped the wrong
pages, missed to unmap pages) quite possibly happened in different
context, way way earlier.
Why would you think the stack trace would be of any value when
destroying an MM (__mmdrop)?
Having that said, I really hate these "pr_*("BUG: ...") with passion.
Probably we'd want to invoke the panic_on_warn machinery, because
something unexpected happened.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists