[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <01b44e0f-ea2e-406f-9f65-b698b5504f42@kernel.org>
Date: Tue, 4 Nov 2025 10:38:54 +0100
From: "David Hildenbrand (Red Hat)" <david@...nel.org>
To: Xie Yuanbin <xieyuanbin1@...wei.com>, david@...hat.com,
dave.hansen@...el.com, bp@...en8.de, tglx@...utronix.de, mingo@...hat.com,
dave.hansen@...ux.intel.com, hpa@...or.com, akpm@...ux-foundation.org,
lorenzo.stoakes@...cle.com, Liam.Howlett@...cle.com, vbabka@...e.cz,
rppt@...nel.org, surenb@...gle.com, mhocko@...e.com, linmiaohe@...wei.com,
nao.horiguchi@...il.com, luto@...nel.org, peterz@...radead.org,
tony.luck@...el.com
Cc: x86@...nel.org, linux-kernel@...r.kernel.org, linux-mm@...ck.org,
linux-edac@...r.kernel.org, will@...nel.org, liaohua4@...wei.com,
lilinjie8@...wei.com
Subject: Re: [PATCH v2 2/2] mm/memory-failure: remove the selection of RAS
On 04.11.25 08:23, Xie Yuanbin wrote:
> The commit 97f0b13452198290799f ("tracing: add trace event for
> memory-failure") introduces the selection of RAS in memory-failure.
> This commit is just a tracing feature; in reality, there is no dependency
> between memory-failure and RAS. RAS increases the size of the bzImage
> image by 8k, which is very valuable for embedded devices.
>
> Move the memory-failure traceing code from ras_event.h to
> memory-failure.h and remove the selection of RAS.
>
> Signed-off-by: Xie Yuanbin <xieyuanbin1@...wei.com>
> Cc: David Hildenbrand <david@...hat.com>
> Cc: Borislav Petkov <bp@...en8.de>
> ---
[...]
> +++ b/include/trace/events/memory-failure.h
> @@ -0,0 +1,97 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM ras
This trace system should not be called "ras". All RAS terminology should
be removed here.
#define TRACE_SYSTEM memory_failure
> +#define TRACE_INCLUDE_FILE memory-failure
> +
> +#if !defined(_TRACE_MEMORY_FAILURE_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_MEMORY_FAILURE_H
> +
> +#include <linux/tracepoint.h>
> +#include <linux/mm.h>
> +
> +/*
> + * memory-failure recovery action result event
> + *
> + * unsigned long pfn - Page Frame Number of the corrupted page
> + * int type - Page types of the corrupted page
> + * int result - Result of recovery action
> + */
> +
> +#define MF_ACTION_RESULT \
> + EM ( MF_IGNORED, "Ignored" ) \
> + EM ( MF_FAILED, "Failed" ) \
> + EM ( MF_DELAYED, "Delayed" ) \
> + EMe ( MF_RECOVERED, "Recovered" )
> +
> +#define MF_PAGE_TYPE \
> + EM ( MF_MSG_KERNEL, "reserved kernel page" ) \
> + EM ( MF_MSG_KERNEL_HIGH_ORDER, "high-order kernel page" ) \
> + EM ( MF_MSG_HUGE, "huge page" ) \
> + EM ( MF_MSG_FREE_HUGE, "free huge page" ) \
> + EM ( MF_MSG_GET_HWPOISON, "get hwpoison page" ) \
> + EM ( MF_MSG_UNMAP_FAILED, "unmapping failed page" ) \
> + EM ( MF_MSG_DIRTY_SWAPCACHE, "dirty swapcache page" ) \
> + EM ( MF_MSG_CLEAN_SWAPCACHE, "clean swapcache page" ) \
> + EM ( MF_MSG_DIRTY_MLOCKED_LRU, "dirty mlocked LRU page" ) \
> + EM ( MF_MSG_CLEAN_MLOCKED_LRU, "clean mlocked LRU page" ) \
> + EM ( MF_MSG_DIRTY_UNEVICTABLE_LRU, "dirty unevictable LRU page" ) \
> + EM ( MF_MSG_CLEAN_UNEVICTABLE_LRU, "clean unevictable LRU page" ) \
> + EM ( MF_MSG_DIRTY_LRU, "dirty LRU page" ) \
> + EM ( MF_MSG_CLEAN_LRU, "clean LRU page" ) \
> + EM ( MF_MSG_TRUNCATED_LRU, "already truncated LRU page" ) \
> + EM ( MF_MSG_BUDDY, "free buddy page" ) \
> + EM ( MF_MSG_DAX, "dax page" ) \
> + EM ( MF_MSG_UNSPLIT_THP, "unsplit thp" ) \
> + EM ( MF_MSG_ALREADY_POISONED, "already poisoned" ) \
> + EMe ( MF_MSG_UNKNOWN, "unknown page" )
> +
> +/*
> + * First define the enums in MM_ACTION_RESULT to be exported to userspace
> + * via TRACE_DEFINE_ENUM().
> + */
> +#undef EM
> +#undef EMe
> +#define EM(a, b) TRACE_DEFINE_ENUM(a);
> +#define EMe(a, b) TRACE_DEFINE_ENUM(a);
> +
> +MF_ACTION_RESULT
> +MF_PAGE_TYPE
> +
> +/*
> + * Now redefine the EM() and EMe() macros to map the enums to the strings
> + * that will be printed in the output.
> + */
> +#undef EM
> +#undef EMe
> +#define EM(a, b) { a, b },
> +#define EMe(a, b) { a, b }
> +
> +TRACE_EVENT(memory_failure_event,
> + TP_PROTO(unsigned long pfn,
> + int type,
> + int result),
> +
> + TP_ARGS(pfn, type, result),
> +
> + TP_STRUCT__entry(
> + __field(unsigned long, pfn)
> + __field(int, type)
> + __field(int, result)
> + ),
> +
> + TP_fast_assign(
> + __entry->pfn = pfn;
> + __entry->type = type;
> + __entry->result = result;
> + ),
> +
> + TP_printk("pfn %#lx: recovery action for %s: %s",
> + __entry->pfn,
> + __print_symbolic(__entry->type, MF_PAGE_TYPE),
> + __print_symbolic(__entry->result, MF_ACTION_RESULT)
> + )
> +);
> +#endif /* _TRACE_MEMORY_FAILURE_H */
> +
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>
We want to add that new file to the "HWPOISON MEMORY FAILURE HANDLING"
section in MAINTAINERS.
Nothing else jumped at me.
--
Cheers
David
Powered by blists - more mailing lists