[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5599808.DvuYhMxLoT@redhat.com>
Date: Wed, 07 Sep 2022 08:15:06 +0200
From: Oleksandr Natalenko <oleksandr@...hat.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
Cc: linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
linux-fsdevel@...r.kernel.org, Jonathan Corbet <corbet@....net>,
Alexander Viro <viro@...iv.linux.org.uk>,
Andrew Morton <akpm@...ux-foundation.org>,
Huang Ying <ying.huang@...el.com>,
"Jason A . Donenfeld" <Jason@...c4.com>,
Will Deacon <will@...nel.org>,
"Guilherme G . Piccoli" <gpiccoli@...lia.com>,
Laurent Dufour <ldufour@...ux.ibm.com>,
Stephen Kitt <steve@....org>, Rob Herring <robh@...nel.org>,
Joel Savitz <jsavitz@...hat.com>,
Kees Cook <keescook@...omium.org>,
Xiaoming Ni <nixiaoming@...wei.com>,
Luis Chamberlain <mcgrof@...nel.org>,
Renaud Métrich <rmetrich@...hat.com>,
Oleg Nesterov <oleg@...hat.com>,
Grzegorz Halat <ghalat@...hat.com>, Qi Guo <qguo@...hat.com>
Subject: Re: [PATCH] core_pattern: add CPU specifier
Hello.
On středa 7. září 2022 0:22:42 CEST Eric W. Biederman wrote:
> Oleksandr Natalenko <oleksandr@...hat.com> writes:
>
> > Statistically, in a large deployment regular segfaults may indicate a CPU issue.
> >
> > Currently, it is not possible to find out what CPU the segfault happened on.
> > There are at least two attempts to improve segfault logging with this regard,
> > but they do not help in case the logs rotate.
> >
> > Hence, lets make sure it is possible to permanently record a CPU
> > the task ran on using a new core_pattern specifier.
>
> I am puzzled why make it part of the file name, and not part of the
> core file? Say an elf note?
This might be a good idea too, and one approach doesn't exclude the other one.
> The big advantage is that you could always capture the cpu and
> will not need to take special care configuring your system to
> capture that information.
The advantage of having CPU recorded in the file name is that in case of multiple cores one can summarise them with a simple ls+grep without invoking a fully-featured debugger to find out whether the segfaults happened on the same CPU.
Thanks.
> Eric
>
> > Suggested-by: Renaud Métrich <rmetrich@...hat.com>
> > Signed-off-by: Oleksandr Natalenko <oleksandr@...hat.com>
> > ---
> > Documentation/admin-guide/sysctl/kernel.rst | 1 +
> > fs/coredump.c | 5 +++++
> > include/linux/coredump.h | 1 +
> > 3 files changed, 7 insertions(+)
> >
> > diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
> > index 835c8844bba48..b566fff04946b 100644
> > --- a/Documentation/admin-guide/sysctl/kernel.rst
> > +++ b/Documentation/admin-guide/sysctl/kernel.rst
> > @@ -169,6 +169,7 @@ core_pattern
> > %f executable filename
> > %E executable path
> > %c maximum size of core file by resource limit RLIMIT_CORE
> > + %C CPU the task ran on
> > %<OTHER> both are dropped
> > ======== ==========================================
> >
> > diff --git a/fs/coredump.c b/fs/coredump.c
> > index a8661874ac5b6..166d1f84a9b17 100644
> > --- a/fs/coredump.c
> > +++ b/fs/coredump.c
> > @@ -325,6 +325,10 @@ static int format_corename(struct core_name *cn, struct coredump_params *cprm,
> > err = cn_printf(cn, "%lu",
> > rlimit(RLIMIT_CORE));
> > break;
> > + /* CPU the task ran on */
> > + case 'C':
> > + err = cn_printf(cn, "%d", cprm->cpu);
> > + break;
> > default:
> > break;
> > }
> > @@ -535,6 +539,7 @@ void do_coredump(const kernel_siginfo_t *siginfo)
> > */
> > .mm_flags = mm->flags,
> > .vma_meta = NULL,
> > + .cpu = raw_smp_processor_id(),
> > };
> >
> > audit_core_dumps(siginfo->si_signo);
> > diff --git a/include/linux/coredump.h b/include/linux/coredump.h
> > index 08a1d3e7e46d0..191dcf5af6cb9 100644
> > --- a/include/linux/coredump.h
> > +++ b/include/linux/coredump.h
> > @@ -22,6 +22,7 @@ struct coredump_params {
> > struct file *file;
> > unsigned long limit;
> > unsigned long mm_flags;
> > + int cpu;
> > loff_t written;
> > loff_t pos;
> > loff_t to_skip;
--
Oleksandr Natalenko (post-factum)
Principal Software Maintenance Engineer
Powered by blists - more mailing lists