linux-kernel - Re: [RFC PATCH 1/1] drivers: base: Expose probe failures via debugfs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YLojgGvjAO0v/4l2@kroah.com>
Date:   Fri, 4 Jun 2021 14:58:40 +0200
From:   Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To:     Adrian Ratiu <adrian.ratiu@...labora.com>
Cc:     "Rafael J. Wysocki" <rafael@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        kernel@...labora.com, linux-kernel@...r.kernel.org,
        Guillaume Tucker <gtucker.collabora@...il.com>,
        Enric Balletbò <enric.balletbo@...labora.com>
Subject: Re: [RFC PATCH 1/1] drivers: base: Expose probe failures via debugfs

On Thu, Jun 03, 2021 at 11:00:14PM +0300, Adrian Ratiu wrote:
> On Thu, 03 Jun 2021, Greg Kroah-Hartman <gregkh@...uxfoundation.org> wrote:
> > On Thu, Jun 03, 2021 at 03:55:34PM +0300, Adrian Ratiu wrote:
> > > This adds a new devices_failed debugfs attribute to list driver
> > > probe failures excepting -EPROBE_DEFER which are still handled as
> > > before via their own devices_deferred attribute.
> > 
> > Who is going to use this?
> > 
> 
> It's for KernelCI testing, I explained the background in my other reply.
> 
> > > This is useful on automated test systems like KernelCI to avoid
> > > filtering dmesg dev_err() messages to extract potential probe
> > > failures.
> > 
> > I thought we listed these already some other way today?
> > 
> 
> The only other place is the printk buffer via dev_err() and only the result
> subset of -EPROBE_DEFER info is exported via debugfs.
> 
> An additional problem with this new interface implementation is that it is
> based on the new-ish driver core "dev_err_probe" helper to which not all
> drivers have been converted (yet...), so there will be a mismatch between
> printk and this new interface.

Then why not move to use the new interface :)

> > > Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org> Cc: "Rafael J.
> > > Wysocki" <rafael@...nel.org> Cc: Guillaume Tucker
> > > <gtucker.collabora@...il.com> Suggested-by: Enric Balletbò
> > > <enric.balletbo@...labora.com> Signed-off-by: Adrian Ratiu
> > > <adrian.ratiu@...labora.com> ---  drivers/base/core.c | 76
> > > +++++++++++++++++++++++++++++++++++++++++++--  lib/Kconfig.debug   |
> > > 23 ++++++++++++++ 2 files changed, 96  insertions(+), 3 deletions(-)
> > > diff --git a/drivers/base/core.c b/drivers/base/core.c index
> > > b8a8c96dca58..74bf057234b8 100644 --- a/drivers/base/core.c +++
> > > b/drivers/base/core.c @@ -9,7 +9,9 @@   */   #include <linux/acpi.h>
> > > +#include <linux/circ_buf.h>  #include <linux/cpufreq.h> +#include
> > > <linux/debugfs.h>  #include <linux/device.h> #include <linux/err.h>
> > > #include  <linux/fwnode.h> @@ -53,6 +55,15 @@ static
> > > DEFINE_MUTEX(fwnode_link_lock);  static bool
> > > fw_devlink_is_permissive(void); static bool
> > > fw_devlink_drv_reg_done;  +#ifdef CONFIG_DEBUG_FS_PROBE_ERR +#define
> > > PROBE_ERR_BUF_ELEM_SIZE	64 +#define PROBE_ERR_BUF_SIZE	(1 <<
> > > CONFIG_DEBUG_FS_PROBE_ERR_BUF_SHIFT) +static struct circ_buf
> > > probe_err_crbuf; +static char
> > > failed_probe_buffer[PROBE_ERR_BUF_SIZE]; +static
> > > DEFINE_MUTEX(failed_probe_mutex); +static struct dentry
> > > *devices_failed_probe; +#endif
> > 
> > All of this just for a log buffer?  The kernel provides a great one,
> > printk, let's not create yet-another-log-buffer if at all possible
> > please.
> 
> Yes, that is correct, this is esentially duplicating information already
> exposed via the printk buffer.

Not good, I will not take this for that reason alone.  Also I don't want
to maintain something like this for the next 10+ years for no good
reason.

> > If the existing messages are "hard to parse", what can we do to make
> > them "easier" that would allow systems to do something with them?
> > 
> > What _do_ systems want to do with this information anyway?  What does it
> > help with exactly?
> > 
> 
> I know driver core probe error message formats are unlikely to change over
> time and debugfs in theory is as "stable" as printk, but I think the main
> concern is to find a a more reliable way than parsing printk to extract
> probe erros, like for the existing devices_deferred in debugfs.

But what exactly are you trying to detect?  And what are you going to do
if you detect it?

> The idea in my specific case is to be able to reliably run driver tests in
> KernelCI for expected or unexpected probe errors like -EINVAL.

How about making a "real" test for this type of thing and we add that to
the kernel itself?  Wouldn't that be a better thing to have?

thanks,

greg k-h