lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250428204840.GB1572343-robh@kernel.org>
Date: Mon, 28 Apr 2025 15:48:40 -0500
From: Rob Herring <robh@...nel.org>
To: Danilo Krummrich <dakr@...nel.org>
Cc: Dirk Behme <dirk.behme@...bosch.com>, Dirk Behme <dirk.behme@...il.com>,
	Remo Senekowitsch <remo@...nzli.dev>,
	Saravana Kannan <saravanak@...gle.com>,
	Miguel Ojeda <ojeda@...nel.org>,
	Alex Gaynor <alex.gaynor@...il.com>,
	Boqun Feng <boqun.feng@...il.com>, Gary Guo <gary@...yguo.net>,
	Björn Roy Baron <bjorn3_gh@...tonmail.com>,
	Benno Lossin <benno.lossin@...ton.me>,
	Andreas Hindborg <a.hindborg@...nel.org>,
	Alice Ryhl <aliceryhl@...gle.com>, Trevor Gross <tmgross@...ch.edu>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	"Rafael J. Wysocki" <rafael@...nel.org>,
	linux-kernel@...r.kernel.org, devicetree@...r.kernel.org,
	rust-for-linux@...r.kernel.org
Subject: Re: [PATCH v3 3/7] rust: property: Introduce PropertyGuard

On Mon, Apr 28, 2025 at 06:09:36PM +0200, Danilo Krummrich wrote:
> On Mon, Apr 28, 2025 at 07:03:07AM +0200, Dirk Behme wrote:
> > On 27/04/2025 14:23, Danilo Krummrich wrote:
> > > On Sun, Apr 27, 2025 at 08:11:58AM +0200, Dirk Behme wrote:
> > >> On 26.04.25 12:15, Danilo Krummrich wrote:
> > >>> On Sat, Apr 26, 2025 at 08:19:09AM +0200, Dirk Behme wrote:
> > >>>> On 25.04.25 17:35, Danilo Krummrich wrote:
> > >>>>> On Fri, Apr 25, 2025 at 05:01:26PM +0200, Remo Senekowitsch wrote:
> > >>>>>> +impl<T> PropertyGuard<'_, '_, T> {
> > >>>>>> +    /// Access the property, indicating it is required.
> > >>>>>> +    ///
> > >>>>>> +    /// If the property is not present, the error is automatically logged. If a
> > >>>>>> +    /// missing property is not an error, use [`Self::optional`] instead.
> > >>>>>> +    pub fn required(self) -> Result<T> {
> > >>>>>> +        if self.inner.is_err() {
> > >>>>>> +            pr_err!(
> > >>>>>> +                "{}: property '{}' is missing\n",
> > >>>>>> +                self.fwnode.display_path(),
> > >>>>>> +                self.name
> > >>>>>> +            );
> > >>>>>
> > >>>>> Hm, we can't use the device pointer of the fwnode_handle, since it is not
> > >>>>> guaranteed to be valid, hence the pr_*() print...
> > >>>>>
> > >>>>> Anyways, I'm not sure we need to print here at all. If a driver wants to print
> > >>>>> that it is unhappy about a missing required property it can do so by itself, I
> > >>>>> think.
> > >>>>
> > >>>> Hmm, the driver said by using 'required' that it *is* required. So a
> > >>>> missing property is definitely an error here. Else it would have used
> > >>>> 'optional'. Which doesn't print in case the property is missing.
> > >>>>
> > >>>> If I remember correctly having 'required' and 'optional' is the result
> > >>>> of some discussion on Zulip. And one conclusion of that discussion was
> > >>>> to move checking & printing the error out of the individual drivers
> > >>>> into a central place to avoid this error checking & printing in each
> > >>>> and every driver. I think the idea is that the drivers just have to do
> > >>>> ...required()?; and that's it, then.
> > >>>
> > >>> Yes, I get the idea.
> > >>>
> > >>> If it'd be possible to use dev_err!() instead I wouldn't object in this specific
> > >>> case. But this code is used by drivers from probe(), hence printing the error
> > >>> without saying for which device it did occur is a bit pointless.
> > >>
> > >> Thinking a little about this, yes, we don't know the device here. But:
> > >> Does the device matter here?
> > > 
> > > If the above fails it means that for a (specific) device a driver expects that
> > > a specific property of some firmware node is present. So, yes, I think it does
> > > matter.
> > > 
> > >> There is nothing wrong with the (unknown)
> > >> device, no? What is wrong here is the firmware (node). It misses
> > >> something.
> > > 
> > > How do we know the firmware node is wrong? Maybe the driver has wrong
> > > expectations for this device?
> > > 
> > >> And this is exactly what the message tells: "There is an
> > >> error due to the missing node 'name' in 'path', please fix it". That
> > >> should be sufficient to identify the firmware/device tree description
> > >> and fix it.
> > > 
> > > I think we can't always fix them, even if they're wrong. How do we fix ACPI
> > > firmware nodes for instance?
> > 
> > So the argument here is that the device (driver) is expecting something
> > to be "required" is wrong and might need to be fixed. Not the firmware.
> > Yes, ok, that is a valid argument. I have a device tree background and
> > there in 99% of the cases the device tree needs a fix ;)
> > 
> > But let me ask the other way around, then: What will it hurt or break if
> > we keep the pr_err() like Remo did? Even knowing that its not perfect?
> > But knowing that it will give at least a note that something is wrong
> > with at least a starting point for searching what needs to be fixed. I
> > mean even if we don't get the device, we will get the affected node we
> > can search for which device uses it as "required".
> > 
> > Could we somehow agree that in 90% of the cases this should be catched
> > at device (driver) development time, already?
> 
> I don't see why *catching* such errors needs pr_err() in core code; without it
> you still get a Result as return value that you need to handle in some way.
> 
> > And therefore it should be
> > beneficial if we don't require each and every driver to be "bloated"
> > with checking this individually?
> 
> I guess you mean "bloated with *printing* this individually", rather than
> "checking".
> 
> This is where we disagree: I think it is "bloating" the core kernel instead if
> we start adding error prints to core code, where a proper error code is
> propagated up to the driver.

1 or more error strings in every single driver is what bloats the 
kernel, not 1 string. It's all kernel code and memory usage whether it's 
core or drivers.

> I did say that I would agree to a certain extend with this specific one if we
> could print it properly, since it is designed to leave no doubt that returning
> an error code from required() is fatal for the driver. But I'm not even sure
> about this anymore.
> 
> I still haven't read a reason why this one is so crucial to print from core
> code, while for other things that are always fatal (e.g. request_irq()) we
> don't.

request_irq() is not always fatal. Some drivers fallback to polling. In 
general, we've been adding _optional() variants of functions to return 
NULL rather than errors which is handled silently by subsequent API 
calls. Secondarily, those print errors in the non-optional case. It's 
not real consistent in this area, but something we should improve.

> However, if you really think we need a common helper that prints something in
> the error case, maybe we can add an *additional* helper
> 
> 	pub fn required_by(self, dev: &Device) -> Result<T>
> 
> and document that it is the same as required(), with an additional error print
> in case of failure for the given device.

One thing that's really hard to debug in C drivers is where an 
error came from. You can for example turn on initcall_debug and see that 
a driver probe returned an error. It's virtually impossible to tell 
where that originated from. The only way to tell is with prints. That is 
probably the root of why probe has so many error prints. I think we can 
do a lot better with rust given Result can hold more than just an int. 
We obviously can't get back to the origin if that was C code, but just 
if we know exactly which call from probe failed that would be a huge 
improvement.

Rob

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ