lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <18dbe345-e268-b640-3ab6-d7381b236edf@gmail.com>
Date:   Tue, 16 Nov 2021 10:07:13 -0500
From:   Frank Rowand <frowand.list@...il.com>
To:     Matthias Schiffer <matthias.schiffer@...tq-group.com>
Cc:     devicetree@...r.kernel.org, linux-kernel@...r.kernel.org,
        Rob Herring <robh+dt@...nel.org>
Subject: Re: [PATCH] of: base: Skip CPU nodes with non-"okay"/"disabled"
 status

Hi Mathias,

On 11/16/21 3:32 AM, Matthias Schiffer wrote:
> On Mon, 2021-11-15 at 12:23 -0500, Frank Rowand wrote:
>> Hi Matthias,
>>
>> On 11/15/21 3:13 AM, Matthias Schiffer wrote:
>>> On Sun, 2021-11-14 at 14:41 -0500, Frank Rowand wrote:
>>>> On 11/8/21 3:48 AM, Matthias Schiffer wrote:

It turns out I confused myself a bit...

The first email provided the clue that I needed:

>>>>> Allow fully disabling CPU nodes using status = "fail". Having no status
>>>>> property at all is still interpreted as "okay" as usual.

I managed to forget that sentence before I looked at the code in the patch.

>>>>>
>>>>> This allows a bootloader to change the number of available CPUs (for
>>>>> example when a common DTS is used for SoC variants with different numbers
>>>>> of cores) without deleting the nodes altogether, which could require
>>>>> additional fixups to avoid dangling phandle references.
>>>>>
>>>>> References:
>>>>> - https://www.lkml.org/lkml/2020/8/26/1237
>>>>> - https://www.spinics.net/lists/devicetree-spec/msg01007.html
>>>>> - https://github.com/devicetree-org/dt-schema/pull/61
>>>>>
>>>>> Signed-off-by: Matthias Schiffer <matthias.schiffer@...tq-group.com>
>>>>> ---
>>>>>  drivers/of/base.c | 29 +++++++++++++++++++++++++++++
>>>>>  1 file changed, 29 insertions(+)
>>>>>
>>>>> diff --git a/drivers/of/base.c b/drivers/of/base.c
>>>>> index 61de453b885c..4e9973627c8d 100644
>>>>> --- a/drivers/of/base.c
>>>>> +++ b/drivers/of/base.c
>>>>> @@ -650,6 +650,32 @@ bool of_device_is_available(const struct device_node *device)
>>>>>  }
>>>>>  EXPORT_SYMBOL(of_device_is_available);
>>>>>  
>>>>> +/**
>>>>> + *  __of_device_is_disabled - check if a device has status "disabled"
>>>>> + *
>>>>> + *  @device: Node to check status for, with locks already held
>>>>> + *
>>>>> + *  Return: True if the status property is set to "disabled",
>>>>> + *  false otherwise
>>>>> + *
>>>>> + *  Most callers should use __of_device_is_available() instead, this function
>>>>> + *  only exists due to the special interpretation of the "disabled" status for
>>>>> + *  CPU nodes.
>>>>> + */
>>>>> +static bool __of_device_is_disabled(const struct device_node *device)
>>>>> +{
>>>>> +	const char *status;
>>>>> +
>>>>> +	if (!device)
>>>>> +		return false;
>>>>> +
>>>>> +	status = __of_get_property(device, "status", NULL);
>>>>> +	if (status == NULL)
>>>>> +		return false;
>>>>> +
>>>>> +	return !strcmp(status, "disabled");
>>>>> +}
>>>>> +
>>>>>  /**
>>>>>   *  of_device_is_big_endian - check if a device has BE registers
>>>>>   *
>>>>> @@ -817,6 +843,9 @@ struct device_node *of_get_next_cpu_node(struct device_node *prev)
>>>>>  		of_node_put(node);
>>>>>  	}

So in the following test:

>>>>>  	for (; next; next = next->sibling) {
>>>>> +		if (!__of_device_is_available(next) &&
>>>>> +		    !__of_device_is_disabled(next))

   (!__of_device_is_available(next) &&
    !__of_device_is_disabled(next))

is meant to detect a status of "fail" or "fail-sss".  Since I forget the sentence
about "fail" from early in the email, I had difficulty in interpreting the intent
of the (!ok && !disabled) form of the test.  The intent of the test would be more
clear if it was checking _for_ "fail" instead of checking for _not_ the other
possible status values.

So you _are_ checking for the status of "fail" check (Rob's choice #1 below)
and I just did not understand that was the intent of the patch.

So I am fine with the patch if you change the above logic to check for
"fail" or "fail-sss".

It would also be good to add a comment to the of_get_next_cpu_node() header
comment that "fail" nodes are excluded.

Sorry for the confusion.

-Frank

>>>>
>>>> Shouldn't that just be a check to continue if the device is disabled?
>>>>
>>>> If adding a check for available, then all of the callers of for_each_of_cpu_node()
>>>> need to be checked.  There is at least one that is suspicious - arch/arm/mach-imx/platsmp.c
>>>> has a comment:
>>>>
>>>>  * Initialise the CPU possible map early - this describes the CPUs
>>>>  * which may be present or become present in the system.
>>
>> Thanks for the links to previous discussion you provided below.  I had
>> forgotten the previous discussion.
>>
>> In [2] Rob ended up saying that there were two options he was fine with.
>> Either (or both), in of_get_next_cpu_node(),
>>
>>   (1) use status of "fail" as the check or
>>
>>   (2) use status of "disabled" as the check, conditional on !IS_ENABLED(CONFIG_PPC)
>>       "this would need some spec update"
>>       "Need to double check MIPS and Sparc though."
>>
>> Neither of these two options are what this patch does.  It seems to me that
>> option 1 is probably the easiest and least intrusive method.
> 
> My intuition is that a device with an unknown status value should not
> be used. For most devices this is already handled by treating any value
> that is not unset, "okay" or "ok" the same. For CPU nodes, this would
> be the case by treating such values like "fail".
> 
> I did a quick grep through the in-tree Device Trees, and I did find a
> few unusual status properties (none of them in CPU nodes though):
> 
> - Typo "failed" (4 cases)
> - Typo "disable" (17 cases)
> - "reserved" (19 cases)
> 
> "fail" appears 2 times, the rest is "okay", "ok" or "disabled".
> 
> I do not have a strong opinion on this though - for our concrete
> usecase, checking for "fail" is fine, and we can treat unknown values
> like "disabled" if you prefer that solution. Should "fail-*" status
> values also be treated like "fail" then?
> 
> 
> 
> 
>>
>> -Frank
>>
>>> Previously, there were two option for the (effective) value of the
>>> status of a device_node:
>>>
>>> - "okay", "ok" or unset
>>> - anything else (which includes "disabled" and "fail")
>>>
>>> __of_device_is_available() checks which of these two is the case.
>>>
>>> With the new code, we have 3 cases for the status of CPU nodes:
>>>
>>> - "okay", "ok" or unset
>>> - "disabled"
>>> - anything else ("fail", ...)
>>>
>>> My patch will only change the behaviour in one case: When a CPU node
>>> has a status that is not "okay", "ok", "disabled" or unset - which is
>>> exactly the point of my patch.
>>>
>>> See also the change [1], which removed the !available check a while
>>> ago, and the discussion in [2], which led us to the conclusion that 
>>> of_get_next_cpu_node() must not distinguish "okay" and "disabled" CPU
>>> nodes and we instead need a third status to disable a CPU for real.
>>>
>>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/of/base.c?id=c961cb3be9064d1097ccc019390f8b5739daafc6
>>> [2] https://www.lkml.org/lkml/2020/8/26/1237
>>>
>>>
>>>>
>>>> -Frank
>>>>
>>>>> +			continue;
>>>>>  		if (!(of_node_name_eq(next, "cpu") ||
>>>>>  		      __of_node_is_type(next, "cpu")))
>>>>>  			continue;
>>>>>
>>>>
>>>>
>>
>>
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ