linux-kernel - Re: [PATCH 3/3] tpm: make locality request return value consistent

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <33838539-62ec-43d0-8223-b3d5df4bf8f6@infineon.com>
Date: Tue, 20 Feb 2024 19:57:49 +0100
From: Alexander Steffen <Alexander.Steffen@...ineon.com>
To: "Daniel P. Smith" <dpsmith@...rtussolutions.com>, Jarkko Sakkinen
	<jarkko@...nel.org>, Jason Gunthorpe <jgg@...pe.ca>,
	<linux-integrity@...r.kernel.org>, <linux-kernel@...r.kernel.org>
CC: Ross Philipson <ross.philipson@...cle.com>, Peter Huewe
	<peterhuewe@....de>
Subject: Re: [PATCH 3/3] tpm: make locality request return value consistent

On 19.02.2024 21:29, Daniel P. Smith wrote:
> On 2/1/24 17:49, Jarkko Sakkinen wrote:
>> On Wed Jan 31, 2024 at 7:08 PM EET, Daniel P. Smith wrote:
>>> The function tpm_tis_request_locality() is expected to return the 
>>> locality
>>> value that was requested, or a negative error code upon failure. If 
>>> it is called
>>> while locality_count of struct tis_data is non-zero, no actual 
>>> locality request
>>> will be sent. Because the ret variable is initially set to 0, the
>>> locality_count will still get increased, and the function will return 
>>> 0. For a
>>> caller, this would indicate that locality 0 was successfully 
>>> requested and not
>>> the state changes just mentioned.
>>>
>>> Additionally, the function __tpm_tis_request_locality() provides 
>>> inconsistent
>>> error codes. It will provide either a failed IO write or a -1 should 
>>> it have
>>> timed out waiting for locality request to succeed.
>>>
>>> This commit changes __tpm_tis_request_locality() to return valid 
>>> negative error
>>> codes to reflect the reason it fails. It then adjusts the return 
>>> value check in
>>> tpm_tis_request_locality() to check for a non-negative return value 
>>> before
>>> incrementing locality_cout. In addition, the initial value of the ret 
>>> value is
>>> set to a negative error to ensure the check does not pass if
>>> __tpm_tis_request_locality() is not called.
>>
>> This is way way too abtract explanation and since I don't honestly
>> understand what I'm reading, the code changes look bunch of arbitrary
>> changes with no sound logic as a whole.
> 
> In more simpler terms, the interface is inconsistent with its return
> values. To be specific, here are the sources for the possible values
> tpm_tis_request_locality() will return:
> 1. 0 - 4: _tpm_tis_request_locality() was able to set the locality
> 2. 0: a locality already open, no locality request made
> 3. -1: if timeout happens in __tpm_tis_request_locality()
> 4. -EINVAL: unlikely, return by IO write for incorrect sized write
> 
> As can easily be seen, tpm_tis_request_locality() will return 0 for both
> a successful(1) and non-successful request(2). And to be explicit for
> (2), if tpm_tis_request_locality is called for a non-zero locality and
> the locality counter is not zero, it will return 0. Thus, making the
> value 0 reflect as success when locality 0 is successfully requested and
> as failure when a locality is requested with a locality already open.

There is a potential problem here, but I think it is slightly different 
from what you describe: Currently, the kernel uses only locality 0, so 
case 1 and 2 are indistinguishable for the caller. Getting a return 
value of 0 simply means that the requested locality is now active. The 
callers don't care whether it had already been active before or not, so 
it is not a problem that the callers cannot distinguish case 1 and 2, 
and a return value of 0 always indicates "success".

It might only become a problem once you make the kernel use localities 
!= 0. Then a caller can get either 0 as the return value (if the 
locality was already active before) or the requested locality, and both 
values mean "success". In practice, this shouldn't cause any problems as 
far as I can tell, because all existing callers either check only for 
failures (negative return values), e.g. 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/char/tpm/tpm_tis_core.c?h=v6.8-rc5#n1214, 
or explicitly request locality 0 and check for a return value of 0, e.g. 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/char/tpm/tpm_tis_core.c?h=v6.8-rc5#n750. 
There is no caller that would be confused by case 2 because it requests 
an arbitrary locality and always expects that locality to be returned in 
order to indiciate "success".

Still, such an inconsistency is not nice and should be fixed, but if I 
read your patch correctly, this is not what it does: In 
tpm_tis_request_locality(), you initialize ret with -EBUSY. For 
locality_count != 0, you never assign to ret again and therefore return 
-EBUSY, even though the locality is active and can be used. The correct 
fix would be to initialize ret with l, so that no error is returned in 
such cases.

> As for failures, correct me if I am wrong, but if a function is
> returning negative error codes, it should not be using a hard coded -1
> as a generic error code. As I note, it is unlikely for the -EINVAL to be
> delivered, but the code path is still available should something in the
> future change the backing call logic.
> 
> After this change, the possible return values for
> tpm_tis_request_locality() become:
> 1. 0 - 4: the locality that was successfully requested
> 2. -EBUSY: tpm busy, unable to request locality
> 3. -EINVAL: invalid parameter
> 
> With this more consistent interface, I updated the return value checks
> at the call sites to check for negative error as the means to catch
> failures.
> 
> v/r,
> dps
>