lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Mon, 26 Feb 2024 13:43:55 +0100
From: Alexander Steffen <Alexander.Steffen@...ineon.com>
To: "Daniel P. Smith" <dpsmith@...rtussolutions.com>, Lino Sanfilippo
	<l.sanfilippo@...bus.com>, Jarkko Sakkinen <jarkko@...nel.org>, "Jason
 Gunthorpe" <jgg@...pe.ca>, Sasha Levin <sashal@...nel.org>,
	<linux-integrity@...r.kernel.org>, <linux-kernel@...r.kernel.org>
CC: Ross Philipson <ross.philipson@...cle.com>, Kanth Ghatraju
	<kanth.ghatraju@...cle.com>, Peter Huewe <peterhuewe@....de>
Subject: Re: [PATCH 1/3] tpm: protect against locality counter underflow

On 23.02.2024 02:55, Daniel P. Smith wrote:
> On 2/20/24 13:42, Alexander Steffen wrote:
>> On 02.02.2024 04:08, Lino Sanfilippo wrote:
>>> On 01.02.24 23:21, Jarkko Sakkinen wrote:
>>>
>>>>
>>>> On Wed Jan 31, 2024 at 7:08 PM EET, Daniel P. Smith wrote:
>>>>> Commit 933bfc5ad213 introduced the use of a locality counter to
>>>>> control when a
>>>>> locality request is allowed to be sent to the TPM. In the commit,
>>>>> the counter
>>>>> is indiscriminately decremented. Thus creating a situation for an
>>>>> integer
>>>>> underflow of the counter.
>>>>
>>>> What is the sequence of events that leads to this triggering the
>>>> underflow? This information should be represent in the commit message.
>>>>
>>>
>>> AFAIU this is:
>>>
>>> 1. We start with a locality_counter of 0 and then we call
>>> tpm_tis_request_locality()
>>> for the first time, but since a locality is (unexpectedly) already 
>>> active
>>> check_locality() and consequently __tpm_tis_request_locality() return
>>> "true".
>>
>> check_locality() returns true, but __tpm_tis_request_locality() returns
>> the requested locality. Currently, this is always 0, so the check for
>> !ret will always correctly indicate success and increment the
>> locality_count.
>>
>> But since theoretically a locality != 0 could be requested, the correct
>> fix would be to check for something like ret >= 0 or ret == l instead of
>> !ret. Then the counter will also be incremented correctly for localities
>> != 0, and no underflow will happen later on. Therefore, explicitly
>> checking for an underflow is unnecessary and hides the real problem.
>>
> 
> My apologies, but I will have to humbly disagree from a fundamental
> level here. If a state variable has bounds, then those bounds should be
> enforced when the variable is being manipulated.

That's fine, but that is not what your proposed fix does.

tpm_tis_request_locality and tpm_tis_relinquish_locality are meant to be 
called in pairs: for every (successful) call to tpm_tis_request_locality 
there *must* be a corresponding call to tpm_tis_relinquish_locality 
afterwards. Unfortunately, in C there is no language construct to 
enforce that (nothing like a Python context manager), so instead 
locality_count is used to count the number of successful calls to 
tpm_tis_request_locality, so that tpm_tis_relinquish_locality can wait 
to actually relinquish the locality until the last expected call has 
happened (you can think of that as a Python RLock, to stay with the 
Python analogies).

So if locality_count ever gets negative, that is certainly a bug 
somewhere. But your proposed fix hides this bug, by allowing 
tpm_tis_relinquish_locality to be called more often than 
tpm_tis_request_locality. You could have added something like 
BUG_ON(priv->locality_count == 0) before decrementing the counter. That 
would really enforce the bounds, without hiding the bug, and I would be 
fine with that.

Of course, that still leaves the actual bug to be fixed. In this case, 
there is no mismatch between the calls to tpm_tis_request_locality and 
tpm_tis_relinquish_locality. It is just (as I said before) that the 
counting of successful calls in tpm_tis_request_locality is broken for 
localities != 0, so that is what you need to fix.

> Assuming that every
> path leading to the variable manipulation code has ensured proper
> manipulation is just that, an assumption. When assumptions fail is how
> bugs and vulnerabilities occur.
> 
> To your point, does this full address the situation experienced, I would
> say it does not. IMHO, the situation is really a combination of both
> patch 1 and patch 2, but the request was to split the changes for
> individual discussion. We selected this one as being the fixes for two
> reasons. First, it blocks the underflow such that when the Secure Launch
> series opens Locality 2, it will get incremented at that time and the
> internal locality tracking state variables will end up with the correct
> values. Thus leading to the relinquish succeeding at kernel shutdown.
> Second, it provides a stronger defensive coding practice.
> 
> Another reason that this works as a fix is that the TPM specification
> requires the registers to be mirrored across all localities, regardless
> of the active locality. While all the request/relinquishes for Locality
> 0 sent by the early code do not succeed, obtaining the values via the
> Locality 0 registers are still guaranteed to be correct.
> 
> v/r,
> dps


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ