[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f954e9ef-295a-a8ce-0ff8-a88ad81b01a3@apertussolutions.com>
Date: Mon, 1 Jun 2020 20:13:31 -0400
From: "Daniel P. Smith" <dpsmith@...rtussolutions.com>
To: Andy Lutomirski <luto@...capital.net>
Cc: Andy Lutomirski <luto@...nel.org>,
Daniel Kiper <daniel.kiper@...cle.com>,
Lukasz Hawrylko <lukasz.hawrylko@...ux.intel.com>,
grub-devel@....org, LKML <linux-kernel@...r.kernel.org>,
trenchboot-devel@...glegroups.com, X86 ML <x86@...nel.org>,
alexander.burmashev@...cle.com,
Andrew Cooper <andrew.cooper3@...rix.com>,
Ard Biesheuvel <ard.biesheuvel@...aro.org>,
eric.snowberg@...cle.com, javierm@...hat.com,
kanth.ghatraju@...cle.com,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
krystian.hebel@...eb.com, michal.zygowski@...eb.com,
Matthew Garrett <mjg59@...gle.com>, phcoder@...il.com,
piotr.krol@...eb.com, Peter Jones <pjones@...hat.com>,
Ross Philipson <ross.philipson@...cle.com>
Subject: Re: [GRUB PATCH RFC 00/18] i386: Intel TXT secure launcher
On 6/1/20 3:39 PM, Andy Lutomirski wrote:
>
>> On Jun 1, 2020, at 10:56 AM, Daniel P. Smith <dpsmith@...rtussolutions.com> wrote:
>>
>> On 6/1/20 12:51 PM, Andy Lutomirski wrote:
>>>> On Mon, Jun 1, 2020 at 8:33 AM Daniel P. Smith
>>>> <dpsmith@...rtussolutions.com> wrote:
>>>>
>>>> On 5/7/20 7:06 AM, Daniel Kiper wrote:
>>>>> Hi Łukasz,
>>>>>
>>>>> On Tue, May 05, 2020 at 04:38:02PM +0200, Lukasz Hawrylko wrote:
>>>>>>> On Tue, 2020-05-05 at 01:21 +0200, Daniel Kiper wrote:
>>>>
>>>> ...
>>>>
>>>>>> In OS-MLE table there is a buffer for TPM event log, however I see that
>>>>>> you are not using it, but instead allocate space somewhere in the
>>>>>
>>>>> I think that this part requires more discussion. In my opinion we should
>>>>> have this region dynamically allocated because it gives us more flexibility.
>>>>> Of course there is a question about the size of this buffer too. I am
>>>>> not sure about that because I have not checked yet how many log entries
>>>>> are created by the SINIT ACM. Though probably it should not be large...
>>>>>
>>>>>> memory. I am just wondering if, from security perspective, it will be
>>>>>> better to use memory from TXT heap for event log, like we do in TBOOT.
>>>>>
>>>>> Appendix F, TPM Event Log, has following sentence: There are no
>>>>> requirements for event log to be in DMA protected memory – SINIT will
>>>>> not enforce it.
>>>>>
>>>>> I was thinking about it and it seems to me that the TPM event log does
>>>>> not require any special protections. Any changes in it can be quickly
>>>>> detected by comparing hashes with the TPM PCRs. Does not it?
>>>>>
>>>>
>>>> I think it would be beneficial to consider the following in deciding
>>>> where the log is placed. There are two areas of attack/manipulation that
>>>> need to be considered. The first area is the log contents itself, which
>>>> as Daniel has pointed out, the log contents do not really need to be
>>>> protected from tampering as that would/should be detected during
>>>> verification by the attestor. The second area that we need to consider
>>>> is the log descriptors themselves. If these are in an area that can be
>>>> manipulated, it is an opportunity for an attacker to attempt to
>>>> influence the ACM's execution. For a little perspective, the ACM
>>>> executes from CRAM to take the most possible precaution to ensure that
>>>> it cannot be tampered with during execution. This is very important
>>>> since it runs a CPU mode (ACM Mode) that I would consider to be higher
>>>> (or lower depending on how you view it) than SMM. As a result, the txt
>>>> heap is also included in what is mapped into CRAM. If the event log is
>>>> place in the heap, this ensures that the ACM is not using memory outside
>>>> of CRAM during execution. Now as Daniel has pointed out, the down side
>>>> to this is that it greatly restricts the log size and can only be
>>>> managed by a combination of limiting the number of events and
>>>> restricting what content is carried in the event data field.
>>>
>>> Can you clarify what the actual flow of control is? If I had to guess, it's:
>>>
>>> GRUB (or other bootloader) writes log.
>>>
>>> GRUB transfers control to the ACM. At this point, GRUB is done
>>> running and GRUB code will not run again.
>>>
>>> ACM validates system configuration and updates TPM state using magic
>>> privileged TPM access.
>>>
>>> ACM transfers control to the shiny new Linux secure launch entry point
>>>
>>> Maybe this is right, and maybe this is wrong. But I have some
>>> questions about this whole setup. Is the ACM code going to inspect
>>> this log at all? If so, why? Who supplies the ACM code? If the ACM
>>> can be attacked by putting its inputs (e.g. this log) in the wrong
>>> place in memory, why should this be considered anything other than a
>>> bug in the ACM?
>>
>> There is a lot behind that, so to get a complete detail of the event
>> sequence I would recommend looking at Section Vol. 2D 6.2.3 (pg Vol. 2D
>> 6-5/ pdf pg 2531), 6.3 GETSEC[ENTERACCS](pg 6-10 Vol. 2D/pdf pg 2546),
>> and 6.3 GETSEC[SENTER](pg Vol. 2D 6-21/pdf pg 2557) in the Intel SDM[1].
>> Section 6.2.3 gives a slightly detailed overview. Section
>> GETSEC[ENTERACCS] details the requirements/procedures for entering AC
>> execution mode and ACRAM (Authenticated CRAM) and section GETSEC[SENTER]
>> will detail requirements/procedures for SENTER.
>>
>> To answer you additional questions I would say if you look at all the
>> work that goes into protecting the ACM execution, why would you want to
>> then turn around and have it use memory outside of the protected region.
>> On the other hand, you are right, if the Developer's Guide says it
>> doesn't need to be protected and someone somehow finds a way to cause a
>> failure in the ACM through the use of a log outside of CRAM, then
>> rightfully that is a bug in the ACM. This is why I asked about making it
>> configurable, paranoid people could set the configuration to use the
>> heap and all others could just use an external location.
>
> And this provides no protection whatsoever to paranoid people AFAICS, unless the location gets hashed before any processing occurs.
Apologies but that is exactly what it says. From section 6.2.3,
"After the GETSEC[SENTER] rendezvous handshake is performed between all
logical processors in the platform, the ILP loads the chipset
authenticated code module (SINIT) and performs an authentication check.
If the check passes, the processor hashes the SINIT AC module and stores
the result into TPM PCR 17. It then switches execution context to the
SINIT AC module."
To get a little into the details, the ACM is signed with an RSA key that
is deeply embedded into the CPU, thus why there is an ACM per
architecture as each one gets a new key. The authentication check is
detailed in the section GETSEC[ENTERACCS], but in the end the ACM has
had a crypto signature check carried out by the CPU (not firmware) and
is also cryptographically hashed by the CPU (again not firmware). On top
of this, both operations are executed after all interrupts have been
disabled and the ACM has been loaded into ACRAM from memory that was
protected by the IOMMU. Only after all this succeeds is the ACM allowed
to execute.
> But you haven’t answered the most important question: what is the ACM doing with this log? I feel like a lot of details are being covered but the big picture is missing.
To the ACM, this is just an allocated buffer for it to record a TPM
event log for all the measurements it takes. It has been a while since I
have manually reviewed a TXT event log but I want to say there are
about five measurements taken before the ACM exits, including recording
the CRTM taken by the CPU. As such, it initialize the beginning of the
buffer with a TXT log header and records a UEFI TPM Event for each
measurement it takes. In theory, it will only ever write to this memory
but seeing that the ACM is a binary blob, I have never seen
programmatically if it ever tries reading from the memory.
>>
>>> If GRUB is indeed done by the time anyone consumes the log, why does
>>> GRUB care where the log ends up?
>>
>> This is because the log buffer allocation was made the responsibility of
>> the pre-launch environment, in this case GRUB, and is communicated to
>> the ACM via the os_to_mle structure.
>>
>>> And finally, the log presumably has nonzero size, and it would be nice
>>> not to pin some physical memory forever for the log. Could the kernel
>>> copy it into tmpfs during boot so it's at least swappable and then
>>> allow userspace to delete it when userspace is done with it?
>>>
>>
>> Actually yes we will want to do that because when we move to enabling
>> relaunching, an implementation may want to preserve the log from the
>> last launch before triggering the new launch which will result in a
>> reset of the DRTM PCRs and an overwriting the log.
>
> I’m having a bit of trouble understanding how this log is useful for a relaunch. At the point that you’re relaunching, the log is even less trustworthy than on first launch. At least on first launch, if you dial up your secure and verified boot settings tight enough, you can have some degree of confidence in the boot environment. But on a relaunch, I don’t see how the log is useful for anything more than any other piece of kernel memory.
>
> What am I missing?
>
Before relaunch you can have the TPM do a sign quote of the log to bind
the contents to the state of the PCRs. The why you would do that is more
about enterprise use-cases concerned with the lifecycle of enterprise
devices.When relaunch occurs, the DRTM PCRs will be reset by the CPU
before the CRTM for the relaunch is taken by the CPU and the ACM will
overwrite the existing log with new log entries. As highlighted above,
the CRTM and all measurements take by the ACM will have an extremely low
risk of external/attacker influence. When the MLE takes control,
interrupts will still be disabled and any measurements it takes prior to
enabling them will have a low risk of external/attacker influence. Once
the MLE enables the interrupts is when you have the situation whereby
the integrity you just established about the runtime can be compromised
by hostile firmware (UEFI runtime services, SMI handler, EC firmware,
PCI devices, etc.), hostile applications, and a hostile network. At this
point we could devolve into discourse over how long can load time
integrity measurements be considered trustworthy but I don't think that
is relevant to the issue at hand.
In other words, the log for the relaunch to attest what is currently
running is really no less useful than using the first launch log to
attest to the what was running in the first launch.
Powered by blists - more mailing lists