linux-kernel - Re: [PATCH] efi: expose TPM event log to userspace via sysfs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZiopXE6-AucAB9NM@gardel-login>
Date: Thu, 25 Apr 2024 11:58:52 +0200
From: Lennart Poettering <mzxreary@...inter.de>
To: Ard Biesheuvel <ardb@...nel.org>
Cc: Ilias Apalodimas <ilias.apalodimas@...aro.org>,
	James Bottomley <James.Bottomley@...senpartnership.com>,
	Mikko Rapeli <mikko.rapeli@...aro.org>, linux-efi@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-integrity@...r.kernel.org
Subject: Re: [PATCH] efi: expose TPM event log to userspace via sysfs

On Mi, 24.04.24 19:15, Ard Biesheuvel (ardb@...nel.org) wrote:

> > > > > > > [    0.000000] efi: ESRT=0xf0ea5040 TPMFinalLog=0xf0ea9040
> > > > > > > RTPROP=0xf0ea7040 SMBIOS=0xf0ea3000 TPMEventLog=0xeb3b3040
> > > > > > > INITRD=0xeb3b2040 RNG=0xe5c0f040 MEMRESERVE=0xe5c0e040
> > > > > > >
> > > > > > > Different boards use different TPM HW and drivers so compiling
> > > > > > > all these in is possible but a bit ugly. systemd recently
> > > > > > > gained support for a specific tpm2.target which makes TPM
> > > > > > > support modular and also works with kernel modules for some TPM
> > > > > > > use cases but not rootfs encryption.
> > > > > > >
> > > > > > > In my test case we have a kernel+initramfs uki binary which is
> > > > > > > loaded by EFI firmware as a secure boot binary. TPM support on
> > > > > > > various boards is visible in devicetree but not as ACPI table
> > > > > > > entries. systemd currently detect TPM2 support either via ACPI
> > > > > > > table /sys/firmware/acpi/tables/TPM2 or TPM entry or via
> > > > > > > firmware measurement via
> > > > > > > /sys/kernel/security/tpm0/binary_bios_measurements
> > > > > > > .
> > > > > >
> > > > > > One corner case worth noting here is that scanning the device
> > > > > > tree won't always work for non-ACPI systems... The reason is that
> > > > > > a firmware TPM (running in OP-TEE) might or might not have a DT
> > > > > > entry, since OP-TEE can discover the device dynamically and
> > > > > > doesn't always rely on a DT entry.
> > > > > >
> > > > > > I don't particularly love the idea that an EventLog existence
> > > > > > automatically means a TPM will be there, but it seems that
> > > > > > systemd already relies on that and it does solve the problem we
> > > > > > have.
> > > > >
> > > > > Well, quite. That's why the question I was interested in, perhaps
> > > > > not asked as clearly as it could be is: since all the TPM devices
> > > > > rely on discovery mechanisms like ACPI or DT or the like which are
> > > > > ready quite early, should we simply be auto loading the TPM drivers
> > > > > earlier?
> > > >
> > > > This would be an elegant way to solve this and on top of that, we
> > > > have a single discovery mechanism from userspace -- e.g ls /dev/tpmX.
> > > > But to answer that we need more feedback from systemd. What 'earlier'
> > > > means? Autload it from the kernel before we go into launching the
> > > > initrd?
> > >
> > > Right, so this is another timing problem: we can't autoload modules
> > > *before* they appear in the filesystem and presumably they're on the
> > > initrd, so auto loading must be post initrd mount (and init execution)
> > > but otherwise quite early?
> >
> > Exactly. But is that enough?

General purpose distros typically don't build all TPM drivers into the
kernel, but ship some in the initrd instead. Then, udev is responsible
for iterating all buses/devices and auto-loading the necessary
drivers. Each loaded bus driver might make more devices available for
which more drivers then need to be loaded, and so on. Some of the
busses are "slow" in the sense that we don't really know a precise
time when we know that all devices have now shown up, there might
always be slow devices that haven't popped up yet. Iterating through
the entire tree of devices in sysfs is often quite slow in itself too,
it's one of the most time consuming parts of the boot in fact. This
all is done asynchronously hence: we enumerate/trigger/kmod all
devices as quickly as we can, but we continue doing other stuff at the
same time.

Of course that means that other stuff sometimes has to *wait* for
devices to show up. For example, if a harddisk shall be mounted, it
needs to be found/probed/kmod'ed first. Hence that's what we do: the
fscking/mounting of a file system is delayed exactly as long as it
takes for the block device it is for to show up.

systemd these days makes use of the TPM — if available — for various
purposes, such as disk encryption, measuring boot phases and system
identity and various other things. Now, for the purpose of disk
encryption, we need to wait for two things: the hard drive, and the
TPM to be probed/driver loaded/accessible. /etc/fstab tells us pretty
explicitly what bloock device to wait for, hence it's easy. But
waiting for a TPM is harder: we might need it for disk encryption, but
we don't know right-away if there actually *is* a TPM device to show
up, and hence don't know whether to wait for it or not.

In systemd we hence check early if firmware reported that it measured
something into a TPM during early boot. If not, then we have no
measured boot, and we don't have to wait for the TPM. (Of course,
there could possibly be a TPM that the firmware didn't use, but even
if, there's no real point in waiting for it, since a TPM is not that
interesting if your firmware didn't protect the boot with it/filled
the PCRs with data). If however we see that firmware measured
something into the TPM, then we deduce from that that Linux will probably
find the device too sooner or later, we just need to give the udev
enumerate/trigger/kmod logic enough time. And so we tell the disk
encryption stuff to wait for a TPM to show up before we continue. This
works quite well in the x86/acpi/uefi world.

There, we check for the efi tpm event log + the ACPI TPM table as
indication whether TPM was used by firmware. Note that we are not
interested in the *contents* of the log or the table to determine if
we shall wait for a tpm device, we just care about the
*existence*. i.e. it's two access() calls that check if the two
things exist in sysfs.

On ACPI-less arm platforms this check doesn't work. And we are looking
for a sensible replacement: i.e. something that tells us that the
firmware found, initialized, and measured stuff into a TPM and that is
enough for us to assume that sooner or later it will also be probed
and be accessible in Linux/udev/…, too.

This check is done very very early in userspace (early initrd), at a
moment in time where no devices have been probed yet.

How precisely we do the check is up for discussion. All we care about
is an answer to the question "Did firmware find and make use/measure
into a TPM device? yes/no". And we want to be able to answer that
question very early in userspace, without having to actually probe
hardware/busses/drivers/….

And yes, the test we implemented on x86/acpu/uefi might actually come
to the wrong conclusion, because Linux might lack a driver for a TPM
the firmware supports. But right now there are few reports this is
actually a problem, Linux TPM support seems to be quite OK
IRL. Moreover, there's an override via the kernel cmdline
(systemd.tpm2_wait=0) for the cases where the logic fails.

> What I would like to know is which API systemd is attempting to use,
> and which -apparently- may never become available if no TPM is exposed
> by the kernel.

We want a trivial check whether it's likely Linux is going to expose a
TPM device eventually.

And yes, as mentioned, there might be cases where Linux lacks a driver
for something the firmware support. But as long as this is not the
common case we are good.

> Ideally, we should be able to take inspiration from the probe deferral
> work, and return -EAGAIN to convey that it is too early to signal
> either success or permanent failure.

We need an early check, so that we can dealy FDE setup, that we can
delay our first boot phase measurements and so on. An kernel interface
that just tells us "we don't have a tpm yet, we might have one one
day, or not, we don't know yet" is useless to us.

> Exposing random firmware assets directly to user space to make guesses
> about this doesn't seem like a very robust approach to this issue.

If you give us a generic flag file that says "firmware found and used
a tpm" somewhere in sysfs that abstracts the details how it detects
that is enough for us. i.e. i don't care if the kenrel abstracts this
or if we do more explicit checks in userspace. All i care is that it's
just a few access(F_OK) checks away for us.

Lennart

--
Lennart Poettering, Berlin