[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <B085ADB4-4B8C-4998-BB33-DA67C45483E9@kernel.org>
Date: Mon, 28 Aug 2023 18:28:25 -0700
From: Kees Cook <kees@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>,
Kees Cook <keescook@...omium.org>
CC: linux-kernel@...r.kernel.org, Ard Biesheuvel <ardb@...nel.org>,
Enlin Mu <enlin.mu@...soc.com>,
Eric Biggers <ebiggers@...gle.com>,
"Guilherme G. Piccoli" <gpiccoli@...lia.com>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>,
Yunlong Xing <yunlong.xing@...soc.com>,
Yuxiao Zhang <yuxiaozhang@...gle.com>
Subject: Re: [GIT PULL] pstore updates for v6.6-rc1
On August 28, 2023 4:56:00 PM PDT, Linus Torvalds <torvalds@...ux-foundation.org> wrote:
>On Mon, 28 Aug 2023 at 11:21, Kees Cook <keescook@...omium.org> wrote:
>>
>> Please pull these pstore updates for v6.6-rc1. This contains a fair bit
>> of code _removal_ which is always nice.
>
>Hmm. The diffstat certainly looks good, but the end result isn't great..
>
>I now get 124 lines of
>
> pstore: zlib_inflate() failed, ret = -5!
>
>in my bootup dmesg.
>
>Considering that there's no reason for pstore to even be active on
>this machine, I think it's because pstore now goes and tries to
>uncompress something entirely invalid.
>
>The message itself does not seem to be new, but with the switch from
>the crypto code, it apparently used to be
>
> crypto_comp_decompress failed, ret = %d!
>
>but the key word here is *apparently*. I never got that message
>before. So something else has changed, and I'm thinking that the old
>code probably didn't even try to decompress the bogus data it found?
>
>I dunno. But 124 lines of insane garbage in the kernel messages is not
>a good thing.
Oh dear! That's obviously unexpected. I have so many questions. :P
- does this happen at every boot? (I assume yes.)
- what CONFIG are you built with?
- what was the prior CONFIG?
- what backend is in use? (Or better yet, what does "dmesg | grep pstore" report?)
- are you using systemd?
Decompression is only attempted if it's a valid record. If the records aren't being removed after boot (i.e. unlinked from /sys/fs/pstore) they won't get cleared. Normally systemd-pstore moves everything to /var/lib/systemd/pstore. But that must not be happening since you keep seeing the warnings.
That you have 124 of these makes me think you've got the EFI backend (CONFIG_EFI_VARS_PSTORE) built and it's default enabled (CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE=n). The latter config was created to keep the EFI backend from filling the EFI variable space. I think distros started setting it to "n" once systemd-pstore was added, which keeps the EFI variables from piling up...
So, I assume either systemd-pstore isn't running for you or something has gone sideways with it. And since I did testing of "changed compression type" without systemd-pstore, I bet systemd-pstore ignores the failed records...
https://github.com/systemd/systemd/blob/599a3124849819ba5af0a71b7572e87256814881/src/pstore/pstore.c#L225
Yup. Ugh. (Though I still find it odd that you have 124 records...)
Let me think about the best way to deal with this. I expect I'll have pstore wipe the failed records as it is expressly not expected to work across differing configs/kernel versions. And permanently spewing errors is not ok.
In the meantime, you can make the warnings go away with:
rm /sys/fs/pstore/*enc.z
--
Kees Cook
Powered by blists - more mailing lists