linux-kernel - Re: [PATCH v5 6/7] acpi/ghes: add support for generic error injection via QAPI

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20240813205911.1719db56@foz.lan>
Date: Tue, 13 Aug 2024 20:59:11 +0200
From: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
To: Igor Mammedov <imammedo@...hat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@...wei.com>, Shiju Jose
 <shiju.jose@...wei.com>, "Michael S. Tsirkin" <mst@...hat.com>, Ani Sinha
 <anisinha@...hat.com>, Dongjiu Geng <gengdongjiu1@...il.com>,
 <linux-kernel@...r.kernel.org>, <qemu-arm@...gnu.org>,
 <qemu-devel@...gnu.org>
Subject: Re: [PATCH v5 6/7] acpi/ghes: add support for generic error
 injection via QAPI

Em Mon, 12 Aug 2024 11:39:00 +0200
Igor Mammedov <imammedo@...hat.com> escreveu:

> > We may also store cper_offset there via bios_linker_loader_add_pointer()
> > and/or use bios_linker_loader_write_pointer(), but I can't see how the
> > data stored there can be retrieved, nor any advantage of using it instead
> > of the current code, as, in the end, we'll have 3 addresses that will be
> > used:
> > 
> > 	- an address where a pointer to CPER record will be stored;
> > 	- an address where the ack will be stored;
> > 	- an address where the actual CPER record will be stored.
> > 
> > And those are calculated on a single function and are all stored at the
> > ACPI table files.
> >
> > What am I missing?  
> 
> That's basically (2) approach and it works to some degree,
> unfortunately it's fragile when we start talking about migration
> and changing layout in the future.
> 
> Lets take as example increasing size of 1) 'Generic Error Status Block',
> we are considering. Old QEMU will, tell firmware to allocate 1K buffer
> for it and calculated offsets to [1] (that you've stored/calculated) will
> include this assumption.
> Then in newer we QEMU increase size of [1] and all hardcoded offsets will
> account for new size, but if we migrate guest from old QEMU to this newer
> one all HEST tables layout within guest will match old QEMU assumptions,
> and as result newer QEMU with larger block size will write CPERs at wrong
> address considering we are still running guest from old QEMU.
> That's just one example.
> 
> To make it work there a number of ways, but the ultimate goal is to pick
> one that's the least fragile and won't snowball in maintenance nightmare
> as number of GHES sources increases over time.
> 
> This series tries to solve problem of mapping GHES source to
> a corresponding 'Generic Error Status Block' and related registers.
> However we are missing access to this mapping since it only
> exists in guest patched HEST (i.e in guest RAM only).
> 
> The robust way to make it work would be for QEMU to get a pointer
> to whole HEST table and then enumerate GHES sources and related
> error/ack registers directly from guest RAM (sidestepping layout
> change issues this way).
> 
> what I'm proposing is to use bios_linker_loader_write_pointer()
> (only once) so that firmware could tell QEMU address of HEST table,
> in which one can find a GHES source and always correct error/ack
> pointers (regardless of table[s] layout changes).

Ok, got it. Such change was not easy, but I finally figured out how
to make it actually work.

I'll address tomorrow your comment on patch 5/10 about using raw data also 
for the other parts of CPER (generic error status and generic error data).

If you want to do a sneak peak, I'm keeping the latest development
version here:

	https://gitlab.com/mchehab_kernel/qemu/-/commits/qemu_submission?ref_type=heads

In particular, the patch changing from /etc/hardware_errors offset to
a HEST offset is at:

	https://gitlab.com/mchehab_kernel/qemu/-/commit/9197d22de09df97ce3d6725cb21bd2114c2eb43c

It contains several cleanups to make the logic clearer and more robust.

Thanks,
Mauro