[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240807152357.0d2dc466@foz.lan>
Date: Wed, 7 Aug 2024 15:23:57 +0200
From: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
To: Jonathan Cameron <Jonathan.Cameron@...wei.com>
Cc: Igor Mammedov <imammedo@...hat.com>, Shiju Jose <shiju.jose@...wei.com>,
"Michael S. Tsirkin" <mst@...hat.com>, Ani Sinha <anisinha@...hat.com>,
Dongjiu Geng <gengdongjiu1@...il.com>, <linux-kernel@...r.kernel.org>,
<qemu-arm@...gnu.org>, <qemu-devel@...gnu.org>
Subject: Re: [PATCH v5 6/7] acpi/ghes: add support for generic error
injection via QAPI
Em Wed, 7 Aug 2024 10:34:36 +0100
Jonathan Cameron <Jonathan.Cameron@...wei.com> escreveu:
> On Wed, 7 Aug 2024 09:47:50 +0200
> Mauro Carvalho Chehab <mchehab+huawei@...nel.org> wrote:
>
> > Em Tue, 6 Aug 2024 16:31:13 +0200
> > Igor Mammedov <imammedo@...hat.com> escreveu:
> >
> > > PS:
> > > looking at the code, ACPI_GHES_MAX_RAW_DATA_LENGTH is 1K
> > > and it is the total size of a error block for a error source.
> > >
> > > However acpi_hest_ghes.rst (3) says it should be 4K,
> > > am I mistaken?
> >
> > Maybe Jonathan knows better, but I guess the 1K was just some
> > arbitrary limit to prevent a too big CPER. The 4K limit described
> > at acpi_hest_ghes.rst could be just some limit to cope with
> > the current bios implementation, but I didn't check myself how
> > this is implemented there.
> >
> > I was unable to find any limit at the specs. Yet, if you look at:
> >
> > https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#arm-processor-error-section
>
> I think both limits are just made up. You can in theory log huge
> error records. Just not one does.
If both are made up, I would sync them, either patching the
documentation or the ghes driver.
>
> >
> > The processor Error Information Structure, starting at offset
> > 40, can go up to 255*32, meaning an offset of 8200, which is
> > bigger than 4K.
> >
> > Going further, processor context can have up to 65535 (spec
> > actually says 65536, but that sounds a typo, as the size is
> > stored on an uint16_t), containing multiple register values
> > there (the spec calls its length as "P").
> >
> > So, the CPER record could, in theory, have:
> > 8200 + (65535 * P) + sizeof(vendor-specicific-info)
> >
> > The CPER length is stored in Section Length record, which is
> > uint32_t.
> >
> > So, I'd say that the GHES record can theoretically be a lot
> > bigger than 4K.
> Agreed - but I don't think we care for testing as long as it's
> big enough for plausible records. Unless you really want
> to fuzz the limits?
Fuzz the limits could be interesting, but it is not on my
current plans.
Yet, 1K could be a little bit short for ARM CPER.
See: N.26 ARMv8 AArch64 GPRs (Type 4) has 256 bytes for
registers, plus 8 bytes for the header. So, a total size of
264 bytes, for a single context register dump. I would expect
that, in real life, type 4 to always be reported on aarch64,
on BIOS with context register support. Maybe other types could
also be dumped altogether (like context registers for EL1,
EL2 and/or EL3).
If just one type 4 context is encoded, it means that, 1K has
space for 23 errors (of a max limit of 255).
Just looking at the maximum number, my feeling is that 1K
might be too short to simulate some real life reports,
but that depends on how firmware is actually grouping
such events.
So, maybe this could be expanded to, let's say, 4K, thus
aligning with the ReST documentation.
Regards,
Mauro
Powered by blists - more mailing lists