lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250227102255.6843705e@imammedo.users.ipa.redhat.com>
Date: Thu, 27 Feb 2025 10:22:55 +0100
From: Igor Mammedov <imammedo@...hat.com>
To: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
Cc: "Michael S . Tsirkin" <mst@...hat.com>, Jonathan Cameron
 <Jonathan.Cameron@...wei.com>, Shiju Jose <shiju.jose@...wei.com>,
 qemu-arm@...gnu.org, qemu-devel@...gnu.org, Ani Sinha
 <anisinha@...hat.com>, Dongjiu Geng <gengdongjiu1@...il.com>,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 03/14] acpi/ghes: Use HEST table offsets when
 preparing GHES records

On Wed, 26 Feb 2025 17:14:06 +0100
Mauro Carvalho Chehab <mchehab+huawei@...nel.org> wrote:

> Em Tue, 25 Feb 2025 10:43:27 +0100
> Igor Mammedov <imammedo@...hat.com> escreveu:
> 
> > On Fri, 21 Feb 2025 07:02:21 +0100
> > Mauro Carvalho Chehab <mchehab+huawei@...nel.org> wrote:
> >   
> > > Em Mon, 3 Feb 2025 15:34:23 +0100
> > > Igor Mammedov <imammedo@...hat.com> escreveu:
> > >     
> > > > On Fri, 31 Jan 2025 18:42:44 +0100
> > > > Mauro Carvalho Chehab <mchehab+huawei@...nel.org> wrote:
> > > >       
> > > > > There are two pointers that are needed during error injection:
> > > > > 
> > > > > 1. The start address of the CPER block to be stored;
> > > > > 2. The address of the ack.
> > > > > 
> > > > > It is preferable to calculate them from the HEST table.  This allows
> > > > > checking the source ID, the size of the table and the type of the
> > > > > HEST error block structures.
> > > > > 
> > > > > Yet, keep the old code, as this is needed for migration purposes.
> > > > > 
> > > > > Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
> > > > > ---
> > > > >  hw/acpi/ghes.c         | 132 ++++++++++++++++++++++++++++++++++++-----
> > > > >  include/hw/acpi/ghes.h |   1 +
> > > > >  2 files changed, 119 insertions(+), 14 deletions(-)
> > > > > 
> > > > > diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> > > > > index 27478f2d5674..8f284fd191a6 100644
> > > > > --- a/hw/acpi/ghes.c
> > > > > +++ b/hw/acpi/ghes.c
> > > > > @@ -41,6 +41,12 @@
> > > > >  /* Address offset in Generic Address Structure(GAS) */
> > > > >  #define GAS_ADDR_OFFSET 4
> > > > >  
> > > > > +/*
> > > > > + * ACPI spec 1.0b
> > > > > + * 5.2.3 System Description Table Header
> > > > > + */
> > > > > +#define ACPI_DESC_HEADER_OFFSET     36
> > > > > +
> > > > >  /*
> > > > >   * The total size of Generic Error Data Entry
> > > > >   * ACPI 6.1/6.2: 18.3.2.7.1 Generic Error Data,
> > > > > @@ -61,6 +67,25 @@
> > > > >   */
> > > > >  #define ACPI_GHES_GESB_SIZE                 20
> > > > >  
> > > > > +/*
> > > > > + * Offsets with regards to the start of the HEST table stored at
> > > > > + * ags->hest_addr_le,        
> > > > 
> > > > If I read this literary, then offsets above are not what
> > > > declared later in this patch.
> > > > I'd really drop this comment altogether as it's confusing,
> > > > and rather get variables/macro naming right
> > > >       
> > > > > according with the memory layout map at
> > > > > + * docs/specs/acpi_hest_ghes.rst.
> > > > > + */        
> > > > 
> > > > what we need is update to above doc, describing new and old ways.
> > > > a separate patch.      
> > > 
> > > I can't see anything that should be changed at
> > > docs/specs/acpi_hest_ghes.rst, as this series doesn't change the
> > > firmware layout: we're still using two firmware tables:
> > > 
> > > - etc/acpi/tables, with HEST on it;
> > > - etc/hardware_errors, with:
> > > 	- error block addresses;
> > > 	- read_ack registers;
> > > 	- CPER records.
> > > 
> > > The only changes that this series introduce are related to how
> > > the error generation logic navigates between HEST and hw_errors
> > > firmware. This is not described at acpi_hest_ghes.rst, and both
> > > ways follow ACPI specs to the letter.
> > > 
> > > The only difference is that the code which populates the CPER
> > > record and the error/read offsets doesn't require to know how
> > > the HEST table generation placed offsets, as it will basically
> > > reproduce what OSPM firmware does when handling	HEST events.    
> > 
> > section 8 describes old way to get to address to record old CPER,
> > so it needs to amended to also describe a new approach and say
> > which way is used for which version.
> > 
> > possibly section 11 might need some messaging as well.  
> 
> Ok, I'll modify it and place at the end of the series. Please
> see below if the new text is ok for you.
> 
> ---
> 
> [PATCH] docs/specs/acpi_hest_ghes.rst: update it to reflect some changes

s/^^^/docs: hest: add new "etc/acpi_table_hest_addr" and update workflow/

> 
> While the HEST layout didn't change, there are some internal
> changes related to how offsets are calculated and how memory error
> events are triggered.
> 
> Update specs to reflect such changes.
> 
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
> 
> diff --git a/docs/specs/acpi_hest_ghes.rst b/docs/specs/acpi_hest_ghes.rst
> index c3e9f8d9a702..f22d2eefdec7 100644
> --- a/docs/specs/acpi_hest_ghes.rst
> +++ b/docs/specs/acpi_hest_ghes.rst
> @@ -89,12 +89,21 @@ Design Details
>      addresses in the "error_block_address" fields with a pointer to the
>      respective "Error Status Data Block" in the "etc/hardware_errors" blob.
>  
> -(8) QEMU defines a third and write-only fw_cfg blob which is called
> -    "etc/hardware_errors_addr". Through that blob, the firmware can send back
> -    the guest-side allocation addresses to QEMU. The "etc/hardware_errors_addr"
> -    blob contains a 8-byte entry. QEMU generates a single WRITE_POINTER command
> -    for the firmware. The firmware will write back the start address of
> -    "etc/hardware_errors" blob to the fw_cfg file "etc/hardware_errors_addr".
> +(8) QEMU defines a third and write-only fw_cfg blob to store the location
> +    where the error block offsets, read ack registers and CPER records are
> +    stored.
> +
> +    Up to QEMU 9.2, the location was at "etc/hardware_errors_addr", and
> +    contains an offset for the beginning of "etc/hardware_errors".
> +
> +    Newer versions place the location at "etc/acpi_table_hest_addr",
> +    pointing to the beginning of the HEST table.
> +
> +    Through that such offsets, the firmware can send back the guest-side
       ^^^^^^^^^^^^^^^^^^^^^^^^^ can't parse that, suggest to just drop the phrase

> +    allocation addresses to QEMU. They contain a 8-byte entry. QEMU generates
> +    a single WRITE_POINTER command for the firmware. The firmware will write
> +    back the start address of either "etc/hardware_errors" or HEST table at
                ^^^^ drop this?

> +    the correspoinding address firmware.
>  
>  (9) When QEMU gets a SIGBUS from the kernel, QEMU writes CPER into corresponding
>      "Error Status Data Block", guest memory, and then injects platform specific
> @@ -105,8 +114,6 @@ Design Details
>       kernel, on receiving notification, guest APEI driver could read the CPER error
>       and take appropriate action.
>  
> -(11) kvm_arch_on_sigbus_vcpu() uses source_id as index in "etc/hardware_errors" to
> -     find out "Error Status Data Block" entry corresponding to error source. So supported
> -     source_id values should be assigned here and not be changed afterwards to make sure
> -     that guest will write error into expected "Error Status Data Block" even if guest was
> -     migrated to a newer QEMU.
> +(11) kvm_arch_on_sigbus_vcpu() report RAS errors via a SEA notifications,
> +     when a SIGBUS event is triggered.
 
>       The logic to convert a SEA notification
> +     into a source ID is defined inside ghes.c source file.
that's cheating and not documentation by any means

> 
> 
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ