lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240916130506.293ab543@imammedo.users.ipa.redhat.com>
Date: Mon, 16 Sep 2024 13:05:06 +0200
From: Igor Mammedov <imammedo@...hat.com>
To: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
Cc: "Michael S. Tsirkin" <mst@...hat.com>, Ani Sinha <anisinha@...hat.com>,
 Dongjiu Geng <gengdongjiu1@...il.com>, linux-kernel@...r.kernel.org,
 qemu-arm@...gnu.org, qemu-devel@...gnu.org
Subject: Re: [PATCH v9 01/12] acpi/ghes: add a firmware file with HEST
 address

On Sat, 14 Sep 2024 07:33:14 +0200
Mauro Carvalho Chehab <mchehab+huawei@...nel.org> wrote:

> Hi Igor,
> 
> Em Fri, 13 Sep 2024 15:25:18 +0200
> Igor Mammedov <imammedo@...hat.com> escreveu:
> 
> > > > in addition to this, it needs a patch on top to make sure
> > > > that we migrate hest_addr_le.
> > > > See a08a64627b6b 'ACPI: Record the Generic Error Status Block address'
> > > > and fixes on top of that for an example.      
> > > 
> > > Hmm... If I understood such change well, vmstate_ghes_state() will
> > > use this structure as basis to do migration:
> > > 
> > > 	/* ghes.h */
> > > 	typedef struct AcpiGhesState {
> > > 	    uint64_t hest_addr_le;
> > > 	    uint64_t ghes_addr_le;
> > > 	    bool present; /* True if GHES is present at all on this board */
> > > 	} AcpiGhesState;
> > > 
> > > 	/* generic_event_device.c */
> > > 	static const VMStateDescription vmstate_ghes_state = {
> > > 	    .name = "acpi-ged/ghes",
> > > 	    .version_id = 1,
> > > 	    .minimum_version_id = 1,
> > > 	    .needed = ghes_needed,
> > > 	    .fields      = (VMStateField[]) {
> > > 	        VMSTATE_STRUCT(ghes_state, AcpiGedState, 1,
> > > 	                       vmstate_ghes_state, AcpiGhesState),
> > > 	        VMSTATE_END_OF_LIST()
> > > 	    }
> > > 	};    
> > 
> > current code looks like that:
> >                                                                                  
> > static const VMStateDescription vmstate_ghes = {                                 
> >     .name = "acpi-ghes",                                                         
> >     .version_id = 1,                                                             
> >     .minimum_version_id = 1,                                                     
> >     .fields = (const VMStateField[]) {                                           
> >         VMSTATE_UINT64(ghes_addr_le, AcpiGhesState),   <<===                         
> >         VMSTATE_END_OF_LIST()                                                    
> >     },                                                                           
> > };                                                                               
> >                                                                                  
> > static bool ghes_needed(void *opaque)                                            
> > {                                                                                
> >     AcpiGedState *s = opaque;                                                    
> >     return s->ghes_state.ghes_addr_le;                                           
> > }                                                                                
> >                                                                                  
> > static const VMStateDescription vmstate_ghes_state = {                           
> >     .name = "acpi-ged/ghes",                                                     
> >     .version_id = 1,                                                             
> >     .minimum_version_id = 1,                                                     
> >     .needed = ghes_needed,                                                       
> >     .fields = (const VMStateField[]) {                                           
> >         VMSTATE_STRUCT(ghes_state, AcpiGedState, 1,                              
> >                        vmstate_ghes, AcpiGhesState),                             
> >         VMSTATE_END_OF_LIST()                                                    
> >     }                                                                            
> > };  
> > 
> > where 
> >     VMSTATE_UINT64(ghes_addr_le, AcpiGhesState),
> > explicitly defines field(s) within structure to be sent over wire.
> > 
> > we need to add a conditional field for ghes_addr_le
> > which will be sent only with new machine types, but not with old ones
> > to avoid migration breakage.
> > 
> > I don't know much about migration, but maybe we can get away with
> > similar condition as in ghes_needed(), or enabling QMP error injection
> > based on machine type version.
> > 
> > Or maybe adding a CLI option to enable QMP error injection in which
> > case the explicit option would serve as a trigger enable QMP command and
> > to migrate hest_addr_le.
> > It might be even better this way, since machine wouldn't need to
> > carry extra error source that will be used only for testing
> > and practically never in production VMs (aka reduced attack surface).
> > 
> > You can easily test it locally:
> >   new-qemu: with your patches
> >   old-qemu: qemu-9.1
> > 
> > and then try to do forth & back migration for following cases:
> >   1. (ping-pong case with OLD firmware/ACPI tables)
> >      start old-qemu with 9.1 machine type ->
> >        migrate to file ->
> >        start new-qemu with 9.1 machine type -> restore from file ->
> >        migrate to file ->  
> 
> As I never used migration, I'm a little stuck with the command line
> parameters.
> 
> I guess I got the one to do the migration at the monitor:
> 
> 	(qemu) migrate file://tmp/migrate
> 
> But no idea how to start a machine using a saved state.

see https://www.linux-kvm.org/page/Migration
'savevm/loadvm to an external state file (using pseudo-migration)' section

> 
> >        start old-qemu with 9.1 machine type ->restore from file ->
> >        
> >   2.  (ping-pong case with NEW firmware/ACPI tables)
> >       do the same as #1 but starting with new-qemu binary
> > 
> > (from upstream pov #2 is optional, but not implementing it
> > is pain for downstream so it's better to have it if it's not
> > too much work)  
> 
> If I understood the migration documentation, every when new fields
> are added, we should increment .version_id. If new version is
> not backward-compatible, .minimum_version_id is also incremented.
> 
> So, for a migration-compatible code with a 9.1 VM, the code needs to
> handle the case where hest_addr_le is not defined, e. g. use offsets
> relative to ghes_addr_le, just like the current version, e.g.:
> 
>     uint64_t cper_addr, read_ack_start_addr;
> 
>     AcpiGedState *acpi_ged_state =
>         ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED, NULL));
>     AcpiGhesState *ags = &acpi_ged_state->ghes_state;
> 
>     if (!ags->hest_addr_le) {
>         // Backward-compatible migration code
>         uint64_t base = le64_to_cpu(ags->ghes_addr_le);
> 
>         *read_ack_start_addr = base +
>             ACPI_GHES_ERROR_SOURCE_COUNT * sizeof(uint64_t) +
>             error_source_to_index[notify] * sizeof(uint64_t);
> 
>         *cper_addr = base +
>             ACPI_GHES_ERROR_SOURCE_COUNT * sizeof(uint64_t) +
>             ACPI_GHES_ERROR_SOURCE_COUNT * sizeof(uint64_t) +
>             error_source_to_index[notify] * ACPI_GHES_MAX_RAW_DATA_LENGTH;
>    } else {
>         // Use the new logic from ags->hest_addr_le
>    }
> 
> There are two problems with that:
> 
> 1. On your reviews, if I understood right, the code above is not
>    migration safe. So, while implementing it would be technically
>    correct, migration still won't work;
> 
> 2. With the new code, ACPI_GHES_ERROR_SOURCE_COUNT is not
>    defined anymore, as the size of the error source structure can
>    be different on different architectures, being 2 for new
>    VMs and 1 for old ones.
> 
>    Basically the new code gets it right because it can see a
>    pointer to the HEST table, so it can get the number from there:
> 
> 	hest_addr = le64_to_cpu(ags->hest_addr_le);
> 	cpu_physical_memory_read(hest_addr, &num_sources, sizeof(num_sources));
> 
>    But, without hest_addr_le, getting num_sources is not possible.
> 
>    An alternative would be to add a hacky code that works only for
>    arm machines (as new versions may support more archs).
> 
>    Something like:
> 	#define V1_ARM_ACPI_GHES_ERROR_SOURCE_COUNT 1
> 	#define V2_ARM_ACPI_GHES_ERROR_SOURCE_COUNT 2
> 
>    And have a hardcoded logic that would work before/after this
>    changeset but may break on newer versions, if the number of
>    source IDs change, if we add other HEST types, etc.
> 
>    Now, assuming that such hack would work, it sounds too hacky to 
>    my taste.
> 
> So, IMO it is a lot safer to not support migrations from v1 (only
> ghes_addr_le), using a patch like the enclosed one to ensure that.
> 
> Btw, checking existing migration structs, it sounds that for almost all
> structures, .version_id is identical to .minimum_version_id, meaning that
> migration between different versions aren't supported on most cases.

let me try to find more examples on how to implement migration bits
hopefully more comprehensible. 

> 
> Thanks,
> Mauro
> 
> ---
> 
> [PATCH] acpi/generic_event_device: Update GHES migration to cover hest addr
> 
> The GHES migration logic at GED should now support HEST table
> location too.
> 
> Increase migration version and change needed to check for both
> ghes_addr_le and hest_addr_le.
> 
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
> 
> diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
> index b4c83a089a02..efae0ff62c7b 100644
> --- a/hw/acpi/generic_event_device.c
> +++ b/hw/acpi/generic_event_device.c
> @@ -351,10 +351,11 @@ static const VMStateDescription vmstate_ged_state = {
>  
>  static const VMStateDescription vmstate_ghes = {
>      .name = "acpi-ghes",
> -    .version_id = 1,
> -    .minimum_version_id = 1,
> +    .version_id = 2,
> +    .minimum_version_id = 2,
>      .fields = (const VMStateField[]) {
>          VMSTATE_UINT64(ghes_addr_le, AcpiGhesState),
> +        VMSTATE_UINT64(hest_addr_le, AcpiGhesState),
>          VMSTATE_END_OF_LIST()
>      },
>  };
> @@ -362,13 +363,13 @@ static const VMStateDescription vmstate_ghes = {
>  static bool ghes_needed(void *opaque)
>  {
>      AcpiGedState *s = opaque;
> -    return s->ghes_state.ghes_addr_le;
> +    return s->ghes_state.ghes_addr_le && s->ghes_state.hest_addr_le;
>  }
>  
>  static const VMStateDescription vmstate_ghes_state = {
>      .name = "acpi-ged/ghes",
> -    .version_id = 1,
> -    .minimum_version_id = 1,
> +    .version_id = 2,
> +    .minimum_version_id = 2,
>      .needed = ghes_needed,
>      .fields = (const VMStateField[]) {
>          VMSTATE_STRUCT(ghes_state, AcpiGedState, 1,
> 
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ