lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250226132905.47aa50d2@imammedo.users.ipa.redhat.com>
Date: Wed, 26 Feb 2025 13:29:05 +0100
From: Igor Mammedov <imammedo@...hat.com>
To: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@...wei.com>, "Michael S . Tsirkin"
 <mst@...hat.com>, Shiju Jose <shiju.jose@...wei.com>,
 <qemu-arm@...gnu.org>, <qemu-devel@...gnu.org>, Philippe Mathieu-Daudé <philmd@...aro.org>, Ani Sinha
 <anisinha@...hat.com>, Cleber Rosa <crosa@...hat.com>, Dongjiu Geng
 <gengdongjiu1@...il.com>, Eduardo Habkost <eduardo@...kost.net>, Eric Blake
 <eblake@...hat.com>, John Snow <jsnow@...hat.com>, Marcel Apfelbaum
 <marcel.apfelbaum@...il.com>, "Markus Armbruster" <armbru@...hat.com>,
 Michael Roth <michael.roth@....com>, "Paolo Bonzini" <pbonzini@...hat.com>,
 Peter Maydell <peter.maydell@...aro.org>, Shannon Zhao
 <shannon.zhaosl@...il.com>, Yanan Wang <wangyanan55@...wei.com>, Zhao Liu
 <zhao1.liu@...el.com>, <kvm@...r.kernel.org>,
 <linux-kernel@...r.kernel.org>, Gavin Shan <gshan@...hat.com>
Subject: Re: [PATCH v3 00/14] Change ghes to use HEST-based offsets and add
 support for error inject

On Wed, 26 Feb 2025 10:56:28 +0100
Mauro Carvalho Chehab <mchehab+huawei@...nel.org> wrote:

> Em Tue, 25 Feb 2025 11:01:15 +0100
> Igor Mammedov <imammedo@...hat.com> escreveu:
> 
> > On Fri, 21 Feb 2025 10:21:27 +0000
> > Jonathan Cameron <Jonathan.Cameron@...wei.com> wrote:
> >   
> > > On Fri, 21 Feb 2025 07:38:23 +0100
> > > Mauro Carvalho Chehab <mchehab+huawei@...nel.org> wrote:
> > >     
> > > > Em Mon, 3 Feb 2025 16:22:36 +0100
> > > > Igor Mammedov <imammedo@...hat.com> escreveu:
> > > >       
> > > > > On Mon, 3 Feb 2025 11:09:34 +0000
> > > > > Jonathan Cameron <Jonathan.Cameron@...wei.com> wrote:
> > > > >         
> > > > > > On Fri, 31 Jan 2025 18:42:41 +0100
> > > > > > Mauro Carvalho Chehab <mchehab+huawei@...nel.org> wrote:
> > > > > >           
> > > > > > > Now that the ghes preparation patches were merged, let's add support
> > > > > > > for error injection.
> > > > > > > 
> > > > > > > On this series, the first 6 patches chang to the math used to calculate offsets at HEST
> > > > > > > table and hardware_error firmware file, together with its migration code. Migration tested
> > > > > > > with both latest QEMU released kernel and upstream, on both directions.
> > > > > > > 
> > > > > > > The next patches add a new QAPI to allow injecting GHESv2 errors, and a script using such QAPI
> > > > > > >    to inject ARM Processor Error records.
> > > > > > > 
> > > > > > > If I'm counting well, this is the 19th submission of my error inject patches.            
> > > > > > 
> > > > > > Looks good to me. All remaining trivial things are in the category
> > > > > > of things to consider only if you are doing another spin.  The code
> > > > > > ends up how I'd like it at the end of the series anyway, just
> > > > > > a question of the precise path to that state!          
> > > > > 
> > > > > if you look at series as a whole it's more or less fine (I guess you
> > > > > and me got used to it)
> > > > > 
> > > > > however if you take it patch by patch (as if you've never seen it)
> > > > > ordering is messed up (the same would apply to everyone after a while
> > > > > when it's forgotten)
> > > > > 
> > > > > So I'd strongly suggest to restructure the series (especially 2-6/14).
> > > > > re sum up my comments wrt ordering:
> > > > > 
> > > > > 0  add testcase for HEST table with current HEST as expected blob
> > > > >    (currently missing), so that we can be sure that we haven't messed
> > > > >    existing tables during refactoring.        
> > > 
> > > To potentially save time I think Igor is asking that before you do anything
> > > at all you plug the existing test hole which is that we don't test HEST
> > > at all.   Even after this series I think we don't test HEST.  You add
> > > a stub hest and exclusion but then in patch 12 the HEST stub is deleted whereas
> > > it should be replaced with the example data for the test.    
> > 
> > that's what I was saying.
> > HEST table should be in DSDT, but it's optional and one has to use
> > 'ras=on' option to enable that, which we aren't doing ATM.
> > So whatever changes are happening we aren't seeing them in tests
> > nor will we see any regression for the same reason.
> > 
> > While white listing tables before change should happen and then updating them
> > is the right thing to do, it's not sufficient since none of tests
> > run with 'ras' enabled, hence code is not actually executed.   
> 
> Ok. Well, again we're not modifying HEST table structure on this
> changeset. The only change affecting HEST is when the number of entries
> increased from 1 to 2.
> 
> Now, looking at bios-tables-test.c, if I got it right, I should be doing
> something similar to the enclosed patch, right?
> 
> If so, I have a couple of questions:
> 
> 1. from where should I get the HEST table? dumping the table from the
>    running VM?


> 
> 2. what values should I use to fill those variables:
> 
> 	int hest_offset = 40 /* HEST */;
> 	int hest_entry_size = 4;
you don't need to do that,
bios-tables-test will dump all ACPI tables for you automatically,
you only need to add or extend a test with ras=on option.

   1: 1st add empty table and whitelist it ("tests/data/acpi/aarch64/virt/HEST")
   2: enable ras in existing tescase

--- a/tests/qtest/bios-tables-test.c
+++ b/tests/qtest/bios-tables-test.c
@@ -2123,7 +2123,8 @@ static void test_acpi_aarch64_virt_tcg(void)
     data.smbios_cpu_max_speed = 2900;
     data.smbios_cpu_curr_speed = 2700;
     test_acpi_one("-cpu cortex-a57 "
-                  "-smbios type=4,max-speed=2900,current-speed=2700", &data);
+                  "-smbios type=4,max-speed=2900,current-speed=2700 "
+                  "-machine ras=on", &data);
     free_test_data(&data);
 }
     
  then with installed IASL run
    V=1 QTEST_QEMU_BINARY=./qemu-system-aarch64  ./tests/qtest/bios-tables-test
  to see diff

  3: rebuild tables and follow the rest of procedure to update expected blobs
     as described in comment at the top of (tests/qtest/bios-tables-test.c)

I'd recommend to add 3 patches as the beginning of the series,
that way we can be sure that if something changes unintentionally
it won't go unnoticed.

> 
> >   
> > > 
> > > That indeed doesn't address testing the error data storage which would be
> > > a different problem.    
> > 
> > I'd skip hardware_errors/CPER testing from QEMU unit tests.
> > That's basically requires functioning 'APEI driver' to test that.
> > 
> > Maybe we can use Ani's framework to parse HEST and all the way
> > towards CPER record(s) traversal, but that's certainly out of
> > scope of this series.
> > It could be done on top, but I won't insist even on that
> > since Mauro's out of tree error injection testing will
> > cover that using actual guest (which I assume he would
> > like to run periodically).  
> 
> Yeah, my plan is to periodically test it. I intend to setup somewhere
> a CI to test Kernel, QEMU and rasdaemon altogether.
> 
> Thanks,
> Mauro
> 
> ---
> 
> diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
> index 0a333ec43536..31e69d906db4 100644
> --- a/tests/qtest/bios-tables-test.c
> +++ b/tests/qtest/bios-tables-test.c
> @@ -210,6 +210,8 @@ static void test_acpi_fadt_table(test_data *data)
>      uint32_t val;
>      int dsdt_offset = 40 /* DSDT */;
>      int dsdt_entry_size = 4;
> +    int hest_offset = 40 /* HEST */;
> +    int hest_entry_size = 4;
>  
>      g_assert(compare_signature(&table, "FACP"));
>  
> @@ -242,6 +244,12 @@ static void test_acpi_fadt_table(test_data *data)
>      /* update checksum */
>      fadt_aml[9 /* Checksum */] = 0;
>      fadt_aml[9 /* Checksum */] -= acpi_calc_checksum(fadt_aml, fadt_len);
> +
> +
> +
> +    acpi_fetch_table(data->qts, &table.aml, &table.aml_len,
> +                     fadt_aml + hest_offset, hest_entry_size, "HEST", true);
> +    g_array_append_val(data->tables, table);
>  }
>  
>  static void dump_aml_files(test_data *data, bool rebuild)
> @@ -2411,7 +2419,7 @@ static void test_acpi_aarch64_virt_oem_fields(void)
>      };
>      char *args;
>  
> -    args = test_acpi_create_args(&data, "-cpu cortex-a57 "OEM_TEST_ARGS);
> +    args = test_acpi_create_args(&data, "-ras on -cpu cortex-a57 "OEM_TEST_ARGS);
>      data.qts = qtest_init(args);
>      test_acpi_load_tables(&data);
>      test_oem_fields(&data);
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ