lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241204102413.31c8d76d@imammedo.users.ipa.redhat.com>
Date: Wed, 4 Dec 2024 10:24:13 +0100
From: Igor Mammedov <imammedo@...hat.com>
To: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@...wei.com>, Shiju Jose
 <shiju.jose@...wei.com>, "Michael S. Tsirkin" <mst@...hat.com>, Ani Sinha
 <anisinha@...hat.com>, Dongjiu Geng <gengdongjiu1@...il.com>,
 linux-kernel@...r.kernel.org, qemu-arm@...gnu.org, qemu-devel@...gnu.org
Subject: Re: [PATCH v4 13/15] acpi/ghes: move offset calculus to a separate
 function

On Wed, 4 Dec 2024 09:56:35 +0100
Mauro Carvalho Chehab <mchehab+huawei@...nel.org> wrote:

> Em Wed, 4 Dec 2024 08:54:40 +0100
> Igor Mammedov <imammedo@...hat.com> escreveu:
> 
> > On Tue, 3 Dec 2024 14:47:30 +0100
> > Mauro Carvalho Chehab <mchehab+huawei@...nel.org> wrote:
> >   
> > > Em Tue, 3 Dec 2024 12:51:43 +0100
> > > Igor Mammedov <imammedo@...hat.com> escreveu:
> > >     
> > > > On Fri, 22 Nov 2024 10:11:30 +0100
> > > > Mauro Carvalho Chehab <mchehab+huawei@...nel.org> wrote:
> > > >       
> > > > > Currently, CPER address location is calculated as an offset of
> > > > > the hardware_errors table. It is also badly named, as the
> > > > > offset actually used is the address where the CPER data starts,
> > > > > and not the beginning of the error source.
> > > > > 
> > > > > Move the logic which calculates such offset to a separate
> > > > > function, in preparation for a patch that will be changing the
> > > > > logic to calculate it from the HEST table.
> > > > > 
> > > > > While here, properly name the variable which stores the cper
> > > > > address.
> > > > > 
> > > > > Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
> > > > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
> > > > > ---
> > > > >  hw/acpi/ghes.c | 41 ++++++++++++++++++++++++++++++++---------
> > > > >  1 file changed, 32 insertions(+), 9 deletions(-)
> > > > > 
> > > > > diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> > > > > index 87fd3feedd2a..d99697b20164 100644
> > > > > --- a/hw/acpi/ghes.c
> > > > > +++ b/hw/acpi/ghes.c
> > > > > @@ -364,10 +364,37 @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
> > > > >      ags->present = true;
> > > > >  }
> > > > >  
> > > > > +static void get_hw_error_offsets(uint64_t ghes_addr,
> > > > > +                                 uint64_t *cper_addr,
> > > > > +                                 uint64_t *read_ack_register_addr)
> > > > > +{        
> > > > 
> > > >       
> > > > > +    if (!ghes_addr) {
> > > > > +        return;
> > > > > +    }        
> > > > 
> > > > why do we need this check?      
> > > 
> > > It is a safeguard measure to avoid crashes and OOM access. If fw_cfg 
> > > callback doesn't fill it properly, this will be zero.    
> > 
> > shouldn't happen, but yeah it firmware job to write back addr
> > which might happen for whatever reason (a bug for example).
> >  
> 
> The main reason I added it is that, after the second series, it could 
> also happen if there's something wrong with the backward compat logic.
> 
> So, both here and after switching to HEST-based offsets, I opted
> to explicitly test.
> 
> > Perhaps push this up to the stack, so we don't have to deal
> > with scattered checks in ghes code.
> > 
> > kvm_arch_on_sigbus_vcpu() looks like a goo candidate for check
> > and warn_once if that ever happens.
> > It already calls acpi_ghes_present() which resolves GED device
> > and then later we duplicate this job in ghes_record_cper_errors()
> > 
> > so maybe rename acpi_ghes_present to something like AcpiGhesState* acpi_ghes_get_state()
> > and call it instead. And then move ghes_addr check/warn_once there.
> > This way the rest of ghes code won't have to deal handling practically
> > impossible error conditions that cause reader to wonder why it might happen.  
> 
> I'll look on it. Yet, if ok for you, I would prefer dealing with this
> once we have a bigger picture, e.g. once we merge those tree series:
> 
> 	- cleanup series (this one);
> 	- HEST offset (I'll be sending a new version today);
ok, lets revisit this point after this series.
Since at this point we should have a clean picture of how new code
works.

> 	- error_inject.
> 
> Thanks,
> Mauro
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ