lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240807154350.5907e4ed@imammedo.users.ipa.redhat.com>
Date: Wed, 7 Aug 2024 15:43:50 +0200
From: Igor Mammedov <imammedo@...hat.com>
To: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@...wei.com>, Shiju Jose
 <shiju.jose@...wei.com>, "Michael S. Tsirkin" <mst@...hat.com>, Ani Sinha
 <anisinha@...hat.com>, Dongjiu Geng <gengdongjiu1@...il.com>,
 <linux-kernel@...r.kernel.org>, <qemu-arm@...gnu.org>,
 <qemu-devel@...gnu.org>
Subject: Re: [PATCH v5 6/7] acpi/ghes: add support for generic error
 injection via QAPI

On Wed, 7 Aug 2024 15:23:57 +0200
Mauro Carvalho Chehab <mchehab+huawei@...nel.org> wrote:

> Em Wed, 7 Aug 2024 10:34:36 +0100
> Jonathan Cameron <Jonathan.Cameron@...wei.com> escreveu:
> 
> > On Wed, 7 Aug 2024 09:47:50 +0200
> > Mauro Carvalho Chehab <mchehab+huawei@...nel.org> wrote:
> >   
> > > Em Tue, 6 Aug 2024 16:31:13 +0200
> > > Igor Mammedov <imammedo@...hat.com> escreveu:
> > >     
> > > > PS:
> > > > looking at the code, ACPI_GHES_MAX_RAW_DATA_LENGTH is 1K
> > > > and it is the total size of a error block for a error source.
> > > > 
> > > > However acpi_hest_ghes.rst (3) says it should be 4K,
> > > > am I mistaken?      
> > > 
> > > Maybe Jonathan knows better, but I guess the 1K was just some
> > > arbitrary limit to prevent a too big CPER. The 4K limit described
> > > at acpi_hest_ghes.rst could be just some limit to cope with
> > > the current bios implementation, but I didn't check myself how
> > > this is implemented there. 
> > > 
> > > I was unable to find any limit at the specs. Yet, if you look at:
> > > 
> > > https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#arm-processor-error-section    
> > 
> > I think both limits are just made up.  You can in theory log huge
> > error records.  Just not one does.  
> 
> If both are made up, I would sync them, either patching the
> documentation or the ghes driver.
> 
> >   
> > > 
> > > The processor Error Information Structure, starting at offset
> > > 40, can go up to 255*32, meaning an offset of 8200, which is
> > > bigger than 4K.
> > > 
> > > Going further, processor context can have up to 65535 (spec
> > > actually says 65536, but that sounds a typo, as the size is
> > > stored on an uint16_t), containing multiple register values
> > > there (the spec calls its length as "P").
> > > 
> > > So, the CPER record could, in theory, have:
> > > 	8200 + (65535 * P) + sizeof(vendor-specicific-info)
> > > 
> > > The CPER length is stored in Section Length record, which is
> > > uint32_t.
> > > 
> > > So, I'd say that the GHES record can theoretically be a lot
> > > bigger than 4K.	    
> > Agreed - but I don't think we care for testing as long as it's
> > big enough for plausible records.   Unless you really want
> > to fuzz the limits?  
> 
> Fuzz the limits could be interesting, but it is not on my
> current plans.
> 
> Yet, 1K could be a little bit short for ARM CPER.
> 
> See: N.26 ARMv8 AArch64 GPRs (Type 4) has 256 bytes for
> registers, plus 8 bytes for the header. So, a total size of
> 264 bytes, for a single context register dump. I would expect
> that, in real life, type 4 to always be reported on aarch64,
> on BIOS with context register support. Maybe other types could
> also be dumped altogether (like context registers for EL1, 
> EL2 and/or EL3).
> 
> If just one type 4 context is encoded, it means that, 1K has 
> space for 23 errors (of a max limit of 255).
> 
> Just looking at the maximum number, my feeling is that 1K
> might be too short to simulate some real life reports,
> but that depends on how firmware is actually grouping
> such events.

per my knowledge firmware is out of picture here, since all
it does in HEST case is allocate continuous space for
'etc/hardware_errors' blob as QEMU told it.

> 
> So, maybe this could be expanded to, let's say, 4K, thus
> aligning with the ReST documentation.
maybe to get moving, 1st get your series in with docs fixed
to today limit.
And then increase error_block size to desired value on top of that
as it's really not relevant to what you are doing here.

> Regards,
> Mauro
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ