lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 30 Jun 2017 10:47:17 -0600
From:   "Baicar, Tyler" <tbaicar@...eaurora.org>
To:     Robert Richter <robert.richter@...ium.com>
Cc:     christoffer.dall@...aro.org, marc.zyngier@....com,
        pbonzini@...hat.com, rkrcmar@...hat.com, linux@...linux.org.uk,
        catalin.marinas@....com, will.deacon@....com, rjw@...ysocki.net,
        lenb@...nel.org, matt@...eblueprint.co.uk, robert.moore@...el.com,
        lv.zheng@...el.com, nkaje@...eaurora.org, zjzhang@...eaurora.org,
        mark.rutland@....com, james.morse@....com,
        akpm@...ux-foundation.org, eun.taik.lee@...sung.com,
        sandeepa.s.prabhu@...il.com, labbott@...hat.com,
        shijie.huang@....com, rruigrok@...eaurora.org,
        paul.gortmaker@...driver.com, tn@...ihalf.com, fu.wei@...aro.org,
        rostedt@...dmis.org, bristot@...hat.com,
        linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.cs.columbia.edu,
        kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-acpi@...r.kernel.org, linux-efi@...r.kernel.org,
        Suzuki.Poulose@....com, punit.agrawal@....com, astone@...hat.com,
        harba@...eaurora.org, hanjun.guo@...aro.org, john.garry@...wei.com,
        shiju.jose@...wei.com, joe@...ches.com, bp@...en8.de,
        rafael@...nel.org, tony.luck@...el.com, gengdongjiu@...wei.com,
        xiexiuqi@...wei.com
Subject: Re: [PATCH V17 01/11] acpi: apei: read ack upon ghes record
 consumption

On 6/30/2017 4:10 AM, Robert Richter wrote:
> Tyler,
>
> On 19.05.17 14:32:03, Tyler Baicar wrote:
>> A RAS (Reliability, Availability, Serviceability) controller
>> may be a separate processor running in parallel with OS
>> execution, and may generate error records for consumption by
>> the OS. If the RAS controller produces multiple error records,
>> then they may be overwritten before the OS has consumed them.
>>
>> The Generic Hardware Error Source (GHES) v2 structure
>> introduces the capability for the OS to acknowledge the
>> consumption of the error record generated by the RAS
>> controller. A RAS controller supporting GHESv2 shall wait for
>> the acknowledgment before writing a new error record, thus
>> eliminating the race condition.
>>
>> Add support for parsing of GHESv2 sub-tables as well.
>>
>> Signed-off-by: Tyler Baicar <tbaicar@...eaurora.org>
>> CC: Jonathan (Zhixiong) Zhang <zjzhang@...eaurora.org>
>> Reviewed-by: James Morse <james.morse@....com>
>> ---
>>   drivers/acpi/apei/ghes.c | 59 +++++++++++++++++++++++++++++++++++++++++++++---
>>   drivers/acpi/apei/hest.c |  7 ++++--
>>   include/acpi/ghes.h      |  5 +++-
>>   3 files changed, 65 insertions(+), 6 deletions(-)
>>   static int ghes_proc(struct ghes *ghes)
>>   {
>>   	int rc;
>> @@ -661,6 +704,16 @@ static int ghes_proc(struct ghes *ghes)
>>   			ghes_estatus_cache_add(ghes->generic, ghes->estatus);
>>   	}
>>   	ghes_do_proc(ghes, ghes->estatus);
>> +
>> +	/*
>> +	 * GHESv2 type HEST entries introduce support for error acknowledgment,
>> +	 * so only acknowledge the error if this support is present.
>> +	 */
>> +	if (is_hest_type_generic_v2(ghes)) {
>> +		rc = ghes_ack_error(ghes->generic_v2);
>> +		if (rc)
>> +			return rc;
>> +	}
>>   out:
>>   	ghes_clear_estatus(ghes);
>>   	return rc;
> was there any specific reason why the ack is sent before clearing the
> block status? Spec says the ack should be sent at last.
>
> Also, the block is never cleared if ghes_ack_error() returns an error.
> IMO we should fall through and clear the block status (this will
> change anyway if the bloc status is cleared first).
Hello Robert,

Thank you for pointing this out. I will send a patch to move the ack 
after the ghes_clear_estatus. This is probably the right thing to do 
since right now if the FW populates an invalid estatus, we will fail to 
read the estatus, jump to 'out:', and never send the ack.

Thanks,
Tyler

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ