lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 27 Feb 2018 19:34:17 +0100
From:   Borislav Petkov <bp@...e.de>
To:     "Ghannam, Yazen" <Yazen.Ghannam@....com>
Cc:     "linux-efi@...r.kernel.org" <linux-efi@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "ard.biesheuvel@...aro.org" <ard.biesheuvel@...aro.org>,
        "x86@...nel.org" <x86@...nel.org>
Subject: Re: [PATCH v2 2/8] efi: Decode IA32/X64 Processor Error Section

On Tue, Feb 27, 2018 at 06:06:21PM +0000, Ghannam, Yazen wrote:
> It's not just about arches but record types. A single platform can report
> using arch-specific records, memory records, PCIe records, etc.
> 
> So all the other record types only show fields with a valid bit set and this
> has always been the case. There may be users or tools who expect that
> same behavior.

You know we have this thing called tracepoints for that, don't you?

You define one tracepoint which regurgitates an error record and then
all arches which have the same table, share them.

> Please see the other patches where these are decoded further. As I
> mentioned the cover letter, the decoding is applied incrementally rather
> than having one large patch.

Here's what needs to go:

All the validation bits crap:

	printk("%sValidation Bits: 0x%04x\n", pfx, validation_bits);

Instead, you simply dump all entries unconditionally and say whether
they're valid or not. This way you have the same format for each error
record - much easier to work with.

	printk("%sCPUID Info:\n", pfx);
        print_hex_dump(pfx, "", DUMP_PREFIX_OFFSET, 16, 4, proc->cpuid,
                       sizeof(proc->cpuid), 0);

Frankly, I have no clue why we should even dump CPUID for *every* error.
It is completely sufficient to dump family/model/stepping like we do in
mce_amd.c. Besides, when debugging, you collect much more info from the
system - CPUID only is not enough.

	printk("%sError Structure Type: %pUl\n", newpfx,
               &err_info->err_type);

Unreadable GUIDs.

What are those even:

                if (err_info->validation_bits & INFO_VALID_TARGET_ID) {
                        printk("%sTarget Identifier: 0x%016llx\n",
                               newpfx, err_info->target_id);
                }

                if (err_info->validation_bits & INFO_VALID_REQUESTOR_ID) {
                        printk("%sRequestor Identifier: 0x%016llx\n",
                               newpfx, err_info->requestor_id);
                }

                if (err_info->validation_bits & INFO_VALID_RESPONDER_ID) {
                        printk("%sResponder Identifier: 0x%016llx\n",
                               newpfx, err_info->responder_id);
                }

some 8-byte ids? Who knows... Either remove them or make them
human-readable, *explaining* what the requestor is and the target is and
so on.

                printk("%sRegister Array Size: 0x%04x\n", newpfx,
                       ctx_info->reg_arr_size);

I have no clue what that is for.

                if (ctx_info->reg_ctx_type == CTX_TYPE_MSR) {
                        groupsize = 8; /* MSRs are 8 bytes wide. */
                        printk("%sMSR Address: 0x%08x\n", newpfx,
                               ctx_info->msr_addr);
                }

What do I need the MSR *addresses* for?

                if (ctx_info->reg_ctx_type == CTX_TYPE_MMREG) {
                        printk("%sMM Register Address: 0x%016llx\n", newpfx,
                               ctx_info->mm_reg_addr);
                }

memory mapped registers?!?!

> Also, like I said in another thread, we should always print the raw value
> followed by the decoding. That way the raw info is there in case a user
> wants to send the data to the vendor or other expert party.

By that logic we shouldn't be decoding at all - we should be dumping a
fat hex blob.

Again, there's this thing called tracepoints which have exactly that,
you know. You can dump human readable info to dmesg and dump a whole raw
record into the tracepoint.

When the error is critical and we're about to die, the tracepoint
contents goes to dmesg too.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ