linux-kernel - Re: [PATCH 3/8] x86, microcode, intel: clarify log messages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20141029095453.GB17989@pd.tnic>
Date:	Wed, 29 Oct 2014 10:54:53 +0100
From:	Borislav Petkov <bp@...en8.de>
To:	Henrique de Moraes Holschuh <hmh@....eng.br>
Cc:	linux-kernel@...r.kernel.org, H Peter Anvin <hpa@...or.com>
Subject: Re: [PATCH 3/8] x86, microcode, intel: clarify log messages

On Tue, Oct 21, 2014 at 12:13:32PM -0200, Henrique de Moraes Holschuh wrote:
> > > @@ -178,11 +178,11 @@ static int apply_microcode_intel(int cpu)
> > >  	rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
> > >  
> > >  	if (val[1] != mc_intel->hdr.rev) {
> > > -		pr_err("CPU%d update to revision 0x%x failed\n",
> > > +		pr_err("CPU%d: update to revision 0x%x rejected by the processor\n",
> > >  		       cpu_num, mc_intel->hdr.rev);
> > >  		return -1;
> > >  	}
> > > -	pr_info("CPU%d updated to revision 0x%x, date = %04x-%02x-%02x\n",
> > > +	pr_info("CPU%d: entire core updated to revision 0x%x, date %04x-%02x-%02x\n",
> > 
> > Those two above are not really needed IMO.
> 
> They are an attempt to make it more clear.  I've got bug reports about this
> sort of issue before in Debian, people do wonder why some "cpus" are
> updated, and others _apparently_ are not (because there are no messages for
> them).

The fact that people open bugs doesn't always mean it is really a bug :-)

And there's a very simple answer to that:

$ grep microcode /proc/cpuinfo | uniq

should give only a single line.

And I'm pretty sure people will ask about "entire core" too. You simply
can't explain to them that the microcode engine is shared between
threads in a oneliner. :-)

Hrrm, I still don't like that "entire core" thing, how about something
like this:

	pr_info("CPU%d: update to revision 0x%x, date %04x-%02x-%02x\n"

this doesn't say at least that we're updating a CPU to a revision sth
sth but that we're doing the update on CPU%d. This should clearly
state what happens without making any assumptions about what the core
contains.

> We have more reasons why an update fail.  Rejected by the processor is a
> very specific one we want to single out (it basically means we are either
> feeding crap to the processor _or_ violating one of the architectural
> requirements).

That all doesn't matter - you only want to be able to pinpoint which
error message you're looking at and as long as they differ, you're fine.
Otherwise this is just an unnecessary churn.

> It is lowercase because we now prefix all pr_<level>() with "microcode: ",
> so it will look like this:
> 
> "microcode: warning: foo"
> "microcode: error: foo".
> 
> Should I capitalize this and all other messages of this sort?  It doesn't
> look right to me to log "microcode: Error: foo" or "microcode: Warning:
> foo".

Why not? You have the piece of machinery which generates the message,
i.e. "microcode" and then a sentence which starts with a capital letter.
I think this is the accepted practice in the kernel.

> The status-quo confuses people, they wonder why some "cpus" were skipped,
> and they file bugs about it or ask about it in the user forums.  I got a bug
> report for a module-based AMD processor, and it took a few round-trips to
> explain what was happening to the user...
> 
> If "entire core" is unclear, what wording would you suggest?

See above.

I just don't like the quantification "entire core" because later, when
CPU guys decide to split it more, we'll need to change it again.

> However, it isn't from the initrd: it comes from the early initramfs, which
> is a different kind of beast.  We (distros) did have microcode data inside
> the initrd/initramfs for the regular microcode driver, so the previous error
> message could be misleading.

Wait a minute, we do supply the early microcode by adding it to the
initrd: Documentation/x86/early-microcode.txt

Also, what you do on Debian doesn't necessarily mean all distros do it,
does it?

> Would you perfer "warning: failed to save early initramfs microcode update
> data for future use\n" ?

I guess we can say something method-agnostic without explicitly naming
that method...


> > > @@ -72,13 +72,13 @@ int microcode_sanity_check(void *mc, int print_err)
> > >  		if ((ext_table_size < EXT_HEADER_SIZE)
> > >  		 || ((ext_table_size - EXT_HEADER_SIZE) % EXT_SIGNATURE_SIZE)) {
> > >  			if (print_err)
> > > -				pr_err("error! Small exttable size in microcode data file\n");
> > > +				pr_err("error: small exttable size in microcode data file\n");
> > 
> > That doesn't tell me a whole lot - maybe "... truncated exttable in microcode data file" ?
> 
> If it hit this codepath, it will not be due to "normal" truncation.  The

What is a "normal" truncation?

> caller detects truncation earlier, checking the main header total_size
> against the size of the data it got from userspace or the early initramfs.

You mean this:

	if (data_size + MC_HEADER_SIZE > total_size)

?

If so, this is not truncation check - it sanity-checks what's in the
headers or the default sizes.


Btw, there's a typo in show_saved_mc()

pr_debug("mc_saved[%d]: sig=0x%x, pf=0x%x, rev=0x%x, toal size=0x%x, date = %04x-%02x-%02x\n",
						     ^^^^
> The message really means: "exttable size incompatible with the size of a
> valid table".

Let's call this:

	"Error: extended signature table size mismatch!\n"

because this is what this check does.

> I want to differentiate these two, they're caused by different issues, so
> they should not be the same error message.
> 
> Would you prefer "exttable entry counter incompatible with exttable size",
> which is what this message really means?

Yep, that sounds better.

> Well, every other error message in microcode_sanity_check() also means we
> abort just the same.  It doesn't make sense to have "abort" just in this
> one.

Ok.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/