lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20130927193657.GA11294@khazad-dum.debian.net>
Date:	Fri, 27 Sep 2013 16:36:58 -0300
From:	Henrique de Moraes Holschuh <hmh@....eng.br>
To:	Sherry Hurwitz <sherry.hurwitz@....com>
Cc:	Borislav Petkov <bp@...en8.de>, Jacob Shin <jacob.shin@....com>,
	Andreas Herrmann <herrmann.der.user@...glemail.com>,
	linux-kernel@...r.kernel.org
Subject: Re: Issues with AMD microcode updates

On Thu, 26 Sep 2013, Sherry Hurwitz wrote:
> We have failed to reproduce a hang while loading microcode.

I got an offer from a Debian user to test it over the weekend, let's hope
he will have more luck(?) at hitting the issue.  If he does, it should give
us sysrq+t dumps of the hung system.

> We have tested with kernel and AMD family combinations with
> normal and error condition so error paths were taken.  Obviously
> there are factors we are missing that the users are hitting.

Yeah, and it is not likely to be a kernel patch, as the users hit the issue
using non-distro kernels :-(

Maybe it is on the firmware-loader side, but one user did wait 1 hour for
the thing to get unstuck, and that would have taken care of any possible
firmware-loader timeouts.

> Any suggestions on how we improve the test matrix would be
> helpful.  We will continue the investigation but any insights are appreciated.
> 
> NOTE: kernels before 3.0 only load 1 (2k) size of microcode patch and
> therefore do not support microcode loading of family 14h, 15h, and 16h.
> Also,in a test request on another thread you suggested someone with
> family 15h revC0 to load microcode twice with an earlier patch and then
> the latest, but there has only been 1 microcode patch level published for revB2
> so that test won't work.

Well, it is the only thing I could think of, other than some nasty race
condition...

> kernel           cpu family             results             conditions
> ---------------------------------------------------------------------------------
> 2.6.38           fam10h                 load passed         normal
> 2.6.38           fam15h revC0           load failed         2.6.38 can not handle 4k patches
> 3.5.2            fam10h                 load passed         normal
> 3.5.2            fam15h revB2           load passed         loaded 637 then second load 63d
> 3.5.2            fam15h revC0           load passed         normal
> 3.5.2            fam15h revC0           load failed         used a corrupted bin file

I just looked, and the 2.6.38 hang happened for i686 and an unindentified
3-core AMD processor, and the 3.5.2 on x86-64 PREEMPT, on a fam15h model 2
stepping 0, 32-core AMD processor (Linux 3.5.2 (SMP w/32 CPU cores;
PREEMPT)).  No patterns there.

BTW, the userspace script that users reported to have hung is this:

grep -q "^vendor_id[[:blank:]]*:[[:blank:]]*.*AuthenticAMD" /proc/cpuinfo && {
if modprobe -q --first-time microcode ; then
    echo "Updating microcode on all online processors..." >&2
else
    # we have to trigger the microcode update manually
    if [ -e /sys/devices/system/cpu/microcode/reload ] ; then
	echo "Updating microcode on all online processors..." >&2
	echo 1 > /sys/devices/system/cpu/microcode/reload || {
	    echo "Kernel reported failure while updating microcode!" >&2
	}
    else
	# Try all online processors, broken kernels need this,
	# fixed kernels will accept it only on the BSP and update
	# all processors anyway, and -EINVAL all others... but we
	# don't know which one is the BSP, so we try all of them
	# and hide errors, the kernel will log any real problem.
	echo "Using per-core interface to update microcode on online processors..." >&2
	find /sys/devices/system/cpu -noleaf -type f -path '/sys/devices/system/cpu/cpu*/microcode/reload' | \
	    while read i ; do echo -n 1 2>/dev/null >"$i" || true ; done
    fi
fi
}


With the microcode driver already loaded (so, that modprobe line fails).

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ