lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 21 Nov 2015 13:49:06 +0200
From:	Vassilis Virvilis <vasvir@....demokritos.gr>
To:	Juergen Gross <jgross@...e.com>
Cc:	linux-kernel@...r.kernel.org, Toshi Kani <toshi.kani@...com>,
	"Luis R. Rodriguez" <mcgrof@...e.com>
Subject: Re: Hibernate resume bug around 3,18-rc2 - Full PAT support

On 11/20/2015 02:23 PM, Juergen Gross wrote:
> On 20/11/15 11:04, vasvir@....demokritos.gr wrote:
>>> I've just found a potential issue: In case MTRR is disabled by the BIOS
>>> the PAT register of the boot processor won't be restored after resume.
>>>
>>> Can you check whether pr_info("MTRR: Disabled\n") has been executed in
>>> early boot? If yes, this might be a BIOS option.
>>>
>>
>> I don't have access right now. I will test it later tonight (This is my
>> home machine).
>>
>> Would $dmesg | grep -i mtrr suffice or I need to look for the mtrr
>> somewere else e.g. /proc /sys etc?
>
> I think grepping for MTRR in dmesg should be enough.

kernel 4.3 +nopat also died on the 4th or the 5th hibernate on the familiar (see previously attached image) "Calling lapic..." place.

$dmesg | grep -i mtr for 4.3 kernel with notpat
[    0.189113] calling  mtrr_if_init+0x0/0x5f @ 1
[    0.189116] initcall mtrr_if_init+0x0/0x5f returned 0 after 0 usecs
[    0.189222] pmd_set_huge: Cannot satisfy [mem 0xf8000000-0xf8200000] with a huge-page mapping due to MTRR override.
[    0.189559] calling  mtrr_init_finialize+0x0/0x3a @ 1
[    0.189560] initcall mtrr_init_finialize+0x0/0x3a returned 0 after 0 usecs
[    8.994140] mtrr: type mismatch for e0000000,10000000 old: write-back new: write-combining
[    8.994154] Failed to add WC MTRR for [00000000e0000000-00000000efffffff]; performance may suffer.

$dmesg | grep -i mtr for 4.3 kernel with default pat enabled
[    0.189368] calling  mtrr_if_init+0x0/0x5f @ 1
[    0.189370] initcall mtrr_if_init+0x0/0x5f returned 0 after 0 usecs
[    0.189478] pmd_set_huge: Cannot satisfy [mem 0xf8000000-0xf8200000] with a huge-page mapping due to MTRR override.
[    0.189814] calling  mtrr_init_finialize+0x0/0x3a @ 1
[    0.189815] initcall mtrr_init_finialize+0x0/0x3a returned 0 after 0 usecs


I also checked my BIOS. I found nothing about mtrr. My BIOS manual is ftp://europe.asrock.com/Manual/H97%20Pro4.pdf. Can you see any option about MTRR?

Question: If we assume your theory is correct about mtrr/pat, wouldn't lockup/hang reboot every time the system goes to hibernate/resume? Can this assumption explain why the first hibernation/resume cycles in rapid succession after system boot are working and the long ones fail somewhat more consistently?

Note: With PAT enabled the system boots up significantly faster.

In the weekend I will return to 3.18-rc2 and I will try to verify my bisection is correct. Double guessing your self is a terrible thing...

I will also try with nopat and I will run dmesg | grep -i mtr and post results

Unless you have any other suggestions...

     Vassilis

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists