lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrUgb8frLsmaqAEopsf1O-2io7wGvTO1BLFJq8wjtb+G5Q@mail.gmail.com>
Date:   Wed, 6 Sep 2017 15:26:19 -0700
From:   Andy Lutomirski <luto@...nel.org>
To:     Jiri Kosina <jikos@...nel.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Ingo Molnar <mingo@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Andy Lutomirski <luto@...nel.org>,
        Borislav Petkov <bp@...en8.de>
Subject: Re: [GIT PULL] x86/mm changes for v4.14: PCID support, 5-level paging
 support, Secure Memory Encryption support

On Wed, Sep 6, 2017 at 2:16 PM, Jiri Kosina <jikos@...nel.org> wrote:
> On Wed, 6 Sep 2017, Jiri Kosina wrote:
>
>> This is a "me too", observed on my Lenovo thinkpad x270 (so it's not
>> specific to that XPS 13 system at all).
>>
>> The symptom I observe is that an attempt to resume from hibernation
>> proceeds up to reading 100% of the hibernation image, and then reboot
>> happens (IOW looks like triple fault).
>>
>> nopcid cures it, I haven't tried to revert 10af6235e0d3 yet, but looks
>> like it's the same thing.
>
> [ reposting the information again with LKML re-introduced to CC ]
>
> As suggested by Andy off-list, I tested with this change to always force
> ASID 0
>
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index 5ca71d1..c3b0811 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -35,7 +35,7 @@ static void choose_new_asid(struct mm_struct *next, u64 next_tlb_gen,
>  {
>         u16 asid;
>
> -       if (!static_cpu_has(X86_FEATURE_PCID)) {
> +       if (true || !static_cpu_has(X86_FEATURE_PCID)) {
>                 *new_asid = 0;
>                 *need_flush = true;
>                 return;
>
> and that fixes the issue on my system.


I got Linus' config to boot.  The problem was that I ended up with a
root-owned file (not sure which) in my tree that cause an incorrect
build but didn't generate errors.  I don't know how this happened, but
an ill-timed sudo make -j4 modules_install install was probably
involved.  git clean -ffxxxd , did *not* fix it or even notice it in
any obvious way.

Anyway, the problem appears to depend on kernel config because it's
dying here on resume on secondary cpus:

    VM_BUG_ON(__read_cr3() != (__sme_pa(real_prev->pgd) | prev_asid));

in switch_mm_irqs_off().

What seems to be going on is that the wakeup CPU is exactly restoring
original state.  All other CPUs are restoring swapper_pg_dir but are
failing to restore the PCID tag bits, which trips the assertion w.p.
5/6 per non-boot CPU.  So, if you have that debug option set, you die
w.p. 1 - (1/6)^(cpus - 1), which is pretty large.

I'll come up with a clean fix this evening, I hope.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ