linux-kernel - Re: [PATCH v4] x86/power/64: Fix kernel text mapping corruption during image restoration

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrU38QiDYBpZKJOjezsQ==ON=H3SCUhAVbBJuFq2MGwsZA@mail.gmail.com>
Date:	Thu, 30 Jun 2016 08:24:02 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	"Rafael J. Wysocki" <rafael@...nel.org>
Cc:	Borislav Petkov <bp@...en8.de>,
	"Rafael J. Wysocki" <rjw@...ysocki.net>,
	Logan Gunthorpe <logang@...tatee.com>,
	Kees Cook <keescook@...omium.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	lkml <linux-kernel@...r.kernel.org>,
	"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
	Andy Lutomirski <luto@...nel.org>,
	Brian Gerst <brgerst@...il.com>,
	Denys Vlasenko <dvlasenk@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Linux PM list <linux-pm@...r.kernel.org>,
	Stephen Smalley <sds@...ho.nsa.gov>
Subject: Re: [PATCH v4] x86/power/64: Fix kernel text mapping corruption
 during image restoration

On Thu, Jun 30, 2016 at 8:17 AM, Rafael J. Wysocki <rafael@...nel.org> wrote:
> On Thu, Jun 30, 2016 at 5:05 PM, Borislav Petkov <bp@...en8.de> wrote:
>> On Thu, Jun 30, 2016 at 03:17:20PM +0200, Rafael J. Wysocki wrote:
>>> From: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
>>>
>>> Logan Gunthorpe reports that hibernation stopped working reliably for
>>> him after commit ab76f7b4ab23 (x86/mm: Set NX on gap between __ex_table
>>> and rodata).
>>
>> ...
>>
>>> +static int relocate_restore_code(void)
>>> +{
>>> +     pgd_t *pgd;
>>> +     pud_t *pud;
>>> +
>>> +     relocated_restore_code = get_safe_page(GFP_ATOMIC);
>>> +     if (!relocated_restore_code)
>>> +             return -ENOMEM;
>>> +
>>> +     memcpy((void *)relocated_restore_code, &core_restore_code, PAGE_SIZE);
>>> +
>>> +     /* Make the page containing the relocated code executable */
>>> +     pgd = (pgd_t *)__va(read_cr3()) + pgd_index(relocated_restore_code);
>>> +     pud = pud_offset(pgd, relocated_restore_code);
>>> +     if (pud_large(*pud)) {
>>> +             set_pud(pud, __pud(pud_val(*pud) & ~_PAGE_NX));
>>> +     } else {
>>> +             pmd_t *pmd = pmd_offset(pud, relocated_restore_code);
>>> +
>>> +             if (pmd_large(*pmd)) {
>>> +                     set_pmd(pmd, __pmd(pmd_val(*pmd) & ~_PAGE_NX));
>>> +             } else {
>>> +                     pte_t *pte = pte_offset_kernel(pmd, relocated_restore_code);
>>> +
>>> +                     set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_NX));
>>> +             }
>>> +     }
>>> +     flush_tlb_all();
>>
>> I know you want to flush TLBs but this causes the splat below on the
>> resume kernel.
>>
>> Most likely because:
>>
>> resume_target_kernel() does local_irq_disable() and then
>>
>> swsusp_arch_resume() -> relocate_restore_code() -> flush_tlb_all()
>>
>> and smp_call_function_many() doesn't like it when IRQs are disabled.
>
> Right.
>
> Can I invoke __flush_tlb_all() from here to flush the TLB on the local
> CPU?  That should be sufficient IMO.

Yes, unless there's another CPU already up at the time, but that seems unlikely.

FWIW, the CPU can and will cache the NX bit and the CPU will *not*
refresh the TLB before sending a page fault.  You definitely need to
flush.  I wonder if this is also why you're seeing a certain amount of
random corruption.

--Andy