lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180226084700.vpjavqk6e4xfyapa@gmail.com>
Date:   Mon, 26 Feb 2018 09:47:00 +0100
From:   Ingo Molnar <mingo@...nel.org>
To:     "Kirill A. Shutemov" <kirill@...temov.name>
Cc:     Borislav Petkov <bp@...e.de>, tglx@...utronix.de,
        torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
        willy@...radead.org, hpa@...or.com,
        kirill.shutemov@...ux.intel.com, peterz@...radead.org,
        gorcunov@...nvz.org, luto@...capital.net,
        linux-tip-commits@...r.kernel.org
Subject: Re: [tip:x86/boot] x86/boot/compressed/64: Prepare trampoline memory


* Kirill A. Shutemov <kirill@...temov.name> wrote:

> > Also, could do a puts() hexdump of the affected memory area _before_ we overwrite 
> > it? Is it empty? Could we add some debug warning that checks that it's all zeroes?
> 
> The problem is that we don't really have a way get a message out of there.

Is there any memory area we can write to that survives into the real kernel?

> But without a way to print address it's still a black box.

> > 1)
> > 
> >         /* Check if LA57 is desired and supported */
> >         if (IS_ENABLED(CONFIG_X86_5LEVEL) && native_cpuid_eax(0) >= 7 &&
> >                         (native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
> >                 paging_config.l5_required = 1;
> > 
> > ... it isn't explained why this feature CPU check is so complex.
> 
> We check that the CPUID leaf is supported and than check the feature
> itself.
> 
> Maybe the first check is redundant, but I tried to be safe here.

So the explanation that is missing is to explain which usual primitive this is 
equivalent to. This is essentially an early boot variant of 
boot_cpu_has(X86_FEATURE_LA57), right?

> > 2)
> > 
> > +       /* Place the trampoline just below the end of low memory, aligned to 4k */
> > +       paging_config.trampoline_start = bios_start - TRAMPOLINE_32BIT_SIZE;
> > +       paging_config.trampoline_start = round_down(paging_config.trampoline_start, PAGE_SIZE);
> > 
> > placing trampolines just below or just above BIOS images is fragile. Instead a 
> > better heuristic is to use the "middle" of suspected available RAM and work from 
> > there.
> 
> It's not obvious what is lower end of available memory here. Any hints?
> 
> Realtime trampoline is allocated with top-down approach and I tried to
> mimic the approach here.

Yeah, it's all a bit weird - I'm grasping at straws really - and the large size of 
the patch doesn't help.

> > +       /* Clear trampoline memory first */
> > +       memset(trampoline, 0, TRAMPOLINE_32BIT_SIZE);
> > 
> > Memory bootup state is typically all zeroes (except maybe for kexec), so this 
> > should check that what it's clearing doesn't contain any data.
> 
> Hm. I don't see why would we expect this. Do we really have guarantee that
> bootloader would not mess with the memory?

Hm, probably not - what's the typical memory usage of GRUB for example?

> > It should probably also clear this memory _after_ use.
> 
> After use I tired to restore the original content of the memory.
> See cleanup_trampoline(). That looks safer to me.

Yeah, agreed.

BTW., it's still a possibility that we are barking up the wrong tree here: if the 
bug Boris triggered is somehow not fully deterministic (but say kernel build 
alignment dependent - which alignment your commit changed) then this might be the 
wrong commit. A split-up into several patches would help there too: for example if 
the new data structures in the kernel image trigger the bug (with nothing else in 
that commit) then it strongly suggests that it's alignment related, etc.

But it certainly 'feels' to have a chance to be related: stomping on the wrong 
piece of memory can make graphics fail.

> > Why '+ _PAGE_TABLE_NOENC', while not ' |' ?
> 
> It shouldn't really matter, but yeah, '|' is more appropriate.

Readability and using standard patterns is king.

Thanks,

	Ingo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ