linux-kernel - Re: [PATCH RFC] x86: check for and defend against BIOS memory corruption

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <48B7A377.8010205@goop.org>
Date:	Fri, 29 Aug 2008 00:21:27 -0700
From:	Jeremy Fitzhardinge <jeremy@...p.org>
To:	Ingo Molnar <mingo@...e.hu>
CC:	Rafał Miłecki <zajec5@...il.com>,
	Alan Jenkins <alan-jenkins@...fmail.co.uk>,
	Hugh Dickens <hugh@...itas.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC] x86: check for and defend against BIOS memory	corruption

Ingo Molnar wrote:
> * Rafał Miłecki <zajec5@...il.com> wrote:
>
>   
>> 2008/8/28 Jeremy Fitzhardinge <jeremy@...p.org>:
>>     
>>> Some BIOSes have been observed to corrupt memory in the low 64k.  This
>>> patch does two things:
>>>  - Reserves all memory which does not have to be in that area, to
>>>   prevent it from being used as general memory by the kernel.  Things
>>>   like the SMP trampoline are still in the memory, however.
>>>  - Clears the reserved memory so we can observe changes to it.
>>>  - Adds a function check_for_bios_corruption() which checks and reports on
>>>   memory becoming unexpectedly non-zero.  Currently it's called in the
>>>   x86 fault handler, and the powermanagement debug output.
>>>
>>> RFC: What other places should we check for corruption in?
>>>
>>> [ Alan, Rafał: could you check you see:
>>>   1: corruption messages
>>>   2: no crashes
>>>  Thanks -J
>>> ]
>>>       
>> I was trying my best to crash system with this patch applied and failed :)
>>
>> Works great.
>>
>> Just wonder if I should expect any printk from
>> check_for_bios_corruption? I do not see any:
>>
>> zajec@...y:~> dmesg | grep -i corr
>> scanning 2 areas for BIOS corruption
>>     
>
> that's _very_ weird.
>   

No, it's expected.  Rafał only got corruption when plugging his HDMI
cable, and I didn't put any corruption checks on that path (I'm not even
sure what kernel code would get executed in that case).  Hugh's original
patch put a check in the hot path of the fault handler - and so it would
get called regularly - but I put it in the kernel-bug path, which is
fairly pointless given that we expect this patch to prevent the crashes.

It does, however, do the check in the pm state changes, so doing a
suspend should make it print some of the corruption it found.  Alan's
case would be a better test for that though.

It does raise the question of where the good places to put the check
are.  It shouldn't be too hot, given that it's scanning ~64k of memory,
but often enough to actually show something.  I was thinking of putting
some calls in the acpi code itself, but got, erm, discouraged.

Maybe hooking into a sysrq key would be useful (sysrq-m?).

> maybe the BIOS expects _zeroes_ somewhere? Do you suddenly see crashes 
> if you change this line in Jeremy's patch:
>
> +               memset(__va(addr), 0, size);
>
> to something like:
>
> +               memset(__va(addr), 0x55, size);
>
> If this does not tickle any messages either, then maybe the problem is 
> in the identity of the entities we allocate in the first 64K. Is there a 
> list of allocations that go there when Jeremy's patch is not applied?
>
> but ... i think with an earlier patch you saw corruption, right? 
> Far-fetched idea: maybe it's some CPU erratum during suspend/resume that 
> corrupts pagetables if the pagetables are allocated in the first 64K of 
> RAM? In that case we should use a bootmem allocation for pagetables that 
> give a minimum address of 64K.
>   

Rafał's corruption was definitely non-zero.  I think the corruption is
happening, but it's just not reported.

    J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/