lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <54911702.2030809@vmware.com>
Date:	Wed, 17 Dec 2014 06:39:14 +0100
From:	Thomas Hellstrom <thellstrom@...are.com>
To:	<jongman.heo@...sung.com>
CC:	Peter Hurley <peter@...leysoftware.com>,
	Juergen Gross <jgross@...e.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [3.18+] Can't boot with commit bd809af1 ("x86: Enable PAT to
 use cache mode translation tables")

Thanks,

>From what I understand, there is indeed a virtual processor bug. It's
fixed in HardWare Version 11, so that the PAT registers return the
correct value.

Thanks,
Thomas


On 12/17/2014 03:46 AM, Jongman Heo wrote:
> Hi,
>
> I'm using VMWare workstation, version 10.0.3 build-1895310, on Windows 7 64-bit.
> Guest is Fedora 21.
>
> ------- Original Message -------
> Sender : Thomas Hellstrom<thellstrom@...are.com>
> Date : 2014-12-17 00:12 (GMT+09:00)
> Title : Re: [3.18+] Can't boot with commit bd809af1 ("x86: Enable PAT to use cache mode translation tables")
>
> Jongman, what product (player, ws, esx) and version are you using?
>
> Thanks,
> Thomas
>
>
> On 12/16/2014 02:08 PM, Peter Hurley wrote:
>> VMware guys probably already know this but just in case
>>
>> [ +cc Thomas Hellstrom ]
>>
>> Jongman - you need to fix your mailer to use plaintext and not base64.
>>
>> On 12/16/2014 01:46 AM, Jongman Heo wrote:
>>>> Sender : Juergen Gross
>>>> On 12/16/2014 07:29 AM, Jongman Heo wrote:
>>>>>> Sender : Juergen Gross
>>>>>> On 12/16/2014 05:40 AM, Jongman Heo wrote:
>>>>>>>> Sender : Juergen Gross
>>>>>>>> On 12/15/2014 08:52 AM, Jongman Heo wrote:
>>>>>>>>>> Sender : Juergen Gross
>>>>>>>>>> On 12/14/2014 06:07 AM, ÇãÁ¾¸¸ wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> My Linux virtual machine on (Windows) VMWare workstation 10 can't boot with following commit.
>>>>>>>>>>>
>>>>>>>>>>> commit bd809af16e3ab1f8d55b3e2928c47c67e2a865d2
>>>>>>>>>>> Author: Juergen Gross
>>>>>>>>>>> Date:   Mon Nov 3 14:02:03 2014 +0100
>>>>>>>>>>>
>>>>>>>>>>>         x86: Enable PAT to use cache mode translation tables
>>>>>>>>>>>
>>>>>>>>>>> Unfortunately I can't see any console log.
>>>>>>>>>> Hmm, weird. Could you provide some more information?
>>>>>>>>>>
>>>>>>>>>> Kernel config, hardware used, /proc/cpuinfo of working kernel?
>>>>>>>>>> Anything you see with earlyprintk enabled?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Juergen
>>>>>>>>> (Sorry for resending this email, previous one bounced from mailing list due to HTML format)
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I'm using Fedora 21, with custom built kernel.
>>>>>>>>> Host PC is windows 7 64-bit, and running VMWare workstation 10 for guest Fedora Linux.
>>>>>>>>>
>>>>>>>>> With earlyprintk, just following message is printed.
>>>>>>>>>
>>>>>>>>>      early console in setup code
>>>>>>>>>
>>>>>>>>> and nothing more...
>>>>>>>> Can you try attached diagnostic patch, please? I suspect a problem
>>>>>>>> regarding VMWares PAT emulation...
>>>>>>>>
>>>>>>>>
>>>>>>>> Juergen
>>>>>>> Hi,
>>>>>>>
>>>>>>> With the commit reverted, the patch doesn't apply.
>>>>>> Sure.
>>>>>>
>>>>>>> Without revert, kernel (patch applied) doesn't boot and I can't see any message.
>>>>>> What are your kernel parameters? There must be some message with the
>>>>>> diagnostic patch, as the first pr_info() is called before any other
>>>>>> part of the critical patch is becoming active. Could it be you have
>>>>>> instructed the kernel to be "quiet"? I'd recommend:
>>>>>>
>>>>>> earlyprintk=vga ignore_loglevel
>>>>>>
>>>>>> and no quiet. I don't know VMWare settings, so may be you can use
>>>>>> earlyprintk=ttyS0 instead of vga.
>>>>>>
>>>>>>> Let me show you my PAT values (the commit reverted)
>>>>>>>
>>>>>>> # dmesg | grep PAT
>>>>>>> [    0.000000] x86 PAT enabled: cpu 0, old 0x0, new 0x7010600070106
>>>>>>> [    0.314631] x86 PAT enabled: cpu 3, old 0x0, new 0x7010600070106
>>>>>>> [    0.314703] x86 PAT enabled: cpu 1, old 0x0, new 0x7010600070106
>>>>>>> [    0.314780] x86 PAT enabled: cpu 2, old 0x0, new 0x7010600070106
>>>>>>> [    0.314852] x86 PAT enabled: cpu 4, old 0x0, new 0x7010600070106
>>>>>>> [    0.314923] x86 PAT enabled: cpu 0, old 0x0, new 0x7010600070106
>>>>>>> [    0.314997] x86 PAT enabled: cpu 6, old 0x0, new 0x7010600070106
>>>>>>> [    0.315069] x86 PAT enabled: cpu 7, old 0x0, new 0x7010600070106
>>>>>>> [    0.315142] x86 PAT enabled: cpu 5, old 0x0, new 0x7010600070106
>>>>>> These are the expected values. But these values are the ones which are
>>>>>> written, not the ones which have been read from the PAT MSR again.
>>>>>>
>>>>>> Without applying the critical patch you could add:
>>>>>>
>>>>>> rdmsrl(MSR_IA32_CR_PAT, pat);
>>>>>> printk(KERN_INFO "PAT read: cpu %d, 0x%Lx\n", smp_processor_id(), pat);
>>>>>>
>>>>>> at the end of pat_init() to verify VMWare is handling reads of the PAT
>>>>>> MSR properly.
>>>>>>
>>>>>> Juergen
>>>>>>
>>>>> Hi,
>>>>>
>>>>> With earlyprintk=vga, I can see the log.
>>>>> But due to call trace, I can't see what the pat value is.
>>>>>
>>>>> Call chain is as follows.
>>>>>
>>>>>    i386_start_kernel -> start_kernel -> setup_arch ->
>>>>>    mtrr_bp_init -> get_mtrr_state -> pat_init ->
>>>>>    pat_init_cache_mode_entry -> update_cache_mode_entry ->
>>>>>    early_idt_handler -> dump_stack
>>>>>
>>>>> So, I blocked update_cache_mode_entry() call like below...
>>>>>
>>>>> --- a/arch/x86/mm/pat.c
>>>>> +++ b/arch/x86/mm/pat.c
>>>>> @@ -182,11 +182,12 @@ void pat_init_cache_modes(void)
>>>>>          u64 pat;
>>>>>   
>>>>>          rdmsrl(MSR_IA32_CR_PAT, pat);
>>>>> +       pr_info("read pat %0llx\n", pat);
>>>>>          pat_msg[32] = 0;
>>>>>          for (i = 7; i >= 0; i--) {
>>>>>                  cache = pat_get_cache_mode((pat >> (i * 8)) & 7,
>>>>>                                             pat_msg + 4 * i);
>>>>> -               update_cache_mode_entry(i, cache);
>>>>> +               //update_cache_mode_entry(i, cache);
>>>>>          }
>>>>>          pr_info("PAT configuration [0-7]: %s\n", pat_msg);
>>>>>   }
>>>>> @@ -238,9 +239,13 @@ void pat_init(void)
>>>>>                  rdmsrl(MSR_IA32_CR_PAT, boot_pat_state);
>>>>>   
>>>>>          wrmsrl(MSR_IA32_CR_PAT, pat);
>>>>> +       pr_info("about to write pat %0llx\n", pat);
>>>>>   
>>>>>          if (boot_cpu)
>>>>>                  pat_init_cache_modes();
>>>>> +
>>>>> +       rdmsrl(MSR_IA32_CR_PAT, pat);
>>>>> +       printk(KERN_INFO "PAT read: cpu %d, 0x%Lx\n", smp_processor_id(), pat);
>>>>>   }
>>>>>   
>>>>>
>>>>> Then boot is fine, and PAT values are as follows.
>>>>>
>>>>>
>>>>> # dmesg|grep -i "pat "
>>>>> [    0.000000] about to write pat 7010600070106
>>>>> [    0.000000] read pat 0
>>>>> [    0.000000] PAT configuration [0-7]: UC  UC  UC  UC  UC  UC  UC  UC
>>>>> [    0.000000] PAT read: cpu 0, 0x0
>>>>> [    0.320559] about to write pat 7010600070106
>>>>> [    0.320876] read pat 0
>>>>> [    0.321090] PAT configuration [0-7]: UC  UC  UC  UC  UC  UC  UC  UC
>>>>> [    0.321260] PAT read: cpu 5, 0x0
>>>>> [    0.321403] about to write pat 7010600070106
>>>>> [    0.321818] read pat 0
>>>>> [    0.322033] PAT configuration [0-7]: UC  UC  UC  UC  UC  UC  UC  UC
>>>>> [    0.322205] PAT read: cpu 6, 0x0
>>>>> [    0.322334] about to write pat 7010600070106
>>>>> [    0.322417] read pat 0
>>>>> [    0.322479] PAT configuration [0-7]: UC  UC  UC  UC  UC  UC  UC  UC
>>>>> [    0.322573] PAT read: cpu 0, 0x0
>>>>> [    0.322703] about to write pat 7010600070106
>>>>> [    0.323012] read pat 0
>>>>> [    0.323228] PAT configuration [0-7]: UC  UC  UC  UC  UC  UC  UC  UC
>>>>> [    0.323400] PAT read: cpu 1, 0x0
>>>>> [    0.323537] about to write pat 7010600070106
>>>>> [    0.323833] read pat 0
>>>>> [    0.324055] PAT configuration [0-7]: UC  UC  UC  UC  UC  UC  UC  UC
>>>>> [    0.324224] PAT read: cpu 7, 0x0
>>>>> [    0.324362] about to write pat 7010600070106
>>>>> [    0.324662] read pat 0
>>>>> [    0.324877] PAT configuration [0-7]: UC  UC  UC  UC  UC  UC  UC  UC
>>>>> [    0.325048] PAT read: cpu 2, 0x0
>>>>> [    0.325185] about to write pat 7010600070106
>>>>> [    0.325483] read pat 0
>>>>> [    0.325695] PAT configuration [0-7]: UC  UC  UC  UC  UC  UC  UC  UC
>>>>> [    0.325863] PAT read: cpu 4, 0x0
>>>>> [    0.325997] about to write pat 7010600070106
>>>>> [    0.326288] read pat 0
>>>>> [    0.326507] PAT configuration [0-7]: UC  UC  UC  UC  UC  UC  UC  UC
>>>>> [    0.326677] PAT read: cpu 3, 0x0
>>>> Okay, so VMWare doesn't seem to return the correct PAT MSR value.
>>>>
>>>> I suggest you try "nopat" as kernel option. This should disable all the
>>>> PAT handling and VMWare can't wreck the kernel this way.
>>>>
>>>> I'll write a patch which detects this VMWare bug by checking the PAT
>>>> value after writing it.
>>>>
>>>> Thanks for reporting that case,
>>>>
>>>>
>>>> Juergen
>>>>
>>>>
>>> OK, my VMWare works with "nopat" option.
>>>
>>> Thanks~.N

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ