linux-kernel - Re: qemu sparc64 runtime crashes in -next

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <b46961b3-0b69-7f0f-8d24-402c7c74ac69@oracle.com>
Date:   Wed, 14 Jun 2017 16:53:18 -0400
From:   Pasha Tatashin <pasha.tatashin@...cle.com>
To:     Guenter Roeck <linux@...ck-us.net>,
        David Miller <davem@...emloft.net>
Cc:     linux-kernel@...r.kernel.org, bob.picco@...cle.com,
        steven.sistare@...cle.com
Subject: Re: qemu sparc64 runtime crashes in -next

I think I know the problem, and working on a fix. Will send it out soon.

Thank you,
Pasha

On 06/14/2017 04:42 PM, Guenter Roeck wrote:
> On Wed, Jun 14, 2017 at 03:31:08PM -0400, David Miller wrote:
>> From: Guenter Roeck <linux@...ck-us.net>
>> Date: Wed, 14 Jun 2017 03:13:54 -0700
>>
>>> Hi,
>>>
>>> my sparc qemu tests started failing with next-20170613.
>>> Log output is not very helpful:
>>>
>>> Unhandled Exception 0x0000000000000028
>>> PC = 0x00000000004620f4 NPC = 0x00000000004620f8
>>> Stopping execution
>>>
>>> It looks like 0x00000000004620f4 is in init_tick_ops().
>>>
>>> Bisect points to commit 'sparc64: improve modularity tick options'.
>>> Bisect log is attached.
>>>
>>> No idea if this is a qemu problem. If you think it is, anything to
>>> help
>>> tracking it down would be appreciated.
>>
>> Pavel, please look into this.
>>
>> It looks weird that the commit it bisects to would cause a problem.
>> Maybe the change from __read_mostly to __cachelin_aligned causes the
>> issue?
>>
>> Really weird...
> 
> Turns out tick_get_frequency() returns 0. The value is used as divisor
> in clocksource_hz2mult().
> 
> Looking into it further, clock_tick is initialized much later.
> 
> [    0.000000] clock_tick is 0
> 	-> tick_get_frequency()
> [    0.039361] PROMLIB: Sun IEEE Boot Prom 'OBP 3.10.24 1999/01/01 01:01'
> [    0.041646] PROMLIB: Root node compatible: sun4u
> [    0.060500] Linux version 4.12.0-rc5-next-20170614+ (groeck@...s) (gcc version 4.6.3 (GCC) ) #5 SMP Wed Jun 14 13:40:01 PDT 2017
> [    0.893475] bootconsole [earlyprom0] enabled
> [    0.958658] ARCH: SUN4U
> [    1.265007] Ethernet address: 52:54:00:12:34:56
> [    1.340458] MM: PAGE_OFFSET is 0xfffff80000000000 (max_phys_bits == 40)
> [    1.405302] MM: VMALLOC [0x0000000100000000 --> 0x0000060000000000]
> [    1.468992] MM: VMEMMAP [0x0000060000000000 --> 0x00000c0000000000]
> [    3.349070] Kernel: Using 5 locked TLB entries for main kernel image.
> [    3.422093] Remapping the kernel...
> [    4.342159] done.
> [  136.231664] OF stdout device is: /pci@1fe,0/ebus@...u
> [  136.298896] PROM: Built device tree with 60466 bytes of memory.
> [  136.458520] Top of RAM: 0x1fe80000, Total RAM: 0x1fe80000
> [  136.520487] Memory hole size: 0MB
> [  143.705871] Allocated 16384 bytes for kernel page tables.
> [  143.972916] Zone ranges:
> [  144.039046]   Normal   [mem 0x0000000000000000-0x000000001fe7ffff]
> [  144.118654] Movable zone start for each node
> [  144.180797] Early memory node ranges
> [  144.240870]   node   0: [mem 0x0000000000000000-0x000000001fe7ffff]
> [  144.333686] Initmem setup node 0 [mem 0x0000000000000000-0x000000001fe7ffff]
> [  144.943918] Booting Linux...
> [  145.010966] CPU CAPS: [flush,stbar,swap,muldiv,v9,mul32,div32,v8plus]
> [  145.082225] CPU CAPS: [vis]
> [  145.581394] percpu: Embedded 12 pages/cpu @fffff8001f800000 s57024 r8192 d33088 u4194304
> [  145.949412] ###################### fill_in_one_cpu(): CPU 0 clock tick set to 100000000
> 
> That doesn't really take 145 seconds, though :-).
> 
> Guenter
>