lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 18 Aug 2017 11:15:13 +0100
From:   Julien Grall <julien.grall@....com>
To:     Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        xen-devel@...ts.xen.org
Cc:     jgross@...e.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] xen/events: events_fifo: Don't use {get,put}_cpu() in
 xen_evtchn_fifo_init()

Hi Boris,

On 17/08/17 18:36, Boris Ostrovsky wrote:
> On 08/17/2017 12:14 PM, Julien Grall wrote:
>> When booting Linux as Xen guest with CONFIG_DEBUG_ATOMIC, the following
>> splat appears:
>>
>> [    0.002323] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes)
>> [    0.019717] ASID allocator initialised with 65536 entries
>> [    0.020019] xen:grant_table: Grant tables using version 1 layout
>> [    0.020051] Grant table initialized
>> [    0.020069] BUG: sleeping function called from invalid context at /data/src/linux/mm/page_alloc.c:4046
>> [    0.020100] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper/0
>> [    0.020123] no locks held by swapper/0/1.
>> [    0.020143] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc5 #598
>> [    0.020166] Hardware name: FVP Base (DT)
>> [    0.020182] Call trace:
>> [    0.020199] [<ffff00000808a5c0>] dump_backtrace+0x0/0x270
>> [    0.020222] [<ffff00000808a95c>] show_stack+0x24/0x30
>> [    0.020244] [<ffff000008c1ef20>] dump_stack+0xb8/0xf0
>> [    0.020267] [<ffff0000081128c0>] ___might_sleep+0x1c8/0x1f8
>> [    0.020291] [<ffff000008112948>] __might_sleep+0x58/0x90
>> [    0.020313] [<ffff0000082171b8>] __alloc_pages_nodemask+0x1c0/0x12e8
>> [    0.020338] [<ffff00000827a110>] alloc_page_interleave+0x38/0x88
>> [    0.020363] [<ffff00000827a904>] alloc_pages_current+0xdc/0xf0
>> [    0.020387] [<ffff000008211f38>] __get_free_pages+0x28/0x50
>> [    0.020411] [<ffff0000086566a4>] evtchn_fifo_alloc_control_block+0x2c/0xa0
>> [    0.020437] [<ffff0000091747b0>] xen_evtchn_fifo_init+0x38/0xb4
>> [    0.020461] [<ffff0000091746c0>] xen_init_IRQ+0x44/0xc8
>> [    0.020484] [<ffff000009128adc>] xen_guest_init+0x250/0x300
>> [    0.020507] [<ffff000008083974>] do_one_initcall+0x44/0x130
>> [    0.020531] [<ffff000009120df8>] kernel_init_freeable+0x120/0x288
>> [    0.020556] [<ffff000008c31ca8>] kernel_init+0x18/0x110
>> [    0.020578] [<ffff000008083710>] ret_from_fork+0x10/0x40
>> [    0.020606] xen:events: Using FIFO-based ABI
>> [    0.020658] Xen: initializing cpu0
>> [    0.027727] Hierarchical SRCU implementation.
>> [    0.036235] EFI services will not be available.
>> [    0.043810] smp: Bringing up secondary CPUs ...
>>
>> This is because get_cpu() in xen_evtchn_fifo_init() will disable
>> preemption, but __get_free_page() might sleep (GFP_ATOMIC is not set).
>>
>> xen_evtchn_fifo_init() will always be called before SMP is initialized,
>> so {get,put}_cpu() could be replaced by a simple smp_processor_id().
>
> On x86 this will be called out of init_IRQ(), which is already preceded
> by preempt_disable().

Well the main problem is preempt_disable() itself. in_atomic() will 
check preempt_count and return 1 if it is non-zero.

__get_free_page might sleep if GFP_ATOMIC is not set and therefore you 
will see the splat when CONFIG_DEBUG_ATOMIC is enabled. However, those 
checks don't happen before the scheduler is setup. Hence why you don't 
see the error on x86.

Cheers,

>
> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@...cle.com>
>

-- 
Julien Grall

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ