lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 22 Apr 2022 12:03:50 -0700
From:   Elliot Berman <quic_eberman@...cinc.com>
To:     Will Deacon <will@...nel.org>, Juergen Gross <jgross@...e.com>
CC:     "Srivatsa S. Bhat (VMware)" <srivatsa@...il.mit.edu>,
        Alexey Makhalov <amakhalov@...are.com>,
        Catalin Marinas <catalin.marinas@....com>,
        "Prakruthi Deepak Heragu" <quic_pheragu@...cinc.com>,
        <virtualization@...ts.linux-foundation.org>, <x86@...nel.org>,
        <linux-arm-kernel@...ts.infradead.org>,
        <linux-kernel@...r.kernel.org>,
        Murali Nalajala <quic_mnalajal@...cinc.com>
Subject: Re: [PATCH] arm64: paravirt: Disable IRQs during
 stolen_time_cpu_down_prepare



On 4/21/2022 2:10 AM, Will Deacon wrote:
> On Thu, Apr 21, 2022 at 09:44:28AM +0200, Juergen Gross wrote:
>> On 20.04.22 22:44, Elliot Berman wrote:
>>> From: Prakruthi Deepak Heragu <quic_pheragu@...cinc.com>
>>>
>>> During hotplug, the stolen time data structure is unmapped and memset.
>>> There is a possibility of the timer IRQ being triggered before memset
>>> and stolen time is getting updated as part of this timer IRQ handler. This
>>> causes the below crash in timer handler -
>>>
>>>     [ 3457.473139][    C5] Unable to handle kernel paging request at virtual address ffffffc03df05148
>>>     ...
>>>     [ 3458.154398][    C5] Call trace:
>>>     [ 3458.157648][    C5]  para_steal_clock+0x30/0x50
>>>     [ 3458.162319][    C5]  irqtime_account_process_tick+0x30/0x194
>>>     [ 3458.168148][    C5]  account_process_tick+0x3c/0x280
>>>     [ 3458.173274][    C5]  update_process_times+0x5c/0xf4
>>>     [ 3458.178311][    C5]  tick_sched_timer+0x180/0x384
>>>     [ 3458.183164][    C5]  __run_hrtimer+0x160/0x57c
>>>     [ 3458.187744][    C5]  hrtimer_interrupt+0x258/0x684
>>>     [ 3458.192698][    C5]  arch_timer_handler_virt+0x5c/0xa0
>>>     [ 3458.198002][    C5]  handle_percpu_devid_irq+0xdc/0x414
>>>     [ 3458.203385][    C5]  handle_domain_irq+0xa8/0x168
>>>     [ 3458.208241][    C5]  gic_handle_irq.34493+0x54/0x244
>>>     [ 3458.213359][    C5]  call_on_irq_stack+0x40/0x70
>>>     [ 3458.218125][    C5]  do_interrupt_handler+0x60/0x9c
>>>     [ 3458.223156][    C5]  el1_interrupt+0x34/0x64
>>>     [ 3458.227560][    C5]  el1h_64_irq_handler+0x1c/0x2c
>>>     [ 3458.232503][    C5]  el1h_64_irq+0x7c/0x80
>>>     [ 3458.236736][    C5]  free_vmap_area_noflush+0x108/0x39c
>>>     [ 3458.242126][    C5]  remove_vm_area+0xbc/0x118
>>>     [ 3458.246714][    C5]  vm_remove_mappings+0x48/0x2a4
>>>     [ 3458.251656][    C5]  __vunmap+0x154/0x278
>>>     [ 3458.255796][    C5]  stolen_time_cpu_down_prepare+0xc0/0xd8
>>>     [ 3458.261542][    C5]  cpuhp_invoke_callback+0x248/0xc34
>>>     [ 3458.266842][    C5]  cpuhp_thread_fun+0x1c4/0x248
>>>     [ 3458.271696][    C5]  smpboot_thread_fn+0x1b0/0x400
>>>     [ 3458.276638][    C5]  kthread+0x17c/0x1e0
>>>     [ 3458.280691][    C5]  ret_from_fork+0x10/0x20
>>>
>>> As a fix, disable the IRQs during hotplug until we unmap and memset the
>>> stolen time structure.
>>
>> This will work for the call chain of your observed crash, but are
>> you sure that para_steal_clock() can't be called from another cpu
>> concurrently?
> 
> Agreed, this needs checking as update_rq_clock() is called from all over the
> place.
> 
>> In case you verified this can't happen, you can add my:
>>
>> Reviewed-by: Juergen Gross <jgross@...e.com>
>>
>> Otherwise you either need to use RCU for doing the memunmap(), or a
>> lock to protect stolen_time_region.
> 
> Yes, I think RCU would make a lot of sense here, deferring the memunmap
> until there are no longer any readers. The reader is currently doing:
> 
> 	if (!reg->kaddr)
> 		return 0;
> 
> 	return le64_to_cpu(READ_ONCE(reg->kaddr->stolen_time));
> 
> so we'd also want an rcu_dereference() on reg->kaddr to avoid reading it
> twice.
> 
> Will


We have seen this particular stack many times in our testing, but not 
any other way. Will's comments are agreeable, I'll send out an updated 
patch.

Thanks,
Elliot

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ