linux-kernel - Re: [PATCH] arm64: paravirt: Disable IRQs during stolen_time_cpu_down

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0737998d-3006-5866-5ea1-dcfef0cfeb32@suse.com>
Date:   Thu, 21 Apr 2022 09:44:28 +0200
From:   Juergen Gross <jgross@...e.com>
To:     Elliot Berman <quic_eberman@...cinc.com>,
        "Srivatsa S. Bhat (VMware)" <srivatsa@...il.mit.edu>,
        Alexey Makhalov <amakhalov@...are.com>,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>
Cc:     Prakruthi Deepak Heragu <quic_pheragu@...cinc.com>,
        virtualization@...ts.linux-foundation.org, x86@...nel.org,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        Murali Nalajala <quic_mnalajal@...cinc.com>
Subject: Re: [PATCH] arm64: paravirt: Disable IRQs during
 stolen_time_cpu_down_prepare

On 20.04.22 22:44, Elliot Berman wrote:
> From: Prakruthi Deepak Heragu <quic_pheragu@...cinc.com>
> 
> During hotplug, the stolen time data structure is unmapped and memset.
> There is a possibility of the timer IRQ being triggered before memset
> and stolen time is getting updated as part of this timer IRQ handler. This
> causes the below crash in timer handler -
> 
>    [ 3457.473139][    C5] Unable to handle kernel paging request at virtual address ffffffc03df05148
>    ...
>    [ 3458.154398][    C5] Call trace:
>    [ 3458.157648][    C5]  para_steal_clock+0x30/0x50
>    [ 3458.162319][    C5]  irqtime_account_process_tick+0x30/0x194
>    [ 3458.168148][    C5]  account_process_tick+0x3c/0x280
>    [ 3458.173274][    C5]  update_process_times+0x5c/0xf4
>    [ 3458.178311][    C5]  tick_sched_timer+0x180/0x384
>    [ 3458.183164][    C5]  __run_hrtimer+0x160/0x57c
>    [ 3458.187744][    C5]  hrtimer_interrupt+0x258/0x684
>    [ 3458.192698][    C5]  arch_timer_handler_virt+0x5c/0xa0
>    [ 3458.198002][    C5]  handle_percpu_devid_irq+0xdc/0x414
>    [ 3458.203385][    C5]  handle_domain_irq+0xa8/0x168
>    [ 3458.208241][    C5]  gic_handle_irq.34493+0x54/0x244
>    [ 3458.213359][    C5]  call_on_irq_stack+0x40/0x70
>    [ 3458.218125][    C5]  do_interrupt_handler+0x60/0x9c
>    [ 3458.223156][    C5]  el1_interrupt+0x34/0x64
>    [ 3458.227560][    C5]  el1h_64_irq_handler+0x1c/0x2c
>    [ 3458.232503][    C5]  el1h_64_irq+0x7c/0x80
>    [ 3458.236736][    C5]  free_vmap_area_noflush+0x108/0x39c
>    [ 3458.242126][    C5]  remove_vm_area+0xbc/0x118
>    [ 3458.246714][    C5]  vm_remove_mappings+0x48/0x2a4
>    [ 3458.251656][    C5]  __vunmap+0x154/0x278
>    [ 3458.255796][    C5]  stolen_time_cpu_down_prepare+0xc0/0xd8
>    [ 3458.261542][    C5]  cpuhp_invoke_callback+0x248/0xc34
>    [ 3458.266842][    C5]  cpuhp_thread_fun+0x1c4/0x248
>    [ 3458.271696][    C5]  smpboot_thread_fn+0x1b0/0x400
>    [ 3458.276638][    C5]  kthread+0x17c/0x1e0
>    [ 3458.280691][    C5]  ret_from_fork+0x10/0x20
> 
> As a fix, disable the IRQs during hotplug until we unmap and memset the
> stolen time structure.

This will work for the call chain of your observed crash, but are
you sure that para_steal_clock() can't be called from another cpu
concurrently?

In case you verified this can't happen, you can add my:

Reviewed-by: Juergen Gross <jgross@...e.com>

Otherwise you either need to use RCU for doing the memunmap(), or a
lock to protect stolen_time_region.


Juergen

Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3099 bytes)

Download attachment "OpenPGP_signature" of type "application/pgp-signature" (496 bytes)