lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210201190938.GB15399@p4>
Date:   Mon, 1 Feb 2021 19:09:41 +0000
From:   Giancarlo Ferrari <giancarlo.ferrari89@...il.com>
To:     Mark Rutland <mark.rutland@....com>
Cc:     linux-arm-kernel@...ts.infradead.org, linux@...linux.org.uk,
        linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
        rppt@...nel.org, penberg@...nel.org, geert@...ux-m68k.org,
        giancarlo.ferrari@...ia.com
Subject: Re: [PATCH] ARM: kexec: Fix panic after TLB are invalidated

Hi,

On Mon, Feb 01, 2021 at 03:30:12PM +0000, Mark Rutland wrote:
> Hi,
> 
> On Mon, Feb 01, 2021 at 02:39:46PM +0000, Giancarlo Ferrari wrote:
> > On Mon, Feb 01, 2021 at 12:47:20PM +0000, Mark Rutland wrote:
> > > On Mon, Feb 01, 2021 at 12:44:56AM +0000, Giancarlo Ferrari wrote:
> > > > machine_kexec() need to set rw permission in text and rodata sections
> > > > to assign some variables (e.g. kexec_start_address). To do that at
> > > > the end (after flushing pdm in memory, etc.) it needs to invalidate
> > > > TLB [section] entries.
> > > 
> > > It'd be worth noting explicitly that set_kernel_text_rw() alters
> > > current->active_mm...
> > > 
> > > > If during the TLB invalidation an interrupt occours, which might cause
> > > > a context switch, there is the risk to inject invalid TLBs, with ro
> > > > permissions.
> > > 
> > > ... which is why if there's a context switch things can go wrong, since
> > > active_mm isn't stable, and so it's possible that set_kernel_text_rw()
> > > updates multiple tables, none of which might be the active table at the
> > > point we try to make an access.
> > 
> > Maybe the behaviour causing issue is not completely clear to me, and I do
> > apologize for that (moreover I haven't eougth debug capabilities).
> 
> I think we're in rough agreement that the issue is likely related to the
> context switch, but our understanding of the specifics differs (and I
> think we're missing a detail here).
> 
> > However, current-active_mm is switched among context switches. Correct ?
> 
> In some cases current->active_mm is not switched, and can be inherited
> over a context switch. When switching to a user task, we always switch
> to its mm (which becomes the active_mm), but when switching to a kthread
> we retain the previous task's mm as the active_mm as part of the lazy
> context switch.
> 
> So while a kthread is preemptible, its active_mm (and active ASID) can
> change under its feet. That could happen anywhere while the task is
> preemptible, e.g. in the middle of set_kernel_text_rw(), or
> mid-modification to the kexec variables.
> 

Yes.

In my understanding, even in the case of user process, when current->active_mm is switched,
we can run into trouble. For instance:

- Process A issue kexec (PageTables entry of A has 0x8000_0000-0x8010_0000 with ro
  permission and section is global, no NG bit set).

- A context switch happens in the middle of set_kernel_text_rw(), right after the
  section 0x8000_0000-0x8010_0000 has been invalidated.

- Process B, in its execution, re-inject its own PageTable entry with ro permission, which
  is not shared with Process A (and is not touched by the previous invalidation) in the MMU
  cache.

- When Process A, is rescheduled, it carries on with the invalidation, but unfortunately I have
  "wrong" entries in the MMU.

> > So, in principle, the invalidation, if stopped, is carried on where it
> > left.
> 
> That's true so long as all the processes we switch between share the
> same leaf tables for the region we're altering. If not, then the lazy
> context switch means that those tables can change under our feet.
> 
> I believe the tables mapping the kernel text are shared by all threads,
> and if so this _should_ be true. Russell might be able to confirm that
> or correct me if I have that wrong.
> 

I am not ready to put my hand on the fire, but I seen the behaviour described above.

> > I thought the issue was that the PageTable entry for the section 0x8010_0000
> > is global, thus not indexed by ASID (Address Space ID). By the fact that each
> > process has its own version of that entry, is the cause of the issue, as the
> > schedule process might bringing a spurious entry (with ro permission) in the
> > MMU cache.
> 
> The TLB invalidation performed under set_kernel_text_rw() affects all
> ASIDs on the current CPU, so there shouldn't be any stale RO TLB entries
> to hit unless the kthread is migrated to another CPU.
> 
> > If the entry is not global holds the ASID, and the issue cannot happen.
> 
> I don't think that's true, since switching to a different active_mm
> would also change ASID, and we'd have no additional guarantee.
> 

I agree, my fault.

> Thanks,
> Mark.

Thanks,


GF

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ