linux-kernel - Re: [RFC PATCH 1/1] mm/vmalloc: Modify permission reset procedure to avoid invalid access

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZnF41EAK06FYog27@swlinux02>
Date: Tue, 18 Jun 2024 20:08:52 +0800
From: Leo Liang <ycliang@...estech.com>
To: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
CC: "lstoakes@...il.com" <lstoakes@...il.com>,
        "akpm@...ux-foundation.org"
	<akpm@...ux-foundation.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "hch@...radead.org" <hch@...radead.org>,
        "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>,
        "bpf@...r.kernel.org" <bpf@...r.kernel.org>,
        "urezki@...il.com" <urezki@...il.com>,
        "patrick@...estech.com"
	<patrick@...estech.com>
Subject: Re: [RFC PATCH 1/1] mm/vmalloc: Modify permission reset procedure to
 avoid invalid access

On Tue, Jun 11, 2024 at 02:21:42PM +0000, Edgecombe, Rick P wrote:
> [EXTERNAL MAIL]
> 
> On Tue, 2024-06-11 at 21:13 +0800, Leo Yu-Chi Liang wrote:
> > The previous reset procedure is
> > 1. Set direct map attribute to invalid
> > 2. Flush TLB
> > 3. Reset direct map attribute to default
> >
> > It is possible that kernel forks another process
> > on another core that access the invalid mappings after
> > sync_kernel_mappings.
> >
> > We could reproduce this scenario by running LTP/bpf_prog
> > multiple times on RV32 kernel on QEMU.
> >
> > Therefore, the following procedure is proposed
> > to avoid mappings being invalid.
> > 1. Reset direct map attribute to default
> > 2. Flush TLB
> 
> Can you explain more about what is happening in this scenario? Looking briefly,
> riscv is doing something unique around sync_kernel_mappings(). If a RO mapping
> is copied instead of a NP/invalid mapping, how is the problem avoided?

Hi Edgecombe,

Sorry for the late reply and thank you for taking a look.

What we are seeing at first is that running LTP bpf_prog03 test fails randomly
on RV32 SMP QEMU with kernel 6.1 and the failed cause is a load page fault.

After a bit of inspection, we found that the faulting page is a part of
kernel's page table and the valid bit of that page's PTE is cleared due to this reset procedure.

The scenario of this fault is suspected to be the following:
1. Running bpf_prog03: Creates kernel pages with elevated 'X' permission so that bpf program can be executed.
2. Finishing bpf_prog03: vfree code path to reset permission to default: 
	a. Set the pages to invalid first
	b. Unmap the pages and flush TLB
	c. Reset them to default permission
3. Other core forkes new processes: sync_kernel_mappings copies the kernel page table.

If the 3rd step happens during 2a, then we get a kernel mapping with invalid PTE permission,
Therefore, if the invalid page is accessed, we'd get a page fault exception and the kernel panics.

But despite all of the above conjecture,
we still are wondering if setting the mappings to be invalid first is necessary.
IMHO, "set to invalid --> unmap & flush TLB --> set to default" is identical to "set to default --> unmap & flush TLB".
Could we not just reset them to default first and then flush TLB & free memory?

Best regards,
Leo