linux-kernel - Re: [PATCH v2 4/4] drm/panfrost: Handle non-aligned lock addresses

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <6fe675c4-d22b-22da-ba3c-f6d33419b9ed@arm.com>
Date:   Wed, 25 Aug 2021 10:04:01 +0100
From:   Steven Price <steven.price@....com>
To:     Alyssa Rosenzweig <alyssa.rosenzweig@...labora.com>,
        dri-devel@...ts.freedesktop.org
Cc:     Rob Herring <robh@...nel.org>,
        Tomeu Vizoso <tomeu.vizoso@...labora.com>,
        David Airlie <airlied@...ux.ie>,
        Daniel Vetter <daniel@...ll.ch>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 4/4] drm/panfrost: Handle non-aligned lock addresses

On 24/08/2021 18:30, Alyssa Rosenzweig wrote:
> When locking memory, the base address is rounded down to the nearest
> page. The current code does not adjust the size in this case,
> truncating the lock region:
> 
> 	Input:      [----size----]
> 	Round: [----size----]
> 
> To fix the truncation, extend the lock region by the amount rounded off.
> 
> 	Input:      [----size----]
> 	Round: [-------size------]
> 
> This bug is difficult to hit under current assumptions: since the size
> of the lock region is stored as a ceil(log2), the truncation must cause
> us to cross a power-of-two boundary. This is possible, for example if
> the caller tries to lock 65535 bytes starting at iova 0xCAFE0010. The
> existing code rounds down the iova to 0xCAFE0000 and rounds up the lock
> region to 65536 bytes, locking until 0xCAFF0000. This fails to lock the
> last 15 bytes.
> 
> In practice, the current callers pass PAGE_SIZE aligned inputs, avoiding
> the bug. Therefore this doesn't need to be backported. Still, that's a
> happy accident and not a precondition of lock_region, so we let's do the
> right thing to future proof.

Actually it's worse than that due to the hardware behaviour, the spec
states (for LOCKADDR_BASE):

> Only the upper bits of the address are used. The address is aligned to a
> multiple of the region size, so a variable number of low-order bits are
> ignored, depending on the selected region size. It is recommended that software
> ensures that these low bits in the address are cleared, to avoid confusion.

It appears that indeed this has caused confusion ;)

So for a simple request like locking from 0xCAFE0000 - 0xCB010000 (size
= 0x30000) the region width gets rounded up (to 0x40000) which causes
the start address to be effectively rounded down (by the hardware) to
0xCAFC0000 and we fail to lock 0xCB000000-0xCB010000.

Interestingly (unless my reading of this is wrong) that means to lock
0xFFFF0000-0x100010000 (i.e. crossing the 4GB boundary) requires locking
*at least* 0x00000000-0x200000000 (i.e. locking the first 8GB).

This appears to be broken in kbase (which actually does zero out the low
bits of the address) - I've raised a bug internally so hopefully someone
will tell me if I've read the spec completely wrong here.

Steve

> Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@...labora.com>
> Reported-by: Steven Price <steven.price@....com>
> ---
>  drivers/gpu/drm/panfrost/panfrost_mmu.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index dfe5f1d29763..14be32497ec3 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -63,6 +63,9 @@ static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
>  	u8 region_width;
>  	u64 region = iova & PAGE_MASK;
>  
> +	/* After rounding the address down, extend the size to lock the end. */
> +	size += (region - iova);
> +
>  	/* The size is encoded as ceil(log2) minus(1), which may be calculated
>  	 * with fls. The size must be clamped to hardware bounds.
>  	 */
>