linux-kernel - Re: [2/3] mm: fix up some user-visible effects of the stack guard page

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1282308887.3170.5439.camel@zakaz.uk.xensource.com>
Date:	Fri, 20 Aug 2010 13:54:47 +0100
From:	Ian Campbell <ijc@...lion.org.uk>
To:	torvalds@...ux-foundation.org
Cc:	linux-kernel@...r.kernel.org, stable@...nel.org,
	stable-review@...nel.org, akpm@...ux-foundation.org,
	alan@...rguk.ukuu.org.uk, Greg KH <gregkh@...e.de>
Subject: Re: [2/3] mm: fix up some user-visible effects of the stack guard
 page

On Wed, 2010-08-18 at 13:30 -0700, Greg KH wrote:
> 2.6.35-stable review patch.  If anyone has any objections, please let us know.
> 

>  - by also teaching the _real_ mlock() functionality not to try to lock
>    the guard page.
> 
>    That would just expand the mapping down to create a new guard page,
>    so there really is no point in trying to lock it in place.

> --- a/mm/mlock.c
> +++ b/mm/mlock.c
> @@ -167,6 +167,14 @@ static long __mlock_vma_pages_range(stru
>  	if (vma->vm_flags & VM_WRITE)
>  		gup_flags |= FOLL_WRITE;
>  
> +	/* We don't try to access the guard page of a stack vma */
> +	if (vma->vm_flags & VM_GROWSDOWN) {
> +		if (start == vma->vm_start) {
> +			start += PAGE_SIZE;
> +			nr_pages--;
> +		}
> +	}
> +

Is this really correct?

I have an app which tries to mlock a portion of its stack. With this
patch (and a bunch of debug) in place I get:
        [  170.977782] sys_mlock 0xbfd8b000-0xbfd8c000 4096
        [  170.978200] sys_mlock aligned, range now 0xbfd8b000-0xbfd8c000 4096
        [  170.978209] do_mlock 0xbfd8b000-0xbfd8c000 4096 (locking)
        [  170.978216] do_mlock vma de47d8f0 0xbfd7e000-0xbfd94000
        [  170.978223] mlock_fixup split vma de47d8f0 0xbfd7e000-0xbfd94000 at start 0xbfd8b000
        [  170.978231] mlock_fixup split vma de47d8f0 0xbfd8b000-0xbfd94000 at end 0xbfd8c000
        [  170.978240] __mlock_vma_pages_range locking 0xbfd8b000-0xbfd8c000 (1 pages) in VMA bfd8b000 0xbfd8c000-0x0
        [  170.978248] __mlock_vma_pages_range adjusting start 0xbfd8b000->0xbfd8c000 to avoid guard
        [  170.978256] __mlock_vma_pages_range now locking 0xbfd8c000-0xbfd8c000 (0 pages)
        [  170.978263] do_mlock error = 0

Note how we end up locking 0 pages.

The stack layout is:
         0xbfd94000 stack VMA end / base
        
         0xbfd8c000 mlock requested end
         0xbfd8b000 mlock requested start
        
         0xbfd7f000 stack VMA start / top
        
         0xbfd7e000 guard page

As part of the mlock_fixup the original VMA (0xbfd7e000-0xbfd94000) is
split into 3, 0xbfd7e000-0xbfd8b000 + 0xbfd8b000-0xbfd8c000 +
0xbfd8c000-0xbfd94000 in order to mlock the middle bit.

Since we have split the original VMA into 3, shouldn't only the bottom
one still have VM_GROWSDOWN set? (how can the top two grow down with the
bottom one in the way?) Certainly it seems wrong to enforce a guard page
on anything but the bottom VMA (which is what appears to be happening).

Although perhaps the larger issue is whether or not it is valid to mlock
below the current end of your current stack, I don't see why it wouldn't
be so perhaps the above is just completely bogus? Isn't it possible that
a process may try and mlock something on a stack page which hasn't
previously been touched and therefore isn't currently mapped and which
therefore could contain the guard page?

Out of interest how does the guard page interact with processes which do
alloca(N*PAGE_SIZE)?

Ian.
-- 
Ian Campbell
Current Noise: Opeth - White Cluster

If we do not change our direction we are likely to end up where we are headed.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/