lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 19 Apr 2023 10:38:29 -0400
From:   Paul Moore <paul@...l-moore.com>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Waiman Long <longman@...hat.com>, Hugh Dickins <hughd@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Joe Mario <jmario@...hat.com>,
        Barry Marson <bmarson@...hat.com>,
        Rafael Aquini <aquini@...hat.com>,
        Stephen Smalley <stephen.smalley.work@...il.com>,
        Eric Paris <eparis@...isplace.org>, selinux@...r.kernel.org,
        James Morris <jmorris@...ei.org>,
        "Serge E. Hallyn" <serge@...lyn.com>,
        linux-security-module@...r.kernel.org
Subject: Re: [PATCH] mm/mmap: Map MAP_STACK to VM_STACK

On Tue, Apr 18, 2023 at 11:24 PM Matthew Wilcox <willy@...radead.org> wrote:
> On Tue, Apr 18, 2023 at 09:45:34PM -0400, Waiman Long wrote:
> > On 4/18/23 21:36, Hugh Dickins wrote:
> > > On Tue, 18 Apr 2023, Waiman Long wrote:
> > > > On 4/18/23 17:18, Andrew Morton wrote:
> > > > > On Tue, 18 Apr 2023 17:02:30 -0400 Waiman Long <longman@...hat.com> wrote:
> > > > >
> > > > > > One of the flags of mmap(2) is MAP_STACK to request a memory segment
> > > > > > suitable for a process or thread stack. The kernel currently ignores
> > > > > > this flags. Glibc uses MAP_STACK when mmapping a thread stack. However,
> > > > > > selinux has an execstack check in selinux_file_mprotect() which disallows
> > > > > > a stack VMA to be made executable.
> > > > > >
> > > > > > Since MAP_STACK is a noop, it is possible for a stack VMA to be merged
> > > > > > with an adjacent anonymous VMA. With that merging, using mprotect(2)
> > > > > > to change a part of the merged anonymous VMA to make it executable may
> > > > > > fail. This can lead to sporadic failure of applications that need to
> > > > > > make those changes.
> > > > > "Sporadic failure of applications" sounds quite serious.  Can you
> > > > > provide more details?
> > > > The problem boils down to the fact that it is possible for user code to mmap a
> > > > region of memory and then for the kernel to merge the VMA for that memory with
> > > > the VMA for one of the application's thread stacks. This is causing random
> > > > SEGVs with one of our large customer application.
> > > >
> > > > At a high level, this is what's happening:
> > > >
> > > >   1) App runs creating lots of threads.
> > > >   2) It mmap's 256K pages of anonymous memory.
> > > >   3) It writes executable code to that memory.
> > > >   4) It calls mprotect() with PROT_EXEC on that memory so
> > > >      it can subsequently execute the code.
> > > >
> > > > The above mprotect() will fail if the mmap'd region's VMA gets merged with the
> > > > VMA for one of the thread stacks.  That's because the default RHEL SELinux
> > > > policy is to not allow executable stacks.
> > > Then wouldn't the bug be at the SELinux end?  VMAs may have been merged
> > > already, but the mprotect() with PROT_EXEC of the good non-stack range
> > > will then split that area off from the stack again - maybe the SELinux
> > > check does not understand that must happen?
> >
> > The SELinux check is done per VMA, not a region within a VMA. After VMA
> > merging, SELinux is probably not able to determine which part of a VMA is a
> > stack unless we keep that information somewhere and provide an API for
> > SELinux to query. That can be quite a lot of work. So the easiest way to
> > prevent this problem is to avoid merging a stack VMA with a regular
> > anonymous VMA.
>
> To paraphrase you, "Yes, SELinux is buggy, but we don't want to fix it".
>
> Cc'ing the SELinux people so it can be fixed properly.

SELinux needs some way to determine what memory region is currently
being used by an application's stacks.  The current logic can be found
in selinux_file_mprotect(), the relevant snippet is below:

int selinux_file_mprotect(struct vm_area_struct *vma, ...)
{
  ...
  } else if (!vma->vm_file &&
    ((vma->vm_start <= vma->vm_mm->start_stack &&
      vma->vm_end >= vma->vm_mm->start_stack) ||
    vma_is_stack_for_current(vma))) {
      rc = avc_has_perm(&selinux_state,
                        sid, sid, SECCLASS_PROCESS,
                        PROCESS__EXECSTACK, NULL);
 }
  ...
}

If someone has a better, more foolproof way to determine an
application's stack please let us know, or better yet submit a patch
:)

-- 
paul-moore.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ