lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 5 May 2020 14:11:51 +0200
From:   Jann Horn <jannh@...gle.com>
To:     Christoph Hellwig <hch@....de>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        kernel list <linux-kernel@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        "Eric W . Biederman" <ebiederm@...ssion.com>,
        Oleg Nesterov <oleg@...hat.com>,
        Russell King <linux@...linux.org.uk>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        Mark Salter <msalter@...hat.com>,
        Aurelien Jacquiot <jacquiot.aurelien@...il.com>,
        linux-c6x-dev@...ux-c6x.org,
        Yoshinori Sato <ysato@...rs.sourceforge.jp>,
        Rich Felker <dalias@...c.org>,
        Linux-sh list <linux-sh@...r.kernel.org>
Subject: Re: [PATCH v2 4/5] binfmt_elf, binfmt_elf_fdpic: Use a VMA list snapshot

On Tue, May 5, 2020 at 1:04 PM Christoph Hellwig <hch@....de> wrote:
> On Wed, Apr 29, 2020 at 11:49:53PM +0200, Jann Horn wrote:
> > In both binfmt_elf and binfmt_elf_fdpic, use a new helper
> > dump_vma_snapshot() to take a snapshot of the VMA list (including the gate
> > VMA, if we have one) while protected by the mmap_sem, and then use that
> > snapshot instead of walking the VMA list without locking.
[...]
> > diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> > index fb36469848323..dffe9dc8497ca 100644
> > --- a/fs/binfmt_elf.c
> > +++ b/fs/binfmt_elf.c
> > @@ -1292,8 +1292,12 @@ static bool always_dump_vma(struct vm_area_struct *vma)
> >       return false;
> >  }
> >
> > +#define DUMP_SIZE_MAYBE_ELFHDR_PLACEHOLDER 1
> > +
> >  /*
> >   * Decide what to dump of a segment, part, all or none.
> > + * The result must be fixed up via vma_dump_size_fixup() once we're in a context
> > + * that's allowed to sleep arbitrarily long.
> >   */
> >  static unsigned long vma_dump_size(struct vm_area_struct *vma,
> >                                  unsigned long mm_flags)
> > @@ -1348,30 +1352,15 @@ static unsigned long vma_dump_size(struct vm_area_struct *vma,
> >
> >       /*
> >        * If this looks like the beginning of a DSO or executable mapping,
> > -      * check for an ELF header.  If we find one, dump the first page to
> > -      * aid in determining what was mapped here.
> > +      * we'll check for an ELF header. If we find one, we'll dump the first
> > +      * page to aid in determining what was mapped here.
> > +      * However, we shouldn't sleep on userspace reads while holding the
> > +      * mmap_sem, so we just return a placeholder for now that will be fixed
> > +      * up later in vma_dump_size_fixup().
> >        */
> >       if (FILTER(ELF_HEADERS) &&
> > -         vma->vm_pgoff == 0 && (vma->vm_flags & VM_READ)) {
> > -             u32 __user *header = (u32 __user *) vma->vm_start;
> > -             u32 word;
> > -             /*
> > -              * Doing it this way gets the constant folded by GCC.
> > -              */
> > -             union {
> > -                     u32 cmp;
> > -                     char elfmag[SELFMAG];
> > -             } magic;
> > -             BUILD_BUG_ON(SELFMAG != sizeof word);
> > -             magic.elfmag[EI_MAG0] = ELFMAG0;
> > -             magic.elfmag[EI_MAG1] = ELFMAG1;
> > -             magic.elfmag[EI_MAG2] = ELFMAG2;
> > -             magic.elfmag[EI_MAG3] = ELFMAG3;
> > -             if (unlikely(get_user(word, header)))
> > -                     word = 0;
> > -             if (word == magic.cmp)
> > -                     return PAGE_SIZE;
> > -     }
> > +         vma->vm_pgoff == 0 && (vma->vm_flags & VM_READ))
> > +             return DUMP_SIZE_MAYBE_ELFHDR_PLACEHOLDER;
> >
> >  #undef       FILTER
> >
> > @@ -1381,6 +1370,22 @@ static unsigned long vma_dump_size(struct vm_area_struct *vma,
> >       return vma->vm_end - vma->vm_start;
> >  }
> >
> > +/* Fix up the result from vma_dump_size(), now that we're allowed to sleep. */
> > +static void vma_dump_size_fixup(struct core_vma_metadata *meta)
> > +{
> > +     char elfmag[SELFMAG];
> > +
> > +     if (meta->dump_size != DUMP_SIZE_MAYBE_ELFHDR_PLACEHOLDER)
> > +             return;
> > +
> > +     if (copy_from_user(elfmag, (void __user *)meta->start, SELFMAG)) {
> > +             meta->dump_size = 0;
> > +             return;
> > +     }
> > +     meta->dump_size =
> > +             (memcmp(elfmag, ELFMAG, SELFMAG) == 0) ? PAGE_SIZE : 0;
> > +}
>
> While this code looks entirely correct, it took me way too long to
> follow.  I'd take te DUMP_SIZE_MAYBE_ELFHDR_PLACEHOLDER into the caller,
> and then have a simple function like this:
>
> static void vma_dump_size_fixup(struct core_vma_metadata *meta)
> {
>         char elfmag[SELFMAG];
>
>         if (copy_from_user(elfmag, (void __user *)meta->start, SELFMAG) ||
>             memcmp(elfmag, ELFMAG, sizeof(elfmag)) != 0)
>                 meta->dump_size = 0;
>         else
>                 meta->dump_size = PAGE_SIZE;
> }

I guess I can make that change, even if I personally think it's less
clear if parts of the fixup logic spill over into the caller instead
of being handled locally here. :P

> Also a few comments explaining why we do this fixup would help readers
> of the code.
>
> > -             if (vma->vm_flags & VM_WRITE)
> > -                     phdr.p_flags |= PF_W;
> > -             if (vma->vm_flags & VM_EXEC)
> > -                     phdr.p_flags |= PF_X;
> > +             phdr.p_flags = meta->flags & VM_READ ? PF_R : 0;
> > +             phdr.p_flags |= meta->flags & VM_WRITE ? PF_W : 0;
> > +             phdr.p_flags |= meta->flags & VM_EXEC ? PF_X : 0;
>
> Sorry for another nitpick, but I find the spelled out version with the
> if a lot easier to read.

Huh... I find it easier to scan if it is three lines with the same
pattern, but I'm not too attached to it.

In that case, I guess I should change it like this? The old code had a
ternary for VM_READ and branches for the other two, which didn't seem
very pretty to me.

phdr.p_flags = 0;
if (meta->flags & VM_READ)
        phdr.p_flags |= PF_R;
if (meta->flags & VM_WRITE)
        phdr.p_flags |= PF_W;
if (meta->flags & VM_EXEC)
        phdr.p_flags |= PF_X;

> > +int dump_vma_snapshot(struct coredump_params *cprm, int *vma_count,
> > +     struct core_vma_metadata **vma_meta,
> > +     unsigned long (*dump_size_cb)(struct vm_area_struct *, unsigned long))
>
> Plase don't use single tabs for indentating parameter continuations
> (we actually have two styles - either two tabs or aligned after the
> opening brace, pick your poison :))

I did that because if I use either one of those styles, I'll have to
either move the callback type into a typedef or add line breaks in the
parameters of the callback type. I guess I can write it like this...

int dump_vma_snapshot(struct coredump_params *cprm, int *vma_count,
                      struct core_vma_metadata **vma_meta,
                      unsigned long (*dump_size_cb)(struct vm_area_struct *,
                                                    unsigned long));

but if you also dislike that, let me know and I'll add a typedef instead. :P

> > +     *vma_meta = kvmalloc_array(*vma_count, sizeof(**vma_meta), GFP_KERNEL);
> > +     if (!*vma_meta) {
> > +             up_read(&mm->mmap_sem);
> > +             return -ENOMEM;
> > +     }
> > +
> > +     for (i = 0, vma = first_vma(current, gate_vma); vma != NULL;
> > +                     vma = next_vma(vma, gate_vma)) {
> > +             (*vma_meta)[i++] = (struct core_vma_metadata) {
> > +                     .start = vma->vm_start,
> > +                     .end = vma->vm_end,
> > +                     .flags = vma->vm_flags,
> > +                     .dump_size = dump_size_cb(vma, cprm->mm_flags)
> > +             };
>
> This looks a little weird.  Why not kcalloc + just initialize the four
> fields we actually fill out here?

Yeah, I can just change the syntax here to normal member writes if you
want. I just thought C99-style initialization looked nicer, but I
guess that's unusual in the kernel...

(And I just noticed that that "filesize" member of that struct
core_vma_metadata I'm defining is entirely unused... I'll have to
remove that in the next iteration.)

Powered by blists - more mailing lists