lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CALYGNiO_Cbwqd_eFF-MsJP6nAe0OcF=5ABLzJxLMCSAPQgsh4A@mail.gmail.com>
Date:	Sat, 20 Feb 2016 09:25:57 +0300
From:	Konstantin Khlebnikov <koct9i@...il.com>
To:	Ingo Molnar <mingo@...nel.org>, Borislav Petkov <bp@...en8.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	Vlastimil Babka <vbabka@...e.cz>,
	Andrea Arcangeli <aarcange@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
	Peter Zijlstra <peterz@...radead.org>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	dvlasenk@...hat.com, Dave Hansen <dave@...1.net>,
	Thomas Gleixner <tglx@...utronix.de>,
	Rik van Riel <riel@...hat.com>, brgerst@...il.com,
	Andy Lutomirski <luto@...capital.net>,
	Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	linux-tip-commits@...r.kernel.org
Subject: Re: [tip:x86/pkeys] mm/gup: Introduce get_user_pages_remote()

On Tue, Feb 16, 2016 at 3:14 PM, tip-bot for Dave Hansen
<tipbot@...or.com> wrote:
> Commit-ID:  1e9877902dc7e11d2be038371c6fbf2dfcd469d7
> Gitweb:     http://git.kernel.org/tip/1e9877902dc7e11d2be038371c6fbf2dfcd469d7
> Author:     Dave Hansen <dave.hansen@...ux.intel.com>
> AuthorDate: Fri, 12 Feb 2016 13:01:54 -0800
> Committer:  Ingo Molnar <mingo@...nel.org>
> CommitDate: Tue, 16 Feb 2016 10:04:09 +0100
>
> mm/gup: Introduce get_user_pages_remote()
>
> For protection keys, we need to understand whether protections
> should be enforced in software or not.  In general, we enforce
> protections when working on our own task, but not when on others.
> We call these "current" and "remote" operations.
>
> This patch introduces a new get_user_pages() variant:
>
>         get_user_pages_remote()
>
> Which is a replacement for when get_user_pages() is called on
> non-current tsk/mm.

As I see task-struct argument could be NULL as well as in old api.
They only usage for it is updating task->maj_flt/min_flt.

May be just remove arg and always account major/minor faults into
current task - currently counters are plain unsigned long, so remote
access could corrupt them.

>
> We also introduce a new gup flag: FOLL_REMOTE which can be used
> for the "__" gup variants to get this new behavior.
>
> The uprobes is_trap_at_addr() location holds mmap_sem and
> calls get_user_pages(current->mm) on an instruction address.  This
> makes it a pretty unique gup caller.  Being an instruction access
> and also really originating from the kernel (vs. the app), I opted
> to consider this a 'remote' access where protection keys will not
> be enforced.
>
> Without protection keys, this patch should not change any behavior.
>
> Signed-off-by: Dave Hansen <dave.hansen@...ux.intel.com>
> Reviewed-by: Thomas Gleixner <tglx@...utronix.de>
> Cc: Andrea Arcangeli <aarcange@...hat.com>
> Cc: Andrew Morton <akpm@...ux-foundation.org>
> Cc: Andy Lutomirski <luto@...capital.net>
> Cc: Borislav Petkov <bp@...en8.de>
> Cc: Brian Gerst <brgerst@...il.com>
> Cc: Dave Hansen <dave@...1.net>
> Cc: Denys Vlasenko <dvlasenk@...hat.com>
> Cc: H. Peter Anvin <hpa@...or.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> Cc: Linus Torvalds <torvalds@...ux-foundation.org>
> Cc: Naoya Horiguchi <n-horiguchi@...jp.nec.com>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Rik van Riel <riel@...hat.com>
> Cc: Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
> Cc: Vlastimil Babka <vbabka@...e.cz>
> Cc: jack@...e.cz
> Cc: linux-mm@...ck.org
> Link: http://lkml.kernel.org/r/20160212210154.3F0E51EA@viggo.jf.intel.com
> Signed-off-by: Ingo Molnar <mingo@...nel.org>
> ---
>  drivers/gpu/drm/etnaviv/etnaviv_gem.c   |  6 +++---
>  drivers/gpu/drm/i915/i915_gem_userptr.c | 10 +++++-----
>  drivers/infiniband/core/umem_odp.c      |  8 ++++----
>  fs/exec.c                               |  8 ++++++--
>  include/linux/mm.h                      |  5 +++++
>  kernel/events/uprobes.c                 | 10 ++++++++--
>  mm/gup.c                                | 27 ++++++++++++++++++++++-----
>  mm/memory.c                             |  2 +-
>  mm/process_vm_access.c                  | 11 ++++++++---
>  security/tomoyo/domain.c                |  9 ++++++++-
>  virt/kvm/async_pf.c                     |  8 +++++++-
>  11 files changed, 77 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
> index 4b519e4..97d4457 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
> @@ -753,9 +753,9 @@ static struct page **etnaviv_gem_userptr_do_get_pages(
>
>         down_read(&mm->mmap_sem);
>         while (pinned < npages) {
> -               ret = get_user_pages(task, mm, ptr, npages - pinned,
> -                                    !etnaviv_obj->userptr.ro, 0,
> -                                    pvec + pinned, NULL);
> +               ret = get_user_pages_remote(task, mm, ptr, npages - pinned,
> +                                           !etnaviv_obj->userptr.ro, 0,
> +                                           pvec + pinned, NULL);
>                 if (ret < 0)
>                         break;
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
> index 59e45b3..90dbf81 100644
> --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
> +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
> @@ -584,11 +584,11 @@ __i915_gem_userptr_get_pages_worker(struct work_struct *_work)
>
>                 down_read(&mm->mmap_sem);
>                 while (pinned < npages) {
> -                       ret = get_user_pages(work->task, mm,
> -                                            obj->userptr.ptr + pinned * PAGE_SIZE,
> -                                            npages - pinned,
> -                                            !obj->userptr.read_only, 0,
> -                                            pvec + pinned, NULL);
> +                       ret = get_user_pages_remote(work->task, mm,
> +                                       obj->userptr.ptr + pinned * PAGE_SIZE,
> +                                       npages - pinned,
> +                                       !obj->userptr.read_only, 0,
> +                                       pvec + pinned, NULL);
>                         if (ret < 0)
>                                 break;
>
> diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
> index e69bf26..75077a0 100644
> --- a/drivers/infiniband/core/umem_odp.c
> +++ b/drivers/infiniband/core/umem_odp.c
> @@ -572,10 +572,10 @@ int ib_umem_odp_map_dma_pages(struct ib_umem *umem, u64 user_virt, u64 bcnt,
>                  * complex (and doesn't gain us much performance in most use
>                  * cases).
>                  */
> -               npages = get_user_pages(owning_process, owning_mm, user_virt,
> -                                       gup_num_pages,
> -                                       access_mask & ODP_WRITE_ALLOWED_BIT, 0,
> -                                       local_page_list, NULL);
> +               npages = get_user_pages_remote(owning_process, owning_mm,
> +                               user_virt, gup_num_pages,
> +                               access_mask & ODP_WRITE_ALLOWED_BIT,
> +                               0, local_page_list, NULL);
>                 up_read(&owning_mm->mmap_sem);
>
>                 if (npages < 0)
> diff --git a/fs/exec.c b/fs/exec.c
> index dcd4ac7..d885b98 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -198,8 +198,12 @@ static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
>                         return NULL;
>         }
>  #endif
> -       ret = get_user_pages(current, bprm->mm, pos,
> -                       1, write, 1, &page, NULL);
> +       /*
> +        * We are doing an exec().  'current' is the process
> +        * doing the exec and bprm->mm is the new process's mm.
> +        */
> +       ret = get_user_pages_remote(current, bprm->mm, pos, 1, write,
> +                       1, &page, NULL);
>         if (ret <= 0)
>                 return NULL;
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index b1d4b8c..faf3b70 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1225,6 +1225,10 @@ long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
>                       unsigned long start, unsigned long nr_pages,
>                       unsigned int foll_flags, struct page **pages,
>                       struct vm_area_struct **vmas, int *nonblocking);
> +long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
> +                           unsigned long start, unsigned long nr_pages,
> +                           int write, int force, struct page **pages,
> +                           struct vm_area_struct **vmas);
>  long get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
>                     unsigned long start, unsigned long nr_pages,
>                     int write, int force, struct page **pages,
> @@ -2170,6 +2174,7 @@ static inline struct page *follow_page(struct vm_area_struct *vma,
>  #define FOLL_MIGRATION 0x400   /* wait for page to replace migration entry */
>  #define FOLL_TRIED     0x800   /* a retry, previous pass started an IO */
>  #define FOLL_MLOCK     0x1000  /* lock present pages */
> +#define FOLL_REMOTE    0x2000  /* we are working on non-current tsk/mm */
>
>  typedef int (*pte_fn_t)(pte_t *pte, pgtable_t token, unsigned long addr,
>                         void *data);
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 0167679..8eef5f5 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -299,7 +299,7 @@ int uprobe_write_opcode(struct mm_struct *mm, unsigned long vaddr,
>
>  retry:
>         /* Read the page with vaddr into memory */
> -       ret = get_user_pages(NULL, mm, vaddr, 1, 0, 1, &old_page, &vma);
> +       ret = get_user_pages_remote(NULL, mm, vaddr, 1, 0, 1, &old_page, &vma);
>         if (ret <= 0)
>                 return ret;
>
> @@ -1700,7 +1700,13 @@ static int is_trap_at_addr(struct mm_struct *mm, unsigned long vaddr)
>         if (likely(result == 0))
>                 goto out;
>
> -       result = get_user_pages(NULL, mm, vaddr, 1, 0, 1, &page, NULL);
> +       /*
> +        * The NULL 'tsk' here ensures that any faults that occur here
> +        * will not be accounted to the task.  'mm' *is* current->mm,
> +        * but we treat this as a 'remote' access since it is
> +        * essentially a kernel access to the memory.
> +        */
> +       result = get_user_pages_remote(NULL, mm, vaddr, 1, 0, 1, &page, NULL);
>         if (result < 0)
>                 return result;
>
> diff --git a/mm/gup.c b/mm/gup.c
> index 7bf19ff..36ca850 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -870,7 +870,7 @@ long get_user_pages_unlocked(struct task_struct *tsk, struct mm_struct *mm,
>  EXPORT_SYMBOL(get_user_pages_unlocked);
>
>  /*
> - * get_user_pages() - pin user pages in memory
> + * get_user_pages_remote() - pin user pages in memory
>   * @tsk:       the task_struct to use for page fault accounting, or
>   *             NULL if faults are not to be recorded.
>   * @mm:                mm_struct of target mm
> @@ -924,12 +924,29 @@ EXPORT_SYMBOL(get_user_pages_unlocked);
>   * should use get_user_pages because it cannot pass
>   * FAULT_FLAG_ALLOW_RETRY to handle_mm_fault.
>   */
> -long get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
> -               unsigned long start, unsigned long nr_pages, int write,
> -               int force, struct page **pages, struct vm_area_struct **vmas)
> +long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
> +               unsigned long start, unsigned long nr_pages,
> +               int write, int force, struct page **pages,
> +               struct vm_area_struct **vmas)
>  {
>         return __get_user_pages_locked(tsk, mm, start, nr_pages, write, force,
> -                                      pages, vmas, NULL, false, FOLL_TOUCH);
> +                                      pages, vmas, NULL, false,
> +                                      FOLL_TOUCH | FOLL_REMOTE);
> +}
> +EXPORT_SYMBOL(get_user_pages_remote);
> +
> +/*
> + * This is the same as get_user_pages_remote() for the time
> + * being.
> + */
> +long get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
> +               unsigned long start, unsigned long nr_pages,
> +               int write, int force, struct page **pages,
> +               struct vm_area_struct **vmas)
> +{
> +       return __get_user_pages_locked(tsk, mm, start, nr_pages,
> +                                      write, force, pages, vmas, NULL, false,
> +                                      FOLL_TOUCH);
>  }
>  EXPORT_SYMBOL(get_user_pages);
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 38090ca..8bfbad0 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3685,7 +3685,7 @@ static int __access_remote_vm(struct task_struct *tsk, struct mm_struct *mm,
>                 void *maddr;
>                 struct page *page = NULL;
>
> -               ret = get_user_pages(tsk, mm, addr, 1,
> +               ret = get_user_pages_remote(tsk, mm, addr, 1,
>                                 write, 1, &page, &vma);
>                 if (ret <= 0) {
>  #ifndef CONFIG_HAVE_IOREMAP_PROT
> diff --git a/mm/process_vm_access.c b/mm/process_vm_access.c
> index 5d453e5..07514d4 100644
> --- a/mm/process_vm_access.c
> +++ b/mm/process_vm_access.c
> @@ -98,9 +98,14 @@ static int process_vm_rw_single_vec(unsigned long addr,
>                 int pages = min(nr_pages, max_pages_per_loop);
>                 size_t bytes;
>
> -               /* Get the pages we're interested in */
> -               pages = get_user_pages_unlocked(task, mm, pa, pages,
> -                                               vm_write, 0, process_pages);
> +               /*
> +                * Get the pages we're interested in.  We must
> +                * add FOLL_REMOTE because task/mm might not
> +                * current/current->mm
> +                */
> +               pages = __get_user_pages_unlocked(task, mm, pa, pages,
> +                                                 vm_write, 0, process_pages,
> +                                                 FOLL_REMOTE);
>                 if (pages <= 0)
>                         return -EFAULT;
>
> diff --git a/security/tomoyo/domain.c b/security/tomoyo/domain.c
> index 3865145..ade7c6c 100644
> --- a/security/tomoyo/domain.c
> +++ b/security/tomoyo/domain.c
> @@ -874,7 +874,14 @@ bool tomoyo_dump_page(struct linux_binprm *bprm, unsigned long pos,
>         }
>         /* Same with get_arg_page(bprm, pos, 0) in fs/exec.c */
>  #ifdef CONFIG_MMU
> -       if (get_user_pages(current, bprm->mm, pos, 1, 0, 1, &page, NULL) <= 0)
> +       /*
> +        * This is called at execve() time in order to dig around
> +        * in the argv/environment of the new proceess
> +        * (represented by bprm).  'current' is the process doing
> +        * the execve().
> +        */
> +       if (get_user_pages_remote(current, bprm->mm, pos, 1,
> +                               0, 1, &page, NULL) <= 0)
>                 return false;
>  #else
>         page = bprm->page[pos / PAGE_SIZE];
> diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c
> index 3531599..d604e87 100644
> --- a/virt/kvm/async_pf.c
> +++ b/virt/kvm/async_pf.c
> @@ -79,7 +79,13 @@ static void async_pf_execute(struct work_struct *work)
>
>         might_sleep();
>
> -       get_user_pages_unlocked(NULL, mm, addr, 1, 1, 0, NULL);
> +       /*
> +        * This work is run asynchromously to the task which owns
> +        * mm and might be done in another context, so we must
> +        * use FOLL_REMOTE.
> +        */
> +       __get_user_pages_unlocked(NULL, mm, addr, 1, 1, 0, NULL, FOLL_REMOTE);
> +
>         kvm_async_page_present_sync(vcpu, apf);
>
>         spin_lock(&vcpu->async_pf.lock);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ