lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 11 Feb 2019 16:08:10 -0800
From:   Ira Weiny <>
To:     Jason Gunthorpe <>
Cc:     Dan Williams <>,
        John Hubbard <>,
        linux-rdma <>,
        Linux Kernel Mailing List <>,
        Linux MM <>,
        Daniel Borkmann <>,
        Davidlohr Bueso <>,
        Netdev <>,
        Mike Marciniszyn <>,
        Dennis Dalessandro <>,
        Doug Ledford <>,
        Andrew Morton <>,
        "Kirill A. Shutemov" <>
Subject: Re: [PATCH 2/3] mm/gup: Introduce get_user_pages_fast_longterm()

On Mon, Feb 11, 2019 at 04:25:10PM -0700, Jason Gunthorpe wrote:
> On Mon, Feb 11, 2019 at 02:55:10PM -0800, Dan Williams wrote:
> > > I also wonder if someone should think about making fast into a flag
> > > too..
> > >
> > > But I'm not sure when fast should be used vs when it shouldn't :(
> > 
> > Effectively fast should always be used just in case the user cares
> > about performance. It's just that it may fail and need to fall back to
> > requiring the vma.
> But the fall back / slow path is hidden inside the API, so when should
> the caller care? 
> ie when should the caller care to use gup_fast vs gup_unlocked? (the
> comments say they are the same, but this seems to be a mistake)
> Based on some of the comments in the code it looks like this API is
> trying to convert itself into:
> long get_user_pages_locked(struct task_struct *tsk, struct mm_struct *mm,
>                            unsigned long start, unsigned long nr_pages,
> 			   unsigned int gup_flags, struct page **pages,
> 			   struct vm_area_struct **vmas, bool *locked)
> long get_user_pages_unlocked(struct task_struct *tsk, struct mm_struct *mm,
>                              unsigned long start, unsigned long nr_pages,
> 			     unsigned int gup_flags, struct page **pages)
> (and maybe a FOLL_FAST if there is some reason we have _fast and
> _unlocked)
> The reason I ask, is that if there is no reason for fast vs unlocked
> then maybe Ira should convert HFI to use gup_unlocked and move the
> 'fast' code into unlocked?
> ie move incrementally closer to the desired end-state here.

If the pages are not in the page tables then fast is probably going to be
slightly slower because it will have to fall back after walking the tables and
finding something missing.

For PSM2 (MPI) applications are performance improvement was probably because
the memory in question was in the page tables and very much in use.


> Jason

Powered by blists - more mailing lists