lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 22 May 2019 23:32:53 -0700
From:   John Hubbard <>
To:     Jason Gunthorpe <>,
        Ralph Campbell <>
CC:     <>, <>,
        Ira Weiny <>,
        Dan Williams <>,
        Arnd Bergmann <>,
        Balbir Singh <>,
        Dan Carpenter <>,
        Matthew Wilcox <>,
        Souptick Joarder <>,
        Andrew Morton <>
Subject: Re: [PATCH 5/5] mm/hmm: Fix mm stale reference use in hmm_free()

On 5/22/19 6:25 PM, Jason Gunthorpe wrote:
> On Wed, May 22, 2019 at 05:54:17PM -0700, Ralph Campbell wrote:
>> On 5/22/19 4:36 PM, Jason Gunthorpe wrote:
>>> On Mon, May 06, 2019 at 04:35:14PM -0700, wrote:
>>>> From: Ralph Campbell <>
>>>> The last reference to struct hmm may be released long after the mm_struct
>>>> is destroyed because the struct hmm_mirror memory may be part of a
>>>> device driver open file private data pointer. The file descriptor close
>>>> is usually after the mm_struct is destroyed in do_exit(). This is a good
>>>> reason for making struct hmm a kref_t object [1] since its lifetime spans
>>>> the life time of mm_struct and struct hmm_mirror.
>>>> The fix is to not use hmm->mm in hmm_free() and to clear mm->hmm and
>>>> hmm->mm pointers in hmm_destroy() when the mm_struct is
>>>> destroyed.
>>> I think the right way to fix this is to have the struct hmm hold a
>>> mmgrab() on the mm so its memory cannot go away until all of the hmm
>>> users release the struct hmm, hmm_ranges/etc
>>> Then we can properly use mmget_not_zero() instead of the racy/abnormal
>>> 'if (hmm->xmm == NULL || hmm->dead)' pattern (see the other
>>> thread). Actually looking at this, all these tests look very
>>> questionable. If we hold the mmget() for the duration of the range
>>> object, as Jerome suggested, then they all get deleted.
>>> That just leaves mmu_notifier_unregister_no_relase() as the remaining
>>> user of hmm->mm (everyone else is trying to do range->mm) - and it
>>> looks like it currently tries to call
>>> mmu_notifier_unregister_no_release on a NULL hmm->mm and crashes :(
>>> Holding the mmgrab fixes this as we can safely call
>>> mmu_notifier_unregister_no_relase() post exit_mmap on a grab'd mm.
>>> Also we can delete the hmm_mm_destroy() intrustion into fork.c as it
>>> can't be called when the mmgrab is active.
>>> This is the basic pattern we used in ODP when working with mmu
>>> notifiers, I don't know why hmm would need to be different.

+1 for the mmgrab() approach. I have never been able to see how these
various checks can protect anything, and refcounting it into place definitely
sounds like the right answer.

John Hubbard

Powered by blists - more mailing lists