lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 1 Sep 2015 10:58:11 -0400
From:	Jerome Glisse <jglisse@...hat.com>
To:	Mark Hairgrove <mhairgrove@...dia.com>
Cc:	akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org, Linus Torvalds <torvalds@...ux-foundation.org>,
	joro@...tes.org, Mel Gorman <mgorman@...e.de>,
	"H. Peter Anvin" <hpa@...or.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Johannes Weiner <jweiner@...hat.com>,
	Larry Woodman <lwoodman@...hat.com>,
	Rik van Riel <riel@...hat.com>,
	Dave Airlie <airlied@...hat.com>,
	Brendan Conoboy <blc@...hat.com>,
	Joe Donohue <jdonohue@...hat.com>,
	Christophe Harle <charle@...dia.com>,
	Duncan Poole <dpoole@...dia.com>,
	Sherry Cheung <SCheung@...dia.com>,
	Subhash Gutti <sgutti@...dia.com>,
	John Hubbard <jhubbard@...dia.com>,
	Lucien Dunning <ldunning@...dia.com>,
	Cameron Buschardt <cabuschardt@...dia.com>,
	Arvind Gopalakrishnan <arvindg@...dia.com>,
	Haggai Eran <haggaie@...lanox.com>,
	Shachar Raindel <raindel@...lanox.com>,
	Liran Liss <liranl@...lanox.com>,
	Roland Dreier <roland@...estorage.com>,
	Ben Sander <ben.sander@....com>,
	Greg Stoner <Greg.Stoner@....com>,
	John Bridgman <John.Bridgman@....com>,
	Michael Mantor <Michael.Mantor@....com>,
	Paul Blinzer <Paul.Blinzer@....com>,
	Leonid Shamis <Leonid.Shamis@....com>,
	Laurent Morichetti <Laurent.Morichetti@....com>,
	Alexander Deucher <Alexander.Deucher@....com>
Subject: Re: [PATCH 02/15] mmu_notifier: keep track of active invalidation
 ranges v4

On Mon, Aug 31, 2015 at 08:27:17PM -0700, Mark Hairgrove wrote:
> On Thu, 13 Aug 2015, Jérôme Glisse wrote:

[...]

Will fix syntax.

[...]

> > +/* mmu_notifier_range_wait_valid() - wait for a range to have no conflict with
> > + * active invalidation.
> > + *
> > + * @mm: The mm struct.
> > + * @start: Start address of the range (inclusive).
> > + * @end: End address of the range (exclusive).
> > + *
> > + * This function wait for any active range invalidation that conflict with the
> > + * given range, to end.
> > + *
> > + * Note by the time this function return a new range invalidation that conflict
> > + * might have started. So you need to atomically block new range and query
> > + * again if range is still valid with mmu_notifier_range_is_valid(). So call
> > + * sequence should be :
> > + *
> > + * again:
> > + * mmu_notifier_range_wait_valid()
> > + * // block new invalidation using that lock inside your range_start callback
> > + * lock_block_new_invalidation()
> > + * if (!mmu_notifier_range_is_valid())
> > + *     goto again;
> > + * unlock()
> 
> I think this example sequence can deadlock so I wouldn't want to encourage 
> its use. New invalidation regions are added to the list before the 
> range_start callback is invoked.
> 
> Thread A                           Thread B
> -----------------                  -----------------
> mmu_notifier_range_wait_valid
> // returns
>                                    __mmu_notifier_invalidate_range_start
>                                      list_add_tail
> lock_block_new_invalidation
>                                      ->invalidate_range_start
>                                        // invalidation blocked in callback
> mmu_notifier_range_is_valid // fails
> goto again
> mmu_notifier_range_wait_valid // deadlock
> 
> mmu_notifier_range_wait_valid can't finish until thread B's callback 
> returns, but thread B's callback can't return because it's blocked.
> 
> I see that HMM in later patches takes the approach of not holding the lock 
> when mmu_notifier_range_is_valid returns false. Instead of stalling new 
> invalidations it returns -EAGAIN to the caller. While that resolves the 
> deadlock, it won't prevent the faulting thread from being starved in the 
> pathological case.

The comment here is not clear, what HMM does is what is intended. If
mmu_notifier_range_is_valid() return false then you drop lock and try
again. I am not sure we should care about the starve case as it can
not happen, it would mean that something keeps invalidating over and
over the same range of address space of a process. I do not see how
such thing would happen.

> 
> Is it out of the question to build a lock into the mmu notifier API 
> directly? It's a little worrisome to me that the complexity for this 
> locking is pushed into the callbacks rather than handled in the core. 
> Something like this:
> 
>     mmu_notifier_range_lock(start, end)
>     mmu_notifier_range_unlock(start, end)

If a range is about to be invalidated it is better to avoid faulting
in memory in the device as that same memory is about to be invalidated.
This is why i always have invalidation take precedence over device fault.


> If that's not feasible and we have to stick with the current approach, 
> then I suggest changing the "valid" name. "valid" doesn't have a clear 
> meaning at first glance because the reader doesn't know what would make a 
> range "valid." How about "active" instead? Then the names would look 
> something like this, assuming the polarity matches their current versions:
> 
>     mmu_notifier_range_inactive_locked
>     mmu_notifier_range_inactive
>     mmu_notifier_range_wait_active

Those names are better i will update the patch accordingly.

Thanks for the review,
Jérôme
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ