lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 4 May 2022 22:04:44 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Matthew Wilcox <willy@...radead.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
        Waiman Long <longman@...hat.com>,
        "Liam R. Howlett" <Liam.Howlett@...cle.com>,
        linux-kernel@...r.kernel.org
Subject: Re: Wait for mutex to become unlocked

On Thu, May 05, 2022 at 03:14:53AM +0200, Thomas Gleixner wrote:
> On Thu, May 05 2022 at 01:38, Matthew Wilcox wrote:
> > On Thu, May 05, 2022 at 02:22:30AM +0200, Thomas Gleixner wrote:
> >> > That is, rwsem_wait_read() puts the thread on the rwsem's wait queue,
> >> > and wakes it up without giving it the lock.  Now this thread will never
> >> > be able to block any thread that tries to acquire mmap_sem for write.
> >> 
> >> Never?
> >> 
> >>  	if (down_read_trylock(&vma->sem)) {
> >> 
> >> ---> preemption by writer
> >
> > Ah!  This is a different semaphore.  Yes, it can be preempted while
> > holding the VMA rwsem and block a thread which is trying to modify the
> > VMA which will then block all threads from faulting _on that VMA_,
> > but it won't affect page faults on any other VMA.
> 
> Ooops. Missed that detail. Too many semaphores here.
> 
> > It's only Better, not Best (the Best approach was proposed on Monday
> > afternoon, and the other MM developers asked us to only go as far as
> > Better and see if that was good enough).
> 
> :)
> 
> >> The information gathered from /proc/pid/smaps is unreliable at the point
> >> where the lock is dropped already today. So it does not make a
> >> difference whether the VMAs have a 'read me if you really think it's
> >> useful' sideband information which gets updated when the VMA changes and
> >> allows to do:
> >
> > Mmm.  I'm not sure that we want to maintain the smaps information on
> > the off chance that somebody wants to query it.
> 
> Fair enough, but then the question is whether it's more reasonable to
> document that if you want to read that nonsense, then you have to live
> with the consequences. The problem with many of those interfaces is that
> they have been added for whatever reasons, became ABI and people are
> suddenly making performance claims which might not be justified at all.
> 
> We really have to make our mind up and make decisions whether we want to
> solve every "I want a pony" complaint just because.
> 
> >> But looking at the stuff which gets recomputed and reevaluated in that
> >> proc/smaps code this makes a lot of sense, because most if not all of
> >> this information is already known at the point where the VMA is modified
> >> while holding mmap_sem for useful reasons, no?
> >
> > I suspect the only way to know is to try to implement it, and then
> > benchmark it.
> 
> Sure. There are other ways than having a RCU protected info, e.g. a
> sequence count which ensures that the to be read information is
> consistent.

So the thought is to maintain the /proc/smaps information separately,
so that it can just be read out, correct?  If so...

As you say, sequence counts can check consistency, but something else
is required to protect any dereferences of pointers to data that might
be freed.  One approach is to place the /proc/smaps information somewhere
that cannot be freed during /proc/smaps scan.  The place that comes
immediately to mind is the mm_struct, but I suspect that the /proc/smaps
information will need to be variable length, especially on 64-bit systems.

Another approach is to allocate space for the /proc/smaps information
dynamically, using RCU to protect only reads of only that information.
But you seem to be thinking of something else.  Or maybe your point is
that the use of RCU can be restricted to this /proc/smaps information?

Yet another approach is to use reference counts, but of course the counts
need to live outside of the structure being protected.  If the summary
information is not to block expansion of the address space (which is
the asked-for pony), this gets tricky due to the need to quickly and
repeatedly enlarge the memory holding the /proc/smaps information.

Or am I missing a trick here?

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ