[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120530000315.GW21339@redhat.com>
Date: Wed, 30 May 2012 02:03:15 +0200
From: Andrea Arcangeli <aarcange@...hat.com>
To: "Kirill A. Shutemov" <kirill@...temov.name>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Hillf Danton <dhillf@...il.com>, Dan Smith <danms@...ibm.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...e.hu>, Paul Turner <pjt@...gle.com>,
Suresh Siddha <suresh.b.siddha@...el.com>,
Mike Galbraith <efault@....de>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Lai Jiangshan <laijs@...fujitsu.com>,
Bharata B Rao <bharata.rao@...il.com>,
Lee Schermerhorn <Lee.Schermerhorn@...com>,
Rik van Riel <riel@...hat.com>,
Johannes Weiner <hannes@...xchg.org>,
Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>,
Christoph Lameter <cl@...ux.com>
Subject: Re: [PATCH 23/35] autonuma: core
On Tue, May 29, 2012 at 02:45:54PM +0300, Kirill A. Shutemov wrote:
> On Fri, May 25, 2012 at 07:02:27PM +0200, Andrea Arcangeli wrote:
>
> > +static int knumad_do_scan(void)
> > +{
>
> ...
>
> > + if (knumad_test_exit(mm) || !vma) {
> > + mm_autonuma = mm->mm_autonuma;
> > + if (mm_autonuma->mm_node.next != &knumad_scan.mm_head) {
> > + mm_autonuma = list_entry(mm_autonuma->mm_node.next,
> > + struct mm_autonuma, mm_node);
> > + knumad_scan.mm = mm_autonuma->mm;
> > + atomic_inc(&knumad_scan.mm->mm_count);
> > + knumad_scan.address = 0;
> > + knumad_scan.mm->mm_autonuma->numa_fault_pass++;
> > + } else
> > + knumad_scan.mm = NULL;
>
> knumad_scan.mm should be nulled only after list_del otherwise you will
> have race with autonuma_exit():
Thanks for noticing I managed to reproduce it by setting
knuma_scand/scan_sleep_millisecs and
knuma_scand/scan_sleep_pass_millisecs both to 0 and running a loop of
"while :; do memhog -r10 10m &>/dev/null; done".
So the problem was that if knuma_scand would change the knumad_scan.mm
after the mm->mm_users was 0 but before autonuma_exit run,
autonuma_exit wouldn't notice that the mm->mm_auotnuma was already
unlinked and it would unlink again.
autonuma_exit itself doesn't need to tell anything to knuma_scand,
because if it notices knuma_scand.mm == mm, it will do nothing and it
_always_ relies on knumad_scan to unlink it.
And if instead knuma_scand.mm is != mm, then autonuma_exit knows the
knuma_scand daemon will never have a chance to see the "mm" in the
list again if it arrived first (setting mm_autonuma->mm = NULL there
is just a debug tweak according to the comment).
The "serialize" event is there only to wait the knuma_scand main loop
before taking down the mm (it's not related to the list management).
The mm_autonuma->mm is useless after the "mm_autonuma" has been
unlinked so it's ok to use that to track if knuma_scand arrives first.
The exit path of the kernel daemon also forgot to check for
knumad_test_exit(mm) before unlinking, but that only runs if
kthread_should_stop() is true, and nobody calls kthread_stop so it's
only a theoretical improvement.
So this seems to fix it.
diff --git a/mm/autonuma.c b/mm/autonuma.c
index c2a5a82..768250a 100644
--- a/mm/autonuma.c
+++ b/mm/autonuma.c
@@ -679,9 +679,12 @@ static int knumad_do_scan(void)
} else
knumad_scan.mm = NULL;
- if (knumad_test_exit(mm))
+ if (knumad_test_exit(mm)) {
list_del(&mm->mm_autonuma->mm_node);
- else
+ /* tell autonuma_exit not to list_del */
+ VM_BUG_ON(mm->mm_autonuma->mm != mm);
+ mm->mm_autonuma->mm = NULL;
+ } else
mm_numa_fault_flush(mm);
mmdrop(mm);
@@ -770,8 +773,12 @@ static int knuma_scand(void *none)
mm = knumad_scan.mm;
knumad_scan.mm = NULL;
- if (mm)
+ if (mm && knumad_test_exit(mm)) {
list_del(&mm->mm_autonuma->mm_node);
+ /* tell autonuma_exit not to list_del */
+ VM_BUG_ON(mm->mm_autonuma->mm != mm);
+ mm->mm_autonuma->mm = NULL;
+ }
mutex_unlock(&knumad_mm_mutex);
if (mm)
@@ -996,11 +1003,15 @@ void autonuma_exit(struct mm_struct *mm)
mutex_lock(&knumad_mm_mutex);
if (knumad_scan.mm == mm)
serialize = true;
- else
+ else if (mm->mm_autonuma->mm) {
+ VM_BUG_ON(mm->mm_autonuma->mm != mm);
+ mm->mm_autonuma->mm = NULL; /* debug */
list_del(&mm->mm_autonuma->mm_node);
+ }
mutex_unlock(&knumad_mm_mutex);
if (serialize) {
+ /* prevent the mm to go away under knumad_do_scan main loop */
down_write(&mm->mmap_sem);
up_write(&mm->mmap_sem);
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists