lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 11 Jun 2011 09:04:14 -0700 (PDT)
From:	Hugh Dickins <hughd@...gle.com>
To:	Andrea Arcangeli <aarcange@...hat.com>
cc:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Hiroyuki Kamezawa <kamezawa.hiroyuki@...il.com>,
	Ying Han <yinghan@...gle.com>, Dave Jones <davej@...hat.com>,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	Oleg Nesterov <oleg@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] [BUGFIX] update mm->owner even if no next owner.

On Sat, 11 Jun 2011, Hiroyuki Kamezawa wrote:
> 2011/6/11 Hugh Dickins <hughd@...gle.com>:
> > On Fri, 10 Jun 2011, KAMEZAWA Hiroyuki wrote:
> >>
> >> I think this can be a fix.
> >
> > Sorry, I think not: I've not digested your rationale,
> > but three things stand out:
> >
> > 1. Why has this only just started happening?  I may not have run that
> >   test on 3.0-rc1, but surely I ran it for hours with 2.6.39;
> >   maybe not with khugepaged, but certainly with ksmd.
> >
> Not sure. I pointed this just by review because I found "charge" in
> khugepaged is out of mmap_sem now.

Right, Andrea's patch cited below.

> 
> > 2. Your hunk below:
> >> -     if (!mm_need_new_owner(mm, p))
> >> +     if (!mm_need_new_owner(mm, p)) {
> >> +             rcu_assign_pointer(mm->owner, NULL);
> >   is now setting mm->owner to NULL at times when we were sure it did not
> >   need updating before (task is not the owner): you're damaging mm->owner.
> >
> Ah, yes. It's my mistake.
> 
> > 3. There's a patch from Andrea in 3.0-rc1 which looks very likely to be
> >   relevant, 692e0b35427a "mm: thp: optimize memcg charge in khugepaged".
> >   I'll try reproducing without that tonight (I crashed in 20 minutes
> >   this morning, so it's not too hard).

I had another go at reproducing it, 2 hours that time, then a try with
692e0b35427a reverted: it ran overnight for 9 hours when I stopped it.

Andrea, please would you ask Linus to revert that commit before -rc3?
Or is there something else you'd like us to try instead?  I admit that
I've not actually taken the time to think through exactly how it goes
wrong, but it does look dangerous.

The way I reproduce it is with my tmpfs kbuilds swapping load,
in this case restricting mem by memcg, and (perhaps the important
detail, not certain) doing concurrent swapoff/swapon repeatedly -
swapoff takes another mm_users reference to the mm it's working on,
which can cause surprises.

Hugh

Powered by blists - more mailing lists