lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAGfvh62kn1px+NmV3vJ_-KtJNP7LH48QTiJ6vYrg3AC4jEZPMA@mail.gmail.com>
Date:	Wed, 20 Jan 2016 09:21:11 -0600
From:	Russell Knize <rknize@...orola.com>
To:	Minchan Kim <minchan@...nel.org>
Cc:	Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
	Junil Lee <junil0814.lee@....com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Nitin Gupta <ngupta@...are.org>, linux-mm@...ck.org,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition

Yes, I saw your v5 and have already started testing it.  I suspect it
will be stable, as the key for us was to set that bit before the
store.  We were only seeing it on ARM32, but those platforms tend
perform compaction far more often due to the memory pressure.  We
don't see it at all anymore.

Honestly, at first I didn't think setting the bit would help that much
as I assumed it was the barrier in the clear_bit_unlock() that
mattered.  Then I saw the same sort of race happening in the page
migration stuff I've been working on.  I had done the same type of
"optimization" there and in fact did not call unpin_tag() at all after
updating the object handles with the bit dropped.

Russ

On Wed, Jan 20, 2016 at 1:00 AM, Minchan Kim <minchan@...nel.org> wrote:
> Hello Russ,
>
> On Tue, Jan 19, 2016 at 09:47:12AM -0600, Russell Knize wrote:
>>    Just wanted to ack this, as we have been seeing the same problem (weird
>>    race conditions during compaction) and fixed it in the same way a few
>>    weeks ago (resetting the pin bit before recording the obj).
>>    Russ
>
> First of all, thanks for your comment.
>
> The patch you tested have a problem although it's really subtle(ie,
> it doesn't do store tearing when I disassemble ARM{32|64}) but it
> could have a problem potentially for other architecutres or future ARM.
> For right fix, I sent v5 - https://lkml.org/lkml/2016/1/18/263.
> If you can prove it fixes your problem, please Tested-by to the thread.
> It's really valuable to do testing for stable material.
>
> Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ