lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 18 Mar 2016 10:17:41 +0900
From:	Minchan Kim <minchan@...nel.org>
To:	Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
Cc:	Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Joonsoo Kim <js1304@...il.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC][PATCH v3 1/5] mm/zsmalloc: introduce class auto-compaction

Hi Sergey,

On Thu, Mar 17, 2016 at 10:29:30AM +0900, Sergey Senozhatsky wrote:
> Hello Minchan,
> 
> On (03/15/16 15:17), Minchan Kim wrote:
> [..]
> > > hm, in this scenario both solutions are less than perfect. we jump
> > > X times over 40% margin, we have X*NR_CLASS compaction scans in the
> > > end. the difference is that we queue less works, yes, but we don't
> > > have to use workqueue in the first place; compaction can be done
> > > asynchronously by a pool's dedicated kthread. so we will just
> > > wake_up() the process.
> > 
> > Hmm, kthread is over-engineered to me. If we want to create new kthread
> > in the system, I guess we should persuade many people to merge in.
> > Surely, we should have why it couldn't be done by others(e.g., workqueue).
> > 
> > I think your workqueue approach is good to me.
> > Only problem I can see with it is we cannot start compaction when
> > we want instantly so my conclusion is we need both direct and
> > background compaction.
> 
> well, if we will keep the shrinker callbacks then it's not such a huge
> issue, IMHO. for that type of forward progress guarantees we can have
> our own, dedicated, workqueue with a rescuer thread (WQ_MEM_RECLAIM).

What I meant with direct compaction is shrinker while backgroud
compaction is workqueue.
So do you mean that you agree to remain shrinker?
And do you want to use workqueue with WQ_MEM_RECLAIM rather than
new kthread?

> 
> > > > If zs_free(or something) realizes current fragment is over 4M,
> > > > kick compacion backgroud job.
> > > 
> > > yes, zs_free() is the only place that introduces fragmentation.
> > > 
> > > > The job scans from highest to lower class and compact zspages
> > > > in each size_class until it meets high watermark(e.g, 4M + 4M /2 =
> > > > 6M fragment ratio).
> 
> just thought... I think it'll be tricky to implement this. We scan classes
> from HIGH class_size to SMALL class_size, counting fragmentation value and
> re-calculating the global fragmentation all the time; once the global
> fragmentation passes the watermark, we start compacting from HIGH to
> SMALL. the problem here is that as soon as we calculated the class B
> fragmentation index and moved to class A we can't trust B anymore. classes
> are not locked and absolutely free to change. so the global fragmentation
> index likely will be inaccurate.
> 

Actually, I don't think such inaccuracy will make big trouble here.
But How about this simple idea?

If zs_free find wasted space is bigger than threshold(e.g., 10M)
user defined, zs_free can queue work for background compaction(
that background compaction work should be WQ_MEM_RECLAIM |
WQ_CPU_INTENSIVE?). Once that work is executed, the work compacts
all size_class unconditionally.

With it, less background compaction and more simple algorithm,
no harmful other works by WQ_CPU_INTENSIVE.

> so I'm thinking about triggering a global compaction from zs_free() (to
> queue less works), but instead of calculating global watermark and compacting
> afterwards, just compact every class that has fragmentation over XY% (for
> example 30%). "iterate from HI to LO and compact everything that is too
> fragmented".

The problem with approach is we can compact only small size class which
is fragment ratio is higher than bigger size class but compaction benefit
is smaller than higher size class which is lower fragment ratio.
With that, continue to need to background work until it meets user-defined
global threshold.

> 
> we still need some sort of a pool->compact_ts timestamp to prevent too
> frequent compaction jobs.

Yes, we need something to throttle mechanism. Need time to think more. :)

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ