linux-kernel - RE: [GIT PULL] mm: frontswap (for 3.2 window)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1320219877.3091.22.camel@dabdike>
Date:	Wed, 02 Nov 2011 11:44:37 +0400
From:	James Bottomley <James.Bottomley@...senPartnership.com>
To:	Dan Magenheimer <dan.magenheimer@...cle.com>
Cc:	John Stoffel <john@...ffel.org>,
	Johannes Weiner <jweiner@...hat.com>,
	Pekka Enberg <penberg@...nel.org>,
	Cyclonus J <cyclonusj@...il.com>,
	Sasha Levin <levinsasha928@...il.com>,
	Christoph Hellwig <hch@...radead.org>,
	David Rientjes <rientjes@...gle.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	linux-mm@...ck.org, LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Konrad Wilk <konrad.wilk@...cle.com>,
	Jeremy Fitzhardinge <jeremy@...p.org>,
	Seth Jennings <sjenning@...ux.vnet.ibm.com>, ngupta@...are.org,
	Chris Mason <chris.mason@...cle.com>, JBeulich@...ell.com,
	Dave Hansen <dave@...ux.vnet.ibm.com>,
	Jonathan Corbet <corbet@....net>
Subject: RE: [GIT PULL] mm: frontswap (for 3.2 window)

On Tue, 2011-11-01 at 11:10 -0700, Dan Magenheimer wrote:
[...]
> For clarity and brevity, let's call the three cases:
> 
> Case A) CONFIG_FRONTSWAP=n
> Case B) CONFIG_FRONTSWAP=y and no tmem backend registers
> Case C) CONFIG_FRONTSWAP=y and a tmem backend DOES register
> 
> There are no interactions in Case A, agreed?  I'm not sure
> if it is clear, but in Case B every hook checks to
> see if a tmem backend is registered... if not, the
> hook is a no-op except for the addition of a
> compare-pointer-against-NULL op, so there is no
> interaction there either.
> 
> So the only case where interactions are possible is
> Case C, which currently only can occur if a user
> specifies a kernel boot parameter of "tmem" or "zcache".
> (I know, a bit ugly, but there's a reason for doing
> it this way, at least for now.)

OK, so what I'd like to see is benchmarks for B and C.  B should confirm
your contention of no cost (which is the ideal anyway) and C quantifies
the passive cost to users.

[...]
> Can we agree that if frontswap is doing its job properly on
> any "normal" workload that is swapping, it is improving on a
> bad situation?

No, not without a set of benchmarks ... that's rather the point of doing
them.

> Then let's get back to your implied question about _negative_
> data.  As described above there is NO impact for Case A
> and Case B.  (The zealot will point out that a pointer-compare
> against-NULL per page-swapped-in/out is not "NO" impact,
> but let's ignore him for now.)  In Case C, there are
> demonstrated benefits for SOME workloads... will frontswap
> HARM some workloads?
> 
> I have openly admitted that for _cleancache_ on _zcache_,
> sometimes the cost can exceed the benefits, and this was
> actually demonstrated by one user on lkml.  For _frontswap_
> it's really hard to imagine even a very contrived workload
> where frontswap fails to provide an advantage.  I suppose
> maybe if your swap disk lives on a PCI SSD and your CPU
> is ancient single-core which does extremely slow copying
> and compression?
> 
> IOW, I feel like you are giving me busywork, and any additional
> evidence I present you will wave away anyway.

Well, OK, so there's a performance issue in some workloads what the
above is basically asking is how bad is it and how widespread?  

> > > I understand that some kernel developers (mostly from one
> > > company) continue to completely discount Xen, and
> > > thus won't even look at the Xen results.  IMHO
> > > that is mudslinging.
> > 
> > OK, so lets look at this another way:  one of the signs of a good ABI is
> > generic applicability.  Any good virtualisation ABI should thus work for
> > all virtualisation systems (including VMware should they choose to take
> > advantage of it).  The fact that transcendent memory only seems to work
> > well for Xen is a red flag in this regard.
> 
> I think the tmem ABI will work fine with any virtualization system,
> and particularly frontswap will.  There are some theoretical arguments
> that KVM will get little or no benefit, but those arguments
> pertain primarily to cleancache.  And I've noted that the ABI
> was designed to be very extensible, so if KVM wants a batching
> interface, they can add one.  To repeat from the LWN KS2011 report:
> 
>   "[Linus] stated that, simply, code that actually is used is
>    code that is actually worth something... code aimed at
>    solving the same problem is just a vague idea that is
>    worthless by comparison...  Even if it truly is crap,
>    we've had crap in the kernel before.  The code does not
>    get better out of tree."
> 
> AND the API/ABI clearly supports other non-virtualization uses
> as well.  The in-kernel hooks are very simple and the layering
> is very clean.  The ABI is extensible, has been published for
> nearly three years, and successfully rev'ed once (to accomodate
> 192-bit exportfs handles for cleancache).  Your arguments are on
> very thin ice here.
> 
> It sounds like you are saying that unless/until KVM has a completed
> measurable implementation... and maybe VMware and Hyper/V as well...
> you don't think the tiny set of hooks that are frontswap should
> be merged.  If so, that "red flag" sounds self-serving, not what I
> would expect from someone like you.  Sorry.

Hm, straw man and ad hominem.  What I said was "one of the signs of a
good ABI is generic applicability".  That doesn't mean you have to apply
an ABI to every situation by coming up with a demonstration for the use
case.  It does mean that people should know how to do it.  I'm not
particularly interested in the hypervisor wars, but it does seem to me
that there are legitimate questions about the applicability of this to
KVM.

[...]
> > > But I have never suggested that every kernel should always
> > > unconditionally compile-time-enable and run-time-enable
> > > frontswap... simply that it should be in-tree so those
> > > who wish to enable it are able to enable it.
> > 
> > In practise, most useful ABIs end up being compiled in ... and useful
> > basically means useful to any constituency, however small.  If your ABI
> > is useless, then fine, we don't have to worry about the configured but
> > inactive case (but then again, we wouldn't have to worry about the ABI
> > at all).  If it has a use, then kernels will end up shipping with it
> > configured in which is why the inactive performance impact is so
> > important to quantify.
> 
> So do you now understand/agree that the inactive performance is zero
> and the interaction of an inactive configuration with the remainder
> of the MM subsystem is zero?  And that you and your users will be
> completely unaffected unless you/they intentionally turn it on,
> not only compiled in, but explicitly at runtime as well?

As I said above, just benchmark it for B and C. As long as nothing nasty
is happening, I'm fine with it.

> So... understanding your preference for more workloads and your
> preference that KVM should be demonstrated as a profitable user
> first... is there anything else that you think should stand
> in the way of merging frontswap so that existing and planned
> kernel developers can build on top of it in-tree?

No, I think that's my list.  The confusion over a KVM interface is
solely because you keep saying it's not a Xen only ABI ... if it were,
I'd be fine for it living in the xen tree.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/