lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110301194703.GA7736@amt.cnet>
Date:	Tue, 1 Mar 2011 16:47:03 -0300
From:	Marcelo Tosatti <mtosatti@...hat.com>
To:	Avi Kivity <avi@...hat.com>
Cc:	Alex Williamson <alex.williamson@...hat.com>,
	linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
	xiaoguangrong@...fujitsu.com
Subject: Re: [RFC PATCH 0/3] Weight-balanced binary tree + KVM growable
 memory slots using wbtree

On Sun, Feb 27, 2011 at 11:54:29AM +0200, Avi Kivity wrote:
> On 02/24/2011 07:35 PM, Alex Williamson wrote:
> >On Thu, 2011-02-24 at 12:06 +0200, Avi Kivity wrote:
> >>  On 02/23/2011 09:28 PM, Alex Williamson wrote:
> >>  >  I had forgotten about<1M mem, so actually the slot configuration was:
> >>  >
> >>  >  0:<1M
> >>  >  1: 1M - 3.5G
> >>  >  2: 4G+
> >>  >
> >>  >  I stacked the deck in favor of the static array (0: 4G+, 1: 1M-3.5G, 2:
> >>  >  <1M), and got these kernbench results:
> >>  >
> >>  >               base (stdev)    reorder (stdev)   wbtree (stdev)
> >>  >  --------+-----------------+----------------+----------------+
> >>  >  Elapsed |  42.809 (0.19)  |  42.160 (0.22) |  42.305 (0.23) |
> >>  >  User    | 115.709 (0.22)  | 114.358 (0.40) | 114.720 (0.31) |
> >>  >  System  |  41.605 (0.14)  |  40.741 (0.22) |  40.924 (0.20) |
> >>  >  %cpu    |   366.9 (1.45)  |   367.4 (1.17) |   367.6 (1.51) |
> >>  >  context |  7272.3 (68.6)  |  7248.1 (89.7) |  7249.5 (97.8) |
> >>  >  sleeps  | 14826.2 (110.6) | 14780.7 (86.9) | 14798.5 (63.0) |
> >>  >
> >>  >  So, wbtree is only slightly behind reordering, and the standard
> >>  >  deviation suggests the runs are mostly within the noise of each other.
> >>  >  Thanks,
> >>
> >>  Doesn't this indicate we should use reordering, instead of a new data
> >>  structure?
> >
> >The original problem that brought this on was scaling.  The re-ordered
> >array still has O(N) scaling while the tree should have ~O(logN) (note
> >that it currently doesn't because it needs a compaction algorithm added
> >after insert and remove).  So yes, it's hard to beat the results of a
> >test that hammers on the first couple entries of a sorted array, but I
> >think the tree has better than current performance and more predictable
> >when scaled performance.
> 
> Scaling doesn't matter, only actual performance.  Even a guest with
> 512 slots would still hammer only on the first few slots, since
> these will contain the bulk of memory.
> 
> >If we knew when we were searching for which type of data, it would
> >perhaps be nice if we could use a sorted array for guest memory (since
> >it's nicely bounded into a small number of large chunks), and a tree for
> >mmio (where we expect the scaling to be a factor).  Thanks,
> 
> We have three types of memory:
> 
> - RAM - a few large slots
> - mapped mmio (for device assignment) - possible many small slots
> - non-mapped mmio (for emulated devices) - no slots
>
> The first two are handled in exactly the same way - they're just
> memory slots.  We expect a lot more hits into the RAM slots, since
> they're much bigger.  But by far the majority of faults will be for
> the third category - mapped memory will be hit once per page, then
> handled by hardware until Linux memory management does something
> about the page, which should hopefully be rare (with device
> assignment, rare == never, since those pages are pinned).
> 
> Therefore our optimization priorities should be
> - complete miss into the slot list
> - hit into the RAM slots
> - hit into the other slots (trailing far behind)

Whatever ordering considered optimal in one workload can be suboptimal
in another. The binary search reduces the number of slots inspected
in the average case. Using slot size as weight favours probability.

> Of course worst-case performance matters.  For example, we might
> (not sure) be searching the list with the mmu spinlock held.
> 
> I think we still have a bit to go before we can justify the new data
> structure.

Intensive IDE disk IO on guest with lots of assigned network devices, 3%
improvement on netperf with rtl8139, 1% improvement on kernbench?

Fail to see justification for not using it.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ