lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 02 Feb 2012 12:10:38 +0200
From:	Avi Kivity <avi@...hat.com>
To:	Takuya Yoshikawa <yoshikawa.takuya@....ntt.co.jp>
CC:	Peter Zijlstra <peterz@...radead.org>, paulmck@...ux.vnet.ibm.com,
	Oleg Nesterov <oleg@...hat.com>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Marcelo Tosatti <mtosatti@...hat.com>,
	KVM list <kvm@...r.kernel.org>
Subject: Re: [test result] dirty logging without srcu update -- Re: [RFC][PATCH]
 srcu: Implement call_srcu()

On 02/02/2012 07:46 AM, Takuya Yoshikawa wrote:
> Avi Kivity <avi@...hat.com> wrote:
>
> > >> That'll be great, numbers are better than speculation.
> > >>
> > >
> > >
> > > Yes, I already have some good numbers to show (and some patches).
> > 
> > Looking forward.
>
> I made a patch to see if Avi's suggestion of getting rid of srcu
> update for dirty logging is practical;  tested with my unit-test.
>
> (I used a function to write protect a range of pages using rmap,
>  which is itself useful for optimizing the current code.)
>
> 1. test result
>
> on 32bit host (core i3 box)   // just for the unit-test ...
> slot size: 256K pages (1GB memory)
>
>
> Measured by dirty-log-perf (executed only once for each case)
>
> Note: dirty pages are completely distributed
>       (no locality: worst case for my patch?)
>
> =========================================================
> # of dirty pages:  kvm.git (ns),  with this patch (ns)
> 1:         102,077 ns      10,105 ns
> 2:          47,197 ns       9,395 ns
> 4:          43,563 ns       9,938 ns
> 8:          41,239 ns      10,618 ns
> 16:         42,988 ns      12,299 ns
> 32:         45,503 ns      14,298 ns
> 64:         50,915 ns      19,895 ns
> 128:        61,087 ns      29,260 ns
> 256:        81,007 ns      49,023 ns
> 512:       132,776 ns      86,670 ns
> 1024:      939,299 ns     131,496 ns
> 2048:      992,209 ns     250,429 ns
> 4096:      891,809 ns     479,280 ns
> 8192:    1,027,280 ns     906,971 ns
> (until now pretty good)
>
> (ah, for every 32-bit atomic clear mask ...)
> 16384:   1,270,972 ns   6,661,741 ns    //  1  1  1 ...  1
> 32768:   1,581,335 ns   9,673,985 ns    //  ...
> 65536:   2,161,604 ns  11,466,134 ns    //  ...
> 131072:  3,253,027 ns  13,412,954 ns    //  ...
> 262144:  5,663,002 ns  16,309,924 ns    // 31 31 31 ... 31
> =========================================================

On a 64-bit host, this will be twice as fast.  Or if we use cmpxchg16b,
and there are no surprises, four times as fast.  It will still be slower
than the original, but by a smaller margin.

> According to a 2005 usenix paper, WWS with a 8sec window was
> about 50,000 pages for a high dirtying rate program.
>
> Taking into acount of another possible gains from the WWS locality
> of real workloads, these numbers are not so bad IMO.

I agree.

>
> Furthermore the code has been made for initial test only and I did
> not do any optimization:  I know what I should try.
>
> So this seems worth more testing.
>
>
> The new code also makes it possible to do find-grained get dirty log.
> Live migration can be done like this ??? (not sure yet):
>
> 	until the dirty rate becomes enough low
> 		get dirty log for the first 32K pages (partial return is OK)
> 		while sending
> 		get dirty log for the next 32K pages  (partial return is OK)
> 		while sending
> 		...
> 		get dirty log for the last 32K pages  (partial return is OK)
>
> 	stop the guest and get dirty log (but no need to write protect now)
> 	send the remaining pages
>
> New API is needed for this as discussed before!

Yeah.  But I think we should switch to srcu-less dirty logs regardless. 
Here are you numbers, but normalized by the number of dirty pages.

dirty pages                    old (ns/page)    new (ns/page)
1    102077    10105
2    23599    4698
4    10891    2485
8    5155    1327
16    2687    769
32    1422    447
64    796    311
128    477    229
256    316    191
512    259    169
1024    917    128
2048    484    122
4096    218    117
8192    125    111
16384    78    407
32768    48    295
65536    33    175
131072    25    102
262144    22    62


Your worst case, when considering a reasonable number of dirty pages, is
407ns/page, which is still lower than what userspace will actually do to
process the page, so it's reasonable.  The old method is often a lot
worse than your worst case, by this metric.



-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ