[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8738g44wxp.fsf@vitty.brq.redhat.com>
Date: Tue, 20 May 2014 11:32:50 +0200
From: Vitaly Kuznetsov <vkuznets@...hat.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
Cc: gregkh@...uxfoundation.org, linux-kernel@...r.kernel.org,
stable@...r.kernel.org, xen-devel@...ts.xenproject.org,
felipe.franciosi@...rix.com, roger.pau@...rix.com,
jerry.snitselaar@...cle.com, axboe@...nel.dk
Subject: Re: [Xen-devel] Backport request to stable of two performance related fixes for xen-blkfront (3.13 fixes to earlier trees)
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com> writes:
> Hey Greg
>
> This email is in regards to backporting two patches to stable that
> fall under the 'performance' rule:
>
> bfe11d6de1c416cea4f3f0f35f864162063ce3fa
> fbe363c476afe8ec992d3baf682670a4bd1b6ce6
>
> I've copied Jerry - the maintainer of the Oracle's kernel. I don't have
> the emails of the other distros maintainers but the bugs associated with it are:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1096909
> (RHEL7)
I was doing tests with RHEL7 kernel and these patches and unfortunately
I see huge performance degradation in some workloads.
I'm in the middle of my testing now but here are some intermediate
results.
Test environment:
Fedora-20, xen-4.3.2-2.fc20.x86_64, 3.11.10-301.fc20.x86_64
I do testing with 1-9 RHEL7 PVHVM guests with:
1) Unmodified RHEL7 kernel
2) Only fbe363c476afe8ec992d3baf682670a4bd1b6ce6 applied (revoke foreign
access)
3) Both fbe363c476afe8ec992d3baf682670a4bd1b6ce6 and
bfe11d6de1c416cea4f3f0f35f864162063ce3fa
(actually 427bfe07e6744c058ce6fc4aa187cda96b635539 is required as well
to make build happy, I suggest we backport that to stable as well)
Storage devices are:
1) ramdisks (/dev/ram*) (persistent grants and indirect descriptors disabled)
2) /tmp/img*.img on tmpfs (persistent grants and indirect descriptors disabled)
Test itself: direct random read with bs=2048k (using fio). (Actually
'dd', 'read/write access', ... show same results)
fio test file:
[fio_read]
ioengine=libaio
blocksize=2048k
rw=randread
filename=/dev/xvdc
randrepeat=1
fallocate=none
direct=1
invalidate=0
runtime=20
time_based
I run fio simultaneously and sum up the result. So, results are:
1) ramdisks: http://hadoop.ru/pubfiles/b1096909_3.11.10_ramdisk.png
2) tmpfiles: http://hadoop.ru/pubfiles/b1096909_3.11.10_tmpfile.png
In few words: patch series has (almost) no effect when persistent grants
are enabled (that was expected) and gives me performance regression when
persistent grants are disabled (that wasn't expected).
My thoughts are: it seems fbe363c476afe8ec992d3baf682670a4bd1b6ce6
brings performance regression in some cases (at least when persistent
grants are disabled). My guess atm is that gnttab_end_foreign_access()
(gnttab_end_foreign_access_ref_v1() is being used here) is guilty, for
some reason it is looping for some
time. bfe11d6de1c416cea4f3f0f35f864162063ce3fa really brings performance
improvement over fbe363c476afe8ec992d3baf682670a4bd1b6ce6 but whole
series still brings regression.
I would be glad to hear what could be wrong with my testing in case I'm
the only one who sees such behavior. Any other pointers are more than
welcome and please feel free to ask for any additional
info/testing/whatever from me.
> https://bugs.launchpad.net/ubuntu/+bug/1319003
> (Ubuntu 13.10)
>
> The following distros are affected:
>
> (x) Ubuntu 13.04 and derivatives (3.8)
> (v) Ubuntu 13.10 and derivatives (3.11), supported until 2014-07
> (x) Fedora 17 (3.8 and 3.9 in updates)
> (x) Fedora 18 (3.8, 3.9, 3.10, 3.11 in updates)
> (v) Fedora 19 (3.9; 3.10, 3.11, 3.12 in updates; fixed with latest update to 3.13), supported until TBA
> (v) Fedora 20 (3.11; 3.12 in updates; fixed with latest update to 3.13), supported until TBA
> (v) RHEL 7 and derivatives (3.10), expected to be supported until about 2025
> (v) openSUSE 13.1 (3.11), expected to be supported until at least 2016-08
> (v) SLES 12 (3.12), expected to be supported until about 2024
> (v) Mageia 3 (3.8), supported until 2014-11-19
> (v) Mageia 4 (3.12), supported until 2015-08-01
> (v) Oracle Enterprise Linux with Unbreakable Enterprise Kernel Release 3 (3.8), supported until TBA
>
> Here is the analysis of the problem and what was put in the RHEL7 bug.
> The Oracle bug does not exist (as I just backport them in the kernel and
> send a GIT PULL to Jerry) - but if you would like I can certainly furnish
> you with one (it would be identical to what is mentioned below).
>
> If you are OK with the backport, I am volunteering Roger and Felipe to assist
> in jamming^H^H^H^Hbackporting the patches into earlier kernels.
>
> Summary:
> Storage performance regression when Xen backend lacks persistent-grants support
>
> Description of problem:
> When used as a Xen guest, RHEL 7 will be slower than older releases in terms
> s of storage performance. This is due to the persistent-grants feature introduced
> in xen-blkfront on the Linux Kernel 3.8 series. From 3.8 to 3.12 (inclusive),
> xen-blkfront will add an extra set of memcpy() operations regardless of
> persistent-grants support in the backend (i.e. xen-blkback, qemu, tapdisk).
> This has been identified and fixed in the 3.13 kernel series, but was not
> backported to previous LTS kernels due to the nature of the bug (performance only).
>
> While persistent grants reduce the stress on the Xen grant table and allow
> for much better aggregate throughput (at the cost of an extra set of memcpy
> operations), adding the copy overhead when the feature is unsupported on
> the backend combines the worst of both worlds. This is particularly noticeable
> when intensive storage workloads are active from many guests.
>
> How reproducible:
> This is always reproducible when a RHEL 7 guest is running on Xen and the
> storage backend (i.e. xen-blkback, qemu, tapdisk) does not have support for
> persistent grants.
>
> Steps to Reproduce:
> 1. Install a Xen dom0 running a kernel prior to 3.8 (without
> persistent-grants support) - or run it under Amazon EC2
> 2. Install a set of RHEL 7 guests (which uses kernel 3.10).
> 3. Measure aggregate storage throughput from all guests.
>
> NOTE: The storage infrastructure (e.g. local SSDs, network-attached storage)
> cannot be a bottleneck in itself. If tested on a single SATA disk, for
> example, the issue will probably be unnoticeable as the infrastructure will
> be limiting response time and throughput.
>
> Actual results:
> Aggregate storage throughput will be lower than with a xen-blkfront
> versions prior to 3.8 or newer than 3.12.
>
> Expected results:
> Aggregate storage throughput should be at least as good or better if the
> backend supports persistent grants.
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@...ts.xen.org
> http://lists.xen.org/xen-devel
--
Vitaly
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists