[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d58faed1e9e35becb80f2737ed4be3d422507d6e.camel@gmail.com>
Date: Fri, 26 Jun 2020 14:55:41 -0300
From: Leonardo Bras <leobras.c@...il.com>
To: Michael Ellerman <mpe@...erman.id.au>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
Alexey Kardashevskiy <aik@...abs.ru>,
Thiago Jung Bauermann <bauerman@...ux.ibm.com>,
Ram Pai <linuxram@...ibm.com>
Cc: linuxppc-dev@...ts.ozlabs.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 5/6] powerpc/pseries/iommu: Make use of DDW even if
it does not map the partition
On Fri, 2020-06-26 at 12:23 -0300, Leonardo Bras wrote:
> On Wed, 2020-06-24 at 03:24 -0300, Leonardo Bras wrote:
> > As of today, if a DDW is created and can't map the whole partition, it's
> > removed and the default DMA window "ibm,dma-window" is used instead.
> >
> > Usually this DDW is bigger than the default DMA window, so it would be
> > better to make use of it instead.
> >
> > Signed-off-by: Leonardo Bras <leobras.c@...il.com>
> > ---
>
> I tested this change with a 256GB DDW which did not map the whole
> partition, with a MT27700 Family [ConnectX-4 Virtual Function].
>
> I noticed the performance improvement is about the same as using DDW
> with IOMMU bypass.
>
> 64 thread write throughput: +203.0%
> 64 thread read throughput: +17.5%
> 1 thread write throughput: +20.5%
> 1 thread read throughput: +3.43%
> Average write latency: -23.0%
> Average read latency: -2.26%
The above improvements are based on the default DMA window, which is
currently used if DDW can't map the whole partition.
Those values are an average of 20 tests for each environment, 30
seconds each test.
I also did some intense testing, for 5 hour each:
64 thread write throughput
64 thread read throughput
The throughput values are stable in the whole test, and I noticed no
error on dmesg / journalctl.
Powered by blists - more mailing lists