[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <396d343962bb4e3b845bab4f46721a5ebed39cf6.camel@gmail.com>
Date: Wed, 22 Jul 2020 20:37:36 -0300
From: Leonardo Bras <leobras.c@...il.com>
To: Alexey Kardashevskiy <aik@...abs.ru>,
Michael Ellerman <mpe@...erman.id.au>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
Joel Stanley <joel@....id.au>,
Christophe Leroy <christophe.leroy@....fr>,
Thiago Jung Bauermann <bauerman@...ux.ibm.com>,
Ram Pai <linuxram@...ibm.com>,
Brian King <brking@...ux.vnet.ibm.com>
Cc: linuxppc-dev@...ts.ozlabs.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 5/7] powerpc/iommu: Move iommu_table cleaning routine
to iommu_table_clean
On Wed, 2020-07-22 at 11:28 +1000, Alexey Kardashevskiy wrote:
>
> On 22/07/2020 08:13, Leonardo Bras wrote:
> > On Tue, 2020-07-21 at 14:59 +1000, Alexey Kardashevskiy wrote:
> > > On 16/07/2020 17:16, Leonardo Bras wrote:
> > > > Move the part of iommu_table_free() that does struct iommu_table cleaning
> > > > into iommu_table_clean, so we can invoke it separately.
> > > >
> > > > This new function is useful for cleaning struct iommu_table before
> > > > initializing it again with a new DMA window, without having it freed and
> > > > allocated again.
> > > >
> > > > Signed-off-by: Leonardo Bras <leobras.c@...il.com>
> > > > ---
> > > > arch/powerpc/kernel/iommu.c | 30 ++++++++++++++++++------------
> > > > 1 file changed, 18 insertions(+), 12 deletions(-)
> > > >
> > > > diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
> > > > index 9704f3f76e63..c3242253a4e7 100644
> > > > --- a/arch/powerpc/kernel/iommu.c
> > > > +++ b/arch/powerpc/kernel/iommu.c
> > > > @@ -735,21 +735,10 @@ struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid,
> > > > return tbl;
> > > > }
> > > >
> > > > -static void iommu_table_free(struct kref *kref)
> > > > +static void iommu_table_clean(struct iommu_table *tbl)
> > >
> > > iommu_table_free() + iommu_init_table() + set_iommu_table_base() should
> > > work too, why new helper?
> >
> > iommu_table_free() also frees the tbl, which would need allocate it
> > again (new address) and to fill it up again, unnecessarily.
>
> It is a new table in fact, everything is new there. You are only saving
> kfree+kzalloc which does not seem a huge win.
>
> Also, iommu_table_update() simply assumes 64bit window by passing
> res_start=res_end=0 to iommu_init_table() which is not horribly robust
> either. Yeah, I know, iommu_init_table() is always called with zeroes in
> pseries but this is somewhat ok as those tables are from the device tree
> and those windows don't overlap with 32bit MMIO but under KVM they will
> (well, if we hack QEMU to advertise a single window).
>
> I suggest removing iommu_pseries_table_update() from 6/7 and do
> iommu_table_free() + iommu_init_table() + set_iommu_table_base() with a
> WARN_ON(pdev->dev.archdata.dma_offset>=SZ_4G), may be even do this all
> in enable_ddw() where we know for sure if it is 1:1 mapping or just a
> big window.
Sure, I have yet to understand the full impact of this change, but I
will implement this and give it a try.
>
> Out of curiosity - what page sizes does pHyp advertise in "query"?
64kB (page shift 0x10)
>
>
> > I think it's a better approach to only change what is needed.
> >
> > > There is also iommu_table_clear() which does a different thing so you
> > > need a better name.
> >
> > I agree.
> > I had not noticed this other function before sending the patchset. What
> > would be a better name though? __iommu_table_free()?
> >
> > > Second, iommu_table_free
> > > use and it would be ok as we would only see this when hot-unplugging a
> > > PE because we always kept the default window.
> > > Btw you must be seeing these warnings now every time you create DDW with
> > > these patches as at least the first page is reserved, do not you?
> >
> > It does not print a warning.
> > I noticed other warnings,
>
> And what are these?
tce_freemulti_pSeriesLP: plpar_tce_stuff failed
[...]
It's regarding the change in pagesize.
Some places have the tceshift hardcoded as 12, tce_freemulti_pSeriesLP
is one of them, and that is causing some errors.
I wrote a patch fixing this, and I will include it in the next series.
>
> > but not this one from iommu_table_free():
> > /* verify that table contains no entries */
> > if (!bitmap_empty(tbl->it_ma
> > p, tbl->it_size))
> > pr_warn("%s: Unexpected TCEs\n", __func__);
> >
> > Before that, iommu_table_release_pages(tbl) is supposed to clear the
> > bitmap, so this only tests for a tce that is created in this short period.
>
> iommu_table_release_pages() only clears reserved pages - page 0 (just a
> protection against NULL DMA pointers) and 32bit MMIO (these should not
> be set for 64bit window). The "%s: Unexpected TCEs\n" is what checks for
> actual mapped TCEs.
>
Oh, I haven't noticed that. Thanks for pointing!
> > > Since we are replacing a table for a device which is still in the
> > > system, we should not try messing with its DMA if it already has
> > > mappings so the warning should become an error preventing DDW. It is
> > > rather hard to trigger in practice but I could hack a driver to ask for
> > > 32bit DMA mask first, map few pages and then ask for 64bit DMA mask, it
> > > is not illegal, I think. So this needs a new helper - "bool
> > > iommu_table_in_use(tbl)" - to use in enable_ddw(). Or I am overthinking
> > > this?... Thanks,
> >
> > As of today, there seems to be nothing like that happening in the
> > driver I am testing.
> > I spoke to Brian King on slack, and he mentioned that at the point DDW
> > is created there should be no allocations in place.
>
> Correct, there should not be. But it is also not a huge effort to fall
> back if there are.
True.
>
> > But I suppose some driver could try to do this.
> >
> > Maybe a better approach would be removing the mapping only if the
> > default window is removed (at the end of enable_ddw, as an else to
> > resetting the default DMA window), and having a way to add more
> > mappings to those pools. But this last part doesn't look so simple, and
> > it would be better to understand if it's necessary investing work in
> > this.
> >
> > What do you think?
>
> Add iommu_table_in_use(tbl) and fail DDW if that says "yes".
Seems good, I will include that on the next patchset.
Still, I will try to implement that more complex approach in a future
patchset, as it may come to be useful.
Thank you for the feedback!
Powered by blists - more mailing lists