[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200623133843.GA5499@localhost.localdomain>
Date: Tue, 23 Jun 2020 09:38:43 -0400
From: Konrad Rzeszutek Wilk <konrad@...nok.org>
To: Ashish Kalra <ashish.kalra@....com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>, hch@....de,
tglx@...utronix.de, mingo@...hat.com, bp@...en8.de, hpa@...or.com,
x86@...nel.org, luto@...nel.org, peterz@...radead.org,
dave.hansen@...ux-intel.com, iommu@...ts.linux-foundation.org,
linux-kernel@...r.kernel.org, brijesh.singh@....com,
Thomas.Lendacky@....com
Subject: Re: [PATCH v2] swiotlb: Adjust SWIOTBL bounce buffer size for SEV
guests.
On Mon, Apr 27, 2020 at 06:53:18PM +0000, Ashish Kalra wrote:
> Hello Konrad,
>
> On Mon, Mar 30, 2020 at 10:25:51PM +0000, Ashish Kalra wrote:
> > Hello Konrad,
> >
> > On Tue, Mar 03, 2020 at 12:03:53PM -0500, Konrad Rzeszutek Wilk wrote:
> > > On Tue, Feb 04, 2020 at 07:35:00PM +0000, Ashish Kalra wrote:
> > > > Hello Konrad,
> > > >
> > > > Looking fwd. to your feedback regarding support of other memory
> > > > encryption architectures such as Power, S390, etc.
> > > >
> > > > Thanks,
> > > > Ashish
> > > >
> > > > On Fri, Jan 24, 2020 at 11:00:08PM +0000, Ashish Kalra wrote:
> > > > > On Tue, Jan 21, 2020 at 03:54:03PM -0500, Konrad Rzeszutek Wilk wrote:
> > > > > > >
> > > > > > > Additional memory calculations based on # of PCI devices and
> > > > > > > their memory ranges will make it more complicated with so
> > > > > > > many other permutations and combinations to explore, it is
> > > > > > > essential to keep this patch as simple as possible by
> > > > > > > adjusting the bounce buffer size simply by determining it
> > > > > > > from the amount of provisioned guest memory.
> > > > > >>
> > > > > >> Please rework the patch to:
> > > > > >>
> > > > > >> - Use a log solution instead of the multiplication.
> > > > > >> Feel free to cap it at a sensible value.
> > > > >
> > > > > Ok.
> > > > >
> > > > > >>
> > > > > >> - Also the code depends on SWIOTLB calling in to the
> > > > > >> adjust_swiotlb_default_size which looks wrong.
> > > > > >>
> > > > > >> You should not adjust io_tlb_nslabs from swiotlb_size_or_default.
> > > > >
> > > > > >> That function's purpose is to report a value.
> > > > > >>
> > > > > >> - Make io_tlb_nslabs be visible outside of the SWIOTLB code.
> > > > > >>
> > > > > >> - Can you utilize the IOMMU_INIT APIs and have your own detect which would
> > > > > >> modify the io_tlb_nslabs (and set swiotbl=1?).
> > > > >
> > > > > This seems to be a nice option, but then IOMMU_INIT APIs are
> > > > > x86-specific and this swiotlb buffer size adjustment is also needed
> > > > > for other memory encryption architectures like Power, S390, etc.
> > >
> > > Oh dear. That I hadn't considered.
> > > > >
> > > > > >>
> > > > > >> Actually you seem to be piggybacking on pci_swiotlb_detect_4gb - so
> > > > > >> perhaps add in this code ? Albeit it really should be in it's own
> > > > > >> file, not in arch/x86/kernel/pci-swiotlb.c
> > > > >
> > > > > Actually, we piggyback on pci_swiotlb_detect_override which sets
> > > > > swiotlb=1 as x86_64_start_kernel() and invocation of sme_early_init()
> > > > > forces swiotlb on, but again this is all x86 architecture specific.
> > >
> > > Then it looks like the best bet is to do it from within swiotlb_init?
> > > We really can't do it from swiotlb_size_or_default - that function
> > > should just return a value and nothing else.
> > >
> >
> > Actually, we need to do it in swiotlb_size_or_default() as this gets called by
> > reserve_crashkernel_low() in arch/x86/kernel/setup.c and used to
> > reserve low crashkernel memory. If we adjust swiotlb size later in
> > swiotlb_init() which gets called later than reserve_crashkernel_low(),
> > then any swiotlb size changes/expansion will conflict/overlap with the
> > low memory reserved for crashkernel.
> >
> and will also potentially cause SWIOTLB buffer allocation failures.
>
> Do you have any feedback, comments on the above ?
The init boot chain looks like this:
initmem_init
pci_iommu_alloc
-> pci_swiotlb_detect_4gb
-> swiotlb_init
reserve_crashkernel
reserve_crashkernel_low
-> swiotlb_size_or_default
..
(rootfs code):
pci_iommu_init
-> a bunch of the other IOMMU late_init code gets called..
-> pci_swiotlb_late_init
I have to say I am lost to how your patch fixes "If we adjust swiolb
size later .. then any swiotlb size .. will overlap with the low memory
reserved for crashkernel"?
Or are you saying that 'reserve_crashkernel_low' is the _culprit_ and it
is the one changing the size? And hence it modifying the swiotlb size
will fix this problem? Aka _before_ all the other IOMMU get their hand
on it?
If so why not create an
IOMMU_INIT(crashkernel_adjust_swiotlb,pci_swiotlb_detect_override,
NULL, NULL);
And crashkernel_adjust_swiotlb would change the size of swiotlb buffer
if conditions are found to require it.
You also may want to put a #define DEBUG in arch/x86/kernel/pci-iommu_table.c
to check out whether the tree structure of IOMMU entries is correct.
But still I am lost - if say the AMD one does decide for unknown reason
to expand the SWIOTLB you are still stuck with the 'overlap with
the low memory reserved' or so.
Perhaps add a late_init that gets called as the last one to validate
this ? And maybe if the swiotlb gets turned off you also take proper
steps?
> As such i feel, this patch is complete otherwise and can be included as
> it is.
>
> Thanks,
> Ashish
Powered by blists - more mailing lists