lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200624002357.GA9955@ashkalra_ubuntu_server>
Date:   Wed, 24 Jun 2020 00:23:57 +0000
From:   Ashish Kalra <ashish.kalra@....com>
To:     Konrad Rzeszutek Wilk <konrad@...nok.org>
Cc:     Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>, hch@....de,
        tglx@...utronix.de, mingo@...hat.com, bp@...en8.de, hpa@...or.com,
        x86@...nel.org, luto@...nel.org, peterz@...radead.org,
        dave.hansen@...ux-intel.com, iommu@...ts.linux-foundation.org,
        linux-kernel@...r.kernel.org, brijesh.singh@....com,
        Thomas.Lendacky@....com
Subject: Re: [PATCH v2] swiotlb: Adjust SWIOTBL bounce buffer size for SEV
 guests.

Hello Konrad,

On Tue, Jun 23, 2020 at 09:38:43AM -0400, Konrad Rzeszutek Wilk wrote:
> On Mon, Apr 27, 2020 at 06:53:18PM +0000, Ashish Kalra wrote:
> > Hello Konrad,
> > 
> > On Mon, Mar 30, 2020 at 10:25:51PM +0000, Ashish Kalra wrote:
> > > Hello Konrad,
> > > 
> > > On Tue, Mar 03, 2020 at 12:03:53PM -0500, Konrad Rzeszutek Wilk wrote:
> > > > On Tue, Feb 04, 2020 at 07:35:00PM +0000, Ashish Kalra wrote:
> > > > > Hello Konrad,
> > > > > 
> > > > > Looking fwd. to your feedback regarding support of other memory
> > > > > encryption architectures such as Power, S390, etc.
> > > > > 
> > > > > Thanks,
> > > > > Ashish
> > > > > 
> > > > > On Fri, Jan 24, 2020 at 11:00:08PM +0000, Ashish Kalra wrote:
> > > > > > On Tue, Jan 21, 2020 at 03:54:03PM -0500, Konrad Rzeszutek Wilk wrote:
> > > > > > > > 
> > > > > > > > Additional memory calculations based on # of PCI devices and
> > > > > > > > their memory ranges will make it more complicated with so
> > > > > > > > many other permutations and combinations to explore, it is
> > > > > > > > essential to keep this patch as simple as possible by 
> > > > > > > > adjusting the bounce buffer size simply by determining it
> > > > > > > > from the amount of provisioned guest memory.
> > > > > > >> 
> > > > > > >> Please rework the patch to:
> > > > > > >> 
> > > > > > >>  - Use a log solution instead of the multiplication.
> > > > > > >>    Feel free to cap it at a sensible value.
> > > > > > 
> > > > > > Ok.
> > > > > > 
> > > > > > >> 
> > > > > > >>  - Also the code depends on SWIOTLB calling in to the
> > > > > > >>    adjust_swiotlb_default_size which looks wrong.
> > > > > > >> 
> > > > > > >>    You should not adjust io_tlb_nslabs from swiotlb_size_or_default.
> > > > > > 
> > > > > > >>    That function's purpose is to report a value.
> > > > > > >> 
> > > > > > >>  - Make io_tlb_nslabs be visible outside of the SWIOTLB code.
> > > > > > >> 
> > > > > > >>  - Can you utilize the IOMMU_INIT APIs and have your own detect which would
> > > > > > >>    modify the io_tlb_nslabs (and set swiotbl=1?).
> > > > > > 
> > > > > > This seems to be a nice option, but then IOMMU_INIT APIs are
> > > > > > x86-specific and this swiotlb buffer size adjustment is also needed
> > > > > > for other memory encryption architectures like Power, S390, etc.
> > > > 
> > > > Oh dear. That I hadn't considered.
> > > > > > 
> > > > > > >> 
> > > > > > >>    Actually you seem to be piggybacking on pci_swiotlb_detect_4gb - so
> > > > > > >>    perhaps add in this code ? Albeit it really should be in it's own
> > > > > > >>    file, not in arch/x86/kernel/pci-swiotlb.c
> > > > > > 
> > > > > > Actually, we piggyback on pci_swiotlb_detect_override which sets
> > > > > > swiotlb=1 as x86_64_start_kernel() and invocation of sme_early_init()
> > > > > > forces swiotlb on, but again this is all x86 architecture specific.
> > > > 
> > > > Then it looks like the best bet is to do it from within swiotlb_init?
> > > > We really can't do it from swiotlb_size_or_default - that function
> > > > should just return a value and nothing else.
> > > > 
> > > 
> > > Actually, we need to do it in swiotlb_size_or_default() as this gets called by
> > > reserve_crashkernel_low() in arch/x86/kernel/setup.c and used to
> > > reserve low crashkernel memory. If we adjust swiotlb size later in
> > > swiotlb_init() which gets called later than reserve_crashkernel_low(),
> > > then any swiotlb size changes/expansion will conflict/overlap with the
> > > low memory reserved for crashkernel.
> > > 
> > and will also potentially cause SWIOTLB buffer allocation failures.
> > 
> > Do you have any feedback, comments on the above ?
> 
> 
> The init boot chain looks like this:
> 
> initmem_init
> 	pci_iommu_alloc
> 		-> pci_swiotlb_detect_4gb
> 		-> swiotlb_init
> 
> reserve_crashkernel
> 	reserve_crashkernel_low
> 		-> swiotlb_size_or_default
> 		..
> 
> 
> (rootfs code):
> 	pci_iommu_init
> 		-> a bunch of the other IOMMU late_init code gets called..
> 		->  pci_swiotlb_late_init 
> 
> I have to say I am lost to how your patch fixes "If we adjust swiolb
> size later .. then any swiotlb size .. will overlap with the low memory
> reserved for crashkernel"?
> 

Actually as per the boot flow :

setup_arch() calls reserve_crashkernel() and pci_iommu_alloc() is
invoked through mm_init()/mem_init() and not via initmem_init().

start_kernel:
...
setup_arch()
	reserve_crashkernel
		reserve_crashkernel_low
			-> swiotlb_size_or_default

...
...
mm_init()
	mem_init()
		pci_iommu_alloc
			-> pci_swiotlb_detect_4gb
			-> swiotlb_init

So as per the above boot flow, reserve_crashkernel() can get called
before swiotlb_detect/init, and hence, if we don't fixup or adjust
the SWIOTLB buffer size in swiotlb_size_or_default() then crash kernel
will reserve memory which will conflict/overlap with any SWIOTLB bounce
buffer allocated memory (adjusted or fixed up later).

Therefore, we need to adjust/fixup SWIOTLB bounce buffer memory in
swiotlb_size_or_default() function itself, before swiotlb detect/init
funtions get invoked.

Thanks,
Ashish

> Or are you saying that 'reserve_crashkernel_low' is the _culprit_ and it
> is the one changing the size? And hence it modifying the swiotlb size
> will fix this problem? Aka _before_ all the other IOMMU get their hand
> on it?
> 
> If so why not create an
> IOMMU_INIT(crashkernel_adjust_swiotlb,pci_swiotlb_detect_override,
> NULL, NULL);
> 
> And crashkernel_adjust_swiotlb would change the size of swiotlb buffer
> if conditions are found to require it.
> 
> You also may want to put a #define DEBUG in arch/x86/kernel/pci-iommu_table.c
> to check out whether the tree structure of IOMMU entries is correct.
> 
> 
> 
> But still I am lost - if say the AMD one does decide for unknown reason
> to expand the SWIOTLB you are still stuck with the 'overlap with
> the low memory reserved' or so.
> 
> Perhaps add a late_init that gets called as the last one to validate
> this ? And maybe if the swiotlb gets turned off you also take proper
> steps?
> 
> > As such i feel, this patch is complete otherwise and can be included as
> > it is. 
> > 
> > Thanks,
> > Ashish

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ