linux-kernel - Re: [PATCH v3] swiotlb: Adjust SWIOTBL bounce buffer size for SEV guests.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20201117173829.GA29387@ashkalra_ubuntu_server>
Date:   Tue, 17 Nov 2020 17:38:29 +0000
From:   Ashish Kalra <ashish.kalra@....com>
To:     Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
Cc:     hch@....de, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
        hpa@...or.com, x86@...nel.org, luto@...nel.org,
        peterz@...radead.org, dave.hansen@...ux-intel.com,
        iommu@...ts.linux-foundation.org, linux-kernel@...r.kernel.org,
        brijesh.singh@....com, Thomas.Lendacky@....com,
        ssg.sos.patches@....com, jon.grimm@....com, rientjes@...gle.com
Subject: Re: [PATCH v3] swiotlb: Adjust SWIOTBL bounce buffer size for SEV
 guests.

Hello Konrad, 

On Tue, Nov 17, 2020 at 12:00:03PM -0500, Konrad Rzeszutek Wilk wrote:
> .snip..
> > > > > Lets break this down:
> > > > > 
> > > > > How does the performance improve for one single device if you increase the SWIOTLB?
> > > > > Is there a specific device/driver that you can talk about that improve with this patch?
> > > > > 
> > > > > 
> > > > 
> > > > Yes, these are mainly for multi-queue devices such as NICs or even
> > > > multi-queue virtio. 
> > > > 
> > > > This basically improves performance with concurrent DMA, hence,
> > > > basically multi-queue devices.
> > > 
> > > OK, and for _1GB_ guest - what are the "internal teams/external customers" amount 
> > > of CPUs they use? Please lets use real use-cases.
> > 
> > >> I am sure you will understand we cannot share any external customer
> > >> data as all that customer information is proprietary.
> > >>
> > >> In similar situation if you have to share Oracle data, you will
> > >> surely have the same concerns and i don't think you will be able
> > >> to share any such information externally, i.e., outside Oracle.
> > >>
> > >I am asking for a simple query - what amount of CPUs does a 1GB
> > >guest have? The reason for this should be fairly obvious - if
> > >it is a 1vCPU, then there is no multi-queue and the existing
> > >SWIOTLB pool size as it is OK.
> > >
> > >If however there are say 2 and multiqueue is enabled, that
> > >gives me an idea of how many you use and I can find out what
> > >the maximum pool size usage of virtio there is with that configuration.
> > 
> > Again we cannot share any customer data.
> > 
> > Also i don't think there can be a definitive answer to how many vCPUs a
> > 1GB guest will have, it will depend on what kind of configuration we are
> > testing.
> > 
> > For example, i usually setup 4-16 vCPUs for as low as 512M configured
> > gueest memory.
> 
> Sure, but you are not the normal user.
> 
> That is I don't like that for 1GB guests your patch ends up doubling the
> SWIOTLB memory pool. That seems to me we are trying to solve a problem
> that normal users will not hit. That is why I want 'here is the customer
> bug'.
> 
> Here is what I am going to do - I will take out the 1GB and 4GB case out of
> your patch and call it a day. If there are customers who start reporting issues
> we can revist that. Nothing wrong with 'Reported-by' XZY (we often ask the
> customer if he or she would like to be recognized on upstream bugs).
>

Ok.

> And in the meantime I am going to look about adding ..
> > 
> > I have been also testing with 16 vCPUs configuration for 512M-1G guest
> > memory with Mellanox SRIOV NICs, and this will be a multi-queue NIC
> > device environment.
> 
> .. late SWIOTLB expansion to stich the DMA pools together as both
> Mellanox and VirtIO are not 32-bit DMA bound.
> 
> > 
> > So we might be having less configured guest memory, but we still might
> > be using that configuration with I/O intensive workloads.
> >

I am going to submit v4 of my current patch-set which uses max() instead
of clamp() and also replaces constants defined in this patch with the
pre-defined ones in sizes.h

Thanks,
Ashish