[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20220630024238.GA884@gao-cwp>
Date:   Thu, 30 Jun 2022 10:42:43 +0800
From:   Chao Gao <chao.gao@...el.com>
To:     linux-kernel@...r.kernel.org
Cc:     dave.hansen@...el.com, len.brown@...el.com, tony.luck@...el.com,
        rafael.j.wysocki@...el.com, reinette.chatre@...el.com,
        dan.j.williams@...el.com, kirill.shutemov@...ux.intel.com,
        sathyanarayanan.kuppuswamy@...ux.intel.com,
        ilpo.jarvinen@...ux.intel.com, Andi Kleen <ak@...ux.intel.com>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Borislav Petkov <bp@...e.de>,
        Muchun Song <songmuchun@...edance.com>,
        Kees Cook <keescook@...omium.org>,
        Randy Dunlap <rdunlap@...radead.org>,
        Damien Le Moal <damien.lemoal@...nsource.wdc.com>,
        linux-doc@...r.kernel.org, linux-pm@...r.kernel.org,
        iommu@...ts.linux-foundation.org
Subject: Re: [PATCH v1 3/3] swiotlb: Split up single swiotlb lock
On Tue, Jun 28, 2022 at 03:01:34PM +0800, Chao Gao wrote:
>From: Andi Kleen <ak@...ux.intel.com>
>
>Traditionally swiotlb was not performance critical because it was only
>used for slow devices. But in some setups, like TDX confidential
>guests, all IO has to go through swiotlb. Currently swiotlb only has a
>single lock. Under high IO load with multiple CPUs this can lead to
>signifiant lock contention on the swiotlb lock. We've seen 20+% CPU
>time in locks in some extreme cases.
>
>This patch splits the swiotlb into individual areas which have their
>own lock. Each CPU tries to allocate in its own area first. Only if
>that fails does it search other areas. On freeing the allocation is
>freed into the area where the memory was originally allocated from.
>
>To avoid doing a full modulo in the main path the number of swiotlb
>areas is always rounded to the next power of two. I believe that's
>not really needed anymore on modern CPUs (which have fast enough
>dividers), but still a good idea on older parts.
>
>The number of areas can be set using the swiotlb option. But to avoid
>every user having to set this option set the default to the number of
>available CPUs. Unfortunately on x86 swiotlb is initialized before
>num_possible_cpus() is available, that is why it uses a custom hook
>called from the early ACPI code.
>
>Signed-off-by: Andi Kleen <ak@...ux.intel.com>
>[ rebase and fix warnings of checkpatch.pl ]
>Signed-off-by: Chao Gao <chao.gao@...el.com>
Just noticed that Tianyu already posted a variant of this patch.
Will drop this one from my series.
Powered by blists - more mailing lists
 
