lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 22 Jan 2020 17:34:31 +0000
From:   Robin Murphy <robin.murphy@....com>
To:     Cong Wang <xiyou.wangcong@...il.com>
Cc:     iommu@...ts.linux-foundation.org,
        LKML <linux-kernel@...r.kernel.org>,
        Joerg Roedel <joro@...tes.org>,
        John Garry <john.garry@...wei.com>
Subject: Re: [Patch v3 2/3] iommu: optimize iova_magazine_free_pfns()

On 21/01/2020 5:29 pm, Cong Wang wrote:
> On Tue, Jan 21, 2020 at 1:52 AM Robin Murphy <robin.murphy@....com> wrote:
>>
>> On 18/12/2019 4:39 am, Cong Wang wrote:
>>> If the magazine is empty, iova_magazine_free_pfns() should
>>> be a nop, however it misses the case of mag->size==0. So we
>>> should just call iova_magazine_empty().
>>>
>>> This should reduce the contention on iovad->iova_rbtree_lock
>>> a little bit, not much at all.
>>
>> Have you measured that in any way? AFAICS the only time this can get
>> called with a non-full magazine is in the CPU hotplug callback, where
>> the impact of taking the rbtree lock and immediately releasing it seems
>> unlikely to be significant on top of everything else involved in that
>> operation.
> 
> This patchset is only tested as a whole, it is not easy to deploy
> each to production and test it separately.
> 
> Is there anything wrong to optimize a CPU hotplug path? :) And,
> it is called in alloc_iova_fast() too when, for example, over-cached.

And if the IOVA space is consumed to the point that we've fallen back to 
that desperate last resort, what do you think the chances are that a 
significant number of percpu magazines will be *empty*? Also bear in 
mind that in that case we've already walked the rbtree once, so any 
notion of still being fast is long, long gone.

As for CPU hotplug, it's a comparatively rare event involving all manner 
of system-wide synchronisation, and the "optimisation" of shaving a few 
dozen CPU cycles off at one point *if* things happen to line up 
correctly is taking a cup of water out of a lake. If the domain is busy 
at the time, then once again chances are the magazines aren't empty and 
having an extra check redundant with the loop condition simply adds 
(trivial, but nonzero) overhead to every call. And if the domain isn't 
busy, then the lock is unlikely to be contended anyway.

Sorry, but without convincing evidence, this change just looks like 
churn for the sake of it.

Robin.

Powered by blists - more mailing lists