lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 22 Jan 2020 17:34:31 +0000 From: Robin Murphy <robin.murphy@....com> To: Cong Wang <xiyou.wangcong@...il.com> Cc: iommu@...ts.linux-foundation.org, LKML <linux-kernel@...r.kernel.org>, Joerg Roedel <joro@...tes.org>, John Garry <john.garry@...wei.com> Subject: Re: [Patch v3 2/3] iommu: optimize iova_magazine_free_pfns() On 21/01/2020 5:29 pm, Cong Wang wrote: > On Tue, Jan 21, 2020 at 1:52 AM Robin Murphy <robin.murphy@....com> wrote: >> >> On 18/12/2019 4:39 am, Cong Wang wrote: >>> If the magazine is empty, iova_magazine_free_pfns() should >>> be a nop, however it misses the case of mag->size==0. So we >>> should just call iova_magazine_empty(). >>> >>> This should reduce the contention on iovad->iova_rbtree_lock >>> a little bit, not much at all. >> >> Have you measured that in any way? AFAICS the only time this can get >> called with a non-full magazine is in the CPU hotplug callback, where >> the impact of taking the rbtree lock and immediately releasing it seems >> unlikely to be significant on top of everything else involved in that >> operation. > > This patchset is only tested as a whole, it is not easy to deploy > each to production and test it separately. > > Is there anything wrong to optimize a CPU hotplug path? :) And, > it is called in alloc_iova_fast() too when, for example, over-cached. And if the IOVA space is consumed to the point that we've fallen back to that desperate last resort, what do you think the chances are that a significant number of percpu magazines will be *empty*? Also bear in mind that in that case we've already walked the rbtree once, so any notion of still being fast is long, long gone. As for CPU hotplug, it's a comparatively rare event involving all manner of system-wide synchronisation, and the "optimisation" of shaving a few dozen CPU cycles off at one point *if* things happen to line up correctly is taking a cup of water out of a lake. If the domain is busy at the time, then once again chances are the magazines aren't empty and having an extra check redundant with the loop condition simply adds (trivial, but nonzero) overhead to every call. And if the domain isn't busy, then the lock is unlikely to be contended anyway. Sorry, but without convincing evidence, this change just looks like churn for the sake of it. Robin.
Powered by blists - more mailing lists