lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Wed, 23 Sep 2020 08:11:02 +0200
From:   Christoph Hellwig <hch@....de>
To:     Tvrtko Ursulin <tvrtko.ursulin@...ux.intel.com>
Cc:     Christoph Hellwig <hch@....de>,
        Matthew Wilcox <willy@...radead.org>,
        Juergen Gross <jgross@...e.com>,
        Stefano Stabellini <sstabellini@...nel.org>,
        linux-mm@...ck.org, Peter Zijlstra <peterz@...radead.org>,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>, x86@...nel.org,
        linux-kernel@...r.kernel.org, Minchan Kim <minchan@...nel.org>,
        dri-devel@...ts.freedesktop.org, xen-devel@...ts.xenproject.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        intel-gfx@...ts.freedesktop.org, Nitin Gupta <ngupta@...are.org>,
        Chris Wilson <chris@...is-wilson.co.uk>,
        Matthew Auld <matthew.auld@...el.com>
Subject: Re: [Intel-gfx] [PATCH 3/6] drm/i915: use vmap in shmem_pin_map

On Tue, Sep 22, 2020 at 06:04:37PM +0100, Tvrtko Ursulin wrote:
> Only reason I can come up with now is if mapping side is on a latency 
> sensitive path, while un-mapping is lazy/delayed so can be more costly. 
> Then fast map and extra cost on unmap may make sense.

In general yes.  But compared to the overall operations a small kmalloc
is in the noise, so I'd really like to see numbers.

> It more applies to the other i915 patch, which implements a much more used 
> API, but whether or not we can demonstrate any difference in the perf 
> profiles I couldn't tell you without trying to collect some.

The other patch keeps the stack, as avoiding it would not simplify the
code as significantly.  I still doubt it is all that useful, though.


>> We could do vmalloc_to_page, but that is fairly expensive (not as bad
>> as reading from the page cache..).  Are you really worried about the
>> allocation?
>
> Not so much given how we don't even use shmem_pin_map outside selftests.
>
> If we start using it I expect it will be for tiny objects anyway. Only if 
> they end up being pinned for the lifetime of the driver, it may be a 
> pointless waste of memory compared to the downsides of vmalloc_to_page. But 
> we can revisit this particular edge case optimization if the need arises.

For tiny object we could either look into using kmap, or in fact
ensure the shmem files aren't in highmem, in which case you could
always use single-page mappings without any extra mapping.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ