linux-kernel - Re: [PATCH 3/7] drm/vc4: Add KMS support for Raspberry Pi.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87zj1uiyfv.fsf@eliezer.anholt.net>
Date:	Thu, 13 Aug 2015 16:03:00 -0700
From:	Eric Anholt <eric@...olt.net>
To:	Russell King - ARM Linux <linux@....linux.org.uk>
Cc:	Daniel Vetter <daniel@...ll.ch>, devicetree@...r.kernel.org,
	Stephen Warren <swarren@...dotorg.org>,
	Lee Jones <lee@...nel.org>, linux-kernel@...r.kernel.org,
	dri-devel@...ts.freedesktop.org,
	linux-rpi-kernel@...ts.infradead.org,
	linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH 3/7] drm/vc4: Add KMS support for Raspberry Pi.

Russell King - ARM Linux <linux@....linux.org.uk> writes:

> On Thu, Aug 13, 2015 at 01:44:03PM -0700, Eric Anholt wrote:
>> Struct mutex is here because this code is from the V3D series, with the
>> in-kernel BO cache ripped out (it turns out that the CMA allocator is
>> slow, and you can't just userspace cache since we have to do allocations
>> within the kernel to the tune of a couple per draw and that's too much).
>
> The CMA allocator is fast until you have pinned pages in its region,
> where it becomes _very_ slow to do allocations, sometimes getting up
> to the order of seconds.
>
> The main culpret of this are GFP_HIGHUSER_MOVABLE allocations which
> then pin the page.  It doesn't take many of those to make CMA really
> inefficient.
>
> The problem is that CMA doesn't get any information back from the
> internal page migration about which pages couldn't be moved, so it
> dumbly just tries incrementing the allocation by one page (subject
> to alignment constraints) and retrying again - repeating over the
> entire CMA region.  The bigger the region, the more time this takes.

Ouch.

Since I can workaround the allocation cost, the main problem I have
right now is that I've got a set of small allocations for 3D that all
need to have the same high 4 bits of paddr, because someone cleverly
packed some address bits in a GPU-managed structure.  Any
recommendations for ways to handle this with CMA?

Download attachment "signature.asc" of type "application/pgp-signature" (819 bytes)