linux-kernel - Re: [patch] mm: pageable memory allocator (for DRM-GEM?)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 29 Sep 2008 18:10:05 -0700
From:	Eric Anholt <eric@...olt.net>
To:	Nick Piggin <npiggin@...e.de>
Cc:	keith.packard@...el.com, hugh@...itas.com, hch@...radead.org,
	airlied@...ux.ie, jbarnes@...tuousgeek.org,
	thomas@...gstengraphics.com, dri-devel@...ts.sourceforge.net,
	Linux Memory Management List <linux-mm@...ck.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [patch] mm: pageable memory allocator (for DRM-GEM?)

On Tue, 2008-09-23 at 11:10 +0200, Nick Piggin wrote:
> Hi,
> 
> So I promised I would look at this again, because I (and others) have some
> issues with exporting shmem_file_setup for DRM-GEM to go off and do things
> with.
> 
> The rationale for using shmem seems to be that pageable "objects" are needed,
> and they can't be created by userspace because that would be ugly for some
> reason, and/or they are required before userland is running.
> 
> I particularly don't like the idea of exposing these vfs objects to random
> drivers because they're likely to get things wrong or become out of synch
> or unreviewed if things change. I suggested a simple pageable object allocator
> that could live in mm and hide the exact details of how shmem / pagecache
> works. So I've coded that up quickly.

Hiding the details of shmem and the pagecache sounds pretty good to me
(since we've got it wrong at least twice so far).  Hopefully the result
isn't even more fragile code on our part.

> Upon actually looking at how "GEM" makes use of its shmem_file_setup filp, I
> see something strange... it seems that userspace actually gets some kind of
> descriptor, a descriptor to an object backed by this shmem file (let's call it
> a "file descriptor"). Anyway, it turns out that userspace sometimes needs to
> pread, pwrite, and mmap these objects, but unfortunately it has no direct way
> to do that, due to not having open(2)ed the files directly. So what GEM does
> is to add some ioctls which take the "file descriptor" things, and derives
> the shmem file from them, and then calls into the vfs to perform the operation.
> 
> If my cursory reading is correct, then my allocator won't work so well as a
> drop in replacement because one isn't allowed to know about the filp behind
> the pageable object. It would also indicate some serious crack smoking by
> anyone who thinks open(2), pread(2), mmap(2), etc is ugly in comparison...

I think the explanation for this got covered in other parts of the
thread, but drm_gem.c comments at the top also cover it.

> So please, nobody who worked on that code is allowed to use ugly as an
> argument. Technical arguments are fine, so let's try to cover them.
> 
> BTW. without knowing much of either the GEM or the SPU subsystems, the
> GEM problem seems similar to SPU. Did anyone look at that code? Was it ever
> considered to make the object allocator be a filesystem? That way you could
> control the backing store to the objects yourself, those that want pageable
> memory could use the following allocator, the ioctls could go away,
> you could create your own objects if needed before userspace is up...

Yes, we definitely considered a filesystem (it would be nice for
debugging to be able to look at object contents from a debugger process
\easily).  However, once we realized that fds just wouldn't work (we're
allocating objects in a library, so we couldn't just dup2 them up high,
and we couldn't rely on being able to up the open file limit for the
process), shmem seemed to already be exactly what we wanted, and we
assumed that whatever future API changes in the couple of VFS and
pagecache calls we made would be easier to track than duplicating all of
shmem.c into our driver.

I'm porting our stuff to test on your API now, and everything looks
straightforward except for mmap.  For that I seem to have three options:

1) Implement a range allocator on the DRM device and have a hashtable of
ranges to objects, then have a GEM hook in the mmap handler of the drm
device when we find we're in one of those ranges.

This was the path that TTM took.  Since we have different paths to
mmapping objects (direct backing store access, or aperture access,
though the second isn't in my tree yet), it means we end up having
multiple offsets to represent different mmap types, or multiplying the
size of the range and having the top half of the range mean the other
mmap type.

2) Create a kernel-internal filesystem and get struct files for the
objects.

This is the method that seemed like the right thing to do in the linux
style, so I've been trying to figure that part out.  I've been assured
that libfs makes my job easy here, but as I look at it I'm less sure.
The sticking point to me is how the page list I get from your API ends
up getting used by simple_file_* and generic_file_*.  And, in the future
where the pageable memory allocator is actually pageable while mmapped,
what does the API I get to consume look like, roughly?

3) Use shmem_file_setup()

This was what we originally went with.  It got messy when we wanted a
different mmap path, so that we had to do one of 1) or 2) anyway.

Also, I'm looking at a bunch of spu*.c code, and I'm having a hard time
finding something relevant for us, but maybe I'm not looking in the
right place.  Can you elaborate on that comment?

-- 
Eric Anholt
eric@...olt.net                         eric.anholt@...el.com



Download attachment "signature.asc" of type "application/pgp-signature" (198 bytes)