[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1224443274.5526.39.camel@vonnegut.anholt.net>
Date: Sun, 19 Oct 2008 12:07:54 -0700
From: Eric Anholt <eric@...olt.net>
To: Ingo Molnar <mingo@...e.hu>
Cc: Keith Packard <keithp@...thp.com>,
Jesse Barnes <jbarnes@...tuousgeek.org>,
Nick Piggin <nickpiggin@...oo.com.au>,
Dave Airlie <airlied@...ux.ie>,
Yinghai Lu <yinghai@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
dri-devel@...ts.sf.net, Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: io resources and cached mappings (was: [git pull] drm patches
for 2.6.27-rc1)
On Sun, 2008-10-19 at 19:53 +0200, Ingo Molnar wrote:
> * Keith Packard <keithp@...thp.com> wrote:
>
> > On Sat, 2008-10-18 at 21:14 -0700, Keith Packard wrote:
> > > On Sun, 2008-10-19 at 00:32 +0200, Ingo Molnar wrote:
> > >
> > > > Mind sending patches for this? :-)
> >
> > Here's a patch for the i915 driver that includes the new API. Tested
> > on x86_32+HIGHMEM and x86_64. I stuck a new 'io_reserve.h' header in
> > the i915 directory for this patch, but it should go elsewhere.
> >
> > The new APIs are:
> >
> > io_reserve_create_wc
> > io_reserve_free
> > io_reserve_map_atomic_wc
> > io_reserve_unmap_atomic
> > io_reserve_map_wc
> > io_reserve_unmap
>
> very nice!
>
> I think we need a somewhat different abstraction though.
>
> Firstly, regarding drivers/gpu/drm/i915/io_reserve.h, that needs to move
> to generic code.
>
> Secondly, wouldnt the right abstraction be to attach this functionality
> to 'struct resource' ? [or at least create a second struct that embedds
> struct resource]
>
> this abstraction is definitely not a PCI thing and not a
> detached-from-everything thing, it's an IO resource thing. We could make
> it a property of struct resource:
>
> struct resource {
> resource_size_t start;
> resource_size_t end;
> const char *name;
> unsigned long flags;
> struct resource *parent, *sibling, *child;
> + void *mapping;
> };
>
> The APIs would be:
>
> int io_resource_init_mapping(struct resource *res);
> void io_resource_free_mapping(struct resource *res);
> void * io_resource_map(struct resource *res, pfn_t pfn, unsigned long offset);
> void io_resource_unmap(struct resource *res, void *kaddr);
>
> Note how simple and consistent it all gets: IO resources already know
> their physical location and their size limits. Being able to cache an
> ioremap in a mapping [and being able to use atomic kmaps on 32-bit] is a
> relatively simple and natural extension to the concept.
>
> i think that would be quite acceptable - and the APIs could just
> transparently work on it. This would also allow the PCI code to
> automatically unmap any cached mappings from resources, when the driver
> deinitializes.
>
> Linus, Jesse, what do you think?
>
> i think we need to finalize the API names and their abstraction level,
> and then could even merge those APIs into v2.6.28 on a fast path, to
> enable you to use it. It does not interact with anything else so it
> should be safe to do.
This API needs the cacheability control, which I don't see in it
currently. In the past we've been relying on an MTRR over the GTT
resulting in all of our UC- mappings getting us the correct WC behavior,
but now there aren't enough MTRRs to go around and graphics loses out
(at about a 20% CPU cost as a conservative estimate). The primary goal
of the new API is to let us eat PAT costs up front so we don't have to
at runtime.
Second, we need to know when we're doing a mapping whether we're
affected by atomic scheduling restrictions. Right now our plan has been
to try doing page-by-page
io_map_atomic_wc()/copy_from_user_inatomic()/io_unmap_atomic(), and if
we fail at that at some point (map returns NULL or we get a partial
completion from copy_from_user_inatomic) then fall back to io_map_wc()
and copy_from_user() the whole thing at once. That gets us good
performance on both x86 with highmem and x86-64, and not too shabby
performance on x86 non-highmem.
Also, while it's rare, there have been graphics cards (looking at you,
S3) where BARs were expensive for some reason and they stuffed both the
framebuffer and registers into one PCI BAR, where you want the FB to be
WC and the registers to be UC. Not sure if they would be supportable
with this API or not. And if it's not, I'm not sure how much we care to
design for them, but it's something to potentially consider.
Finally, I'm confused by the pfn and offset args to io_resource_map,
when I expected something parallel to ioremap but with our resource arg
added.
--
Eric Anholt
eric@...olt.net eric.anholt@...el.com
Download attachment "signature.asc" of type "application/pgp-signature" (198 bytes)
Powered by blists - more mailing lists