linux-kernel - Re: [git pull] drm patches for 2.6.27-rc1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Sat, 18 Oct 2008 18:47:05 -0400
From:	"Jon Smirl" <jonsmirl@...il.com>
To:	"Ingo Molnar" <mingo@...e.hu>
Cc:	"Keith Packard" <keithp@...thp.com>,
	"Linus Torvalds" <torvalds@...ux-foundation.org>,
	"Nick Piggin" <nickpiggin@...oo.com.au>,
	"Dave Airlie" <airlied@...ux.ie>,
	"Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>,
	dri-devel@...ts.sf.net,
	"Andrew Morton" <akpm@...ux-foundation.org>,
	"Yinghai Lu" <yinghai@...nel.org>
Subject: Re: [git pull] drm patches for 2.6.27-rc1

On Sat, Oct 18, 2008 at 6:32 PM, Ingo Molnar <mingo@...e.hu> wrote:
>
> * Keith Packard <keithp@...thp.com> wrote:
>
>> On Sat, 2008-10-18 at 22:37 +0200, Ingo Molnar wrote:
>>
>> > But i think the direction of the new GEM code is subtly wrong here,
>> > because it tries to manage memory even on 64-bit systems. IMO it
>> > should just map the _whole_ graphics aperture (non-cached) and be
>> > done with it. There's no faster method at managing pages than the
>> > CPU doing a TLB fill from pagetables.
>>
>> Yeah, we're stuck thinking that we "can't" map the aperture because
>> it's too large, but with a 64-bit kernel, we should be able to keep it
>> mapped permanently.
>>
>> Of course, the io_reserve_pci_resource and io_map_atomic functions
>> could do precisely that, as kmap_atomic does on non-HIGHMEM systems
>> today.
>
> okay, so basically what we need is a shared API that does per page
> kmap_atomic on 32-bit, and just an ioremap() on 64-bit. I had the
> impression that you were suggesting to extend kmap_atomic() to 64-bit -
> which would be wrong.

Is it possible to use a segment register to map the whole aperture on
32b? A segment register might allow common code on 64b/32b by
eliminating the need to move the mapping window around.

>
> So, in terms of the 4 APIs you suggest:
>
>  struct io_mapping *io_reserve_pci_resource(struct pci_dev *dev,
>                                             int bar,
>                                             int prot);
>  void io_mapping_free(struct io_mapping *mapping);
>
>  void *io_map_atomic(struct io_mapping *mapping, unsigned long pfn);
>  void io_unmap_atomic(struct io_mapping *mapping, unsigned long pfn);
>
> here is what we'd do on 64-bit:
>
>  - io_reserve_pci_resource() would just do an ioremap(), and would save
>    the ioremap-ed memory into struct io_mapping.
>
>  - io_mapping_free() does the iounmap()
>
>  - io_map_atomic(): just arithmetics, returns mapping->base + pfn - no
>    TLB activities at all.
>
>  - io_unmap_atomic(): NOP.
>
> it's as fast as it gets: zero overhead in essence. Note that it's also
> shared between all CPUs and there's no aliasing trouble.
>
> And we could make it even faster: if you think we could even use 2MB
> TLBs for the _linear_ ioremap()s here, hm? There's plenty of address
> space on 64-bit so we can align to 2MB just fine - and aperture sizes
> are 2MB sized anyway.
>
> Or we could go one step further and install these aperture mappings into
> the _kernel linear_ address space. That would be even faster, because
> we'd have a constant offset. We have the (2MB mappings aware) mechanism
> for that already. (Yinghai Cc:-ed - he did a lot of great work to
> generalize this area.)
>
> (In fact if we installed it into the linear kernel address space, and if
> the aperture is 1GB aligned, we will automatically use gbpages for it.
> Were Intel to support gbpages in the future ;-)
>
> the _real_ remapping in a graphics aperture happens on the GPU level
> anyway, you manage an in-RAM GPU pagetable that just works like an
> IOMMU, correct?
>
> on 32-bit we'd have what you use in the GEM code today:
>
>  - io_reserve_pci_resource(): a NOP in essence
>
>  - io_mapping_free(): a NOP
>
>  - io_map_atomic(): does a kmap_atomic(pfn)
>
>  - io_unmap_atomic(): does a kunmap_atomic(pfn)
>
> so on 32-bit we have the INVLPG TLB overhead and preemption restrictions
> - but we knew that. We'd have to allow atomic_kmap() on non-highmem as
> well but that's fair.
>
> Mind sending patches for this? :-)
>
>        Ingo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>



-- 
Jon Smirl
jonsmirl@...il.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/