linux-kernel - Re: [PATCH 0/3]HTLB mapping for drivers (take 2)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090825104731.GA21335@csn.ul.ie>
Date:	Tue, 25 Aug 2009 11:47:31 +0100
From:	Mel Gorman <mel@....ul.ie>
To:	Alexey Korolev <akorolex@...il.com>
Cc:	Eric Munson <linux-mm@...bm.net>,
	Alexey Korolev <akorolev@...radead.org>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/3]HTLB mapping for drivers (take 2)

On Thu, Aug 20, 2009 at 07:03:28PM +1200, Alexey Korolev wrote:
> Mel,
> 
> >> User level applications process the data.
> >> Device is using a master DMA to send data to the user buffer, buffer
> >> size can be >1GB and performance is very important. (So huge pages
> >> mapping really makes sense.)
> >>
> >
> > Ok, so the DMA may be faster because you have to do less scatter/gather
> > and can DMA in larger chunks and and reading from userspace may be faster
> > because there is less translation overhead. Right?
> >
> Less translation overhead is important. Unfortunately not all devices
> have scatter/gather
> (our case) as having it increase h/w complexity a lot.
> 

Ok.

> >> In addition we have to mention that:
> >> 1. It is hard for user to tell how much huge pages needs to be
> >>    reserved by the driver.
> >
> > I think you have this problem either way. If the buffer is allocated and
> > populated before mmap(), then the driver is going to have to guess how many
> > pages it needs. If the DMA occurs as a result of mmap(), it's easier because
> > you know the number of huge pages to be reserved at that point and you have
> > the option of falling back to small pages if necessary.
> >
> >> 2. Devices add constrains on memory regions. For example it needs to
> >>    be contiguous with in the physical address space. It is necessary to
> >>   have ability to specify special gfp flags.
> >
> > The contiguity constraints are the same for huge pages. Do you mean there
> > are zone restrictions? If so, the hugetlbfs_file_setup() function could be
> > extended to specify a GFP mask that is used for the allocation of hugepages
> > and associated with the hugetlbfs inode. Right now, there is a htlb_alloc_mask
> > mask that is applied to some additional flags so htlb_alloc_mask would be
> > the default mask unless otherwise specified.
> >
> Under contiguous I mean that we need several huge pages being
> physically contiguous.

Why? One hugepage of default size will be one TLB entry. Each hugepage
after that will be additional TLB entries so there is no savings on
translation overhead.

Getting contiguous pages beyond the hugepage boundary is not a matter
for GFP flags.

> To obtain it we allocate pages till not find a contig. region
> (success) or reach a boundary (fail).
> So in our particular case approach based on getting pages from
> hugetlbfs won't work
> because memory region will not be contiguous.

With a direct allocation of hugepages, there is no guarantee they will
be contiguous either. If you need contiguity above hugepages (which in
many cases will also be the largest page the buddy allocator can grant),
you need something else.

> However this approach will give an easy way to support hugetlb
> mapping, it won't cause any complexity
> in accounting. But it will be suitable for hardware with large amount
> of sg regions only.
> 
> >
> > How about;
> >
> >        o Extend Eric's helper slightly to take a GFP mask that is
> >          associated with the inode and used for allocations from
> >          outside the hugepage pool
> >        o A helper that returns the page at a given offset within
> >          a hugetlbfs file for population before the page has been
> >          faulted.
>
> Do you mean get_user_pages call?
> 

If you're willing to call it directly, sure.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/