lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 10 Jan 2008 11:20:38 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	Jens Axboe <jens.axboe@...cle.com>
Cc:	Christoph Hellwig <hch@...radead.org>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	linux-kernel@...r.kernel.org, chris.mason@...cle.com,
	linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH][RFC] fast file mapping for loop


On Thu, 2008-01-10 at 11:02 +0100, Jens Axboe wrote:
> On Thu, Jan 10 2008, Peter Zijlstra wrote:
> > 
> > On Thu, 2008-01-10 at 10:49 +0100, Jens Axboe wrote:
> > > On Thu, Jan 10 2008, Peter Zijlstra wrote:
> > > > 
> > > > On Thu, 2008-01-10 at 08:37 +0000, Christoph Hellwig wrote:
> > > > 
> > > > > Peter, any chance you could chime in here?
> > > > 
> > > > I have this patch to add swap_out/_in methods. I expect we can loosen
> > > > the requirement for swapcache pages and change the name a little.
> > > > 
> > > > previously posted here:
> > > >   http://lkml.org/lkml/2007/5/4/143
> > > > 
> > > > --- 
> > > > Subject: mm: add support for non block device backed swap files
> > > > 
> > > > New addres_space_operations methods are added:
> > > >   int swapfile(struct address_space *, int);
> > > >   int swap_out(struct file *, struct page *, struct writeback_control *);
> > > >   int swap_in(struct file *, struct page *);
> > > > 
> > > > When during sys_swapon() the swapfile() method is found and returns no error
> > > > the swapper_space.a_ops will proxy to sis->swap_file->f_mapping->a_ops, and
> > > > make use of swap_{out,in}() to write/read swapcache pages.
> > > > 
> > > > The swapfile method will be used to communicate to the address_space that the
> > > > VM relies on it, and the address_space should take adequate measures (like 
> > > > reserving memory for mempools or the like).
> > > > 
> > > > This new interface can be used to obviate the need for ->bmap in the swapfile
> > > > code. A filesystem would need to load (and maybe even allocate) the full block
> > > > map for a file into memory and pin it there on ->swapfile(,1) so that
> > > > ->swap_{out,in}() have instant access to it. It can be released on
> > > > ->swapfile(,0).
> > > 
> > > So this is where I don't think that's good enough, you cannot require a
> > > full block/extent mapping of a file on setup. It can take quite some
> > > time, a little testing I did here easily took 5 seconds for only a
> > > couple of gigabytes. And that wasn't even worst case for that size. It
> > > also wastes memory by populating extents that we may never read or
> > > write.
> > > 
> > > If you look at the loop addition I did, it populates lazily as needed
> > > with some very simple logic to populate-ahead. In practice that performs
> > > as well as a pre-populated map, the first IO to a given range will just
> > > be a little slower since we have to bmap() it.
> > > 
> > > Do you have plans to improve this area?
> > 
> > Nope, for swap it _must_ be there, there is just no way we can do block
> > allocation on swapout.
> 
> I appreciate that fact, just saying that we could have more flexibility
> for other uses.
> 
> > That said, the swapfile() interface can be used to pre-populate the
> > extend/block mapping, and when using swap_(in/out) without it, it can be
> > done lazily.
> 
> I think the interface is too simple, I would much rather have a way to
> dig into the file mappings and be allowed to request population and so
> on. Without that, it can't be used for eg loop.

I'm open to suggestions :-), if you can propose an interface we both can
use I'm all ears.

Although poking at the extends sounds like bmap and we were just trying
to get rid of that.

Perhaps something like badvise(), where you can suggest loading/dropping
extend information on a voluntary basis.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ