lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 29 Jul 2014 08:12:59 -0400
From:	Matthew Wilcox <willy@...ux.intel.com>
To:	Jan Kara <jack@...e.cz>
Cc:	Matthew Wilcox <matthew.r.wilcox@...el.com>,
	linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v7 07/22] Replace the XIP page fault handler with the DAX
 page fault handler

On Wed, Apr 09, 2014 at 11:43:31PM +0200, Jan Kara wrote:
> So there are three places that can fail after we allocate the block:
> 1) We race with truncate reducing i_size
> 2) dax_get_pfn() fails
> 3) vm_insert_mixed() fails
> 
> I would guess that 2) can fail only if the HW has problems and leaking
> block in that case could be acceptable (please correct me if I'm wrong).
> 3) shouldn't fail because of ENOMEM because fault has already allocated all
> the page tables and EBUSY should be handled as well. So the only failure we
> have to care about is 1). And we could move ->get_block() call under
> i_mmap_mutex after the i_size check.  Lock ordering should be fine because
> i_mmap_mutex ranks above page lock under which we do block mapping in
> standard ->page_mkwrite callbacks. The only (big) drawback is that
> i_mmap_mutex will now be held for much longer time and thus the contention
> would be much higher. But hopefully once we resolve our problems with
> mmap_sem and introduce mapping range lock we could scale reasonably.

Lockdep barfs on holding i_mmap_mutex while calling ext4's ->get_block.

Path 1:

ext4_fallocate ->
 ext4_punch_hole ->
  ext4_inode_attach_jinode() -> ... ->
    lock_map_acquire(&handle->h_lockdep_map);
  truncate_pagecache_range() ->
   unmap_mapping_range() ->
    mutex_lock(&mapping->i_mmap_mutex);

Path 2:
do_dax_fault() ->
 mutex_lock(&mapping->i_mmap_mutex);
 ext4_get_block() -> ... ->
  lock_map_acquire(&handle->h_lockdep_map);

So that idea doesn't work.

We can't exclude truncates by incrementing i_dio_count, because we can't
take i_mutex in the fault path.

I'm stumped.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ