lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1394746786-6397-1-git-send-email-n-horiguchi@ah.jp.nec.com>
Date:	Thu, 13 Mar 2014 17:39:40 -0400
From:	Naoya Horiguchi <n-horiguchi@...jp.nec.com>
To:	linux-kernel@...r.kernel.org
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Andi Kleen <andi@...stfloor.org>,
	Wu Fengguang <fengguang.wu@...el.com>,
	Tony Luck <tony.luck@...el.com>,
	Wanpeng Li <liwanp@...ux.vnet.ibm.com>,
	Dave Chinner <david@...morbit.com>,
	"Jun'ichi Nomura" <j-nomura@...jp.nec.com>, linux-mm@...ck.org
Subject: [PATCH 0/6] memory error report/recovery for dirty pagecache v3

This patchset tries to solve the following issues related to handling memory
errors on dirty pagecache:
 1. stickiness of error info: in current implementation, the events of
    dirty pagecache memory error are recorded as AS_EIO on page_mapping(page),
    which is not sticky (cleared once checked). As a result, we have a race
    window of ignoring the data lost due to concurrent accesses even if
    your application can handle the error report by itself.
 2. finer granularity: when memory error hits a page of a file, we get the
    error report in accessing to other healthy pages, which is confusing for
    userspace.
 3. overwrite recovery: with fixes on problem 1 and 2, we have a possibility
    to recover from the memory error if applications recreate the date on the
    error page or applications are sure of that data on the error page is not
    important.
These problems are solved by introducing a new pagecache tag to remember
memory errors.

Patch 1 is extending some radix_tree operation to support end parameter,
which is used later.

Patch 2 introduces PAGECACHE_TAG_HWPOISON and solve problem 1 and 2 with it.

Patch 3 implements overwrite recovery to solve problem 3.

Patch 4-6 add a new interface /proc/kpagecache which is helpful when
testing/debugging pagecache related issues like this patchset.
Some sample usespace code and documentation is also added.

I think that we can straightforwardly raplace error reporting for normal
IO error with pagecache tag, and we have a clear benefit of doing so in
finer granurality. And overwrite recovery is also fine for example when
dirty data was lost in write failure. But at first I want review and 
feedback on the base idea.

Previous discussions are available from the URLs:
- v1: http://thread.gmane.org/gmane.linux.kernel/1341433
- v2: http://thread.gmane.org/gmane.linux.kernel.mm/84760

Test code:
  https://github.com/Naoya-Horiguchi/test_memory_error_reporting
---
Summary:

Naoya Horiguchi (6):
      radix-tree: add end_index to support ranged iteration
      mm/memory-failure.c: report and recovery for memory error on dirty pagecache
      mm/memory-failure.c: add code to resolve quasi-hwpoisoned page
      fs/proc/page.c: introduce /proc/kpagecache interface
      tools/vm/page-types.c: add file scanning mode
      Documentation: update Documentation/vm/pagemap.txt

 Documentation/vm/pagemap.txt  |  29 ++++++
 drivers/gpu/drm/qxl/qxl_ttm.c |   2 +-
 fs/proc/page.c                | 106 +++++++++++++++++++
 include/linux/fs.h            |  12 ++-
 include/linux/pagemap.h       |  27 +++++
 include/linux/radix-tree.h    |  31 ++++--
 kernel/irq/irqdomain.c        |   2 +-
 lib/radix-tree.c              |   8 +-
 mm/filemap.c                  |  28 ++++-
 mm/memory-failure.c           | 230 +++++++++++++++++++++++++++++++++++-------
 mm/shmem.c                    |   2 +-
 mm/truncate.c                 |   7 ++
 tools/vm/page-types.c         | 117 ++++++++++++++++++---
 13 files changed, 530 insertions(+), 71 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ