lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230515110315.uqifqgqkzcrrrubv@box.shutemov.name>
Date: Mon, 15 May 2023 14:03:15 +0300
From: "Kirill A . Shutemov" <kirill@...temov.name>
To: Lorenzo Stoakes <lstoakes@...il.com>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Jason Gunthorpe <jgg@...pe.ca>, Jens Axboe <axboe@...nel.dk>,
	Matthew Wilcox <willy@...radead.org>,
	Dennis Dalessandro <dennis.dalessandro@...nelisnetworks.com>,
	Leon Romanovsky <leon@...nel.org>,	Christian Benvenuti <benve@...co.com>,
	Nelson Escobar <neescoba@...co.com>,
	Bernard Metzler <bmt@...ich.ibm.com>,
	Peter Zijlstra <peterz@...radead.org>,	Ingo Molnar <mingo@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Mark Rutland <mark.rutland@....com>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Jiri Olsa <jolsa@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
	Ian Rogers <irogers@...gle.com>,	Adrian Hunter <adrian.hunter@...el.com>,
	Bjorn Topel <bjorn@...nel.org>,
	Magnus Karlsson <magnus.karlsson@...el.com>,
	Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
	Jonathan Lemon <jonathan.lemon@...il.com>,
	"David S . Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,	Jakub Kicinski <kuba@...nel.org>,
 Paolo Abeni <pabeni@...hat.com>,	Christian Brauner <brauner@...nel.org>,
	Richard Cochran <richardcochran@...il.com>,
	Alexei Starovoitov <ast@...nel.org>,
	Daniel Borkmann <daniel@...earbox.net>,
	Jesper Dangaard Brouer <hawk@...nel.org>,
	John Fastabend <john.fastabend@...il.com>,	linux-fsdevel@...r.kernel.org,
 linux-perf-users@...r.kernel.org,	netdev@...r.kernel.org,
 bpf@...r.kernel.org,	Oleg Nesterov <oleg@...hat.com>,
 Jason Gunthorpe <jgg@...dia.com>,	John Hubbard <jhubbard@...dia.com>,
 Jan Kara <jack@...e.cz>,	Pavel Begunkov <asml.silence@...il.com>,
	Mika Penttila <mpenttil@...hat.com>,
	David Hildenbrand <david@...hat.com>,	Dave Chinner <david@...morbit.com>,
 Theodore Ts'o <tytso@....edu>,	Peter Xu <peterx@...hat.com>,
	Matthew Rosato <mjrosato@...ux.ibm.com>,
	"Paul E . McKenney" <paulmck@...nel.org>,
	Christian Borntraeger <borntraeger@...ux.ibm.com>
Subject: Re: [PATCH v9 0/3] mm/gup: disallow GUP writing to file-backed
 mappings by default

On Thu, May 04, 2023 at 10:27:50PM +0100, Lorenzo Stoakes wrote:
> Writing to file-backed mappings which require folio dirty tracking using
> GUP is a fundamentally broken operation, as kernel write access to GUP
> mappings do not adhere to the semantics expected by a file system.
> 
> A GUP caller uses the direct mapping to access the folio, which does not
> cause write notify to trigger, nor does it enforce that the caller marks
> the folio dirty.

Okay, problem is clear and the patchset look good to me. But I'm worried
breaking existing users.

Do we expect the change to be visible to real world users? If yes, are we
okay to break them?

One thing that came to mind is KVM with "qemu -object memory-backend-file,share=on..."
It is mostly used for pmem emulation.

Do we have plan B?

Just a random/crazy/broken idea:

 - Allow folio_mkclean() (and folio_clear_dirty_for_io()) to fail,
   indicating that the page cannot be cleared because it is pinned;

 - Introduce a new vm_operations_struct::mkclean() that would be called by
   page_vma_mkclean_one() before clearing the range and can fail;

 - On GUP, create an in-kernel fake VMA that represents the file, but with
   custom vm_ops. The VMA registered in rmap to get notified on
   folio_mkclean() and fail it because of GUP.

 - folio_clear_dirty_for_io() callers will handle the new failure as
   indication that the page can be written back but will stay dirty and
   fs-specific data that is associated with the page writeback cannot be
   freed.

I'm sure the idea is broken on many levels (I have never looked closely at
the writeback path). But maybe it is good enough as conversation started?

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ