lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-Id: <20230502191821.71c86a2c25f19fe342aa72db@linux-foundation.org> Date: Tue, 2 May 2023 19:18:21 -0700 From: Andrew Morton <akpm@...ux-foundation.org> To: Lorenzo Stoakes <lstoakes@...il.com> Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org, Jason Gunthorpe <jgg@...pe.ca>, Jens Axboe <axboe@...nel.dk>, Matthew Wilcox <willy@...radead.org>, Dennis Dalessandro <dennis.dalessandro@...nelisnetworks.com>, Leon Romanovsky <leon@...nel.org>, Christian Benvenuti <benve@...co.com>, Nelson Escobar <neescoba@...co.com>, Bernard Metzler <bmt@...ich.ibm.com>, Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, Arnaldo Carvalho de Melo <acme@...nel.org>, Mark Rutland <mark.rutland@....com>, Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, Namhyung Kim <namhyung@...nel.org>, Ian Rogers <irogers@...gle.com>, Adrian Hunter <adrian.hunter@...el.com>, Bjorn Topel <bjorn@...nel.org>, Magnus Karlsson <magnus.karlsson@...el.com>, Maciej Fijalkowski <maciej.fijalkowski@...el.com>, Jonathan Lemon <jonathan.lemon@...il.com>, "David S . Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Christian Brauner <brauner@...nel.org>, Richard Cochran <richardcochran@...il.com>, Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, Jesper Dangaard Brouer <hawk@...nel.org>, John Fastabend <john.fastabend@...il.com>, linux-fsdevel@...r.kernel.org, linux-perf-users@...r.kernel.org, netdev@...r.kernel.org, bpf@...r.kernel.org, Oleg Nesterov <oleg@...hat.com>, Jason Gunthorpe <jgg@...dia.com>, John Hubbard <jhubbard@...dia.com>, Jan Kara <jack@...e.cz>, "Kirill A . Shutemov" <kirill@...temov.name>, Pavel Begunkov <asml.silence@...il.com>, Mika Penttila <mpenttil@...hat.com>, David Hildenbrand <david@...hat.com>, Dave Chinner <david@...morbit.com>, Theodore Ts'o <tytso@....edu>, Peter Xu <peterx@...hat.com>, Matthew Rosato <mjrosato@...ux.ibm.com>, "Paul E . McKenney" <paulmck@...nel.org>, Christian Borntraeger <borntraeger@...ux.ibm.com> Subject: Re: [PATCH v8 3/3] mm/gup: disallow FOLL_LONGTERM GUP-fast writing to file-backed mappings On Tue, 2 May 2023 23:51:35 +0100 Lorenzo Stoakes <lstoakes@...il.com> wrote: > Writing to file-backed dirty-tracked mappings via GUP is inherently broken > as we cannot rule out folios being cleaned and then a GUP user writing to > them again and possibly marking them dirty unexpectedly. > > This is especially egregious for long-term mappings (as indicated by the > use of the FOLL_LONGTERM flag), so we disallow this case in GUP-fast as > we have already done in the slow path. > > We have access to less information in the fast path as we cannot examine > the VMA containing the mapping, however we can determine whether the folio > is anonymous or belonging to a whitelisted filesystem - specifically > hugetlb and shmem mappings. > > We take special care to ensure that both the folio and mapping are safe to > access when performing these checks and document folio_fast_pin_allowed() > accordingly. > > It's important to note that there are no APIs allowing users to specify > FOLL_FAST_ONLY for a PUP-fast let alone with FOLL_LONGTERM, so we can > always rely on the fact that if we fail to pin on the fast path, the code > will fall back to the slow path which can perform the more thorough check. arm allnoconfig said mm/gup.c:115:13: warning: 'folio_fast_pin_allowed' defined but not used [-Wunused-function] 115 | static bool folio_fast_pin_allowed(struct folio *folio, unsigned int flags) | ^~~~~~~~~~~~~~~~~~~~~~ so I moved the definition inside CONFIG_ARCH_HAS_PTE_SPECIAL. mm/gup.c | 154 ++++++++++++++++++++++++++--------------------------- 1 file changed, 77 insertions(+), 77 deletions(-) --- a/mm/gup.c~mm-gup-disallow-foll_longterm-gup-fast-writing-to-file-backed-mappings-fix +++ a/mm/gup.c @@ -96,83 +96,6 @@ retry: return folio; } -/* - * Used in the GUP-fast path to determine whether a pin is permitted for a - * specific folio. - * - * This call assumes the caller has pinned the folio, that the lowest page table - * level still points to this folio, and that interrupts have been disabled. - * - * Writing to pinned file-backed dirty tracked folios is inherently problematic - * (see comment describing the writable_file_mapping_allowed() function). We - * therefore try to avoid the most egregious case of a long-term mapping doing - * so. - * - * This function cannot be as thorough as that one as the VMA is not available - * in the fast path, so instead we whitelist known good cases and if in doubt, - * fall back to the slow path. - */ -static bool folio_fast_pin_allowed(struct folio *folio, unsigned int flags) -{ - struct address_space *mapping; - unsigned long mapping_flags; - - /* - * If we aren't pinning then no problematic write can occur. A long term - * pin is the most egregious case so this is the one we disallow. - */ - if ((flags & (FOLL_PIN | FOLL_LONGTERM | FOLL_WRITE)) != - (FOLL_PIN | FOLL_LONGTERM | FOLL_WRITE)) - return true; - - /* The folio is pinned, so we can safely access folio fields. */ - - /* Neither of these should be possible, but check to be sure. */ - if (unlikely(folio_test_slab(folio) || folio_test_swapcache(folio))) - return false; - - /* hugetlb mappings do not require dirty-tracking. */ - if (folio_test_hugetlb(folio)) - return true; - - /* - * GUP-fast disables IRQs. When IRQS are disabled, RCU grace periods - * cannot proceed, which means no actions performed under RCU can - * proceed either. - * - * inodes and thus their mappings are freed under RCU, which means the - * mapping cannot be freed beneath us and thus we can safely dereference - * it. - */ - lockdep_assert_irqs_disabled(); - - /* - * However, there may be operations which _alter_ the mapping, so ensure - * we read it once and only once. - */ - mapping = READ_ONCE(folio->mapping); - - /* - * The mapping may have been truncated, in any case we cannot determine - * if this mapping is safe - fall back to slow path to determine how to - * proceed. - */ - if (!mapping) - return false; - - /* Anonymous folios are fine, other non-file backed cases are not. */ - mapping_flags = (unsigned long)mapping & PAGE_MAPPING_FLAGS; - if (mapping_flags) - return mapping_flags == PAGE_MAPPING_ANON; - - /* - * At this point, we know the mapping is non-null and points to an - * address_space object. The only remaining whitelisted file system is - * shmem. - */ - return shmem_mapping(mapping); -} - /** * try_grab_folio() - Attempt to get or pin a folio. * @page: pointer to page to be grabbed @@ -2474,6 +2397,83 @@ static void __maybe_unused undo_dev_page #ifdef CONFIG_ARCH_HAS_PTE_SPECIAL /* + * Used in the GUP-fast path to determine whether a pin is permitted for a + * specific folio. + * + * This call assumes the caller has pinned the folio, that the lowest page table + * level still points to this folio, and that interrupts have been disabled. + * + * Writing to pinned file-backed dirty tracked folios is inherently problematic + * (see comment describing the writable_file_mapping_allowed() function). We + * therefore try to avoid the most egregious case of a long-term mapping doing + * so. + * + * This function cannot be as thorough as that one as the VMA is not available + * in the fast path, so instead we whitelist known good cases and if in doubt, + * fall back to the slow path. + */ +static bool folio_fast_pin_allowed(struct folio *folio, unsigned int flags) +{ + struct address_space *mapping; + unsigned long mapping_flags; + + /* + * If we aren't pinning then no problematic write can occur. A long term + * pin is the most egregious case so this is the one we disallow. + */ + if ((flags & (FOLL_PIN | FOLL_LONGTERM | FOLL_WRITE)) != + (FOLL_PIN | FOLL_LONGTERM | FOLL_WRITE)) + return true; + + /* The folio is pinned, so we can safely access folio fields. */ + + /* Neither of these should be possible, but check to be sure. */ + if (unlikely(folio_test_slab(folio) || folio_test_swapcache(folio))) + return false; + + /* hugetlb mappings do not require dirty-tracking. */ + if (folio_test_hugetlb(folio)) + return true; + + /* + * GUP-fast disables IRQs. When IRQS are disabled, RCU grace periods + * cannot proceed, which means no actions performed under RCU can + * proceed either. + * + * inodes and thus their mappings are freed under RCU, which means the + * mapping cannot be freed beneath us and thus we can safely dereference + * it. + */ + lockdep_assert_irqs_disabled(); + + /* + * However, there may be operations which _alter_ the mapping, so ensure + * we read it once and only once. + */ + mapping = READ_ONCE(folio->mapping); + + /* + * The mapping may have been truncated, in any case we cannot determine + * if this mapping is safe - fall back to slow path to determine how to + * proceed. + */ + if (!mapping) + return false; + + /* Anonymous folios are fine, other non-file backed cases are not. */ + mapping_flags = (unsigned long)mapping & PAGE_MAPPING_FLAGS; + if (mapping_flags) + return mapping_flags == PAGE_MAPPING_ANON; + + /* + * At this point, we know the mapping is non-null and points to an + * address_space object. The only remaining whitelisted file system is + * shmem. + */ + return shmem_mapping(mapping); +} + +/* * Fast-gup relies on pte change detection to avoid concurrent pgtable * operations. * _
Powered by blists - more mailing lists