[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1bc8a5df-b413-4869-8931-98f5b9e82fe5@suse.cz>
Date: Tue, 16 Jan 2024 15:42:01 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: Suren Baghdasaryan <surenb@...gle.com>, akpm@...ux-foundation.org
Cc: viro@...iv.linux.org.uk, brauner@...nel.org, jack@...e.cz,
dchinner@...hat.com, casey@...aufler-ca.com, ben.wolsieffer@...ring.com,
paulmck@...nel.org, david@...hat.com, avagin@...gle.com,
usama.anjum@...labora.com, peterx@...hat.com, hughd@...gle.com,
ryan.roberts@....com, wangkefeng.wang@...wei.com, Liam.Howlett@...cle.com,
yuzhao@...gle.com, axelrasmussen@...gle.com, lstoakes@...il.com,
talumbau@...gle.com, willy@...radead.org, mgorman@...hsingularity.net,
jhubbard@...dia.com, vishal.moola@...il.com, mathieu.desnoyers@...icios.com,
dhowells@...hat.com, jgg@...pe.ca, sidhartha.kumar@...cle.com,
andriy.shevchenko@...ux.intel.com, yangxingui@...wei.com,
keescook@...omium.org, linux-kernel@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org, kernel-team@...roid.com
Subject: Re: [RFC 0/3] reading proc/pid/maps under RCU
On 1/15/24 19:38, Suren Baghdasaryan wrote:
Hi,
> The issue this patchset is trying to address is mmap_lock contention when
> a low priority task (monitoring, data collecting, etc.) blocks a higher
> priority task from making updated to the address space. The contention is
> due to the mmap_lock being held for read when reading proc/pid/maps.
> With maple_tree introduction, VMA tree traversals are RCU-safe and per-vma
> locks make VMA access RCU-safe. this provides an opportunity for lock-less
> reading of proc/pid/maps. We still need to overcome a couple obstacles:
> 1. Make all VMA pointer fields used for proc/pid/maps content generation
> RCU-safe;
> 2. Ensure that proc/pid/maps data tearing, which is currently possible at
> page boundaries only, does not get worse.
Hm I thought we were to only choose this more complicated in case additional
tearing becomes a problem, and at first assume that if software can deal
with page boundary tearing, it can deal with sub-page tearing too?
> The patchset deals with these issues but there is a downside which I would
> like to get input on:
> This change introduces unfairness towards the reader of proc/pid/maps,
> which can be blocked by an overly active/malicious address space modifyer.
So this is a consequence of the validate() operation, right? We could avoid
this if we allowed sub-page tearing.
> A couple of ways I though we can address this issue are:
> 1. After several lock-less retries (or some time limit) to fall back to
> taking mmap_lock.
> 2. Employ lock-less reading only if the reader has low priority,
> indicating that blocking it is not critical.
> 3. Introducing a separate procfs file which publishes the same data in
> lock-less manner.
>
> I imagine a combination of these approaches can also be employed.
> I would like to get feedback on this from the Linux community.
>
> Note: mmap_read_lock/mmap_read_unlock sequence inside validate_map()
> can be replaced with more efficiend rwsem_wait() proposed by Matthew
> in [1].
>
> [1] https://lore.kernel.org/all/ZZ1+ZicgN8dZ3zj3@casper.infradead.org/
>
> Suren Baghdasaryan (3):
> mm: make vm_area_struct anon_name field RCU-safe
> seq_file: add validate() operation to seq_operations
> mm/maps: read proc/pid/maps under RCU
>
> fs/proc/internal.h | 3 +
> fs/proc/task_mmu.c | 130 ++++++++++++++++++++++++++++++++++----
> fs/seq_file.c | 24 ++++++-
> include/linux/mm_inline.h | 10 ++-
> include/linux/mm_types.h | 3 +-
> include/linux/seq_file.h | 1 +
> mm/madvise.c | 30 +++++++--
> 7 files changed, 181 insertions(+), 20 deletions(-)
>
Powered by blists - more mailing lists