[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201104013212.GA82153@lx-t490>
Date: Wed, 4 Nov 2020 02:32:12 +0100
From: "Ahmed S. Darwish" <a.darwish@...utronix.de>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: John Hubbard <jhubbard@...dia.com>,
Jason Gunthorpe <jgg@...dia.com>, Peter Xu <peterx@...hat.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Andrea Arcangeli <aarcange@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
"Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>,
Christoph Hellwig <hch@....de>,
Hugh Dickins <hughd@...gle.com>, Jan Kara <jack@...e.cz>,
Jann Horn <jannh@...gle.com>,
Kirill Shutemov <kirill@...temov.name>,
Kirill Tkhai <ktkhai@...tuozzo.com>,
Leon Romanovsky <leonro@...dia.com>,
Linux-MM <linux-mm@...ck.org>, Michal Hocko <mhocko@...e.com>,
Oleg Nesterov <oleg@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Sebastian Siewior <bigeasy@...utronix.de>
Subject: Re: [PATCH v2 2/2] mm: prevent gup_fast from racing with COW during
fork
On Tue, Nov 03, 2020 at 09:40:22AM -0800, Linus Torvalds wrote:
> On Mon, Nov 2, 2020 at 10:52 PM Ahmed S. Darwish
> <a.darwish@...utronix.de> wrote:
> >
> > The problem is, I've already documented seqlock.h to death.... There are
> > more comments than code in there, and there is "seqlock.rst" under
> > Documentation/ to further describe the big picture.
>
> Well, honestly, I think the correct thing to do is to get rid of the
> *_seqcount_t_*() functions entirely.
>
> They add nothing but confusion, and they are entirely misnamed. That's
> not the pattern we use for "internal use only" functions, and they are
> *very* confusing.
>
I see. Would the enclosed patch #1 be OK? It basically uses the "__do_"
prefix instead, with some rationale.
>
> They have other issues too: like raw_write_seqcount_end() not being
> usable on its own when preemptibility isn't an issue like here. You
> basically _have_ to use raw_write_seqcount_t_end(), because otherwise
> it tries to re-enable preemption that was never there.
>
Hmmm, raw_write_seqcount_{begin,end}() *never* disable/enable preemption
for plain seqcount_t. This is why I kept recommending those for this
patch series instead of internal raw_write_seqcount_*t*_{begin,end}().
But..... given that multiple people made the exact same remark by now, I
guess that's due to:
#define raw_write_seqcount_begin(s) \
do { \
if (__seqcount_lock_preemptible(s)) \
preempt_disable(); \
\
... \
} while (0);
#define raw_write_seqcount_end(s) \
do { \
... \
\
if (__seqcount_lock_preemptible(s)) \
preempt_enable(); \
} while (0);
The tricky part is that __seqcount_lock_preemptible() is always false
for plain "seqcount_t". With that data type, the _Generic() selection
makes it resolve to __seqprop_preemptible(), which just returns false.
Originally, __seqcount_lock_preemptible() was called:
__seqcount_associated_lock_exists_and_is_preemptible()
but it got transformed to its current short form in the process of some
pre-mainline refactorings. Looking at it now after all the dust has
settled, maybe the verbose form was much more clear.
Please see the enclosed patch #2... Would that be OK too?
(I will submit these two patches in their own thread after some common
ground is reached.)
Patches
-------
====>
====> patch #1:
====>
Subject: [RFC][PATCH 1/2] seqlock: Use __do_ prefix instead of non-standed
_seqcount_t_ marker
The use of "*_seqcount_t_*" as a marker to denote internal seqlock.h
functions taking only plain seqcount_t instead of the whole
seqcount_LOCKNAME_t family is confusing users, as it's also not the
standard kernel pattern for denoting header file internal functions.
Use the __do_ prefix instead.
Note, a plain "__" prefix is not used since seqlock.h already uses it
for some of its exported functions; e.g. __read_seqcount_begin() and
__read_seqcount_retry().
Reported-by: Jason Gunthorpe <jgg@...dia.com>
Reported-by: John Hubbard <jhubbard@...dia.com>
Reported-by: Linus Torvalds <torvalds@...ux-foundation.org>
Link: https://lkml.kernel.org/r/CAHk-=wgB8nyOQufpn0o6a5BpJCJPnXvH+kRxApujhsgG+7qAwQ@mail.gmail.com
Signed-off-by: Ahmed S. Darwish <a.darwish@...utronix.de>
---
include/linux/seqlock.h | 62 ++++++++++++++++++++---------------------
1 file changed, 31 insertions(+), 31 deletions(-)
diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h
index cbfc78b92b65..5de043841d33 100644
--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -425,9 +425,9 @@ SEQCOUNT_LOCKNAME(ww_mutex, struct ww_mutex, true, &s->lock->base, ww_mu
* Return: true if a read section retry is required, else false
*/
#define __read_seqcount_retry(s, start) \
- __read_seqcount_t_retry(__seqcount_ptr(s), start)
+ __do___read_seqcount_retry(__seqcount_ptr(s), start)
-static inline int __read_seqcount_t_retry(const seqcount_t *s, unsigned start)
+static inline int __do___read_seqcount_retry(const seqcount_t *s, unsigned start)
{
kcsan_atomic_next(0);
return unlikely(READ_ONCE(s->sequence) != start);
@@ -445,12 +445,12 @@ static inline int __read_seqcount_t_retry(const seqcount_t *s, unsigned start)
* Return: true if a read section retry is required, else false
*/
#define read_seqcount_retry(s, start) \
- read_seqcount_t_retry(__seqcount_ptr(s), start)
+ __do_read_seqcount_retry(__seqcount_ptr(s), start)
-static inline int read_seqcount_t_retry(const seqcount_t *s, unsigned start)
+static inline int __do_read_seqcount_retry(const seqcount_t *s, unsigned start)
{
smp_rmb();
- return __read_seqcount_t_retry(s, start);
+ return __do___read_seqcount_retry(s, start);
}
/**
@@ -462,10 +462,10 @@ do { \
if (__seqcount_lock_preemptible(s)) \
preempt_disable(); \
\
- raw_write_seqcount_t_begin(__seqcount_ptr(s)); \
+ __do_raw_write_seqcount_begin(__seqcount_ptr(s)); \
} while (0)
-static inline void raw_write_seqcount_t_begin(seqcount_t *s)
+static inline void __do_raw_write_seqcount_begin(seqcount_t *s)
{
kcsan_nestable_atomic_begin();
s->sequence++;
@@ -478,13 +478,13 @@ static inline void raw_write_seqcount_t_begin(seqcount_t *s)
*/
#define raw_write_seqcount_end(s) \
do { \
- raw_write_seqcount_t_end(__seqcount_ptr(s)); \
+ __do_raw_write_seqcount_end(__seqcount_ptr(s)); \
\
if (__seqcount_lock_preemptible(s)) \
preempt_enable(); \
} while (0)
-static inline void raw_write_seqcount_t_end(seqcount_t *s)
+static inline void __do_raw_write_seqcount_end(seqcount_t *s)
{
smp_wmb();
s->sequence++;
@@ -506,12 +506,12 @@ do { \
if (__seqcount_lock_preemptible(s)) \
preempt_disable(); \
\
- write_seqcount_t_begin_nested(__seqcount_ptr(s), subclass); \
+ __do_write_seqcount_begin_nested(__seqcount_ptr(s), subclass); \
} while (0)
-static inline void write_seqcount_t_begin_nested(seqcount_t *s, int subclass)
+static inline void __do_write_seqcount_begin_nested(seqcount_t *s, int subclass)
{
- raw_write_seqcount_t_begin(s);
+ __do_raw_write_seqcount_begin(s);
seqcount_acquire(&s->dep_map, subclass, 0, _RET_IP_);
}
@@ -533,12 +533,12 @@ do { \
if (__seqcount_lock_preemptible(s)) \
preempt_disable(); \
\
- write_seqcount_t_begin(__seqcount_ptr(s)); \
+ __do_write_seqcount_begin(__seqcount_ptr(s)); \
} while (0)
-static inline void write_seqcount_t_begin(seqcount_t *s)
+static inline void __do_write_seqcount_begin(seqcount_t *s)
{
- write_seqcount_t_begin_nested(s, 0);
+ __do_write_seqcount_begin_nested(s, 0);
}
/**
@@ -549,16 +549,16 @@ static inline void write_seqcount_t_begin(seqcount_t *s)
*/
#define write_seqcount_end(s) \
do { \
- write_seqcount_t_end(__seqcount_ptr(s)); \
+ __do_write_seqcount_end(__seqcount_ptr(s)); \
\
if (__seqcount_lock_preemptible(s)) \
preempt_enable(); \
} while (0)
-static inline void write_seqcount_t_end(seqcount_t *s)
+static inline void __do_write_seqcount_end(seqcount_t *s)
{
seqcount_release(&s->dep_map, _RET_IP_);
- raw_write_seqcount_t_end(s);
+ __do_raw_write_seqcount_end(s);
}
/**
@@ -603,9 +603,9 @@ static inline void write_seqcount_t_end(seqcount_t *s)
* }
*/
#define raw_write_seqcount_barrier(s) \
- raw_write_seqcount_t_barrier(__seqcount_ptr(s))
+ __do_raw_write_seqcount_barrier(__seqcount_ptr(s))
-static inline void raw_write_seqcount_t_barrier(seqcount_t *s)
+static inline void __do_raw_write_seqcount_barrier(seqcount_t *s)
{
kcsan_nestable_atomic_begin();
s->sequence++;
@@ -623,9 +623,9 @@ static inline void raw_write_seqcount_t_barrier(seqcount_t *s)
* will complete successfully and see data older than this.
*/
#define write_seqcount_invalidate(s) \
- write_seqcount_t_invalidate(__seqcount_ptr(s))
+ __do_write_seqcount_invalidate(__seqcount_ptr(s))
-static inline void write_seqcount_t_invalidate(seqcount_t *s)
+static inline void __do_write_seqcount_invalidate(seqcount_t *s)
{
smp_wmb();
kcsan_nestable_atomic_begin();
@@ -865,7 +865,7 @@ static inline unsigned read_seqretry(const seqlock_t *sl, unsigned start)
}
/*
- * For all seqlock_t write side functions, use write_seqcount_*t*_begin()
+ * For all seqlock_t write side functions, use __do_write_seqcount_begin()
* instead of the generic write_seqcount_begin(). This way, no redundant
* lockdep_assert_held() checks are added.
*/
@@ -886,7 +886,7 @@ static inline unsigned read_seqretry(const seqlock_t *sl, unsigned start)
static inline void write_seqlock(seqlock_t *sl)
{
spin_lock(&sl->lock);
- write_seqcount_t_begin(&sl->seqcount.seqcount);
+ __do_write_seqcount_begin(&sl->seqcount.seqcount);
}
/**
@@ -898,7 +898,7 @@ static inline void write_seqlock(seqlock_t *sl)
*/
static inline void write_sequnlock(seqlock_t *sl)
{
- write_seqcount_t_end(&sl->seqcount.seqcount);
+ __do_write_seqcount_end(&sl->seqcount.seqcount);
spin_unlock(&sl->lock);
}
@@ -912,7 +912,7 @@ static inline void write_sequnlock(seqlock_t *sl)
static inline void write_seqlock_bh(seqlock_t *sl)
{
spin_lock_bh(&sl->lock);
- write_seqcount_t_begin(&sl->seqcount.seqcount);
+ __do_write_seqcount_begin(&sl->seqcount.seqcount);
}
/**
@@ -925,7 +925,7 @@ static inline void write_seqlock_bh(seqlock_t *sl)
*/
static inline void write_sequnlock_bh(seqlock_t *sl)
{
- write_seqcount_t_end(&sl->seqcount.seqcount);
+ __do_write_seqcount_end(&sl->seqcount.seqcount);
spin_unlock_bh(&sl->lock);
}
@@ -939,7 +939,7 @@ static inline void write_sequnlock_bh(seqlock_t *sl)
static inline void write_seqlock_irq(seqlock_t *sl)
{
spin_lock_irq(&sl->lock);
- write_seqcount_t_begin(&sl->seqcount.seqcount);
+ __do_write_seqcount_begin(&sl->seqcount.seqcount);
}
/**
@@ -951,7 +951,7 @@ static inline void write_seqlock_irq(seqlock_t *sl)
*/
static inline void write_sequnlock_irq(seqlock_t *sl)
{
- write_seqcount_t_end(&sl->seqcount.seqcount);
+ __do_write_seqcount_end(&sl->seqcount.seqcount);
spin_unlock_irq(&sl->lock);
}
@@ -960,7 +960,7 @@ static inline unsigned long __write_seqlock_irqsave(seqlock_t *sl)
unsigned long flags;
spin_lock_irqsave(&sl->lock, flags);
- write_seqcount_t_begin(&sl->seqcount.seqcount);
+ __do_write_seqcount_begin(&sl->seqcount.seqcount);
return flags;
}
@@ -989,7 +989,7 @@ static inline unsigned long __write_seqlock_irqsave(seqlock_t *sl)
static inline void
write_sequnlock_irqrestore(seqlock_t *sl, unsigned long flags)
{
- write_seqcount_t_end(&sl->seqcount.seqcount);
+ __do_write_seqcount_end(&sl->seqcount.seqcount);
spin_unlock_irqrestore(&sl->lock, flags);
}
====>
====> patch #2:
====>
Subject: [PATCH 2/2] seqlock: seqcount_LOCKAME_t: Use more verbose macro names
As evidenced by multiple discussions over LKML so far, it's not clear
that __seqcount_lock_preemptible() is always false for plain seqcount_t.
For that data type, the _Generic() selection resolves to
__seqprop_preemptible(), which just returns false.
Use __seqcount_associated_lock_exists_and_is_preemptible() instead,
which hints that "preemptibility" is for the associated write
serialization lock (if any), not for the seqcount itself.
Similarly, rename __seqcount_assert_lock_held() to
__seqcount_assert_associated_lock_held().
Link: https://lkml.kernel.org/r/CAHk-=wgB8nyOQufpn0o6a5BpJCJPnXvH+kRxApujhsgG+7qAwQ@mail.gmail.com
Link: https://lkml.kernel.org/r/20201030235121.GQ2620339@nvidia.com
Link: https://lkml.kernel.org/r/20201103170327.GJ20600@xz-x1
Signed-off-by: Ahmed S. Darwish <a.darwish@...utronix.de>
---
include/linux/seqlock.h | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h
index 5de043841d33..eb1e5a822e44 100644
--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -307,10 +307,10 @@ SEQCOUNT_LOCKNAME(ww_mutex, struct ww_mutex, true, &s->lock->base, ww_mu
__seqprop_case((s), mutex, prop), \
__seqprop_case((s), ww_mutex, prop))
-#define __seqcount_ptr(s) __seqprop(s, ptr)
-#define __seqcount_sequence(s) __seqprop(s, sequence)
-#define __seqcount_lock_preemptible(s) __seqprop(s, preemptible)
-#define __seqcount_assert_lock_held(s) __seqprop(s, assert)
+#define __seqcount_ptr(s) __seqprop(s, ptr)
+#define __seqcount_sequence(s) __seqprop(s, sequence)
+#define __seqcount_associated_lock_exists_and_is_preemptible(s) __seqprop(s, preemptible)
+#define __seqcount_assert_associated_lock_held(s) __seqprop(s, assert)
/**
* __read_seqcount_begin() - begin a seqcount_t read section w/o barrier
@@ -459,7 +459,7 @@ static inline int __do_read_seqcount_retry(const seqcount_t *s, unsigned start)
*/
#define raw_write_seqcount_begin(s) \
do { \
- if (__seqcount_lock_preemptible(s)) \
+ if (__seqcount_associated_lock_exists_and_is_preemptible(s)) \
preempt_disable(); \
\
__do_raw_write_seqcount_begin(__seqcount_ptr(s)); \
@@ -480,7 +480,7 @@ static inline void __do_raw_write_seqcount_begin(seqcount_t *s)
do { \
__do_raw_write_seqcount_end(__seqcount_ptr(s)); \
\
- if (__seqcount_lock_preemptible(s)) \
+ if (__seqcount_associated_lock_exists_and_is_preemptible(s)) \
preempt_enable(); \
} while (0)
@@ -501,9 +501,9 @@ static inline void __do_raw_write_seqcount_end(seqcount_t *s)
*/
#define write_seqcount_begin_nested(s, subclass) \
do { \
- __seqcount_assert_lock_held(s); \
+ __seqcount_assert_associated_lock_held(s); \
\
- if (__seqcount_lock_preemptible(s)) \
+ if (__seqcount_associated_lock_exists_and_is_preemptible(s)) \
preempt_disable(); \
\
__do_write_seqcount_begin_nested(__seqcount_ptr(s), subclass); \
@@ -528,9 +528,9 @@ static inline void __do_write_seqcount_begin_nested(seqcount_t *s, int subclass)
*/
#define write_seqcount_begin(s) \
do { \
- __seqcount_assert_lock_held(s); \
+ __seqcount_assert_associated_lock_held(s); \
\
- if (__seqcount_lock_preemptible(s)) \
+ if (__seqcount_associated_lock_exists_and_is_preemptible(s)) \
preempt_disable(); \
\
__do_write_seqcount_begin(__seqcount_ptr(s)); \
@@ -551,7 +551,7 @@ static inline void __do_write_seqcount_begin(seqcount_t *s)
do { \
__do_write_seqcount_end(__seqcount_ptr(s)); \
\
- if (__seqcount_lock_preemptible(s)) \
+ if (__seqcount_associated_lock_exists_and_is_preemptible(s)) \
preempt_enable(); \
} while (0)
> Linus
Thanks,
--
Ahmed S. Darwish
Linutronix GmbH
Powered by blists - more mailing lists