[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1430934639-2131-1-git-send-email-dahi@linux.vnet.ibm.com>
Date: Wed, 6 May 2015 19:50:24 +0200
From: David Hildenbrand <dahi@...ux.vnet.ibm.com>
To: linux-kernel@...r.kernel.org
Cc: dahi@...ux.vnet.ibm.com, mingo@...hat.com, peterz@...radead.org,
yang.shi@...driver.com, bigeasy@...utronix.de,
benh@...nel.crashing.org, paulus@...ba.org,
akpm@...ux-foundation.org, heiko.carstens@...ibm.com,
schwidefsky@...ibm.com, borntraeger@...ibm.com, mst@...hat.com,
tglx@...utronix.de, David.Laight@...LAB.COM, hughd@...gle.com,
hocko@...e.cz, ralf@...ux-mips.org, herbert@...dor.apana.org.au,
linux@....linux.org.uk, airlied@...ux.ie, daniel.vetter@...el.com,
linux-mm@...ck.org, linux-arch@...r.kernel.org
Subject: [PATCH RFC 00/15] decouple pagefault_disable() from preempt_disable()
As Peter asked me to also do the decoupling in one shot, this is
the new series.
I recently discovered that might_fault() doesn't call might_sleep()
anymore. Therefore bugs like:
spin_lock(&lock);
rc = copy_to_user(...);
spin_unlock(&lock);
would not be detected with CONFIG_DEBUG_ATOMIC_SLEEP. The code was
changed to disable false positives for code like:
pagefault_disable();
rc = copy_to_user(...);
pagefault_enable();
Whereby the caller wants do deal with failures.
If we schedule while holding a spinlock, bad things can happen.
Preemption is disabled, but we still preempt - on s390 this can lead
to spinlocks never getting unlocked (as unlocking code checks for the
cpu id that locke the spinlock), therefore resulting in a deadlock.
Until now, pagefault_(dis|en)nable) simply modified the preempt count,
therefore telling the pagefault handler that the context is atomic and
sleeping is disallowed.
With !CONFIG_PREEMPT_COUNT, the preempt count does not exist.
So preempt_disable() as well es pagefault_disable() is a NOP.
in_atomic() checks the preempt count. So we can't detect in_atomic on
systems without preemption.
That is also mentioned in the comment of in_atomic():
"WARNING: this macro cannot always detect atomic context; in particular,
it cannot know about held spinlocks in non-preemptible kernels."
We automatically have !CONFIG_PREEMPT_COUNT with !CONFIG_PREEMPT
and !CONFIG_DEBUG_ATOMIC_SLEEP, so on a system with disabled pagefaults.
All fault handlers currently rely on in_atomic() to check for disabled
pagefaults.
This series therefore does 2 things:
1. Decouple pagefault_disable() from preempt_enable()
pagefault_(dis|en)able() modifies an own counter and doesn't touch
preemption anymore. The fault handlers now only check for
disabled pagefaults.
I checked (hopefully) every caller of pagefault_disable(), if they
implicitly rely on preempt_disable() - and if so added these calls.
Hope I haven't missed some cases.
I didn't check all users of preempt_disable() if they relied on them
disabling pagefaults. My assumption is that such code is broken
either way (on non-preemptible systems). Pagefaults would only be
disabled with CONFIG_PREEMPT_COUNT.
I also thought about leaving the in_atomic() check inside the
fault handlers in addition to the new check. In my opinion that
can only hide bugs that would happen with !CONFIG_PREEMPT_COUNT.
We could even add a warning at that place like
WARN_ONCE(in_atomic() && !pagefault_disabled())
But most of these cases should be found with the following part.
2. Reenable might_sleep() checks for might_fault()
As we can now decide if we really have pagefaults disabled,
we can reenable the might_sleep() check in might_fault().
So this should now work:
spin_lock(&lock); /* also if left away */
pagefault_disable()
rc = copy_to_user(...);
pagefault_enable();
spin_unlock(&lock);
And this should report a warning again:
spin_lock(&lock);
rc = copy_to_user(...);
spin_unlock(&lock);
Cross compiled on powerpc, arm, sparc, sparc64, arm64, x86_64, i386,
mips, alpha, ia64, xtensa, m68k, microblaze.
Tested on s390x.
Any feedback very welcome!
Thanks!
David Hildenbrand (15):
uaccess: count pagefault_disable() levels in pagefault_disabled
mm, uaccess: trigger might_sleep() in might_fault() with disabled
pagefaults
uaccess: clarify that uaccess may only sleep if pagefaults are enabled
mm: explicitly disable/enable preemption in kmap_atomic_*
mips: kmap_coherent relies on disabled preemption
mm: use pagefault_disabled() to check for disabled pagefaults
drm/i915: use pagefault_disabled() to check for disabled pagefaults
futex: UP futex_atomic_op_inuser() relies on disabled preemption
futex: UP futex_atomic_cmpxchg_inatomic() relies on disabled
preemption
arm/futex: UP futex_atomic_cmpxchg_inatomic() relies on disabled
preemption
arm/futex: UP futex_atomic_op_inuser() relies on disabled preemption
futex: clarify that preemption doesn't have to be disabled
powerpc: enable_kernel_altivec() requires disabled preemption
mips: properly lock access to the fpu
uaccess: decouple preemption from the pagefault logic
arch/alpha/mm/fault.c | 5 ++--
arch/arc/include/asm/futex.h | 10 +++----
arch/arc/mm/fault.c | 2 +-
arch/arm/include/asm/futex.h | 13 +++++++--
arch/arm/mm/fault.c | 2 +-
arch/arm/mm/highmem.c | 3 ++
arch/arm64/include/asm/futex.h | 4 +--
arch/arm64/mm/fault.c | 2 +-
arch/avr32/include/asm/uaccess.h | 12 +++++---
arch/avr32/mm/fault.c | 4 +--
arch/cris/mm/fault.c | 4 +--
arch/frv/mm/fault.c | 4 +--
arch/frv/mm/highmem.c | 2 ++
arch/hexagon/include/asm/uaccess.h | 3 +-
arch/ia64/mm/fault.c | 4 +--
arch/m32r/include/asm/uaccess.h | 30 +++++++++++++-------
arch/m32r/mm/fault.c | 4 +--
arch/m68k/mm/fault.c | 4 +--
arch/metag/mm/fault.c | 2 +-
arch/metag/mm/highmem.c | 4 ++-
arch/microblaze/include/asm/uaccess.h | 6 ++--
arch/microblaze/mm/fault.c | 6 ++--
arch/microblaze/mm/highmem.c | 4 ++-
arch/mips/include/asm/uaccess.h | 45 ++++++++++++++++++++----------
arch/mips/kernel/signal-common.h | 9 ++----
arch/mips/mm/fault.c | 4 +--
arch/mips/mm/highmem.c | 5 +++-
arch/mips/mm/init.c | 2 ++
arch/mn10300/include/asm/highmem.h | 3 ++
arch/mn10300/mm/fault.c | 4 +--
arch/nios2/mm/fault.c | 2 +-
arch/parisc/include/asm/cacheflush.h | 2 ++
arch/parisc/kernel/traps.c | 4 +--
arch/parisc/mm/fault.c | 4 +--
arch/powerpc/lib/vmx-helper.c | 11 ++++----
arch/powerpc/mm/fault.c | 9 +++---
arch/powerpc/mm/highmem.c | 4 ++-
arch/s390/include/asm/uaccess.h | 15 ++++++----
arch/s390/mm/fault.c | 2 +-
arch/score/include/asm/uaccess.h | 15 ++++++----
arch/score/mm/fault.c | 3 +-
arch/sh/mm/fault.c | 3 +-
arch/sparc/mm/fault_32.c | 4 +--
arch/sparc/mm/fault_64.c | 4 +--
arch/sparc/mm/highmem.c | 4 ++-
arch/sparc/mm/init_64.c | 2 +-
arch/tile/include/asm/uaccess.h | 18 ++++++++----
arch/tile/mm/fault.c | 4 +--
arch/tile/mm/highmem.c | 3 +-
arch/um/kernel/trap.c | 4 +--
arch/unicore32/mm/fault.c | 2 +-
arch/x86/include/asm/uaccess.h | 15 ++++++----
arch/x86/include/asm/uaccess_32.h | 6 ++--
arch/x86/lib/usercopy_32.c | 6 ++--
arch/x86/mm/fault.c | 5 ++--
arch/x86/mm/highmem_32.c | 3 +-
arch/x86/mm/iomap_32.c | 2 ++
arch/xtensa/mm/fault.c | 4 +--
arch/xtensa/mm/highmem.c | 2 ++
drivers/crypto/vmx/aes.c | 8 +++++-
drivers/crypto/vmx/aes_cbc.c | 6 ++++
drivers/crypto/vmx/ghash.c | 8 ++++++
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 3 +-
include/asm-generic/futex.h | 7 +++--
include/linux/highmem.h | 2 ++
include/linux/io-mapping.h | 2 ++
include/linux/kernel.h | 3 +-
include/linux/sched.h | 1 +
include/linux/uaccess.h | 36 +++++++++++++++---------
kernel/fork.c | 3 ++
lib/strnlen_user.c | 6 ++--
mm/memory.c | 18 ++++--------
72 files changed, 302 insertions(+), 169 deletions(-)
--
2.1.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists