lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 9 Dec 2019 08:49:07 -0800
From:   Matthew Wilcox <willy@...radead.org>
To:     Waiman Long <longman@...hat.com>
Cc:     Mike Kravetz <mike.kravetz@...cle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH] hugetlbfs: Disable IRQ when taking hugetlb_lock in
 set_max_huge_pages()

On Mon, Dec 09, 2019 at 11:01:50AM -0500, Waiman Long wrote:
> [  613.245273] Call Trace:
> [  613.256273]  <IRQ>
> [  613.265273]  dump_stack+0x9a/0xf0
> [  613.281273]  mark_lock+0xd0c/0x12f0
> [  613.341273]  __lock_acquire+0x146b/0x48c0
> [  613.401273]  lock_acquire+0x14f/0x3b0
> [  613.440273]  _raw_spin_lock+0x30/0x70
> [  613.477273]  free_huge_page+0x36f/0xaa0
> [  613.495273]  bio_check_pages_dirty+0x2fc/0x5c0

Oh, this is fun.  So we kicked off IO to a hugepage, then truncated or
otherwise caused the page to come free.  Then the IO finished and did the
final put on the page ... from interrupt context.  Splat.  Not something
that's going to happen often, but can happen if a process dies during
IO or due to a userspace bug.

Maybe we should schedule work to do the actual freeing of the page
instead of this rather large patch.  It doesn't seem like a case we need
to optimise for.

Further, this path is called from softirq context, not hardirq context.
That means the whole mess could be a spin_lock_bh(), not spin_lock_irq()
which is rather cheaper.  Anyway, I think the schedule-freeing-of-a-page-
if-in-irq-context approach is likely to be better.

> +/*
> + * Check to make sure that IRQ is enabled before calling spin_lock_irq()
> + * so that after a matching spin_unlock_irq() the system won't be in an
> + * incorrect state.
> + */
> +static __always_inline void spin_lock_irq_check(spinlock_t *lock)
> +{
> +	lockdep_assert_irqs_enabled();
> +	spin_lock_irq(lock);
> +}
> +#ifdef spin_lock_irq
> +#undef spin_lock_irq
> +#endif
> +#define spin_lock_irq(lock)	spin_lock_irq_check(lock)

Don't leave your debugging code in the patch you submit for merging.

> @@ -1775,7 +1793,11 @@ static void return_unused_surplus_pages(struct hstate *h,
>  		unused_resv_pages--;
>  		if (!free_pool_huge_page(h, &node_states[N_MEMORY], 1))
>  			goto out;
> -		cond_resched_lock(&hugetlb_lock);
> +		if (need_resched()) {
> +			spin_unlock_irq(&hugetlb_lock);
> +			cond_resched();
> +			spin_lock_irq(&hugetlb_lock);
> +		}

This doesn't work.  need_resched() is only going to be set due to things
happening in interrupt context.  But you've disabled interrupts, so
need_resched() is never going to be set.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ