lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220315155837.2dcef6eb226ad74e37ca31ca@linux-foundation.org>
Date:   Tue, 15 Mar 2022 15:58:37 -0700
From:   Andrew Morton <akpm@...ux-foundation.org>
To:     Dong Aisheng <aisheng.dong@....com>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org, dongas86@...il.com,
        shawnguo@...nel.org, linux-imx@....com, m.szyprowski@...sung.com,
        lecopzer.chen@...iatek.com, david@...hat.com, vbabka@...e.cz,
        stable@...r.kernel.org, shijie.qin@....com
Subject: Re: [PATCH v3 1/2] mm: cma: fix allocation may fail sometimes

On Tue, 15 Mar 2022 22:45:20 +0800 Dong Aisheng <aisheng.dong@....com> wrote:

> --- a/mm/cma.c
> +++ b/mm/cma.c
>
> ...
>
> @@ -457,6 +458,16 @@ struct page *cma_alloc(struct cma *cma, unsigned long count,
>  				offset);
>  		if (bitmap_no >= bitmap_maxno) {
>  			spin_unlock_irq(&cma->lock);
> +			pr_debug("%s(): alloc fail, retry loop %d\n", __func__, loop++);
> +			/*
> +			 * rescan as others may finish the memory migration
> +			 * and quit if no available CMA memory found finally
> +			 */
> +			if (start) {
> +				schedule();
> +				start = 0;
> +				continue;
> +			}
>  			break;

The schedule() is problematic.  For a start, we'd normally use
cond_resched() here, so we avoid calling the more expensive schedule()
if we know it won't perform any action.

But cond_resched() is problematic if this thread has realtime
scheduling policy and the process we're waiting on does not.  One way
to address that is to use an unconditional msleep(1), but that's still
just a hack.

A much cleaner solution is to use appropriate locking so that various
threads run this code in order, without messing each other up.

And it looks like the way to do that is to simply revert the commit
which caused this regression, a4efc174b382 ("mm/cma.c: remove redundant
cma_mutex lock")?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ