[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAA+hA=Ss=YBt-3f=r1BL1NuO7FK56kTf31zWzNjMBkAKQE41Rg@mail.gmail.com>
Date: Wed, 16 Mar 2022 11:41:37 +0800
From: Dong Aisheng <dongas86@...il.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Dong Aisheng <aisheng.dong@....com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
shawnguo@...nel.org, linux-imx@....com, m.szyprowski@...sung.com,
lecopzer.chen@...iatek.com, david@...hat.com, vbabka@...e.cz,
stable@...r.kernel.org, shijie.qin@....com
Subject: Re: [PATCH v3 1/2] mm: cma: fix allocation may fail sometimes
On Wed, Mar 16, 2022 at 6:58 AM Andrew Morton <akpm@...ux-foundation.org> wrote:
>
> On Tue, 15 Mar 2022 22:45:20 +0800 Dong Aisheng <aisheng.dong@....com> wrote:
>
> > --- a/mm/cma.c
> > +++ b/mm/cma.c
> >
> > ...
> >
> > @@ -457,6 +458,16 @@ struct page *cma_alloc(struct cma *cma, unsigned long count,
> > offset);
> > if (bitmap_no >= bitmap_maxno) {
> > spin_unlock_irq(&cma->lock);
> > + pr_debug("%s(): alloc fail, retry loop %d\n", __func__, loop++);
> > + /*
> > + * rescan as others may finish the memory migration
> > + * and quit if no available CMA memory found finally
> > + */
> > + if (start) {
> > + schedule();
> > + start = 0;
> > + continue;
> > + }
> > break;
>
> The schedule() is problematic. For a start, we'd normally use
> cond_resched() here, so we avoid calling the more expensive schedule()
> if we know it won't perform any action.
>
> But cond_resched() is problematic if this thread has realtime
> scheduling policy and the process we're waiting on does not. One way
> to address that is to use an unconditional msleep(1), but that's still
> just a hack.
>
I think we can simply drop schedule() here during the second round of retry
as the estimated delay may not be really needed.
Do you think that's ok?
> A much cleaner solution is to use appropriate locking so that various
> threads run this code in order, without messing each other up.
>
> And it looks like the way to do that is to simply revert the commit
> which caused this regression, a4efc174b382 ("mm/cma.c: remove redundant
> cma_mutex lock")?
Yes, agree it could be a backup solution if not better ideas.
Regards
Aisheng
Powered by blists - more mailing lists