[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YFIZVvQllaZHDgzW@dhcp22.suse.cz>
Date: Wed, 17 Mar 2021 15:59:34 +0100
From: Michal Hocko <mhocko@...e.com>
To: Oscar Salvador <osalvador@...e.de>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Vlastimil Babka <vbabka@...e.cz>,
David Hildenbrand <david@...hat.com>,
Muchun Song <songmuchun@...edance.com>,
Mike Kravetz <mike.kravetz@...cle.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 2/5] mm,compaction: Let
isolate_migratepages_{range,block} return error codes
On Wed 17-03-21 15:38:35, Oscar Salvador wrote:
> On Wed, Mar 17, 2021 at 03:12:29PM +0100, Michal Hocko wrote:
> > > Since isolate_migratepages_block will stop returning the next pfn to be
> > > scanned, we reuse the cc->migrate_pfn field to keep track of that.
> >
> > This looks hakish and I cannot really tell that users of cc->migrate_pfn
> > work as intended.
>
> When discussing this with Vlastimil, I came up with the idea of adding a new
> field in compact_control struct, e.g: next_pfn_scan to keep track of the next
> pfn to be scanned.
>
> But Vlastimil made me realize that since cc->migrate_pfn points to that aleady,
> so we do not need any extra field.
This deserves a big fat comment.
> > > @@ -810,6 +811,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> > > unsigned long next_skip_pfn = 0;
> > > bool skip_updated = false;
> > >
> > > + cc->migrate_pfn = low_pfn;
> > > +
> > > /*
> > > * Ensure that there are not too many pages isolated from the LRU
> > > * list by either parallel reclaimers or compaction. If there are,
> > > @@ -818,16 +821,16 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> > > while (unlikely(too_many_isolated(pgdat))) {
> > > /* stop isolation if there are still pages not migrated */
> > > if (cc->nr_migratepages)
> > > - return 0;
> > > + return -EINTR;
> > >
> > > /* async migration should just abort */
> > > if (cc->mode == MIGRATE_ASYNC)
> > > - return 0;
> > > + return -EINTR;
> >
> > EINTR for anything other than signal based bail out is really confusing.
>
> When coding that, I thought about using -1 for the first two checks, and keep
> -EINTR for the signal check, but isolate_migratepages_block only has two users:
No, do not mix error reporting with different semantic. Either make it
errno or return -1 for all failures if you do not care which error that
is. You do care and hence this patch so make that errno and above two
should simply EAGAIN as this is a congestion situation.
> - isolate_migratepages: Does not care about the return code other than pfn != 0,
> and it does not pass the error down the chain.
> - isolate_migratepages_range: The error is passed down the chain, and !pfn is being
> treated as -EINTR:
>
> static int __alloc_contig_migrate_range(struct compact_control *cc,
> unsigned long start, unsigned long end)
> {
> ...
> ...
> pfn = isolate_migratepages_range(cc, pfn, end);
> if (!pfn) {
> ret = -EINTR;
> break;
> }
> ...
> }
>
> That is why I decided to stick with -EINTR.
I suspect this is only because there was not really a better way to tell
the failure so it went with EINTR which makes alloc_contig_range bail
out. The high level handling there is quite dubious as EAGAIN is already
possible from the page migration path and that shouldn't be a fatal
failure.
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists