[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c5c77b6cadde9032ad1941bd7610cfb3925ac1ab.camel@surriel.com>
Date: Tue, 03 Oct 2023 20:20:02 -0400
From: Rik van Riel <riel@...riel.com>
To: Mike Kravetz <mike.kravetz@...cle.com>
Cc: linux-kernel@...r.kernel.org, kernel-team@...a.com,
linux-mm@...ck.org, akpm@...ux-foundation.org,
muchun.song@...ux.dev, leit@...a.com, willy@...radead.org
Subject: Re: [PATCH 2/3] hugetlbfs: close race between MADV_DONTNEED and
page fault
On Tue, 2023-10-03 at 13:19 -0700, Mike Kravetz wrote:
> On 10/03/23 15:35, Rik van Riel wrote:
> > On Sun, 2023-10-01 at 21:39 -0700, Mike Kravetz wrote:
> > >
> > > Something is not right here. I have not looked closely at the
> > > patch,
> > > but running libhugetlbfs test suite hits this NULL deref in
> > > misalign
> > > (2M: 32).
> >
> > Hi Mike,
> >
> > fixing the null dereference was easy, but I continued running
> > into a test case failure with linkhuge_rw. After tweaking the
> > code in my patches quite a few times, I finally ran out of
> > ideas and tried it on a tree without my patches.
> >
> > I still see the test failure on upstream
> > 2cf0f7156238 ("Merge tag 'nfs-for-6.6-2' of git://git.linux-
> > nfs.org/projects/anna/linux-nfs")
> >
> > This is with a modern glibc, and the __morecore assignments
> > in libhugetlbfs/morecore.c commented out.
> >
> >
> > HUGETLB_ELFMAP=R HUGETLB_SHARE=1 linkhuge_rw (2M: 32): Pool state:
> > (('hugepages-2048kB', (('free_hugepages', 1), ('resv_hugepages',
> > 0),
> > ('surplus_hugepages', 0), ('nr_hugepages_mempolicy', 1),
> > ('nr_hugepages', 1), ('nr_overcommit_hugepages', 0))),)
> > Hugepage pool state not preserved!
> > BEFORE: (('hugepages-2048kB', (('free_hugepages', 1),
> > ('resv_hugepages', 0), ('surplus_hugepages', 0),
> > ('nr_hugepages_mempolicy', 1), ('nr_hugepages', 1),
> > ('nr_overcommit_hugepages', 0))),)
> > AFTER: (('hugepages-2048kB', (('free_hugepages', 0),
> > ('resv_hugepages',
> > 0), ('surplus_hugepages', 0), ('nr_hugepages_mempolicy', 1),
> > ('nr_hugepages', 1), ('nr_overcommit_hugepages', 0))),)
> >
>
> Please consider the above failures normal and expected. That have
> been
> this way for many years. Sorry for any waste of your time.
>
> Of course, if you would like to look into these you are welcome.
I'm not too worried about the test cases returning failure,
but having free_hugepages not go back to 1 after linkhuge_rw
exits looks bad.
In this case it appears that linkhuge_rw simply left behind
a file in /dev/hugepages when it died, and removing that file
returns free_hugepages back to what it should be.
I guess I'll go run the test cases without -c 1 :)
--
All Rights Reversed.
Powered by blists - more mailing lists