[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1593498910.3046.2.camel@suse.de>
Date: Tue, 30 Jun 2020 08:35:10 +0200
From: Oscar Salvador <osalvador@...e.de>
To: Qian Cai <cai@....pw>, nao.horiguchi@...il.com
Cc: linux-mm@...ck.org, mhocko@...nel.org, akpm@...ux-foundation.org,
mike.kravetz@...cle.com, tony.luck@...el.com, david@...hat.com,
aneesh.kumar@...ux.vnet.ibm.com, zeil@...dex-team.ru,
naoya.horiguchi@....com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 00/15] HWPOISON: soft offline rework
On Tue, 2020-06-30 at 01:08 -0400, Qian Cai wrote:
> On Wed, Jun 24, 2020 at 03:01:22PM +0000, nao.horiguchi@...il.com
> wrote:
> > I rebased soft-offline rework patchset [1][2] onto the latest
> > mmotm. The
> > rebasing required some non-trivial changes to adjust, but mainly
> > that was
> > straightforward. I confirmed that the reported problem doesn't
> > reproduce on
> > compaction after soft offline. For more precise description of the
> > problem
> > and the motivation of this patchset, please see [2].
> >
> > I think that the following two patches in v2 are better to be done
> > with
> > separate work of hard-offline rework, so it's not included in this
> > series.
> >
> > - mm,hwpoison: Take pages off the buddy when hard-offlining
> > - mm/hwpoison-inject: Rip off duplicated checks
> >
> > These two are not directly related to the reported problem, so they
> > seems
> > not urgent. And the first one breaks num_poisoned_pages counting
> > in some
> > testcases, and The second patch needs more consideration about
> > commented point.
> >
> > Any comment/suggestion/help would be appreciated.
>
> Even after applied the compling fix,
>
> https://lore.kernel.org/linux-mm/20200628065409.GA546944@u2004/
>
> madvise(MADV_SOFT_OFFLINE) will fail with EIO with hugetlb where it
> would succeed without this series. Steps:
>
> # git clone https://github.com/cailca/linux-mm
> # cd linux-mm; make
> # ./random 1 (Need at least two NUMA memory nodes)
> start: migrate_huge_offline
> - use NUMA nodes 0,4.
> - mmap and free 8388608 bytes hugepages on node 0
> - mmap and free 8388608 bytes hugepages on node 4
> madvise: Input/output error
I think I know why.
It's been a while since I took a look, but I compared the posted
patchset with my newest patchset I had ready and I saw I made some
changes with regard of hugetlb pages.
I will be taking a look, although it might be better to re-post the
patchset instead of adding a fix on top since the changes are a bit
substantial.
Thanks for reporting.
--
Oscar Salvador
SUSE L3
Powered by blists - more mailing lists