lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180801112039.GJ16767@dhcp22.suse.cz>
Date:   Wed, 1 Aug 2018 13:20:39 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Michael Ellerman <mpe@...erman.id.au>
Cc:     John Allen <jallen@...ux.ibm.com>, n-horiguchi@...jp.nec.com,
        linuxppc-dev@...ts.ozlabs.org, linux-kernel@...r.kernel.org,
        kamezawa.hiroyu@...fujitsu.com, mgorman@...e.de
Subject: Re: Infinite looping observed in __offline_pages

On Wed 01-08-18 21:09:39, Michael Ellerman wrote:
> Michal Hocko <mhocko@...nel.org> writes:
> > On Wed 25-07-18 13:11:15, John Allen wrote:
> > [...]
> >> Does a failure in do_migrate_range indicate that the range is unmigratable
> >> and the loop in __offline_pages should terminate and goto failed_removal? Or
> >> should we allow a certain number of retrys before we
> >> give up on migrating the range?
> >
> > Unfortunatelly not. Migration code doesn't tell a difference between
> > ephemeral and permanent failures.
> 
> What's to stop an ephemeral failure happening repeatedly?

If there is a short term pin on the page that prevents the migration
then the holder of the pin should realease it and the next retry will
succeed the migration. If the page gets freed on the way then it will
not be reallocated because they are isolated already. I can only see
complete OOM to be the reason to fail allocation of the target place
as the migration failure and that is highly unlikely and sooner or later
trigger the oom killer and release some memory.

The biggest problem here is that we cannot tell ephemeral and long term
pins...
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ