lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YLjO2YU2G5fTVB3x@dhcp22.suse.cz>
Date:   Thu, 3 Jun 2021 14:45:13 +0200
From:   Michal Hocko <mhocko@...e.com>
To:     Oscar Salvador <osalvador@...e.de>
Cc:     David Hildenbrand <david@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Anshuman Khandual <anshuman.khandual@....com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Pavel Tatashin <pasha.tatashin@...een.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/3] mm,page_alloc: Use {get,put}_online_mems() to get
 stable zone's values

On Thu 03-06-21 10:38:24, Oscar Salvador wrote:
> On Wed, Jun 02, 2021 at 09:45:58PM +0200, Oscar Salvador wrote:
> > It was too nice and easy to be true I guess.
> > Yeah, I missed the point that blocking might be an issue here, and hotplug
> > operations can take really long, so not an option.
> > I must have switched my brain off back there, because now it is just too
> > obvious.
> > 
> > Then I guwmess that seqlock must stay and the only thing than can go is the
> > pgdat resize lock from the hotplug code.
> 
> So, I have been looking into this again.
> Of course, the approach taken here is outrageously wrong, but there are
> some other things that are a bit confusing.
> 
> As pointed out, bad_range() (the function that ends up calling
> page_outside_zone_boundaries) is called from different functions via VM_BUG_ON_*.
> page_outside_zone_boundaries() takes care of taking the seqlock to avoid
> reading stale values that might happen if we race with memhotplug
> operations.
> page_outside_zone_boundaries() calls zone_spans_pfn() to check that.
>  
> Now on the funny thing.
> 
> We do have several places happily calling zone_spans_pfn() without
> holding zone's seqlock, e.g:
> 
> set_pageblock_migratetype
>  set_pfnblock_flags_mask
>   zone_spans_pfn
> 
> move_freepages_block
>  zone_spans_pfn
> 
> alloc_contig_pages
>  zone_spans_last_pfn
>   zone_spans_pfn
> 
> Those places hold zone->lock, while move_pfn_range_to_zone() and
> remove_pfn_range_from_zone() hold zone->seqlock, so AFAICS, those places
> could read a stale value and proceed thinking the range is within the
> zone while it is not.
> 
> So I guess my question is, should we force those places to take the
> seqlock reader as we do in page_outside_zone_boundaries(), (or maybe
> just move the seqlock handling to zone_spans_pfn())?

I believe we need to define the purpose of the locking first. The
existing locking doesn't serve much purpose, does it? The state might
change right after the lock is released and the caller cannot really
rely on the result. So aside of the current implementation, I would
argue that any locking has to be be done on the caller layer.

But the primary question is whether anybody actually cares about
potential races in the first place.
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ