lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 29 Jun 2015 14:47:59 -0700
From:	Mike Kravetz <mike.kravetz@...cle.com>
To:	linux-mm@...ck.org, linux-kernel@...r.kernel.org
CC:	Dave Hansen <dave.hansen@...ux.intel.com>,
	Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
	David Rientjes <rientjes@...gle.com>,
	Hugh Dickins <hughd@...gle.com>,
	Davidlohr Bueso <dave@...olabs.net>,
	Aneesh Kumar <aneesh.kumar@...ux.vnet.ibm.com>,
	Hillf Danton <hillf.zj@...baba-inc.com>,
	Christoph Hellwig <hch@...radead.org>
Subject: Re: [RFC v5 PATCH 1/9] mm/hugetlb: add region_del() to delete a specific
 range of entries

On 06/22/2015 05:38 PM, Mike Kravetz wrote:
> fallocate hole punch will want to remove a specific range of pages.
> The existing region_truncate() routine deletes all region/reserve
> map entries after a specified offset.  region_del() will provide
> this same functionality if the end of region is specified as -1.
> Hence, region_del() can replace region_truncate().
>
> Unlike region_truncate(), region_del() can return an error in the
> rare case where it can not allocate memory for a region descriptor.
> This ONLY happens in the case where an existing region must be split.
> Current callers passing -1 as end of range will never experience
> this error and do not need to deal with error handling.  Future
> callers of region_del() (such as fallocate hole punch) will need to
> handle this error.

Unfortunately, this new region_del() functionality required for hole
punch conflicts with existing region_chg()/region_add() assumptions.

region_chg/region_add is something like a two step commit process for
adding new region entries.  region_chg is first called to determine
the changes required for the new entry.  If the new entry can be
represented by expanding an existing region, no changes are made to
the map in region_chg.  If the new entry is not adjacent to an
existing region, a placeholder is created during region_chg.  Later
when region_add is called, the assumption is that a region (real or
placeholder) can be expanded to represent the new entry.  Since
all required entries already exist in the map, region_add can not
fail.

It is possible for the new region_del to modify the map between the
region_chg and region_add calls.  It can not modify the same map
entry being added by region_chg/region_add as that is protected by
the fault mutex.  However, it can modify an entry adjacent to the
new entry.  The entry could be modified so that it is no longer
adjacent to the new entry.  As a result, when region_add is called
it will not find a region which can be expanded to represent the
new entry.

In this situation, region_add only needs to add a new region to
the map.  However, to do so would require allocating a new region
descriptor.  The allocation could fail which would result in
region_add failing.

I'm thinking about having a cache of region descriptors pre-allocated
to handle this (rare) situation.  The number of descriptors needed
in the cache would correspond to the number of page faults in
progress (between region_chg and region_add).  region_chg would make
sure there are sufficient descriptors and allocate one if needed.
Error handling for region_chg ENOMEM already exists.  A sufficient
number of entries would be pre-allocated such that in the normal
case no allocation would be necessary.

Thoughts?
-- 
Mike Kravetz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ