lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200227084704.aolem5nktpricrzo@box>
Date:   Thu, 27 Feb 2020 11:47:04 +0300
From:   "Kirill A. Shutemov" <kirill@...temov.name>
To:     Hugh Dickins <hughd@...gle.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Yang Shi <yang.shi@...ux.alibaba.com>,
        Alexander Duyck <alexander.duyck@...il.com>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        David Hildenbrand <david@...hat.com>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Matthew Wilcox <willy@...radead.org>,
        Andrea Arcangeli <aarcange@...hat.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] huge tmpfs: try to split_huge_page() when punching hole

On Wed, Feb 26, 2020 at 08:06:33PM -0800, Hugh Dickins wrote:
> Yang Shi writes:
> 
> Currently, when truncating a shmem file, if the range is partly in a THP
> (start or end is in the middle of THP), the pages actually will just get
> cleared rather than being freed, unless the range covers the whole THP.
> Even though all the subpages are truncated (randomly or sequentially),
> the THP may still be kept in page cache.
> 
> This might be fine for some usecases which prefer preserving THP, but
> balloon inflation is handled in base page size.  So when using shmem THP
> as memory backend, QEMU inflation actually doesn't work as expected since
> it doesn't free memory.  But the inflation usecase really needs to get
> the memory freed.  (Anonymous THP will also not get freed right away,
> but will be freed eventually when all subpages are unmapped: whereas
> shmem THP still stays in page cache.)
> 
> Split THP right away when doing partial hole punch, and if split fails
> just clear the page so that read of the punched area will return zeroes.
> 
> Hugh Dickins adds:
> 
> Our earlier "team of pages" huge tmpfs implementation worked in the way
> that Yang Shi proposes; and we have been using this patch to continue to
> split the huge page when hole-punched or truncated, since converting over
> to the compound page implementation.  Although huge tmpfs gives out huge
> pages when available, if the user specifically asks to truncate or punch
> a hole (perhaps to free memory, perhaps to reduce the memcg charge), then
> the filesystem should do so as best it can, splitting the huge page.

I'm still uncomfortable with proposition to use truncate or punch a hole
operations to manage memory footprint. These operations are about managing
storage footprint, not memory. This happens to be the same for tmpfs.

I wounder if we should consider limiting the behaviour to the operation
that explicitly combines memory and storage managing: MADV_REMOVE. This
way we can avoid future misunderstandings with THP backed by a real
filesystem.

>  }
>  
>  /*
> + * Check whether a hole-punch or truncation needs to split a huge page,
> + * returning true if no split was required, or the split has been successful.
> + *
> + * Eviction (or truncation to 0 size) should never need to split a huge page;
> + * but in rare cases might do so, if shmem_undo_range() failed to trylock on
> + * head, and then succeeded to trylock on tail.
> + *
> + * A split can only succeed when there are no additional references on the
> + * huge page: so the split below relies upon find_get_entries() having stopped
> + * when it found a subpage of the huge page, without getting further references.
> + */
> +static bool shmem_punch_compound(struct page *page, pgoff_t start, pgoff_t end)
> +{
> +	if (!PageTransCompound(page))
> +		return true;
> +
> +	/* Just proceed to delete a huge page wholly within the range punched */
> +	if (PageHead(page) &&
> +	    page->index >= start && page->index + HPAGE_PMD_NR <= end)
> +		return true;
> +
> +	/* Try to split huge page, so we can truly punch the hole or truncate */
> +	return split_huge_page(page) >= 0;
> +}

I wanted to recommend taking into account khugepaged_max_ptes_none here,
but it will nullify usefulness of the feature for ballooning. Oh, well...

-- 
 Kirill A. Shutemov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ