lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YEo1gz6wuYl1Fuqt@cmpxchg.org>
Date:   Thu, 11 Mar 2021 10:21:39 -0500
From:   Johannes Weiner <hannes@...xchg.org>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Matthew Wilcox <willy@...radead.org>,
        Zhou Guanghui <zhouguanghui1@...wei.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        akpm@...ux-foundation.org, hughd@...gle.com,
        kirill.shutemov@...ux.intel.com, npiggin@...il.com, ziy@...dia.com,
        wangkefeng.wang@...wei.com, guohanjun@...wei.com,
        dingtianhong@...wei.com, chenweilong@...wei.com,
        rui.xiang@...wei.com
Subject: Re: [PATCH v2 2/2] mm/memcg: set memcg when split page

On Thu, Mar 11, 2021 at 09:37:02AM +0100, Michal Hocko wrote:
> Johannes, Hugh,
> 
> what do you think about this approach? If we want to stick with
> split_page approach then we need to update the missing place Matthew has
> pointed out.

I find the __free_pages() code quite tricky as well. But for that
reason I would actually prefer to initiate the splitting in there,
since that's the place where we actually split the page, rather than
spread the handling of this situation further out.

The race condition shouldn't be hot, so I don't think we need to be as
efficient about setting page->memcg_data only on the higher-order
buddies as in Willy's scratch patch. We can call split_page_memcg(),
which IMO should actually help document what's happening to the page.

I think that function could also benefit a bit more from step-by-step
documentation about what's going on. The kerneldoc is helpful, but I
don't think it does justice to how tricky this race condition is.

Something like this?

void __free_pages(struct page *page, unsigned int order)
{
	/*
	 * Drop the base reference from __alloc_pages and free. In
	 * case there is an outstanding speculative reference, from
	 * e.g. the page cache, it will put and free the page later.
	 */
	if (likely(put_page_testzero(page))) {
		free_the_page(page, order);
		return;
	}

	/*
	 * The speculative reference will put and free the page.
	 *
	 * However, if the speculation was into a higher-order page
	 * that isn't marked compound, the other side will know
	 * nothing about our buddy pages and only free the order-0
	 * page at the start of our chunk! We must split off and free
	 * the buddy pages here.
	 *
	 * The buddy pages aren't individually refcounted, so they
	 * can't have any pending speculative references themselves.
	 */
	if (!PageHead(page) && order > 0) {
		split_page_memcg(page, 1 << order);
		while (order-- > 0)
			free_the_page(page + (1 << order), order);
	}
}

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ