lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 9 Sep 2011 08:28:55 +0200
From:	Johannes Weiner <jweiner@...hat.com>
To:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Ying Han <yinghan@...gle.com>,
	Daisuke Nishimura <nishimura@....nes.nec.co.jp>,
	Balbir Singh <bsingharora@...il.com>,
	Michal Hocko <mhocko@...e.cz>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [patch] mm: memcg: close race between charge and putback

On Fri, Sep 09, 2011 at 09:28:53AM +0900, KAMEZAWA Hiroyuki wrote:
> On Thu, 8 Sep 2011 11:53:49 +0200
> Johannes Weiner <jweiner@...hat.com> wrote:
> 
> > On Thu, Sep 08, 2011 at 06:42:21PM +0900, KAMEZAWA Hiroyuki wrote:
> > > On Thu, 8 Sep 2011 11:33:16 +0200
> > > Johannes Weiner <jweiner@...hat.com> wrote:
> > > 
> > > > On Thu, Sep 08, 2011 at 06:19:01PM +0900, KAMEZAWA Hiroyuki wrote:
> > > > > On Thu, 8 Sep 2011 10:54:04 +0200
> > > > > Johannes Weiner <jweiner@...hat.com> wrote:
> > > > > 
> > > > > > On Thu, Sep 08, 2011 at 05:30:42PM +0900, KAMEZAWA Hiroyuki wrote:
> > > > > > > On Thu,  8 Sep 2011 09:40:22 +0200
> > > > > > > Johannes Weiner <jweiner@...hat.com> wrote:
> > > > > > > 
> > > > > > > > There is a potential race between a thread charging a page and another
> > > > > > > > thread putting it back to the LRU list:
> > > > > > > > 
> > > > > > > > charge:                         putback:
> > > > > > > > SetPageCgroupUsed               SetPageLRU
> > > > > > > > PageLRU && add to memcg LRU     PageCgroupUsed && add to memcg LRU
> > > > > > > > 
> > > > > > > 
> > > > > > > I assumed that all pages are charged before added to LRU.
> > > > > > > (i.e. event happens in charge->lru_lock->putback order.)
> > > > > > > 
> > > > > > > But hmm, this assumption may be bad for maintainance.
> > > > > > > Do you find a code which adds pages to LRU before charge ?
> > > > > > > 
> > > > > > > Hmm, if there are codes which recharge the page to other memcg,
> > > > > > > it will cause bug and my assumption may be harmful.
> > > > > > 
> > > > > > Swap slots are read optimistically into swapcache and put to the LRU,
> > > > > > then charged upon fault.  
> > > > > 
> > > > > Yes, then swap charge removes page from LRU before charge.
> > > > > IIUC, it needed to do so because page->mem_cgroup may be replaced.
> > > > 
> > > > But only from the memcg LRU.  It's still on the global per-zone LRU,
> > > > so reclaim could isolate/putback it during the charge.  And then
> > > > 
> > > > > > > > charge:                         putback:
> > > > > > > > SetPageCgroupUsed               SetPageLRU
> > > > > > > > PageLRU && add to memcg LRU     PageCgroupUsed && add to memcg LRU
> > > > 
> > > > applies.
> > > 
> > > Hmm, in this case, I thought memcg puts back the page to its LRU by itself
> > > under lru_loc after charge and the race was hidden.
> > 
> > But it locklessly checks PageLRU and bails if it's cleared and that is
> 
> I think PageLRU check is done under zone->lru_lock.

Yes, but only if a preliminary, lockless check observed PageLRU being
set:

static void mem_cgroup_lru_add_after_commit(struct page *page)
{
        unsigned long flags;
        struct zone *zone = page_zone(page);
        struct page_cgroup *pc = lookup_page_cgroup(page);
        /*
         * putback:                             charge:
         * SetPageLRU                           SetPageCgroupUsed
         * smp_mb                               smp_mb
         * PageCgroupUsed && add to memcg LRU   PageLRU && add to memcg LRU
         *
         * Ensure that one of the two sides adds the page to the memcg
         * LRU during a race.
         */
        smp_mb();
        /* taking care of that the page is added to LRU while we commit it */
        if (likely(!PageLRU(page)))
                return;
        spin_lock_irqsave(&zone->lru_lock, flags);
        /* link when the page is linked to LRU but page_cgroup isn't */
        if (PageLRU(page) && !PageCgroupAcctLRU(pc))
                mem_cgroup_add_lru_list(page, page_lru(page));
        spin_unlock_irqrestore(&zone->lru_lock, flags);
}

Without the barriers, the preliminary check may see !PageLRU while a
racing putback observed !PageCgroupUsed and nobody will add the page
to the memcg-LRU.

> > the problem: it's not guaranteed that PageLRU is observed on the
> > charging CPU when the putback side bailed because of PageCgroupUsed.
> > 
> zone->lru_lock is no help ?

Removing the preliminary check is not an option.  This is the path for
regular file cache and the common case is !PageLRU.  We do not want to
acquire the lock in estimated 99.999% of the cases.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ