linux-kernel - Re: [PATCH 09/19] list_lru: per-node list infrastructure

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130118080825.GP2498@dastard>
Date:	Fri, 18 Jan 2013 19:08:25 +1100
From:	Dave Chinner <david@...morbit.com>
To:	Glauber Costa <glommer@...allels.com>
Cc:	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	linux-mm@...ck.org, xfs@....sgi.com,
	Greg Thelen <gthelen@...gle.com>,
	Ying Han <yinghan@...gle.com>,
	Suleiman Souhlal <suleiman@...gle.com>
Subject: Re: [PATCH 09/19] list_lru: per-node list infrastructure

On Thu, Jan 17, 2013 at 04:51:03PM -0800, Glauber Costa wrote:
> On 01/17/2013 04:10 PM, Dave Chinner wrote:
> > and we end up with:
> > 
> > lru_add(struct lru_list *lru, struct lru_item *item)
> > {
> > 	node_id = min(object_to_nid(item), lru->numnodes);
> > 	
> > 	__lru_add(lru, node_id, &item->global_list);
> > 	if (memcg) {
> > 		memcg_lru = find_memcg_lru(lru->memcg_lists, memcg_id)
> > 		__lru_add_(memcg_lru, node_id, &item->memcg_list);
> > 	}
> > }
> 
> A follow up thought: If we have multiple memcgs, and global pressure
> kicks in (meaning none of them are particularly under pressure),
> shouldn't we try to maintain fairness among them and reclaim equal
> proportions from them all the same way we do with sb's these days, for
> instance?

I don't like the complexity. The global lists will be reclaimed in
LRU order, so it's going to be as fair as can be. If there's a memcg
that has older unused objectsi than the others, then froma global
perspective they should be reclaimed first because the memcg is not
using them...

> I would argue that if your memcg is small, the list of dentries is
> small: scan it all for the nodes you want shouldn't hurt.

on the contrary - the memcg might be small, but what happens if
someone ran a find across all the filesytsems on the system in it?
Then the LRU will be huge, and scanning expensive...

We can't make static decisions about small and large, and we can't
trust heuristics to get it right, either. If we have a single list,
we don't/can't do node-aware reclaim efficiently and so shouldn't
even try.

> if the memcg is big, it will have per-node lists anyway.

But may have no need for them due to the workload. ;)

> Given that, do we really want to pay the price of two list_heads
> in the objects?

I'm just looking at ways at making the infrastructure sane. If the
cost is an extra 16 bytes per object on a an LRU, then that a small
price to pay for having robust memory reclaim infrastructure....

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/