linux-kernel - Re: [RFC][PATCH 4/9] create aggregate kvm_total_used_mmu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 16 Jun 2010 11:48:00 +0300
From:	Avi Kivity <avi@...hat.com>
To:	Dave Hansen <dave@...ux.vnet.ibm.com>
CC:	linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [RFC][PATCH 4/9] create aggregate kvm_total_used_mmu_pages value

On 06/15/2010 04:55 PM, Dave Hansen wrote:
> Note: this is the real meat of the patch set.  It can be applied up
> to this point, and everything will probably be improved, at least
> a bit.
>
> Of slab shrinkers, the VM code says:
>
>   * Note that 'shrink' will be passed nr_to_scan == 0 when the VM is
>   * querying the cache size, so a fastpath for that case is appropriate.
>
> and it *means* it.  Look at how it calls the shrinkers:
>
> 	nr_before = (*shrinker->shrink)(0, gfp_mask);
> 	shrink_ret = (*shrinker->shrink)(this_scan, gfp_mask);
>
> So, if you do anything stupid in your shrinker, the VM will doubly
> punish you.
>    

Ouch.

> The mmu_shrink() function takes the global kvm_lock, then acquires
> every VM's kvm->mmu_lock in sequence.  If we have 100 VMs, then
> we're going to take 101 locks.  We do it twice, so each call takes
> 202 locks.  If we're under memory pressure, we can have each cpu
> trying to do this.  It can get really hairy, and we've seen lock
> spinning in mmu_shrink() be the dominant entry in profiles.
>
> This is guaranteed to optimize at least half of those lock
> aquisitions away.  It removes the need to take any of the locks
> when simply trying to count objects.
>
>
> Signed-off-by: Dave Hansen<dave@...ux.vnet.ibm.com>
> ---
>
>   linux-2.6.git-dave/arch/x86/kvm/mmu.c |   33 +++++++++++++++++++++++----------
>   1 file changed, 23 insertions(+), 10 deletions(-)
>
> diff -puN arch/x86/kvm/mmu.c~make_global_used_value arch/x86/kvm/mmu.c
> --- linux-2.6.git/arch/x86/kvm/mmu.c~make_global_used_value	2010-06-09 15:14:30.000000000 -0700
> +++ linux-2.6.git-dave/arch/x86/kvm/mmu.c	2010-06-09 15:14:30.000000000 -0700
> @@ -891,6 +891,19 @@ static int is_empty_shadow_page(u64 *spt
>   }
>   #endif
>
> +/*
> + * This value is the sum of all of the kvm instances's
> + * kvm->arch.n_used_mmu_pages values.  We need a global,
> + * aggregate version in order to make the slab shrinker
> + * faster
> + */
> +static unsigned int kvm_total_used_mmu_pages;
>    

The variable needs to be at the top of the file.

> +static inline void kvm_mod_used_mmu_pages(struct kvm *kvm, int nr)
> +{
> +	kvm->arch.n_used_mmu_pages += nr;
> +	kvm_total_used_mmu_pages += nr;
>    

Needs an atomic operation, since there's no global lock here.  To avoid 
bouncing this cacheline around, make the variable percpu and make 
readers take a sum across all cpus.  Side benefit is that you no longer 
need an atomic but a local_t, which is considerably cheaper.

> +}
> +
>    


-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/