linux-kernel - Re: [PATCH 12/30] mm: memory reserve management

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1217240994.7813.53.camel@penberg-laptop>
Date:	Mon, 28 Jul 2008 13:29:54 +0300
From:	Pekka Enberg <penberg@...helsinki.fi>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	netdev@...r.kernel.org, trond.myklebust@....uio.no,
	Daniel Lezcano <dlezcano@...ibm.com>,
	Neil Brown <neilb@...e.de>, mpm@...enic.com,
	cl@...ux-foundation.org
Subject: Re: [PATCH 12/30] mm: memory reserve management

Hi Peter,

On Mon, 2008-07-28 at 12:17 +0200, Peter Zijlstra wrote:
> On Mon, 2008-07-28 at 13:06 +0300, Pekka Enberg wrote:
> > Hi Peter,
> > 
> > On Thu, 2008-07-24 at 16:00 +0200, Peter Zijlstra wrote:
> > > +/*
> > > + * alloc wrappers
> > > + */
> > > +
> > 
> > Hmm, I'm not sure I like the use of __kmalloc_track_caller() (even
> > though you do add the wrappers for SLUB). The functions really are SLAB
> > internals so I'd prefer to see kmalloc_reserve() moved to the
> > allocators.
> 
> See below..
> 
> > > +void *___kmalloc_reserve(size_t size, gfp_t flags, int node, void *ip,
> > > +			 struct mem_reserve *res, int *emerg)
> > > +{
> > 
> > This function could use some comments...
> 
> Yes, my latest does have those.. let me paste the relevant bit:
> 
> +void *___kmalloc_reserve(size_t size, gfp_t flags, int node, void *ip,
> +                        struct mem_reserve *res, int *emerg)
> +{
> +       void *obj;
> +       gfp_t gfp;
> +
> +       /*
> +        * Try a regular allocation, when that fails and we're not entitled
> +        * to the reserves, fail.
> +        */
> +       gfp = flags | __GFP_NOMEMALLOC | __GFP_NOWARN;
> +       obj = __kmalloc_node_track_caller(size, gfp, node, ip);
> +
> +       if (obj || !(gfp_to_alloc_flags(flags) & ALLOC_NO_WATERMARKS))
> +               goto out;
> +
> +       /*
> +        * If we were given a reserve to charge against, try that.
> +        */
> +       if (res && !mem_reserve_kmalloc_charge(res, size)) {
> +               /*
> +                * If we failed to charge and we're not allowed to wait for
> +                * it to succeed, bail.
> +                */
> +               if (!(flags & __GFP_WAIT))
> +                       goto out;
> +
> +               /*
> +                * Wait for a successfull charge against the reserve. All
> +                * uncharge operations against this reserve will wake us up.
> +                */
> +               wait_event(res->waitqueue,
> +                               mem_reserve_kmalloc_charge(res, size));
> +
> +               /*
> +                * After waiting for it, again try a regular allocation.
> +                * Pressure could have lifted during our sleep. If this
> +                * succeeds, uncharge the reserve.
> +                */
> +               obj = __kmalloc_node_track_caller(size, gfp, node, ip);
> +               if (obj) {
> +                       mem_reserve_kmalloc_charge(res, -size);
> +                       goto out;
> +               }
> +       }
> +
> +       /*
> +        * Regular allocation failed, and we've successfully charged our
> +        * requested usage against the reserve. Do the emergency allocation.
> +        */
> +       obj = __kmalloc_node_track_caller(size, flags, node, ip);
> +       WARN_ON(!obj);
> +       if (emerg)
> +               *emerg |= 1;
> +
> +out:
> +       return obj;
> +}

Heh, indeed, looks much better :-).

> 
> > > +	void *obj;
> > > +	gfp_t gfp;
> > > +
> > > +	gfp = flags | __GFP_NOMEMALLOC | __GFP_NOWARN;
> > > +	obj = __kmalloc_node_track_caller(size, gfp, node, ip);
> > > +
> > > +	if (obj || !(gfp_to_alloc_flags(flags) & ALLOC_NO_WATERMARKS))
> > > +		goto out;
> > > +
> > > +	if (res && !mem_reserve_kmalloc_charge(res, size)) {
> > > +		if (!(flags & __GFP_WAIT))
> > > +			goto out;
> > > +
> > > +		wait_event(res->waitqueue,
> > > +				mem_reserve_kmalloc_charge(res, size));
> > > +
> > > +		obj = __kmalloc_node_track_caller(size, gfp, node, ip);
> > > +		if (obj) {
> > > +			mem_reserve_kmalloc_charge(res, -size);
> > 
> > Why do we discharge here?
> 
> because a regular allocation succeeded.
> 
> > > +			goto out;
> > > +		}
> > 
> > If the allocation fails, we try again (but nothing has changed, right?).
> > Why?
> 
> Note the different allocation flags for the two allocations.

Uhm, yeah. I missed that.

> > > +	}
> > > +
> > > +	obj = __kmalloc_node_track_caller(size, flags, node, ip);
> > > +	WARN_ON(!obj);
> > 
> > Why don't we discharge from the reserve here if !obj?
> 
> Well, this allocation should never fail:
>   - we reserved memory
>   - we accounted/throttle its usage
> 
> Thus this allocation should always succeed.

But if it *does* fail, it doesn't help that we mess up the reservation
counts, no?

> > > +	if (emerg)
> > > +		*emerg |= 1;
> > > +
> > > +out:
> > > +	return obj;
> > > +}
> > > +
> > > +void __kfree_reserve(void *obj, struct mem_reserve *res, int emerg)
> > 
> > I don't see 'emerg' used anywhere.
> 
> Patch 19/30 has:
> 
> -       data = kmalloc_node_track_caller(size + sizeof(struct skb_shared_info),
> -                       gfp_mask, node);
> +       data = kmalloc_reserve(size + sizeof(struct skb_shared_info),
> +                       gfp_mask, node, &net_skb_reserve, &emergency);
>         if (!data)
>                 goto nodata;
> 
> @@ -205,6 +211,7 @@ struct sk_buff *__alloc_skb(unsigned int
>          * the tail pointer in struct sk_buff!
>          */
>         memset(skb, 0, offsetof(struct sk_buff, tail));
> +       skb->emergency = emergency;
>         skb->truesize = size + sizeof(struct sk_buff);
>         atomic_set(&skb->users, 1);
>         skb->head = data;
> 
> > > +{
> > > +	size_t size = ksize(obj);
> > > +
> > > +	kfree(obj);
> > 
> > We're trying to get rid of kfree() so I'd __kfree_reserve() could to
> > mm/sl?b.c. Matt, thoughts?
> 
> My issue with moving these helpers into mm/sl?b.c is that it would
> require replicating all this code 3 times. Even though the functionality
> is (or should) be invariant to the actual slab implementation.

Right, I guess we could just rename ksize() to something else then and
keep it internal to mm/.

> > > +	/*
> > > +	 * ksize gives the full allocated size vs the requested size we used to
> > > +	 * charge; however since we round up to the nearest power of two, this
> > > +	 * should all work nicely.
> > > +	 */
> > > +	mem_reserve_kmalloc_charge(res, -size);
> > > +}
> > > 
> > 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/