[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9a79ef1a-96e0-1fd7-97e8-ef854b08524d@redhat.com>
Date: Fri, 13 Dec 2019 11:02:46 -0500
From: Waiman Long <longman@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>, Will Deacon <will.deacon@....com>,
linux-kernel@...r.kernel.org, Bart Van Assche <bvanassche@....org>
Subject: Re: [PATCH 4/5] locking/lockdep: Reuse free chain_hlocks entries
On 12/13/19 5:50 AM, Peter Zijlstra wrote:
> On Fri, Dec 13, 2019 at 11:25:25AM +0100, Peter Zijlstra wrote:
>
>> void push_block(struct chain_block **bp, struct chain_block *b)
>> {
>> b->next = *bp;
>> *bp = b;
>> }
>>
>> /* could contemplate ilog2() buckets */
>> int size2bucket(int size)
>> {
>> return size >= MAX_BUCKET ? 0 : size;
>> }
> If you make the allocation granularity 4 entries, then you can have
> push_block() store the chain_block * at the start of every free entry,
> which would enable merging adjecent blocks.
>
> That is, if I'm not mistaken these u16 chain entries encode the class
> index (per lock_chain_get_class()). And since the lock_classes array is
> MAX_LOCKDEP_KEYS, or 13 bits, big, we have the MSB of each entry spare.
>
> union {
> struct {
> u16 hlock[4];
> }
> u64 addr;
> } ponies;
>
> So if we make the rule that:
>
> !(idx % 4) && (((union ponies *)chain_hlock[idx])->addr & BIT_ULL(63))
>
> encodes a free block at said addr, then we should be able to detect if:
>
> chain_hlock[b->base + b->size]
>
> is also free and merge the blocks.
>
> (we just have to fix up the MSB of the address, not all arches have
> negative addresses for kernel space)
>
That is an interesting idea. It will eliminate the need of a separate
array to track the free chain_hlocks. However, if there are n chains
available, it will waste about 3n bytes of storage, on average.
I have a slightly different idea. I will enforce a minimum allocation
size of 2. For a free block, the first 2 hlocks for each allocation
block will store a 32-bit integer (hlock[0] << 16)|hlock[1]:
Bit 31: always 1
Bits 24-30: block size
Bits 0-23: index to the next free block.
In this way, the wasted space will be k bytes where k is the number of
1-entry chains. I don't think merging adjacent blocks will be that
useful at this point. We can always add this capability later on if it
is found to be useful.
Cheers,
Longman
Powered by blists - more mailing lists