lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <569CD817.7090309@suse.cz>
Date:	Mon, 18 Jan 2016 13:18:31 +0100
From:	Vlastimil Babka <vbabka@...e.cz>
To:	Minchan Kim <minchan@...nel.org>
Cc:	Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
	Junil Lee <junil0814.lee@....com>, ngupta@...are.org,
	akpm@...ux-foundation.org, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] zsmalloc: fix migrate_zspage-zs_free race condition

On 01/18/2016 09:20 AM, Minchan Kim wrote:
> On Mon, Jan 18, 2016 at 08:54:07AM +0100, Vlastimil Babka wrote:
>> On 18.1.2016 8:39, Sergey Senozhatsky wrote:
>>> On (01/18/16 16:11), Minchan Kim wrote:
>>> [..]
>>>>> so, even if clear_bit_unlock/test_and_set_bit_lock do smp_mb or
>>>>> barrier(), there is no corresponding barrier from record_obj()->WRITE_ONCE().
>>>>> so I don't think WRITE_ONCE() will help the compiler, or am I missing
>>>>> something?
>>>>
>>>> We need two things
>>>> 2. memory barrier.
>>>>
>>>> As compiler barrier, WRITE_ONCE works to prevent store tearing here
>>>> by compiler.
>>>> However, if we omit unpin_tag here, we lose memory barrier(e,g, smp_mb)
>>>> so another CPU could see stale data caused CPU memory reordering.
>>>
>>> oh... good find! lost release semantic of unpin_tag()...
>>
>> Ah, release semantic, good point indeed. OK then we need the v2 approach again,
>> with WRITE_ONCE() in record_obj(). Or some kind of record_obj_release() with
>> release semantic, which would be a bit more effective, but I guess migration is
>> not that critical path to be worth introducing it.
>
> WRITE_ONCE in record_obj would add more memory operations in obj_malloc

A simple WRITE_ONCE would just add a compiler barrier. What you suggests 
below does indeed add more operations, which are actually needed just in 
the migration. What I suggested is the v2 approach of adding the PIN bit 
before calling record_obj, *and* simply doing a WRITE_ONCE in 
record_obj() to make sure the PIN bit is indeed applied *before* writing 
to the handle, and not as two separate writes.

> but I don't feel it's too heavy in this phase so,

I'm afraid it's dangerous for the usage of record_obj() in zs_malloc() 
where the handle is freshly allocated by alloc_handle(). Are we sure the 
bit is not set?

The code in alloc_handle() is:
         return (unsigned long)kmem_cache_alloc(pool->handle_cachep,
                 pool->flags & ~__GFP_HIGHMEM);

There's no explicit __GFP_ZERO, so the handles are not guaranteed to be 
allocated empty? And expecting all zpool users to include __GFP_ZERO in 
flags would be too subtle and error prone.

> How about this? Junil, Could you resend patch if others agree this?
> Thanks.
>
> +/*
> + * record_obj updates handle's value to free_obj and it shouldn't
> + * invalidate lock bit(ie, HANDLE_PIN_BIT) of handle, otherwise
> + * it breaks synchronization using pin_tag(e,g, zs_free) so let's
> + * keep the lock bit.
> + */
>   static void record_obj(unsigned long handle, unsigned long obj)
>   {
> -	*(unsigned long *)handle = obj;
> +	int locked = (*(unsigned long *)handle) & (1<<HANDLE_PIN_BIT);
> +	unsigned long val = obj | locked;
> +
> +	/*
> +	 * WRITE_ONCE could prevent store tearing like below
> +	 * *(unsigned long *)handle = free_obj
> +	 * *(unsigned long *)handle |= locked;
> +	 */
> +	WRITE_ONCE(*(unsigned long *)handle, val);
>   }
>
>
>
>>
>> Thanks,
>> Vlastimil
>>
>>>
>>> 	-ss
>>>
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ