lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <50EAE015.1000702@linux.vnet.ibm.com>
Date:	Mon, 07 Jan 2013 08:47:49 -0600
From:	Seth Jennings <sjenning@...ux.vnet.ibm.com>
To:	Dan Magenheimer <dan.magenheimer@...cle.com>
CC:	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Nitin Gupta <ngupta@...are.org>,
	Minchan Kim <minchan@...nel.org>,
	Konrad Wilk <konrad.wilk@...cle.com>,
	Robert Jennings <rcj@...ux.vnet.ibm.com>,
	Jenifer Hopper <jhopper@...ibm.com>,
	Mel Gorman <mgorman@...e.de>,
	Johannes Weiner <jweiner@...hat.com>,
	Rik van Riel <riel@...hat.com>,
	Larry Woodman <lwoodman@...hat.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, devel@...verdev.osuosl.org,
	Dave Hansen <dave@...ux.vnet.ibm.com>
Subject: Re: [PATCH 7/8] zswap: add to mm/

On 01/04/2013 04:45 PM, Dan Magenheimer wrote:
>> From: Seth Jennings [mailto:sjenning@...ux.vnet.ibm.com]
>> Subject: Re: [PATCH 7/8] zswap: add to mm/
>>
>> On 01/03/2013 04:33 PM, Dan Magenheimer wrote:
>>>> From: Seth Jennings [mailto:sjenning@...ux.vnet.ibm.com]
>>>>
>>>> However, once the flushing code was introduced and could free an entry
>>>> from the zswap_fs_store() path, it became necessary to add a per-entry
>>>> refcount to make sure that the entry isn't freed while another code
>>>> path was operating on it.
>>>
>>> Hmmm... doesn't the refcount at least need to be an atomic_t?
>>
>> An entry's refcount is only ever changed under the tree lock, so
>> making them atomic_t would be redundantly atomic.
> 
> Maybe I'm missing something still but then I think you also
> need to evaluate and act on the refcount (not just read it) while
> your treelock is held.  I.e., in:
> 
>> +		/* page is already in the swap cache, ignore for now */
>> +		spin_lock(&tree->lock);
>> +		refcount = zswap_entry_put(entry);
>> +		spin_unlock(&tree->lock);
>> +
>> +		if (likely(refcount))
>> +			return 0;
>> +
>> +		/* if the refcount is zero, invalidate must have come in */
>> +		/* free */
>> +		zs_free(tree->pool, entry->handle);
>> +		zswap_entry_cache_free(entry);
>> +		atomic_dec(&zswap_stored_pages);
> 
> the entry's refcount may be changed by another processor
> immediately after the unlock, and then the "if (refcount)"
> is testing a stale value and you will get (I think) a memory leak.

It is true that the refcount could be stale by the time we do the
check. However, all functions that do a zswap_entry_put(), which
potentially drops the refcount to 0, check the refcount and free the
entry if they need to.  All the functions that do a zswap_entry_put()
that result in the refcount being 0 also ensure that there is no way
for another thread to gain a reference to entry by either the tree or
lru list before releasing the lock.  That way the cleanup can happen
outside the lock with the risk of someone gaining access to the entry
being freed in the meantime.

<snip>
> A nit: Even I, steeped in tmem terminology, was confused by
> your use of "fs"... to nearly all readers it will
> be translated as "filesystem" which is mystifying.
> Just spell it out "frontswap", even if it causes a few
> lines to be wrapped.

Sound good. I'll queue it up.

Thanks,
Seth

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ