lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f391c30e66dc962826031b5ffa8ab44e.squirrel@webmail-b.css.fujitsu.com>
Date:	Sat, 30 May 2009 20:11:35 +0900 (JST)
From:	"KAMEZAWA Hiroyuki" <kamezawa.hiroyu@...fujitsu.com>
To:	"Andrew Morton" <akpm@...ux-foundation.org>
Cc:	"KAMEZAWA Hiroyuki" <kamezawa.hiroyu@...fujitsu.com>,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	nishimura@....nes.nec.co.jp, balbir@...ux.vnet.ibm.com,
	hugh.dickins@...cali.co.uk, hannes@...xchg.org
Subject: Re: [PATCH 3/4] reuse unused swap entry if necessary

Andrew Morton wrote:
> On Thu, 28 May 2009 14:20:47 +0900
> KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com> wrote:
>
>> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
>>
>> Now, we can know a swap entry is just used as SwapCache via swap_map,
>> without looking up swap cache.
>>
>> Then, we have a chance to reuse swap-cache-only swap entries in
>> get_swap_pages().
>>
>> This patch tries to free swap-cache-only swap entries if swap is
>> not enough.
>> Note: We hit following path when swap_cluster code cannot find
>> a free cluster. Then, vm_swap_full() is not only condition to allow
>> the kernel to reclaim unused swap.
>>
>> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
>> ---
>>  mm/swapfile.c |   39 +++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 39 insertions(+)
>>
>> Index: new-trial-swapcount2/mm/swapfile.c
>> ===================================================================
>> --- new-trial-swapcount2.orig/mm/swapfile.c
>> +++ new-trial-swapcount2/mm/swapfile.c
>> @@ -73,6 +73,25 @@ static inline unsigned short make_swap_c
>>  	return ret;
>>  }
>>
>> +static int
>> +try_to_reuse_swap(struct swap_info_struct *si, unsigned long offset)
>> +{
>> +	int type = si - swap_info;
>> +	swp_entry_t entry = swp_entry(type, offset);
>> +	struct page *page;
>> +	int ret = 0;
>> +
>> +	page = find_get_page(&swapper_space, entry.val);
>> +	if (!page)
>> +		return 0;
>> +	if (trylock_page(page)) {
>> +		ret = try_to_free_swap(page);
>> +		unlock_page(page);
>> +	}
>> +	page_cache_release(page);
>> +	return ret;
>> +}
>
> This function could do with some comments explaining what it does, and
> why.  Also describing the semantics of its return value.
>
Ah, there are no comments ...

> afacit it's misnamed.  It doesn't 'reuse' anything.  It in fact tries
> to release a swap entry so that (presumably) its _caller_ can reuse the
> swap slot.
>
yes.

> The missing comment should also explain why this function is forced to
> use the nasty trylock_page().
>
> Why _is_ this function forced to use the nasty trylock_page()?
>
Because get_swap_page() is called by vmscan.c and when this is called
the caller hold page_lock() on a page. IIUC, nesting lock_page()
without trylock is not good here.

I'll explain this in the next post.


>>  /*
>>   * We need this because the bdev->unplug_fn can sleep and we cannot
>>   * hold swap_lock while calling the unplug_fn. And swap_lock
>> @@ -294,6 +313,18 @@ checks:
>>  		goto no_page;
>>  	if (offset > si->highest_bit)
>>  		scan_base = offset = si->lowest_bit;
>> +
>> +	/* reuse swap entry of cache-only swap if not busy. */
>> +	if (vm_swap_full() && si->swap_map[offset] == SWAP_HAS_CACHE) {
>> +		int ret;
>> +		spin_unlock(&swap_lock);
>> +		ret = try_to_reuse_swap(si, offset);
>> +		spin_lock(&swap_lock);
>> +		if (ret)
>> +			goto checks; /* we released swap_lock. retry. */
>> +		goto scan; /* In some racy case */
>> +	}
>
> So..  what prevents an infinite (or long) busy loop here?  It appears
> that if try_to_reuse_swap() returned non-zero, it will have cleared
> si->swap_map[offset], so we don't rerun try_to_reuse_swap().  Yes?
>
yes.

> `ret' is a poor choice of identifier.  It is usually used to hold the
> value which this function will be returning.  Ditto `retval'.  But that
> is not this variable's role in this case.  Perhaps a better name would
> be slot_was_freed or something.
>
Sure, I'll modifty this patch to be more clear one.
Thank you for review!

-Kame


>>  	if (si->swap_map[offset])
>>  		goto scan;
>>
>> @@ -375,6 +406,10 @@ scan:
>>  			spin_lock(&swap_lock);
>>  			goto checks;
>>  		}
>> +		if (vm_swap_full() && si->swap_map[offset] == SWAP_HAS_CACHE) {
>> +			spin_lock(&swap_lock);
>> +			goto checks;
>> +		}
>>  		if (unlikely(--latency_ration < 0)) {
>>  			cond_resched();
>>  			latency_ration = LATENCY_LIMIT;
>> @@ -386,6 +421,10 @@ scan:
>>  			spin_lock(&swap_lock);
>>  			goto checks;
>>  		}
>> +		if (vm_swap_full() && si->swap_map[offset] == SWAP_HAS_CACHE) {
>> +			spin_lock(&swap_lock);
>> +			goto checks;
>> +		}
>>  		if (unlikely(--latency_ration < 0)) {
>>  			cond_resched();
>>  			latency_ration = LATENCY_LIMIT;
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ