lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 18 Jun 2009 00:41:49 +0200
From:	Johannes Weiner <hannes@...xchg.org>
To:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Rik van Riel <riel@...hat.com>,
	Hugh Dickins <hugh.dickins@...cali.org.uk>,
	Andi Kleen <andi@...stfloor.org>,
	Wu Fengguang <fengguang.wu@...el.com>,
	Minchan Kim <minchan.kim@...il.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [patch v3] swap: virtual swap readahead

On Thu, Jun 11, 2009 at 02:31:22PM +0900, KAMEZAWA Hiroyuki wrote:
> On Tue, 9 Jun 2009 21:01:28 +0200
> Johannes Weiner <hannes@...xchg.org> wrote:
> > [resend with lists cc'd, sorry]
> > 
> > +static int swap_readahead_ptes(struct mm_struct *mm,
> > +			unsigned long addr, pmd_t *pmd,
> > +			swp_entry_t *entries,
> > +			unsigned long cluster)
> > +{
> > +	unsigned long window, min, max, limit;
> > +	spinlock_t *ptl;
> > +	pte_t *ptep;
> > +	int i, nr;
> > +
> > +	window = cluster << PAGE_SHIFT;
> > +	min = addr & ~(window - 1);
> > +	max = min + cluster;
> 
> Johannes, I wonder there is no reason to use "alignment".

I am wondering too.  I digged into the archives but the alignment
comes from a change older than what history.git documents, so I wasn't
able to find written down justification for this.

> I think we just need to read "nearby" pages. Then, this function's
> scan range should be
> 
> 	[addr - window/2, addr + window/2)
> or some.
> 
> And here, too
> > +	if (!entries)	/* XXX: shmem case */
> > +		return swapin_readahead_phys(entry, gfp_mask, vma, addr);
> > +	pmin = swp_offset(entry) & ~(cluster - 1);
> > +	pmax = pmin + cluster;
> 
> pmin = swp_offset(entry) - cluster/2.
> pmax = swp_offset(entry) + cluster/2.
> 
> I'm sorry if I miss a reason for using "alignment".

Perhas someone else knows a good reason for it, but I think it could
even be harmful.

Chances are that several processes fault around the same slots
simultaneously.  By letting them all start at the same aligned offset
we have a maximum race between them and they all allocate pages for
the same slots concurrently.

By placing the window unaligned we decrease this overlapping, so it
sounds like a good idea.

It would increase the amount of readahead done even more, though, and
Fengguang already measured degradation in IO latency with my patch, so
this probably needs more changes to work well.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ