linux-kernel - Re: [PATCH 2.6.16.29 1/1] MM: enhance Linux swap subsystem

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4df04b840701110011p153c6d14t8d5f0ca584af9516@mail.gmail.com>
Date:	Thu, 11 Jan 2007 16:11:16 +0800
From:	"yunfeng zhang" <zyf.zeroos@...il.com>
To:	"Rik van Riel" <riel@...hat.com>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2.6.16.29 1/1] MM: enhance Linux swap subsystem

2007/1/11, Rik van Riel <riel@...hat.com>:
>
> Have you actually measured this?
>
> If your measurements saw any performance gains, with what kind
> of workload did they happen, how big were they and how do you
> explain those performance gains?
>
> How do you balance scanning the private memory with taking
> pages off the per-zone page lists?
>
> How do you deal with systems where some zones are really
> large and other zones are really small, eg. a 32 bit system
> with one 880MB zone and one 15.1GB zone?
>

You didn't mention another problem, a VMA maybe residents on multiple zones. To
multiple address space, multiple memory inode architecture, we can introduce
a new core object -- section which has several features
1) Section is used as the atomic unit to contain the pages of a VMA residing in
  the memory inode of the section.
2) When page migration occurs among different memory inodes, new secion should
  be set up to trace the pages.
3) Section can be scanned by the SwapDaemon of its memory inode directely.
4) All sections of a VMA are excluded with each other not overlayed.
5) VMA is made up of sections totally, but its section objects scatter on memory
  inodes.
So to the architecture, we can deploy swap subsystem on an
architecture-independent layer by section and scan pages batchly.

> If the benefits come mostly from better IO clustering, would
> it not be safer/less invasive to add swapout clustering of the
> same style that the BSD kernels have?
>

You mean add a new swap file/partition? I scan every UserSpaces by
init_mm::mmlist which has already been used by swap subsystem, and I've patched
in mm/swapfile.c.

> For your reference, the BSD kernels do swapout clustering like this:
> 1) select a page off the end of the pageout list
> 2) then scan the page table the page is in, to find
>     nearby pages that are also eligable for pageout
> 3) page them all out with one disk I/O operation
>
> The same could also be done for files.
>
> With peterz's dirty tracking (and possible dirty limiting)
> code in the kernel, this can be done without the kind of
> deadlocks that would have plagued earlier kernels, when
> trying to do IO trickery from the pageout path...
>

I've noticed FreeBSD maybe has a similar design as I mentioned, I'm reading
their code to whether they have another two features -- UnmappedPTE and dftlb in
my Documentation/vm_pps.txt.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/