lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YUn0ikP4Gip3Yc6L@t490s>
Date:   Tue, 21 Sep 2021 11:04:42 -0400
From:   Peter Xu <peterx@...hat.com>
To:     Tiberiu Georgescu <tiberiu.georgescu@...anix.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Jonathan Corbet <corbet@....net>,
        "david@...hat.com" <david@...hat.com>,
        "linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Ivan Teterevkov <ivan.teterevkov@...anix.com>,
        Florian Schmidt <flosch@...anix.com>,
        "Carl Waldspurger [C]" <carl.waldspurger@...anix.com>,
        Jonathan Davies <jond@...anix.com>
Subject: Re: [PATCH v2 1/1] Documentation: update pagemap with shmem
 exceptions

Hi, Tiberiu,

On Tue, Sep 21, 2021 at 08:52:32AM +0000, Tiberiu Georgescu wrote:
> I tested it some more, and it still looks like the mincore() syscall considers pages
> in the swap cache as "in memory". This is how I tested:
> 
> 1. Create a cgroup with 1M limit_in_bytes, and allow swapping
> 2. mmap 1024 pages (both shared and private present the same behaviour)
> 3. write to all pages in order
> 4. compare mincore output with pagemap output
> 
> This is an example of a usual mincore output in this scenario, shortened for
> coherency (4x8 instead of 16x64):
> 00000000
> 00000000
> 00001110   <- this bugs me
> 01111111
> 
> The last 7 bits are definitely marking pages present in memory, but there are
> some other bits set a little earlier. When comparing this output with the pagemap,
> indeed, there are 7 consecutive pages present, and the rest of them are
> swapped, including those 3 which are marked as present by mincore.
> At this point, I can only assume the bits in between are on the swap cache.
> 
> If you have another explanation, please share it with me. In the meanwhile,
> I will rework the doc patch, and see if there is any other way to differentiate
> clearly between the three types of pages. If not, I guess we'll stick to
> mincore() and a best-effort 5th step.

IIUC it could be because of that the pages are still in swap cache, so
mincore() will return 1 for them too.

What swap device are you using?  I'm wildly guessing you're not using frontswap
like zram.  If that's the case, would you try zram?  That should flush the page
synchronously iiuc, then all the "suspecious 1s" will go away above.

To do that, you may need to firstly turn off your current swap:

        # swapoff -a

Then to configure zram you need:

        # modprobe zram
        # echo 4G > /sys/block/zram0/disksize
        # mkswap --label zram0 /dev/zram0
        # swapon --priority 100 /dev/zram0

Quotting from here:

        https://wiki.archlinux.org/title/Improving_performance#zram_or_zswap

Then you can try run the same test program again.

Thanks,

-- 
Peter Xu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ