lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6ab58270-c487-2a56-b522-ea5100edb13c@redhat.com>
Date:   Wed, 18 Aug 2021 20:13:47 +0200
From:   David Hildenbrand <david@...hat.com>
To:     Tiberiu Georgescu <tiberiu.georgescu@...anix.com>
Cc:     Peter Xu <peterx@...hat.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        Alistair Popple <apopple@...dia.com>,
        Ivan Teterevkov <ivan.teterevkov@...anix.com>,
        Mike Rapoport <rppt@...ux.vnet.ibm.com>,
        Hugh Dickins <hughd@...gle.com>,
        Matthew Wilcox <willy@...radead.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        "Kirill A . Shutemov" <kirill@...temov.name>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        "Carl Waldspurger [C]" <carl.waldspurger@...anix.com>,
        Florian Schmidt <flosch@...anix.com>,
        Jonathan Davies <jond@...anix.com>
Subject: Re: [PATCH RFC 0/4] mm: Enable PM_SWAP for shmem with PTE_MARKER

>>
>>> I'm now wondering whether for Tiberiu's case mincore() can also be used.  It
>>> should just still be a bit slow because it'll look up the cache too, but it
>>> should work similarly like the original proposal.
> 
> I am afraid that the information returned by mincore is a little too vague to be of better help, compared to what the pagemap should provide in theory. I will have a look to see whether lseek on
> proc/map_files works as a "PM_SWAP" equivalent. However, the swap offset would still be missing.

Well, with mincore() you could at least decide "page is present" vs. 
"page is swapped or not existent". At least for making pageout decisions 
it shouldn't really matter, no? madvise(MADV_PAGEOUT) on a hole is a nop.

But I'm not 100% sure what exactly your use case is here and what you 
would really need, so you know best :)

>>
>> Very right, maybe we can just avoid tampering with pagemap on shmem completely (which sounds like an excellent idea to me) and document it as "On shared memory, we will never indicate SWAPPED if the pages have been swapped out. Further, PRESENT might be under-indicated: if a shared page is currently not mapped into the page table of a process.". I saw there was a related, proposed doc update, maybe we can finetune that.
>>
> We could take into consideration an alternative approach to retrieving the shared page info in user
> space, like storing it in sys/fs instead of per process. However, just leaving the pagemap functionality
> incomplete, and not providing an alternative to retrieve the missing information, does not seem right. Updating the docs with a "can't do" should be temporary, until an alternative or fix.
> 

As I stated before, making pagemap less broken is not a good idea IMHO. 
Either make it really correct or just leave it all broken -- and 
document that e.g., other interfaces (lseek) shall be used. It sounds 
like they exist and are good enough for CRUI.

And TBH, if other interfaces already exist and get the job done, I'm 
more than happy that we can avoid mixing more shmem stuff into pagemap 
and trying to compensate performance problems by introducing inconsistency.

If it has an fd and we can punch that into syscalls, we should much 
rather use that fd to lookup stuff then going via process page tables -- 
if possible of course (to be evaluated, because I haven't looked into 
the CRIU details and how they use lseek with anonymous shared memory).

> Also, I think you are talking about my own doc update patch[3]. If not, please share a link with your
> next reply.
> 
> [3] https://marc.info/?m=162878395426774

No, that's it.


-- 
Thanks,

David / dhildenb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ