lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 8 Apr 2024 12:24:10 +0200
From: David Hildenbrand <david@...hat.com>
To: Ryan Roberts <ryan.roberts@....com>,
 Andrew Morton <akpm@...ux-foundation.org>,
 Matthew Wilcox <willy@...radead.org>, Huang Ying <ying.huang@...el.com>,
 Gao Xiang <xiang@...nel.org>, Yu Zhao <yuzhao@...gle.com>,
 Yang Shi <shy828301@...il.com>, Michal Hocko <mhocko@...e.com>,
 Kefeng Wang <wangkefeng.wang@...wei.com>, Barry Song <21cnbao@...il.com>,
 Chris Li <chrisl@...nel.org>, Lance Yang <ioworker0@...il.com>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v6 2/6] mm: swap: free_swap_and_cache_nr() as batched
 free_swap_and_cache()

On 08.04.24 12:07, Ryan Roberts wrote:
> On 08/04/2024 10:43, David Hildenbrand wrote:
>>
>>>>> +
>>>>> +/**
>>>>> + * swap_pte_batch - detect a PTE batch for a set of contiguous swap entries
>>>>> + * @start_ptep: Page table pointer for the first entry.
>>>>> + * @max_nr: The maximum number of table entries to consider.
>>>>> + * @entry: Swap entry recovered from the first table entry.
>>>>> + *
>>>>> + * Detect a batch of contiguous swap entries: consecutive (non-present) PTEs
>>>>> + * containing swap entries all with consecutive offsets and targeting the same
>>>>> + * swap type.
>>>>> + *
>>>>
>>>> Likely you should document that any swp pte bits are ignored? ()
>>>
>>> Sorry I don't understand this comment. I thought any non-none, non-present PTE
>>> was always considered to contain only a "swap entry" and a swap entry consists
>>> of a "type" and an "offset" only. (and its a special "non-swap" swap entry if
>>> type > SOME_CONSTANT) Are you saying there are additional fields in the PTE that
>>> are not part of the swap entry?
>>
>>
>> pte_swp_soft_dirty()
>> pte_swp_clear_exclusive()
>> pte_swp_uffd_wp()
>>
>> Are PTE bits used for swp PTE.
> 
> Ahh wow. mind blown. Looks like a massive hack... why not store them in the
> arch-independent swap entry, rather than have them squat independently in the PTE?

I think that was discussed at some point, but it not only requires quite 
some churn to change it (all that swp entry code is a mess), these bits 
are conceptually really per PTE and not something you would want to pass 
into actual swap handling code that couldn't care less about all of these.

I looked at this when I added SWP exclusive, but accidentally losing the 
SWP exclusive marker when converting back and forth made me go the PTE 
route instead.

Then, the available PTE bits are a bit scattered on some architectures, 
and converting entry<->PTE gets even uglier if we don't want to "lose" 
available bits.

Probably the whole "unsigned long swp_entry" stuff should be replaced by 
a proper struct where we could more easily add flags and have the arch 
code handle the conversion to the PTE just once. The arch-specific 
swp_entry stuff is another nasty thing IMHO.

> 
> OK, my implementation is buggy. I'll re-spin to fix this.
> 
> 
>>
>> There is also dirty/young for migration entries, but that's not of a concern
>> here, because we stop for non_swap_entry().
> 
> Looks like these are part of the offset field in the arch-independent swap entry
> - much cleaner ;-).

Note that it only applies to migration entries, and only when we have 
some spare bits due to PFN < offset.

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ