lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 5 Jan 2023 20:00:23 -0800
From:   Andrew Morton <akpm@...ux-foundation.org>
To:     Yu Zhao <yuzhao@...gle.com>
Cc:     Alexander Viro <viro@...iv.linux.org.uk>,
        Andrea Righi <andrea.righi@...onical.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Michael Larabel <michael@...haellarabel.com>,
        linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-mm@...gle.com
Subject: Re: [PATCH mm-unstable v2 1/2] mm: add vma_has_recency()

On Fri, 30 Dec 2022 14:52:51 -0700 Yu Zhao <yuzhao@...gle.com> wrote:

> This patch adds vma_has_recency() to indicate whether a VMA may
> exhibit temporal locality that the LRU algorithm relies on.
> 
> This function returns false for VMAs marked by VM_SEQ_READ or
> VM_RAND_READ. While the former flag indicates linear access, i.e., a
> special case of spatial locality, both flags indicate a lack of
> temporal locality, i.e., the reuse of an area within a relatively
> small duration.
> 
> "Recency" is chosen over "locality" to avoid confusion between
> temporal and spatial localities.
> 
> Before this patch, the active/inactive LRU only ignored the accessed
> bit from VMAs marked by VM_SEQ_READ. After this patch, the
> active/inactive LRU and MGLRU share the same logic: they both ignore
> the accessed bit if vma_has_recency() returns false.
> 
> For the active/inactive LRU, the following fio test showed a [6, 8]%
> increase in IOPS when randomly accessing mapped files under memory
> pressure.
> 
>   kb=$(awk '/MemTotal/ { print $2 }' /proc/meminfo)
>   kb=$((kb - 8*1024*1024))
> 
>   modprobe brd rd_nr=1 rd_size=$kb
>   dd if=/dev/zero of=/dev/ram0 bs=1M
> 
>   mkfs.ext4 /dev/ram0
>   mount /dev/ram0 /mnt/
>   swapoff -a
> 
>   fio --name=test --directory=/mnt/ --ioengine=mmap --numjobs=8 \
>       --size=8G --rw=randrw --time_based --runtime=10m \
>       --group_reporting
> 
> The discussion that led to this patch is here [1]. Additional test
> results are available in that thread.
> 
> --- a/include/linux/mm_inline.h
> +++ b/include/linux/mm_inline.h
> @@ -595,4 +595,12 @@ pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr,
>  #endif
>  }
>  
> +static inline bool vma_has_recency(struct vm_area_struct *vma)
> +{
> +	if (vma->vm_flags & (VM_SEQ_READ | VM_RAND_READ))
> +		return false;

I guess it's fairly obvious why these hints imply "doesn't have
recency".  But still, some comments wouldn't hurt!

> +	return true;
> +}
>  #endif
> diff --git a/mm/memory.c b/mm/memory.c
> index 4000e9f017e0..ee72badad847 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1402,8 +1402,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb,
>  						force_flush = 1;
>  					}
>  				}
> -				if (pte_young(ptent) &&
> -				    likely(!(vma->vm_flags & VM_SEQ_READ)))
> +				if (pte_young(ptent) && likely(vma_has_recency(vma)))

So we're newly using VM_RAND_READ for the legacy LRU?  Deliberate?  If
so, what are the effects and why?

>  					mark_page_accessed(page);
>  			}
>  			rss[mm_counter(page)]--;
> @@ -5148,8 +5147,8 @@ static inline void mm_account_fault(struct pt_regs *regs,
>  #ifdef CONFIG_LRU_GEN
>  static void lru_gen_enter_fault(struct vm_area_struct *vma)
>  {
> -	/* the LRU algorithm doesn't apply to sequential or random reads */
> -	current->in_lru_fault = !(vma->vm_flags & (VM_SEQ_READ | VM_RAND_READ));
> +	/* the LRU algorithm only applies to accesses with recency */
> +	current->in_lru_fault = vma_has_recency(vma);
>  }
>  
>  static void lru_gen_exit_fault(void)
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 8a24b90d9531..9abffdd63a6a 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -823,25 +823,14 @@ static bool folio_referenced_one(struct folio *folio,
>  		}
>  
>  		if (pvmw.pte) {
> -			if (lru_gen_enabled() && pte_young(*pvmw.pte) &&
> -			    !(vma->vm_flags & (VM_SEQ_READ | VM_RAND_READ))) {
> +			if (lru_gen_enabled() && pte_young(*pvmw.pte)) {
>  				lru_gen_look_around(&pvmw);
>  				referenced++;
>  			}

I'd expect a call to vma_has_recency() here, but I'll trust you ;)


>  			if (ptep_clear_flush_young_notify(vma, address,
> -						pvmw.pte)) {
> -				/*
> -				 * Don't treat a reference through
> -				 * a sequentially read mapping as such.
> -				 * If the folio has been used in another mapping,
> -				 * we will catch it; if this other mapping is
> -				 * already gone, the unmap path will have set
> -				 * the referenced flag or activated the folio.
> -				 */
> -				if (likely(!(vma->vm_flags & VM_SEQ_READ)))
> -					referenced++;
> -			}
> +						pvmw.pte))
> +				referenced++;
>  		} else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
>  			if (pmdp_clear_flush_young_notify(vma, address,
>  						pvmw.pmd))
> ...
>

The posix_fadvise() manpage will need an update, please.  Not now, but
if/when these changes are heading into mainline.  "merged into
mm-stable" would be a good trigger for this activity.

The legacy LRU has had used-once drop-behind for a long time (Johannes
touched it last).  Have you noticed whether that's all working OK?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ