[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHbLzkrShKORwU-Zrf5_5c6_o_iVwoXRTT+7ABZALLj7p0w-Ug@mail.gmail.com>
Date: Thu, 5 Aug 2021 16:08:56 -0700
From: Yang Shi <shy828301@...il.com>
To: Hugh Dickins <hughd@...gle.com>
Cc: Matthew Wilcox <willy@...radead.org>,
"Huang, Ying" <ying.huang@...el.com>,
Andrew Morton <akpm@...ux-foundation.org>,
David Hildenbrand <david@...hat.com>,
Linux MM <linux-mm@...ck.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Miaohe Lin <linmiaohe@...wei.com>,
Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...e.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Minchan Kim <minchan@...nel.org>
Subject: Re: [PATCH] mm,shmem: Fix a typo in shmem_swapin_page()
On Tue, Aug 3, 2021 at 10:34 PM Hugh Dickins <hughd@...gle.com> wrote:
>
> On Tue, 3 Aug 2021, Matthew Wilcox wrote:
> > On Tue, Aug 03, 2021 at 04:14:38PM +0800, Huang, Ying wrote:
> > > Matthew Wilcox <willy@...radead.org> writes:
> > > > But I REALLY REALLY REALLY want a reproducer. Right now, I have a hard
> > > > time believing this, or any of the other races can really happen.
> > >
> > > I think the race is only theoretical too. Firstly, swapoff is a rare
> > > operations in practice; secondly, the race window is really small.
> >
> > So do something to provoke it. Widen the window. Put an msleep(1000)
> > between *pagep = NULL and the call to get_swap_device(). That's assuming
> > that the swapon/swapoff loop that I proposed doesn't work. Did you
> > try it?
>
> I've been doing that swapon/swapoff loop for years, while running kernel
> builds on tmpfs going out to swap; for better or worse on baremetal not VM.
>
> You're right that few will ever need that level of reliability; but it
> has caught problems from time to time, and I do insist on fixing them.
>
> I'm not as insistent as you on wanting a reproducer; and we all take pride
> sometimes in fixing ever more inconceivable bugs. I'm not against that,
> but it's easy to end up with a fix more dangerous than what it claims to
> fix, rather like with random newbie cleanups.
>
> I've never seen the swapoff race claimed by Miaohe, and don't expect to;
> but he's probably right, given the current code. I just dislike adding
> unnecessary complexity, and siting it in the wrong place (mm/shmem.c).
>
> Yang, is it possible that 5.1 commit 8fd2e0b505d1 ("mm: swap: check if
> swap backing device is congested or not") was actually developed and
> measured on 4.1 or earlier, which still had blk_set_queue_congested()?
I forgot the exact version, but definitely not 4.1 or earlier. Maybe
4.19 or earlier. I'm not familiar with how block layer detect
congestion, if the logic was changed, hence the optimization doesn't
stand anymore nowadays, I'm totally fine to remove it.
>
> I cannot explain its usefulness nowadays, on congested HDD anyway:
> Matthew is right that NFS and a few others may still be setting
> congested flags, but they're not what that commit was proposed for.
>
> If it is still useful, then I contend (but Huang Ying will disagree)
> that the get_swap_device() and put_swap_device() should be around
> 8fd2e0b505d1's inode_read_congested() block in swap_cluster_readahead(),
> not encroaching into mm/shmem.c.
>
> But if that block is not useful, then it should simply be removed (later).
>
> Hugh
Powered by blists - more mailing lists