lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHbLzkrJ9YZYUS+T64L9vFzg77qVg2SZ4DBGC013kgGTRvpieA@mail.gmail.com>
Date:   Tue, 12 Oct 2021 12:17:33 -0700
From:   Yang Shi <shy828301@...il.com>
To:     Peter Xu <peterx@...hat.com>
Cc:     HORIGUCHI NAOYA(堀口 直也) 
        <naoya.horiguchi@....com>, Hugh Dickins <hughd@...gle.com>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Matthew Wilcox <willy@...radead.org>,
        Oscar Salvador <osalvador@...e.de>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux MM <linux-mm@...ck.org>,
        Linux FS-devel Mailing List <linux-fsdevel@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [v3 PATCH 4/5] mm: shmem: don't truncate page if memory failure happens

On Mon, Oct 11, 2021 at 6:57 PM Peter Xu <peterx@...hat.com> wrote:
>
> On Thu, Sep 30, 2021 at 02:53:10PM -0700, Yang Shi wrote:
> > diff --git a/mm/shmem.c b/mm/shmem.c
> > index 88742953532c..75c36b6a405a 100644
> > --- a/mm/shmem.c
> > +++ b/mm/shmem.c
> > @@ -2456,6 +2456,7 @@ shmem_write_begin(struct file *file, struct address_space *mapping,
> >       struct inode *inode = mapping->host;
> >       struct shmem_inode_info *info = SHMEM_I(inode);
> >       pgoff_t index = pos >> PAGE_SHIFT;
> > +     int ret = 0;
> >
> >       /* i_rwsem is held by caller */
> >       if (unlikely(info->seals & (F_SEAL_GROW |
> > @@ -2466,7 +2467,17 @@ shmem_write_begin(struct file *file, struct address_space *mapping,
> >                       return -EPERM;
> >       }
> >
> > -     return shmem_getpage(inode, index, pagep, SGP_WRITE);
> > +     ret = shmem_getpage(inode, index, pagep, SGP_WRITE);
> > +
> > +     if (*pagep) {
> > +             if (PageHWPoison(*pagep)) {
> > +                     unlock_page(*pagep);
> > +                     put_page(*pagep);
> > +                     ret = -EIO;
> > +             }
> > +     }
> > +
> > +     return ret;
> >  }
> >
> >  static int
> > @@ -2555,6 +2566,11 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
> >                       unlock_page(page);
> >               }
> >
> > +             if (page && PageHWPoison(page)) {
> > +                     error = -EIO;
> > +                     break;
> > +             }
> > +
> >               /*
> >                * We must evaluate after, since reads (unlike writes)
> >                * are called without i_rwsem protection against truncate
>
> [...]
>
> > @@ -4193,6 +4216,10 @@ struct page *shmem_read_mapping_page_gfp(struct address_space *mapping,
> >               page = ERR_PTR(error);
> >       else
> >               unlock_page(page);
> > +
> > +     if (PageHWPoison(page))
> > +             page = ERR_PTR(-EIO);
> > +
> >       return page;
> >  #else
> >       /*
> > diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
> > index 7a9008415534..b688d5327177 100644
> > --- a/mm/userfaultfd.c
> > +++ b/mm/userfaultfd.c
> > @@ -233,6 +233,11 @@ static int mcontinue_atomic_pte(struct mm_struct *dst_mm,
> >               goto out;
> >       }
> >
> > +     if (PageHWPoison(page)) {
> > +             ret = -EIO;
> > +             goto out_release;
> > +     }
> > +
> >       ret = mfill_atomic_install_pte(dst_mm, dst_pmd, dst_vma, dst_addr,
> >                                      page, false, wp_copy);
> >       if (ret)
> > --
> > 2.26.2
> >
>
> These are shmem_getpage_gfp() call sites:
>
>   shmem_getpage[151]             return shmem_getpage_gfp(inode, index, pagep, sgp,
>   shmem_fault[2112]              err = shmem_getpage_gfp(inode, vmf->pgoff, &vmf->page, SGP_CACHE,
>   shmem_read_mapping_page_gfp[4188] error = shmem_getpage_gfp(inode, index, &page, SGP_CACHE,
>
> These are further shmem_getpage() call sites:
>
>   collapse_file[1735]            if (shmem_getpage(mapping->host, index, &page,
>   shmem_undo_range[965]          shmem_getpage(inode, start - 1, &page, SGP_READ);
>   shmem_undo_range[980]          shmem_getpage(inode, end, &page, SGP_READ);
>   shmem_write_begin[2467]        return shmem_getpage(inode, index, pagep, SGP_WRITE);
>   shmem_file_read_iter[2544]     error = shmem_getpage(inode, index, &page, sgp);
>   shmem_fallocate[2733]          error = shmem_getpage(inode, index, &page, SGP_FALLOC);
>   shmem_symlink[3079]            error = shmem_getpage(inode, 0, &page, SGP_WRITE);
>   shmem_get_link[3120]           error = shmem_getpage(inode, 0, &page, SGP_READ);
>   mcontinue_atomic_pte[235]      ret = shmem_getpage(inode, pgoff, &page, SGP_READ);
>
> Wondering whether this patch covered all of them.

No, it doesn't need. Not all places care about hwpoison page, for
example, truncate, hole punch, etc. Only the APIs which return the
data back to userspace or write back to disk need care about if the
data is corrupted or not since. This has been elaborated in the cover
letter.

>
> This also reminded me that whether we should simply fail shmem_getpage_gfp()
> directly, then all above callers will get a proper failure, rather than we do
> PageHWPoison() check everywhere?

Actually I did a prototype for this approach by returning
ERR_PTR(-EIO). But all the callers have to check this return value
even though the callers don't care about hwpoison page since all the
callers (not only shmem, but also all other filesystems) just check if
page is NULL but not check if it is an error pointer. This actually
incur more changes. It sounds not optimal IMHO. So I just treat
hwpoison as other flags, for example, Uptodate, and have callers check
it when necessary.

>
> --
> Peter Xu
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ