lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+YtOXNPJSrmvs=O5UvFPjUQdnSHGhuE_kiLkxzJjH=DNQ@mail.gmail.com>
Date:   Tue, 2 May 2023 08:13:11 +0200
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Cc:     linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        akpm@...ux-foundation.org, hughd@...gle.com,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        syzkaller-bugs@...glegroups.com,
        syzbot <syzbot+702361cf7e3d95758761@...kaller.appspotmail.com>
Subject: Re: [syzbot] [mm?] KCSAN: data-race in generic_fillattr / shmem_mknod (2)

On Mon, 1 May 2023 at 07:16, Tetsuo Handa
<penguin-kernel@...ove.sakura.ne.jp> wrote:
>
> On 2023/04/24 17:26, Dmitry Vyukov wrote:
> >> HEAD commit:    457391b03803 Linux 6.3
> >> git tree:       upstream
> >> console output: https://syzkaller.appspot.com/x/log.txt?x=13226cf0280000
> >> kernel config:  https://syzkaller.appspot.com/x/.config?x=8c81c9a3d360ebcf
> >> dashboard link: https://syzkaller.appspot.com/bug?extid=702361cf7e3d95758761
> >> compiler:       Debian clang version 15.0.7, GNU ld (GNU Binutils for Debian) 2.35.2
> >
> > I think shmem_mknod() needs to use i_size_write() to update the size.
> > Writes to i_size are not assumed to be atomic throughout the kernel
> > code.
> >
>
> I don't think that using i_size_{read,write}() alone is sufficient,
> for I think that i_size_{read,write}() needs data_race() annotation.

Agree. Or better proper READ/WRITE_ONCE.
data_race() is just an annotation, it does not fix the actual data
race bug that is present there.
I see there are lots of uses of i_size_read() in complex scenarios
that involve comparisons of the size. All such racy uses are subject
to the TOCTOU bug at least.


>  include/linux/fs.h |   13 +++++++++++--
>  mm/shmem.c         |   12 ++++++------
>  2 files changed, 17 insertions(+), 8 deletions(-)
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 21a981680856..0d067bbe3ee9 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -860,6 +860,13 @@ void filemap_invalidate_unlock_two(struct address_space *mapping1,
>   * the read or for example on x86 they can be still implemented as a
>   * cmpxchg8b without the need of the lock prefix). For SMP compiles
>   * and 64bit archs it makes no difference if preempt is enabled or not.
> + *
> + * However, when KCSAN is enabled, CPU being capable of reading/updating
> + * naturally aligned 8 bytes of memory atomically is not sufficient for
> + * avoiding KCSAN warning, for KCSAN checks whether value has changed between
> + * before and after of a read operation. But since we don't want to introduce
> + * seqcount overhead only for suppressing KCSAN warning, tell KCSAN that data
> + * race on accessing i_size field is acceptable.
>   */
>  static inline loff_t i_size_read(const struct inode *inode)
>  {
> @@ -880,7 +887,8 @@ static inline loff_t i_size_read(const struct inode *inode)
>         preempt_enable();
>         return i_size;
>  #else
> -       return inode->i_size;
> +       /* See comment above. */
> +       return data_race(inode->i_size);
>  #endif
>  }
>
> @@ -902,7 +910,8 @@ static inline void i_size_write(struct inode *inode, loff_t i_size)
>         inode->i_size = i_size;
>         preempt_enable();
>  #else
> -       inode->i_size = i_size;
> +       /* See comment above. */
> +       data_race(inode->i_size = i_size);
>  #endif
>  }
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index e40a08c5c6d7..a2f20297fb59 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -2951,7 +2951,7 @@ shmem_mknod(struct mnt_idmap *idmap, struct inode *dir,
>                         goto out_iput;
>
>                 error = 0;
> -               dir->i_size += BOGO_DIRENT_SIZE;
> +               i_size_write(dir, i_size_read(dir) + BOGO_DIRENT_SIZE);
>                 dir->i_ctime = dir->i_mtime = current_time(dir);
>                 inode_inc_iversion(dir);
>                 d_instantiate(dentry, inode);
> @@ -3027,7 +3027,7 @@ static int shmem_link(struct dentry *old_dentry, struct inode *dir, struct dentr
>                         goto out;
>         }
>
> -       dir->i_size += BOGO_DIRENT_SIZE;
> +       i_size_write(dir, i_size_read(dir) + BOGO_DIRENT_SIZE);
>         inode->i_ctime = dir->i_ctime = dir->i_mtime = current_time(inode);
>         inode_inc_iversion(dir);
>         inc_nlink(inode);
> @@ -3045,7 +3045,7 @@ static int shmem_unlink(struct inode *dir, struct dentry *dentry)
>         if (inode->i_nlink > 1 && !S_ISDIR(inode->i_mode))
>                 shmem_free_inode(inode->i_sb);
>
> -       dir->i_size -= BOGO_DIRENT_SIZE;
> +       i_size_write(dir, i_size_read(dir) - BOGO_DIRENT_SIZE);
>         inode->i_ctime = dir->i_ctime = dir->i_mtime = current_time(inode);
>         inode_inc_iversion(dir);
>         drop_nlink(inode);
> @@ -3132,8 +3132,8 @@ static int shmem_rename2(struct mnt_idmap *idmap,
>                 inc_nlink(new_dir);
>         }
>
> -       old_dir->i_size -= BOGO_DIRENT_SIZE;
> -       new_dir->i_size += BOGO_DIRENT_SIZE;
> +       i_size_write(old_dir, i_size_read(old_dir) - BOGO_DIRENT_SIZE);
> +       i_size_write(new_dir, i_size_read(new_dir) + BOGO_DIRENT_SIZE);
>         old_dir->i_ctime = old_dir->i_mtime =
>         new_dir->i_ctime = new_dir->i_mtime =
>         inode->i_ctime = current_time(old_dir);
> @@ -3189,7 +3189,7 @@ static int shmem_symlink(struct mnt_idmap *idmap, struct inode *dir,
>                 folio_unlock(folio);
>                 folio_put(folio);
>         }
> -       dir->i_size += BOGO_DIRENT_SIZE;
> +       i_size_write(dir, i_size_read(dir) + BOGO_DIRENT_SIZE);
>         dir->i_ctime = dir->i_mtime = current_time(dir);
>         inode_inc_iversion(dir);
>         d_instantiate(dentry, inode);
>
> Maybe we want i_size_add() ?
>
> Also, there was a similar report on updating i_{ctime,mtime} to current_time()
> which means that i_size is not the only field that is causing data race.
> https://syzkaller.appspot.com/bug?id=067d40ab9ab23a6fa0a8156857ed54e295062a29
>
> Hmm, where is the serialization that avoids concurrent
> shmem_mknod()/shmem_mknod() or shmem_mknod()/shmem_unlink() ?
> i_size_write() says "need locking around it (normally i_mutex)"...
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ