lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 25 Feb 2020 21:27:03 +0100
From:   Marco Elver <elver@...gle.com>
To:     Qian Cai <cai@....pw>
Cc:     "Darrick J. Wong" <darrick.wong@...cle.com>,
        linux-xfs@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
        kasan-dev <kasan-dev@...glegroups.com>
Subject: Re: [PATCH] xfs: fix data races in inode->i_*time

On Tue, 25 Feb 2020 at 21:09, Qian Cai <cai@....pw> wrote:
>
> inode->i_*time could be accessed concurrently. The plain reads in
> xfs_vn_getattr() is lockless which result in data races. To avoid bad
> compiler optimizations like load tearing, adding pairs of
> READ|WRITE_ONCE(). While at it, also take care of xfs_setattr_time()
> which presumably could run concurrently with xfs_vn_getattr() as well.
> The data races were reported by KCSAN,
>
>  write to 0xffff9275a1920ad8 of 16 bytes by task 47311 on cpu 46:
>   xfs_vn_update_time+0x1b0/0x400 [xfs]
>   xfs_vn_update_time at fs/xfs/xfs_iops.c:1122

So this one is doing concurrent writes and reads of the whole struct,
which is 16 bytes. This will always be split into multiple
loads/stores. Is it intentional?

Sadly, this is pretty much guaranteed to tear, even with the
READ/WRITE_ONCE.  The *ONCE will just make KCSAN not tell us about
this one, which is probably not what we want right now, unless we know
for sure the race is intentional.

Thanks,
-- Marco

>   update_time+0x57/0x80
>   file_update_time+0x143/0x1f0
>   __xfs_filemap_fault+0x1be/0x3d0 [xfs]
>   xfs_filemap_page_mkwrite+0x25/0x40 [xfs]
>   do_page_mkwrite+0xf7/0x250
>   do_fault+0x679/0x920
>   __handle_mm_fault+0xc9f/0xd40
>   handle_mm_fault+0xfc/0x2f0
>   do_page_fault+0x263/0x6f9
>   page_fault+0x34/0x40
>
>  4 locks held by doio/47311:
>   #0: ffff9275e7d70808 (&mm->mmap_sem#2){++++}, at: do_page_fault+0x143/0x6f9
>   #1: ffff9274864394d8 (sb_pagefaults){.+.+}, at: __xfs_filemap_fault+0x19b/0x3d0 [xfs]
>   #2: ffff9274864395b8 (sb_internal){.+.+}, at: xfs_trans_alloc+0x2af/0x3c0 [xfs]
>   #3: ffff9275a1920920 (&xfs_nondir_ilock_class){++++}, at: xfs_ilock+0x116/0x2c0 [xfs]
>  irq event stamp: 42649
>  hardirqs last  enabled at (42649): [<ffffffffb22dcdb3>] _raw_spin_unlock_irqrestore+0x53/0x60
>  hardirqs last disabled at (42648): [<ffffffffb22dcad1>] _raw_spin_lock_irqsave+0x21/0x60
>  softirqs last  enabled at (42306): [<ffffffffb260034c>] __do_softirq+0x34c/0x57c
>  softirqs last disabled at (42299): [<ffffffffb18c6762>] irq_exit+0xa2/0xc0
>
>  read to 0xffff9275a1920ad8 of 16 bytes by task 47312 on cpu 40:
>   xfs_vn_getattr+0x20c/0x6a0 [xfs]
>   xfs_vn_getattr at fs/xfs/xfs_iops.c:551
>   vfs_getattr_nosec+0x11a/0x170
>   vfs_statx_fd+0x54/0x90
>   __do_sys_newfstat+0x40/0x90
>   __x64_sys_newfstat+0x3a/0x50
>   do_syscall_64+0x91/0xb05
>   entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
>  no locks held by doio/47312.
>  irq event stamp: 43883
>  hardirqs last  enabled at (43883): [<ffffffffb1805119>] do_syscall_64+0x39/0xb05
>  hardirqs last disabled at (43882): [<ffffffffb1803ede>] trace_hardirqs_off_thunk+0x1a/0x1c
>  softirqs last  enabled at (43844): [<ffffffffb260034c>] __do_softirq+0x34c/0x57c
>  softirqs last disabled at (43141): [<ffffffffb18c6762>] irq_exit+0xa2/0xc0
>
> Signed-off-by: Qian Cai <cai@....pw>
> ---
>  fs/xfs/xfs_iops.c | 18 +++++++++---------
>  1 file changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index 81f2f93caec0..2d5ca13ee9da 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -547,9 +547,9 @@
>         stat->uid = inode->i_uid;
>         stat->gid = inode->i_gid;
>         stat->ino = ip->i_ino;
> -       stat->atime = inode->i_atime;
> -       stat->mtime = inode->i_mtime;
> -       stat->ctime = inode->i_ctime;
> +       stat->atime = READ_ONCE(inode->i_atime);
> +       stat->mtime = READ_ONCE(inode->i_mtime);
> +       stat->ctime = READ_ONCE(inode->i_ctime);
>         stat->blocks =
>                 XFS_FSB_TO_BB(mp, ip->i_d.di_nblocks + ip->i_delayed_blks);
>
> @@ -614,11 +614,11 @@
>         ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
>
>         if (iattr->ia_valid & ATTR_ATIME)
> -               inode->i_atime = iattr->ia_atime;
> +               WRITE_ONCE(inode->i_atime, iattr->ia_atime);
>         if (iattr->ia_valid & ATTR_CTIME)
> -               inode->i_ctime = iattr->ia_ctime;
> +               WRITE_ONCE(inode->i_ctime, iattr->ia_ctime);
>         if (iattr->ia_valid & ATTR_MTIME)
> -               inode->i_mtime = iattr->ia_mtime;
> +               WRITE_ONCE(inode->i_mtime, iattr->ia_mtime);
>  }
>
>  static int
> @@ -1117,11 +1117,11 @@
>
>         xfs_ilock(ip, XFS_ILOCK_EXCL);
>         if (flags & S_CTIME)
> -               inode->i_ctime = *now;
> +               WRITE_ONCE(inode->i_ctime, *now);
>         if (flags & S_MTIME)
> -               inode->i_mtime = *now;
> +               WRITE_ONCE(inode->i_mtime, *now);
>         if (flags & S_ATIME)
> -               inode->i_atime = *now;
> +               WRITE_ONCE(inode->i_atime, *now);
>
>         xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
>         xfs_trans_log_inode(tp, ip, log_flags);
> --
> 1.8.3.1
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ