lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHS8izPjJRf50yAtB0iZmVBi1LNKVHGmLb6ayx7U2+j8fzSgJA@mail.gmail.com>
Date:   Sat, 13 Nov 2021 06:48:37 -0800
From:   Mina Almasry <almasrymina@...gle.com>
To:     Muchun Song <songmuchun@...edance.com>
Cc:     Mike Kravetz <mike.kravetz@...cle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Shuah Khan <shuah@...nel.org>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Oscar Salvador <osalvador@...e.de>,
        Michal Hocko <mhocko@...e.com>,
        David Rientjes <rientjes@...gle.com>,
        Shakeel Butt <shakeelb@...gle.com>, Jue Wang <juew@...gle.com>,
        Yang Yao <ygyao@...gle.com>, Joanna Li <joannali@...gle.com>,
        Cannon Matthews <cannonmatthews@...gle.com>,
        Linux Memory Management List <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v6] hugetlb: Add hugetlb.*.numa_stat file

On Fri, Nov 12, 2021 at 6:45 PM Muchun Song <songmuchun@...edance.com> wrote:
>
> On Sat, Nov 13, 2021 at 7:36 AM Mike Kravetz <mike.kravetz@...cle.com> wrote:
> >
> > Subject:   Re: [PATCH v6] hugetlb: Add hugetlb.*.numa_stat file
> >
> > To:        Muchun Song <songmuchun@...edance.com>, Mina Almasry <almasrymina@...gle.com>
> >
> > Cc:        Andrew Morton <akpm@...ux-foundation.org>, Shuah Khan <shuah@...nel.org>, Miaohe Lin <linmiaohe@...wei.com>, Oscar Salvador <osalvador@...e.de>, Michal Hocko <mhocko@...e.com>, David Rientjes <rientjes@...gle.com>, Shakeel Butt <shakeelb@...gle.com>, Jue Wang <juew@...gle.com>, Yang Yao <ygyao@...gle.com>, Joanna Li <joannali@...gle.com>, Cannon Matthews <cannonmatthews@...gle.com>, Linux Memory Management List <linux-mm@...ck.org>, LKML <linux-kernel@...r.kernel.org>
> >
> > Bcc:
> >
> > -=-=-=-=-=-=-=-=-=# Don't remove this line #=-=-=-=-=-=-=-=-=-
> >
> > On 11/10/21 6:36 PM, Muchun Song wrote:
> >
> > > On Thu, Nov 11, 2021 at 9:50 AM Mina Almasry <almasrymina@...gle.com> wrote:
> >
> > >>
> >
> > >> +struct hugetlb_cgroup_per_node {
> >
> > >> +       /* hugetlb usage in pages over all hstates. */
> >
> > >> +       atomic_long_t usage[HUGE_MAX_HSTATE];
> >
> > >
> >
> > > Why do you use atomic? IIUC, 'usage' is always
> >
> > > increased/decreased under hugetlb_lock except
> >
> > > hugetlb_cgroup_read_numa_stat() which is always
> >
> > > reading it. So I think WRITE_ONCE/READ_ONCE
> >
> > > is enough.
> >
> >
> >
> > Thanks for continuing to work this, I was traveling and unable to
> >
> > comment.
>
> Have a good time.
>
> >
> >
> >
> > Unless I am missing something, I do not see a reason for WRITE_ONCE/READ_ONCE
>
> Because __hugetlb_cgroup_commit_charge and
> hugetlb_cgroup_read_numa_stat can run parallely,
> which meets the definition of data race. I believe
> KCSAN could report this race. I'm not strongly
> suggest using WRITE/READ_ONCE() here. But
> in theory it should be like this. Right?
>

My understanding is that the (only) potential problem here is
read_numa_stat() reading an intermediate garbage value while
commit_charge() is happening concurrently. This will only happen on
archs where the writes to an unsigned long aren't atomic. On archs
where writes to an unsigned long are atomic, there is no race, because
read_numa_stat() will only ever read the value before the concurrent
write or after the concurrent write, both of which are valid. To cater
to archs where the writes to unsigned long aren't atomic, we need to
use an atomic data type.

I'm not too familiar but my understanding from reading the
documentation is that WRITE_ONCE/READ_ONCE don't contribute anything
meaningful here:

/*
* Prevent the compiler from merging or refetching reads or writes. The
* compiler is also forbidden from reordering successive instances of
* READ_ONCE and WRITE_ONCE, but only when the compiler is aware of some
* particular ordering. One way to make the compiler aware of ordering is to
* put the two invocations of READ_ONCE or WRITE_ONCE in different C
* statements.
...

I can't see a reason why we care about the compiler merging or
refetching reads or writes here. As far as I can tell the problem is
atomicy of the write.

> Thanks.
>
> >
> > and would suggest going back to the way this code was in v5.
> >
> > --
> >
> > Mike Kravetz
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ