linux-kernel - Re: [External] Re: [PATCH] mm/hugetlb: Fix a race between hugetlb sysctl handlers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAMZfGtXh+FP0O92ccj532Y=K4m2S==eSK8LjYEM+p6cttV0p8Q@mail.gmail.com>
Date:   Tue, 25 Aug 2020 11:01:17 +0800
From:   Muchun Song <songmuchun@...edance.com>
To:     Mike Kravetz <mike.kravetz@...cle.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>, ak@...ux.intel.com,
        Linux Memory Management List <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [External] Re: [PATCH] mm/hugetlb: Fix a race between hugetlb
 sysctl handlers

On Tue, Aug 25, 2020 at 5:21 AM Mike Kravetz <mike.kravetz@...cle.com> wrote:
>
> On 8/24/20 1:59 PM, Andrew Morton wrote:
> > On Sat, 22 Aug 2020 17:53:28 +0800 Muchun Song <songmuchun@...edance.com> wrote:
> >
> >> There is a race between the assignment of `table->data` and write value
> >> to the pointer of `table->data` in the __do_proc_doulongvec_minmax().
> >
> > Where does __do_proc_doulongvec_minmax() write to table->data?
> >
> > I think you're saying that there is a race between the assignment of
> > ctl_table->table in hugetlb_sysctl_handler_common() and the assignment
> > of the same ctl_table->table in hugetlb_overcommit_handler()?
> >
> > Or not, maybe I'm being thick.  Can you please describe the race more
> > carefully and completely?
> >
>
> I too am looking at this now and do not completely understand the race.
> It could be that:
>
> hugetlb_sysctl_handler_common
> ...
>         table->data = &tmp;
>
> and, do_proc_doulongvec_minmax()
> ...
>         return __do_proc_doulongvec_minmax(table->data, table, write, ...
> with __do_proc_doulongvec_minmax(void *data, struct ctl_table *table, ...
> ...
>         i = (unsigned long *) data;
>         ...
>                 *i = val;
>
> So, __do_proc_doulongvec_minmax can be dereferencing and writing to the pointer
> in one thread when hugetlb_sysctl_handler_common is setting it in another?

Yes, you are right.

>
> Another confusing part of the message is the stack trace which includes
> ...
>      ? set_max_huge_pages+0x3da/0x4f0
>      ? alloc_pool_huge_page+0x150/0x150
>
> which are 'downstream' from these routines.  I don't understand why these
> are in the trace.

I am also confused. But this issue can be reproduced easily by letting more
than one thread write to `/proc/sys/vm/nr_hugepages`. With this patch applied,
the issue can not be reproduced and disappears.

>
> If the race is with the pointer set and dereference/write, then this type of
> fix is OK.  However, if you really have two 'sysadmin type' global operations
> racing then one or both are not going to get what they expected.  Instead of

In our team, more than one developer shares one server which is a test server.
I guess that the panic happens when two people write
`/proc/sys/vm/nr_hugepages`.

> changing the code to 'handle the race', I think it might be acceptable to just
> put a big semaphore around it.

It is also a good idea to fix this issue. Thanks.

> --
> Mike Kravetz



-- 
Yours,
Muchun