[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOUHufZ8xaVKZD7LNeo8AZv_xywvwef4P8CjdO+npijLHEUfWg@mail.gmail.com>
Date: Wed, 22 Jun 2022 13:13:39 -0600
From: Yu Zhao <yuzhao@...gle.com>
To: Qi Zheng <zhengqi.arch@...edance.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Andi Kleen <ak@...ux.intel.com>,
Aneesh Kumar <aneesh.kumar@...ux.ibm.com>,
Catalin Marinas <catalin.marinas@....com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Hillf Danton <hdanton@...a.com>, Jens Axboe <axboe@...nel.dk>,
Johannes Weiner <hannes@...xchg.org>,
Jonathan Corbet <corbet@....net>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Matthew Wilcox <willy@...radead.org>,
Mel Gorman <mgorman@...e.de>,
Michael Larabel <Michael@...haellarabel.com>,
Michal Hocko <mhocko@...nel.org>,
Mike Rapoport <rppt@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Tejun Heo <tj@...nel.org>, Vlastimil Babka <vbabka@...e.cz>,
Will Deacon <will@...nel.org>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>,
"open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>,
"the arch/x86 maintainers" <x86@...nel.org>,
Kernel Page Reclaim v2 <page-reclaim@...gle.com>,
Brian Geffon <bgeffon@...gle.com>,
Jan Alexander Steffens <heftig@...hlinux.org>,
Oleksandr Natalenko <oleksandr@...alenko.name>,
Steven Barrett <steven@...uorix.net>,
Suleiman Souhlal <suleiman@...gle.com>,
Daniel Byrne <djbyrne@....edu>,
Donald Carr <d@...os-reins.com>,
Holger Hoffstätte <holger@...lied-asynchrony.com>,
Konstantin Kharlamov <Hi-Angel@...dex.ru>,
Shuang Zhai <szhai2@...rochester.edu>,
Sofia Trinh <sofia.trinh@....works>,
Vaibhav Jain <vaibhav@...ux.ibm.com>
Subject: Re: [PATCH v12 12/14] mm: multi-gen LRU: debugfs interface
On Wed, Jun 22, 2022 at 3:16 AM Qi Zheng <zhengqi.arch@...edance.com> wrote:
> > +static ssize_t lru_gen_seq_write(struct file *file, const char __user *src,
> > + size_t len, loff_t *pos)
> > +{
> > + void *buf;
> > + char *cur, *next;
> > + unsigned int flags;
> > + struct blk_plug plug;
> > + int err = -EINVAL;
> > + struct scan_control sc = {
> > + .may_writepage = true,
> > + .may_unmap = true,
> > + .may_swap = true,
> > + .reclaim_idx = MAX_NR_ZONES - 1,
> > + .gfp_mask = GFP_KERNEL,
> > + };
> > +
> > + buf = kvmalloc(len + 1, GFP_KERNEL);
> > + if (!buf)
> > + return -ENOMEM;
> > +
> > + if (copy_from_user(buf, src, len)) {
> > + kvfree(buf);
> > + return -EFAULT;
> > + }
> > +
> > + if (!set_mm_walk(NULL)) {
>
> The current->reclaim_state will be dereferenced in set_mm_walk(), so
> calling set_mm_walk() before set_task_reclaim_state(current,
> &sc.reclaim_state) will cause panic:
>
> [ 1861.154916] BUG: kernel NULL pointer dereference, address:
> 0000000000000008
Thanks.
Apparently I shot myself in the foot by one of the nits between v11 and v12.
> > + kvfree(buf);
> > + return -ENOMEM;
> > + }
> > +
> > + set_task_reclaim_state(current, &sc.reclaim_state);
> > + flags = memalloc_noreclaim_save();
> > + blk_start_plug(&plug);
> > +
> > + next = buf;
> > + next[len] = '\0';
> > +
> > + while ((cur = strsep(&next, ",;\n"))) {
> > + int n;
> > + int end;
> > + char cmd;
> > + unsigned int memcg_id;
> > + unsigned int nid;
> > + unsigned long seq;
> > + unsigned int swappiness = -1;
> > + unsigned long opt = -1;
> > +
> > + cur = skip_spaces(cur);
> > + if (!*cur)
> > + continue;
> > +
> > + n = sscanf(cur, "%c %u %u %lu %n %u %n %lu %n", &cmd, &memcg_id, &nid,
> > + &seq, &end, &swappiness, &end, &opt, &end);
> > + if (n < 4 || cur[end]) {
> > + err = -EINVAL;
> > + break;
> > + }
> > +
> > + err = run_cmd(cmd, memcg_id, nid, seq, &sc, swappiness, opt);
> > + if (err)
> > + break;
> > + }
> > +
> > + blk_finish_plug(&plug);
> > + memalloc_noreclaim_restore(flags);
> > + set_task_reclaim_state(current, NULL);
> > +
> > + clear_mm_walk();
>
> Ditto, we can't call clear_mm_walk() after
> set_task_reclaim_state(current, NULL).
>
> Maybe it can be modified as follows:
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 2422edc786eb..552e6ae5243e 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -5569,12 +5569,12 @@ static ssize_t lru_gen_seq_write(struct file
> *file, const char __user *src,
> return -EFAULT;
> }
>
> + set_task_reclaim_state(current, &sc.reclaim_state);
> if (!set_mm_walk(NULL)) {
> kvfree(buf);
> return -ENOMEM;
> }
>
> - set_task_reclaim_state(current, &sc.reclaim_state);
We need a `goto` because otherwise we leave a dangling
`current->reclaim_state`. (I swear I had one.)
Powered by blists - more mailing lists