[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200302192328.GB234476@google.com>
Date: Mon, 2 Mar 2020 11:23:28 -0800
From: Minchan Kim <minchan@...nel.org>
To: Suren Baghdasaryan <surenb@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>,
linux-mm <linux-mm@...ck.org>, linux-api@...r.kernel.org,
oleksandr@...hat.com, Tim Murray <timmurray@...gle.com>,
Daniel Colascione <dancol@...gle.com>,
Sandeep Patil <sspatil@...gle.com>,
Sonny Rao <sonnyrao@...gle.com>,
Brian Geffon <bgeffon@...gle.com>,
Michal Hocko <mhocko@...e.com>,
Johannes Weiner <hannes@...xchg.org>,
Shakeel Butt <shakeelb@...gle.com>,
John Dias <joaodias@...gle.com>,
Joel Fernandes <joel@...lfernandes.org>, sj38.park@...il.com,
alexander.h.duyck@...ux.intel.com, Jann Horn <jannh@...gle.com>,
Christian Brauner <christian@...uner.io>,
Kirill Tkhai <ktkhai@...tuozzo.com>
Subject: Re: [PATCH v6 5/7] mm: support both pid and pidfd for process_madvise
On Fri, Feb 28, 2020 at 02:41:07PM -0800, Suren Baghdasaryan wrote:
> On Tue, Feb 18, 2020 at 5:44 PM Minchan Kim <minchan@...nel.org> wrote:
> >
> > There is a demand[1] to support pid as well pidfd for process_madvise
> > to reduce unnecessary syscall to get pidfd if the user has control of
> > the target process(ie, they could guarantee the process is not gone
> > or pid is not reused. Or, it might be okay to give a hint to wrong
> > process).
>
> nit: When would "give a hint to wrong process" be ok? I would just
> remove this part.
I wanted to say non destructive hints. It's already true for other
some hints because they are just best effort so it's not critical
to be failed. If you mind it, I will remove the phrase.
Thanks.
>
> >
> > This patch aims for supporting both options like waitid(2). So, the
> > syscall is currently,
> >
> > int process_madvise(int which, pid_t pid, void *addr,
> > size_t length, int advise, unsigned long flag);
> >
> > @which is actually idtype_t for userspace libray and currently,
> > it supports P_PID and P_PIDFD.
> >
> > [1] https://lore.kernel.org/linux-mm/9d849087-3359-c4ab-fbec-859e8186c509@virtuozzo.com/
> >
> > Cc: Christian Brauner <christian@...uner.io>
> > Suggested-by: Kirill Tkhai <ktkhai@...tuozzo.com>
> > Signed-off-by: Minchan Kim <minchan@...nel.org>
> > ---
> > include/linux/syscalls.h | 3 ++-
> > mm/madvise.c | 34 ++++++++++++++++++++++------------
> > 2 files changed, 24 insertions(+), 13 deletions(-)
> >
> > diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> > index e4cd2c2f8bb4..f5ada20e2943 100644
> > --- a/include/linux/syscalls.h
> > +++ b/include/linux/syscalls.h
> > @@ -876,7 +876,8 @@ asmlinkage long sys_munlockall(void);
> > asmlinkage long sys_mincore(unsigned long start, size_t len,
> > unsigned char __user * vec);
> > asmlinkage long sys_madvise(unsigned long start, size_t len, int behavior);
> > -asmlinkage long sys_process_madvise(int pidfd, unsigned long start,
> > +
> > +asmlinkage long sys_process_madvise(int which, pid_t pid, unsigned long start,
> > size_t len, int behavior, unsigned long flags);
> > asmlinkage long sys_remap_file_pages(unsigned long start, unsigned long size,
> > unsigned long prot, unsigned long pgoff,
> > diff --git a/mm/madvise.c b/mm/madvise.c
> > index def1507c2030..f6d9b9e66243 100644
> > --- a/mm/madvise.c
> > +++ b/mm/madvise.c
> > @@ -1182,11 +1182,10 @@ SYSCALL_DEFINE3(madvise, unsigned long, start, size_t, len_in, int, behavior)
> > return do_madvise(current, current->mm, start, len_in, behavior);
> > }
> >
> > -SYSCALL_DEFINE5(process_madvise, int, pidfd, unsigned long, start,
> > +SYSCALL_DEFINE6(process_madvise, int, which, pid_t, upid, unsigned long, start,
> > size_t, len_in, int, behavior, unsigned long, flags)
> > {
> > int ret;
> > - struct fd f;
> > struct pid *pid;
> > struct task_struct *task;
> > struct mm_struct *mm;
> > @@ -1197,20 +1196,31 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, unsigned long, start,
> > if (!process_madvise_behavior_valid(behavior))
> > return -EINVAL;
> >
> > - f = fdget(pidfd);
> > - if (!f.file)
> > - return -EBADF;
> > + switch (which) {
> > + case P_PID:
> > + if (upid <= 0)
> > + return -EINVAL;
> > +
> > + pid = find_get_pid(upid);
> > + if (!pid)
> > + return -ESRCH;
> > + break;
> > + case P_PIDFD:
> > + if (upid < 0)
> > + return -EINVAL;
> >
> > - pid = pidfd_pid(f.file);
> > - if (IS_ERR(pid)) {
> > - ret = PTR_ERR(pid);
> > - goto fdput;
> > + pid = pidfd_get_pid(upid);
> > + if (IS_ERR(pid))
> > + return PTR_ERR(pid);
> > + break;
> > + default:
> > + return -EINVAL;
> > }
> >
> > task = get_pid_task(pid, PIDTYPE_PID);
> > if (!task) {
> > ret = -ESRCH;
> > - goto fdput;
> > + goto put_pid;
> > }
> >
> > mm = mm_access(task, PTRACE_MODE_ATTACH_FSCREDS);
> > @@ -1223,7 +1233,7 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, unsigned long, start,
> > mmput(mm);
> > release_task:
> > put_task_struct(task);
> > -fdput:
> > - fdput(f);
> > +put_pid:
> > + put_pid(pid);
> > return ret;
> > }
> > --
> > 2.25.0.265.gbab2e86ba0-goog
> >
>
> Reviewed-by: Suren Baghdasaryan <surenb@...gle.com>
Powered by blists - more mailing lists