[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAQViEsp0LjUcgR-at-ufdC7rnWARNBeqjqOSx6r3wJBcQkGiQ@mail.gmail.com>
Date: Sat, 1 Jun 2019 20:20:05 +0200
From: Albert Vaca Cintora <albertvaka@...il.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: rdunlap@...radead.org, mingo@...nel.org, Jan Kara <jack@...e.cz>,
ebiederm@...ssion.com,
Nicolas Saenz Julienne <nsaenzjulienne@...e.de>,
linux-kernel@...r.kernel.org, corbet@....net,
linux-doc@...r.kernel.org, Matthias Brugger <mbrugger@...e.com>
Subject: Re: [PATCH v3 2/3] kernel/ucounts: expose count of inotify watches in use
On Sat, Jun 1, 2019 at 2:00 AM Andrew Morton <akpm@...ux-foundation.org> wrote:
>
> On Fri, 31 May 2019 21:50:15 +0200 Albert Vaca Cintora <albertvaka@...il.com> wrote:
>
> > Adds a readonly 'current_inotify_watches' entry to the user sysctl table.
> > The handler for this entry is a custom function that ends up calling
> > proc_dointvec. Said sysctl table already contains 'max_inotify_watches'
> > and it gets mounted under /proc/sys/user/.
> >
> > Inotify watches are a finite resource, in a similar way to available file
> > descriptors. The motivation for this patch is to be able to set up
> > monitoring and alerting before an application starts failing because
> > it runs out of inotify watches.
> >
> > ...
> >
> > --- a/kernel/ucount.c
> > +++ b/kernel/ucount.c
> > @@ -118,6 +118,26 @@ static void put_ucounts(struct ucounts *ucounts)
> > kfree(ucounts);
> > }
> >
> > +#ifdef CONFIG_INOTIFY_USER
> > +int proc_read_inotify_watches(struct ctl_table *table, int write,
> > + void __user *buffer, size_t *lenp, loff_t *ppos)
> > +{
> > + struct ucounts *ucounts;
> > + struct ctl_table fake_table;
>
> hmm.
>
> > + int count = -1;
> > +
> > + ucounts = get_ucounts(current_user_ns(), current_euid());
> > + if (ucounts != NULL) {
> > + count = atomic_read(&ucounts->ucount[UCOUNT_INOTIFY_WATCHES]);
> > + put_ucounts(ucounts);
> > + }
> > +
> > + fake_table.data = &count;
> > + fake_table.maxlen = sizeof(count);
> > + return proc_dointvec(&fake_table, write, buffer, lenp, ppos);
>
> proc_dointvec
> ->do_proc_dointvec
> ->__do_proc_dointvec
> ->proc_first_pos_non_zero_ignore
> ->warn_sysctl_write
> ->pr_warn_once(..., table->procname)
>
> and I think ->procname is uninitialized.
>
> That's from a cursory check. Perhaps other uninitialized members of
> fake_table are accessed, dunno.
>
> we could do
>
> {
> struct ctl_table fake_table = {
> .data = &count,
> .maxlen = sizeof(count),
> };
>
> return proc_dointvec(&fake_table, write, buffer, lenp, ppos);
> }
>
> or whatever. That will cause the pr_warn_once to print "(null)" but
> that's OK I guess.
>
> Are there other places in the kernel which do this temp ctl_table
> trick? If so, what do they do? If not, what is special about this
> code?
>
>
I copied this 'fake_table' trick from proc_do_entropy in
drivers/char/random.c exactly as it is. It is also used in other
places with slight variations.
Note that, since we are creating a read-only proc file,
proc_first_pos_non_zero_ignore is not called from __do_proc_dointvec,
so the uninitialized ->procname is not accessed.
Albert
Powered by blists - more mailing lists