linux-kernel - Re: [PATCH v6 3/8] securtiy/brute: Detect a brute force attack

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210322183201.GA3401@ubuntu>
Date:   Mon, 22 Mar 2021 19:32:01 +0100
From:   John Wood <john.wood@....com>
To:     Kees Cook <keescook@...omium.org>
Cc:     John Wood <john.wood@....com>, Jann Horn <jannh@...gle.com>,
        Randy Dunlap <rdunlap@...radead.org>,
        Jonathan Corbet <corbet@....net>,
        James Morris <jmorris@...ei.org>,
        Shuah Khan <shuah@...nel.org>,
        "Serge E. Hallyn" <serge@...lyn.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Andi Kleen <ak@...ux.intel.com>,
        kernel test robot <oliver.sang@...el.com>,
        linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-security-module@...r.kernel.org,
        linux-kselftest@...r.kernel.org,
        kernel-hardening@...ts.openwall.com
Subject: Re: [PATCH v6 3/8] securtiy/brute: Detect a brute force attack

Hi,

On Sun, Mar 21, 2021 at 11:45:59AM -0700, Kees Cook wrote:
> On Sun, Mar 21, 2021 at 04:01:18PM +0100, John Wood wrote:
> > On Wed, Mar 17, 2021 at 07:57:10PM -0700, Kees Cook wrote:
> > > On Sun, Mar 07, 2021 at 12:30:26PM +0100, John Wood wrote:
> > Sorry, but I try to understand how to use locking properly without luck.
> >
> > I have read (and tried to understand):
> >    tools/memory-model/Documentation/simple.txt
> >    tools/memory-model/Documentation/ordering.txt
> >    tools/memory-model/Documentation/recipes.txt
> >    Documentation/memory-barriers.txt
> >
> > And I don't find the responses that I need. I'm not saying they aren't
> > there but I don't see them. So my questions:
> >
> > If in the above function makes sense to use locking, and it is called from
> > the brute_task_fatal_signal hook, then, all the functions that are called
> > from this hook need locking (more than one process can access stats at the
> > same time).
> >
> > So, as you point, how it is possible and safe to read jiffies and faults
> > (and I think period even though you not mention it) using READ_ONCE() but
> > without holding brute_stats::lock? I'm very confused.
>
> There are, I think, 3 considerations:
>
> - is "stats", itself, a valid allocation in kernel memory? This is the
>   "lifetime" management of the structure: it will only stay allocated as
>   long as there is a task still alive that is attached to it. The use of
>   refcount_t on task creation/death should entirely solve this issue, so
>   that all the other places where you access "stats", the memory will be
>   valid. AFAICT, this one is fine: you're doing all the correct lifetime
>   management.
>
> - changing a task's stats pointer: this is related to lifetime
>   management, but it, I think, entirely solved by the existing
>   refcounting. (And isn't helped by holding stats->lock since this is
>   about stats itself being a valid pointer.) Again, I think this is all
>   correct already in your existing code (due to the implicit locking of
>   "current"). Perhaps I've missed something here, but I guess we'll see!

My only concern now is the following case:

One process crashes with a fatal signal. Then, its stats are updated. Then
we get the exec stats (the stats of the task that calls exec). At the same
time another CPU frees this same stats. Now, if the first process writes
to the exec stats we get a "use after free" bug.

If this scenario is possible, we would need to protect all the section
inside the task_fatal_signal hook that deals with the exec stats. I think
that here a global lock is necessary and also, protect the write of the
pointer to stats struct in the task_free hook.

Moreover, I can see another scenario:

The first CPU gets the exec stats when a task fails with a fatal signal.
The second CPU exec()ve after exec()ve over the same task from we get the
exec stats with the first CPU. This second CPU resets the stats at the same
time that the first CPU updates the same stats. I think we also need lock
here.

Am I right? Are these paths possible?

>
> - are the values in stats getting written by multiple writers, or read
>   during a write, etc?
>
> This last one is the core of what I think could be improved here:
>
> To keep the writes serialized, you (correctly) perform locking in the
> writers. This is fine.
>
> There is also locking in the readers, which I think is not needed.
> AFAICT, READ_ONCE() (with WRITE_ONCE() in the writers) is sufficient for
> the readers here.
>
> > IIUC (during the reading of the documentation) READ_ONCE and WRITE_ONCE only
> > guarantees that a variable loaded with WRITE_ONCE can be read safely with
> > READ_ONCE avoiding tearing, etc. So, I see these functions like a form of
> > guarantee atomicity in variables.
>
> Right -- from what I can see about how you're reading the statistics, I
> don't see a way to have the values get confused (assuming locked writes
> and READ/WRITE_ONCE()).
>
> > Another question. Is it also safe to use WRITE_ONCE without holding the lock?
> > Or this is only appliable to read operations?
>
> No -- you'll still want the writer locked since you update multiple fields
> in stats during a write, so you could miss increments, or interleave
> count vs jiffies writes, etc. But the WRITE_ONCE() makes sure that the
> READ_ONCE() readers will see a stable value (as I understand it), and
> in the order they were written.
>
> > Any light on this will help me to do the best job in the next patches. If
> > somebody can point me to the right direction it would be greatly appreciated.
> >
> > Is there any documentation for newbies regarding this theme? I'm stuck.
> > I have also read the documentation about spinlocks, semaphores, mutex, etc..
> > but nothing clears me the concept expose.
> >
> > Apologies if this question has been answered in the past. But the search in
> > the mailing list has not been lucky.
>
> It's a complex subject! Here are some other docs that might help:
>
> tools/memory-model/Documentation/explanation.txt
> Documentation/core-api/refcount-vs-atomic.rst
>
> or they may melt your brain further! :) I know mine is always mushy
> after reading them.
>
> > Thanks for your time and patience.
>
> You're welcome; and thank you for your work on this! I've wanted a robust
> brute force mitigation in the kernel for a long time. :)
>

Thank you very much for this great explanation and mentorship. Now this
subject is more clear to me. It's a pleasure to me to work on this.

Again, thanks for your help.
John Wood