linux-kernel - Coccinelle rule for CVE-2019-18683

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <fff664e9-06c9-d2fb-738f-e8e591e09569@linux.com>
Date:   Thu, 9 Apr 2020 01:01:31 +0300
From:   Alexander Popov <alex.popov@...ux.com>
To:     Julia Lawall <Julia.Lawall@...6.fr>,
        Gilles Muller <Gilles.Muller@...6.fr>,
        Nicolas Palix <nicolas.palix@...g.fr>,
        Michal Marek <michal.lkml@...kovi.net>, cocci@...teme.lip6.fr,
        "kernel-hardening@...ts.openwall.com" 
        <kernel-hardening@...ts.openwall.com>,
        Jann Horn <jannh@...gle.com>,
        Kees Cook <keescook@...omium.org>,
        Hans Verkuil <hverkuil@...all.nl>,
        Mauro Carvalho Chehab <mchehab@...nel.org>,
        Linux Media Mailing List <linux-media@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Coccinelle rule for CVE-2019-18683

Hello!

Some time ago I fixed CVE-2019-18683 in the V4L2 subsystem of the Linux kernel.

I created a Coccinelle rule that detects that bug pattern. Let me show it.


Bug pattern
===========

CVE-2019-18683 refers to three similar vulnerabilities caused by the same
incorrect approach to locking that is used in vivid_stop_generating_vid_cap(),
vivid_stop_generating_vid_out(), and sdr_cap_stop_streaming().

For fixes please see the commit 6dcd5d7a7a29c1e4 (media: vivid: Fix wrong
locking that causes race conditions on streaming stop).

These three functions are called during streaming stopping with vivid_dev.mutex
locked. And they all do the same mistake while stopping their kthreads, which
need to lock this mutex as well. See the example from
vivid_stop_generating_vid_cap():
    /* shutdown control thread */
    vivid_grab_controls(dev, false);
    mutex_unlock(&dev->mutex);
    kthread_stop(dev->kthread_vid_cap);
    dev->kthread_vid_cap = NULL;
    mutex_lock(&dev->mutex);

But when this mutex is unlocked, another vb2_fop_read() can lock it instead of
the kthread and manipulate the buffer queue. That causes use-after-free.

I created a Coccinelle rule that detects mutex_unlock+kthread_stop+mutex_lock
within one function.


Coccinelle rule
===============

virtual report

@race exists@
expression E;
position stop_p;
position unlock_p;
position lock_p;
@@

mutex_unlock@...ock_p(E)
...
kthread_stop@...p_p(...)
...
mutex_lock@...k_p(E)

@script:python@
stop_p << race.stop_p;
unlock_p << race.unlock_p;
lock_p << race.lock_p;
E << race.E;
@@

coccilib.report.print_report(unlock_p[0], 'mutex_unlock(' + E + ') here')
coccilib.report.print_report(stop_p[0], 'kthread_stop here')
coccilib.report.print_report(lock_p[0], 'mutex_lock(' + E + ') here\n')


Testing the rule
================

I reverted the commit 6dcd5d7a7a29c1e4 and called:
COCCI=./scripts/coccinelle/kthread_race.cocci make coccicheck MODE=report

The result:

./drivers/media/platform/vivid/vivid-kthread-out.c:347:1-13: mutex_unlock(& dev
-> mutex) here
./drivers/media/platform/vivid/vivid-kthread-out.c:348:1-13: kthread_stop here
./drivers/media/platform/vivid/vivid-kthread-out.c:350:1-11: mutex_lock(& dev ->
mutex) here

./drivers/media/platform/vivid/vivid-sdr-cap.c:306:1-13: mutex_unlock(& dev ->
mutex) here
./drivers/media/platform/vivid/vivid-sdr-cap.c:307:1-13: kthread_stop here
./drivers/media/platform/vivid/vivid-sdr-cap.c:309:1-11: mutex_lock(& dev ->
mutex) here

./drivers/media/platform/vivid/vivid-kthread-cap.c:1001:1-13: mutex_unlock(& dev
-> mutex) here
./drivers/media/platform/vivid/vivid-kthread-cap.c:1002:1-13: kthread_stop here
./drivers/media/platform/vivid/vivid-kthread-cap.c:1004:1-11: mutex_lock(& dev
-> mutex) here

There are no other bugs detected.

Do you have any idea how to improve it?
Do we need that rule for regression testing in the upstream?

Thanks in advance!
Alexander