[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <582C30DF.60205@iogearbox.net>
Date: Wed, 16 Nov 2016 11:11:43 +0100
From: Daniel Borkmann <daniel@...earbox.net>
To: Greg KH <gregkh@...uxfoundation.org>,
Kees Cook <keescook@...omium.org>
CC: Peter Zijlstra <peterz@...radead.org>,
Will Deacon <will.deacon@....com>,
"Reshetova, Elena" <elena.reshetova@...el.com>,
Arnd Bergmann <arnd@...db.de>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...nel.org>,
"H. Peter Anvin" <hpa@...or.com>,
David Windsor <dave@...gbits.org>,
LKML <linux-kernel@...r.kernel.org>,
Alexei Starovoitov <alexei.starovoitov@...il.com>
Subject: Re: [RFC][PATCH 2/7] kref: Add kref_read()
On 11/16/2016 09:21 AM, Greg KH wrote:
> On Tue, Nov 15, 2016 at 12:53:35PM -0800, Kees Cook wrote:
>> On Tue, Nov 15, 2016 at 12:03 AM, Peter Zijlstra <peterz@...radead.org> wrote:
>>> On Tue, Nov 15, 2016 at 08:33:22AM +0100, Greg KH wrote:
>>>> On Mon, Nov 14, 2016 at 06:39:48PM +0100, Peter Zijlstra wrote:
>>>
>>>>> --- a/drivers/block/drbd/drbd_req.c
>>>>> +++ b/drivers/block/drbd/drbd_req.c
>>>>> @@ -520,7 +520,7 @@ static void mod_rq_state(struct drbd_req
>>>>> /* Completion does it's own kref_put. If we are going to
>>>>> * kref_sub below, we need req to be still around then. */
>>>>> int at_least = k_put + !!c_put;
>>>>> - int refcount = atomic_read(&req->kref.refcount);
>>>>> + int refcount = kref_read(&req->kref);
>>>>> if (refcount < at_least)
>>>>> drbd_err(device,
>>>>> "mod_rq_state: Logic BUG: %x -> %x: refcount = %d, should be >= %d\n",
>>>>
>>>> As proof of "things you should never do", here is one such example.
>>>>
>>>> ugh.
>>>>
>>>>> --- a/drivers/block/virtio_blk.c
>>>>> +++ b/drivers/block/virtio_blk.c
>>>>> @@ -767,7 +767,7 @@ static void virtblk_remove(struct virtio
>>>>> /* Stop all the virtqueues. */
>>>>> vdev->config->reset(vdev);
>>>>>
>>>>> - refc = atomic_read(&disk_to_dev(vblk->disk)->kobj.kref.refcount);
>>>>> + refc = kref_read(&disk_to_dev(vblk->disk)->kobj.kref);
>>>>> put_disk(vblk->disk);
>>>>> vdev->config->del_vqs(vdev);
>>>>> kfree(vblk->vqs);
>>>>
>>>> And this too, ugh, that's a huge abuse and is probably totally wrong...
>>>>
>>>> thanks again for digging through this crap. I wonder if we need to name
>>>> the kref reference variable "do_not_touch_this_ever" or some such thing
>>>> to catch all of the people who try to be "too smart".
>>>
>>> There's unimaginable bong hits involved in this stuff, in the end I
>>> resorted to brute force and scripts to convert all this.
>>
>> What should we do about things like this (bpf_prog_put() and callbacks
>> from kernel/bpf/syscall.c):
Just reading up on this series. Your question refers to converting bpf
prog and map ref counts to Peter's refcount_t eventually, right?
>> static void bpf_prog_uncharge_memlock(struct bpf_prog *prog)
>> {
>> struct user_struct *user = prog->aux->user;
>>
>> atomic_long_sub(prog->pages, &user->locked_vm);
>
> Oh that's scary. Let's just make one reference count rely on another
> one and not check things...
Sorry, could you elaborate what you mean by 'check things', you mean for
wrap around? IIUC, back then accounting was roughly similar modeled after
perf event's one, and in this case accounts for pages used by progs and
maps during their life-time. Are you suggesting that this approach is
inherently broken?
>> free_uid(user);
>> }
>>
>> static void __bpf_prog_put_rcu(struct rcu_head *rcu)
>> {
>> struct bpf_prog_aux *aux = container_of(rcu, struct bpf_prog_aux, rcu);
>>
>> free_used_maps(aux);
>> bpf_prog_uncharge_memlock(aux->prog);
>> bpf_prog_free(aux->prog);
>> }
>>
>> void bpf_prog_put(struct bpf_prog *prog)
>> {
>> if (atomic_dec_and_test(&prog->aux->refcnt))
>> call_rcu(&prog->aux->rcu, __bpf_prog_put_rcu);
>> }
>>
>>
>> Not only do we want to protect prog->aux->refcnt, but I think we want
>> to protect user->locked_vm too ... I don't think it's sane for
>> user->locked_vm to be a stats_t ?
>
> I don't think this is sane code...
>
> greg k-h
Powered by blists - more mailing lists