netdev - Re: [PATCH net 1/2] net/sched: cls_u32: fix refcount leak in the error path of u32

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3bbe208c56d4b6cf3526f4957b19f87d695d5d0a.camel@redhat.com>
Date:   Fri, 20 Dec 2019 14:21:58 +0100
From:   Davide Caratti <dcaratti@...hat.com>
To:     Vlad Buslov <vladbu@...lanox.com>
Cc:     Jamal Hadi Salim <jhs@...atatu.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Cong Wang <xiyou.wangcong@...il.com>,
        Jiri Pirko <jiri@...nulli.us>, Roman Mashak <mrv@...atatu.com>
Subject: Re: [PATCH net 1/2] net/sched: cls_u32: fix refcount leak in the
 error path of u32_change()

hi Jamal and Vlad,

thanks a lot for sharing your thoughts.

On Thu, 2019-12-19 at 17:01 +0000, Vlad Buslov wrote:
> > > IMO that would be a cleaner fix give walk() is used for other
> > > operations and this is a core cls issue.
> > > 

[...]

> > So I guess the requirement now is for unlocked classifier to have sane
> > implementation of ops->walk() that doesn't assume >1 filters and
> > correctly handles the case when insertion of first filter after
> > classifier instance is created fails.

I tried forcing an error in matchall, and didn't observe this problem:

[root@f31 ~]# perf record -e probe:mall_change__return -agR -- tc filter add dev dam0 parent root matchall skip_sw action drop
RTNETLINK answers: Operation not supported
We have an error talking to the kernel
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.225 MB perf.data (1 samples) ]
[root@f31 ~]# perf script
tc 115241 [002] 19665.372130: probe:mall_change__return: (ffffffffc115f930 <- ffffffffb98a7266) ret=0xffffffa1
        ffffffffb9066790 kretprobe_trampoline+0x0 (vmlinux)
            7fa64c16cb77 __libc_sendmsg+0x17 (/usr/lib64/libc-2.30.so)

[root@f31 ~]# tc filter show dev dam0 
[root@f31 ~]#

and similar test can be also carried for the other classifiers
(on unpatched kernel), so it should be easy to understand if there are
other filter that show the same problem. 

> BTW another approach would be to extend ops with new callback
> delete_empty(), require unlocked implementation to define it and move
> functionality of tcf_proto_check_delete() there. Such approach would
> remove the need for (ab)using ops->walk() for this since internally
> in classifier implementation there is always a way to correctly verify
> that classifier instance is empty. Don't know which approach is better
> in this case.

I like the idea of using walk() only when filters are dumped, and handling
the check for empty filters on deletion with a separate callback. That
would allow de-refcounting the filter in the error path unconditionally
for   all classifiers, like old kernel was doing, unless the classifier
implements its own "check empty" function.

how does it sound?
thanks!
-- 
davide