[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <cover.1550271080.git.dcaratti@redhat.com>
Date: Sat, 16 Feb 2019 00:06:26 +0100
From: Davide Caratti <dcaratti@...hat.com>
To: Jamal Hadi Salim <jhs@...atatu.com>,
Cong Wang <xiyou.wangcong@...il.com>,
Jiri Pirko <jiri@...nulli.us>
Cc: "David S. Miller" <davem@...emloft.net>,
Vlad Buslov <vladbu@...lanox.com>,
Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org
Subject: [PATCH RFC 0/5] net/sched: validate the control action with all the other parameters
currently, the kernel checks for bad values of the control action in
tcf_action_init_1(), after a successful call to the action's init()
function. This causes three bad behaviors:
1. the "half configuration"
if the action is overwritten, the new configuration data are
applied successfully
# tc action add action gact drop index 100
# tc action replace action gact goto chain 66 index 100 \
> cookie aabbccdd
Error: Failed to init TC action chain
We have an error talking to the kernel
# tc action show action gact
total acts 1
action order 0: gact action goto chain 66
random type none pass val 0
index 100 ref 1 bind 0
cookie aabbccdd
2. a "refcount leak that makes kmemleak complain"
when a valid 'goto chain' action is overwritten with another 'goto chain'
action, the kernel leaks two refcounts.
# tc chain add dev dd0 chain 42 ingress protocol ip flower \
> ip_proto tcp action drop
# tc chain add dev dd0 chain 43 ingress protocol ip flower \
> ip_proto udp action drop
# tc filter add dev dd0 ingress matchall \
> action goto chain 42 index 66
# tc action replace action gact goto chain 43 index 66
Error: Failed to init TC action chain.
We have an error talking to the kernel
# echo scan >/sys/kernel/debug/kmemleak
<...>
unreferenced object 0xffff93c0ee09f000 (size 1024):
comm "tc", pid 2565, jiffies 4295339808 (age 65.426s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 08 00 06 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<000000009b63f92d>] tc_ctl_chain+0x3d2/0x4c0
[<00000000683a8d72>] rtnetlink_rcv_msg+0x263/0x2d0
[<00000000ddd88f8e>] netlink_rcv_skb+0x4a/0x110
[<000000006126a348>] netlink_unicast+0x1a0/0x250
[<00000000b3340877>] netlink_sendmsg+0x2c1/0x3c0
[<00000000a25a2171>] sock_sendmsg+0x36/0x40
[<00000000f19ee1ec>] ___sys_sendmsg+0x280/0x2f0
[<00000000d0422042>] __sys_sendmsg+0x5e/0xa0
[<000000007a6c61f9>] do_syscall_64+0x5b/0x180
[<00000000ccd07542>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<0000000013eaa334>] 0xffffffffffffffff
3. a "kernel crash in the traffic plane"
when an action is overwritten with an invalid 'goto chain' action,
packets hitting the new rule will trigger a NULL pointer dereference:
# tc qdisc add dev crash0 clsact
# tc filter add dev crash0 egress matchall action csum icmp index 6
# tc action replace action csum icmp goto chain 42 index 6 cookie c1a0c1a0
# ping 1.2.3.4 -Icrash0
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
#PF error: [normal kernel read fault]
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.0.0-rc4.splash #516
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:tcf_action_exec+0xb5/0x100
Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3
RSP: 0018:ffff99933da03be0 EFLAGS: 00010246
RAX: 000000002000002a RBX: ffff99933c463300 RCX: 0000000000003a00
RDX: 0000000000000000 RSI: ffff9992ebda2828 RDI: ffff9992ebda2818
RBP: ffff99933da03c80 R08: 0000000032250000 R09: 0000000000000003
R10: 000000000000003f R11: 0000000000000028 R12: ffff9992ed22c100
R13: ffff9992ed22c108 R14: 0000000000000001 R15: ffff9993339919c0
FS: 0000000000000000(0000) GS:ffff99933da00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000005a0e004 CR4: 00000000001606f0
Call Trace:
<IRQ>
tcf_classify+0x58/0x110
__dev_queue_xmit+0x407/0x890
? ip6_finish_output2+0x366/0x590
ip6_finish_output2+0x366/0x590
? ip6_output+0x68/0x110
ip6_output+0x68/0x110
? nf_hook.constprop.35+0x79/0xc0
mld_sendpack+0x16f/0x220
mld_ifc_timer_expire+0x195/0x2c0
? igmp6_timer_handler+0x70/0x70
call_timer_fn+0x2b/0x130
run_timer_softirq+0x3e8/0x440
? tick_sched_timer+0x37/0x70
__do_softirq+0xe3/0x2f5
irq_exit+0xf0/0x100
smp_apic_timer_interrupt+0x6c/0x130
apic_timer_interrupt+0xf/0x20
</IRQ>
RIP: 0010:native_safe_halt+0x2/0x10
Code: 74 ff ff ff 7f f3 c3 65 48 8b 04 25 00 5c 01 00 f0 80 48 02 20 48 8b 00 a8 08 74 8b eb c1 90 90 90 90 90 90 90 90 90 90 fb f4 <c3> 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f4 c3 90 90 90 90 90 90
RSP: 0018:ffffffff8be03e98 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
RAX: ffffffff8b417730 RBX: 0000000000000000 RCX: 7ffffb503267b27c
RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff99933da1d240
RBP: 0000000000000000 R08: 000c515677542b91 R09: 0000000000000000
R10: ffffb312c0cffd98 R11: 0000000000000001 R12: 0000000000000000
all these three problems can be fixed if we validate the control action
in the init() function, in the same way as we are already doing for all
the other parameters.
- patch 1 is a temporary fix for problem 2), but it's reverted at the
end of the series
- we need to refcount and unrefcount chains within init(): patch 2
extends the init() prototype to correctly handle 'goto chain' control
actions.
- patch 3 to [n] fix all the actions, providing a proper error path in
case of bad control action, thus providing a fix for problems 1), 2)
and 3)
- patch [n+1] removes two if statements in tcf_action_init_1() causing
the above problems:
if (TC_ACT_EXT_CMP(a->tcfa_action, TC_ACT_GOTO_CHAIN)) {
....
}
and
if (!tcf_action_valid(a->tcfa_action)) {
....
}
as they are not not needed anymore, and for the same reason reverts
patch 1.
This RFC series fixes only csum, bpf and gact - so it stops with n=5.
In case there are not objections, I will prepare a series fixing all
the actions and send it targeting the 'net' tree (or also 'net-next',
this behavior is reproducible since the first introduction of 'goto
action').
Any feedback is appreciated, thank you in advance!
Davide Caratti (5):
net/sched: fix refcount leak when 'goto_chain' is used
net/sched: prepare TC actions to properly validate the control action
net/sched: act_bpf: validate the control action inside init()
net/sched: act_csum: validate the control action inside init()
net/sched: act_gact: validate the control action inside init()
include/net/act_api.h | 9 +++++-
net/sched/act_api.c | 63 ++++++++++++++++++++++++++++++++++++--
net/sched/act_bpf.c | 12 ++++++--
net/sched/act_connmark.c | 1 +
net/sched/act_csum.c | 22 ++++++++++---
net/sched/act_gact.c | 17 ++++++++--
net/sched/act_ife.c | 2 +-
net/sched/act_ipt.c | 11 ++++---
net/sched/act_mirred.c | 1 +
net/sched/act_nat.c | 3 +-
net/sched/act_pedit.c | 2 +-
net/sched/act_police.c | 1 +
net/sched/act_sample.c | 2 +-
net/sched/act_simple.c | 2 +-
net/sched/act_skbedit.c | 1 +
net/sched/act_skbmod.c | 1 +
net/sched/act_tunnel_key.c | 1 +
net/sched/act_vlan.c | 2 +-
18 files changed, 131 insertions(+), 22 deletions(-)
--
2.20.1
Powered by blists - more mailing lists