lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <863681d4-92cf-1575-a182-b07f22ec578e@gmail.com>
Date:   Fri, 25 May 2018 22:43:56 +0900
From:   Toshiaki Makita <toshiaki.makita1@...il.com>
To:     Jason Wang <jasowang@...hat.com>,
        Toshiaki Makita <makita.toshiaki@....ntt.co.jp>,
        "David S. Miller" <davem@...emloft.net>
Cc:     netdev@...r.kernel.org
Subject: Re: [PATCH net] tun: Fix NULL pointer dereference in XDP redirect

On 18/05/25 (金) 18:59, Jason Wang wrote:
> On 2018年05月25日 12:32, Toshiaki Makita wrote:
>> Calling XDP redirection requires preempt/bh disabled. Especially softirq
>> can call another XDP function and redirection functions, then percpu
>> value ri->map can be overwritten to NULL.
>>
>> This is a generic XDP case called from tun.
>>
>> [ 3535.736058] BUG: unable to handle kernel NULL pointer dereference 
>> at 0000000000000018
>> [ 3535.743974] PGD 0 P4D 0
>> [ 3535.746530] Oops: 0000 [#1] SMP PTI
>> [ 3535.750049] Modules linked in: vhost_net vhost tap tun bridge stp 
>> llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter 
>> sunrpc vfat fat ext4 mbcache jbd2 intel_rapl skx_edac nfit libnvdimm 
>> x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm ipmi_ssif 
>> irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc ses 
>> aesni_intel crypto_simd cryptd enclosure hpwdt hpilo glue_helper 
>> ipmi_si pcspkr wmi mei_me ioatdma mei ipmi_devintf shpchp dca 
>> ipmi_msghandler lpc_ich acpi_power_meter sch_fq_codel ip_tables xfs 
>> libcrc32c sd_mod mgag200 i2c_algo_bit drm_kms_helper syscopyarea 
>> sysfillrect sysimgblt fb_sys_fops ttm drm smartpqi i40e crc32c_intel 
>> scsi_transport_sas tg3 i2c_core ptp pps_core
>> [ 3535.813456] CPU: 5 PID: 1630 Comm: vhost-1614 Not tainted 
>> 4.17.0-rc4 #2
>> [ 3535.820127] Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 
>> Gen10, BIOS U32 11/14/2017
>> [ 3535.828732] RIP: 0010:__xdp_map_lookup_elem+0x5/0x30
>> [ 3535.833740] RSP: 0018:ffffb4bc47bf7c58 EFLAGS: 00010246
>> [ 3535.839009] RAX: ffff9fdfcfea1c40 RBX: 0000000000000000 RCX: 
>> ffff9fdf27fe3100
>> [ 3535.846205] RDX: ffff9fdfca769200 RSI: 0000000000000000 RDI: 
>> 0000000000000000
>> [ 3535.853402] RBP: ffffb4bc491d9000 R08: 00000000000045ad R09: 
>> 0000000000000ec0
>> [ 3535.860597] R10: 0000000000000001 R11: ffff9fdf26c3ce4e R12: 
>> ffff9fdf9e72c000
>> [ 3535.867794] R13: 0000000000000000 R14: fffffffffffffff2 R15: 
>> ffff9fdfc82cdd00
>> [ 3535.874990] FS:  0000000000000000(0000) GS:ffff9fdfcfe80000(0000) 
>> knlGS:0000000000000000
>> [ 3535.883152] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 3535.888948] CR2: 0000000000000018 CR3: 0000000bde724004 CR4: 
>> 00000000007626e0
>> [ 3535.896145] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
>> 0000000000000000
>> [ 3535.903342] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
>> 0000000000000400
>> [ 3535.910538] PKRU: 55555554
>> [ 3535.913267] Call Trace:
>> [ 3535.915736]  xdp_do_generic_redirect+0x7a/0x310
>> [ 3535.920310]  do_xdp_generic.part.117+0x285/0x370
>> [ 3535.924970]  tun_get_user+0x5b9/0x1260 [tun]
>> [ 3535.929279]  tun_sendmsg+0x52/0x70 [tun]
>> [ 3535.933237]  handle_tx+0x2ad/0x5f0 [vhost_net]
>> [ 3535.937721]  vhost_worker+0xa5/0x100 [vhost]
>> [ 3535.942030]  kthread+0xf5/0x130
>> [ 3535.945198]  ? vhost_dev_ioctl+0x3b0/0x3b0 [vhost]
>> [ 3535.950031]  ? kthread_bind+0x10/0x10
>> [ 3535.953727]  ret_from_fork+0x35/0x40
>> [ 3535.957334] Code: 0e 74 15 83 f8 10 75 05 e9 49 aa b3 ff f3 c3 0f 
>> 1f 80 00 00 00 00 f3 c3 e9 29 9d b3 ff 66 0f 1f 84 00 00 00 00 00 0f 
>> 1f 44 00 00 <8b> 47 18 83 f8 0e 74 0d 83 f8 10 75 05 e9 49 a9 b3 ff 31 
>> c0 c3
>> [ 3535.976387] RIP: __xdp_map_lookup_elem+0x5/0x30 RSP: ffffb4bc47bf7c58
>> [ 3535.982883] CR2: 0000000000000018
>> [ 3535.987096] ---[ end trace 383b299dd1430240 ]---
>> [ 3536.131325] Kernel panic - not syncing: Fatal exception
>> [ 3536.137484] Kernel Offset: 0x26a00000 from 0xffffffff81000000 
>> (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
>> [ 3536.281406] ---[ end Kernel panic - not syncing: Fatal exception ]---
>>
>> And a kernel with generic case fixed still panics in tun driver XDP
>> redirect, because it did not disable bh.
>>
>> [ 2055.128746] BUG: unable to handle kernel NULL pointer dereference 
>> at 0000000000000018
>> [ 2055.136662] PGD 0 P4D 0
>> [ 2055.139219] Oops: 0000 [#1] SMP PTI
>> [ 2055.142736] Modules linked in: vhost_net vhost tap tun bridge stp 
>> llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter 
>> sunrpc vfat fat ext4 mbcache jbd2 intel_rapl skx_edac nfit libnvdimm 
>> x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass 
>> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc ses aesni_intel 
>> ipmi_ssif crypto_simd enclosure cryptd hpwdt glue_helper ioatdma hpilo 
>> wmi dca pcspkr ipmi_si acpi_power_meter ipmi_devintf shpchp mei_me 
>> ipmi_msghandler mei lpc_ich sch_fq_codel ip_tables xfs libcrc32c 
>> sd_mod mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect 
>> sysimgblt fb_sys_fops ttm drm i40e smartpqi tg3 scsi_transport_sas 
>> crc32c_intel i2c_core ptp pps_core
>> [ 2055.206142] CPU: 6 PID: 1693 Comm: vhost-1683 Tainted: G        
>> W         4.17.0-rc5-fix-tun+ #1
>> [ 2055.215011] Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 
>> Gen10, BIOS U32 11/14/2017
>> [ 2055.223617] RIP: 0010:__xdp_map_lookup_elem+0x5/0x30
>> [ 2055.228624] RSP: 0018:ffff998b07607cc0 EFLAGS: 00010246
>> [ 2055.233892] RAX: ffff8dbd8e235700 RBX: ffff8dbd8ff21c40 RCX: 
>> 0000000000000004
>> [ 2055.241089] RDX: ffff998b097a9000 RSI: 0000000000000000 RDI: 
>> 0000000000000000
>> [ 2055.248286] RBP: 0000000000000000 R08: 00000000000065a8 R09: 
>> 0000000000005d80
>> [ 2055.255483] R10: 0000000000000040 R11: ffff8dbcf0100000 R12: 
>> ffff998b097a9000
>> [ 2055.262681] R13: ffff8dbd8c98c000 R14: 0000000000000000 R15: 
>> ffff998b07607d78
>> [ 2055.269879] FS:  0000000000000000(0000) GS:ffff8dbd8ff00000(0000) 
>> knlGS:0000000000000000
>> [ 2055.278039] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 2055.283834] CR2: 0000000000000018 CR3: 0000000c0c8cc005 CR4: 
>> 00000000007626e0
>> [ 2055.291030] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
>> 0000000000000000
>> [ 2055.298227] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
>> 0000000000000400
>> [ 2055.305424] PKRU: 55555554
>> [ 2055.308153] Call Trace:
>> [ 2055.310624]  xdp_do_redirect+0x7b/0x380
>> [ 2055.314499]  tun_get_user+0x10fe/0x12a0 [tun]
>> [ 2055.318895]  tun_sendmsg+0x52/0x70 [tun]
>> [ 2055.322852]  handle_tx+0x2ad/0x5f0 [vhost_net]
>> [ 2055.327337]  vhost_worker+0xa5/0x100 [vhost]
>> [ 2055.331646]  kthread+0xf5/0x130
>> [ 2055.334813]  ? vhost_dev_ioctl+0x3b0/0x3b0 [vhost]
>> [ 2055.339646]  ? kthread_bind+0x10/0x10
>> [ 2055.343343]  ret_from_fork+0x35/0x40
>> [ 2055.346950] Code: 0e 74 15 83 f8 10 75 05 e9 e9 aa b3 ff f3 c3 0f 
>> 1f 80 00 00 00 00 f3 c3 e9 c9 9d b3 ff 66 0f 1f 84 00 00 00 00 00 0f 
>> 1f 44 00 00 <8b> 47 18 83 f8 0e 74 0d 83 f8 10 75 05 e9 e9 a9 b3 ff 31 
>> c0 c3
>> [ 2055.366004] RIP: __xdp_map_lookup_elem+0x5/0x30 RSP: ffff998b07607cc0
>> [ 2055.372500] CR2: 0000000000000018
>> [ 2055.375856] ---[ end trace 2a2dcc5e9e174268 ]---
>> [ 2055.523626] Kernel panic - not syncing: Fatal exception
>> [ 2055.529796] Kernel Offset: 0x2e000000 from 0xffffffff81000000 
>> (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
>> [ 2055.677539] ---[ end Kernel panic - not syncing: Fatal exception ]---
>>
>> Fixes: 761876c857cb ("tap: XDP support")
>> Signed-off-by: Toshiaki Makita <makita.toshiaki@....ntt.co.jp>
>> ---
>>   drivers/net/tun.c | 12 ++++++++++++
>>   1 file changed, 12 insertions(+)
>>
>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>> index 45d8077..4fc7dbf 100644
>> --- a/drivers/net/tun.c
>> +++ b/drivers/net/tun.c
>> @@ -1650,6 +1650,7 @@ static struct sk_buff *tun_build_skb(struct 
>> tun_struct *tun,
>>       else
>>           *skb_xdp = 0;
>> +    local_bh_disable();
>>       preempt_disable();
>>       rcu_read_lock();
>>       xdp_prog = rcu_dereference(tun->xdp_prog);
>> @@ -1676,6 +1677,7 @@ static struct sk_buff *tun_build_skb(struct 
>> tun_struct *tun,
>>                   goto err_redirect;
>>               rcu_read_unlock();
>>               preempt_enable();
>> +            local_bh_enable();
>>               return NULL;
>>           case XDP_TX:
>>               get_page(alloc_frag->page);
>> @@ -1685,6 +1687,7 @@ static struct sk_buff *tun_build_skb(struct 
>> tun_struct *tun,
>>               tun_xdp_flush(tun->dev);
>>               rcu_read_unlock();
>>               preempt_enable();
>> +            local_bh_enable();
>>               return NULL;
>>           case XDP_PASS:
>>               delta = orig_data - xdp.data;
>> @@ -1704,6 +1707,7 @@ static struct sk_buff *tun_build_skb(struct 
>> tun_struct *tun,
>>       if (!skb) {
>>           rcu_read_unlock();
>>           preempt_enable();
>> +        local_bh_enable();
>>           return ERR_PTR(-ENOMEM);
>>       }
>> @@ -1714,6 +1718,7 @@ static struct sk_buff *tun_build_skb(struct 
>> tun_struct *tun,
>>       rcu_read_unlock();
>>       preempt_enable();
>> +    local_bh_enable();
>>       return skb;
>> @@ -1722,6 +1727,7 @@ static struct sk_buff *tun_build_skb(struct 
>> tun_struct *tun,
>>   err_xdp:
>>       rcu_read_unlock();
>>       preempt_enable();
>> +    local_bh_enable();
>>       this_cpu_inc(tun->pcpu_stats->rx_dropped);
>>       return NULL;
>>   }
>> @@ -1917,16 +1923,22 @@ static ssize_t tun_get_user(struct tun_struct 
>> *tun, struct tun_file *tfile,
>>           struct bpf_prog *xdp_prog;
>>           int ret;
>> +        local_bh_disable();
>> +        preempt_disable();
>>           rcu_read_lock();
>>           xdp_prog = rcu_dereference(tun->xdp_prog);
>>           if (xdp_prog) {
>>               ret = do_xdp_generic(xdp_prog, skb);
>>               if (ret != XDP_PASS) {
>>                   rcu_read_unlock();
>> +                preempt_enable();
>> +                local_bh_enable();
>>                   return total_len;
>>               }
>>           }
>>           rcu_read_unlock();
>> +        preempt_enable();
>> +        local_bh_enable();
>>       }
>>       rcu_read_lock();
> 
> Good catch, thanks.
> 
> But I think we can just replace preempt_disable()/enable() with 
> local_bh_disable()/local_bh_enable() ?

I actually thought the same, but noticed this patch.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9ea4c380066fbe

It looks like they do not think local_bh_disable() implies 
preempt_disable(). But I'm not sure why..

Toshiaki Makita

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ