netdev - Re: [PATCH] net: packetmmap: fix only tx timestamp on request

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAN6QFNyHep+UGjM7XpA4akbtvZFNDarVcs3=zZPpYO7RMTJgHg@mail.gmail.com>
Date:   Wed, 12 May 2021 10:11:28 +1200
From:   Richard Sanger <rsanger@...d.net.nz>
To:     Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc:     Network Development <netdev@...r.kernel.org>,
        Daniel Borkmann <daniel@...earbox.net>
Subject: Re: [PATCH] net: packetmmap: fix only tx timestamp on request

I've had a chance to look into this further and have found where the
timestamp is added. Details are at the end of this message.

On Thu, May 6, 2021 at 1:23 PM Willem de Bruijn
<willemdebruijn.kernel@...il.com> wrote:
>
> On Wed, May 5, 2021 at 7:42 PM Richard Sanger <rsanger@...d.net.nz> wrote:
[...]
> >
> > I've just verified using printk() that after the call to skb_tx_timestamp(skb)
> > in veth_xmit() skb->tstamp == 0 as expected.
> >
> > However, when skb_tx_timestamp() is called within the packetmmap code path
> > skb->tstamp holds a valid time.
>
> Interesting. I had expected veth_xmit to trigger skb_orphan, which
> calls the destructor.
>
> But this is no longer true as of commit 9c4c325252c5 ("skbuff:
> preserve sock reference when scrubbing the skb.").
>
> As a result, I suppose the skb can enter the next namespace and be
> timestamped there if receive timestamps are enabled (this is not
> per-socket).
>
> One way to verify, if you can easily recompile a kernel, is to add a
> WARN_ON_ONCE(1) to tpacket_destruct_skb to see which path led up to
> queuing the completion notification.
>

Here's the output of putting a WARN_ON_ONCE(1) statement in
tpacket_destruct_skb, I don't believe it is related to the problem.

[   37.249629] RIP: 0010:tpacket_destruct_skb+0x24/0x60
[...]
[   37.249659] Call Trace:
[   37.249661]  <IRQ>
[   37.249666]  skb_release_head_state+0x44/0x90
[   37.249680]  skb_release_all+0x13/0x30
[   37.249684]  kfree_skb+0x2f/0xa0
[   37.249689]  llc_rcv+0x2e/0x360 [llc]
[   37.249698]  __netif_receive_skb_one_core+0x8f/0xa0
[   37.249707]  __netif_receive_skb+0x18/0x60
[   37.249710]  process_backlog+0xa9/0x160
[   37.249714]  __napi_poll+0x31/0x140
[   37.249717]  net_rx_action+0xde/0x210
[   37.249722]  __do_softirq+0xe0/0x29b
[   37.249737]  do_softirq+0x66/0x80
[   37.249747]  </IRQ>
[   37.249748]  __local_bh_enable_ip+0x50/0x60
[   37.249751]  __dev_queue_xmit+0x23a/0x6e0
[   37.249756]  dev_queue_xmit+0x10/0x20
[   37.249759]  packet_sendmsg+0x6b8/0x1c90
[   37.249763]  ? __drain_all_pages+0x150/0x1c0
[   37.249772]  sock_sendmsg+0x65/0x70
[   37.249778]  __sys_sendto+0x113/0x190
[   37.249783]  ? handle_mm_fault+0xda/0x2b0
[   37.249790]  ? exit_to_user_mode_prepare+0x3c/0x1e0
[   37.249800]  ? do_user_addr_fault+0x1d3/0x640
[   37.249805]  __x64_sys_sendto+0x29/0x30
[   37.249809]  do_syscall_64+0x40/0xb0
[   37.249816]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   37.249820] RIP: 0033:0x7f43950d27ea

[...]

> I think we need to understand exactly what goes on before we apply a
> patch. It might just be papering over the problem otherwise.

Okay, so the call path that adds the timestamp looks like this:

send() syscall triggers tpacket_snd() which calls the veth_xmit() hander.
In drivers/net/veth.c veth_xmit() calls veth_forward_skb() which then
calls netif_rx()/netif_rx_internal() in net/core/dev.c.
And finally, net_timestamp_check(netdev_tstamp_prequeue, skb) adds
the timestamp, netdev_tstamp_prequeue defaults to 1.

net_timestamp_check in its current form was added by 588f033075
("net: use jump_label for netstamp_needed ")
In the kernel since 3.3-rc1, so it looks like this issue has been present the
entire time. Pre-conditions are netstamp_needed_key and
netdev_tstamp_prequeue, so if either is false, timestamping won't happen
at this stage in the code.

Here's the call trace of where the timestamp is added

[  251.619538] Call Trace:
[  251.619550]  netif_rx+0x1b/0x60
[  251.619556]  veth_xmit+0x19d/0x230 [veth]
[  251.619563]  netdev_start_xmit+0x4a/0x8b
[  251.619566]  dev_hard_start_xmit.cold+0xc8/0x1d5
[  251.619569]  __dev_queue_xmit.cold+0xa3/0x12c
[  251.619572]  dev_queue_xmit+0x10/0x20
[  251.619575]  packet_sendmsg+0x6b8/0x1c90
[  251.619580]  ? __drain_all_pages+0x150/0x1c0
[  251.619588]  sock_sendmsg+0x65/0x70
[  251.619594]  __sys_sendto+0x113/0x190
[  251.619598]  ? handle_mm_fault+0xda/0x2b0
[  251.619604]  ? exit_to_user_mode_prepare+0x3c/0x1e0
[  251.619611]  ? do_user_addr_fault+0x1d3/0x640
[  251.619615]  __x64_sys_sendto+0x29/0x30
[  251.619618]  do_syscall_64+0x40/0xb0
[  251.619623]  entry_SYSCALL_64_after_hwframe+0x44/0xae

This appears to be reasonable, but I don't know what the expected behaviour
is. Should this timestamp still be cleared before returning the sent skb?