[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACeb84sLAPh7oEs2Y+VGFwbuaRkOnw0CvtM0H+Ps-iO597+5Jw@mail.gmail.com>
Date: Mon, 19 Nov 2018 16:29:54 +0530
From: "Anand H. Krishnan" <anandhkrishnan@...il.com>
To: willemdebruijn.kernel@...il.com
Cc: netdev@...r.kernel.org
Subject: Re: VETH & AF_PACKET problem
Hello,
I tried the 4.19.2 kernel without success. You were probably right
that skb_orphan
is indeed called from somewhere in the receive path. I had an
instrumented kernel
and the following is what I see:
[ 324.709846] Call Trace:
[ 324.709847] <IRQ>
[ 324.709855] dump_stack+0x63/0x85
[ 324.709859] tpacket_destruct_skb+0x2d/0x160
[ 324.709865] ip_rcv_core.isra.17+0x1c5/0x2a0
[ 324.709867] ip_rcv+0x37/0xd0
[ 324.709870] __netif_receive_skb_one_core+0x57/0x80
[ 324.709872] __netif_receive_skb+0x18/0x60
[ 324.709873] process_backlog+0xa4/0x170
[ 324.709874] net_rx_action+0x140/0x3a0
[ 324.709879] ? lapic_next_deadline+0x26/0x30
[ 324.709882] __do_softirq+0xe4/0x2d4
[ 324.709886] do_softirq_own_stack+0x2a/0x40
[ 324.709886] </IRQ>
[ 324.709891] do_softirq.part.21+0x54/0x60
[ 324.709893] __local_bh_enable_ip+0x65/0x70
[ 324.709894] dev_direct_xmit+0x137/0x1d0
[ 324.709896] packet_direct_xmit+0x51/0xa0
[ 324.709897] packet_sendmsg+0x7be/0x1800
[ 324.709899] ? __switch_to_asm+0x34/0x70
[ 324.709905] sock_sendmsg+0x3e/0x50
[ 324.709907] __sys_sendto+0x13f/0x180
[ 324.709914] ? handle_mm_fault+0xe3/0x220
[ 324.709917] ? __do_page_fault+0x270/0x4d0
[ 324.709919] __x64_sys_sendto+0x28/0x30
[ 324.709924] do_syscall_64+0x5a/0x120
[ 324.709926] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Thanks,
Anand
On Wed, Nov 14, 2018 at 10:11 AM Willem de Bruijn
<willemdebruijn.kernel@...il.com> wrote:
>
> On Tue, Nov 13, 2018 at 8:29 PM Anand H. Krishnan
> <anandhkrishnan@...il.com> wrote:
> >
> > skb_scrub_packet calls skb_orphan and from there the destructor is called.
>
> Not since
>
> commit 9c4c325252c54b34d53b3d0ffd535182b744e03d
> skbuff: preserve sock reference when scrubbing the skb.
> v4.19-rc1~140^2~523^2
>
> But the general issue is valid that the tx_ring slot should not be
> released until all users of the pages are freed, not just when the
> skb is orphaned (e.g., on skb_set_owner_r).
>
> I think that this can happen even on a transmit to a physical
> nic, if a clone is queued for reception on a packet socket. That
> clone does not clone the destructor, so if the reader is slow,
> the slot may be released from consume_skb on the original
> path.
>
> I have not verified this yet. But if correct, then the long term
> solution is to use refcounted uarg, similar to msg_zerocopy.
Powered by blists - more mailing lists