lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1359596906.29117.9.camel@cr0>
Date:	Thu, 31 Jan 2013 09:48:26 +0800
From:	Cong Wang <amwang@...hat.com>
To:	Jesse Gross <jesse@...ira.com>
Cc:	netdev@...r.kernel.org, "David S. Miller" <davem@...emloft.net>
Subject: Re: [Patch net-next] openvswitch: adjust skb_gso_segment() for rx
 path

On Wed, 2013-01-30 at 13:54 -0800, Jesse Gross wrote:
> On Wed, Jan 30, 2013 at 12:38 AM, Cong Wang <amwang@...hat.com> wrote:
> > From: Cong Wang <amwang@...hat.com>
> >
> > skb_gso_segment() is almost always called in tx path,
> > except for openvswitch. It calls this function when
> > it receives the packet and tries to queue it to user-space.
> > In this special case, the ->ip_summed check inside
> > skb_gso_segment() is no longer true, as ->ip_summed value
> > has different meanings on rx path.
> 
> I don't think that this is really specific to Open vSwitch - it's
> possible that skb_gso_segment() could be called in the transmit path
> after bridging.  I also don't really think that it is true any more
> that the meaning of skb->ip_summed is different on receive vs.
> transmit paths (this was definitely not the case in the past).
> However, it's certainly possible to see different types of packets.

Take a look at the fat comment in include/linux/skbuff.h, unless it is
out-of-date.

> 
> > This patch adjusts skb_gso_segment() so that we can at least
> > avoid such warnings on checksum.
> 
> When do you see GSO packets with CHECKSUM_NONE on receive?

I see CHECKSUM_UNCESSARY set by ixgbe driver or CHECKSUM_PARTIAL set by
gro. According to comments in include/linux/skbuff.h, CHECKSUM_NONE is
set when device is not able to do checksum, it is not my case. The full
backtrace below is the one I got on RHEL6 kernel:

WARNING: at net/core/dev.c:1760 skb_gso_segment+0x1df/0x2b0() (Not
tainted) 
Hardware name: ProLiant DL580 G7 
802.1Q VLAN Support: caps=(0x190833, 0x0) len=1500 data_len=1448
ip_summed=1 
Modules linked in: bonding 8021q garp stp llc openvswitch sunrpc
cpufreq_ondemand freq_table pcc_cpufreq ipv6 power_meter qlcnic cxgb4
be2net ixgbe dca ptp pps_core mdio netxen_nic mlx4_core microcode
serio_raw iTCO_wdt iTCO_vendor_support hpilo hpwdt sg i7core_edac
edac_core shpchp ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif
pata_acpi ata_generic ata_piix hpsa radeon ttm drm_kms_helper drm
i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod [last
unloaded: llc] 
Pid: 0, comm: swapper Not tainted 2.6.32-356.el6.x86_64 #1 
Call Trace: 
 <IRQ>  [<ffffffff8106e2e7>] ? warn_slowpath_common+0x87/0xc0 
 [<ffffffff8106e3d6>] ? warn_slowpath_fmt+0x46/0x50 
 [<ffffffff81439084>] ? sock_def_readable+0x44/0x80 
 [<ffffffff8144871f>] ? skb_gso_segment+0x1df/0x2b0 
 [<ffffffffa043f92e>] ? queue_gso_packets+0x4e/0x1d0 [openvswitch] 
 [<ffffffffa0443558>] ? ovs_flow_extract+0x6a8/0xa80 [openvswitch] 
 [<ffffffffa043fb0d>] ? ovs_dp_upcall+0x5d/0xb0 [openvswitch] 
 [<ffffffffa043fc8e>] ? ovs_dp_process_received_packet+0x12e/0x140
[openvswitch] 
 [<ffffffffa0443a30>] ? ovs_vport_receive+0x30/0x40 [openvswitch] 
 [<ffffffffa0444893>] ? ovs_netdev_frame_hook+0x83/0xac [openvswitch] 
 [<ffffffff814482aa>] ? __netif_receive_skb+0x60a/0x750 
 [<ffffffff8144a528>] ? netif_receive_skb+0x58/0x60 
 [<ffffffff8144a630>] ? napi_skb_finish+0x50/0x70 
 [<ffffffff814e7024>] ? vlan_gro_receive+0x84/0xa0 
 [<ffffffffa030c08e>] ? ixgbe_poll+0x6ae/0x1280 [ixgbe] 
 [<ffffffff810e3d48>] ? handle_edge_irq+0x98/0x180 
 [<ffffffff8144ccf3>] ? net_rx_action+0x103/0x2f0 
 [<ffffffff81076fb1>] ? __do_softirq+0xc1/0x1e0 
 [<ffffffff810e1640>] ? handle_IRQ_event+0x60/0x170 
 [<ffffffff8100c1cc>] ? call_softirq+0x1c/0x30 
 [<ffffffff8100de05>] ? do_softirq+0x65/0xa0 
 [<ffffffff81076d95>] ? irq_exit+0x85/0x90 
 [<ffffffff81516c45>] ? do_IRQ+0x75/0xf0 
 [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 
 <EOI>  [<ffffffff810a871d>] ? tick_nohz_restart_sched_tick+0x16d/0x190 
 [<ffffffff810a870f>] ? tick_nohz_restart_sched_tick+0x15f/0x190 
 [<ffffffff81009f90>] ? cpu_idle+0x80/0x110 
 [<ffffffff81009ff9>] ? cpu_idle+0xe9/0x110 
 [<ffffffff814f2eda>] ? rest_init+0x7a/0x80 
 [<ffffffff81c27f7b>] ? start_kernel+0x424/0x430 
 [<ffffffff81c2733a>] ? x86_64_start_reservations+0x125/0x129 
 [<ffffffff81c27438>] ? x86_64_start_kernel+0xfa/0x109 
---[ end trace 84ef9bd9ae5d9360 ]--- 


> 
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index a87bc74..f6e7b3f 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -2347,7 +2356,7 @@ struct sk_buff *skb_gso_segment(struct sk_buff *skb,
> >         rcu_read_lock();
> >         list_for_each_entry_rcu(ptype, &offload_base, list) {
> >                 if (ptype->type == type && ptype->callbacks.gso_segment) {
> > -                       if (unlikely(skb->ip_summed != CHECKSUM_PARTIAL)) {
> > +                       if (unlikely(skb_needs_check(skb))) {
> >                                 err = ptype->callbacks.gso_send_check(skb);
> >                                 segs = ERR_PTR(err);
> >                                 if (err || skb_gso_ok(skb, features))
> 
> Even if we don't warn we likely still need to fix the checksum.

By calling skb_checksum_complete()? My initial version patch had it, but
it didn't work.

> 
> > diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> > index d8c13a9..0b75964 100644
> > --- a/net/openvswitch/datapath.c
> > +++ b/net/openvswitch/datapath.c
> > @@ -302,7 +302,7 @@ static int queue_gso_packets(struct net *net, int dp_ifindex,
> >         int err;
> >
> >         segs = skb_gso_segment(skb, NETIF_F_SG | NETIF_F_HW_CSUM);
> > -       if (IS_ERR(segs))
> > +       if (IS_ERR_OR_NULL(segs))
> >                 return PTR_ERR(segs);
> 
> In what case would we expect that NULL is returned here?


BUG: unable to handle kernel NULL pointer dereference at
00000000000000b9 
IP: [<ffffffffa043f581>] queue_userspace_packet+0x21/0x380
[openvswitch] 
PGD 0  
Oops: 0000 [#1] SMP  
last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:01:03.0/irq 
CPU 0  
Modules linked in: bonding 8021q garp stp llc openvswitch sunrpc
cpufreq_ondemand freq_table pcc_cpufreq ipv6 power_meter qlcnic cxgb4
be2net ixgbe dca ptp pps_core mdio netxen_nic mlx4_core microcode
serio_raw iTCO_wdt iTCO_vendor_support hpilo hpwdt sg i7core_edac
edac_core shpchp ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif
pata_acpi ata_generic ata_piix hpsa radeon ttm drm_kms_helper drm
i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod [last
unloaded: llc] 
 
Pid: 0, comm: swapper Tainted: G        W  ---------------
2.6.32-356.el6.x86_64 #1 HP ProLiant DL580 G7 
RIP: 0010:[<ffffffffa043f581>]  [<ffffffffa043f581>]
queue_userspace_packet+0x21/0x380 [openvswitch] 
RSP: 0018:ffff88002f6039d0  EFLAGS: 00010282 
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff880228c58422 
RDX: ffff88002f603b70 RSI: 0000000000000000 RDI: 0000000000000060 
RBP: ffff88002f603a40 R08: ffffffff81c07728 R09: 0000000000000040 
R10: 000000000000000f R11: 0000000000000008 R12: 0000000000000000 
R13: ffff88002f603b70 R14: 0000000000000060 R15: 0000000000000000 
FS:  0000000000000000(0000) GS:ffff88002f600000(0000)
knlGS:0000000000000000 
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b 
CR2: 00000000000000b9 CR3: 0000000001a85000 CR4: 00000000000007f0 
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 
Process swapper (pid: 0, threadinfo ffffffff81a00000, task
ffffffff81a8d020) 
Stack: 
 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
<d> 0000000000000000 0000000000000000 0000000200000000 0241c7121a501498 
<d> ffff880000031dd8 0000000000000000 0000000000000000 ffff88002f603b70 
Call Trace: 
 <IRQ>  
 [<ffffffffa043f96a>] queue_gso_packets+0x8a/0x1d0 [openvswitch] 
 [<ffffffffa0443558>] ? ovs_flow_extract+0x6a8/0xa80 [openvswitch] 
 [<ffffffffa043fb0d>] ovs_dp_upcall+0x5d/0xb0 [openvswitch] 
 [<ffffffffa043fc8e>] ovs_dp_process_received_packet+0x12e/0x140
[openvswitch] 
 [<ffffffffa0443a30>] ovs_vport_receive+0x30/0x40 [openvswitch] 
 [<ffffffffa0444893>] ovs_netdev_frame_hook+0x83/0xac [openvswitch] 
 [<ffffffff814482aa>] __netif_receive_skb+0x60a/0x750 
 [<ffffffff8144a528>] netif_receive_skb+0x58/0x60 
 [<ffffffff8144a630>] napi_skb_finish+0x50/0x70 
 [<ffffffff814e7024>] vlan_gro_receive+0x84/0xa0 
 [<ffffffffa030c08e>] ixgbe_poll+0x6ae/0x1280 [ixgbe] 
 [<ffffffff810e3d48>] ? handle_edge_irq+0x98/0x180 
 [<ffffffff8144ccf3>] net_rx_action+0x103/0x2f0 
 [<ffffffff81076fb1>] __do_softirq+0xc1/0x1e0 
 [<ffffffff810e1640>] ? handle_IRQ_event+0x60/0x170 
 [<ffffffff8100c1cc>] call_softirq+0x1c/0x30 
 [<ffffffff8100de05>] do_softirq+0x65/0xa0 
 [<ffffffff81076d95>] irq_exit+0x85/0x90 
 [<ffffffff81516c45>] do_IRQ+0x75/0xf0 
 [<ffffffff8100b9d3>] ret_from_intr+0x0/0x11 
 <EOI>  
 [<ffffffff810a871d>] ? tick_nohz_restart_sched_tick+0x16d/0x190 
 [<ffffffff810a870f>] ? tick_nohz_restart_sched_tick+0x15f/0x190 
 [<ffffffff81009f90>] ? cpu_idle+0x80/0x110 
 [<ffffffff81009ff9>] cpu_idle+0xe9/0x110 
 [<ffffffff814f2eda>] rest_init+0x7a/0x80 
 [<ffffffff81c27f7b>] start_kernel+0x424/0x430 
 [<ffffffff81c2733a>] x86_64_start_reservations+0x125/0x129 
 [<ffffffff81c27438>] x86_64_start_kernel+0xfa/0x109 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ