[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <0ba3274f12e24e519bc61f30f0b90444@huawei.com>
Date: Sat, 17 Apr 2021 12:33:51 +0000
From: "zhudi (J)" <zhudi21@...wei.com>
To: linyunsheng <linyunsheng@...wei.com>,
"davem@...emloft.net" <davem@...emloft.net>,
"kuba@...nel.org" <kuba@...nel.org>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"Chenxiang (EulerOS)" <rose.chen@...wei.com>
Subject: Re: [PATCH] net: fix a data race when get vlan device
> On 2021/4/16 11:27, zhudi (J) wrote:
> >> dependencyOn 2021/4/15 11:35, zhudi wrote:
> >>> From: Di Zhu <zhudi21@...wei.com>
> >>>
> >>> We encountered a crash: in the packet receiving process, we got an
> >>> illegal VLAN device address, but the VLAN device address saved in
> >>> vmcore is correct. After checking the code, we found a possible data
> >>> competition:
> >>> CPU 0: CPU 1:
> >>> (RCU read lock) (RTNL lock)
> >>> vlan_do_receive() register_vlan_dev()
> >>> vlan_find_dev()
> >>>
> >>> ->__vlan_group_get_device() ->vlan_group_prealloc_vid()
> >>>
> >>> In vlan_group_prealloc_vid(), We need to make sure that kzalloc is
> >>> executed before assigning a value to vlan devices array, otherwise we
> >>
> >> As my understanding, there is a dependency between calling kzalloc() and
> >> assigning the address(returned from kzalloc()) to vg->vlan_devices_arrays,
> >> CPU and compiler can see the dependency, why can't it handling the
> >> dependency before adding the smp_wmb()?
> >>
> >> See CONTROL DEPENDENCIES section in Documentation/memory-
> >> barriers.txt:
> >>
> >> However, stores are not speculated. This means that ordering -is-
> provided
> >> for load-store control dependencies, as in the following example:
> >>
> >> q = READ_ONCE(a);
> >> if (q) {
> >> WRITE_ONCE(b, 1);
> >> }
> >>
> >
> > Maybe I didn't make it clear. This memory isolation is to ensure the order
> of
> > memset(object, 0, size) in kzalloc() operations and the subsequent array
> assignment statements.
> >
> > kzalloc()
> > ->memset(object, 0, size)
> >
> > smp_wmb()
> >
> > vg->vlan_devices_arrays[pidx][vidx] = array;
> >
> > Because __vlan_group_get_device() function depends on this order
>
> Thanks for clarify, it would be good to mention this in the
> commit log too.
OK, I'll change it. Thank you for your advice.
>
> Also, __vlan_group_get_device() is used in the data path, it would
> be to avoid the barrier op too. Maybe using rcu to avoid the barrier
> if the __vlan_group_get_device() is already protected by rcu_lock.
Using the netperf command for testing on x86, there is no difference in performance:
# netperf -H 112.113.0.12 -l 20 -t TCP_STREAM
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 112.113.0.12 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
131072 16384 16384 20.00 9386.03
After patch:
# netperf -H 112.113.0.12 -l 20 -t TCP_STREAM
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 112.113.0.12 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
131072 16384 16384 20.00 9386.41
The same is true for UDP stream test
>
> >
> >>
> >>
> >>> may get a wrong address from the hardware cache on another cpu.
> >>>
> >>> So fix it by adding memory barrier instruction to ensure the order of
> >>> memory operations.
> >>>
> >>> Signed-off-by: Di Zhu <zhudi21@...wei.com>
> >>> ---
> >>> net/8021q/vlan.c | 2 ++
> >>> net/8021q/vlan.h | 3 +++
> >>> 2 files changed, 5 insertions(+)
> >>>
> >>> diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c index
> >>> 8b644113715e..4f541e05cd3f 100644
> >>> --- a/net/8021q/vlan.c
> >>> +++ b/net/8021q/vlan.c
> >>> @@ -71,6 +71,8 @@ static int vlan_group_prealloc_vid(struct
> vlan_group
> >> *vg,
> >>> if (array == NULL)
> >>> return -ENOBUFS;
> >>>
> >>> + smp_wmb();
> >>> +
> >>> vg->vlan_devices_arrays[pidx][vidx] = array;
> >>> return 0;
> >>> }
> >>> diff --git a/net/8021q/vlan.h b/net/8021q/vlan.h index
> >>> 953405362795..7408fda084d3 100644
> >>> --- a/net/8021q/vlan.h
> >>> +++ b/net/8021q/vlan.h
> >>> @@ -57,6 +57,9 @@ static inline struct net_device
> >>> *__vlan_group_get_device(struct vlan_group *vg,
> >>>
> >>> array = vg->vlan_devices_arrays[pidx]
> >>> [vlan_id /
> >> VLAN_GROUP_ARRAY_PART_LEN];
> >>> +
> >>> + smp_rmb();
> >>> +
> >>> return array ? array[vlan_id % VLAN_GROUP_ARRAY_PART_LEN] :
> >> NULL; }
> >>>
> >>>
> >
Powered by blists - more mailing lists