[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0E3E5A3D-E1C5-4C27-BEEB-432891F996F4@redhat.com>
Date: Wed, 07 Jun 2023 11:09:58 +0200
From: Eelco Chaudron <echaudro@...hat.com>
To: wangchuanlei <wangchuanlei@...pur.com>
Cc: aconole@...hat.com, dev@...nvswitch.org, netdev@...r.kernel.org,
simon.horman@...igine.com, wangpeihui@...pur.com, kuba@...nel.org,
pabeni@...hat.com, davem@...emloft.net
Subject: Re: [ovs-dev] [PATCH net v2] net: openvswitch: fix upcall counter
access before allocation
On 7 Jun 2023, at 3:05, wangchuanlei wrote:
> Thanks for fix this, in common enviroment, it's a
> small probability event.
Well, on ARM, they could replicate it a couple of times, but I guess the system was under memory pressure and has a lot of cores.
>> Eelco Chaudron <echaudro@...hat.com> writes:
>
>> Currently, the per cpu upcall counters are allocated after the vport
>> is created and inserted into the system. This could lead to the
>> datapath accessing the counters before they are allocated resulting in
>> a kernel Oops.
>>
>> Here is an example:
>>
>> PID: 59693 TASK: ffff0005f4f51500 CPU: 0 COMMAND: "ovs-vswitchd"
>> #0 [ffff80000a39b5b0] __switch_to at ffffb70f0629f2f4
>> #1 [ffff80000a39b5d0] __schedule at ffffb70f0629f5cc
>> #2 [ffff80000a39b650] preempt_schedule_common at ffffb70f0629fa60
>> #3 [ffff80000a39b670] dynamic_might_resched at ffffb70f0629fb58
>> #4 [ffff80000a39b680] mutex_lock_killable at ffffb70f062a1388
>> #5 [ffff80000a39b6a0] pcpu_alloc at ffffb70f0594460c
>> #6 [ffff80000a39b750] __alloc_percpu_gfp at ffffb70f05944e68
>> #7 [ffff80000a39b760] ovs_vport_cmd_new at ffffb70ee6961b90 [openvswitch]
>> ...
>>
>> PID: 58682 TASK: ffff0005b2f0bf00 CPU: 0 COMMAND: "kworker/0:3"
>> #0 [ffff80000a5d2f40] machine_kexec at ffffb70f056a0758
>> #1 [ffff80000a5d2f70] __crash_kexec at ffffb70f057e2994
>> #2 [ffff80000a5d3100] crash_kexec at ffffb70f057e2ad8
>> #3 [ffff80000a5d3120] die at ffffb70f0628234c
>> #4 [ffff80000a5d31e0] die_kernel_fault at ffffb70f062828a8
>> #5 [ffff80000a5d3210] __do_kernel_fault at ffffb70f056a31f4
>> #6 [ffff80000a5d3240] do_bad_area at ffffb70f056a32a4
>> #7 [ffff80000a5d3260] do_translation_fault at ffffb70f062a9710
>> #8 [ffff80000a5d3270] do_mem_abort at ffffb70f056a2f74
>> #9 [ffff80000a5d32a0] el1_abort at ffffb70f06297dac
>> #10 [ffff80000a5d32d0] el1h_64_sync_handler at ffffb70f06299b24
>> #11 [ffff80000a5d3410] el1h_64_sync at ffffb70f056812dc
>> #12 [ffff80000a5d3430] ovs_dp_upcall at ffffb70ee6963c84 [openvswitch]
>> #13 [ffff80000a5d3470] ovs_dp_process_packet at ffffb70ee6963fdc [openvswitch]
>> #14 [ffff80000a5d34f0] ovs_vport_receive at ffffb70ee6972c78 [openvswitch]
>> #15 [ffff80000a5d36f0] netdev_port_receive at ffffb70ee6973948 [openvswitch]
>> #16 [ffff80000a5d3720] netdev_frame_hook at ffffb70ee6973a28 [openvswitch]
>> #17 [ffff80000a5d3730] __netif_receive_skb_core.constprop.0 at
>> ffffb70f06079f90
>>
>> We moved the per cpu upcall counter allocation to the existing vport
>> alloc and free functions to solve this.
>>
>> Fixes: 95637d91fefd ("net: openvswitch: release vport resources on
>> failure")
>> Fixes: 1933ea365aa7 ("net: openvswitch: Add support to count upcall
>> packets")
>> Signed-off-by: Eelco Chaudron <echaudro@...hat.com>
>> ---
>
> Acked-by: Aaron Conole <aconole@...hat.com>
Were you intentionally ACKing this on Aaron’s behalf? Or just a cut/paste error ;)
> _______________________________________________
> dev mailing list
> dev@...nvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Powered by blists - more mailing lists