netdev - Re: [PATCH net-next v6 10/11] net: sched: atomically check-allocate action

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <vbfeff6bjm0.fsf@reg-r-vrt-018-180.mtr.labs.mlnx>
Date:   Fri, 10 Aug 2018 13:29:11 +0300
From:   Vlad Buslov <vladbu@...lanox.com>
To:     Cong Wang <xiyou.wangcong@...il.com>
Cc:     Linux Kernel Network Developers <netdev@...r.kernel.org>,
        David Miller <davem@...emloft.net>,
        Jamal Hadi Salim <jhs@...atatu.com>,
        Jiri Pirko <jiri@...nulli.us>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Yevgeny Kliteynik <kliteyn@...lanox.com>,
        Jiri Pirko <jiri@...lanox.com>
Subject: Re: [PATCH net-next v6 10/11] net: sched: atomically check-allocate action


On Thu 09 Aug 2018 at 23:43, Cong Wang <xiyou.wangcong@...il.com> wrote:
> On Wed, Aug 8, 2018 at 5:06 AM Vlad Buslov <vladbu@...lanox.com> wrote:
>>
>>
>> On Wed 08 Aug 2018 at 01:20, Cong Wang <xiyou.wangcong@...il.com> wrote:
>> > On Thu, Jul 5, 2018 at 7:24 AM Vlad Buslov <vladbu@...lanox.com> wrote:
>> >>
>> >> Implement function that atomically checks if action exists and either takes
>> >> reference to it, or allocates idr slot for action index to prevent
>> >> concurrent allocations of actions with same index. Use EBUSY error pointer
>> >> to indicate that idr slot is reserved.
>> >
>> > A dumb question:
>> >
>> > How could "concurrent allocations of actions with same index" happen
>> > as you already take idrinfo->lock for the whole
>> > tcf_idr_check_alloc()??
>>
>> I guess my changelog is not precise enough in this description.
>> Let look into sequence of events of initialization of new action:
>> 1) tcf_idr_check_alloc() is called by action init.
>> 2) idrinfo->lock is taken.
>> 3) Lookup in idr is performed to determine if action with specified
>> index already exists.
>> 4) EBUSY pointer is inserted to indicate that id is taken.
>> 5) idrinfo->lock is released.
>> 6) tcf_idr_check_alloc() returns to action init code.
>> 7) New action is allocated and initialized.
>> 8) tcf_idr_insert() is called.
>> 9) idrinfo->lock is taken.
>> 10) EBUSY pointer is substituted with pointer to new action.
>> 11) idrinfo->lock is released.
>> 12) tcf_idr_insert() returns.
>>
>> So in this case "concurrent allocations of actions with same index"
>> means not the allocation with same index during tcf_idr_check_alloc(),
>> but during the period when idrinfo->lock was released(6-8).
>
> Yes but it is unnecessary:
>
> a) When adding a new action, you can actually allocate and init it before
> touching idrinfo, therefore the check and insert can be done in one step
> instead of breaking down it into multiple steps, which means you can
> acquire idrinfo->lock once.
>
> b) When updating an existing action, it is slightly complicated.
> However, you can still allocate a new one first, then find the old one
> and copy it into the new one and finally replace it.
>
> In summary, we can do the following:
>
> 1. always allocate a new action
> 2. acquire idrinfo->lock
> 3a. if it is an add operation: allocate a new ID and insert the new action
> 3b. if it is a replace operation: find the old one with ID, copy it into the
> new one and replace it
> 4. release idrinfo->lock
> 5. If 3a or 3b fails, free the allocation. Otherwise succeed.
>
> I know, the locking scope is now per netns rather than per action,
> but this can be optimized for replacing, you can hold the old action
> and then release the idrinfo->lock, as idr_replace() later doesn't
> require idrinfo->lock AFAIK.
>
> Is there anything I miss here?

Approach you suggest is valid, but has its own trade-offs:

- As you noted, lock granularity becomes coarse-grained due to per-netns
scope.

- I am not sure it is possible to call idr_replace() without obtaining
idrinfo->lock in this particular case. Concurrent delete of action with
same id is possible and, according to idr_replace() description,
unlocked execution is not supported for such use-case:

/**
 * idr_replace() - replace pointer for given ID.
 * @idr: IDR handle.
 * @ptr: New pointer to associate with the ID.
 * @id: ID to change.
 *
 * Replace the pointer registered with an ID and return the old value.
 * This function can be called under the RCU read lock concurrently with
 * idr_alloc() and idr_remove() (as long as the ID being removed is not
 * the one being replaced!).
 *
 * Returns: the old value on success.  %-ENOENT indicates that @id was not
 * found.  %-EINVAL indicates that @ptr was not valid.
 */

- High rate or replace request will generate a lot of unnecessary memory
allocations and deallocations.

>
>
>>
>> >
>> > For me, it should be only one allocation could succeed, all others
>> > should fail.
>>
>> Correct! And this change is made specifically to enforce that rule.
>>
>> Otherwise, multiple processes could try to create new action with same
>> id at the same time, and all processes that executed 3, before any
>> process reached 10, will "succeed" by overwriting each others action in
>> idr. (and leak memory while doing so)
>
> I know but again it doesn't look necessary to achieve a same goal.
>
>
>>
>> >
>> > Maybe you are trying to prevent others treat it like existing one,
>> > but in that case you can just hold the idinfo->lock for all idr operations.
>> >
>> > And more importantly, upper layer is able to tell it is a creation or
>> > just replace, you don't have to check this in this complicated way.
>> >
>> > IOW, all of these complicated code should not exist.
>>
>> Original code was simpler and didn't involve temporary EBUSY pointer.
>> This change was made according to Jiri's request. He wanted to have
>> unified API to be used by all actions and suggested this approach
>> specifically.
>
> I will work on this, as this is aligned to my work to make
> it RCU-complete.