[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <09837ee8-9359-6de5-801d-db85e4e34dfa@codeaurora.org>
Date: Wed, 8 Apr 2020 16:40:15 +0530
From: Maulik Shah <mkshah@...eaurora.org>
To: Doug Anderson <dianders@...omium.org>
Cc: Andy Gross <agross@...nel.org>,
Bjorn Andersson <bjorn.andersson@...aro.org>,
Matthias Kaehlcke <mka@...omium.org>,
Rajendra Nayak <rnayak@...eaurora.org>,
Evan Green <evgreen@...omium.org>,
Lina Iyer <ilina@...eaurora.org>,
Stephen Boyd <swboyd@...omium.org>,
linux-arm-msm <linux-arm-msm@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFT PATCH v2 05/10] drivers: qcom: rpmh-rsc: A lot of comments
Hi,
On 4/3/2020 1:48 AM, Doug Anderson wrote:
> Hi,
>
> On Wed, Apr 1, 2020 at 4:30 AM Maulik Shah <mkshah@...eaurora.org> wrote:
>>> + * Return: 0 if no problem, or -EAGAIN if the caller should try again in a bit.
>>> + * Caller should make sure to enable interrupts between tries.
>>> + * Only ever returns -EAGAIN for ACTIVE_ONLY transfers.
>> with [2] it will not return -EAGAIN, can you please remove this part.
> Sounds good. Overall I'll probably wait to spin until your series
> lands because trying to keep spinning this one in conjunction with
> that one is going to be a nightmare. Hopefully that means:
>
> a) Your series can land soon. I think it's in pretty good shape now.
> I just sent a bunch of reviews. Might need one more spin for nits and
> then we'll see if Bjorn thinks it's good to go.
>
> b) Once I spin this it can get a quicker review so other things don't
> pop up and change things.
>
> ...or, if you want, you can hijack my series and send the next version
> out yourself. I won't object to that but please give me a heads up if
> you're planning to do that so we don't duplicate work.
>
>
>>> @@ -246,12 +288,35 @@ static struct tcs_group *get_tcs_for_msg(struct rsc_drv *drv,
>>> ret = rpmh_rsc_invalidate(drv);
>>> if (ret)
>>> return ERR_PTR(ret);
>>> +
>>> + /*
>>> + * TODO:
>>> + * - Temporarily enable IRQs on the wake tcs.
>>> + * - Make sure nobody else race with us and re-write
>>> + * to this TCS; document how this works.
>> You can remove above comment, i already included change to enable IRQs
>> on wake TCS in my series.
> Right. The race comment still is somewhat interesting because the
> only way we're race free is that we know that the sleep/wake values
> are only programmed at a time when no more active transactions can be
> started. I'll document that assumption. If we ever change that
> assumption we'll have to think about this more since (at the moment)
> programming sleep/wake doesn't set the "tcs_in_use" bit.
>
>
>>> +/**
>>> + * check_for_req_inflight() - Look to see if conflicting cmds are in flight.
>>> + * @drv: The controller.
>>> + * @tcs: A pointer to the tcs_group used for ACTIVE_ONLY transfers.
>>> + * @msg: The message we want to send, which will contain several addr/data
>>> + * pairs to program (but few enough that they all fit in one TCS).
>>> + *
>>> + * Only for use for ACTIVE_ONLY tcs_group, since those are the only ones
>>> + * that might be actively sending.
>> This comment will need to modify/removed after we use this in place of
>> find_match().
>>
>> see below for more details.
> As per below I'm trying to understand the motivation for using
> check_for_req_inflight() when writing sleep/wake sets, so changing
> this is pending on that discussion.
>
>
>>> @@ -411,6 +533,20 @@ static int find_free_tcs(struct tcs_group *tcs)
>>> return -EBUSY;
>>> }
>>>
>>> +/**
>>> + * tcs_write() - Store messages into a TCS right now, or return -EBUSY.
>>> + * @drv: The controller.
>>> + * @msg: The data to be sent.
>>> + *
>>> + * Grabs a TCS for ACTIVE_ONLY transfers and writes the messages to it.
>>> + *
>>> + * If there are no free ACTIVE_ONLY TCSs or if a command for the same address
>>> + * is already transferring returns -EBUSY which means the client should retry
>>> + * shortly.
>>> + *
>>> + * Return: 0 on success, -EBUSY if client should retry, or an error.
>>> + * Client should have interrupts enabled for a bit before retrying.
>>> + */
>>> static int tcs_write(struct rsc_drv *drv, const struct tcs_request *msg)
>>> {
>>> struct tcs_group *tcs;
>>> @@ -418,16 +554,14 @@ static int tcs_write(struct rsc_drv *drv, const struct tcs_request *msg)
>>> unsigned long flags;
>>> int ret;
>>>
>>> + /* TODO: get_tcs_for_msg() can return -EAGAIN and nobody handles */
>> with [2] it will not return -EAGAIN, can you please remove this comment.
> OK.
>
>
>>> tcs = get_tcs_for_msg(drv, msg);
>>> if (IS_ERR(tcs))
>>> return PTR_ERR(tcs);
>>>
>>> spin_lock_irqsave(&tcs->lock, flags);
>>> +
>>> spin_lock(&drv->lock);
>>> - /*
>>> - * The h/w does not like if we send a request to the same address,
>>> - * when one is already in-flight or being processed.
>>> - */
>> please keep above comment as it is.
> OK. I guess I felt like now that check_for_req_inflight() was
> documented it was just getting in the way, but I'll keep it if you
> want.
>
>
>>> @@ -485,6 +635,63 @@ int rpmh_rsc_send_data(struct rsc_drv *drv, const struct tcs_request *msg)
>>> return ret;
>>> }
>>>
>>> +/**
>>> + * find_match() - Find if the cmd sequence is in the tcs_group
>>> + * @tcs: The tcs_group to search. Either sleep or wake.
>>> + * @cmd: The command sequence to search for; only addr is looked at.
>>> + * @len: The number of commands in the sequence.
>>> + *
>>> + * Searches through the given tcs_group to see if a given command sequence
>>> + * is in there.
>>> + *
>>> + * Two sequences are matches if they modify the same set of addresses in
>>> + * the same order. The value of the data is not considered when deciding if
>>> + * two things are matches.
>>> + *
>>> + * How this function works is best understood by example. For our example,
>>> + * we'll imagine our tcs group contains these (cmd, data) tuples:
>>> + * [(a, A), (b, B), (c, C), (d, D), (e, E), (f, F), (g, G), (h, H)]
>>> + * ...in other words it has an element where (addr=a, data=A), etc.
>>> + * ...we'll assume that there is one TCS in the group that can store 8 commands.
>>> + *
>>> + * - find_match([(a, X)]) => 0
>>> + * - find_match([(c, X), (d, X)]) => 2
>>> + * - find_match([(c, X), (d, X), (e, X)]) => 2
>>> + * - find_match([(z, X)]) => -ENODATA
>>> + * - find_match([(a, X), (y, X)]) => -EINVAL (and warning printed)
>>> + * - find_match([(g, X), (h, X), (i, X)]) => -EINVAL (and warning printed)
>>> + * - find_match([(y, X), (a, X)]) => -ENODATA
>>> + *
>>> + * NOTE: This function overall seems like it has questionable value.
>>> + * - It can be used to update a message in the TCS with new data, but I
>>> + * don't believe we actually do that--we always fully invalidate and
>>> + * re-write everything.
>> we are doing this from [1] onwards.
> OK.
>
>
>>> Specifically it would be too limiting to force
>>> + * someone not to change the set of addresses written to each time.
>>> + * - This function could be attempting to avoid writing different data to
>>> + * the same address twice in a tcs_group. If that's the goal, it doesn't
>>> + * do a great job since find_match([(y, X), (a, X)]) return -ENODATA in my
>>> + * above example.
>>> + * - If you originally wrote [(a, A), (b, B), (c, C)] and later tried to
>>> + * write [(a, A), (b, B)] it'd look like a match and we wouldn't consider
>>> + * it an error that the size got shorter.
>>> + * - If two clients wrote sequences that happened to be placed in slots next
>>> + * to each other then a later check could match a sequence that was the
>>> + * size of both together.
>>> + *
>>> + * TODO: in light of the above, prehaps we can just remove this function?
>> We still need to check there is no duplicate request in a TCS for SLEEP
>> and WAKE as well.
>>
>> find_match() checks if the request already exist for a resource then
>> update with new value, in a way preventing new request to go in
>>
>> when one already exists. I am ok to remove this function since after [1]
>> we always fully invalidate and then re-write and little point in
>>
>> finding a match now. However we need to use check_for_req_inflight()
>> from tcs_ctrl_write() with little modification to ignore tcs_is_free()
>>
>> check if is called from tcs_ctrlr_write().
> Sure, we could use find_match() to add this check. It definitely
> feels a lot better than the current solution which seems to miss a
> whole lot of corner cases.
>
> Before I do that, maybe you can help me understand the motivation,
> though? Where are you expecting to find the conflict? Certainly
> there can't be any conflict in the normal (non-batch) sleep/wake sets,
> right? We only have one entry in the RPMH cache per address so the
> non-batch entries can't conflict with themselves. There also can't be
> any previous async command still pending because we cache
> async/non-async alike.
correct, we have unique requests in non-batch caches.
>
> For batch requests I believe that there's exactly one caller using the
> batch API (otherwise rpmh_invalidate() would be a nightmare). That
> one caller is the interconnect code, right? It feels like we could
> assume that the one caller of the batch API won't write to the same
> address more than one time?
correct there is only one client interconnect using batch API.
>
> So I guess you're expecting that an rpmh_write() would write to the
> same address that someone wrote to with rpmh_write_batch() and it
> should override it?
On upstream interconnect i see they are using batch only requests till now.
I agree to remove find_match() completly, and we can see in future if
interconnect driver starts using non-batch APIs as well then we can
introduce a check to find duplicates.
> Does that actually happen?
On upstream interconnect i see they are using batch only requests till now.
I agree to remove find_match() completly, and we can see in future if
interconnect driver starts using non-batch APIs as well then we can
introduce a check to find duplicates.
> Isn't the batch API
> used just for interconnect stuff and nobody should be using
> rpmh_write() to mess with the interconnect stuff?
>
>
>> After this change on 9th change in your series, please move it before
>> current patch in series.
>>
>> please also keep dependency on [1] for 9th change.
> Sure. I was trying to do all the documentation in the earlier patches
> to provide motivation for and help understand the later patches, but I
> can change the order if need be. It didn't seem a big deal to add the
> comments and delete them, but I can understand it being weird.
>
>
>>> /**
>>> - * rpmh_rsc_write_ctrl_data: Write request to the controller
>>> - *
>>> - * @drv: the controller
>>> - * @msg: the data to be written to the controller
>>> + * rpmh_rsc_write_ctrl_data() - Write request to controller but don't trigger.
>>> + * @drv: The controller.
>>> + * @msg: The data to be written to the controller.
>>> *
>>> * There is no response returned for writing the request to the controller.
>> you can remove above line since responce is returned from this function.
> Right. I think this was trying to say that the request wasn't
> triggered and thus there was no response. I think it's clearer with
> my addition of "but don't trigger." to the comment.
>
>
>
> -Doug
Thanks,
Maulik
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Powered by blists - more mailing lists