linux-kernel - Re: [PATCH] drm/panfrost: fix runtime pm imbalance on error

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <272650ba-2c44-9084-7829-b04023eba723@arm.com>
Date:   Fri, 22 May 2020 14:09:24 +0100
From:   Steven Price <steven.price@....com>
To:     dinghao.liu@....edu.cn
Cc:     kjlu@....edu, Rob Herring <robh@...nel.org>,
        Tomeu Vizoso <tomeu.vizoso@...labora.com>,
        Alyssa Rosenzweig <alyssa.rosenzweig@...labora.com>,
        David Airlie <airlied@...ux.ie>,
        Daniel Vetter <daniel@...ll.ch>,
        dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] drm/panfrost: fix runtime pm imbalance on error

On 21/05/2020 08:00, dinghao.liu@....edu.cn wrote:
> Hi Steve,
> 
> There are two bailing out points in panfrost_job_hw_submit(): one is
> the error path beginning from pm_runtime_get_sync(), the other one is
> the error path beginning from WARN_ON() in the if statement. The pm
> imbalance fixed in this patch is between these two paths. I think the
> caller of panfrost_job_hw_submit() cannot distinguish this imbalance
> outside this function.

My point is the caller expects panfrost_job_hw_submit() to increase the 
PM reference count. Since panfrost_job_hw_submit() cannot return an 
error (it's void return) we cannot signal to the caller that the 
reference hasn't been taken.

> panfrost_job_timedout() calls pm_runtime_put_noidle() for every job it
> finds, but all jobs are added to the pfdev->jobs just before calling
> panfrost_job_hw_submit(). Therefore I think the imbalance still exists.

My point's exactly that - the "jobs are added to pfdev->jobs just before 
calling panfrost_job_hw_submit()". Since we don't have a way for 
panfrost_job_hw_submit() to fail it must unconditionally take any 
references that will then be freed later on.

> But I'm not very sure if we should add pm_runtime_put on the error path
> after pm_runtime_get_sync(), or remove pm_runtime_put one the error path
> after WARN_ON().

The pm_runtime_put after the WARN_ON() is a bug. Sorry this is probably 
what confused you - clearly the WARN_ON() situation is never meant to 
happen in the first place, so hopefully this isn't actually possible.

Feel free to send a patch removing it! ;)

> As for the problem about panfrost_devfreq_record_busy(), this may be a
> new bug and requires independent patch to fix it.

Indeed, I'll post a proper patch for that later - I just spotted it 
while looking at the code.

Thanks,

Steve

> Regards,
> Dinghao
> 
> 
>> On 20/05/2020 12:05, Dinghao Liu wrote:
>>> pm_runtime_get_sync() increments the runtime PM usage counter even
>>> the call returns an error code. Thus a pairing decrement is needed
>>> on the error handling path to keep the counter balanced.
>>>
>>> Signed-off-by: Dinghao Liu <dinghao.liu@....edu.cn>
>>
>> Actually I think we have the opposite problem. To be honest we don't
>> handle this situation very well. By the time panfrost_job_hw_submit() is
>> called the job has already been added to the pfdev->jobs array, so it's
>> considered submitted even if it never actually lands on the hardware. So
>> in the case of this function bailing out early we will then (eventually)
>> hit a timeout and trigger a GPU reset.
>>
>> panfrost_job_timedout() iterates through the pfdev->jobs array and calls
>> pm_runtime_put_noidle() for each job it finds. So there's no inbalance
>> here that I can see.
>>
>> Have you actually observed the situation where pm_runtime_get_sync()
>> returns a failure?
>>
>> HOWEVER, it appears that by bailing out early the call to
>> panfrost_devfreq_record_busy() is never made, which as far as I can see
>> means that there may be an extra call to panfrost_devfreq_record_idle()
>> when the jobs have timed out. Which could underflow the counter.
>>
>> But equally looking at panfrost_job_timedout(), we only call
>> panfrost_devfreq_record_idle() *once* even though multiple jobs might be
>> processed.
>>
>> There's a completely untested patch below which in theory should fix that...
>>
>> Steve
>>
>> ----8<---
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c
>> b/drivers/gpu/drm/panfrost/panfrost_job.c
>> index 7914b1570841..f9519afca29d 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>> @@ -145,6 +145,8 @@ static void panfrost_job_hw_submit(struct
>> panfrost_job *job, int js)
>>    	u64 jc_head = job->jc;
>>    	int ret;
>>
>> +	panfrost_devfreq_record_busy(pfdev);
>> +
>>    	ret = pm_runtime_get_sync(pfdev->dev);
>>    	if (ret < 0)
>>    		return;
>> @@ -155,7 +157,6 @@ static void panfrost_job_hw_submit(struct
>> panfrost_job *job, int js)
>>    	}
>>
>>    	cfg = panfrost_mmu_as_get(pfdev, &job->file_priv->mmu);
>> -	panfrost_devfreq_record_busy(pfdev);
>>
>>    	job_write(pfdev, JS_HEAD_NEXT_LO(js), jc_head & 0xFFFFFFFF);
>>    	job_write(pfdev, JS_HEAD_NEXT_HI(js), jc_head >> 32);
>> @@ -410,12 +411,12 @@ static void panfrost_job_timedout(struct
>> drm_sched_job *sched_job)
>>    	for (i = 0; i < NUM_JOB_SLOTS; i++) {
>>    		if (pfdev->jobs[i]) {
>>    			pm_runtime_put_noidle(pfdev->dev);
>> +			panfrost_devfreq_record_idle(pfdev);
>>    			pfdev->jobs[i] = NULL;
>>    		}
>>    	}
>>    	spin_unlock_irqrestore(&pfdev->js->job_lock, flags);
>>
>> -	panfrost_devfreq_record_idle(pfdev);
>>    	panfrost_device_reset(pfdev);
>>
>>    	for (i = 0; i < NUM_JOB_SLOTS; i++)