linux-kernel - Re: [PATCH 5/7] thermal: intel: hfi: Enable notification interrupt

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a69360a65739ba068ff21e266638c41aeb936870.camel@linux.intel.com>
Date:   Tue, 09 Nov 2021 04:54:08 -0800
From:   Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
Cc:     "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
        Daniel Lezcano <daniel.lezcano@...aro.org>,
        linux-pm@...r.kernel.org, x86@...nel.org,
        linux-doc@...r.kernel.org, Len Brown <len.brown@...el.com>,
        Aubrey Li <aubrey.li@...ux.intel.com>,
        Amit Kucheria <amitk@...nel.org>,
        Andi Kleen <ak@...ux.intel.com>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        "Ravi V. Shankar" <ravi.v.shankar@...el.com>,
        Ricardo Neri <ricardo.neri@...el.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 5/7] thermal: intel: hfi: Enable notification interrupt

On Tue, 2021-11-09 at 09:48 +0100, Peter Zijlstra wrote:
> On Mon, Nov 08, 2021 at 06:26:13PM -0800, Ricardo Neri wrote:
> > On Mon, Nov 08, 2021 at 10:07:40AM +0100, Peter Zijlstra wrote:
> > > On Fri, Nov 05, 2021 at 06:33:10PM -0700, Ricardo Neri wrote:
> 
> > > > +static void hfi_update_work_fn(struct work_struct *work)
> > > > +{
> > > > +       struct hfi_instance *hfi_instance;
> > > > +
> > > > +       hfi_instance = container_of(to_delayed_work(work),
> > > > struct hfi_instance,
> > > > +                                   update_work);
> > > > +       if (!hfi_instance)
> > > > +               return;
> > > > +
> > > > +       /* TODO: Consume update here. */
> > > 
> > >         // this here uses ->event_lock to serialize against the
> > >         // interrupt below changing the data...
> > 
> > Anyone reading the HFI table would need to take ->event_lock.
> 
> Right.. that implies ->event_lock can be taken while there is no
> interrupt active, which then necessitates the additional lock.
> 
Correct.
With the raw_spin_trylock() optimization, we will need additional lock.
So need another lock to protect hfi_instance->table_base.

> > > > +}
> > > > +
> > > > +void intel_hfi_process_event(__u64 pkg_therm_status_msr_val)
> > > > +{
> > > > +       struct hfi_instance *hfi_instance;
> > > > +       int cpu = smp_processor_id();
> > > > +       struct hfi_cpu_info *info;
> > > > +       unsigned long flags;
> > > > +       u64 timestamp;
> > > > +
> > > > +       if (!pkg_therm_status_msr_val)
> > > > +               return;
> > > > +
> > > > +       info = &per_cpu(hfi_cpu_info, cpu);
> > > > +       if (!info)
> > > > +               return;
> > > > 

[...]

> > > > +       memcpy(hfi_instance->table_base, hfi_instance-
> > > > >hw_table,
> > > > +              hfi_features.nr_table_pages << PAGE_SHIFT);
> 
> I think we actually need to release ->interrupt_lock here, *before*
> the
> WRMSR that ACKs the HFI update. Because I think the moment that WRMSR
> goes through we can get another interrupt, and that *must* not find
> ->interrupt_lock taken, otherwise it will not process the update
> etc..
> leading to lost interrupts.

Correct.
Once we use raw_spin_trylock() change suggested above, then we need to
release lock here.

Thanks,
Srinivas

> 

> > > > +       /*
> > > > +        * Let hardware and other CPUs know that we are done
> > > > reading the HFI
> > > > +        * table and it is free to update it again.
> > > > +        */
> > > > +       pkg_therm_status_msr_val &= THERM_STATUS_CLEAR_PKG_MASK
> > > > &
> > > > +                                  
> > > > ~PACKAGE_THERM_STATUS_HFI_UPDATED;
> > > > +       wrmsrl(MSR_IA32_PACKAGE_THERM_STATUS,
> > > > pkg_therm_status_msr_val);
> > > > +       schedule_delayed_work(&hfi_instance->update_work,
> > > > HFI_UPDATE_INTERVAL);
> > > > +
> > > > +unlock_spinlock:
> > > > +       raw_spin_unlock_irqrestore(&hfi_instance->event_lock,
> > > > flags);
> > > 
> > >         raw_spin_unlock(&hfi_instance->interrupt_lock);
> > 
> > ... and here we release both locks.
> 
> See above.