linux-kernel - Re: [syzbot] KASAN: out-of-bounds Read in i801

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <YKuC/f8abGyyMG5M@smile.fi.intel.com>
Date:   Mon, 24 May 2021 13:42:05 +0300
From:   Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
To:     Jean Delvare <jdelvare@...e.de>
Cc:     syzbot <syzbot+b4d3fd1dfd53e90afd79@...kaller.appspotmail.com>,
        linux-i2c@...r.kernel.org, linux-kernel@...r.kernel.org,
        syzkaller-bugs@...glegroups.com,
        Jarkko Nikula <jarkko.nikula@...ux.intel.com>,
        Mika Westerberg <mika.westerberg@...ux.intel.com>,
        Benjamin Tissoires <benjamin.tissoires@...hat.com>
Subject: Re: [syzbot] KASAN: out-of-bounds Read in i801_isr

On Mon, May 24, 2021 at 11:16:52AM +0200, Jean Delvare wrote:
> On Fri, 21 May 2021 15:48:20 +0200, Jean Delvare wrote:
> > Looking at the ICH9 datasheet, I see the following description for the
> > KILL bit (which is what we try to use to reset the SMBus controller):
> > 
> >   "Kills the current host transaction taking place, sets the FAILED
> >    status bit, and asserts the interrupt (or SMI#)."
> > 
> > At the time the recovery code was written, i2c-i801 was a polling-only
> > driver, interrupts were not supported, so asserting the interrupt had
> > no effect. Now that the driver does support interrupts, this would call
> > i801_isr(), right?
> 
> I modified the i2c-i801 driver to simulate a timeout once in a while,
> and I can confirm that i801_isr() gets called when the KILL bit is set.
> 
> > So my theory is that our attempt to kill a timed-out byte-by-byte block
> > transaction triggers an interrupt, which calls in i801_isr() with the
> > SMBHSTSTS_BYTE_DONE bit set. This in turn causes i801_isr_byte_done()
> > to be called while we are absolutely not ready nor even supposed to
> > process the next data byte.
> 
> I can also report that I'm reproducing the bug reported by syzbot when
> this happens, by running decode-dimms with my modified i2c-i801 driver.
> I have 4 SPD EEPROMs instantiated for my DIMMs, so decode-dimms
> triggers a lot of calls to i2c_smbus_read_i2c_block_data().
> 
> > I guess we should clear SMBHSTSTS_BYTE_DONE before issuing a
> > SMBHSTCNT_KILL. Alternatively we could add a check at the beginning of
> > i801_isr() to bail out immediately if SMBHSTCNT_KILL is set. While
> > possibly more robust, this approach has the drawback of increasing the
> > processing time of all interrupts, even standard/legitimate ones. So
> > maybe just clearing SMBHSTSTS_BYTE_DONE is more reasonable. Something
> > like:
> > 
> > --- linux-5.11.orig/drivers/i2c/busses/i2c-i801.c
> > +++ linux-5.11/drivers/i2c/busses/i2c-i801.c
> > @@ -393,6 +393,8 @@ static int i801_check_post(struct i801_p
> >  		dev_err(&priv->pci_dev->dev, "Transaction timeout\n");
> >  		/* try to stop the current command */
> >  		dev_dbg(&priv->pci_dev->dev, "Terminating the current operation\n");
> > +		/* Clear BYTE_DONE so as to not confuse i801_isr() */
> > +		outb_p(SMBHSTSTS_BYTE_DONE, SMBHSTSTS(priv));
> >  		outb_p(inb_p(SMBHSTCNT(priv)) | SMBHSTCNT_KILL,
> >  		       SMBHSTCNT(priv));
> >  		usleep_range(1000, 2000);
> 
> I ended up with a different approach. Clearing BYTE_DONE would possibly
> avoid the problem immediately at hand, however the interrupt would
> still be triggered, even for other (not byte-by-byte) transactions,
> causing us to fiddle with the status register. I don't think it's
> needed, and it could have unintended side effects.
> 
> So instead I tried the alternative approach of checking the KILL bit in
> i801_isr() and returning immediately if it is set. While it does work,
> I noticed that i801_isr() is in fact called in a loop for the whole
> duration of the usleep_range(1000, 2000) before we clear the KILL bit.
> While better than an out-of-bounds memory access, an interrupt flood is
> still not ideal.
> 
> > I must say I wonder why SMBHSTCNT_KILL generates an interrupt in the
> > first place, I can't see who would need this.
> 
> Which made me think... The interrupt is being generated because we
> *ask* for it. It doesn't have to be generated if we aren't interested
> in it. So the fix would be very simple: don't bother preserving the
> other bits in HST_CNT, they will be set properly again for the next
> transaction anyway, only set the KILL bit to 1. In particular don't
> preserve the INTREN bit, so no interrupt will be generated.
> 
> @@ -395,11 +401,9 @@ static int i801_check_post(struct i801_p
>  		dev_err(&priv->pci_dev->dev, "Transaction timeout\n");
>  		/* try to stop the current command */
>  		dev_dbg(&priv->pci_dev->dev, "Terminating the current operation\n");
> -		outb_p(inb_p(SMBHSTCNT(priv)) | SMBHSTCNT_KILL,
> -		       SMBHSTCNT(priv));
> +		outb_p(SMBHSTCNT_KILL, SMBHSTCNT(priv));
>  		usleep_range(1000, 2000);
> -		outb_p(inb_p(SMBHSTCNT(priv)) & (~SMBHSTCNT_KILL),
> -		       SMBHSTCNT(priv));
> +		outb_p(0, SMBHSTCNT(priv));
>  
>  		/* Check if it worked */
>  		status = inb_p(SMBHSTSTS(priv));
> 
> This works for me. If there are no objections I'll post a proper patch.

No objection from me, thanks for conducting a research and developing the fix!
Feel free to add mine
Acked-by: Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
to whatever solution you consider the best.

-- 
With Best Regards,
Andy Shevchenko