linux-kernel - Re: [PATCH v1 1/1] scsi: ufs: Add command logging infrastructure

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e3761f14-9a1b-3c83-693b-7262bf004a7e@codeaurora.org>
Date:   Mon, 20 Jan 2020 10:46:43 -0800
From:   "Asutosh Das (asd)" <asutoshd@...eaurora.org>
To:     Bart Van Assche <bvanassche@....org>,
        Bjorn Andersson <bjorn.andersson@...aro.org>,
        cang@...eaurora.org
Cc:     Avri Altman <Avri.Altman@....com>,
        "Winkler, Tomas" <tomas.winkler@...el.com>, nguyenb@...eaurora.org,
        rnayak@...eaurora.org, linux-scsi@...r.kernel.org,
        kernel-team@...roid.com, saravanak@...gle.com, salyzyn@...gle.com,
        Alim Akhtar <alim.akhtar@...sung.com>,
        Pedro Sousa <pedrom.sousa@...opsys.com>,
        "James E.J. Bottomley" <jejb@...ux.ibm.com>,
        "Martin K. Petersen" <martin.petersen@...cle.com>,
        Evan Green <evgreen@...omium.org>,
        Janek Kotas <jank@...ence.com>,
        Stanley Chu <stanley.chu@...iatek.com>,
        Bean Huo <beanhuo@...ron.com>,
        Subhash Jadavani <subhashj@...eaurora.org>,
        Arnd Bergmann <arnd@...db.de>,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v1 1/1] scsi: ufs: Add command logging infrastructure

On 1/16/2020 7:43 PM, Bart Van Assche wrote:
> On 2020-01-16 15:03, Asutosh Das (asd) wrote:
>> On 10/28/2019 10:46 PM, Bjorn Andersson wrote:
>>> On Mon 28 Oct 19:37 PDT 2019, cang@...eaurora.org wrote:
>>>
>>>> On 2019-10-23 18:33, Avri Altman wrote:
>>>>>>
>>>>>>> Add the necessary infrastructure to keep timestamp history of
>>>>>>> commands, events and other useful info for debugging complex issues.
>>>>>>> This helps in diagnosing events leading upto failure.
>>>>>>
>>>>>> Why not use tracepoints, for that?
>>>>> Ack on Tomas's comment.
>>>>> Are there any pieces of information that you need not provided by the
>>>>> upiu tracer?
>>>>>
>>>>> Thanks,
>>>>> Avri
>>>>
>>>> In extreme cases, when the UFS runs into bad state, system may crash.
>>>> There
>>>> may not be a chance to collect trace. If trace is not collected and
>>>> failure
>>>> is hard to be reproduced, some command logs prints would be very
>>>> helpful to
>>>> help understand what was going on before we run into failure.
>>>>
>>>
>>> This is a common problem shared among many/all subsystems, so it's
>>> better to rely on a generic solution for this; such as using tracepoints
>>> dumped into pstore/ramoops.
>>>
>>> Regards,
>>> Bjorn
>>>
>>
>> Reviving this discussion.
>>
>> Another issue with using ftrace is that several subsystems use it.
>> Consider a situation in which we store a history of 64 commands,
>> responses and some other required info in ftrace.
>>
>> Say there's a command that's stuck for seconds until the software times
>> out. In this case, the other ftrace events are still enabled and keep
>> writing to ftrace buffer thus overwriting some/most of the ufs's command
>> history; thus the history is more or less lost. And there's a need to
>> reproduce the issue with other tracing disabled.
>>
>> So what we need is a client specific logging mechanism, which is
>> lightweight as ftrace and can be parsed from ramdumps.
>>
>> I'm open to ideas but ftrace in its current form may not be suitable for
>> this.
> 
> Hi Asutosh,
> 
> Are you aware that it is already possible today to figure out which SCSI
> commands are pending? Are you familiar with the files under
> /sys/kernel/debug/block/? An example:
> 
> $ (cd /sys/kernel/debug/block && grep -aH . */*/busy)
> nvme0n1/hctx7/busy:0000000006f07009 {.op=READ, .cmd_flags=META|PRIO,
> .rq_flags=DONTPREP|IO_STAT, .state=in_flight, .tag=275, .internal_tag=-1}
> 
> Bart.
> 
> 

Hi Bart,

No I was not aware of this. Thanks for letting me know.

However, it doesn't look like this stores the _history_ of commands that 
were already executed.

To put it in perspective, below is the output of the current command 
logging -:

[     461.142124354s] dme: dme_send: seq_no=  738272 lun=0 cmd_id=0x018 
lba=        0 txfer_len=       0 tag= 0 doorbell=0x0000 
outstanding=0x0000 idn= 0
[     461.142384823s] dme: dme_cmpl_2: seq_no=  738273 lun=0 
cmd_id=0x018 lba=        0 txfer_len=       0 tag= 0 doorbell=0x0000 
outstanding=0x0000 idn= 0
[     461.142453052s] scsi: scsi_send: seq_no=  738274 lun=0 
cmd_id=0x028 lba=  1360888 txfer_len=   77824 tag= 0 doorbell=0x0001 
outstanding=0x0001 idn= 0
[     461.142473000s] scsi: scsi_send: seq_no=  738275 lun=0 
cmd_id=0x028 lba=  1361056 txfer_len=   12288 tag= 1 doorbell=0x0003 
outstanding=0x0003 idn= 0
[     461.142482270s] scsi: scsi_send: seq_no=  738276 lun=0 
cmd_id=0x028 lba=  1361088 txfer_len=    4096 tag= 2 doorbell=0x0007 
outstanding=0x0007 idn= 0
[     461.142491020s] scsi: scsi_send: seq_no=  738277 lun=0 
cmd_id=0x028 lba=  1361120 txfer_len=    4096 tag= 3 doorbell=0x000F 
outstanding=0x000F idn= 0
[     461.142538208s] scsi: scsi_send: seq_no=  738278 lun=0 
cmd_id=0x028 lba=  1361144 txfer_len=    4096 tag= 4 doorbell=0x001F 
outstanding=0x001F idn= 0
[     461.142780083s] scsi: scsi_send: seq_no=  738279 lun=0 
cmd_id=0x028 lba=  1361184 txfer_len=   20480 tag= 5 doorbell=0x003F 
outstanding=0x003F idn= 0
[     461.142922791s] scsi: scsi_cmpl: seq_no=  738280 lun=0 
cmd_id=0x028 lba=  1360888 txfer_len=   77824 tag= 0 doorbell=0x003E 
outstanding=0x003F idn= 0
[     461.142982323s] scsi: scsi_cmpl: seq_no=  738281 lun=0 
cmd_id=0x028 lba=  1361056 txfer_len=   12288 tag= 1 doorbell=0x003C 
outstanding=0x003E idn= 0
[     461.142996958s] scsi: scsi_cmpl: seq_no=  738282 lun=0 
cmd_id=0x028 lba=  1361088 txfer_len=    4096 tag= 2 doorbell=0x0038 
outstanding=0x003C idn= 0
[     461.143059406s] scsi: scsi_cmpl: seq_no=  738283 lun=0 
cmd_id=0x028 lba=  1361120 txfer_len=    4096 tag= 3 doorbell=0x0020 
outstanding=0x0038 idn= 0
[     461.143064666s] scsi: scsi_cmpl: seq_no=  738284 lun=0 
cmd_id=0x028 lba=  1361144 txfer_len=    4096 tag= 4 doorbell=0x0020 
outstanding=0x0038 idn= 0

[...]

This output was extracted from ramdumps after a crash.
There're custom logs that come up as well, for e.g. scaling start and 
end etc.

In a nutshell, the high-level requirements of the logging would be -:
1. Light-weight
2. Extract from ram-dumps
3. Eliminate the need to reproduce an issue as much as possible

Ftrace ticks all the boxes except 3. There's a high chance of this data 
being overwritten by other loggers. And when this happens, the issue 
needs to be reproduced with ftrace logging enabled only for ufs.

*Motivation for this logging*:

There are several (read - all) UFS device vendors who need the sequence 
of commands leading up to the specific failure. They've asked for it 
every time. They still do.

There were 2 ways of doing this - either in device or in host.

1. In-Device-:
	The device stores a ring buffer of commands that were received.
	Upon error, host sends a command to device to read this ring 	buffer 
and dumps it out or can be extracted from ramdumps. It'd not be a 
universal solution anyway and would be very vendor-specific. Difficult 
to maintain etc. May have to be added to JEDEC to standardize it.

2. In-host-:
	Ftrace was tried out and the previously highlighted issues popped up. 
Hence the need for a UFS (client)-specific implementation.

If there's a way to have a client-specific ftrace logging, it'd be the 
ideal thing to do.

How about modifying ftrace to make it client-specific?

Any suggestions to modify the current design such that it can use ftrace 
or any other better ideas or methods, please let me know.

-asd


-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
Linux Foundation Collaborative Project