SCSI fixes on 20160625

Two straightforward fixes.  One is a concurrency issue only affecting
 SAS connected SATA drives, but which could hang the storage subsystem
 if it triggers (because the outstanding command count on error never
 goes back to zero) and the other is a NO_TAG fallout from the switch
 to hostwide tags which causes the system to crash on module insertion
 (we've checked carefully and only the 53c700 family of drivers is
 vulnerable to this issue).
 
 Signed-off-by: James Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJXbtgBAAoJEAVr7HOZEZN4/EQP/0diQXZawG4Ncy4dYmrzYDeR
 4R5OqkV5hJapzJD4De/DqTRxKVIsxVLkgM4wVT7KHv0QndX81ipdKsiWyUyXJ3nb
 5HBvJ/GhAUnVyVM1nMGqK81Sk68zJiLoJ32lnE099GP6eJKtRtWgfKvEHPimhDWc
 S42s6t6dhyirmXmVQ0Jyx1/c1y8RuL5dTCMXjkP1LdTc5UnZ7b3HeKIxE0YJ3xRs
 ZsHdEPDk7UR1UUXWIdqjx/P0elgDMExkOaAQUCeZV2aWiV8OgW+OSSzjSxDvlJLk
 IBeggRjdiqEBCyuU5QqfQjISaFR0POf5q41Bomg+Zq9XtkdEovRhSwdkP2/GY/TD
 wEozTZKkRh1RcYtCNOASOgvmpHXcGBWhMk6iqmsbl/LIL5UXdvB9EAzhHlAZ/kai
 Szs3w7IBye6CfISPoFPstNav8DFkse2lf6fVdEEdmRLF8wXGhAXugAdEw3fbuiJX
 9SMwTQqSEMZCQJm4rW2vMSncmatvKfce/SXLvrEEpWFyOM984Q0ZGUTMCCKDstNm
 e8LmnRlMWLa/mYoo9DFIfkgRNd1gH6j+gIPsj+Rtkhk4Q3YC9Rr6gERvuRjGM9XU
 EI611dRUA2YQtclG2DKHkkY1wdN15RyrGR50I3VLaRv0RzwY5HCV1JGEemAokgpU
 BfXrmW3kPfh7fUAHhLqF
 =5jjT
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Two straightforward fixes.

  One is a concurrency issue only affecting SAS connected SATA drives,
  but which could hang the storage subsystem if it triggers (because the
  outstanding command count on error never goes back to zero) and the
  other is a NO_TAG fallout from the switch to hostwide tags which
  causes the system to crash on module insertion (we've checked
  carefully and only the 53c700 family of drivers is vulnerable to this
  issue)"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  53c700: fix BUG on untagged commands
  scsi: fix race between simultaneous decrements of ->host_failed
This commit is contained in:
Linus Torvalds 2016-06-26 10:08:49 -07:00
commit 2ac9b9735b
4 changed files with 12 additions and 6 deletions

View File

@ -263,19 +263,23 @@ scmd->allowed.
3. scmd recovered
ACTION: scsi_eh_finish_cmd() is invoked to EH-finish scmd
- shost->host_failed--
- clear scmd->eh_eflags
- scsi_setup_cmd_retry()
- move from local eh_work_q to local eh_done_q
LOCKING: none
CONCURRENCY: at most one thread per separate eh_work_q to
keep queue manipulation lockless
4. EH completes
ACTION: scsi_eh_flush_done_q() retries scmds or notifies upper
layer of failure.
layer of failure. May be called concurrently but must have
a no more than one thread per separate eh_work_q to
manipulate the queue locklessly
- scmd is removed from eh_done_q and scmd->eh_entry is cleared
- if retry is necessary, scmd is requeued using
scsi_queue_insert()
- otherwise, scsi_finish_command() is invoked for scmd
- zero shost->host_failed
LOCKING: queue or finish function performs appropriate locking

View File

@ -606,7 +606,7 @@ void ata_scsi_error(struct Scsi_Host *host)
ata_scsi_port_error_handler(host, ap);
/* finish or retry handled scmd's and clean up */
WARN_ON(host->host_failed || !list_empty(&eh_work_q));
WARN_ON(!list_empty(&eh_work_q));
DPRINTK("EXIT\n");
}

View File

@ -1122,7 +1122,7 @@ process_script_interrupt(__u32 dsps, __u32 dsp, struct scsi_cmnd *SCp,
} else {
struct scsi_cmnd *SCp;
SCp = scsi_host_find_tag(SDp->host, SCSI_NO_TAG);
SCp = SDp->current_cmnd;
if(unlikely(SCp == NULL)) {
sdev_printk(KERN_ERR, SDp,
"no saved request for untagged cmd\n");
@ -1826,7 +1826,7 @@ NCR_700_queuecommand_lck(struct scsi_cmnd *SCp, void (*done)(struct scsi_cmnd *)
slot->tag, slot);
} else {
slot->tag = SCSI_NO_TAG;
/* must populate current_cmnd for scsi_host_find_tag to work */
/* save current command for reselection */
SCp->device->current_cmnd = SCp;
}
/* sanity check: some of the commands generated by the mid-layer

View File

@ -1128,7 +1128,6 @@ static int scsi_eh_action(struct scsi_cmnd *scmd, int rtn)
*/
void scsi_eh_finish_cmd(struct scsi_cmnd *scmd, struct list_head *done_q)
{
scmd->device->host->host_failed--;
scmd->eh_eflags = 0;
list_move_tail(&scmd->eh_entry, done_q);
}
@ -2227,6 +2226,9 @@ int scsi_error_handler(void *data)
else
scsi_unjam_host(shost);
/* All scmds have been handled */
shost->host_failed = 0;
/*
* Note - if the above fails completely, the action is to take
* individual devices offline and flush the queue of any