linux-kernel-test/drivers
Mika Westerberg 778e277cb8 mmc: core: prevent aggressive clock gating racing with ios updates
We have seen at least two different races when clock gating kicks in in a
middle of ios structure update.

First one happens when ios->clock is changed outside of aggressive clock
gating framework, for example via mmc_set_clock(). The race might happen
when we run following code:

mmc_set_ios():
	...
	if (ios->clock > 0)
		mmc_set_ungated(host);

Now if gating kicks in right after the condition check we end up setting
host->clk_gated to false even though we have just gated the clock. Next
time a request is started we try to ungate and restore the clock in
mmc_host_clk_hold(). However since we have host->clk_gated set to false the
original clock is not restored.

This eventually will cause the host controller to hang since its clock is
disabled while we are trying to issue a request. For example on Intel
Medfield platform we see:

[   13.818610] mmc2: Timeout waiting for hardware interrupt.
[   13.818698] sdhci: =========== REGISTER DUMP (mmc2)===========
[   13.818753] sdhci: Sys addr: 0x00000000 | Version:  0x00008901
[   13.818804] sdhci: Blk size: 0x00000000 | Blk cnt:  0x00000000
[   13.818853] sdhci: Argument: 0x00000000 | Trn mode: 0x00000000
[   13.818903] sdhci: Present:  0x1fff0000 | Host ctl: 0x00000001
[   13.818951] sdhci: Power:    0x0000000d | Blk gap:  0x00000000
[   13.819000] sdhci: Wake-up:  0x00000000 | Clock:    0x00000000
[   13.819049] sdhci: Timeout:  0x00000000 | Int stat: 0x00000000
[   13.819098] sdhci: Int enab: 0x00ff00c3 | Sig enab: 0x00ff00c3
[   13.819147] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000000
[   13.819196] sdhci: Caps:     0x6bee32b2 | Caps_1:   0x00000000
[   13.819245] sdhci: Cmd:      0x00000000 | Max curr: 0x00000000
[   13.819292] sdhci: Host ctl2: 0x00000000
[   13.819331] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000
[   13.819377] sdhci: ===========================================
[   13.919605] mmc2: Reset 0x2 never completed.

and it never recovers.

Second race might happen while running mmc_power_off():

static void mmc_power_off(struct mmc_host *host)
{
	host->ios.clock = 0;
	host->ios.vdd = 0;

[ clock gating kicks in here ]

	/*
	 * Reset ocr mask to be the highest possible voltage supported for
	 * this mmc host. This value will be used at next power up.
	 */
	host->ocr = 1 << (fls(host->ocr_avail) - 1);

	if (!mmc_host_is_spi(host)) {
		host->ios.bus_mode = MMC_BUSMODE_OPENDRAIN;
		host->ios.chip_select = MMC_CS_DONTCARE;
	}
	host->ios.power_mode = MMC_POWER_OFF;
	host->ios.bus_width = MMC_BUS_WIDTH_1;
	host->ios.timing = MMC_TIMING_LEGACY;
	mmc_set_ios(host);
}

If the clock gating worker kicks in while we are only partially updated the
ios structure the host controller gets incomplete ios and might not work as
supposed. Again on Intel Medfield platform we get:

[    4.185349] kernel BUG at drivers/mmc/host/sdhci.c:1155!
[    4.185422] invalid opcode: 0000 [#1] PREEMPT SMP
[    4.185509] Modules linked in:
[    4.185565]
[    4.185608] Pid: 4, comm: kworker/0:0 Not tainted 3.0.0+ #240 Intel Corporation Medfield/iCDKA
[    4.185742] EIP: 0060:[<c136364e>] EFLAGS: 00010083 CPU: 0
[    4.185827] EIP is at sdhci_set_power+0x3e/0xd0
[    4.185891] EAX: f5ff98e0 EBX: f5ff98e0 ECX: 00000000 EDX: 00000001
[    4.185970] ESI: f5ff977c EDI: f5ff9904 EBP: f644fe98 ESP: f644fe94
[    4.186049]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[    4.186125] Process kworker/0:0 (pid: 4, ti=f644e000 task=f644c0e0 task.ti=f644e000)
[    4.186219] Stack:
[    4.186257]  f5ff98e0 f644feb0 c1365173 00000282 f5ff9460 f5ff96e0 f5ff96e0 f644feec
[    4.186418]  c1355bd8 f644c0e0 c1499c3d f5ff96e0 f644fed4 00000006 f5ff96e0 00000286
[    4.186579]  f644fedc c107922b f644feec 00000286 f5ff9460 f5ff9700 f644ff10 c135839e
[    4.186739] Call Trace:
[    4.186802]  [<c1365173>] sdhci_set_ios+0x1c3/0x340
[    4.186883]  [<c1355bd8>] mmc_gate_clock+0x68/0x120
[    4.186963]  [<c1499c3d>] ? _raw_spin_unlock_irqrestore+0x4d/0x60
[    4.187052]  [<c107922b>] ? trace_hardirqs_on+0xb/0x10
[    4.187134]  [<c135839e>] mmc_host_clk_gate_delayed+0xbe/0x130
[    4.187219]  [<c105ec09>] ? process_one_work+0xf9/0x5b0
[    4.187300]  [<c135841d>] mmc_host_clk_gate_work+0xd/0x10
[    4.187379]  [<c105ec82>] process_one_work+0x172/0x5b0
[    4.187457]  [<c105ec09>] ? process_one_work+0xf9/0x5b0
[    4.187538]  [<c1358410>] ? mmc_host_clk_gate_delayed+0x130/0x130
[    4.187625]  [<c105f3c8>] worker_thread+0x118/0x330
[    4.187700]  [<c1496cee>] ? preempt_schedule+0x2e/0x50
[    4.187779]  [<c105f2b0>] ? rescuer_thread+0x1f0/0x1f0
[    4.187857]  [<c1062cf4>] kthread+0x74/0x80
[    4.187931]  [<c1062c80>] ? __init_kthread_worker+0x60/0x60
[    4.188015]  [<c149acfa>] kernel_thread_helper+0x6/0xd
[    4.188079] Code: 81 fa 00 00 04 00 0f 84 a7 00 00 00 7f 21 81 fa 80 00 00 00 0f 84 92 00 00 00 81 fa 00 00 0
[    4.188780] EIP: [<c136364e>] sdhci_set_power+0x3e/0xd0 SS:ESP 0068:f644fe94
[    4.188898] ---[ end trace a7b23eecc71777e4 ]---

This BUG() comes from the fact that ios.power_mode was still in previous
value (MMC_POWER_ON) and ios.vdd was set to zero.

We prevent these by inhibiting the clock gating while we update the ios
structure.

Both problems can be reproduced by simply running the device in a reboot
loop.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Chris Ball <cjb@laptop.org>
Cc: <stable@kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-08-31 16:25:49 -04:00
..
accessibility
acpi Merge branch 'battery' into release 2011-08-05 22:16:42 -04:00
amba
ata drivers/ata/sata_dwc_460ex.c: add missing kfree 2011-08-18 23:58:11 -04:00
atm
auxdisplay
base Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm 2011-08-17 13:15:25 -07:00
bcma
block Merge branch 'stable/for-jens' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-linus 2011-08-09 20:43:26 +02:00
bluetooth
cdrom drivers/cdrom/cdrom.c: relax check on dvd manufacturer value 2011-08-02 12:43:50 +02:00
char net: Compute protocol sequence numbers and fragment IDs using MD5. 2011-08-06 18:33:19 -07:00
clk
clocksource
connector proc_fork_connector: a lockless ->real_parent usage is not safe 2011-07-28 18:26:32 -07:00
cpufreq
cpuidle cpuidle: stop depending on pm_idle 2011-08-03 19:06:37 -04:00
crypto n2_crypto: Attach on Niagara-T3. 2011-07-28 01:30:07 -07:00
dca
dio
dma Merge branch 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm 2011-08-10 17:37:17 -07:00
edac i7core_edac: fixed typo in error count calculation 2011-08-18 14:07:15 -07:00
eisa eisa/pci_eisa.c: fix BUG introduced by 005bdad7b8 2011-08-04 06:32:51 -10:00
firewire Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 2011-08-21 18:13:19 -07:00
firmware Merge branch 'pstore-efi' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 2011-08-02 21:18:39 -10:00
gpio Merge branch 'gpio/next' of git://git.secretlab.ca/git/linux-2.6 2011-08-01 06:13:48 -10:00
gpu Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/drm-intel 2011-08-19 23:07:08 -07:00
hid
hwmon hwmon: (ibmaem) add missing kfree 2011-08-11 10:14:18 -07:00
hwspinlock
i2c
ide drivers/ide/cy82c693.c: Add missing pci_dev_put 2011-08-04 01:30:34 -07:00
idle
ieee802154
infiniband Merge branches 'ipoib' and 'iser' into for-next 2011-08-17 10:57:43 -07:00
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2011-08-03 22:00:09 -10:00
iommu
isdn Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2011-07-28 05:58:19 -07:00
leds
lguest
macintosh
mca
md dm table: set flush capability based on underlying devices 2011-08-02 12:32:08 +01:00
media Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6 2011-07-30 00:08:53 -07:00
memstick
message Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 2011-07-30 08:36:02 -10:00
mfd mfd: Fix mismatch in twl4030 mutex lock-unlock 2011-07-31 23:28:27 +02:00
misc mmc: cb710: fix possible pci_dev leak in cb710_pci_configure() 2011-08-13 14:50:23 -04:00
mmc mmc: core: prevent aggressive clock gating racing with ios updates 2011-08-31 16:25:49 -04:00
mtd
net Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless 2011-08-11 23:09:46 -07:00
nfc
nubus
of Revert "dt: add of_alias_scan and of_alias_get_id" 2011-08-04 11:26:24 +01:00
oprofile
parisc
parport
pci pci: fix new kernel-doc warning in pci.c 2011-08-20 18:02:32 -07:00
pcmcia Merge git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6 2011-07-31 06:23:08 -10:00
platform acer-wmi: support Lenovo ideapad S205 wifi switch 2011-08-05 15:21:52 -04:00
pnp
power Merge git://git.infradead.org/battery-2.6 2011-07-31 06:24:50 -10:00
pps
ps3
ptp
rapidio
regulator Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6 2011-08-01 14:05:46 -10:00
rtc Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2011-08-17 10:28:33 -07:00
s390 [S390] qdio: Use kstrtoul_from_user 2011-08-03 16:44:21 +02:00
sbus
scsi Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 2011-07-30 08:36:02 -10:00
sfi
sh Merge branch 'sh-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-3.x 2011-08-01 06:10:16 -10:00
sn
spi spi/pl022: remove function cannot exit 2011-08-02 14:54:11 +01:00
ssb
staging gma500: kill MIPI interface types 2011-08-16 07:22:16 -07:00
target Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2011-08-03 15:12:09 -10:00
tc
telephony
thermal thermal: make THERMAL_HWMON implementation fully internal 2011-08-02 14:51:57 -04:00
tty sh: Fix boot crash related to SCI 2011-08-07 15:51:45 -07:00
uio
usb USB: Serial: Add PID(0xF7C0) to FTDI SIO driver for a zeitcontrol-device 2011-08-10 22:11:45 -07:00
uwb
vhost
video savagedb: Fix typo causing regression in savage4 series video chip detection 2011-08-06 12:02:40 -07:00
virt
virtio
vlynq
w1
watchdog watchdog: Cleanup WATCHDOG_CORE help text 2011-08-02 08:23:07 +00:00
xen xen: self-balloon needs module.h 2011-08-16 07:23:34 -07:00
zorro
Kconfig
Makefile