linux-kernel-test/arch/x86/kernel
Steven Rostedt 40d6a14662 Kick CPUS that might be sleeping in cpus_idle_wait
Sometimes cpu_idle_wait gets stuck because it might miss CPUS that are
already in idle, have no tasks waiting to run and have no interrupts going
to them.  This is common on bootup when switching cpu idle governors.

This patch gives those CPUS that don't check in an IPI kick.

 Background:
 -----------
I notice this while developing the mcount patches, that every once in a
while the system would hang. Looking deeper, the hang was always at boot
up when registering init_menu of the cpu_idle menu governor. Talking
with Thomas Gliexner, we discovered that one of the CPUS had no timer
events scheduled for it and it was in idle (running with NO_HZ). So the
CPU would not set the cpu_idle_state bit.

Hitting sysrq-t a few times would eventually route the interrupt to the
stuck CPU and the system would continue.

Note, I would have used the PDA isidle but that is set after the
cpu_idle_state bit is cleared, and would leave a window open where we
may miss being kicked.

hmm, looking closer at this, we still have a small race window between
clearing the cpu_idle_state and disabling interrupts (hence the RFC).

    CPU0:                          CPU 1:
  ---------                       ---------
 cpu_idle_wait():                 cpu_idle():
      |                           __cpu_cpu_var(is_idle) = 1;
      |                           if (__get_cpu_var(cpu_idle_state)) /* == 0 */
 per_cpu(cpu_idle_state, 1) = 1;         |
 if (per_cpu(is_idle, 1)) /* == 1 */     |
 smp_call_function(1)                    |
      |                             receives ipi and runs do_nothing.
 wait on map == empty               idle();
   /* waits forever */

So really we need interrupts off for most of this then. One might think
that we could simply clear the cpu_idle_state from do_nothing, but I'm
assuming that cpu_idle governors can be removed, and this might cause a
race that a governor might be used after the module was removed.

Venki said:

  I think your RFC patch is the right solution here.  As I see it, there is
  no race with your RFC patch.  As long as you call a dummy smp_call_function
  on all CPUs, we should be OK.  We can get rid of cpu_idle_state and the
  current wait forever logic altogether with dummy smp_call_function.  And so
  there wont be any wait forever scenario.

  The whole point of cpu_idle_wait() is to make all CPUs come out of idle
  loop atleast once.  The caller will use cpu_idle_wait something like this.

  // Want to change idle handler

  - Switch global idle handler to always present default_idle

  - call cpu_idle_wait so that all cpus come out of idle for an instant
    and stop using old idle pointer and start using default idle

  - Change the idle handler to a new handler

  - optional cpu_idle_wait if you want all cpus to start using the new
    handler immediately.

Maybe the below 1s patch is safe bet for .24.  But for .25, I would say we
just replace all complicated logic by simple dummy smp_call_function and
remove cpu_idle_state altogether.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-14 08:52:22 -08:00
..
acpi ACPI: suspend: old debugging hacks sneaked back 2007-12-06 16:03:06 -05:00
cpu x86: intel_cacheinfo.c: cpu cache info entry for Intel Tolapai 2007-12-21 01:27:19 +01:00
.gitignore .gitignore update for x86 arch 2007-10-17 21:19:04 +02:00
alternative.c x86: convert cpuinfo_x86 array to a per_cpu array 2007-10-19 20:35:04 +02:00
aperture_64.c x86 gart: rename symbols only used for the GART implementation 2007-10-30 00:22:22 +01:00
apic_32.c x86 apic_32.c section fix 2007-12-19 23:20:18 +01:00
apic_64.c x86: add lapic_shutdown for x86_64 2007-10-23 22:37:22 +02:00
apm_32.c PM: ACPI and APM must not be enabled at the same time 2008-01-11 12:26:47 -05:00
asm-offsets_32.c Boot with virtual == physical to get closer to native Linux. 2007-10-23 15:49:54 +10:00
asm-offsets_64.c x86: Fix boot protocol KEEP_SEGMENTS check. 2007-10-27 20:57:43 +02:00
asm-offsets.c i386: move kernel 2007-10-11 11:17:01 +02:00
audit_64.c x86_64: move kernel 2007-10-11 11:17:24 +02:00
bootflag.c i386: move kernel 2007-10-11 11:17:01 +02:00
bugs_64.c Delete filenames in comments. 2007-10-13 10:01:23 -07:00
cpuid.c x86: convert cpuinfo_x86 array to a per_cpu array 2007-10-19 20:35:04 +02:00
crash_dump_32.c kmap leak fix for x86_32 kdump 2007-10-19 11:53:33 -07:00
crash_dump_64.c Delete filenames in comments. 2007-10-13 10:01:23 -07:00
crash.c x86: disable hpet legacy replacement for kdump 2007-12-03 17:17:10 +01:00
doublefault_32.c i386: move kernel 2007-10-11 11:17:01 +02:00
e820_32.c kexec: add BSS to resource tree 2007-10-22 08:13:19 -07:00
e820_64.c kexec: add BSS to resource tree 2007-10-22 08:13:19 -07:00
early_printk.c [x86] remove uses of magic macros for boot_params access 2007-10-16 17:38:31 -07:00
early-quirks.c x86 gart: rename symbols only used for the GART implementation 2007-10-30 00:22:22 +01:00
efi_32.c kexec: add BSS to resource tree 2007-10-22 08:13:19 -07:00
efi_stub_32.S i386: move kernel 2007-10-11 11:17:01 +02:00
entry_32.S Merge branch 'xen-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen 2007-10-17 11:10:11 -07:00
entry_64.S x86: return correct error code from child_rip in x86_64 entry.S 2007-10-17 20:15:29 +02:00
genapic_64.c x86: convert cpu_to_apicid to be a per cpu variable 2007-10-19 20:35:03 +02:00
genapic_flat_64.c x86: convert cpu_to_apicid to be a per cpu variable 2007-10-19 20:35:03 +02:00
geode_32.c x86: Geode Multi-Function General Purpose Timers support 2007-10-12 23:04:06 +02:00
head64.c x86: use descriptor's functions instead of inline assembly 2007-10-19 20:35:03 +02:00
head_32.S fix lguest rmmod "bad pgd" 2008-01-01 11:30:35 -08:00
head_64.S x86_64: move kernel 2007-10-11 11:17:24 +02:00
hpet.c x86: disable hpet on shutdown 2007-12-03 17:17:10 +01:00
i386_ksyms_32.c x86: export the symbol empty_zero_page on the 32-bit x86 architecture 2007-11-26 20:42:19 +01:00
i387_32.c Delete filenames in comments. 2007-10-13 10:01:23 -07:00
i387_64.c x86: fix taking DNA during 64bit sigreturn 2007-11-12 11:09:33 -08:00
i8237.c Delete filenames in comments. 2007-10-13 10:01:23 -07:00
i8253.c spelling fixes: arch/i386/ 2007-10-20 01:13:56 +02:00
i8259_32.c i386: introduce "used_vectors" bitmap which can be used to reserve vectors. 2007-10-19 20:35:03 +02:00
i8259_64.c x86: more struct irqaction initializer cleanups 2007-10-17 20:16:07 +02:00
init_task.c x86: merge init_task_32/64.c 2007-10-19 20:35:02 +02:00
io_apic_32.c x86: fix "Kernel panic - not syncing: IO-APIC + timer doesn't work!" 2007-12-18 18:05:58 +01:00
io_apic_64.c x86: fix "Kernel panic - not syncing: IO-APIC + timer doesn't work!" 2007-12-18 18:05:58 +01:00
ioport_32.c Delete filenames in comments. 2007-10-13 10:01:23 -07:00
ioport_64.c Delete filenames in comments. 2007-10-13 10:01:23 -07:00
irq_32.c x86: also show non-zero IRQ counts for vectors that currently don't have a handler 2007-10-17 20:16:54 +02:00
irq_64.c x86: also show non-zero IRQ counts for vectors that currently don't have a handler 2007-10-17 20:16:54 +02:00
k8.c x86_64: move kernel 2007-10-11 11:17:24 +02:00
kprobes_32.c x86: jprobe bugfix 2007-12-18 18:05:58 +01:00
kprobes_64.c x86: kprobes bugfix 2007-12-18 18:05:58 +01:00
ldt_32.c x86: convert mm_context_t semaphore to a mutex 2007-10-17 20:17:05 +02:00
ldt_64.c x86: convert mm_context_t semaphore to a mutex 2007-10-17 20:17:00 +02:00
machine_kexec_32.c Use extended crashkernel command line on i386 2007-10-19 11:53:49 -07:00
machine_kexec_64.c x86: Dump filtering supports x86_64 sparsemem 2007-10-27 20:57:43 +02:00
Makefile x86: delete vsyscall files during make clean 2007-10-17 21:56:01 +02:00
Makefile_32 x86: do not use $(ARCH) when not needed 2007-11-12 21:02:20 +01:00
Makefile_64 x86: do not use $(ARCH) when not needed 2007-11-12 21:02:20 +01:00
mca_32.c Delete filenames in comments. 2007-10-13 10:01:23 -07:00
mfgpt_32.c x86: Geode MFGPT clock event device support 2007-10-12 23:04:06 +02:00
microcode.c x86: convert cpuinfo_x86 array to a per_cpu array 2007-10-19 20:35:04 +02:00
module_32.c i386: move kernel 2007-10-11 11:17:01 +02:00
module_64.c x86_64: move kernel 2007-10-11 11:17:24 +02:00
mpparse_32.c spelling fixes: arch/i386/ 2007-10-20 01:13:56 +02:00
mpparse_64.c x86: acpi use cpu_physical_id 2007-10-19 20:35:03 +02:00
msr.c x86: convert cpuinfo_x86 array to a per_cpu array 2007-10-19 20:35:04 +02:00
nmi_32.c x86: add the word 'WARNING' in check_nmi_watchdog() output 2007-12-04 17:19:07 +01:00
nmi_64.c x86: add the word 'WARNING' in check_nmi_watchdog() output 2007-12-04 17:19:07 +01:00
numaq_32.c i386: move kernel 2007-10-11 11:17:01 +02:00
paravirt_32.c x86/paravirt: revert exports to restore old behaviour 2007-11-29 09:24:55 -08:00
pci-calgary_64.c x86 gart: rename iommu.h to gart.h 2007-10-30 00:22:22 +01:00
pci-dma_32.c i386: Clean up duplicate includes in arch/i386/kernel/ 2007-10-17 20:15:51 +02:00
pci-dma_64.c x86: turn off iommu merge by default 2007-11-26 20:42:19 +01:00
pci-gart_64.c x86 gart: rename symbols only used for the GART implementation 2007-10-30 00:22:22 +01:00
pci-nommu_64.c x86 gart: rename iommu.h to gart.h 2007-10-30 00:22:22 +01:00
pci-swiotlb_64.c x86 gart: rename iommu.h to gart.h 2007-10-30 00:22:22 +01:00
pcspeaker.c i386: move kernel 2007-10-11 11:17:01 +02:00
pmtimer_64.c x86_64: move kernel 2007-10-11 11:17:24 +02:00
process_32.c Kick CPUS that might be sleeping in cpus_idle_wait 2008-01-14 08:52:22 -08:00
process_64.c Kick CPUS that might be sleeping in cpus_idle_wait 2008-01-14 08:52:22 -08:00
ptrace_32.c spelling fixes: arch/i386/ 2007-10-20 01:13:56 +02:00
ptrace_64.c x86: convert mm_context_t semaphore to a mutex 2007-10-17 20:17:00 +02:00
quirks.c x86: Add HPET force support for MCP55 (nForce 5) chipsets 2007-10-23 22:37:25 +02:00
reboot_32.c x86: disable hpet on shutdown 2007-12-03 17:17:10 +01:00
reboot_64.c x86: disable hpet on shutdown 2007-12-03 17:17:10 +01:00
reboot_fixups_32.c x86: reboot fixup for wrap2c board 2007-11-17 16:27:02 +01:00
relocate_kernel_32.S i386: move kernel 2007-10-11 11:17:01 +02:00
relocate_kernel_64.S x86_64: move kernel 2007-10-11 11:17:24 +02:00
scx200_32.c long vs. unsigned long - low-hanging fruits in drivers 2007-10-14 12:41:51 -07:00
setup64.c x86: use descriptor's functions instead of inline assembly 2007-10-19 20:35:03 +02:00
setup_32.c x86_32: disable_pse must be __cpuinitdata 2007-12-19 23:20:19 +01:00
setup_64.c x86: fixup cpu_info array conversion 2007-11-17 16:27:01 +01:00
sigframe_32.h i386: move kernel 2007-10-11 11:17:01 +02:00
signal_32.c spelling fixes: arch/i386/ 2007-10-20 01:13:56 +02:00
signal_64.c spelling fixes: arch/x86_64/ 2007-10-20 01:25:36 +02:00
smp_32.c x86: export smp_ops to allow modular build of KVM 2007-10-27 20:57:43 +02:00
smp_64.c x86: implement missing x86_64 function smp_call_function_mask() 2007-10-19 20:35:03 +02:00
smpboot_32.c x86 smpboot_32.c section fixes 2007-12-19 23:20:18 +01:00
smpboot_64.c x86: fix do_fork_idle section mismatch 2008-01-08 16:10:35 -08:00
smpcommon_32.c i386: move kernel 2007-10-11 11:17:01 +02:00
srat_32.c i386: move kernel 2007-10-11 11:17:01 +02:00
stacktrace.c x86: constify stacktrace_ops 2007-10-17 20:16:11 +02:00
summit_32.c spelling fixes: arch/i386/ 2007-10-20 01:13:56 +02:00
suspend_64.c revert "Hibernation: Use temporary page tables for kernel text mapping on x86_64" 2007-12-17 19:28:15 -08:00
suspend_asm_64.S x86: Save registers in saved_context during suspend and hibernation 2007-10-23 22:37:24 +02:00
sys_i386_32.c remove include/asm-*/ipc.h 2007-10-17 08:42:55 -07:00
sys_x86_64.c Delete filenames in comments. 2007-10-13 10:01:23 -07:00
syscall_64.c i386/x86_64: move headers to include/asm-x86 2007-10-11 11:20:03 +02:00
syscall_table_32.S i386: move kernel 2007-10-11 11:17:01 +02:00
sysenter_32.c Delete filenames in comments. 2007-10-13 10:01:23 -07:00
tce_64.c x86: Create clflush() inline, remove hardcoded wbinvd 2007-10-17 20:16:12 +02:00
time_32.c x86: Fix irq0 / local apic timer accounting 2007-10-12 23:04:06 +02:00
time_64.c x86: on x86_64, correct reading of PC RTC when update in progress in time_64.c 2007-11-17 16:27:01 +01:00
topology.c x86: arch_register_cpu() section fix 2007-12-04 17:19:07 +01:00
trampoline_32.S x86: misc. constifications 2007-10-17 20:16:08 +02:00
trampoline_64.S x86: misc. constifications 2007-10-17 20:16:08 +02:00
traps_32.c x86: fix die() to not be preemptible 2007-12-21 01:27:19 +01:00
traps_64.c lockdep: annotate do_debug() trap handler 2007-11-26 20:42:19 +01:00
tsc_32.c x86: fix more TSC clock source calibration errors 2007-10-23 22:37:22 +02:00
tsc_64.c x86: convert cpuinfo_x86 array to a per_cpu array 2007-10-19 20:35:04 +02:00
tsc_sync.c Delete filenames in comments. 2007-10-13 10:01:23 -07:00
verify_cpu_64.S x86_64: move kernel 2007-10-11 11:17:24 +02:00
vm86_32.c Delete filenames in comments. 2007-10-13 10:01:23 -07:00
vmi_32.c paravirt: clean up lazy mode handling 2007-10-16 11:51:29 -07:00
vmiclock_32.c i386: move kernel 2007-10-11 11:17:01 +02:00
vmlinux_32.lds.S i386: move kernel 2007-10-11 11:17:01 +02:00
vmlinux_64.lds.S x86_64: move kernel 2007-10-11 11:17:24 +02:00
vmlinux.lds.S i386: move kernel 2007-10-11 11:17:01 +02:00
vsmp_64.c x86_64: move kernel 2007-10-11 11:17:24 +02:00
vsyscall_32.lds.S i386: move kernel 2007-10-11 11:17:01 +02:00
vsyscall_32.S i386: move kernel 2007-10-11 11:17:01 +02:00
vsyscall_64.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial 2007-10-19 20:36:17 -07:00
vsyscall-int80_32.S i386: move kernel 2007-10-11 11:17:01 +02:00
vsyscall-note_32.S i386: move kernel 2007-10-11 11:17:01 +02:00
vsyscall-sigreturn_32.S i386: move kernel 2007-10-11 11:17:01 +02:00
vsyscall-sysenter_32.S i386: move kernel 2007-10-11 11:17:01 +02:00
x8664_ksyms_64.c x86_64: move kernel 2007-10-11 11:17:24 +02:00