Commit Graph

298874 Commits

Author SHA1 Message Date
Gary Hade
15afae6046 ACPI, APEI: Fix incorrect APEI register bit width check and usage
The current code incorrectly assumes that
(1) the APEI register bit width is always 8, 16, 32, or 64 and
(2) the APEI register bit width is always equal to the APEI
    register access width.

ERST serialization instructions entries such as:

[030h 0048   1]                       Action : 00 [Begin Write Operation]
[031h 0049   1]                  Instruction : 03 [Write Register Value]
[032h 0050   1]        Flags (decoded below) : 01
                      Preserve Register Bits : 1
[033h 0051   1]                     Reserved : 00

[034h 0052  12]              Register Region : [Generic Address Structure]
[034h 0052   1]                     Space ID : 00 [SystemMemory]
[035h 0053   1]                    Bit Width : 03
[036h 0054   1]                   Bit Offset : 00
[037h 0055   1]         Encoded Access Width : 03 [DWord Access:32]
[038h 0056   8]                      Address : 000000007F2D7038

[040h 0064   8]                        Value : 0000000000000001
[048h 0072   8]                         Mask : 0000000000000007

break this assumption by yielding:
  [Firmware Bug]: APEI: Invalid bit width in GAR [0x7f2d7038/3/0]

I have found no ACPI specification requirements corresponding
with the above assumptions.  There is even a good example in
the Serialization Instruction Entries section (ACPI 4.0 section
17.4,1.2, ACPI 4.0a section 2.5.1.2, ACPI 5.0 section 18.5.1.2)
that mentions a serialization instruction with a bit range of
[6:2] which is 5 bits wide, _not_ 8, 16, 32, or 64 bits wide.

Compile and boot tested with 3.3.0-rc7 on a IBM HX5.

Signed-off-by: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 03:30:19 -04:00
Chen Gong
6ef19ab7fa Update documentation for parameter *notrigger* in einj.txt
Add description of parameter notrigger in the einj.txt.
One can utilize this new parameter to do some SRAR injection
test. Pay attention, the operation is highly depended on the
BIOS implementation. If no proper BIOS supports it, even if
enabling this parameter, expected result will not happen.

v2:
  Update the documentation suggested by Tony

Suggested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 03:30:19 -04:00
Chen Gong
ee49089dc7 ACPI, APEI, EINJ, new parameter to control trigger action
Some APEI firmware implementation will access injected address
specified in param1 to trigger the error when injecting memory
error, which means if one SRAR error is injected, the crash
always happens because it is executed in kernel context. This
new parameter can disable trigger action and control is taken
over by the user. In this way, an SRAR error can happen in user
context instead of crashing the system. This function is highly
depended on BIOS implementation so please ensure you know the
BIOS trigger procedure before you enable this switch.

v2:
  notrigger should be created together with param1/param2

Tested-by: Tony Luck <tony.luck@lintel.com>
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 03:30:18 -04:00
Chen Gong
185210cc75 ACPI, APEI, EINJ, limit the range of einj_param
On the platforms with ACPI4.x support, parameter extension
is not always doable, which means only parameter extension
is enabled, einj_param can take effect.

v2->v1: stopping early in einj_get_parameter_address for einj_param

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 03:30:18 -04:00
Jiang Liu
7ed28f2ed4 ACPI, APEI, Fix ERST header length check
This fixes a trivial copy & paste error in ERST header length check.
It's just for future safety because sizeof(struct acpi_table_einj)
equals to sizeof(struct acpi_table_erst) with current ACPI5.0
specification. It applies to v3.3-rc6.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 03:30:17 -04:00
Boris Ostrovsky
02401c06b7 cpuidle: power_usage should be declared signed integer
power_usage is always assigned a negative value and should be declared
a signed integer

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 03:23:30 -04:00
Boris Ostrovsky
1a022e3f1b idle, x86: Allow off-lined CPU to enter deeper C states
Currently when a CPU is off-lined it enters either MWAIT-based idle or,
if MWAIT is not desired or supported, HLT-based idle (which places the
processor in C1 state). This patch allows processors without MWAIT
support to stay in states deeper than C1.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 03:23:01 -04:00
Linus Torvalds
f52b69f86e Merge tag 'sh-for-linus' of git://github.com/pmundt/linux-sh
Pull SuperH updates from Paul Mundt.

* tag 'sh-for-linus' of git://github.com/pmundt/linux-sh: (25 commits)
  sh: Support I/O space swapping where needed.
  sh: use set_current_blocked() and block_sigmask()
  sh: no need to reset handler if SA_ONESHOT
  sh: intc: Fix up section mismatch for intc_ack_data
  sh: select ARCH_DISCARD_MEMBLOCK.
  sh: Consolidate duplicate _32/_64 unistd definitions.
  sh: ecovec: switch SDHI controllers to card polling
  sh: Avoid exporting unimplemented syscalls.
  sh: add platform_device for RSPI in setup-sh7757
  SH: pci-sh7780: enable big-endian operation.
  serial: sh-sci: fix a race of DMA submit_tx on transfer
  sh: dma: Collect up CHCR of SH7763, SH7764, SH7780 and SH7785
  sh: dma: Collect up CHCR of SH7723 and SH7730
  sh/next: Fix build fail by asm/system.h in asm/bitops.h
  arch/sh/drivers/dma/{dma-g2,dmabrg}.c: ensure arguments to request_irq and free_irq are compatible
  sh: cpufreq: Wire up scaling_available_freqs support.
  sh: cpufreq: notify about rate rounding fallback.
  sh: cpufreq: Support CPU clock frequency table.
  sh: cpufreq: struct device lookup from CPU topology.
  sh: cpufreq: percpu struct clk accounting.
  ...
2012-03-30 00:09:17 -07:00
Toshi Kani
9f324bda97 ACPI: Add CPU hotplug support for processor device objects
acpi_processor_install_hotplug_notify() registers processor objects to
receive ACPI CPU hotplug event notifications. This patch additionally
registers processor device objects (ACPI0007) to receive the notifications
as well.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 03:06:51 -04:00
Len Brown
f6365201d8 x86: Remove the ancient and deprecated disable_hlt() and enable_hlt() facility
The X86_32-only disable_hlt/enable_hlt mechanism was used by the
32-bit floppy driver. Its effect was to replace the use of the
HLT instruction inside default_idle() with cpu_relax() - essentially
it turned off the use of HLT.

This workaround was commented in the code as:

 "disable hlt during certain critical i/o operations"

 "This halt magic was a workaround for ancient floppy DMA
  wreckage. It should be safe to remove."

H. Peter Anvin additionally adds:

 "To the best of my knowledge, no-hlt only existed because of
  flaky power distributions on 386/486 systems which were sold to
  run DOS.  Since DOS did no power management of any kind,
  including HLT, the power draw was fairly uniform; when exposed
  to the much hhigher noise levels you got when Linux used HLT
  caused some of these systems to fail.

  They were by far in the minority even back then."

Alan Cox further says:

 "Also for the Cyrix 5510 which tended to go castors up if a HLT
  occurred during a DMA cycle and on a few other boxes HLT during
  DMA tended to go astray.

  Do we care ? I doubt it. The 5510 was pretty obscure, the 5520
  fixed it, the 5530 is probably the oldest still in any kind of
  use."

So, let's finally drop this.

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Josh Boyer <jwboyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Stephen Hemminger <shemminger@vyatta.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/n/tip-3rhk9bzf0x9rljkv488tloib@git.kernel.org
[ If anyone cares then alternative instruction patching could be
  used to replace HLT with a one-byte NOP instruction. Much simpler. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-03-30 08:50:27 +02:00
Ingo Molnar
186e54cbe1 Merge branch 'linus' into x86/urgent
Merge reason: Needed for include file dependencies.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-30 08:50:06 +02:00
Bjorn Helgaas
c6436f5a39 ACPI / PM: print physical addresses consistently with other parts of kernel
Print physical address info in a style consistent with the %pR style used
elsewhere in the kernel.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 02:46:57 -04:00
Matthew Garrett
9bcb811896 ACPI: Evaluate thermal trip points before reading temperature
An HP laptop (Pavilion G4-1016tx) has the following code in _TMP:

       Store (\_SB.PCI0.LPCB.EC0.RTMP, Local0)
       If (LGreaterEqual (Local0, S4TP))
       {
           Store (One, HTS4)
       }

S4TP is initialised at 0 and not programmed further until either _HOT or
_CRT is called. If we evaluate _TMP before the trip points then HTS4 will
always be set, causing the firmware to generate a message on boot
complaining that the system shut down because of overheating. The simplest
solution is just to reverse the checking of trip points and _TMP in thermal
init.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 02:38:31 -04:00
Lin Ming
b24e509885 ACPI, PCI: Move acpi_dev_run_wake() to ACPI core
acpi_dev_run_wake() is a generic function which can be used by
other subsystem too. Rename it to acpi_pm_device_run_wake, to be
consistent with acpi_pm_device_sleep_wake.

Then move it to ACPI core.

Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 02:21:18 -04:00
Linus Torvalds
2f7fa1be66 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull 2nd round of input updates from Dmitry Torokhov:
 - update to Wacom driver to support wireless devices
 - update to Sentelci touchpad driver to support newer hardware
 - update to gpio-keys driver to support "interrupt-only" keys
 - fixups to earlier commits

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: wacom - check for allocation failure in probe()
  Input: tegra-kbc - allocate pdata before using it
  Input: amijoy - add missing platform check
  Input: wacom - wireless battery status
  Input: wacom - create inputs when wireless connect
  Input: wacom - wireless monitor framework
  Input: wacom - isolate input registration
  Input: sentelic - improve packet debugging information
  Input: sentelic - minor code cleanup
  Input: sentelic - enabling absolute coordinates output for newer hardware
  Input: sentelic - refactor code for upcoming new hardware support
  Input: gpio_keys - add support for interrupt only keys
  Input: gpio_keys - consolidate key destructor code
  Input: revert "gpio_keys - switch to using threaded IRQs"
  Input: gpio_keys - constify platform data
  Input: spear-keyboard - remove kbd_set_plat_data()
2012-03-29 23:17:44 -07:00
Daniel Lezcano
e07510585a cpuidle: remove unused 'governor_data' field
As far as I can see, this field is never used in the code.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 01:55:40 -04:00
Daniel Lezcano
db70b04407 cpuidle: remove useless array definition in cpuidle_structure
All the modules name are ro-data, it is never copied to the array.

eg.

static struct cpuidle_driver intel_idle_driver = {
        .name = "intel_idle",
        .owner = THIS_MODULE,
};

It safe to assign the pointer of this ro-data to a const char *.
By this way we save 12 bytes.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Tested-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 01:55:22 -04:00
Daniel Lezcano
fc850f39ea cpuidle: use the driver's state_count as default
If the state_count is not initialized for the device use
the driver's state count as the default. That will prevent
to add it manually in the cpuidle driver initialization
routine and will save us from duplicate line of code.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 01:55:04 -04:00
ShuoX Liu
3a53396b03 cpuidle: add a sysfs entry to disable specific C state for debug purpose.
Some C states of new CPU might be not good.  One reason is BIOS might
configure them incorrectly.  To help developers root cause it quickly, the
patch adds a new sysfs entry, so developers could disable specific C state
manually.

In addition, C state might have much impact on performance tuning, as it
takes much time to enter/exit C states, which might delay interrupt
processing.  With the new debug option, developers could check if a deep C
state could impact performance and how much impact it could cause.

Also add this option in Documentation/cpuidle/sysfs.txt.

[akpm@linux-foundation.org: check kstrtol return value]
Signed-off-by: ShuoX Liu <shuox.liu@intel.com>
Reviewed-by: Yanmin Zhang <yanmin_zhang@intel.com>
Reviewed-and-Tested-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 01:52:58 -04:00
Lin Ming
0090def6c3 ACPI: Add interface to register/unregister device to/from power resources
Devices may share same list of power resources in _PR0, for example

Device(Dev0)
{
	Name (_PR0, Package (0x01)
	{
		P0PR,
		P1PR
	})
}

Device(Dev1)
{
	Name (_PR0, Package (0x01)
	{
		P0PR,
		P1PR
	}
}

Assume Dev0 and Dev1 were runtime suspended.
Then Dev0 is resumed first and it goes into D0 state.
But Dev1 is left in D0_Uninitialised state.

This is wrong. In this case, Dev1 must be resumed too.

In order to hand this case, each power resource maintains a list of
devices which relies on it.

When power resource is ON, it will check if the devices on its list
can be resumed. The device can only be resumed when all the power
resouces of its _PR0 are ON.

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 01:47:20 -04:00
Zhang Rui
3ebc81b893 ACPI: Introduce ACPI D3_COLD state support
If a device has _PR3, it means the device supports D3_COLD.
Add the ability to validate and enter D3_COLD state in ACPI.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 01:47:00 -04:00
Bob Moore
5aa3c16c6b ACPICA: Update to version 20120320
Version 20120320.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 01:45:14 -04:00
Bob Moore
6a99b1c94d ACPICA: Object repair code: Support to add Package wrappers
Repair a common problem with objects that are defined to return
a variable-length Package of sub-objects. If there is only one
sub-object, some BIOS code mistakenly simply declares the single
object instead of a Package with one sub-object.  This function
attempts to repair this error by wrapping a Package object around
the original object, creating the correct and expected Package
with one sub-object.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 01:45:13 -04:00
Dan Carpenter
f182394033 Input: wacom - check for allocation failure in probe()
We accidentally removed the check for NULL in 3aac0ef10b "Input: wacom -
isolate input registration".

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Chris Bagwell <chris@cnpbagwell.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2012-03-29 22:42:19 -07:00
Andi Kleen
6fe0d06282 ACPI: Make ACPI interrupt threaded
Some ACPI interrupt actions may need to wait, and it's easiest to
have a thread context for this. So turn the ACPI interrupt
into a threaded interrupt.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 01:41:45 -04:00
Andi Kleen
d6795fe32d ACPI: ec: Do request_region outside WARN()
WARN() is not supposed to have side effects, so move the request_regions
outside.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 01:41:35 -04:00
David S. Miller
8befc9f23c sparc: Fix even more fallout from system.h split.
jump_label.c needs asm/cacheflush.h to get flushi().
kgdb_64.c needs asm/cacheflush.h to get flushw_all().

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-29 22:40:52 -07:00
Stephen Rothwell
7f55ba9c4c sparc: fix fallout from system.h split
Fixes this build error:

kernel/signal.c: In function 'ptrace_stop':
kernel/signal.c:1860:3: error: implicit declaration of function 'synchronize_user_stack' [-Werror=implicit-function-declaration]

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-29 22:39:57 -07:00
Linus Torvalds
1338631433 Merge branch 'for-3.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull urgent cgroup fix from Tejun Heo:
 "Commit 61d1d219c4 ('cgroup: remove extra calls to
  find_existing_css_set') which was part of the rc1 cgroup pull request
  made writes to the cgroup "tasks" file return an uninitialized retval
  on success which can cause boot failures with systemd.

  The change stayed in linux-next for quite some time but gcc
  interestingly failed to emit warning about using uninitialized
  variable and the problem seems to materialize only for certain build
  combinations (probably depends on register allocation).

  It's just missing local variable initialization and the fix is trivial
  & safe.  As the problem is critical when it materializes, I'm
  fast-tracking it.  Also included is Li's email address change in
  MAINTAINERS."

* 'for-3.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: cgroup_attach_task() could return -errno after success
  cgroup: update MAINTAINERS entry
2012-03-29 22:33:10 -07:00
Tejun Heo
8f121918f2 cgroup: cgroup_attach_task() could return -errno after success
61d1d219c4 "cgroup: remove extra calls to find_existing_css_set" made
cgroup_task_migrate() return void.  An unfortunate side effect was
that cgroup_attach_task() was depending on that function's return
value to clear its @retval on the success path.  On cgroup mounts
without any subsystem with ->can_attach() callback,
cgroup_attach_task() ended up returning @retval without initializing
it on success.

For some reason, gcc failed to warn about it and it didn't cause
cgroup_attach_task() to return non-zero value in many cases, probably
due to difference in register allocation.  When the problem
materializes, systemd fails to populate /systemd cgroup mount and
fails to boot.

Fix it by initializing @retval to zero on declaration.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Jiri Kosina <jkosina@suse.cz>
LKML-Reference: <alpine.LNX.2.00.1203282354440.25526@pobox.suse.cz>
Reviewed-by: Mandeep Singh Baines <msb@chromium.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2012-03-29 22:03:33 -07:00
Linus Torvalds
4bde23f875 Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull arm-soc fixes from Olof Johansson:
 "This is a first pass of some of the merge window fallout for ARM
  platforms.

  Nothing controversial:
   - A system.h fallout fix for OMAP
   - PXA fixes for breakage caused by the regulator struct changes
   - GPIO fixes for OMAP to properly deal with dynamic IRQ allocation
   - A mismerge in our arm-soc tree of an lpc32xx change for networking
   - A fix for USB setup on tegra
   - An undo of __init annotation of display mux setup on OMAP that's
     needed at runtime"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: pxa: fix build issue on stargate2
  ARM: pxa: fix build issue on cm-x300
  ARM: pxa: fix build failure for regulator consumer in em-x270.c
  ARM: LPC32xx: clock.c: Fix lpc-eth clock reference
  ARM: OMAP: pm: fix compilation break
  ARM: OMAP: Remove OMAP_GPIO_IRQ macro definition
  drivers: input: Fix OMAP_GPIO_IRQ with gpio_to_irq() in ams_delta_serio_exit()
  ARM: OMAP: boards: Fix OMAP_GPIO_IRQ usage with gpio_to_irq()
  ARM: pxa: fix regulator related build fail in magician_defconfig
  ARM: tegra: Fix device tree AUXDATA for USB/EHCI
  ARM: OMAP2+: Remove __init from DSI mux functions
2012-03-29 21:30:28 -07:00
Olof Johansson
f00e9b1186 Merge branch 'fixes' of git://github.com/hzhuang1/linux into fixes
* 'fixes' of git://github.com/hzhuang1/linux:
  ARM: pxa: fix build issue on stargate2
  ARM: pxa: fix build issue on cm-x300
  ARM: pxa: fix build failure for regulator consumer in em-x270.c
  ARM: pxa: fix regulator related build fail in magician_defconfig
2012-03-29 20:36:18 -07:00
Len Brown
15aaa34654 tools turbostat: harden against cpu online/offline
Sometimes users have turbostat running in interval mode
when they take processors offline/online.

Previously, turbostat would survive, but not gracefully.

Tighten up the error checking so turbostat notices
changesn sooner, and print just 1 line on change:

turbostat: re-initialized with num_cpus %d

Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-29 22:27:19 -04:00
Len Brown
88c3281f7b tools turbostat: reduce measurement overhead due to IPIs
turbostat uses /dev/cpu/*/msr interface to read MSRs.
For modern systems, it reads 10 MSR/CPU.  This can
be observed as 10 "Function Call Interrupts"
per CPU per sample added to /proc/interrupts.

This overhead is measurable on large idle systems,
and as Yoquan Song pointed out, it can even trick
cpuidle into thinking the system is busy.

Here turbostat re-schedules itself in-turn to each
CPU so that its MSR reads will always be local.
This replaces the 10 "Function Call Interrupts"
with a single "Rescheduling interrupt" per sample
per CPU.

On an idle 32-CPU system, this shifts some residency from
the shallow c1 state to the deeper c7 state:

 # ./turbostat.old -s
   %c0  GHz  TSC    %c1    %c3    %c6    %c7   %pc2   %pc3   %pc6   %pc7
  0.27 1.29 2.29   0.95   0.02   0.00  98.77  20.23   0.00  77.41   0.00
  0.25 1.24 2.29   0.98   0.02   0.00  98.75  20.34   0.03  77.74   0.00
  0.27 1.22 2.29   0.54   0.00   0.00  99.18  20.64   0.00  77.70   0.00
  0.26 1.22 2.29   1.22   0.00   0.00  98.52  20.22   0.00  77.74   0.00
  0.26 1.38 2.29   0.78   0.02   0.00  98.95  20.51   0.05  77.56   0.00
^C
 i# ./turbostat.new -s
   %c0  GHz  TSC    %c1    %c3    %c6    %c7   %pc2   %pc3   %pc6   %pc7
  0.27 1.20 2.29   0.24   0.01   0.00  99.49  20.58   0.00  78.20   0.00
  0.27 1.22 2.29   0.25   0.00   0.00  99.48  20.79   0.00  77.85   0.00
  0.27 1.20 2.29   0.25   0.02   0.00  99.46  20.71   0.03  77.89   0.00
  0.28 1.26 2.29   0.25   0.01   0.00  99.46  20.89   0.02  77.67   0.00
  0.27 1.20 2.29   0.24   0.01   0.00  99.48  20.65   0.00  78.04   0.00

cc: Youquan Song <youquan.song@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-29 22:04:58 -04:00
Linus Torvalds
e152c38aba Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6
Pull devicetree documentation update from Grant Likely.

* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6:
  dt: Linux DT usage model documentation
  mtd: Move fdt partition documentation to a seperate file
2012-03-29 18:57:40 -07:00
Linus Torvalds
eb05df9e7e Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Peter Anvin:
 "The biggest textual change is the cleanup to use symbolic constants
  for x86 trap values.

  The only *functional* change and the reason for the x86/x32 dependency
  is the move of is_ia32_task() into <asm/thread_info.h> so that it can
  be used in other code that needs to understand if a system call comes
  from the compat entry point (and therefore uses i386 system call
  numbers) or not.  One intended user for that is the BPF system call
  filter.  Moving it out of <asm/compat.h> means we can define it
  unconditionally, returning always true on i386."

* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Move is_ia32_task to asm/thread_info.h from asm/compat.h
  x86: Rename trap_no to trap_nr in thread_struct
  x86: Use enum instead of literals for trap values
2012-03-29 18:21:35 -07:00
Grant Likely
31134efc68 dt: Linux DT usage model documentation
v2: 2nd draft
 - Editorial cleanups (Randy Dunlap and Stephen Warren)
 - Added missing Microblaze reference (Stephen Neuendorffer)
 - Make example of platform_device creation clearer (Shawn Guo)
 - Expand on PowerPC history and mention i2c mess (David Gibson)
 - convert to plain text (remove bits of html formating)

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2012-03-29 19:13:50 -06:00
Linus Torvalds
a591afc01d Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x32 support for x86-64 from Ingo Molnar:
 "This tree introduces the X32 binary format and execution mode for x86:
  32-bit data space binaries using 64-bit instructions and 64-bit kernel
  syscalls.

  This allows applications whose working set fits into a 32 bits address
  space to make use of 64-bit instructions while using a 32-bit address
  space with shorter pointers, more compressed data structures, etc."

Fix up trivial context conflicts in arch/x86/{Kconfig,vdso/vma.c}

* 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits)
  x32: Fix alignment fail in struct compat_siginfo
  x32: Fix stupid ia32/x32 inversion in the siginfo format
  x32: Add ptrace for x32
  x32: Switch to a 64-bit clock_t
  x32: Provide separate is_ia32_task() and is_x32_task() predicates
  x86, mtrr: Use explicit sizing and padding for the 64-bit ioctls
  x86/x32: Fix the binutils auto-detect
  x32: Warn and disable rather than error if binutils too old
  x32: Only clear TIF_X32 flag once
  x32: Make sure TS_COMPAT is cleared for x32 tasks
  fs: Remove missed ->fds_bits from cessation use of fd_set structs internally
  fs: Fix close_on_exec pointer in alloc_fdtable
  x32: Drop non-__vdso weak symbols from the x32 VDSO
  x32: Fix coding style violations in the x32 VDSO code
  x32: Add x32 VDSO support
  x32: Allow x32 to be configured
  x32: If configured, add x32 system calls to system call tables
  x32: Handle process creation
  x32: Signal-related system calls
  x86: Add #ifdef CONFIG_COMPAT to <asm/sys_ia32.h>
  ...
2012-03-29 18:12:23 -07:00
Haojian Zhuang
5616131daf ARM: pxa: fix build issue on stargate2
arch/arm/mach-pxa/stargate2.c:155:3: error: unknown field ‘dev’
specified in initializer
arch/arm/mach-pxa/stargate2.c:155:3: warning: initialization from
incompatible pointer type [enabled by default]
arch/arm/mach-pxa/stargate2.c:155:3: warning: (near initialization for
‘stargate2_sensor_3_con[0].dev_name’) [enabled by default]
make[1]: *** [arch/arm/mach-pxa/stargate2.o] Error 1
make: *** [arch/arm/mach-pxa] Error 2

It's caused by 'dev' field removed from struct
regulator_consumer_supply.

Signed-off-by: Haojian Zhuang <haojian.zhuang@gmail.com>
2012-03-30 09:03:20 +08:00
Haojian Zhuang
e9478587c4 ARM: pxa: fix build issue on cm-x300
arch/arm/mach-pxa/cm-x300.c:716:3: error: unknown field ‘dev’ specified
in initializer
make[1]: *** [arch/arm/mach-pxa/cm-x300.o] Error 1
make: *** [arch/arm/mach-pxa] Error 2

It's caused by 'dev' field removed from struct
regulator_consumer_supply.

Signed-off-by: Haojian Zhuang <haojian.zhuang@gmail.com>
2012-03-30 09:03:06 +08:00
Linus Torvalds
820d41cf0c Merge tag 'cleanup2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull "ARM: cleanups of io includes" from Olof Johansson:
 "Rob Herring has done a sweeping change cleaning up all of the
  mach/io.h includes, moving some of the oft-repeated macros to a common
  location and removing a bunch of boiler plate.  This is another step
  closer to a common zImage for multiple platforms."

Fix up various fairly trivial conflicts (<mach/io.h> removal vs changes
around it, tegra localtimer.o is *still* gone, yadda-yadda).

* tag 'cleanup2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (29 commits)
  ARM: tegra: Include assembler.h in sleep.S to fix build break
  ARM: pxa: use common IOMEM definition
  ARM: dma-mapping: convert ARCH_HAS_DMA_SET_COHERENT_MASK to kconfig symbol
  ARM: __io abuse cleanup
  ARM: create a common IOMEM definition
  ARM: iop13xx: fix missing declaration of iop13xx_init_early
  ARM: fix ioremap/iounmap for !CONFIG_MMU
  ARM: kill off __mem_pci
  ARM: remove bunch of now unused mach/io.h files
  ARM: make mach/io.h include optional
  ARM: clps711x: remove unneeded include of mach/io.h
  ARM: dove: add explicit include of dove.h to addr-map.c
  ARM: at91: add explicit include of hardware.h to uncompressor
  ARM: ep93xx: clean-up mach/io.h
  ARM: tegra: clean-up mach/io.h
  ARM: orion5x: clean-up mach/io.h
  ARM: davinci: remove unneeded mach/io.h include
  [media] davinci: remove includes of mach/io.h
  ARM: OMAP: Remove remaining includes for mach/io.h
  ARM: msm: clean-up mach/io.h
  ...
2012-03-29 18:02:10 -07:00
Paul Gortmaker
804d231230 ARM: pxa: fix build failure for regulator consumer in em-x270.c
Commit 737f360d5b

"regulator: Remove support for supplies specified by struct device"

caused this file to break, since it was still relying on the
device field to be present.  Map it onto dev_name appropriately

Since there are two consumers with the name "reg-userspace-consumer",
we have to supply the ID as a suffix in the REGULATOR_CONSUMER calls.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
CC: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Haojian Zhuang <haojian.zhuang@gmail.com>
2012-03-30 08:41:18 +08:00
Linus Torvalds
6268b325c3 Revert "ext4: don't release page refs in ext4_end_bio()"
This reverts commit b43d17f319.

Dave Jones reports that it causes lockups on his laptop, and his debug
output showed a lot of processes hung waiting for page_writeback (or
more commonly - processes hung waiting for a lock that was held during
that writeback wait).

The page_writeback hint made Ted suggest that Dave look at this commit,
and Dave verified that reverting it makes his problems go away.

Ted says:
 "That commit fixes a race which is seen when you write into fallocated
  (and hence uninitialized) disk blocks under *very* heavy memory
  pressure.  Furthermore, although theoretically it could trigger under
  normal direct I/O writes, it only seems to trigger if you are issuing
  a huge number of AIO writes, such that a just-written page can get
  evicted from memory, and then read back into memory, before the
  workqueue has a chance to update the extent tree.

  This race has been around for a little over a year, and no one noticed
  until two months ago; it only happens under fairly exotic conditions,
  and in fact even after trying very hard to create a simple repro under
  lab conditions, we could only reproduce the problem and confirm the
  fix on production servers running MySQL on very fast PCIe-attached
  flash devices.

  Given that Dave was able to hit this problem pretty quickly, if we
  confirm that this commit is at fault, the only reasonable thing to do
  is to revert it IMO."

Reported-and-tested-by: Dave Jones <davej@redhat.com>
Acked-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-29 17:00:56 -07:00
Linus Torvalds
12679a2d7e Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm
Pull more ARM updates from Russell King.

This got a fair number of conflicts with the <asm/system.h> split, but
also with some other sparse-irq and header file include cleanups.  They
all looked pretty trivial, though.

* 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (59 commits)
  ARM: fix Kconfig warning for HAVE_BPF_JIT
  ARM: 7361/1: provide XIP_VIRT_ADDR for no-MMU builds
  ARM: 7349/1: integrator: convert to sparse irqs
  ARM: 7259/3: net: JIT compiler for packet filters
  ARM: 7334/1: add jump label support
  ARM: 7333/2: jump label: detect %c support for ARM
  ARM: 7338/1: add support for early console output via semihosting
  ARM: use set_current_blocked() and block_sigmask()
  ARM: exec: remove redundant set_fs(USER_DS)
  ARM: 7332/1: extract out code patch function from kprobes
  ARM: 7331/1: extract out insn generation code from ftrace
  ARM: 7330/1: ftrace: use canonical Thumb-2 wide instruction format
  ARM: 7351/1: ftrace: remove useless memory checks
  ARM: 7316/1: kexec: EOI active and mask all interrupts in kexec crash path
  ARM: Versatile Express: add NO_IOPORT
  ARM: get rid of asm/irq.h in asm/prom.h
  ARM: 7319/1: Print debug info for SIGBUS in user faults
  ARM: 7318/1: gic: refactor irq_start assignment
  ARM: 7317/1: irq: avoid NULL check in for_each_irq_desc loop
  ARM: 7315/1: perf: add support for the Cortex-A7 PMU
  ...
2012-03-29 16:53:48 -07:00
Linus Torvalds
1c03658877 Merge git://git.kernel.org/pub/scm/linux/kernel/git/brodo/cpupowerutils
Pull cpupower updates from Dominik Brodowski.

* git://git.kernel.org/pub/scm/linux/kernel/git/brodo/cpupowerutils:
  cpupower tools: add install target to the debug tools' makefiles
  cpupower tools: allow to build debug tools in a separate directory too
  cpupower: Fix broken mask values
  cpupower tool: allow to build in a separate directory
  cpupower tool: makefile: simplify the recipe used to generate cpupower.pot target
  cpupower tool: remove use of undefined variables from the clean target of the top makefile
  cpupower: Fix linking with --as-needed
  cpupower: Remove unneeded code and by that fix a memleak
  cpupower: Fix number of idle states
  cpupower: Unify cpupower-frequency-* manpages
  cpupower: Add cpupower-idle-info manpage
  cpupower: AMD fam14h/Ontario monitor can also be used by fam12h cpus
  cpupower: Better interface for accessing AMD pci registers
2012-03-29 16:03:12 -07:00
Linus Torvalds
a6f707b601 Merge git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia
Pull a few PCMCIA updates from Dominik Brodowski.

Fix up trivial conflict (modified code in question had been removed) in
drivers/pcmcia/soc_common.c.

* git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia:
  pcmcia at91_cf: fix raw gpio number usage
  ARM: pxa: fix error handling in pxa2xx_drv_pcmcia_probe
  pcmcia: Convert to DEFINE_PCI_DEVICE_TABLE
  pcmcia: convert drivers/pcmcia/* to use module_platform_driver()
  pcmcia: irq: Remove IRQF_DISABLED
2012-03-29 16:00:48 -07:00
Jason Wessel
3751d3e85c x86,kgdb: Fix DEBUG_RODATA limitation using text_poke()
There has long been a limitation using software breakpoints with a
kernel compiled with CONFIG_DEBUG_RODATA going back to 2.6.26. For
this particular patch, it will apply cleanly and has been tested all
the way back to 2.6.36.

The kprobes code uses the text_poke() function which accommodates
writing a breakpoint into a read-only page.  The x86 kgdb code can
solve the problem similarly by overriding the default breakpoint
set/remove routines and using text_poke() directly.

The x86 kgdb code will first attempt to use the traditional
probe_kernel_write(), and next try using a the text_poke() function.
The break point install method is tracked such that the correct break
point removal routine will get called later on.

Cc: x86@kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: stable@vger.kernel.org # >= 2.6.36
Inspried-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2012-03-29 17:41:25 -05:00
Jason Wessel
98b54aa1a2 kgdb,debug_core: pass the breakpoint struct instead of address and memory
There is extra state information that needs to be exposed in the
kgdb_bpt structure for tracking how a breakpoint was installed.  The
debug_core only uses the the probe_kernel_write() to install
breakpoints, but this is not enough for all the archs.  Some arch such
as x86 need to use text_poke() in order to install a breakpoint into a
read only page.

Passing the kgdb_bpt structure to kgdb_arch_set_breakpoint() and
kgdb_arch_remove_breakpoint() allows other archs to set the type
variable which indicates how the breakpoint was installed.

Cc: stable@vger.kernel.org # >= 2.6.36
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2012-03-29 17:41:25 -05:00
Jason Wessel
23bbd8e346 kgdbts: (2 of 2) fix single step awareness to work correctly with SMP
The do_fork and sys_open tests have never worked properly on anything
other than a UP configuration with the kgdb test suite.  This is
because the test suite did not fully implement the behavior of a real
debugger.  A real debugger tracks the state of what thread it asked to
single step and can correctly continue other threads of execution or
conditionally stop while waiting for the original thread single step
request to return.

Below is a simple method to cause a fatal kernel oops with the kgdb
test suite on a 2 processor ARM system:

while [ 1 ] ; do ls > /dev/null 2> /dev/null; done&
while [ 1 ] ; do ls > /dev/null 2> /dev/null; done&
echo V1I1F100 > /sys/module/kgdbts/parameters/kgdbts

Very soon after starting the test the kernel will start warning with
messages like:

kgdbts: BP mismatch c002487c expected c0024878
------------[ cut here ]------------
WARNING: at drivers/misc/kgdbts.c:317 check_and_rewind_pc+0x9c/0xc4()
[<c01f6520>] (check_and_rewind_pc+0x9c/0xc4)
[<c01f595c>] (validate_simple_test+0x3c/0xc4)
[<c01f60d4>] (run_simple_test+0x1e8/0x274)

The kernel will eventually recovers, but the test suite has completely
failed to test anything useful.

This patch implements behavior similar to a real debugger that does
not rely on hardware single stepping by using only software planted
breakpoints.

In order to mimic a real debugger, the kgdb test suite now tracks the
most recent thread that was continued (cont_thread_id), with the
intent to single step just this thread.  When the response to the
single step request stops in a different thread that hit the original
break point that thread will now get continued, while the debugger
waits for the thread with the single step pending.  Here is a high
level description of the sequence of events.

   cont_instead_of_sstep = 0;

1) set breakpoint at do_fork
2) continue
3)   Save the thread id where we stop to cont_thread_id
4) Remove breakpoint at do_fork
5) Reset the PC if needed depending on kernel exception type
6) soft single step
7)   Check where we stopped
       if current thread != cont_thread_id {
           if (here for more than 2 times for the same thead) {
              ### must be a really busy system, start test again ###
	      goto step 1
           }
           goto step 5
       } else {
           cont_instead_of_sstep = 0;
       }
8) clean up and run test again if needed
9) Clear out any threads that were waiting on a break point at the
   point in time the test is ended with get_cont_catch().  This
   happens sometimes because breakpoints are used in place of single
   stepping and some threads could have been in the debugger exception
   handling queue because breakpoints were hit concurrently on
   different CPUs.  This also means we wait at least one second before
   unplumbing the debugger connection at the very end, so as respond
   to any debug threads waiting to be serviced.

Cc: stable@vger.kernel.org # >= 3.0
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2012-03-29 17:41:24 -05:00
Jason Wessel
486c5987a0 kgdbts: (1 of 2) fix single step awareness to work correctly with SMP
The do_fork and sys_open tests have never worked properly on anything
other than a UP configuration with the kgdb test suite.  This is
because the test suite did not fully implement the behavior of a real
debugger.  A real debugger tracks the state of what thread it asked to
single step and can correctly continue other threads of execution or
conditionally stop while waiting for the original thread single step
request to return.

Below is a simple method to cause a fatal kernel oops with the kgdb
test suite on a 4 processor x86 system:

while [ 1 ] ; do ls > /dev/null 2> /dev/null; done&
while [ 1 ] ; do ls > /dev/null 2> /dev/null; done&
while [ 1 ] ; do ls > /dev/null 2> /dev/null; done&
while [ 1 ] ; do ls > /dev/null 2> /dev/null; done&
echo V1I1F1000 > /sys/module/kgdbts/parameters/kgdbts

Very soon after starting the test the kernel will oops with a message like:

kgdbts: BP mismatch 3b7da66480 expected ffffffff8106a590
WARNING: at drivers/misc/kgdbts.c:303 check_and_rewind_pc+0xe0/0x100()
Call Trace:
 [<ffffffff812994a0>] check_and_rewind_pc+0xe0/0x100
 [<ffffffff81298945>] validate_simple_test+0x25/0xc0
 [<ffffffff81298f77>] run_simple_test+0x107/0x2c0
 [<ffffffff81298a18>] kgdbts_put_char+0x18/0x20

The warn will turn to a hard kernel crash shortly after that because
the pc will not get properly rewound to the right value after hitting
a breakpoint leading to a hard lockup.

This change is broken up into 2 pieces because archs that have hw
single stepping (2.6.26 and up) need different changes than archs that
do not have hw single stepping (3.0 and up).  This change implements
the correct behavior for an arch that supports hw single stepping.

A minor defect was fixed where sys_open should be do_sys_open
for the sys_open break point test.  This solves the problem of running
a 64 bit with a 32 bit user space.  The sys_open() never gets called
when using the 32 bit file system for the kgdb testsuite because the
32 bit binaries invoke the compat_sys_open() call leading to the test
never completing.

In order to mimic a real debugger, the kgdb test suite now tracks the
most recent thread that was continued (cont_thread_id), with the
intent to single step just this thread.  When the response to the
single step request stops in a different thread that hit the original
break point that thread will now get continued, while the debugger
waits for the thread with the single step pending.  Here is a high
level description of the sequence of events.

   cont_instead_of_sstep = 0;

1) set breakpoint at do_fork
2) continue
3)   Save the thread id where we stop to cont_thread_id
4) Remove breakpoint at do_fork
5) Reset the PC if needed depending on kernel exception type
6) if (cont_instead_of_sstep) { continue } else { single step }
7)   Check where we stopped
       if current thread != cont_thread_id {
           cont_instead_of_sstep = 1;
           goto step 5
       } else {
           cont_instead_of_sstep = 0;
       }
8) clean up and run test again if needed

Cc: stable@vger.kernel.org # >= 2.6.26
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2012-03-29 17:41:24 -05:00