Use for_each_leaf_cfs_rq() instead of list_for_each_entry_rcu(), this
achieves that load_balance_fair() only iterates those task_groups that
actually have tasks on busiest, and that we iterate bottom-up, trying to
move light groups before the heavier ones.
No idea if it will actually work out to be beneficial in practice, does
anybody have a cgroup workload that might show a difference one way or
the other?
[ Also move update_h_load to sched_fair.c, loosing #ifdef-ery ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Paul Turner <pjt@google.com>
Link: http://lkml.kernel.org/r/1310557009.2586.28.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In dequeue_task_fair() we bail on dequeue when we encounter a parenting entity
with additional weight. However, we perform a double shares update on this
entity as we continue the shares update traversal from this point, despite
dequeue_entity() having already updated its queuing cfs_rq.
Avoid this by starting from the parent when we resume.
Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110707053059.797714697@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
While looking at check_preempt_wakeup() I realized that we are
potentially updating the wrong entity in the fair-group scheduling
case. In this case the current task's cfs_rq may not be the same as
the one used for the comparison between the waking task and the
existing task's vruntime.
This potentially results in us using a stale vruntime in the
pre-emption decision, providing a small false preference for the
previous task. The effects of this are bounded since we always
perform a hierarchal update on the tick.
Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/CAPM31R+2Ke2urUZKao5W92_LupdR4AYEv-EZWiJ3tG=tEes2cw@mail.gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Simple test-case,
int main(void)
{
int pid, status;
pid = fork();
if (!pid) {
pause();
assert(0);
return 0x23;
}
assert(ptrace(PTRACE_ATTACH, pid, 0,0) == 0);
assert(wait(&status) == pid);
assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGSTOP);
kill(pid, SIGCONT); // <--- also clears STOP_DEQUEUD
assert(ptrace(PTRACE_CONT, pid, 0,0) == 0);
assert(wait(&status) == pid);
assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGCONT);
assert(ptrace(PTRACE_CONT, pid, 0, SIGSTOP) == 0);
assert(wait(&status) == pid);
assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGSTOP);
kill(pid, SIGKILL);
return 0;
}
Without the patch it hangs. After the patch SIGSTOP "injected" by the
tracer is not ignored and stops the tracee.
Note also that if this test-case uses, say, SIGWINCH instead of SIGCONT,
everything works without the patch. This can't be right, and this is
confusing.
The problem is that SIGSTOP (or any other sig_kernel_stop() signal) has
no effect without JOBCTL_STOP_DEQUEUED. This means it is simply ignored
after PTRACE_CONT unless JOBCTL_STOP_DEQUEUED was set "by accident", say
it wasn't cleared after initial SIGSTOP sent by PTRACE_ATTACH.
At first glance we could change ptrace_signal() to add STOP_DEQUEUED
after return from ptrace_stop(), but this is not right in case when the
tracer does not change the reported SIGSTOP and SIGCONT comes in between.
This is even more wrong with PT_SEIZED, SIGCONT adds JOBCTL_TRAP_NOTIFY
which will be "lost" during the TRAP_STOP | TRAP_NOTIFY report.
So lets add STOP_DEQUEUED _before_ we report the signal. It has no effect
unless sig_kernel_stop() == T after the tracer resumes us, and in the
latter case the pending STOP_DEQUEUED means no SIGCONT in between, we
should stop.
Note also that if SIGCONT was sent, PT_SEIZED tracee will correctly
report PTRACE_EVENT_STOP/SIGTRAP and thus the tracer can notice the fact
SIGSTOP was cancelled.
Also, move the current->ptrace check from ptrace_signal() to its caller,
get_signal_to_deliver(), this looks more natural.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
At http://www.mail-archive.com/linux-mmc@vger.kernel.org/msg08371.html
(thread: "mmc: sdio: reset card during power_restore") we found and
fixed a bug where mmc's runtime power management functions were not being
called. We have now also made improvements to the SDIO powerup routine
which could possibly mask this kind of issue in future.
Add debug messages to the runtime PM hooks so that it is easy to verify
if and when runtime PM is happening.
Signed-off-by: Daniel Drake <dsd@laptop.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
In the case where a driver returns -ENOSYS from its suspend handler
to indicate that the device should be powered down over suspend, the
remove routine of the driver was not being called, leading to lots of
confusion during resume.
The problem is that runtime PM is disabled during this process,
and when we reach mmc_sdio_remove, calling the runtime PM functions here
(validly) return errors, and this was causing us to skip the remove
function.
Fix this by ignoring the error value of pm_runtime_get_sync(), which
can return valid errors. This also matches the behaviour of
pci_device_remove().
Signed-off-by: Daniel Drake <dsd@laptop.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Fix clock rate setting in the mxs-mmc driver. Previously, if div2 was 0
then the value for TIMING_CLOCK_RATE would have been 255 instead of 0.
The limits for div1 (TIMING_CLOCK_DIVIDE) and div2 (TIMING_CLOCK_RATE+1)
were also not correctly defined.
Can easily be reproduced on mx23evk: default clock for high speed sdio
cards is 50 MHz. With a SSP_CLK of 28.8 MHz default), this resulted in
an actual clock rate of about 56 kHz. Tested on mx23evk.
Signed-off-by: Koen Beel <koen.beel@barco.com>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
Currently the tmio-mmc driver contains a recursive runtime PM method
invocation, which leads to a deadlock on a mutex. Avoid it by taking
care not to request DMA too early.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
A recent commit "mmc: tmio: Share register access functions" has swapped
arguments of a macro and broken DMA with TMIO MMC. This patch fixes the
arguments back.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
This patch uses runtime PM to allow the system to power down the MMC
controller, when the MMC closk is switched off.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
This patch uses runtime PM to allow the system to power down the MMC
controller, when the MMC closk is switched off.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
Calling mmc_request_done() under a spinlock with interrupts disabled
leads to a recursive spin-lock on request retry path and to
scheduling in atomic context. This patch fixes both these problems
by moving mmc_request_done() to the scheduler workqueue.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
Ricoh 1180:e823 does not recognize certain types of SD/MMC cards,
as reported at http://launchpad.net/bugs/773524. Lowering the SD
base clock frequency from 200Mhz to 50Mhz fixes this issue. This
solution was suggest by Koji Matsumuro, Ricoh Company, Ltd.
This change has no negative performance effect on standard SD
cards, though it's quite possible that there will be one on
UHS-1 cards.
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Tested-by: Daniel Manrique <daniel.manrique@canonical.com>
Cc: Koji Matsumuro <matsumur@nts.ricoh.co.jp>
Cc: <stable@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
In the case of an I/O error, the DMA will have been cleaned up in
the MMC interrupt and the request structure pointer will be null.
In that case, it is essential to check if the DMA is over before
dereferencing host->mrq->data.
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
There are a few places with the same functionality. This patch creates
two functions omap_hsmmc_set_bus_width() and omap_hsmmc_set_bus_mode()
to do the job.
Signed-off-by: Andy Shevchenko <ext-andriy.shevchenko@nokia.com>
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
There are two pieces of code which are similar, but not the same.
Each of them contains a bug.
The SYSCTL register should be read before writing to it in
omap_hsmmc_context_restore() to retain the state of the reserved bits.
Before setting the clock divisor and DTO bits the value from the SYSCTL
register should be masked properly. We were lucky to have no problems
with DTO bits. So, make sure we have clear DTO bits properly in
omap_hsmmc_set_ios().
Additionally get rid of msleep(1). The actual time is rarely higher
than 30us on OMAP 3630.
The resulting pieces of code are refactored into the
omap_hsmmc_set_clock() function.
Signed-off-by: Andy Shevchenko <ext-andriy.shevchenko@nokia.com>
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
There is similar code in two functions which enable the clock. Refactor
this code to omap_hsmmc_start_clock(). Re-use omap_hsmmc_stop_clock() in
omap_hsmmc_context_restore() as well.
Signed-off-by: Andy Shevchenko <ext-andriy.shevchenko@nokia.com>
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
There are two places where the same calculations are done.
Let's split them into a separate function.
In addition, simplify by using the DIV_ROUND_UP kernel macro.
Signed-off-by: Andy Shevchenko <ext-andriy.shevchenko@nokia.com>
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
CERR and BADA were in the wrong place and there are only
32 not 35.
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Reviewed-by: Venkatraman S <svenkatr@ti.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
We already check for ongoing async transfers when handling discard
requests, but not in mmc_blk_issue_flush(). This patch fixes that
omission.
Tested with an SDHCI controller and eMMC4.41.
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Per Forlin <per.forlin@linaro.org>
Cc: <stable@kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Documentation about the background and the design of mmc non-blocking.
Host driver guidelines to minimize request preparation overhead.
Signed-off-by: Per Forlin <per.forlin@linaro.org>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Chris Ball <cjb@laptop.org>
The following symbols are not referenced outside this file so
there's no need for it to be in the global name space.
pcmidi_sustained_note_release
init_sustain_timers
stop_sustain_timers
pcmidi_handle_report
pcmidi_setup_extra_keys
pcmidi_snd_initialise
pcmidi_snd_terminate
Make them static.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This patch fixes non-working indep-HP control on VT1708* codecs.
The problems are that via_independent_hp_put() wasn't fixed to follow
the recent change of three HP paths, and hp_indep_path didn't contain
the amp nids of mixer elements.
Together with the fixes, a few code clean-ups are done.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Builds for 32-bit perf binaries on a 64-bit host currently fail
with this error:
[...]
bench/../../../arch/x86/lib/memcpy_64.S: Assembler messages:
bench/../../../arch/x86/lib/memcpy_64.S:29: Error: bad register name `%rdi'
bench/../../../arch/x86/lib/memcpy_64.S:34: Error: invalid instruction suffix for `movs'
bench/../../../arch/x86/lib/memcpy_64.S:50: Error: bad register name `%rdi'
bench/../../../arch/x86/lib/memcpy_64.S:61: Error: bad register name `%rdi'
...
The problem is the detection of the host arch without considering passed in
flags. This change fixes 32-bit builds via:
make EXTRA_CFLAGS=-m32
and 64-bit builds still reference the memcpy_64.S.
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/1310420304-21452-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Some gcc versions warn about prototypes without "inline" when the declaration
includes the "inline" keyword. The fix generates a false error message
"marked inline, but without a definition" with sparse below 0.4.2.
Signed-off-by: Chris Friesen <chris.friesen@genband.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
If overlapping networks with different interfaces was added to
the set, the type did not handle it properly. Example
ipset create test hash:net,iface
ipset add test 192.168.0.0/16,eth0
ipset add test 192.168.0.0/24,eth1
Now, if a packet was sent from 192.168.0.0/24,eth0, the type returned
a match.
In the patch the algorithm is fixed in order to correctly handle
overlapping networks.
Limitation: the same network cannot be stored with more than 64 different
interfaces in a single set.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
In kexec jump support, jump back address passed to the kexeced
kernel via function calling ABI, that is, the function call
return address is the jump back entry.
Furthermore, jump back entry == 0 should be used to signal that
the jump back or preserve context is not enabled in the original
kernel.
But in the current implementation the stack position used for
function call return address is not cleared context
preservation is disabled. The patch fixes this bug.
Reported-and-tested-by: Yin Kangkai <kangkai.yin@intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/1310607277-25029-1-git-send-email-ying.huang@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We need to carve up the configuration between:
- MID general
- Moorestown specific
- Medfield specific
- Future devices
As a base point create an INTEL_MID configuration property. We
make the existing MRST configuration a sub-option. This means
that the rest of the kernel config can still use X86_MRST checks
without anything going backwards.
After this is merged future patches will tidy up which devices
are MID and which are X86_MRST, as well as add options for
Medfield.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Link: http://lkml.kernel.org/r/20110712164859.7642.84136.stgit@bob.linux.org.uk
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The perf_event_attr struct has two __u32's at the top and
they need to be swapped individually.
With this change I was able to analyze a perf.data collected in a
32-bit PPC VM on an x86 system. I tested both 32-bit and 64-bit
binaries for the Intel analysis side; both read the PPC perf.data
file correctly.
-v2:
- changed the existing perf_event__attr_swap() to swap only elements
of perf_event_attr and exported it for use in swapping the
attributes in the file header
- updated swap_ops used for processing events
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: acme@ghostprotocols.net
Cc: peterz@infradead.org
Cc: paulus@samba.org
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/1310754849-12474-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add "node" as a simple alias for NODE cache events.
The addition of NODE cache events broke the parse_alias
function, so any mismatched event caused the segfault, like:
# ./perf stat -e krava ls
The hw_cache/hw_cache_op/hw_cache_result arrays needs to follow
PERF_COUNT_HW_CACHE_*MAX enums. Adding those MAXs to be size
of those arrays, so possible ommision in future wil not lead to
segfault.
Adding read/write/prefetch as allowed operations for node cache
event.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: acme@redhat.com
Link: http://lkml.kernel.org/r/20110713205818.GB7827@jolsa.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The meth for calculating the # of outstanding buffers gives
incorrect results when vq->upend_idx wraps around zero.
Fix that.
Signed-off-by: Shirley Ma <xma@us.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The non-debug variant of mutex_destroy is a no-op, currently
implemented as a macro which does nothing. This approach fails
to check the type of the parameter, so an error would only show
when debugging gets enabled. Using an inline function instead,
offers type checking for earlier bug catching.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110716174200.41002352@endymion.delvare
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On backend change, we flushed out outstanding skbs
but forgot to update the used ring, so that
done entries were left in the ubuf_info ring.
As a result we lose heads or complete incorrect ones,
crashing the guest or leaking memory.
Fix by updating the used ring.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The EFI specification requires that callers of the time related
runtime functions serialize with other CMOS accesses in the
kernel, as the EFI time functions may choose to also use the
legacy CMOS RTC.
Besides fixing a latent bug, this is a prerequisite to safely
enable the rtc-efi driver for x86, which ought to be preferred
over rtc-cmos on all EFI platforms.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: <mjg@redhat.com>
Link: http://lkml.kernel.org/r/4E257E33020000780004E319@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Matthew Garrett <mjg@redhat.com>