This reverts commit cded932b75.
Arjan bisected down a boot-time hang to this, saying:
".. it prevents the kernel to finish booting on my (Penryn based)
laptop. The boot stops right after freeing the init memory."
and while it's not clear exactly what triggers it, at this stage we're
better off just reverting it while Ingo tries to figure out what went
wrong.
Requested-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Hans Rosenfeld <hans.rosenfeld@amd.com>
Cc: Nish Aravamudan <nish.aravamudan@gmail.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
revert commit cded932b75,
"x86: fix pmd_bad and pud_bad to support huge pages", it causes
a bootup hang, as reported and bisected by Arjan van de Ven.
Bisected-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Revert:
commit 8be8f54bae
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Sat Feb 23 20:43:21 2008 +0100
x86: CPA: avoid split of alias mappings
because it clearly mishandles the case when __change_page_attr(), called
from __change_page_attr_set_clr(), changes cpa->processed to 1 and
cpa_process_alias(cpa) is executed right after that.
This crashes my x86-64 test box early in the boot process
(ref. http://bugzilla.kernel.org/show_bug.cgi?id=10140#c4).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Injecting an GP when accessing this MSR lets Windows crash when running some
stress test tools in KVM. So this patch emulates access to this MSR.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
One of the use cases for the supported cpuid list is to create a "greatest
common denominator" of cpu capabilities in a server farm. As such, it is
useful to be able to get the list without creating a virtual machine first.
Since the code does not depend on the vm in any way, all that is needed is
to move it to the device ioctl handler. The capability identifier is also
changed so that binaries made against -rc1 will fail gracefully.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Whilst working on getting a VM to initialize in to IA32e mode I found
this issue. set_cr0 relies on comparing the old cr0 to the new one to
work correctly. Move the assignment below so the compare can work.
Signed-off-by: Paul Knowles <paul@transitive.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Explicitly enable the NM intercept in svm_set_cr0 if we enable TS in the guest
copy of CR0 for lazy FPU switching. This fixes guest SMP with Linux under SVM.
Without that patch Linux deadlocks or panics right after trying to boot the
other CPUs.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
If the guest writes to cr0 and leaves the TS flag at 0 while vcpu->fpu_active
is also 0, the TS flag in the guest's cr0 gets lost. This leads to corrupt FPU
state an causes Windows Vista 64bit to crash very soon after boot. This patch
fixes this bug.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
The only tricky part is we need to adjust the PTE insertion loop to
cater for holes in the page table. The PTEs for each segment start on
a 4K boundary, so with 16M pages we have 16 PTEs per segment and then
a gap to the next 4K page boundary.
It might be possible to allocate the PTEs for each segment separately,
saving the memory currently filling the gaps. However we'd need to
check that's OK with the hardware, and that it actually saves memory.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Make some preliminary changes to cell_iommu_alloc_ptab() to allow it to
take the page size as a parameter rather than assuming IOMMU_PAGE_SIZE.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
We use n_pte_pages to calculate the stride through the page tables, but
we also use it to set the NPPT value in the segment table entry. That is
defined as the number of 4K pages per segment, so we should calculate
it as such regardless of the IOMMU page size.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Currently the cell IOMMU code allocates the entire IOMMU page table in a
contiguous chunk. This is nice and tidy, but for machines with larger
amounts of RAM the page table allocation can fail due to it simply being
too large.
So split the segment table and page table setup routine, and arrange to
have the dynamic and fixed page tables allocated separately.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
There's no need to allocate the pad page unless we're going to actually
use it - so move the allocation to where we know we're going to use it.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The cell IOMMU code no longer needs to save the pte_offset variable
separately, it is incorporated into tbl->it_offset.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The cell IOMMU tce build and free routines use pte_offset to convert
the index passed from the generic IOMMU code into a page table offset.
This takes into account the SPIDER_DMA_OFFSET which sets the top bit
of every DMA address.
However it doesn't cater for the IOMMU window starting at a non-zero
address, as the base of the window is not incorporated into pte_offset
at all.
As it turns out tbl->it_offset already contains the value we need, it
takes into account the base of the window and also pte_offset. So use
it instead!
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
It's called the fixed mapping, not the static mapping.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This moves the private DABRX definitions for celleb from beat.h to
reg.h to make them usable for all.
Signed-off-by: Jens Osterkamp <jens@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This patch enables OProfile callgraph support for the Cell processor. The
original code was just calling a function to add the PC value, now it will
call a function that first checks the callgraph depth. Callgraph is already
enabled on the other Power platforms.
Signed-off-by: Bob Nelson <rrnelson@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: fix crash in automatic module unloading
firewire: potentially invalid pointers used in fw_card_bm_work
firewire: fw-sbp2: better fix for NULL pointer dereference in scsi_remove_device
The bus management workqueue job was in danger to dereference NULL
pointers. Also, after having temporarily lifted card->lock, a few node
pointers and a device pointer may have become invalid.
Add NULL pointer checks and get the necessary references. Also, move
card->local_node out of fw_card_bm_work's sight during shutdown of the
card.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
Patch "firewire: fw-sbp2: fix NULL pointer deref. in scsi_remove_device"
had the unintended effect that firewire-sbp2 could not be unloaded
anymore until all SBP-2 devices were unplugged.
We now fix the NULL pointer bug by reacquiring a reference to the sdev
instead of holding a reference to the sdev (and to the module) all the
time.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Tested-by: Jarod Wilson <jwilson@redhat.com>
Hi,
While we are looking at the printk issue, I see that its printk'ing the EOE
(end of event) records which is really not something that we need in syslog.
Its really intended for the realtime audit event stream handled by the audit
daemon. So, lets avoid printk'ing that record type.
Signed-off-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
On the latest kernels if one was to load about 15 rules, set the failure
state to panic, and then run service auditd stop the kernel will panic.
This is because auditd stops, then the script deletes all of the rules.
These deletions are sent as audit messages out of the printk kernel
interface which is already known to be lossy. These will overun the
default kernel rate limiting (10 really fast messages) and will call
audit_panic(). The same effect can happen if a slew of avc's come
through while auditd is stopped.
This can be fixed a number of ways but this patch fixes the problem by
just not panicing if auditd is not running. We know printk is lossy and
if the user chooses to set the failure mode to panic and tries to use
printk we can't make any promises no matter how hard we try, so why try?
At least in this way we continue to get lost message accounting and will
eventually know that things went bad.
The other change is to add a new call to audit_log_lost() if auditd
disappears. We already pulled the skb off the queue and couldn't send
it so that message is lost. At least this way we will account for the
last message and panic if the machine is configured to panic. This code
path should only be run if auditd dies for unforeseen reasons. If
auditd closes correctly audit_pid will get set to 0 and we won't walk
this code path.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Fix the following compiler warning by using "%zu" as defined in C99.
CC kernel/auditsc.o
kernel/auditsc.c: In function 'audit_log_single_execve_arg':
kernel/auditsc.c:1074: warning: format '%ld' expects type 'long int', but
argument 4 has type 'size_t'
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[libata] wrap kmap_atomic(KM_IRQ0) with local_irq_save/restore()
sata_svw: Add support for HT1100 SATA controller
Interrupts must be disabled if using kmap_atomic(KM_IRQ0), but that was
not the case in a few code paths coming directly from ATA driver
interrupt handlers (which use spin_lock rather than spin_lock_irqsave).
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This is unnecessary since it is already protected by
spin_lock_irq{save, restore} in clock.c.
Signed-off-by: eric miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This typo causes the incorrect calculation of the IRQ numbers
in the ICIP2 registers.
Signed-off-by: eric miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
"cat /dev/mem" may cause kernel Oops for boards with PHYS_OFFSET != 0
because character device is mapped to addresses starting from zero
and there is no protection against such situation.
Patch just add this.
Signed-off-by: Alexandre Rusev <arusev@ru.mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Convert debug-only (and removed) MODULE_PARM() to module_param().
Compiles cleanly (with DEBUG=1).
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch sets KEXEC_CONTROL_MEMORY_LIMIT to (-1)UL. As the value is
compared with physical addresses TASK_SIZE makes no sense. Machines
where the RAM addresses start above TASK_SIZE kexecs eats all memory
and crashes the kernel without this patch.
Signed-off-by: Thomas Kunze <thommycheck@gmx.de>
Acked-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Eric Sandeen tracked an XFS on ARM corruption bug down to a function
under fs/xfs/ involving some get_unaligned() calls on u64 pointers.
As it turns out, calling ARM's get_unaligned() on a u64 pointer
pointing to the following byte sequence:
80 81 82 83 84 85 86 87
would return ffffffff83828180 (LE mode.) This turns out to be
because of implicit u8 -> int promotion in ARM's implementation of
various helpers for get_unaligned(), causing them to accidentally
return signed instead of unsigned values, which in turn caused the
subsequent casts to unsigned long long in __get_unaligned_8_[bl]e()
to sign-extend the lower words.
Fix by casting the return values of __get_unaligned_[24]_[bl]e()
to unsigned int.
Cc: Eric Sandeen <sandeen@sandeen.net>
Cc: Rabeeh Khoury <rabeeh@marvell.com>
Cc: Nicolas Pitre <nico@marvell.com>
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
On Wed, Feb 20, 2008 at 11:50:33AM +0100, Guennadi Liakhovetski wrote:
> arch/arm/kernel/atags.c uses for some reason the
> KEXEC_BOOT_PARAMS_SIZE macro, which is only defined if CONFIG_KEXEC
> is set. So, either this macro should be defined always, or another
> macro should be used, or ATAGS_PROC should depend on KEXEC.
As the procfs export of ATAGS is not meant as a stable, general purpose
ABI it shouldn't be an independent, general configuration option.
This patch make ATAGS_PROC depend on KEXEC
Signed-off-by: Uli Luckas <u.luckas@road.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Set cap.max_inline_data to the actual max inline data that the adapter
support, so that userspace apps see the right value returned.
Signed-off-by: Jon Mason <jon@opengridcomputing.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Commit a3cd7d90 ("IB/fmr_pool: ib_fmr_pool_flush() should flush all
dirty FMRs") caused a regression for iSER and was reverted in
e5507736.
This change attempts to redo the original patch so that all used FMR
entries are flushed when ib_flush_fmr_pool() is called without
affecting the normal FMR pool cleaning thread. Simply move used
entries from the clean list onto the dirty list in ib_flush_fmr_pool()
before letting the cleanup thread do its job.
Signed-off-by: Pete Wyckoff <pw@osc.edu>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This reverts commit a3cd7d9070.
The original commit breaks iSER reliably, making it complain:
iser: iser_reg_page_vec:ib_fmr_pool_map_phys failed: -11
The FMR cleanup thread runs ib_fmr_batch_release() as dirty entries
build up. This commit causes clean but used FMR entries also to be
purged. During that process, another thread can see that there are no
free FMRs and fail, even though there should always have been enough
available.
Signed-off-by: Pete Wyckoff <pw@osc.edu>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When a CM MAD is received, it is queued to a CM workqueue for
processing. The queued work item references the port and device on
which the MAD was received. If that device is removed from the system
before the work item can execute, the work item will reference freed
memory.
To fix this, flush the workqueue after unregistering to receive MAD,
and before the device is be freed.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
I'll be leaving QLogic soon for another job and Ralph has graciously
offered to take over the IPath driver maintainership.
Signed-off-by: Arthur Jones <arthur.jones@qlogic.com>
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Current tun/tap driver sets also net device's hw address when asked to
change character device's hw address. This is a good idea, but it
misses RTLN-locking, resulting following error message in 2.6.25-rc3's
inetdev_event() function:
RTNL: assertion failed at net/ipv4/devinet.c (1050)
Attached patch fixes this problem.
Signed-off-by: Kim B. Heino <Kim.Heino@bluegiga.com>
Signed-off-by: David S. Miller <davem@davemloft.net>