Commit Graph

9383 Commits

Author SHA1 Message Date
Theodore Ts'o
577c4eb09d [PATCH] inode-diet: Move i_cdev into a union
Move the i_cdev pointer in struct inode into a union.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:17 -07:00
Theodore Ts'o
eaf796e7ef [PATCH] inode-diet: Move i_bdev into a union
Move the i_bdev pointer in struct inode into a union.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:17 -07:00
Theodore Ts'o
4c1541680f [PATCH] inode-diet: Move i_pipe into a union
Move the i_pipe pointer into a union that will be shared with i_bdev and
i_cdev.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:17 -07:00
Theodore Ts'o
8e18e2941c [PATCH] inode_diet: Replace inode.u.generic_ip with inode.i_private
The following patches reduce the size of the VFS inode structure by 28 bytes
on a UP x86.  (It would be more on an x86_64 system).  This is a 10% reduction
in the inode size on a UP kernel that is configured in a production mode
(i.e., with no spinlock or other debugging functions enabled; if you want to
save memory taken up by in-core inodes, the first thing you should do is
disable the debugging options; they are responsible for a huge amount of bloat
in the VFS inode structure).

This patch:

The filesystem or device-specific pointer in the inode is inside a union,
which is pretty pointless given that all 30+ users of this field have been
using the void pointer.  Get rid of the union and rename it to i_private, with
a comment to explain who is allowed to use the void pointer.  This is just a
cleanup, but it allows us to reuse the union 'u' for something something where
the union will actually be used.

[judith@osdl.org: powerpc build fix]
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Judith Lebzelter <judith@osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:17 -07:00
Vivek Goyal
7e96287ddc [PATCH] kdump: introduce "reset_devices" command line option
Resetting the devices during driver initialization can be a costly
operation in terms of time (especially scsi devices).  This option can be
used by drivers to know that user forcibly wants the devices to be reset
during initialization.

This option can be useful while kernel is booting in unreliable
environment.  For ex.  during kdump boot where devices are in unknown
random state and BIOS execution has been skipped.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:17 -07:00
Jeff Dike
3c91735099 [PATCH] uml: thread creation tidying
fork on UML has always somewhat subtle.  The underlying cause has been the
need to initialize a stack for the new process.  The only portable way to
initialize a new stack is to set it as the alternate signal stack and take a
signal.  The signal handler does whatever initialization is needed and jumps
back to the original stack, where the fork processing is finished.  The basic
context switching mechanism is a jmp_buf for each process.  You switch to a
new process by longjmping to its jmp_buf.

Now that UML has its own implementation of setjmp and longjmp, and I can poke
around inside a jmp_buf without fear that libc will change the structure, a
much simpler mechanism is possible.  The jmpbuf can simply be initialized by
hand.

This eliminates -
	the need to set up and remove the alternate signal stack
	sending and handling a signal
	the signal blocking needed around the stack switching, since
there is no stack switching
	setting up the jmp_buf needed to jump back to the original
stack after the new one is set up

In addition, since jmp_buf is now defined by UML, and not by libc, it can be
embedded in the thread struct.  This makes it unnecessary to have it exist on
the stack, where it used to be.  It also simplifies interfaces, since the
switch jmp_buf used to be a void * inside the thread struct, and functions
which took it as an argument needed to define a jmp_buf variable and assign it
from the void *.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:16 -07:00
Jeff Dike
a8b4fc4d7c [PATCH] uml: fix missing x86_64 register definitions
The UML/x86_64 headers were missing ptrace support for some segment registers.
 The underlying problem was that the x86_64 kernel uses user_regs_struct
rather than the ptrace register definitions in ptrace.  This patch switches
UML/x86_64 to using user_regs_struct for its definitions of the host's
registers.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:16 -07:00
Hirokazu Takata
85f651794c [PATCH] m32r: revise __raw_read_trylock()
Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:15 -07:00
Hirokazu Takata
a27f311332 [PATCH] m32r: Fix "value computed not used" warnings
Fix to remove annoying gcc-4.1 warnings "value computed not used" for m32r;
Modify set_mb to cast to void for SMP.

Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:15 -07:00
David Howells
f269fdd182 [PATCH] NOMMU: move the fallback arch_vma_name() to a sensible place
Move the fallback arch_vma_name() to a sensible place (kernel/signal.c).

Currently it's in fs/proc/task_mmu.c, a file that is dependent on both
CONFIG_PROC_FS and CONFIG_MMU being enabled, but it's used from
kernel/signal.c from where it is called unconditionally.

[akpm@osdl.org: build fix]
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:15 -07:00
David Howells
dbf8685c8e [PATCH] NOMMU: Implement /proc/pid/maps for NOMMU
Implement /proc/pid/maps for NOMMU by reading the vm_area_list attached to
current->mm->context.vmlist.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:14 -07:00
David Howells
5da6185bca [PATCH] NOMMU: Set BDI capabilities for /dev/mem and /dev/kmem
Set the backing device info capabilities for /dev/mem and /dev/kmem to
permit direct sharing under no-MMU conditions and full mapping capabilities
under MMU conditions.  Make the BDI used by these available to all directly
mappable character devices.

Also comment the capabilities for /dev/zero.

[akpm@osdl.org: ifdef reductions]
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:14 -07:00
Rolf Eike Beer
d24afc57d5 [PATCH] Mark __remove_vm_area() static
The function is exported but not used from anywhere else.  It's also marked as
"not for driver use" so noone out there should really care.

Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:13 -07:00
Jes Sorensen
f4b81804a2 [PATCH] do_no_pfn()
Implement do_no_pfn() for handling mapping of memory without a struct page
backing it.  This avoids creating fake page table entries for regions which
are not backed by real memory.

This feature is used by the MSPEC driver and other users, where it is
highly undesirable to have a struct page sitting behind the page (for
instance if the page is accessed in cached mode via the struct page in
parallel to the the driver accessing it uncached, which can result in data
corruption on some architectures, such as ia64).

This version uses specific NOPFN_{SIGBUS,OOM} return values, rather than
expect all negative pfn values would be an error.  It also bugs on cow
mappings as this would not work with the VM.

[akpm@osdl.org: micro-optimise]
Signed-off-by: Jes Sorensen <jes@sgi.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:13 -07:00
Christoph Lameter
d5f541ed6e [PATCH] Add node to zone for the NUMA case
Add the node in order to optimize zone_to_nid.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:13 -07:00
Christoph Lameter
77f700dab4 [PATCH] Disable GFP_THISNODE in the non-NUMA case
GFP_THISNODE must be set to 0 in the non numa case otherwise we disable retry
and warnings for failing allocations in the SMP and UP case.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:12 -07:00
Christoph Lameter
08e0f6a970 [PATCH] Add NUMA_BUILD definition in kernel.h to avoid #ifdef CONFIG_NUMA
The NUMA_BUILD constant is always available and will be set to 1 on
NUMA_BUILDs.  That way checks valid only under CONFIG_NUMA can easily be done
without #ifdef CONFIG_NUMA

F.e.

if (NUMA_BUILD && <numa_condition>) {
...
}

[akpm: not a thing we'd normally do, but CONFIG_NUMA is special: it is
 causing ifdef explosion in core kernel, so let's see if this is a comfortable
 way in whcih to control that]

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:12 -07:00
Heiko Carstens
5b99cd0eff [PATCH] own header file for struct page
This moves the definition of struct page from mm.h to its own header file
page-struct.h.  This is a prereq to fix SetPageUptodate which is broken on
s390:

#define SetPageUptodate(_page)
       do {
               struct page *__page = (_page);
               if (!test_and_set_bit(PG_uptodate, &__page->flags))
                       page_test_and_clear_dirty(_page);
       } while (0)

_page gets used twice in this macro which can cause subtle bugs.  Using
__page for the page_test_and_clear_dirty call doesn't work since it causes
yet another problem with the page_test_and_clear_dirty macro as well.

In order to avoid all these problems caused by macros it seems to be a good
idea to get rid of them and convert them to static inline functions.
Because of header file include order it's necessary to have a seperate
header file for the struct page definition.

Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:12 -07:00
Andrew Morton
e129b5c23c [PATCH] vm: add per-zone writeout counter
The VM is supposed to minimise the number of pages which get written off the
LRU (for IO scheduling efficiency, and for high reclaim-success rates).  But
we don't actually have a clear way of showing how true this is.

So add `nr_vmscan_write' to /proc/vmstat and /proc/zoneinfo - the number of
pages which have been written by the vm scanner in this zone and globally.

Cc: Christoph Lameter <clameter@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:12 -07:00
Mel Gorman
fb01439c5b [PATCH] Allow an arch to expand node boundaries
Arch-independent zone-sizing determines the size of a node
(pgdat->node_spanned_pages) based on the physical memory that was
registered by the architecture.  However, when
CONFIG_MEMORY_HOTPLUG_RESERVE is set, the architecture expects that the
spanned_pages will be much larger and that mem_map will be allocated that
is used lated on memory hot-add.

This patch allows an architecture that sets CONFIG_MEMORY_HOTPLUG_RESERVE
to call push_node_boundaries() which will set the node beginning and end to
at *least* the requested boundary.

Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Andi Kleen <ak@muc.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Keith Mannthey" <kmannth@gmail.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:12 -07:00
Mel Gorman
0e0b864e06 [PATCH] Account for memmap and optionally the kernel image as holes
The x86_64 code accounted for memmap and some portions of the the DMA zone as
holes.  This was because those areas would never be reclaimed and accounting
for them as memory affects min watermarks.  This patch will account for the
memmap as a memory hole.  Architectures may optionally use set_dma_reserve()
if they wish to account for a portion of memory in ZONE_DMA as a hole.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Andi Kleen <ak@muc.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Keith Mannthey" <kmannth@gmail.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:11 -07:00
Mel Gorman
05e0caad3b [PATCH] Have ia64 use add_active_range() and free_area_init_nodes
Size zones and holes in an architecture independent manner for ia64.

[bob.picco@hp.com: fix ia64 FLATMEM+VIRTUAL_MEM_MAP]
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Bob Picco <bob.picco@hp.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Andi Kleen <ak@muc.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Keith Mannthey" <kmannth@gmail.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Bob Picco <bob.picco@hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:11 -07:00
Mel Gorman
5cb248abf5 [PATCH] Have x86_64 use add_active_range() and free_area_init_nodes
Size zones and holes in an architecture independent manner for x86_64.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Andi Kleen <ak@muc.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Keith Mannthey" <kmannth@gmail.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:11 -07:00
Mel Gorman
c713216dee [PATCH] Introduce mechanism for registering active regions of memory
At a basic level, architectures define structures to record where active
ranges of page frames are located.  Once located, the code to calculate zone
sizes and holes in each architecture is very similar.  Some of this zone and
hole sizing code is difficult to read for no good reason.  This set of patches
eliminates the similar-looking architecture-specific code.

The patches introduce a mechanism where architectures register where the
active ranges of page frames are with add_active_range().  When all areas have
been discovered, free_area_init_nodes() is called to initialise the pgdat and
zones.  The zone sizes and holes are then calculated in an architecture
independent manner.

Patch 1 introduces the mechanism for registering and initialising PFN ranges
Patch 2 changes ppc to use the mechanism - 139 arch-specific LOC removed
Patch 3 changes x86 to use the mechanism - 136 arch-specific LOC removed
Patch 4 changes x86_64 to use the mechanism - 74 arch-specific LOC removed
Patch 5 changes ia64 to use the mechanism - 52 arch-specific LOC removed
Patch 6 accounts for mem_map as a memory hole as the pages are not reclaimable.
	It adjusts the watermarks slightly

Tony Luck has successfully tested for ia64 on Itanium with tiger_defconfig,
gensparse_defconfig and defconfig.  Bob Picco has also tested and debugged on
IA64.  Jack Steiner successfully boot tested on a mammoth SGI IA64-based
machine.  These were on patches against 2.6.17-rc1 and release 3 of these
patches but there have been no ia64-changes since release 3.

There are differences in the zone sizes for x86_64 as the arch-specific code
for x86_64 accounts the kernel image and the starting mem_maps as memory holes
but the architecture-independent code accounts the memory as present.

The big benefit of this set of patches is a sizable reduction of
architecture-specific code, some of which is very hairy.  There should be a
greater reduction when other architectures use the same mechanisms for zone
and hole sizing but I lack the hardware to test on.

Additional credit;
	Dave Hansen for the initial suggestion and comments on early patches
	Andy Whitcroft for reviewing early versions and catching numerous
		errors
	Tony Luck for testing and debugging on IA64
	Bob Picco for fixing bugs related to pfn registration, reviewing a
		number of patch revisions, providing a number of suggestions
		on future direction and testing heavily
	Jack Steiner and Robin Holt for testing on IA64 and clarifying
		issues related to memory holes
	Yasunori for testing on IA64
	Andi Kleen for reviewing and feeding back about x86_64
	Christian Kujau for providing valuable information related to ACPI
		problems on x86_64 and testing potential fixes

This patch:

Define the structure to represent an active range of page frames within a node
in an architecture independent manner.  Architectures are expected to register
active ranges of PFNs using add_active_range(nid, start_pfn, end_pfn) and call
free_area_init_nodes() passing the PFNs of the end of each zone.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Bob Picco <bob.picco@hp.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Andi Kleen <ak@muc.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Keith Mannthey" <kmannth@gmail.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:11 -07:00
Andrew Morton
2bd0cfbde2 [PATCH] fix x86_64-mm-spinlock-cleanup
We need processor.h for cpu_relax().

Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:11 -07:00
Alexey Dobriyan
133d205a18 [PATCH] Make kmem_cache_destroy() return void
un-, de-, -free, -destroy, -exit, etc functions should in general return
void.  Also,

There is very little, say, filesystem driver code can do upon failed
kmem_cache_destroy().  If it will be decided to BUG in this case, BUG
should be put in generic code, instead.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:11 -07:00
Dave Kleikamp
a4e4de36dc [PATCH] ext3: Fix sparse warnings
Fixing up some endian-ness warnings in preparation to clone ext4 from ext3.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:10 -07:00
Dave Kleikamp
e9ad5620bf [PATCH] ext3: More whitespace cleanups
More white space cleanups in preparation of cloning ext4 from ext3.
Removing spaces that precede a tab.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:10 -07:00
Eric Sandeen
37ed322290 [PATCH] JBD: 16T fixes
These are a few places I've found in jbd that look like they may not be
16T-safe, or consistent with the use of unsigned longs for block
containers.  Problems here would be somewhat hard to hit, would require
journal blocks past the 8T boundary, which would not be terribly common.
Still, should fix.

(some of these have come from the ext4 work on jbd as well).

I think there's one more possibility that the wrap() function may not be
safe IF your last block in the journal butts right up against the 232 block
boundary, but that seems like a VERY remote possibility, and I'm not
worrying about it at this point.

Signed-off-by: Eric Sandeen <esandeen@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:09 -07:00
Mingming Cao
ae6ddcc5f2 [PATCH] ext3 and jbd cleanup: remove whitespace
Remove whitespace from ext3 and jbd, before we clone ext4.

Signed-off-by: Mingming Cao<cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:09 -07:00
KAMEZAWA Hiroyuki
bbf2bef9f5 [PATCH] fix "cpu to node relationship fixup: map cpu to node"
Fix build error introduced by 3212fe1594

Non-NUMA case should be handled.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:08 -07:00
Linus Torvalds
a5b08073a0 Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/i2c-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/i2c-2.6: (30 commits)
  i2c: Drop unimplemented slave functions
  i2c: Constify i2c_algorithm declarations, part 2
  i2c: Constify i2c_algorithm declarations, part 1
  i2c: Let drivers constify i2c_algorithm data
  i2c-isa: Restore driver owner
  i2c-viapro: Add support for the VT8237A and VT8251
  i2c: Warn on i2c client creation failure
  i2c-core: Drop useless bitmaskings
  i2c-algo-pcf: Discard the mdelay data struct member
  i2c-algo-bit: Cleanups
  i2c-isa: Fail adding driver on attach_adapter error
  i2c: __must_check fixes (chip drivers)
  i2c-dev: attach/detach_adapter cleanups
  i2c-stub: Chip address as a module parameter
  i2c: Plan i2c-isa for removal
  i2c: New bus driver for TI OMAP boards
  i2c-algo-bit: Discard the mdelay data struct member
  i2c-matroxfb: Struct init conversion
  i2c: Fix copy-n-paste in subsystem Kconfig
  i2c-au1550: Add I2C support for Au1200
  ...
2006-09-27 08:09:48 -07:00
Linus Torvalds
ff0972c26b Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (28 commits)
  pciehp - fix wrong return value
  IA64: PCI: dont disable irq which is not enabled
  acpiphp: add support for ioapic hot-remove
  PCI: assign ioapic resource at hotplug
  acpiphp: disable bridges
  acpiphp: stop bus device before acpi_bus_trim
  PCI: add pci_stop_bus_device
  acpiphp: do not initialize existing ioapics
  acpiphp: initialize ioapics before starting devices
  acpiphp: set hpp values before starting devices
  PCI Hotplug: cleanup pcihp skeleton code.
  PCI: Restore PCI Express capability registers after PM event
  PCI: drivers/pci/hotplug/acpiphp_glue.c: make a function static
  PCI: Multiprobe sanitizer
  PCI: fix __must_check warnings
  PCI Hotplug: fix __must_check warnings
  SHPCHP: fix __must_check warnings
  PCI-Express AER implemetation: pcie_portdrv error handler
  PCI-Express AER implemetation: AER core and aerdriver
  PCI-Express AER implemetation: export pcie_port_bus_type
  ...
2006-09-27 08:09:15 -07:00
Ralf Baechle
36396f3c36 [MIPS] s/__ASSEMBLER__/__ASSEMBLY__/ for clarity sake.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-09-27 13:37:57 +01:00
Ralf Baechle
e584ade1a6 [MIPS] Have headers_install install <asm/cachectl.h> and <asm/sysmips.h>.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-09-27 13:37:56 +01:00
Richard Sandiford
ddb1199c4c [MIPS] fstatat syscall names
MIPS is the only port to call its fstatat()-related syscalls
"__NR_fstatat".  Now I can see why that might be seen as every
other port being wrong, but I think for o32, it is at best confusing.
__NR_fstat provides a plain (32-bit) stat while __NR_fstatat provides a
64-bit stat.  Changing the name to __NR_fstatat64 would make things more
explicit, match x86, and make the glibc port slightly easier.

The current name is more appropriate for n32 and n64, but it would be
appropriate for other 64-bit targets too, and those targets have chosen
to call it __NR_newfstatat instead.  Using the same name for MIPS would
again be more consistent and make the glibc port slightly easier.

I'm not wedded to this idea if the current names are preferred,
but FWIW...

Signed-off-by: Richard Sandiford <richard@codesourcery.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-09-27 13:37:56 +01:00
Ralf Baechle
d48f1de2d8 [MIPS] Remove EV96100 as previously announced.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-09-27 13:37:55 +01:00
Ralf Baechle
d7d86aa88a [MIPS] Cleanup hazard handling.
Mostly based on patch by Chris Dearman and cleanups from Yoichi.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-09-27 13:37:53 +01:00
Peter Watkins
9dbd7b9142 [MIPS] Fix USER_PTRS_PER_PGD for 64K page size.
The code in pgtable-64.h assumes TASK_SIZE is always bigger than a first
level PGDIR_SIZE. This is not the case for 64K pages, where task size is
40 bits (1TB) and a pgd entry can map 42 bits. This leads to
USER_PTRS_PER_PGD being zero for 64K pages.

Signed-off-by: Peter Watkins <treestem@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-09-27 13:37:52 +01:00
thomas@koeller.dyndns.org
0c68a9b6a7 [MIPS] Move excite_fpga.h to include/asm-mips/mach-excite
excite_fpga.h, like all platform headers, really belongs in the
platform header directory.

Signed-off-by: Thomas Koeller <thomas.koeller@baslerweb.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-09-27 13:37:50 +01:00
Ralf Baechle
6b8aab0930 [MIPS] Reformat missformated SMTC bits.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-09-27 13:37:49 +01:00
Atsushi Nemoto
3c70f12bfa [MIPS] Qemu does not have D-cache aliases
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-09-27 13:37:49 +01:00
Yoichi Yuasa
bdb37c8d63 [MIPS] Remove F_SETSIG and F_GETSIG in favor of the asm-generic definitions.
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-09-27 13:37:48 +01:00
Ralf Baechle
633fd568c1 [MIPS] Move definition of IRIX compat constant into IRIX compat code.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-09-27 13:37:45 +01:00
Yoichi Yuasa
6b3e5f44b5 [MIPS] Use common definitions from asm-generic/signal.h
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-09-27 13:37:44 +01:00
Maciej W. Rozycki
fc095a9021 [MIPS] Atlas: update interrupt handling
The following change updates the Atlas interrupt handling to match that
of Malta.  Tested with a 5Kc and a 34Kf successfully.

Signed-off-by: Maciej W. Rozycki <macro@mips.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-09-27 13:37:42 +01:00
Maciej W. Rozycki
3ee24e1b1e [MIPS] Atlas: Fix building the RTC driver
Atlas maps its RTC chip in the host mmio space rather than using the
"traditional" location in the PCI/ISA port space.  A change that has
happened to the generic RTC header requires to define ARCH_RTC_LOCATION
now.

Signed-off-by: Maciej W. Rozycki <macro@mips.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-09-27 13:37:41 +01:00
Atsushi Nemoto
7fdeb04814 [MIPS] Wire up set_robust_list(2) and get_robust_list(2)
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-09-27 13:37:40 +01:00
Atsushi Nemoto
8f9a2b3246 [MIPS] Fix errors detected by "make headers_check"
* export asm/sgidefs.h
* include asm/isadep.h only if in kernel
* do not export contents of asm/timex.h and asm/user.h

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-09-27 13:37:40 +01:00
Ralf Baechle
d34555fb20 [MIPS] Do not lose upper 32-bit on MIPS32 with 64-bit addresses in __pte().
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-09-27 13:37:39 +01:00
Ralf Baechle
65316fd13a [MIPS] Replace generic__raw_read_trylock usage
generic__raw_read_trylock() is a defect generic function actually doing
a __raw_read_lock ...

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-09-27 13:37:39 +01:00
Ralf Baechle
585fa72493 [MIPS] Retire flush_icache_page from mm use.
On the 34K the redundant cache operations were causing excessive stalls
resulting in realtime code running on the second VPE missing its deadline.
For all other platforms this patch is just a significant performance
improvment as illustrated by below benchmark numbers.

Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host                 OS  Mhz null null      open slct sig  sig  fork exec sh
                             call  I/O stat clos TCP  inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
25Kf      2.6.18-rc4     533 0.49 1.16 7.57 33.4 30.5 1.34 12.4 5497 17.K 54.K
25Kf      2.6.18-rc4-p   533 0.49 1.16 6.68 23.0 30.7 1.36 8.55 5030 16.K 48.K
4Kc       2.6.18-rc4      80 4.21 15.0 131. 289. 261. 16.5 258. 18.K 70.K 227K
4Kc       2.6.18-rc4-p    80 4.34 13.1 128. 285. 262. 18.2 258. 12.K 52.K 176K
34Kc      2.6.18-rc4      40 5.01 14.0 61.6 90.0 477. 17.9 94.7 29.K 108K 342K
34Kc      2.6.18-rc4-p    40 4.98 13.9 61.2 89.7 475. 17.6 93.7 8758 44.K 158K
BCM1480   2.6.18-rc4     700 0.28 0.60 3.68 5.92 16.0 0.78 5.08 931. 3163 15.K
BCM1480   2.6.18-rc4-p   700 0.28 0.61 3.65 5.85 16.0 0.79 5.20 395. 1464 8385
TX49-16K  2.6.18-rc3     197 0.73 2.41 19.0 37.8 82.9 2.94 17.5 4438 14.K 56.K
TX49-16K  2.6.18-rc3-p   197 0.73 2.40 19.9 36.3 82.9 2.94 23.4 2577 9103 38.K
TX49-32K  2.6.18-rc3     396 0.36 1.19 6.80 11.8 41.0 1.46 8.17 2738 8465 32.K
TX49-32K  2.6.18-rc3-p   396 0.36 1.19 6.82 10.2 41.0 1.46 8.18 1330 4638 18.K
    
Original patch by me with enhancements by Atsushi Nemoto.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
2006-09-27 13:37:34 +01:00
Ralf Baechle
b4b30a5a0a [MIPS] Cleanup leftovers of ARCH_HAS_IRQ_PER_CPU
CONFIG_IRQ_PER_CPU now controls the IRQ_PER_CPU stuff.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-09-27 13:37:29 +01:00
Paul Mundt
f3c2575818 sh: Calculate shm alignment at runtime.
Set the SHM alignment at runtime, based off of probed cache desc.
Optimize get_unmapped_area() to only colour align shared mappings.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 18:36:17 +09:00
Paul Mundt
87b0ef91b6 sh: dma-mapping compile fixes.
Silly bug, make it build again..

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 18:34:41 +09:00
Paul Mundt
19f9a34f87 sh: Initial vsyscall page support.
This implements initial support for the vsyscall page on SH.
At the moment we leave it configurable due to having nommu
to support from the same code base. We hook it up for the
signal trampoline return at present, with more to be added
later, once uClibc catches up.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 18:33:49 +09:00
Paul Mundt
8c12b5dc13 sh: Clean up PAGE_SIZE definition for assembly use.
We want to be able to use PAGE_SIZE all over the place,
this is the same approach adopted by other architectures..

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 18:31:06 +09:00
Paul Mundt
72c35543f8 sh: Support for L2 cache on newer SH-4A CPUs.
This implements preliminary support for the L2 caches found
on newer SH-4A CPUs.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 18:27:43 +09:00
Paul Mundt
9d549a7d8e sh: Update kexec support for API changes.
This was falling a bit behind..

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 18:26:05 +09:00
Paul Mundt
05ae915851 sh: Optimized readsl()/writesl() support.
Implement optimized copies of readsl()/writesl().

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 18:25:24 +09:00
Paul Mundt
2220d16493 sh: Report movli.l/movco.l capabilities.
Add llsc to cpu_flags[] and comment cpu-features.h.

Signed-off-by: Jamie Lenehan <nynaeve@twibble.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 18:24:28 +09:00
Paul Mundt
315bb96824 sh: CPU flags in AT_HWCAP in ELF auxvt.
Encode processor flags in AT_HWCAP in the ELF auxiliary vector.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 18:22:53 +09:00
Paul Mundt
a6a3113989 sh: Add support for 4K stacks.
This enables support for 4K stacks on SH.

Currently this depends on DEBUG_KERNEL, but likely all boards
will switch to this as the default in the future.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 18:22:14 +09:00
Paul Mundt
d153ea88dc sh: stack debugging support.
This adds a DEBUG_STACK_USAGE and DEBUG_STACKOVERFLOW for SH.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 18:20:16 +09:00
Paul Mundt
2c7834a6f1 sh: machvec rework.
Some more machvec overhauling and setup code cleanup. Kill off
get_system_type() and platform_setup(), we can do these both
through the machvec. While we're add it, kill off more useless
mach.c's and drop some legacy cruft from setup.c.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 18:17:31 +09:00
Russell King
456335e207 [ARM] Separate page table manipulation code from bootmem initialisation
nommu does not require the page table manipulation code in the
bootmem initialisation paths.  Move this into separate inline
functions.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-09-27 10:10:58 +01:00
Paul Mundt
bc8fb5d047 sh: Solution Engine SH7343 board support.
This adds support for the SE7343 board.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 18:09:34 +09:00
Paul Mundt
5a4053b232 sh: Kill off dead boards.
None of these have been maintained in years, and no one seems to
be interested in doing so, so just get rid of them.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 18:00:19 +09:00
Paul Mundt
781125ca58 sh: New atomic ops for SH-4A movli.l/movco.l
SH-4A implements LL/SC instructions, so we implement a simple
set of atomic operations using these.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 17:52:19 +09:00
Paul Mundt
91550f715b sh: Kill off the rest of the legacy rtc mess.
With the new RTC class driver, we can get rid of most of the
old left over cruft.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 17:45:01 +09:00
Takashi YOSHII
51e22e7a05 sh: SHMIN board support.
This adds support for the SHMIN SH7706 board.

Signed-off-by: Takashi YOSHII <takasi-y@ops.dti.ne.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 17:41:31 +09:00
Paul Mundt
e5723e0eeb sh: Add support for SH7706/SH7710/SH7343 CPUs.
This adds support for the aforementioned CPU subtypes, and cleans
up some build issues encountered as a result.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 17:38:11 +09:00
George G. Davis
4052ebb7a2 [ARM] 3859/1: Fix devicemaps_init() XIP_KERNEL odd 1MiB XIP_PHYS_ADDR translation error
The ARM XIP_KERNEL map created in devicemaps_init() is wrong.
The map.pfn is rounded down to an even 1MiB section boundary
which results in va/pa translations errors when XIP_PHYS_ADDR
starts on an odd 1MiB boundary and this causes the kernel to
hang.  This patch fixes ARM XIP_KERNEL translation errors for
the odd 1MiB XIP_PHYS_ADDR boundary case.

Signed-off-by: George G. Davis <gdavis@mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-09-27 09:35:05 +01:00
Paul Mundt
ecd9561687 serial: Add SERIAL_SH_SCI_NR_UARTS for sh-sci.
sh-sci needs to be able to define its number of ports to
support, we do this with a config option, like most other
ports do.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 17:32:30 +09:00
Paul Mundt
9f23e7e94f sh: pselect6 and ppoll, along with signal trampoline rework.
This implements support for ppoll() and pselect6()..

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 17:27:00 +09:00
Yoshinori Sato
a2d1a5fae6 sh: __addr_ok() and other misc nommu fixups.
A few more outstanding nommu fixups..

Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 17:25:07 +09:00
Yoshinori Sato
e96636ccfa sh: Various nommu fixes.
This fixes up some of the various outstanding nommu bugs on
SH.

Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 17:21:02 +09:00
Paul Mundt
e7f93a355c sh: Make PAGE_OFFSET configurable.
nommu needs to be able to shift PAGE_OFFSET, so we switch it to a
non-user-visible CONFIG_PAGE_OFFSET and use that in the few places
where it matters.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 17:19:13 +09:00
Paul Mundt
adf1890b0c sh: Move voyagergx_reg.h to a more sensible place.
Other boards require this as well, so move it out of the
rts7751r2d directory.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 17:17:27 +09:00
Takashi YOSHII
4b565680d1 sh: math-emu support
This implements initial math-emu support, aimed primarily at SH-3.

Signed-off-by: Takashi YOSHII <takasi-y@ops.dti.ne.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 17:15:32 +09:00
Paul Mundt
af514ca7d2 sh: Rename rtc_get/set_time() to avoid RTC_CLASS conflict.
We have a clash with RTC_CLASS over these names, so we
change them..

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 17:11:32 +09:00
Paul Mundt
2991be7252 sh: Fixup __strnlen_user() behaviour.
Drop TIF_USERSPACE and add addr_limit to the thread_info struct.
Subsequently, use that for address checking in strnlen_user() to
ward off bogus -EFAULTs.

Make __strnlen_user() return 0 on exception, rather than -EFAULT.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 17:07:07 +09:00
Paul Mundt
0f08f33808 sh: More cosmetic cleanups and trivial fixes.
Nothing exciting here, just trivial fixes..

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 17:03:56 +09:00
Paul Mundt
959f85f8a3 sh: Consolidated SH7751/SH7780 PCI support.
This cleans up quite a lot of the PCI mess that we
currently have, and attempts to consolidate the
duplication in the SH7780 and SH7751 PCI controllers.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 16:43:28 +09:00
Paul Mundt
56e8d7b578 sh: kgdb stub cleanups.
Some kgdb cleanup. Move hexchars/highhex/lowhex to the header, so it can
be reused by sh-sci. Also drop silly ctrl_inl/outl() overloading being
done by the kgdb stub.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 16:24:55 +09:00
Andriy Skulysh
3aa770e797 sh: APM/PM support.
This adds some simple PM stubs and the basic APM interfaces,
primarily for use by hp6xx, where the existing userland
expects it.

Signed-off-by: Andriy Skulysh <askulysh@gmail.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 16:20:22 +09:00
Paul Mundt
ef48e8e349 sh: Free up some and document PTEL flags.
Drop _PAGE_SHARED/_PAGE_U0_SHARED and document Linux PTE encodings in
the PTEL value. Preserve the swap cache entry encoding semantics for
now, though it will need rework to free up _PAGE_WT from _PAGE_FILE.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 16:17:17 +09:00
Paul Mundt
00b3aa3fc9 sh: xchg()/__xchg() always_inline fixes for gcc4.
Make __xchg() a macro, so that gcc 4.0 doesn't blow up thanks to
always_inline..

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 16:05:56 +09:00
Paul Mundt
f151749440 sh: Cleanup and document register bank usage.
Initial register bank cleanup. Make SR.RB configurable, and add some
preliminary documentation on register bank usage within the kernel.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 16:01:12 +09:00
Paul Mundt
5283ecb5cc sh: Add support for R7780RP and R7780MP boards.
This adds support for the Renesas SH7780 development boards,
R7780RP and R7780MP.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 15:59:17 +09:00
Paul Mundt
d7c30c682a sh: Store Queue API rework.
Rewrite the store queue API for a per-cpu interface in the driver
model. The old miscdevice is dropped, due to TASK_SIZE limitations,
and no one was using it anyways.

Carve up and allocate store queue space with a bitmap, back sq
mapping objects with a slab cache, and let userspace worry about
its own prefetching.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 15:49:57 +09:00
Paul Mundt
373e68b547 sh: Board updates for I/O routine rework.
This updates the various boards for some of the recent I/O routine
updates.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 15:41:24 +09:00
Paul Mundt
c470662854 sh: Fixup SHMLBA definition for SH7705.
We need this set to something sensible anywhere were we have
an aliasing dcache..

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 15:29:18 +09:00
Paul Mundt
d7cdc9e8ac sh: ioremap() overhaul.
ioremap() overhaul. Add support for transparent PMB mapping, get rid of
p3_ioremap(), etc. Also drop ioremap() and iounmap() routines from the
machvec, as everyone can use the generic ioremap() API instead. For PCI
memory apertures and other special cases, use the pci_iomap() API, as
boards are already required to get the mapping right there.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 15:16:42 +09:00
Paul Mundt
26ff6c11ef sh: page table alloc cleanups and page fault optimizations.
Cleanup of page table allocators, using generic folded PMD and PUD
helpers. TLB flushing operations are moved to a more sensible spot.

The page fault handler is also optimized slightly, we no longer waste
cycles on IRQ disabling for flushing of the page from the ITLB, since
we're already under CLI protection by the initial exception handler.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 15:13:36 +09:00
Paul Mundt
0c7b1df69c sh: SH-4A Privileged Space Mapping Buffer (PMB) support.
Add support for 32-bit physical addressing through the SH-4A
Privileged Space Mapping Buffer (PMB).

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 15:08:07 +09:00
Jamie Lenehan
a09749dd86 sh: Titan board support.
Add support for the titan board.

Signed-off-by: Jamie Lenehan <lenehan@twibble.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 15:05:39 +09:00
Paul Mundt
298476220d sh: Add control register barriers.
Currently when making changes to control registers, we
typically need some time for changes to take effect (8
nops, generally).  However, for sh4a we simply need to
do an icbi..

This is a simple patch for implementing a general purpose
ctrl_barrier() which functions as a control register write
barrier. There's some additional documentation in the patch
itself, but it's pretty self explanatory.

There were also some places where we were not doing the
barrier, which didn't seem to have any adverse effects on
legacy parts, but certainly did on sh4a. It's safer to have
the barrier in place for legacy parts as well in these cases,
though this does make flush_tlb_all() more expensive (by an
order of 8 nops).  We can ifdef around the flush_tlb_all()
case for now if it's clear that all legacy parts won't have
a problem with this.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 14:57:44 +09:00
kogiidena
94c0fa520c sh: landisk board support.
This adds support for the I-O DATA Landisk.

Signed-off-by: kogiidena <kogiidena@eggplant.ddo.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 14:53:35 +09:00
Paul Mundt
634bf4f69b sh: Fix libata build.
Drop virt_to_bus() from sg_dma_address() so libata builds.
While we're at it, move sg_dma_address() and sg_dma_len()
from pci.h to scatterlist.h.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 14:48:09 +09:00
Paul Mundt
8b395265f8 sh: Fix fatal oops in copy_user_page() on sh4a (SH7780).
We had a pretty interesting oops happening, where copy_user_page()
was down()'ing p3map_sem[] with a bogus offset (particularly, an
offset that hadn't been initialized with sema_init(), due to the
mismatch between cpu_data->dcache.n_aliases and what was assumed
based off of the old CACHE_ALIAS value).

Luckily, spinlock debugging caught this for us, and so we drop
the old hardcoded CACHE_ALIAS for sh4 completely and rely on the
run-time probed cpu_data->dcache.alias_mask. This in turn gets
the p3map_sem[] index right, and everything works again.

While we're at it, also convert to 4-level page tables..

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 14:38:02 +09:00
Paul Mundt
75c92acdd5 sh: Wire up new syscalls.
The syscall table has lagged behind a bit, wire up the new ones..

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 14:36:44 +09:00
Alexey Dobriyan
ef9a1d4c0c sh: remove cpu_online() definition from <asm/smp.h>
It's defined in <linux/cpumask.h> and log is horribly flooded by
"redefined" messages.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 14:32:57 +09:00
Paul Mundt
5b19c9081f sh: Support for SH7770/SH7780 CPU subtypes.
Merge support for SH7770 and SH7780 SH-4A subtypes.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 14:31:40 +09:00
Paul Mundt
a80fd21e52 sh: earlyprintk= support and cleanups.
Allow multiple early printk consoles via earlyprintk=.

With this change earlyprintk is no longer enabled by default,
it must be specified on the kernel command line. Optionally
with ,keep to prevent unreg by tty_io.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 14:26:53 +09:00
Paul Mundt
e86d6b66f5 sh: prefetch()/prefetchw() support.
SH-2/3/4 are able to prefetch, add support for it..

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 14:20:54 +09:00
Richard Curnow
b638d0b921 sh: Optimized cache handling for SH-4/SH-4A caches.
This reworks some of the SH-4 cache handling code to more easily
accomodate newer-style caches (particularly for the > direct-mapped
case), as well as optimizing some of the old code.

Signed-off-by: Richard Curnow <richard.curnow@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 14:09:26 +09:00
Paul Mundt
fdfc74f9fc sh: Support for SH-4A memory barriers.
SH-4A supports 'synco' as a barrier, sprinkle it around
the cache ops as necessary..

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 14:05:52 +09:00
Paul Mundt
e8fb67f8e0 sh: HS7751RVoIP board updates.
Various cleanups for HS7751RVoIP. Mostly just getting
rid of the old mach.c and splitting codec configuration
in to its own Kconfig.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 13:56:28 +09:00
Paul Mundt
6d75e650f1 sh: Move hd64461.h to a more sensible location.
With the I/O rework for hd64461 we're down to a single header,
so move it by itself and get rid of the directory.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 13:42:57 +09:00
Paul Mundt
d95fb13c96 sh: Fixup TMU_TOCR definition for SH7300.
SH7300 has a different TMU_TOCR, make the TMU code work again.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 13:30:08 +09:00
Paul Mundt
3f787fe2e0 sh: hugetlb updates.
For some of the larger sizes we permitted spanning pages
across several PTEs, but this turned out to not be generally
useful. This reverts the sh hugetlbpage interface to something
more sensible using huge pages at single PTE granularity.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 13:11:57 +09:00
Paul Mundt
e4c2cfee5d sh: Various cosmetic cleanups.
We had quite a bit of whitespace damage, clean most of it up..

Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Arthur Othieno <a.othieno@bluewin.ch>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 12:31:01 +09:00
Tom Rini
e4e3b5ccd7 sh: Add a simple cmpxchg().
We didn't have one of these before, a simple implementation
borrowed from MIPS as well as the __HAVE_ARCH_CMPXCHG bits.

Signed-off-by: Tom Rini <trini@kernel.crashing.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 11:28:20 +09:00
Paul Mundt
0c91c1a701 sh: Move smc37c93x.h for SystemH board use.
SystemH needs this header as well, not just 770x SE.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2006-09-27 11:16:20 +09:00
Satoru Takeuchi
24f8aa9b46 PCI: add pci_stop_bus_device
This patch adds pci_stop_bus_device() which stops a PCI device (detach
the driver, remove from the global list and so on) and any children.
This is needed for ACPI based PCI-to-PCI bridge hot-remove, and it will
be also needed for ACPI based PCI root bridge hot-remove.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: MUNEDA Takahiro <muneda.takahiro@jp.fujitsu.com>
Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-26 17:43:54 -07:00
Alan Cox
50b0075520 PCI: Multiprobe sanitizer
There are numerous drivers that can use multithreaded probing but having
some kind of global flag as the way to control this makes migration to
threaded probing hard and since it enables it everywhere and is almost
as likely to cause serious pain as holding a clog dance in a minefield.

If we have a pci_driver multithread_probe flag to inherit you can turn
it on for one driver at a time.

From playing so far however I think we need a different model at the
device layer which serializes until the called probe function says "ok
you can start another one now". That would need some kind of flag and
semaphore plus a helper function.

Anyway in the absence of that this is a starting point to usefully play
with this stuff

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-26 17:43:53 -07:00
Greg Kroah-Hartman
b19441af18 PCI: fix __must_check warnings
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-26 17:43:53 -07:00
Zhang, Yanmin
6c2b374d74 PCI-Express AER implemetation: AER core and aerdriver
Patch 3 implements the core part of PCI-Express AER and aerdrv
port service driver.

When a root port service device is probed, the aerdrv will call
request_irq to register irq handler for AER error interrupt.

When a device sends an PCI-Express error message to the root port,
the root port will trigger an interrupt, by either MSI or IO-APIC,
then kernel would run the irq handler. The handler collects root
error status register and schedules a work. The work will call
the core part to process the error based on its type
(Correctable/non-fatal/fatal).

As for Correctable errors, the patch chooses to just clear the correctable
error status register of the device.

As for the non-fatal error, the patch follows generic PCI error handler
rules to call the error callback functions of the endpoint's driver. If
the device is a bridge, the patch chooses to broadcast the error to
downstream devices.

As for the fatal error, the patch resets the pci-express link and
follows generic PCI error handler rules to call the error callback
functions of the endpoint's driver. If the device is a bridge, the patch
chooses to broadcast the error to downstream devices.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-26 17:43:53 -07:00
Brice Goglin
6397c75cbc MSI: Blacklist PCI-E chipsets depending on Hypertransport MSI capability
Introduce msi_ht_cap_enabled() to check the MSI capability in the
Hypertransport configuration space.
It is used in a generic quirk quirk_msi_ht_cap() to check whether
MSI is enabled on hypertransport chipset, and a nVidia specific quirk
quirk_nvidia_ck804_msi_ht_cap() where two 2 HT MSI mappings have to
be checked.
Both quirks set the PCI_BUS_FLAGS_NO_MSI bus flag when MSI is disabled.

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-26 17:43:52 -07:00
Brice Goglin
46ff34633e MSI: Rename PCI_CAP_ID_HT_IRQCONF into PCI_CAP_ID_HT
0x08 is the HT capability, while PCI_CAP_ID_HT_IRQCONF would be
the subtype 0x80 that mpic_scan_ht_pic() uses.
Rename PCI_CAP_ID_HT_IRQCONF into PCI_CAP_ID_HT.

And by the way, use it in the ipath driver instead of defining its
own HT_CAPABILITY_ID.

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-26 17:43:52 -07:00
Jean Delvare
6d3aae9d74 i2c: Drop unimplemented slave functions
i2c: Drop unimplemented slave functions

Drop the function declarations for slave mode support of i2c adapters.
This was never implemented, and by the time it is I bet we will want
something different anyway.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-26 15:38:52 -07:00
David Brownell
af71ff690b i2c: Let drivers constify i2c_algorithm data
i2c: Let drivers constify i2c_algorithm data

Let drivers constify I2C algorithm method operations tables,
moving them from ".data" to ".rodata".

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-26 15:38:52 -07:00
Adrian Bunk
9b4ccb86b4 i2c-algo-pcf: Discard the mdelay data struct member
i2c-algo-pcf: Discard the mdelay data struct member

Just as i2c-algo-bit, i2c-algo-pcf has an unused mdelay struct member,
which we can get rid of to spare some code and memory.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-26 15:38:52 -07:00
Jean Delvare
a0d9c63d36 i2c-algo-bit: Discard the mdelay data struct member
i2c-algo-bit: Discard the mdelay data struct member

The i2c_algo_bit_data structure has an mdelay member, which is not
used by the algorithm code (the code has always been ifdef'd out.)
Let's discard it to save some code and memory.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Mauro Carvalho Chehab <mchehab@brturbo.com.br>
Cc: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-26 15:38:51 -07:00
Jean Delvare
51c3711704 i2c-algo-sibyte: Merge into i2c-sibyte
i2c-algo-sibyte: Merge into i2c-sibyte

Merge i2c-algo-sibyte into i2c-sibyte, as this is a complete,
hardware-dependent SMBus implementation and not a reusable algorithm.

Perform some basic coding style cleanups while we're here (mainly
space-based indentation replaced by tabulations.)

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-26 15:38:50 -07:00
Russ Anderson
b29e7132b5 [IA64] PAL calls need physical mode, stacked
PAL_CACHE_READ and PAL_CACHE_WRITE need to be called in physical
mode with stacked registers.

Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-09-26 15:21:11 -07:00
Krzysztof Halasa
eb2a2fd91f [PATCH] Modularize generic HDLC
This patch enables building of individual WAN protocol support
routines (parts of generic HDLC) as separate modules.
All protocol-private definitions are moved from hdlc.h file
to protocol drivers. User-space interface and interface
between generic HDLC and underlying low-level HDLC drivers
are unchanged.

Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-09-26 17:40:24 -04:00
Zou Nan hai
f5a3f3dc18 [IA64] Make gp value point to Region 5 in mca handler
MCA dispatch code take physical address of GP passed from SAL, then call
DATA_PA_TO_VA twice on GP before call into C code.  The first time is
in ia64_set_kernel_register, the second time is in VIRTUAL_MODE_ENTER.
The gp is changed to a virtual address in region 7 because DATA_PA_TO_VA
is implemented by dep instruction.

However when notify blocks were called from MCA handler code, because
notify blocks are supported by callback function pointers, gp value
value was switched to region 5 again.

The patch set gp register to kernel gp of region 5 at entry of MCA
dispatch.

Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-09-26 14:13:03 -07:00
Tony Luck
5c55cd63a7 Revert "[IA64] Unwire set/get_robust_list"
This reverts commit 2636255488.

Jakub Jelinek provided the missing futex_atomic_cmpxchg_inatomic()
function, so now it should be safe to re-enable these syscalls.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-09-26 14:04:42 -07:00
Jakub Jelinek
a192dc1600 [IA64] Implement futex primitives
Implement futex_atomic_op_inuser() and futex_atomic_cmpxchg_inatomic()
on IA64 in order to fully support all futex functionality.

Signed-off-by: Jakub Jelinek <jakub@redhat.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-09-26 14:00:56 -07:00
Linus Torvalds
b278240839 Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (225 commits)
  [PATCH] Don't set calgary iommu as default y
  [PATCH] i386/x86-64: New Intel feature flags
  [PATCH] x86: Add a cumulative thermal throttle event counter.
  [PATCH] i386: Make the jiffies compares use the 64bit safe macros.
  [PATCH] x86: Refactor thermal throttle processing
  [PATCH] Add 64bit jiffies compares (for use with get_jiffies_64)
  [PATCH] Fix unwinder warning in traps.c
  [PATCH] x86: Allow disabling early pci scans with pci=noearly or disallowing conf1
  [PATCH] x86: Move direct PCI scanning functions out of line
  [PATCH] i386/x86-64: Make all early PCI scans dependent on CONFIG_PCI
  [PATCH] Don't leak NT bit into next task
  [PATCH] i386/x86-64: Work around gcc bug with noreturn functions in unwinder
  [PATCH] Fix some broken white space in ia32_signal.c
  [PATCH] Initialize argument registers for 32bit signal handlers.
  [PATCH] Remove all traces of signal number conversion
  [PATCH] Don't synchronize time reading on single core AMD systems
  [PATCH] Remove outdated comment in x86-64 mmconfig code
  [PATCH] Use string instructions for Core2 copy/clear
  [PATCH] x86: - restore i8259A eoi status on resume
  [PATCH] i386: Split multi-line printk in oops output.
  ...
2006-09-26 13:07:55 -07:00
Keshavamurthy Anil S
35589a8fa8 [IA64] Move perfmon tables from thread_struct to pfm_context
This patch renders thread_struct->pmcs[] and thread_struct->pmds[]
OBSOLETE. The actual table is moved to pfm_context structure which
saves space in thread_struct (in turn saving space in task_struct
which frees up more space for kernel stacks).

Signed-off-by: Stephane Eranian <eranian@hpl.hp.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-09-26 12:03:13 -07:00
Linus Torvalds
dd77a4ee0f Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (47 commits)
  Driver core: Don't call put methods while holding a spinlock
  Driver core: Remove unneeded routines from driver core
  Driver core: Fix potential deadlock in driver core
  PCI: enable driver multi-threaded probe
  Driver Core: add ability for drivers to do a threaded probe
  sysfs: add proper sysfs_init() prototype
  drivers/base: check errors
  drivers/base: Platform notify needs to occur before drivers attach to the device
  v4l-dev2: handle __must_check
  add CONFIG_ENABLE_MUST_CHECK
  add __must_check to device management code
  Driver core: fixed add_bind_files() definition
  Driver core: fix comments in drivers/base/power/resume.c
  sysfs_remove_bin_file: no return value, dump_stack on error
  kobject: must_check fixes
  Driver core: add ability for devices to create and remove bin files
  Class: add support for class interfaces for devices
  Driver core: create devices/virtual/ tree
  Driver core: add device_rename function
  Driver core: add ability for classes to handle devices properly
  ...
2006-09-26 11:49:46 -07:00
Stephane Eranian
dd562c0541 [IA64] Add interface so modules can discover whether multithreading is on.
Add is_multithreading_enabled() to check whether multi-threading
is enabled independently of which cpu is currently online

Signed-off-by: stephane eranian <eranian@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-09-26 11:39:38 -07:00
bibo mao
214ddde2f9 [IA64] kprobe opcode 16 bytes alignment on IA64
On IA64 instruction opcode must be 16 bytes alignment, in kprobe structure
there is one element to save original instruction, currently saved opcode
is not statically allocated in kprobe structure, that can not assure
16 bytes alignment. This patch dynamically allocated kprobe instruction
opcode to assure 16 bytes alignment.

Signed-off-by: bibo mao <bibo.mao@intel.com>
Acked-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-09-26 11:20:37 -07:00
Jeff Garzik
c226951b93 Merge branch 'master' into upstream 2006-09-26 13:13:19 -04:00
Tony Luck
a4b47ab946 Pull esi-support into release branch 2006-09-26 09:47:30 -07:00
Tony Luck
ae3e021862 Pull model-name into release branch 2006-09-26 09:47:04 -07:00
Jeff Dike
e8df8c3304 [PATCH] Make UML use ptrace-abi.h
Include the host architecture's ptrace-abi.h instead of ptrace.h.

There was some cpp mangling of names around the ptrace.h include to avoid
symbol clashes between UML and the host architecture.  Most of these can go
away.  The exception is struct pt_regs, which is convenient to have in
userspace, but must be renamed in order that UML can define its own.

ptrace-x86_64.h needed to have some now-obsolete cpp cruft and a declaration
removed.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:49:10 -07:00
Jeff Dike
70e0eb8ef1 [PATCH] Split i386 and x86_64 ptrace.h
The use of SEGMENT_RPL_MASK in the i386 ptrace.h introduced by
x86-allow-a-kernel-to-not-be-in-ring-0.patch broke the UML build, as UML
includes the underlying architecture's ptrace.h, but has no easy access to the
x86 segment definitions.

Rather than kludging around this, as in the past, this patch splits the
userspace-usable parts, which are the bits that UML needs, of ptrace.h into
ptrace-abi.h, which is included back into ptrace.h.  Thus, there is no net
effect on i386.

As a side-effect, this creates a ptrace header which is close to being usable
in /usr/include.

x86_64 is also treated in this way for consistency.  There was some trailing
whitespace there, which is cleaned up.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:49:10 -07:00
Jeff Dike
75e29b18d9 [PATCH] uml: stack usage reduction
The KSTK_* macros used an inordinate amount of stack.  In order to overcome
an impedance mismatch between their interface, which just returns a single
register value, and the interface of get_thread_regs, which took a full
pt_regs, the implementation created an on-stack pt_regs, filled it in, and
returned one field.  do_task_stat calls KSTK_* twice, resulting in two
local pt_regs, blowing out the stack.

This patch changes the interface (and name) of get_thread_regs to just
return a single register from a jmp_buf.

The include of archsetjmp.h" in registers.h to get the definition of
jmp_buf exposed a bogus include of <setjmp.h> in start_up.c.  <setjmp.h>
shouldn't be used anywhere any more since UML uses the klibc
setjmp/longjmp.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:49:09 -07:00
Rafael J. Wysocki
c5c6ba4e08 [PATCH] PM: Add pm_trace switch
Add the pm_trace attribute in /sys/power which has to be explicitly set to
one to really enable the "PM tracing" code compiled in when CONFIG_PM_TRACE
is set (which modifies the machine's CMOS clock in unpredictable ways).

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:49:04 -07:00
Rafael J. Wysocki
c8eb8b4025 [PATCH] PM: make it possible to disable console suspending
Change suspend_console() so that it waits for all consoles to flush the
remaining messages and make it possible to switch the console suspending off
with the help of a Kconfig option.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Stefan Seyfried <seife@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:49:03 -07:00
Rafael J. Wysocki
940864ddab [PATCH] swsusp: Use memory bitmaps during resume
Make swsusp use memory bitmaps to store its internal information during the
resume phase of the suspend-resume cycle.

If the pfns of saveable pages are saved during the suspend phase instead of
the kernel virtual addresses of these pages, we can use them during the resume
phase directly to set the corresponding bits in a memory bitmap.  Then, this
bitmap is used to mark the page frames corresponding to the pages that were
saveable before the suspend (aka "unsafe" page frames).

Next, we allocate as many page frames as needed to store the entire suspend
image and make sure that there will be some extra free "safe" page frames for
the list of PBEs constructed later.  Subsequently, the image is loaded and, if
possible, the data loaded from it are written into their "original" page
frames (ie.  the ones they had occupied before the suspend).

The image data that cannot be written into their "original" page frames are
loaded into "safe" page frames and their "original" kernel virtual addresses,
as well as the addresses of the "safe" pages containing their copies, are
stored in a list of PBEs.  Finally, the list of PBEs is used to copy the
remaining image data into their "original" page frames (this is done
atomically, by the architecture-dependent parts of swsusp).

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:49:02 -07:00
Rafael J. Wysocki
dcbb5a54f6 [PATCH] swsusp: clean up suspend header
Remove some things that are no longer used or defined elsewhere from suspend.h
and make the inline version of software_suspend() return the right error code.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:49:00 -07:00
Rafael J. Wysocki
e3920fb42c [PATCH] Disable CPU hotplug during suspend
The current suspend code has to be run on one CPU, so we use the CPU
hotplug to take the non-boot CPUs offline on SMP machines.  However, we
should also make sure that these CPUs will not be enabled by someone else
after we have disabled them.

The functions disable_nonboot_cpus() and enable_nonboot_cpus() are moved to
kernel/cpu.c, because they now refer to some stuff in there that should
better be static.  Also it's better if disable_nonboot_cpus() returns an
error instead of panicking if something goes wrong, and
enable_nonboot_cpus() has no reason to panic(), because the CPUs may have
been enabled by the userland before it tries to take them online.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:59 -07:00
Rafael J. Wysocki
e8eff5ac29 [PATCH] Make swsusp avoid memory holes and reserved memory regions on x86_64
On x86_64 machines with more than 2 GB of RAM there are large memory gaps
(with no corresponding kernel virtual addresses) and reserved memory
regions between areas of usable physical RAM.  Moreover, if CONFIG_FLATMEM
is set, they appear within the normal zone.  swsusp should not try to save
them, so the corresponding page structs have to be marked as 'nosave'.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Mel Gorman <mel@csn.ul.ie>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:58 -07:00
Andrew Morton
546e0d2719 [PATCH] swsusp: read speedup
Implement async reads for swsusp resuming.

Crufty old PIII testbox:
	15.7 MB/s -> 20.3 MB/s

Sony Vaio:
	14.6 MB/s -> 33.3 MB/s

I didn't implement the post-resume bio_set_pages_dirty().  I don't really
understand why resume needs to run set_page_dirty() against these pages.

It might be a worry that this code modifies PG_Uptodate, PG_Error and
PG_Locked against the image pages.  Can this possibly affect the resumed-into
kernel?  Hopefully not, if we're atomically restoring its mem_map?

Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Jens Axboe <axboe@suse.de>
Cc: Laurent Riffard <laurent.riffard@free.fr>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:58 -07:00
Andrew Morton
ab95416035 [PATCH] swsusp: write speedup
Switch the swsusp writeout code from 4k-at-a-time to 4MB-at-a-time.

Crufty old PIII testbox:
	12.9 MB/s -> 20.9 MB/s

Sony Vaio:
	14.7 MB/s -> 26.5 MB/s

The implementation is crude.  A better one would use larger BIOs, but wouldn't
gain any performance.

The memcpys will be mostly pipelined with the IO and basically come for free.

The ENOMEM path has not been tested.  It should be.

Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:58 -07:00
Steven Whitehouse
930631edd4 [PATCH] add DIV_ROUND_UP()
Add the DIV_ROUND_UP() helper macro: divide `n' by `d', rounding up.

Stolen from the gfs2 tree(!) because the swsusp patches need it.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:58 -07:00
Andrew Morton
a3bc0dbc81 [PATCH] smp_call_function_single() cleanup
If we're going to implement smp_call_function_single() on three architecture
with the same prototype then it should have a declaration in a
non-arch-specific header file.

Move it into <linux/smp.h>.

Cc: Stephane Eranian <eranian@hpl.hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:56 -07:00
Rusty Russell
2965a0e6da [PATCH] x86: trivial move of ptep_set_access_flags
Move ptep_set_access_flags to be closer to the other ptep accessors, and make
the indentation standard.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:56 -07:00
Rusty Russell
6049742dbc [PATCH] x86: trivial move of __HAVE macros in i386 pagetable headers
Move the __HAVE_ARCH_PTEP defines to accompany the function definitions.
Anything else is just a complete nightmare to track through the 2/3-level
paging code, and this caused duplicate definitions to be needed (pte_same),
which could have easily been taken care of with the asm-generic pgtable
functions.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:56 -07:00
Rusty Russell
673eae8230 [PATCH] x86: trivial pgtable.h __ASSEMBLY__ move
Parsing generic pgtable.h in assembler is simply crazy.  None of this file is
needed in assembler code, and C inline functions and structures routine break
one or more different compiles.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:56 -07:00
Ian Campbell
5091e74684 [PATCH] Translate asm version of ELFNOTE macro into preprocessor macro
I've come across some problems with the assembly version of the ELFNOTE
macro currently in -mm. (in
x86-put-note-sections-into-a-pt_note-segment-in-vmlinux.patch)

The first is that older gas does not support :varargs in .macro
definitions (in my testing 2.17 does while 2.15 does not, I don't know
when it became supported). The Changes file says binutils >= 2.12 so I
think we need to avoid using it. There are no other uses in mainline or
-mm. Old gas appears to just ignore it so you get "too many arguments"
type errors.

Secondly it seems that passing strings as arguments to assembler macros
is broken without varargs. It looks like they get unquoted or each
character is treated as a separate argument or something and this causes
all manner of grief. I think this is because of the use of -traditional
when compiling assembly files.

Therefore I have translated the assembler macro into a pre-processor
macro.

I added the desctype as a separate argument instead of including it with
the descdata as the previous version did since -traditional means the
ELFNOTE definition after the #else needs to have the same number of
arguments (I think so anyway, the -traditional CPP semantics are pretty
fscking strange!).

With this patch I am able to define elfnotes in assembly like this with
both old and new assemblers.

	ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS,       .asciz, "linux")
	ELFNOTE(Xen, XEN_ELFNOTE_GUEST_VERSION,  .asciz, "2.6")
	ELFNOTE(Xen, XEN_ELFNOTE_XEN_VERSION,    .asciz, "xen-3.0")
	ELFNOTE(Xen, XEN_ELFNOTE_VIRT_BASE,      .long,  __PAGE_OFFSET)

Which seems reasonable enough.

Signed-off-by: Ian Campbell <ian.campbell@xensource.com>
Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:56 -07:00
Jeremy Fitzhardinge
9c9b8b3882 [PATCH] x86: put .note.* sections into a PT_NOTE segment in vmlinux
This patch will pack any .note.* section into a PT_NOTE segment in the output
file.

To do this, we tell ld that we need a PT_NOTE segment.  This requires us to
start explicitly mapping sections to segments, so we also need to explicitly
create PT_LOAD segments for text and data, and map the sections to them
appropriately.  Fortunately, each section will default to its previous
section's segment, so it doesn't take many changes to vmlinux.lds.S.

This only changes i386 for now, but I presume the corresponding changes for
other architectures will be as simple.

This change also adds <linux/elfnote.h>, which defines C and Assembler macros
for actually creating ELF notes.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:55 -07:00
Jeremy Fitzhardinge
052e79941a [PATCH] x86: make __FIXADDR_TOP variable to allow it to make space for a hypervisor
Make __FIXADDR_TOP a variable, so that it can be set to not get in the way of
address space a hypervisor may want to reserve.

Original patch by Gerd Hoffmann <kraxel@suse.de>

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Gerd Hoffmann <kraxel@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:55 -07:00
Rusty Russell
9f093394d7 [PATCH] x86: roll all the cpuid asm into one __cpuid call
It's a little neater, and also means only one place to patch for
paravirtualization.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:55 -07:00
Chris Wright
027a8c7e60 [PATCH] x86: implement always-locked bit ops, for memory shared with an SMP hypervisor
Add "always lock'd" implementations of set_bit, clear_bit and change_bit and
the corresponding test_and_ functions.  Also add "always lock'd"
implementation of cmpxchg.  These give guaranteed strong synchronisation and
are required for non-SMP kernels running on an SMP hypervisor.

Signed-off-by: Ian Pratt <ian.pratt@xensource.com>
Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:55 -07:00
Rolf Eike Beer
3a750363e6 [PATCH] Use BUG_ON(foo) instead of "if (foo) BUG()" in include/asm-i386/dma-mapping.h
Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:55 -07:00
Haavard Skinnemoen
bc157b7596 [PATCH] AVR32 MTD: Static Memory Controller driver
This patchset adds the necessary drivers and infrastructure to access the
external flash on the ATSTK1000 board through the MTD subsystem.  With this
stuff in place, it will be possible to use a jffs2 filesystem stored in the
external flash as a root filesystem.  It might also be possible to update the
boot loader if you drop the write protection of partition 0.

As suggested by David Woodhouse, I reworked the patches to use the physmap
driver instead of introducing a separate mapping driver for the ATSTK1000.
I've also cleaned up the hsmc header by removing useless comments and
converting spaces to tabs (my headerfile generator needs some work.)

Unfortunately, I couldn't unlock the flash in fixup_use_atmel_lock because the
erase regions hadn't been set up yet, so I had to do it from cfi_amdstd_setup
instead.

This patch:

This adds a simple API for configuring the static memory controller along with
an implementation for the Atmel HSMC.

Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:54 -07:00
Haavard Skinnemoen
5f97f7f940 [PATCH] avr32 architecture
This adds support for the Atmel AVR32 architecture as well as the AT32AP7000
CPU and the AT32STK1000 development board.

AVR32 is a new high-performance 32-bit RISC microprocessor core, designed for
cost-sensitive embedded applications, with particular emphasis on low power
consumption and high code density.  The AVR32 architecture is not binary
compatible with earlier 8-bit AVR architectures.

The AVR32 architecture, including the instruction set, is described by the
AVR32 Architecture Manual, available from

http://www.atmel.com/dyn/resources/prod_documents/doc32000.pdf

The Atmel AT32AP7000 is the first CPU implementing the AVR32 architecture.  It
features a 7-stage pipeline, 16KB instruction and data caches and a full
Memory Management Unit.  It also comes with a large set of integrated
peripherals, many of which are shared with the AT91 ARM-based controllers from
Atmel.

Full data sheet is available from

http://www.atmel.com/dyn/resources/prod_documents/doc32003.pdf

while the CPU core implementation including caches and MMU is documented by
the AVR32 AP Technical Reference, available from

http://www.atmel.com/dyn/resources/prod_documents/doc32001.pdf

Information about the AT32STK1000 development board can be found at

http://www.atmel.com/dyn/products/tools_card.asp?tool_id=3918

including a BSP CD image with an earlier version of this patch, development
tools (binaries and source/patches) and a root filesystem image suitable for
booting from SD card.

Alternatively, there's a preliminary "getting started" guide available at
http://avr32linux.org/twiki/bin/view/Main/GettingStarted which provides links
to the sources and patches you will need in order to set up a cross-compiling
environment for avr32-linux.

This patch, as well as the other patches included with the BSP and the
toolchain patches, is actively supported by Atmel Corporation.

[dmccr@us.ibm.com: Fix more pxx_page macro locations]
[bunk@stusta.de: fix `make defconfig']
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Dave McCracken <dmccr@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:54 -07:00
Ralf Baechle
53e62d3aaa [PATCH] Alchemy: Delete unused pt_regs * argument from au1xxx_dbdma_chan_alloc
The third argument of au1xxx_dbdma_chan_alloc's callback function is not
used anywhere.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:54 -07:00
David Howells
cf134483b2 [PATCH] FRV: Optimise ffs()
Optimise ffs(x) by using fls(x & x - 1) which we optimise to use the SCAN
instruction.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:54 -07:00
David Howells
a8ad27d03f [PATCH] FRV: Implement fls64()
Implement fls64() for FRV without recource to conditional jumps.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:54 -07:00
David Howells
92fc707208 [PATCH] FRV: Fix fls() to handle bit 31 being set correctly
Fix FRV fls() to handle bit 31 being set correctly (it should return 32 not 0).

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:53 -07:00
David Howells
af8c65b57a [PATCH] FRV: permit __do_IRQ() to be dispensed with
Permit __do_IRQ() to be dispensed with based on a configuration option.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:53 -07:00
David Howells
1bcbba3060 [PATCH] FRV: Use the generic IRQ stuff
Make the FRV arch use the generic IRQ code rather than having its own
routines for doing so.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:53 -07:00
Stephen Smalley
9a2f44f01a [PATCH] selinux: replace ctxid with sid in selinux_audit_rule_match interface
Replace ctxid with sid in selinux_audit_rule_match interface for
consistency with other interfaces.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:52 -07:00
Stephen Smalley
1a70cd40cb [PATCH] selinux: rename selinux_ctxid_to_string
Rename selinux_ctxid_to_string to selinux_sid_to_string to be
consistent with other interfaces.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:52 -07:00
Stephen Smalley
62bac0185a [PATCH] selinux: eliminate selinux_task_ctxid
Eliminate selinux_task_ctxid since it duplicates selinux_task_get_sid.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:52 -07:00
Christoph Lameter
89fa30242f [PATCH] NUMA: Add zone_to_nid function
There are many places where we need to determine the node of a zone.
Currently we use a difficult to read sequence of pointer dereferencing.
Put that into an inline function and use throughout VM.  Maybe we can find
a way to optimize the lookup in the future.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:52 -07:00
Christoph Lameter
0ff38490c8 [PATCH] zone_reclaim: dynamic slab reclaim
Currently one can enable slab reclaim by setting an explicit option in
/proc/sys/vm/zone_reclaim_mode.  Slab reclaim is then used as a final
option if the freeing of unmapped file backed pages is not enough to free
enough pages to allow a local allocation.

However, that means that the slab can grow excessively and that most memory
of a node may be used by slabs.  We have had a case where a machine with
46GB of memory was using 40-42GB for slab.  Zone reclaim was effective in
dealing with pagecache pages.  However, slab reclaim was only done during
global reclaim (which is a bit rare on NUMA systems).

This patch implements slab reclaim during zone reclaim.  Zone reclaim
occurs if there is a danger of an off node allocation.  At that point we

1. Shrink the per node page cache if the number of pagecache
   pages is more than min_unmapped_ratio percent of pages in a zone.

2. Shrink the slab cache if the number of the nodes reclaimable slab pages
   (patch depends on earlier one that implements that counter)
   are more than min_slab_ratio (a new /proc/sys/vm tunable).

The shrinking of the slab cache is a bit problematic since it is not node
specific.  So we simply calculate what point in the slab we want to reach
(current per node slab use minus the number of pages that neeed to be
allocated) and then repeately run the global reclaim until that is
unsuccessful or we have reached the limit.  I hope we will have zone based
slab reclaim at some point which will make that easier.

The default for the min_slab_ratio is 5%

Also remove the slab option from /proc/sys/vm/zone_reclaim_mode.

[akpm@osdl.org: cleanups]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:51 -07:00
Christoph Lameter
972d1a7b14 [PATCH] ZVC: Support NR_SLAB_RECLAIMABLE / NR_SLAB_UNRECLAIMABLE
Remove the atomic counter for slab_reclaim_pages and replace the counter
and NR_SLAB with two ZVC counter that account for unreclaimable and
reclaimable slab pages: NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE.

Change the check in vmscan.c to refer to to NR_SLAB_RECLAIMABLE.  The
intend seems to be to check for slab pages that could be freed.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:51 -07:00
Christoph Lameter
8417bba4b1 [PATCH] Replace min_unmapped_ratio by min_unmapped_pages in struct zone
*_pages is a better description of the role of the variable.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:51 -07:00
Dave McCracken
46a82b2d55 [PATCH] Standardize pxx_page macros
One of the changes necessary for shared page tables is to standardize the
pxx_page macros.  pte_page and pmd_page have always returned the struct
page associated with their entry, while pte_page_kernel and pmd_page_kernel
have returned the kernel virtual address.  pud_page and pgd_page, on the
other hand, return the kernel virtual address.

Shared page tables needs pud_page and pgd_page to return the actual page
structures.  There are very few actual users of these functions, so it is
simple to standardize their usage.

Since this is basic cleanup, I am submitting these changes as a standalone
patch.  Per Hugh Dickins' comments about it, I am also changing the
pxx_page_kernel macros to pxx_page_vaddr to clarify their meaning.

Signed-off-by: Dave McCracken <dmccr@us.ibm.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:51 -07:00
Christoph Lameter
980128f223 [PATCH] Define easier to handle GFP_THISNODE
In many places we will need to use the same combination of flags.  Specify
a single GFP_THISNODE definition for ease of use in gfp.h.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:50 -07:00
Christoph Lameter
9b819d204c [PATCH] Add __GFP_THISNODE to avoid fallback to other nodes and ignore cpuset/memory policy restrictions
Add a new gfp flag __GFP_THISNODE to avoid fallback to other nodes.  This
flag is essential if a kernel component requires memory to be located on a
certain node.  It will be needed for alloc_pages_node() to force allocation
on the indicated node and for alloc_pages() to force allocation on the
current node.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:50 -07:00
Christoph Hellwig
dbe5e69d2d [PATCH] slab: optimize kmalloc_node the same way as kmalloc
[akpm@osdl.org: export fix]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:49 -07:00
Nick Piggin
da6052f7b3 [PATCH] update some mm/ comments
Let's try to keep mm/ comments more useful and up to date. This is a start.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:49 -07:00
Heiko Carstens
dfd54cbcc0 [PATCH] bootmem: use MAX_DMA_ADDRESS instead of LOW32LIMIT
Introduce ARCH_LOW_ADDRESS_LIMIT which can be set per architecture to
override the 4GB default limit used by the bootmem allocater within
__alloc_bootmem_low() and __alloc_bootmem_low_node().  E.g.  s390 needs a
2GB limit instead of 4GB.

Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:49 -07:00
Nick Piggin
db37648cd6 [PATCH] mm: non syncing lock_page()
lock_page needs the caller to have a reference on the page->mapping inode
due to sync_page, ergo set_page_dirty_lock is obviously buggy according to
its comments.

Solve it by introducing a new lock_page_nosync which does not do a sync_page.

akpm: unpleasant solution to an unpleasant problem.  If it goes wrong it could
cause great slowdowns while the lock_page() caller waits for kblockd to
perform the unplug.  And if a filesystem has special sync_page() requirements
(none presently do), permanent hangs are possible.

otoh, set_page_dirty_lock() is usually (always?) called against userspace
pages.  They are always up-to-date, so there shouldn't be any pending read I/O
against these pages.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:48 -07:00
Martin Peschke
7ff6f08295 [PATCH] CPU hotplug compatible alloc_percpu()
This patch splits alloc_percpu() up into two phases.  Likewise for
free_percpu().  This allows clients to limit initial allocations to online
cpu's, and to populate or depopulate per-cpu data at run time as needed:

  struct my_struct *obj;

  /* initial allocation for online cpu's */
  obj = percpu_alloc(sizeof(struct my_struct), GFP_KERNEL);

  ...

  /* populate per-cpu data for cpu coming online */
  ptr = percpu_populate(obj, sizeof(struct my_struct), GFP_KERNEL, cpu);

  ...

  /* access per-cpu object */
  ptr = percpu_ptr(obj, smp_processor_id());

  ...

  /* depopulate per-cpu data for cpu going offline */
  percpu_depopulate(obj, cpu);

  ...

  /* final removal */
  percpu_free(obj);

Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:47 -07:00
Martin Schwidefsky
8bc719d3ca [PATCH] out of memory notifier
Add a notifer chain to the out of memory killer.  If one of the registered
callbacks could release some memory, do not kill the process but return and
retry the allocation that forced the oom killer to run.

The purpose of the notifier is to add a safety net in the presence of
memory ballooners.  If the resource manager inflated the balloon to a size
where memory allocations can not be satisfied anymore, it is better to
deflate the balloon a bit instead of killing processes.

The implementation for the s390 ballooner is included.

[akpm@osdl.org: cleanups]
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:47 -07:00
Christoph Lameter
19655d3487 [PATCH] linearly index zone->node_zonelists[]
I wonder why we need this bitmask indexing into zone->node_zonelists[]?

We always start with the highest zone and then include all lower zones
if we build zonelists.

Are there really cases where we need allocation from ZONE_DMA or
ZONE_HIGHMEM but not ZONE_NORMAL? It seems that the current implementation
of highest_zone() makes that already impossible.

If we go linear on the index then gfp_zone() == highest_zone() and a lot
of definitions fall by the wayside.

We can now revert back to the use of gfp_zone() in mempolicy.c ;-)

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:47 -07:00
Christoph Lameter
2f6726e54a [PATCH] Apply type enum zone_type
After we have done this we can now do some typing cleanup.

The memory policy layer keeps a policy_zone that specifies
the zone that gets memory policies applied. This variable
can now be of type enum zone_type.

The check_highest_zone function and the build_zonelists funnctionm must
then also take a enum zone_type parameter.

Plus there are a number of loops over zones that also should use
zone_type.

We run into some troubles at some points with functions that need a
zone_type variable to become -1. Fix that up.

[pj@sgi.com: fix set_mempolicy() crash]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:47 -07:00
Christoph Lameter
4e4785bcf0 [PATCH] mempolicies: fix policy_zone check
There is a check in zonelist_policy that compares pieces of the bitmap
obtained from a gfp mask via GFP_ZONETYPES with a zone number in function
zonelist_policy().

The bitmap is an ORed mask of __GFP_DMA, __GFP_DMA32 and __GFP_HIGHMEM.
The policy_zone is a zone number with the possible values of ZONE_DMA,
ZONE_DMA32, ZONE_HIGHMEM and ZONE_NORMAL. These are two different domains
of values.

For some reason seemed to work before the zone reduction patchset (It
definitely works on SGI boxes since we just have one zone and the check
cannot fail).

With the zone reduction patchset this check definitely fails on systems
with two zones if the system actually has memory in both zones.

This is because ZONE_NORMAL is selected using no __GFP flag at
all and thus gfp_zone(gfpmask) == 0. ZONE_DMA is selected when __GFP_DMA
is set. __GFP_DMA is 0x01.  So gfp_zone(gfpmask) == 1.

policy_zone is set to ZONE_NORMAL (==1) if ZONE_NORMAL and ZONE_DMA are
populated.

For ZONE_NORMAL gfp_zone(<no _GFP_DMA>) yields 0 which is <
policy_zone(ZONE_NORMAL) and so policy is not applied to regular memory
allocations!

Instead gfp_zone(__GFP_DMA) == 1 which results in policy being applied
to DMA allocations!

What we realy want in that place is to establish the highest allowable
zone for a given gfp_mask. If the highest zone is higher or equal to the
policy_zone then memory policies need to be applied. We have such
a highest_zone() function in page_alloc.c.

So move the highest_zone() function from mm/page_alloc.c into
include/linux/gfp.h.  On the way we simplify the function and use the new
zone_type that was also introduced with the zone reduction patchset plus we
also specify the right type for the gfp flags parameter.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:47 -07:00
Christoph Lameter
27bf71c2a7 [PATCH] reduce MAX_NR_ZONES: remove display of counters for unconfigured zones
eventcounters: Do not display counters for zones that are not available on an
arch

Do not define or display counters for the DMA32 and the HIGHMEM zone if such
zones were not configured.

[akpm@osdl.org: s390 fix]
[heiko.carstens@de.ibm.com: s390 fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:47 -07:00
Christoph Lameter
e53ef38d05 [PATCH] reduce MAX_NR_ZONES: make ZONE_HIGHMEM optional
Make ZONE_HIGHMEM optional

- ifdef out code and definitions related to CONFIG_HIGHMEM

- __GFP_HIGHMEM falls back to normal allocations if there is no
  ZONE_HIGHMEM

- GFP_ZONEMASK becomes 0x01 if there is no DMA32 and no HIGHMEM
  zone.

[jdike@addtoit.com: build fix]
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Christoph Lameter <clameter@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:46 -07:00
Christoph Lameter
fb0e7942bd [PATCH] reduce MAX_NR_ZONES: make ZONE_DMA32 optional
Make ZONE_DMA32 optional

- Add #ifdefs around ZONE_DMA32 specific code and definitions.

- Add CONFIG_ZONE_DMA32 config option and use that for x86_64
  that alone needs this zone.

- Remove the use of CONFIG_DMA_IS_DMA32 and CONFIG_DMA_IS_NORMAL
  for ia64 and fix up the way per node ZVCs are calculated.

- Fall back to prior GFP_ZONEMASK of 0x03 if there is no
  DMA32 zone.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:46 -07:00
Christoph Lameter
2f1b624868 [PATCH] reduce MAX_NR_ZONES: use enum to define zones, reformat and comment
Use enum for zones and reformat zones dependent information

Add comments explaning the use of zones and add a zones_t type for zone
numbers.

Line up information that will be #ifdefd by the following patches.

[akpm@osdl.org: comment cleanups]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:46 -07:00
Christoph Lameter
c1f60a5a41 [PATCH] reduce MAX_NR_ZONES: move HIGHMEM counters into highmem.c/.h
Move totalhigh_pages and nr_free_highpages() into highmem.c/.h

Move the totalhigh_pages definition into highmem.c/.h.  Move the
nr_free_highpages function into highmem.c

[yoichi_yuasa@tripeaks.co.jp: build fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:46 -07:00
Franck Bui-Huu
f71bf0cac7 [PATCH] bootmem: miscellaneous coding style fixes
It fixes various coding style issues, specially when spaces are useless.  For
example '*' go next to the function name.

Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:45 -07:00
Franck Bui-Huu
e786e86a54 [PATCH] bootmem: remove useless headers inclusions
Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:45 -07:00
Franck Bui-Huu
bb0923a668 [PATCH] bootmem: limit to 80 columns width
Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:45 -07:00
Franck Bui-Huu
71fb2e8f87 [PATCH] bootmem: remove useless parentheses in bootmem header file
Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:45 -07:00
Franck Bui-Huu
2d1a07d487 [PATCH] bootmem: remove useless __init in header file
__init in headers is pretty useless because the compiler doesn't check it, and
they get out of sync relatively frequently.  So if you see an __init in a
header file, it's quite unreliable and you need to check the definition
anyway.

Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:45 -07:00
keith mannthey
9102330005 [PATCH] convert i386 NUMA KVA space to bootmem
Address a long standing issue of booting with an initrd on an i386 numa
system.  Currently (and always) the numa kva area is mapped into low memory
by finding the end of low memory and moving that mark down (thus creating
space for the kva).  The issue with this is that Grub loads initrds into
this similar space so when the kernel check the initrd it finds it outside
max_low_pfn and disables it (it thinks the initrd is not mapped into usable
memory) thus initrd enabled kernels can't boot i386 numa :(

My solution to the problem just converts the numa kva area to use the
bootmem allocator to save it's area (instead of moving the end of low
memory).  Using bootmem allows the kva area to be mapped into more diverse
addresses (not just the end of low memory) and enables the kva area to be
mapped below the initrd if present.

I have tested this patch on numaq(no initrd) and summit(initrd) i386 numa
based systems.

[akpm@osdl.org: cleanups]
Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:45 -07:00
Adrian Bunk
b221385bc4 [PATCH] mm/: make functions static
This patch makes the following needlessly global functions static:
 - slab.c: kmem_find_general_cachep()
 - swap.c: __page_cache_release()
 - vmalloc.c: __vmalloc_node()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:45 -07:00