Use offsetof() instead of home-brewed version.
Based upon initial patch by Steven Whitehouse and suggestions
by Ben Hutchings.
Signed-off-by: David S. Miller <davem@davemloft.net>
Problem observed:
In IPv6, in the presence of multiple routers candidates to
default gateway in one segment, each sending a different
value of preference, the Linux hosts connected to the
segment weren't selecting the right one in all the
combinations possible of LOW/MEDIUM/HIGH preference.
This patch changes two files:
include/linux/icmpv6.h
Get the "router_pref" bitfield in the right place
(as RFC4191 says), named the bit left with this fix as
"home_agent" (RFC3775 say that's his function)
net/ipv6/ndisc.c
Corrects the binary logic behind the updating of the
router preference in the flags of the routing table
Result:
With this two fixes applied, the default route used by
the system was to consistent with the rules mentioned
in RFC4191 in case of changes in the value of preference
in router advertisements
Signed-off-by: Pedro Ribeiro <pribeiro@net.ipl.pt>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reported by Jay Fenlason: ioctl() did not return as intended
- the size of data read into ioctl_send_request,
- the number of datagrams enqueued by ioctl_queue_iso.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
queuecommand() looked at the remote and local node IDs before it read
the bus generation. The corresponding race with sbp2_reconnect updating
these data was probably impossible to happen though because the current
code blocks the SCSI layer during reconnection. However, better safe
than sorry, especially if someone later improves the code to not block
the SCSI layer.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
1. We don't need to round the SBP-2 segment size limit down to a
multiple of 4 kB (0xffff -> 0xf000). It is only necessary to
ensure quadlet alignment (0xffff -> 0xfffc).
2. Use dma_set_max_seg_size() to tell the DMA mapping infrastructure
and the block IO layer about the restriction. This way we can
remove the size checks and segment splitting in the queuecommand
path.
This assumes that no other code in the firewire stack uses
dma_map_sg() with conflicting requirements. It furthermore assumes
that the controller device's platform actually allows us to set the
segment size to our liking. Assert the latter with a BUG_ON().
3. Also use blk_queue_max_segment_size() to tell the block IO layer
about it. It cannot know it because our scsi_add_host() does not
point to the FireWire controller's device.
Thanks to Grant Grundler and FUJITA Tomonori for advice.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Share code between fw_send_request + wait_for_completion callers.
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Addendum:
Removes an unnecessary struct and an ununsed retry loop.
Calls it fw_run_transaction() instead of fw_send_request_sync().
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Acked-by: Kristian Høgsberg <krh@redhat.com>
There are situations when nodes vanish from the bus and come back in
quickly thereafter:
- When certain bus-powered hubs are plugged in,
- when certain disk enclosures are switched from self-power to bus
power or vice versa and break the daisy chain during the transition,
- when the user plugs a cable out and quickly plugs it back in, e.g.
to reorder a daisy chain (works on Mac OS X if done quickly enough),
- when certain hubs temporarily malfunction during high bus traffic.
The ieee1394 driver's nodemgr already contained a function to set
vanished nodes aside into "limbo"; i.e. they wouldn't actually be
deleted right away. (In fact, only unloading the driver or writing into
an obscure sysfs attribute would delete them eventually.) If nodes
reappeared later, they would be resurrected out of limbo.
Moving nodes into and out of limbo was accompanied with calling the
.suspend() and .resume() driver methods of the drivers which were bound
to a respective node's unit directories. Not only is this somewhat
strange due to the intended use of these driver methods for power
management, also the sbp2 driver in particular does not implement
.suspend() and .resume(). Hence sbp2 would be disconnected from devices
in situations as listed above.
We now:
- leave drivers bound when nodes go into limbo,
- call the drivers' .update() when nodes come out of limbo,
- automatically delete in-limbo nodes 3 seconds after the last
bus reset and bus rescan.
- Because of the automatic removal, the now obsolete bus attribute
/sys/bus/ieee1394/destroy_node is removed.
This especially lets sbp2 survive brief disconnections. You can for
example yank a disk's cable and plug it back in while reading the
respective disk with dd, but dd will happily continue as if nothing
happened.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Remove useless pointer type casts.
Remove unnecessary hi->host indirection where only host is used.
Remove an unnecessary WARN_ON.
Change a few names.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
init->channel and v.buffer are unsigned and tests for < 0 therefore
always false. gcc knows this and eliminates the code, but anyway...
Reported by Roel Kluin.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Application programs should use a libraw1394 handle only in a single
thread. The raw1394 driver was apparently relying on this, because it
did nothing to protect its fi->state variable from corruption due to
concurrent accesses.
We now serialize the fi->state accesses. This affects the write() path.
We re-use the state_mutex which was introduced to protect fi->iso_state
accesses in the ioctl() path. These paths and accesses are independent
of each other, hence separate mutexes could be used. But I don't see
much benefit in that.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Refactor the ioctl dispatcher in order to move a fraction of it out of
the section which is serialized by fi->state_mutex. This is not so much
about performance but more about self-documentation: The mutex_lock()/
mutex_unlock() calls are now closer to the data accesses which the mutex
protects, i.e. to the iso_state switch.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This removes the last usage of the Big Kernel Lock from the ieee1394
stack, i.e. from raw1394's (unlocked_)ioctl and compat_ioctl.
The ioctl()s don't need to take the BKL, but they need to be serialized
per struct file *. In particular, accesses to ->iso_state need to be
serial. We simply use a blocking mutex for this purpose because
libraw1394 does not use O_NONBLOCK. In practice, there is no lock
contention anyway because most if not all libraw1394 clients use a
libraw1394 handle only in a single thread.
mmap() also accesses ->iso_state. Until now this was unprotected
against concurrent changes by ioctls. Fix this bug while we are at it.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
1. We don't need to round the SBP-2 segment size limit down to a
multiple of 4 kB (0xffff -> 0xf000). It is only necessary to
ensure quadlet alignment (0xffff -> 0xfffc).
2. Use dma_set_max_seg_size() to tell the DMA mapping infrastructure
and the block IO layer about the restriction. This way we can
remove the size checks and segment splitting in the queuecommand
path.
This assumes that no other code in the ieee1394 stack uses
dma_map_sg() with conflicting requirements. It furthermore assumes
that the controller device's platform actually allows us to set the
segment size to our liking. Assert the latter with a BUG_ON().
3. Also use blk_queue_max_segment_size() to tell the block IO layer
about it. It cannot know it because our scsi_add_host() does not
point to the FireWire controller's device.
We can also uniformly use dma_map_sg() for the single segment case just
like for the multi segment case, to further simplify the code.
Also clean up how the page table is converted to big endian.
Thanks to Grant Grundler and FUJITA Tomonori for advice.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Two dma_sync_single_for_cpu() were called in the wrong place.
Luckily they were merely for DMA_TO_DEVICE, hence nobody noticed.
Also reorder the matching dma_sync_single_for_device() a little bit
so that they reside in the same functions as their counterparts.
This also avoids syncing the s/g table for requests which don't use it.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
MIPS: Kill unused <asm/debug.h> inclusions
MIPS: IP32: Add platform device for CMOS RTC; remove dead code
RTC: M48T35: new RTC driver
MIPS: IP27: Switch over to RTC class driver
MIPS: DS1286: New RTC driver
MIPS: IP22/28: Switch over to RTC class driver
MIPS: PCI: Scan busses when they are registered
MIPS: WGT634U: Add reset button support
MIPS: BCM47xx: Use the new SSB GPIO API
MIPS: BCM47xx: Remove references to BCM947XX
MIPS: WGT634U: Add machine detection message
MIPS: Align .data.cacheline_aligned based on CONFIG_MIPS_L1_CACHE_SHIFT
MIPS: show_cpuinfo prints the type of the calling CPU
MIPS: Fix wrong branch target in new spin_lock code.
MIPS: Have a heart for a lonely, lost header file ...
The metronome driver produces warnings when built on x86-64 as it assumes that
size_t is an int. Use %Zd instead.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When we skip unrecognized options in xfs_fs_remount we should just break
out of the switch and not return because otherwise we may skip clearing
the xfs-internal read-only flag. This will only show up on some
operations like touch because most read-only checks are done by the VFS
which thinks this filesystem is r/w. Eventually we should replace the
XFS read-only flag with a helper that always checks the VFS flag to make
sure they can never get out of sync.
Bug reported and fix verified by Marcel Beister on #xfs.
Bug fix verified by updated xfstests/189.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Timothy Shimmin <tes@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jeff Chua reported that this patch:
> -#define PTE_IDENT_ATTR 0x007 /* PRESENT+RW+USER */
> -#define PDE_IDENT_ATTR 0x067 /* PRESENT+RW+USER+DIRTY+ACCESSED */
> +#define PTE_IDENT_ATTR 0x003 /* PRESENT+RW */
> +#define PDE_IDENT_ATTR 0x063 /* PRESENT+RW+DIRTY+ACCESSED */
broke kernels with CONFIG_COMPAT_VDSO set with this init segfault:
init[1]: segfault at ffffe01c up b7f0dc28 sp bfc26628 error 5 in ld-2.7.90.so[b7f0b000+1c000]
Include USER bit in the PDE_IDENT_ATTR only, as the protection bits
are combined from the PDE and PTE entries. This will allow the high
mapped VDSO page in the case of CONFIG_COMPAT_VDSO to be user
readable.
Reported-by: Jeff Chua <jeff.chua.linux@gmail.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Tested-by: Jeff Chua <jeff.chua.linux@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (158 commits)
powerpc: Fix CHRP PCI config access for indirect_pci
powerpc/chrp: Fix detection of Python PCI host bridge on IBM CHRPs
powerpc: Fix 32-bit SMP boot on CHRP
powerpc: Fix link errors on 32-bit machines using legacy DMA
powerpc/pci: Improve detection of unassigned bridge resources
hvc_console: Fix free_irq in spinlocked section
powerpc: Get USE_STRICT_MM_TYPECHECKS working again
powerpc: Reflect the used arguments in machine_init() prototype
powerpc: Fix DMA offset for non-coherent DMA
powerpc: fix fsl_upm nand driver modular build
powerpc/83xx: add NAND support for the MPC8360E-RDK boards
powerpc: FPGA support for GE Fanuc SBC610
i2c: MPC8349E-mITX Power Management and GPIO expander driver
powerpc: reserve two DMA channels for audio in MPC8610 HPCD device tree
powerpc: document the "fsl,ssi-dma-channel" compatible property
powerpc: disable CHRP and PMAC support in various defconfigs
OF: add fsl,mcu-mpc8349emitx to the exception list
powerpc/83xx: add DS1374 RTC support for the MPC837xE-MDS boards
powerpc: remove support for bootmem-allocated memory for the DIU driver
powerpc: remove non-dependent load fsl_booke PTE_64BIT
...
With intel iommu hardware, we can assign devices to kvm/ia64 guests.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Using vt-d, kvm guests can be assigned physcial devices, so
this patch introduce a new mmio type (directed mmio)
to handle its mmio access.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Don't try to do put_page once the entries are mmio.
Set the tag to indicate the mmio space for vmm setting
TLB's memory attribute.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Moving irq ack notification logic as common, and make
it shared with ia64 side.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Add a kvm_ prefix to avoid polluting kernel's name space.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
To share with other archs, this patch moves device assignment
logic to common parts.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
KVM-x86 dumps a lot of debug messages that have no meaning for normal
operation:
- INIT de-assertion is ignored
- SIPIs are sent and received
- APIC writes are unaligned or < 4 byte long
(Windows Server 2003 triggers this on SMP)
Degrade them to true debug messages, keeping the host kernel log clean
for real problems.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Assigned device could DMA to mmio pages, so also need to map mmio pages
into VT-d page table.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
The PIC code makes little effort to avoid kvm_vcpu_kick(), resulting in
unnecessary guest exits in some conditions.
For example, if the timer interrupt is routed through the IOAPIC, IRR
for IRQ 0 will get set but not cleared, since the APIC is handling the
acks.
This means that everytime an interrupt < 16 is triggered, the priority
logic will find IRQ0 pending and send an IPI to vcpu0 (in case IRQ0 is
not masked, which is Linux's case).
Introduce a new variable isr_ack to represent the IRQ's for which the
guest has been signalled / cleared the ISR. Use it to avoid more than
one IPI per trigger-ack cycle, in addition to the avoidance when ISR is
set in get_priority().
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Cache the unsynced children information in a per-page bitmap.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Allow guest pagetables to go out of sync. Instead of emulating write
accesses to guest pagetables, or unshadowing them, we un-write-protect
the page table and allow the guest to modify it at will. We rely on
invlpg executions to synchronize individual ptes, and will synchronize
the entire pagetable on tlb flushes.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Need to convert shadow_notrap_nonpresent -> shadow_trap_nonpresent when
unsyncing pages.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
kvm_mmu_zap_page will soon zap the unsynced children of a page. Restart
list walk in such case.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Introduce a function to walk all parents of a given page, invoking a handler.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
With pages out of sync invlpg needs to be trapped. For now simply nuke
the entry.
Untested on AMD.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Examine guest pagetable and bring the shadow back in sync. Caller is responsible
for local TLB flush before re-entering guest mode.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
There is not much point in write protecting large mappings. This
can only happen when a page is shadowed during the window between
is_largepage_backed and mmu_lock acquision. Zap the entry instead, so
the next pagefault will find a shadowed page via is_largepage_backed and
fallback to 4k translations.
Simplifies out of sync shadow.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>