Commit Graph

4042 Commits

Author SHA1 Message Date
aaf68cfbf2 [PATCH] knfsd: fix a race in closing NFSd connections
If you lose this race, it can iput a socket inode twice and you get a BUG
in fs/inode.c

When I added the option for user-space to close a socket, I added some
cruft to svc_delete_socket so that I could call that function when closing
a socket per user-space request.

This was the wrong thing to do.  I should have just set SK_CLOSE and let
normal mechanisms do the work.

Not only wrong, but buggy.  The locking is all wrong and it openned up a
race where-by a socket could be closed twice.

So this patch:
  Introduces svc_close_socket which sets SK_CLOSE then either leave
  the close up to a thread, or calls svc_delete_socket if it can
  get SK_BUSY.

  Adds a bias to sk_busy which is removed when SK_DEAD is set,
  This avoid races around shutting down the socket.

  Changes several 'spin_lock' to 'spin_lock_bh' where the _bh
  was missing.

Bugzilla-url: http://bugzilla.kernel.org/show_bug.cgi?id=7916

Signed-off-by: Neil Brown <neilb@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-09 09:25:47 -08:00
55e747445b [PATCH] hidp __user annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-09 09:14:06 -08:00
1539b98b56 [IPX]: Fix NULL pointer dereference on ipx unload
Fixes a null pointer dereference when unloading the ipx module.

On initialization of the ipx module, registering certain packet
types can fail. When this happens, unloading the module later
dereferences NULL pointers.  This patch fixes that. Please apply.

Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 16:02:21 -08:00
9783e1df7a Merge branch 'HEAD' of master.kernel.org:/pub/scm/linux/kernel/git/herbert/crypto-2.6
Conflicts:

	crypto/Kconfig
2007-02-08 15:25:18 -08:00
4387ff75f2 [NET]: Fix net/socket.c warnings.
GCC (correctly) says:

net/socket.c: In function ‘sys_sendto’:
net/socket.c:1510: warning: ‘err’ may be used uninitialized in this function
net/socket.c: In function ‘sys_recvfrom’:
net/socket.c:1571: warning: ‘err’ may be used uninitialized in this function

sock_from_file() either returns filp->private_data or it
sets *err and returns NULL.

Callers return "err" on NULL, but filp->private_data could
be NULL.

Some minor rearrangements of error handling in sys_sendto
and sys_recvfrom solves the issue.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 15:06:08 -08:00
23bb80d215 [NET]: cleanup sock_from_file()
I believe dead code from sock_from_file() can be cleaned up.

All sockets are now built using sock_attach_fd(), that puts the 'sock' pointer 
into file->private_data and &socket_file_ops into file->f_op

I could not find a place where file->private_data could be set to NULL, 
keeping opened the file.

So to get 'sock' from a 'file' pointer, either :

- This is a socket file (f_op == &socket_file_ops), and we can directly get 
'sock' from private_data.
- This is not a socket, we return -ENOTSOCK and dont even try to find a socket 
via dentry/inode :)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 14:59:57 -08:00
dbca9b2750 [NET]: change layout of ehash table
ehash table layout is currently this one :

First half of this table is used by sockets not in TIME_WAIT state
Second half of it is used by sockets in TIME_WAIT state.

This is non optimal because of for a given hash or socket, the two chain heads 
are located in separate cache lines.
Moreover the locks of the second half are never used.

If instead of this halving, we use two list heads in inet_ehash_bucket instead 
of only one, we probably can avoid one cache miss, and reduce ram usage, 
particularly if sizeof(rwlock_t) is big (various CONFIG_DEBUG_SPINLOCK, 
CONFIG_DEBUG_LOCK_ALLOC settings). So we still halves the table but we keep 
together related chains to speedup lookups and socket state change.

In this patch I did not try to align struct inet_ehash_bucket, but a future 
patch could try to make this structure have a convenient size (a power of two 
or a multiple of L1_CACHE_SIZE).
I guess rwlock will just vanish as soon as RCU is plugged into ehash :) , so 
maybe we dont need to scratch our heads to align the bucket...

Note : In case struct inet_ehash_bucket is not a power of two, we could 
probably change alloc_large_system_hash() (in case it use __get_free_pages()) 
to free the unused space. It currently allocates a big zone, but the last 
quarter of it could be freed. Again, this should be a temporary 'problem'.

Patch tested on ipv4 tcp only, but should be OK for IPV6 and DCCP.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 14:16:46 -08:00
eac3731bd0 [S390]: Add AF_IUCV socket support
From: Jennifer Hunt <jenhunt@us.ibm.com>

This patch adds AF_IUCV socket support.

Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 13:51:54 -08:00
2356f4cb19 [S390]: Rewrite of the IUCV base code, part 2
Add rewritten IUCV base code to net/iucv.

Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 13:37:42 -08:00
c9c2e9dcb8 [X.25]: Adds /proc/net/x25/forward to view active forwarded calls.
View the active forwarded calls
cat /proc/net/x25/forward

Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 13:35:18 -08:00
39e21c0d34 [X.25]: Adds /proc/sys/net/x25/x25_forward to control forwarding.
echo "1" > /proc/sys/net/x25/x25_forward 
To turn on x25_forwarding, defaults to off
Requires the previous patch.

Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 13:34:36 -08:00
95a9dc4390 [X.25]: Add call forwarding
Adds call forwarding to X.25, allowing it to operate like an X.25 router.
Useful if one needs to manipulate X.25 traffic with tools like tc.
This is an update/cleanup based off a patch submitted by Daniel Ferenci a few years ago.

Thanks Alan for the feedback.
Added the null check to the clones.
Moved the skb_clone's into the forwarding functions.

Worked ok with Cisco XoT, linux X.25 back to back, and some old NTUs/PADs.

Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 13:34:02 -08:00
e610e679dd [XFRM]: xfrm_migrate() needs exporting to modules.
Needed by xfrm_user and af_key.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 13:29:15 -08:00
f6ed0ec0ee [PFKEYV2]: CONFIG_NET_KEY_MIGRATE option
Add CONFIG_NET_KEY_MIGRATE option which makes it possible for user
application to send or receive MIGRATE message to/from PF_KEY socket.

Signed-off-by: Shinta Sugimoto <shinta.sugimoto@ericsson.com>
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 13:15:05 -08:00
08de61beab [PFKEYV2]: Extension for dynamic update of endpoint address(es)
Extend PF_KEYv2 framework so that user application can take advantage
of MIGRATE feature via PF_KEYv2 interface. User application can either
send or receive an MIGRATE message to/from PF_KEY socket.

Detail information can be found in the internet-draft
<draft-sugimoto-mip6-pfkey-migrate>.

Signed-off-by: Shinta Sugimoto <shinta.sugimoto@ericsson.com>
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 13:14:33 -08:00
d0473655c8 [XFRM]: CONFIG_XFRM_MIGRATE option
Add CONFIG_XFRM_MIGRATE option which makes it possible for for user
application to send or receive MIGRATE message to/from netlink socket.

Signed-off-by: Shinta Sugimoto <shinta.sugimoto@ericsson.com>
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 13:13:07 -08:00
5c79de6e79 [XFRM]: User interface for handling XFRM_MSG_MIGRATE
Add user interface for handling XFRM_MSG_MIGRATE. The message is issued
by user application. When kernel receives the message, procedure of
updating XFRM databases will take place.

Signed-off-by: Shinta Sugimoto <shinta.sugimoto@ericsson.com>
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 13:12:32 -08:00
80c9abaabf [XFRM]: Extension for dynamic update of endpoint address(es)
Extend the XFRM framework so that endpoint address(es) in the XFRM
databases could be dynamically updated according to a request (MIGRATE
message) from user application. Target XFRM policy is first identified
by the selector in the MIGRATE message. Next, the endpoint addresses
of the matching templates and XFRM states are updated according to
the MIGRATE message.

Signed-off-by: Shinta Sugimoto <shinta.sugimoto@ericsson.com>
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 13:11:42 -08:00
9934e81c8c [NETFILTER]: ip6_tables: remove redundant structure definitions
Move ip6t_standard/ip6t_error_target/ip6t_error definitions to ip6_tables.h
instead of defining them in each table individually.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:23 -08:00
a0ca215a73 [NETFILTER]: ip6_tables: support MH match
This introduces match for Mobility Header (MH) described by Mobile IPv6
specification (RFC3775). User can specify the MH type or its range to be
matched.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: Yasuyuki Kozakai <kozakai@linux-ipv6.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:21 -08:00
e60a13e030 [NETFILTER]: {ip,ip6}_tables: use struct xt_table instead of redefined structure names
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:20 -08:00
6709dbbb19 [NETFILTER]: {ip,ip6}_tables: remove x_tables wrapper functions
Use the x_tables functions directly to make it better visible which
parts are shared between ip_tables and ip6_tables.

Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:19 -08:00
e1fd0586b0 [NETFILTER]: x_tables: fix return values for LOG/ULOG
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:18 -08:00
41f4689a7c [NETFILTER]: NAT: optional source port randomization support
This patch adds support to NAT to randomize source ports.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:17 -08:00
cdd289a2f8 [NETFILTER]: add IPv6-capable TCPMSS target
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:16 -08:00
a8d0f9526f [NET]: Add UDPLITE support in a few missing spots
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:14 -08:00
5eb87f456e [NETFILTER]: bridge-netfilter: use nf_register_hooks/nf_unregister_hooks
Additionally mark the init function __init.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:13 -08:00
efbc597634 [NETFILTER]: nf_nat: remove broken HOOKNAME macro
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:12 -08:00
2822b0d926 [NETFILTER]: Remove useless comparisons before assignments
Remove unnecessary if() constructs before assignment.

Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:11 -08:00
a09113c2c8 [NETFILTER]: tcp conntrack: do liberal tracking for picked up connections
Do liberal tracking (only RSTs need to be in-window) for connections picked
up without seeing a SYN to deal with window scaling. Also change logging
of invalid packets not to log packets accepted by liberal tracking to avoid
spamming the logs.

Based on suggestion from James Ralston <ralston@pobox.com>

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:10 -08:00
6fecd19851 [NETFILTER]: Add SANE connection tracking helper
This is nf_conntrack_sane, a netfilter connection tracking helper module
for the SANE protocol used by the 'saned' daemon to make scanners available
via network. The SANE protocol uses separate control & data connections,
similar to passive FTP. The helper module is needed to recognize the data
connection as RELATED to the control one.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:09 -08:00
719647e213 [IRLAN]: handle out of memory errors
This patch checks return values:

- irlmp_register_client()
- irlmp_register_service()
- irlan_open()

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:08 -08:00
bb5aa42734 [IRDA]: handle out of memory errors
This patch checks return value of memory allocation functions
for irda subsystem and fixes memory leaks in error cases.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:07 -08:00
22f8cde5bc [NET]: unregister_netdevice as void
There was no real useful information from the unregister_netdevice() return
code, the only error occurred in a situation that was a driver bug. So
change it to a void function.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:06 -08:00
f48d5ff1e4 [IPV6] RAW: Add checksum default defines for MH.
Add checksum default defines for mobility header(MH) which
goes through raw socket. As the result kernel's behavior is
to handle MH checksum as default.

This patch also removes verifying inbound MH checksum at
mip6_mh_filter() since it did not consider user specified
checksum offset and was redundant check with raw socket code.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:05 -08:00
cc63f70b8b [IPV4/IPV6] multicast: Check add_grhead() return value
add_grhead() allocates memory with GFP_ATOMIC and in at least two places skb
from it passed to skb_put() without checking.

Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:04 -08:00
f2f2102d1a [XFRM]: Fix missed error setting in xfrm4_policy.c
When we can't find the afinfo we should return EAFNOSUPPORT.
GCC warned about the uninitialized 'err' for this path as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:03 -08:00
4337226228 [IPSEC]: IPv4 over IPv6 IPsec tunnel
This is the patch to support IPv4 over IPv6 IPsec.

Signed-off-by: Miika Komu <miika@iki.fi>
Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi>
Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:02 -08:00
c82f963efe [IPSEC]: IPv6 over IPv4 IPsec tunnel
This is the patch to support IPv6 over IPv4 IPsec

Signed-off-by: Miika Komu <miika@iki.fi>
Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi>
Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:01 -08:00
cdca72652a [IPSEC]: exporting xfrm_state_afinfo
This patch exports xfrm_state_afinfo.

Signed-off-by: Miika Komu <miika@iki.fi>
Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi>
Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:00 -08:00
0f08461ebf [DCCP]: Warning fixes.
net/dccp/ccids/ccid3.c: In function `ccid3_hc_rx_packet_recv':
net/dccp/ccids/ccid3.c:1007: warning: long int format, different type arg (arg 3)
net/dccp/ccids/ccid3.c:1007: warning: long int format, different type arg (arg 4)

opaque types must be suitably cast for printing.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:38:56 -08:00
97353cb4c0 [NET] net/wanrouter/wanmain.c: cleanups
This patch contains the following cleanups:
- make the following needlessly global functions static:
  - lock_adapter_irq()
  - unlock_adapter_irq()
- #if 0 the following unused global functions:
  - wanrouter_encapsulate()
  - wanrouter_type_trans()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:38:54 -08:00
84ff602efb [ATM]: Fix for crash in adummy_init()
This was reported by Ingo Molnar here,

http://lkml.org/lkml/2006/12/18/119

The problem is that adummy_init() depends on atm_init() , but adummy_init()
is called first.

So I put atm_init() into subsys_initcall which seems appropriate, and it
will still get module_init() if it becomes a module.

Interesting to note that you could crash your system here if you just load
the modules in the wrong order.

Signed-off-by: Daniel Walker <dwalker@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:38:53 -08:00
f5a6e01c09 [NET]: user of the jiffies rounding code: Networking
This patch introduces users of the round_jiffies() function in the
networking code.

These timers all were of the "about once a second" or "about once
every X seconds" variety and several showed up in the "what wakes the
cpu up" profiles that the tickless patches provide.  Some timers are
highly dynamic based on network load; but even on low activity systems
they still show up so the rounding is done only in cases of low
activity, allowing higher frequency timers in the high activity case.

The various hardware watchdogs are an obvious case; they run every 2
seconds but aren't otherwise specific of exactly when they need to
run.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:38:52 -08:00
104439a887 [TCP]: Don't apply FIN exception to full TSO segments.
Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:38:51 -08:00
8a3c3a9727 [TCP]: Check num sacks in SACK fast path
We clear the unused parts of the SACK cache, This prevents us from mistakenly
taking the cache data if the old data in the SACK cache is the same as the data
in the SACK block. This assumes that we never receive an empty SACK block with
start and end both at zero.

Signed-off-by: Baruch Even <baruch@ev-en.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:38:50 -08:00
6f74651ae6 [TCP]: Seperate DSACK from SACK fast path
Move DSACK code outside the SACK fast-path checking code. If the DSACK
determined that the information was too old we stayed with a partial cache
copied. Most likely this matters very little since the next packet will not be
DSACK and we will find it in the cache. but it's still not good form and there
is little reason to couple the two checks.

Since the SACK receive cache doesn't need the data to be in host order we also
remove the ntohl in the checking loop.

Signed-off-by: Baruch Even <baruch@ev-en.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:38:49 -08:00
fda03fbb56 [TCP]: Advance fast path pointer for first block only
Only advance the SACK fast-path pointer for the first block, the
fast-path assumes that only the first block advances next time so we
should not move the cached skb for the next sack blocks.

Signed-off-by: Baruch Even <baruch@ev-en.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:38:48 -08:00
ffbc61117d [PACKET]: Fix skb->cb clobbering between aux and sockaddr
Both aux data and sockaddr tries to use the same buffer which
obviously doesn't work.  We just happen to have 4 bytes free in
the skb->cb if you take away the maximum length of sockaddr_ll.
That's just enough to store the one piece of info from aux data
that we can't generate at recvmsg(2) time.

This is what the following patch does.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:38:47 -08:00
8dc4194474 [PACKET]: Add optional checksum computation for recvmsg
This patch is needed to make ISC's DHCP server (and probably other
DHCP servers/clients using AF_PACKET) to be able to serve another
client on the same Xen host.

The problem is that packets between different domains on the same
Xen host only have partial checksums.  Unfortunately this piece of
information is not passed along in AF_PACKET unless you're using
the mmap interface.  Since dhcpd doesn't support packet-mmap, UDP
packets from the same host come out with apparently bogus checksums.

This patch adds a mechanism for AF_PACKET recvmsg(2) to return the
status along with the packet.  It does so by adding a new cmsg that
contains this information along with some other relevant data such
as the original packet length.

I didn't include the time stamp information since there is already
a cmsg for that.

This patch also changes the mmap code to set the CSUMNOTREADY flag
on all packets instead of just outoing packets on cooked sockets.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:38:46 -08:00