linux-kernel-test/net/ipv4
Christoph Paasch 5924f17a8a tcp: Fix divide by zero when pushing during tcp-repair
When in repair-mode and TCP_RECV_QUEUE is set, we end up calling
tcp_push with mss_now being 0. If data is in the send-queue and
tcp_set_skb_tso_segs gets called, we crash because it will divide by
mss_now:

[  347.151939] divide error: 0000 [#1] SMP
[  347.152907] Modules linked in:
[  347.152907] CPU: 1 PID: 1123 Comm: packetdrill Not tainted 3.16.0-rc2 #4
[  347.152907] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  347.152907] task: f5b88540 ti: f3c82000 task.ti: f3c82000
[  347.152907] EIP: 0060:[<c1601359>] EFLAGS: 00210246 CPU: 1
[  347.152907] EIP is at tcp_set_skb_tso_segs+0x49/0xa0
[  347.152907] EAX: 00000b67 EBX: f5acd080 ECX: 00000000 EDX: 00000000
[  347.152907] ESI: f5a28f40 EDI: f3c88f00 EBP: f3c83d10 ESP: f3c83d00
[  347.152907]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  347.152907] CR0: 80050033 CR2: 083158b0 CR3: 35146000 CR4: 000006b0
[  347.152907] Stack:
[  347.152907]  c167f9d9 f5acd080 000005b4 00000002 f3c83d20 c16013e6 f3c88f00 f5acd080
[  347.152907]  f3c83da0 c1603b5a f3c83d38 c10a0188 00000000 00000000 f3c83d84 c10acc85
[  347.152907]  c1ad5ec0 00000000 00000000 c1ad679c 010003e0 00000000 00000000 f3c88fc8
[  347.152907] Call Trace:
[  347.152907]  [<c167f9d9>] ? apic_timer_interrupt+0x2d/0x34
[  347.152907]  [<c16013e6>] tcp_init_tso_segs+0x36/0x50
[  347.152907]  [<c1603b5a>] tcp_write_xmit+0x7a/0xbf0
[  347.152907]  [<c10a0188>] ? up+0x28/0x40
[  347.152907]  [<c10acc85>] ? console_unlock+0x295/0x480
[  347.152907]  [<c10ad24f>] ? vprintk_emit+0x1ef/0x4b0
[  347.152907]  [<c1605716>] __tcp_push_pending_frames+0x36/0xd0
[  347.152907]  [<c15f4860>] tcp_push+0xf0/0x120
[  347.152907]  [<c15f7641>] tcp_sendmsg+0xf1/0xbf0
[  347.152907]  [<c116d920>] ? kmem_cache_free+0xf0/0x120
[  347.152907]  [<c106a682>] ? __sigqueue_free+0x32/0x40
[  347.152907]  [<c106a682>] ? __sigqueue_free+0x32/0x40
[  347.152907]  [<c114f0f0>] ? do_wp_page+0x3e0/0x850
[  347.152907]  [<c161c36a>] inet_sendmsg+0x4a/0xb0
[  347.152907]  [<c1150269>] ? handle_mm_fault+0x709/0xfb0
[  347.152907]  [<c15a006b>] sock_aio_write+0xbb/0xd0
[  347.152907]  [<c1180b79>] do_sync_write+0x69/0xa0
[  347.152907]  [<c1181023>] vfs_write+0x123/0x160
[  347.152907]  [<c1181d55>] SyS_write+0x55/0xb0
[  347.152907]  [<c167f0d8>] sysenter_do_call+0x12/0x28

This can easily be reproduced with the following packetdrill-script (the
"magic" with netem, sk_pacing and limit_output_bytes is done to prevent
the kernel from pushing all segments, because hitting the limit without
doing this is not so easy with packetdrill):

0   socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0  setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0

+0  bind(3, ..., ...) = 0
+0  listen(3, 1) = 0

+0  < S 0:0(0) win 32792 <mss 1460>
+0  > S. 0:0(0) ack 1 <mss 1460>
+0.1  < . 1:1(0) ack 1 win 65000

+0  accept(3, ..., ...) = 4

// This forces that not all segments of the snd-queue will be pushed
+0 `tc qdisc add dev tun0 root netem delay 10ms`
+0 `sysctl -w net.ipv4.tcp_limit_output_bytes=2`
+0 setsockopt(4, SOL_SOCKET, 47, [2], 4) = 0

+0 write(4,...,10000) = 10000
+0 write(4,...,10000) = 10000

// Set tcp-repair stuff, particularly TCP_RECV_QUEUE
+0 setsockopt(4, SOL_TCP, 19, [1], 4) = 0
+0 setsockopt(4, SOL_TCP, 20, [1], 4) = 0

// This now will make the write push the remaining segments
+0 setsockopt(4, SOL_SOCKET, 47, [20000], 4) = 0
+0 `sysctl -w net.ipv4.tcp_limit_output_bytes=130000`

// Now we will crash
+0 write(4,...,1000) = 1000

This happens since ec34232575 (tcp: fix retransmission in repair
mode). Prior to that, the call to tcp_push was prevented by a check for
tp->repair.

The patch fixes it, by adding the new goto-label out_nopush. When exiting
tcp_sendmsg and a push is not required, which is the case for tp->repair,
we go to this label.

When repairing and calling send() with TCP_RECV_QUEUE, the data is
actually put in the receive-queue. So, no push is required because no
data has been added to the send-queue.

Cc: Andrew Vagin <avagin@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Fixes: ec34232575 (tcp: fix retransmission in repair mode)
Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Acked-by: Andrew Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-02 18:21:03 -07:00
..
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2014-05-30 17:54:47 -07:00
af_inet.c gre: Call gso_make_checksum 2014-06-04 22:46:38 -07:00
ah4.c ah4: Use the IPsec protocol multiplexer API 2014-02-25 07:04:17 +01:00
arp.c ipv4: arp: update neighbour address when a gratuitous arp is received and arp_accept is set 2014-01-02 00:08:38 -05:00
cipso_ipv4.c ipv4: ERROR: code indent should use tabs where possible 2013-12-26 13:43:21 -05:00
datagram.c ipv4: fix a race in ip4_datagram_release_cb() 2014-06-11 15:39:18 -07:00
devinet.c ipv4: minor spelling fix 2014-05-18 21:10:29 -04:00
esp4.c esp4: Use the IPsec protocol multiplexer API 2014-02-25 07:04:17 +01:00
fib_frontend.c ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif 2014-04-16 15:05:11 -04:00
fib_lookup.h ipv4: make fib_detect_death static 2013-12-28 17:01:46 -05:00
fib_rules.c inet: fix NULL pointer Oops in fib(6)_rule_suppress 2013-12-10 17:54:23 -05:00
fib_semantics.c ipv4: fib_semantics: increment fib_info_cnt after fib_info allocation 2014-05-07 17:14:32 -04:00
fib_trie.c net: Revert "fib_trie: use seq_file_net rather than seq->private" 2014-06-04 15:11:41 -07:00
gre_demux.c gre: Call gso_make_checksum 2014-06-04 22:46:38 -07:00
gre_offload.c net: Save software checksum complete 2014-06-11 15:46:13 -07:00
icmp.c net: add a sysctl to reflect the fwmark on replies 2014-05-13 18:35:08 -04:00
igmp.c inetpeer: get rid of ip_id_count 2014-06-02 11:00:41 -07:00
inet_connection_sock.c ipv4: make ip_local_reserved_ports per netns 2014-05-14 15:31:45 -04:00
inet_diag.c inet_diag: fix inet_diag_dump_icsk() to use correct state for timewait sockets 2014-01-13 22:35:46 -08:00
inet_fragment.c inet: frag: make sure forced eviction removes all frags 2014-03-06 15:28:45 -05:00
inet_hashtables.c net: Use a more standard macro for INET_ADDR_COOKIE 2014-05-14 16:07:23 -04:00
inet_lro.c lro: remove dead code 2013-12-29 16:34:25 -05:00
inet_timewait_sock.c tcp/dccp: remove twchain 2013-10-08 23:19:24 -04:00
inetpeer.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-06-12 14:27:40 -07:00
ip_forward.c net: rename local_df to ignore_df 2014-05-12 14:03:41 -04:00
ip_fragment.c ipv4: fix "conntrack zones" support for defrag user check in ip_expire 2014-05-05 16:02:59 +02:00
ip_gre.c gre: allow changing mac address when device is up 2014-06-10 22:46:42 -07:00
ip_input.c net: Fix memory leak if TPROXY used with TCP early demux 2014-01-27 16:22:11 -08:00
ip_options.c ipv4: Use predefined value for readability 2014-04-28 13:28:43 -04:00
ip_output.c inetpeer: get rid of ip_id_count 2014-06-02 11:00:41 -07:00
ip_sockglue.c ipv4: yet another new IP_MTU_DISCOVER option IP_PMTUDISC_OMIT 2014-02-26 15:51:00 -05:00
ip_tunnel_core.c net: Support for multiple checksums with gso 2014-06-04 22:46:38 -07:00
ip_tunnel.c ipv4: fix dst race in sk_dst_get() 2014-06-25 17:41:44 -07:00
ip_vti.c ip_vti: Fix 'ip tunnel add' with 'key' parameters 2014-06-11 00:30:52 -07:00
ipcomp.c ipcomp4: Use the IPsec protocol multiplexer API 2014-02-25 07:04:17 +01:00
ipconfig.c ipv4: ipconfig.c: add parentheses in an if statement 2014-02-14 00:14:23 -05:00
ipip.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-06-11 16:02:55 -07:00
ipmr.c inetpeer: get rid of ip_id_count 2014-06-02 11:00:41 -07:00
Kconfig net: neighbour: Remove CONFIG_ARPD 2013-09-03 21:41:43 -04:00
Makefile xfrm4: Add IPsec protocol multiplexer 2014-02-25 07:04:16 +01:00
netfilter.c netfilter: remove double colon 2014-02-19 11:41:25 +01:00
ping.c ping: move ping_group_range out of CONFIG_SYSCTL 2014-05-08 22:50:47 -04:00
proc.c net: clean up snmp stats code 2014-05-07 16:06:05 -04:00
protocol.c net: remove outdated comment for ipv4 and ipv6 protocol handler 2013-11-28 18:47:51 -05:00
raw.c inetpeer: get rid of ip_id_count 2014-06-02 11:00:41 -07:00
route.c ipv4: irq safe sk_dst_[re]set() and ipv4_sk_update_pmtu() fix 2014-06-30 23:40:58 -07:00
syncookies.c net: support marking accepting TCP sockets 2014-05-13 18:35:09 -04:00
sysctl_net_ipv4.c ipv4: make ip_local_reserved_ports per netns 2014-05-14 15:31:45 -04:00
tcp_bic.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_cong.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_cubic.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-05-12 13:19:14 -04:00
tcp_diag.c
tcp_fastopen.c tcp: remove unnecessary tcp_sk assignment. 2014-06-16 21:35:00 -07:00
tcp_highspeed.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_htcp.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_hybla.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_illinois.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_input.c tcp: fix tcp_match_skb_to_sack() for unaligned SACK at end of an skb 2014-06-19 20:50:49 -07:00
tcp_ipv4.c net: support marking accepting TCP sockets 2014-05-13 18:35:09 -04:00
tcp_lp.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_memcontrol.c cgroup: replace cftype->trigger() with cftype->write() 2014-05-13 12:16:21 -04:00
tcp_metrics.c net: use the new API kvfree() 2014-06-05 00:49:51 -07:00
tcp_minisocks.c tcp: use tcp_v4_send_synack on first SYN-ACK 2014-05-13 17:53:02 -04:00
tcp_offload.c gre: Call gso_make_checksum 2014-06-04 22:46:38 -07:00
tcp_output.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-06-12 14:27:40 -07:00
tcp_probe.c tcp: switch rtt estimations to usec resolution 2014-02-26 17:08:40 -05:00
tcp_scalable.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_timer.c tcp: snmp stats for Fast Open, SYN rtx, and data pkts 2014-03-03 15:58:03 -05:00
tcp_vegas.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_vegas.h net: ipv4/ipv6: Remove extern from function prototypes 2013-10-19 19:12:11 -04:00
tcp_veno.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_westwood.c tcp: remove unused min_cwnd member of tcp_congestion_ops 2014-02-13 18:22:34 -05:00
tcp_yeah.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp.c tcp: Fix divide by zero when pushing during tcp-repair 2014-07-02 18:21:03 -07:00
tunnel4.c
udp_diag.c
udp_impl.h net: ipv4/ipv6: Remove extern from function prototypes 2013-10-19 19:12:11 -04:00
udp_offload.c net: Add skb_gro_postpull_rcsum to udp and vxlan 2014-06-11 15:46:13 -07:00
udp.c udp: Add MIB counters for rcvbuferrors 2014-06-27 00:20:55 -07:00
udplite.c net: Eliminate no_check from protosw 2014-05-23 16:28:53 -04:00
xfrm4_input.c xfrm4: Add IPsec protocol multiplexer 2014-02-25 07:04:16 +01:00
xfrm4_mode_beet.c ipv4: ERROR: code indent should use tabs where possible 2013-12-26 13:43:21 -05:00
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c inetpeer: get rid of ip_id_count 2014-06-02 11:00:41 -07:00
xfrm4_output.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-05-24 00:32:30 -04:00
xfrm4_policy.c xfrm: Introduce xfrm_input_afinfo to access the the callbacks properly 2014-03-14 07:28:07 +01:00
xfrm4_protocol.c xfrm4: Properly handle unsupported protocols 2014-04-29 08:41:12 +02:00
xfrm4_state.c inet: make no_pmtu_disc per namespace and kill ipv4_config 2013-12-18 16:58:20 -05:00
xfrm4_tunnel.c sit: add IPv4 over IPv4 support 2013-05-31 17:19:05 -07:00