1 <?xml version="1.0" encoding="ISO-8859-1"?>
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3 <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head><!--
4 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
5 This file is generated from xml source: DO NOT EDIT
6 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
8 <title>Connections in the FIN_WAIT_2 state and Apache - Apache HTTP Server</title>
9 <link href="../style/css/manual.css" rel="stylesheet" media="all" type="text/css" title="Main stylesheet" />
10 <link href="../style/css/manual-loose-100pc.css" rel="alternate stylesheet" media="all" type="text/css" title="No Sidebar - Default font size" />
11 <link href="../style/css/manual-print.css" rel="stylesheet" media="print" type="text/css" />
12 <link href="../images/favicon.ico" rel="shortcut icon" /></head>
13 <body id="manual-page"><div id="page-header">
14 <p class="menu"><a href="../mod/">Modules</a> | <a href="../mod/directives.html">Directives</a> | <a href="../faq/">FAQ</a> | <a href="../glossary.html">Glossary</a> | <a href="../sitemap.html">Sitemap</a></p>
15 <p class="apache">Apache HTTP Server Version 2.0</p>
16 <img alt="" src="../images/feather.gif" /></div>
17 <div class="up"><a href="./"><img title="<-" alt="<-" src="../images/left.gif" /></a></div>
19 <a href="http://www.apache.org/">Apache</a> > <a href="http://httpd.apache.org/">HTTP Server</a> > <a href="http://httpd.apache.org/docs/">Documentation</a> > <a href="../">Version 2.0</a> > <a href="./">Miscellaneous Documentation</a></div><div id="page-content"><div id="preamble"><h1>Connections in the FIN_WAIT_2 state and Apache</h1>
21 <p><span>Available Languages: </span><a href="../en/misc/fin_wait_2.html" title="English"> en </a></p>
25 <div class="warning"><h3>Warning:</h3>
26 <p>This document has not been fully updated
27 to take into account changes made in the 2.0 version of the
28 Apache HTTP Server. Some of the information may still be
29 relevant, but please use it with care.</p>
32 <p>Starting with the Apache 1.2 betas, people are reporting
33 many more connections in the FIN_WAIT_2 state (as reported
34 by <code>netstat</code>) than they saw using older
35 versions. When the server closes a TCP connection, it sends
36 a packet with the FIN bit set to the client, which then
37 responds with a packet with the ACK bit set. The client
38 then sends a packet with the FIN bit set to the server,
39 which responds with an ACK and the connection is closed.
40 The state that the connection is in during the period
41 between when the server gets the ACK from the client and
42 the server gets the FIN from the client is known as
43 FIN_WAIT_2. See the <a href="ftp://ds.internic.net/rfc/rfc793.txt">TCP RFC</a> for
44 the technical details of the state transitions.</p>
46 <p>The FIN_WAIT_2 state is somewhat unusual in that there
47 is no timeout defined in the standard for it. This means
48 that on many operating systems, a connection in the
49 FIN_WAIT_2 state will stay around until the system is
50 rebooted. If the system does not have a timeout and too
51 many FIN_WAIT_2 connections build up, it can fill up the
52 space allocated for storing information about the
53 connections and crash the kernel. The connections in
54 FIN_WAIT_2 do not tie up an httpd process.</p>
57 <div id="quickview"><ul id="toc"><li><img alt="" src="../images/down.gif" /> <a href="#why">Why Does It Happen?</a></li>
58 <li><img alt="" src="../images/down.gif" /> <a href="#what">What Can I Do About it?</a></li>
59 <li><img alt="" src="../images/down.gif" /> <a href="#appendix">Appendix</a></li>
61 <div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
63 <h2><a name="why" id="why">Why Does It Happen?</a></h2>
65 <p>There are numerous reasons for it happening, some of them
66 may not yet be fully clear. What is known follows.</p>
68 <h3><a name="buggy" id="buggy">Buggy Clients and Persistent
71 <p>Several clients have a bug which pops up when dealing with
72 persistent connections (aka
73 keepalives). When the connection is idle and the server
74 closes the connection (based on the <code class="directive"><a href="../mod/core.html#keepalivetimeout">KeepAliveTimeout</a></code>),
75 the client is programmed so that the client does not send
76 back a FIN and ACK to the server. This means that the
77 connection stays in the FIN_WAIT_2 state until one of the
78 following happens:</p>
81 <li>The client opens a new connection to the same or a
82 different site, which causes it to fully close the older
83 connection on that socket.</li>
85 <li>The user exits the client, which on some (most?)
86 clients causes the OS to fully shutdown the
89 <li>The FIN_WAIT_2 times out, on servers that have a
90 timeout for this state.</li>
93 <p>If you are lucky, this means that the buggy client will
94 fully close the connection and release the resources on
95 your server. However, there are some cases where the socket
96 is never fully closed, such as a dialup client
97 disconnecting from their provider before closing the
98 client. In addition, a client might sit idle for days
99 without making another connection, and thus may hold its
100 end of the socket open for days even though it has no
101 further use for it. <strong>This is a bug in the browser or
102 in its operating system's TCP implementation.</strong></p>
104 <p>The clients on which this problem has been verified to
108 <li>Mozilla/3.01 (X11; I; FreeBSD 2.1.5-RELEASE
111 <li>Mozilla/2.02 (X11; I; FreeBSD 2.1.5-RELEASE
114 <li>Mozilla/3.01Gold (X11; I; SunOS 5.5 sun4m)</li>
116 <li>MSIE 3.01 on the Macintosh</li>
118 <li>MSIE 3.01 on Windows 95</li>
121 <p>This does not appear to be a problem on:</p>
124 <li>Mozilla/3.01 (Win95; I)</li>
127 <p>It is expected that many other clients have the same
128 problem. What a client <strong>should do</strong> is
129 periodically check its open socket(s) to see if they have
130 been closed by the server, and close their side of the
131 connection if the server has closed. This check need only
132 occur once every few seconds, and may even be detected by a
133 OS signal on some systems (<em>e.g.</em>, Win95 and NT
134 clients have this capability, but they seem to be ignoring
137 <p>Apache <strong>cannot</strong> avoid these FIN_WAIT_2
138 states unless it disables persistent connections for the
139 buggy clients, just like we recommend doing for Navigator
140 2.x clients due to other bugs. However, non-persistent
141 connections increase the total number of connections needed
142 per client and slow retrieval of an image-laden web page.
143 Since non-persistent connections have their own resource
144 consumptions and a short waiting period after each closure,
145 a busy server may need persistence in order to best serve
148 <p>As far as we know, the client-caused FIN_WAIT_2 problem
149 is present for all servers that support persistent
150 connections, including Apache 1.1.x and 1.2.</p>
154 <h3><a name="code" id="code">A necessary bit of code
155 introduced in 1.2</a></h3>
157 <p>While the above bug is a problem, it is not the whole
158 problem. Some users have observed no FIN_WAIT_2 problems
159 with Apache 1.1.x, but with 1.2b enough connections build
160 up in the FIN_WAIT_2 state to crash their server. The most
161 likely source for additional FIN_WAIT_2 states is a
162 function called <code>lingering_close()</code> which was
163 added between 1.1 and 1.2. This function is necessary for
164 the proper handling of persistent connections and any
165 request which includes content in the message body
166 (<em>e.g.</em>, PUTs and POSTs). What it does is read any
167 data sent by the client for a certain time after the server
168 closes the connection. The exact reasons for doing this are
169 somewhat complicated, but involve what happens if the
170 client is making a request at the same time the server
171 sends a response and closes the connection. Without
172 lingering, the client might be forced to reset its TCP
173 input buffer before it has a chance to read the server's
174 response, and thus understand why the connection has
175 closed. See the <a href="#appendix">appendix</a> for more
178 <p>The code in <code>lingering_close()</code> appears to
179 cause problems for a number of factors, including the
180 change in traffic patterns that it causes. The code has
181 been thoroughly reviewed and we are not aware of any bugs
182 in it. It is possible that there is some problem in the BSD
183 TCP stack, aside from the lack of a timeout for the
184 FIN_WAIT_2 state, exposed by the
185 <code>lingering_close</code> code that causes the observed
189 </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
190 <div class="section">
191 <h2><a name="what" id="what">What Can I Do About it?</a></h2>
193 <p>There are several possible workarounds to the problem, some
194 of which work better than others.</p>
196 <h3><a name="add_timeout" id="add_timeout">Add a timeout for FIN_WAIT_2</a></h3>
198 <p>The obvious workaround is to simply have a timeout for the
199 FIN_WAIT_2 state. This is not specified by the RFC, and
200 could be claimed to be a violation of the RFC, but it is
201 widely recognized as being necessary. The following systems
202 are known to have a timeout:</p>
205 <li><a href="http://www.freebsd.org/">FreeBSD</a>
206 versions starting at 2.0 or possibly earlier.</li>
208 <li><a href="http://www.netbsd.org/">NetBSD</a> version
211 <li><a href="http://www.openbsd.org/">OpenBSD</a> all
214 <li><a href="http://www.bsdi.com/">BSD/OS</a> 2.1, with
215 the <a href="ftp://ftp.bsdi.com/bsdi/patches/patches-2.1/K210-027">
216 K210-027</a> patch installed.</li>
218 <li><a href="http://www.sun.com/">Solaris</a> as of
219 around version 2.2. The timeout can be tuned by using
220 <code>ndd</code> to modify
221 <code>tcp_fin_wait_2_flush_interval</code>, but the
222 default should be appropriate for most servers and
223 improper tuning can have negative impacts.</li>
225 <li><a href="http://www.linux.org/">Linux</a> 2.0.x and
228 <li><a href="http://www.hp.com/">HP-UX</a> 10.x defaults
229 to terminating connections in the FIN_WAIT_2 state after
230 the normal keepalive timeouts. This does not refer to the
231 persistent connection or HTTP keepalive timeouts, but the
232 <code>SO_LINGER</code> socket option which is enabled by
233 Apache. This parameter can be adjusted by using
234 <code>nettune</code> to modify parameters such as
235 <code>tcp_keepstart</code> and <code>tcp_keepstop</code>.
236 In later revisions, there is an explicit timer for
237 connections in FIN_WAIT_2 that can be modified; contact
238 HP support for details.</li>
240 <li><a href="http://www.sgi.com/">SGI IRIX</a> can be
241 patched to support a timeout. For IRIX 5.3, 6.2, and 6.3,
242 use patches 1654, 1703 and 1778 respectively. If you have
243 trouble locating these patches, please contact your SGI
244 support channel for help.</li>
246 <li><a href="http://www.ncr.com/">NCR's MP RAS Unix</a>
247 2.xx and 3.xx both have FIN_WAIT_2 timeouts. In 2.xx it
248 is non-tunable at 600 seconds, while in 3.xx it defaults
249 to 600 seconds and is calculated based on the tunable
250 "max keep alive probes" (default of 8) multiplied by the
251 "keep alive interval" (default 75 seconds).</li>
253 <li><a href="http://www.sequent.com">Sequent's ptx/TCP/IP
254 for DYNIX/ptx</a> has had a FIN_WAIT_2 timeout since
255 around release 4.1 in mid-1994.</li>
258 <p>The following systems are known to not have a
262 <li><a href="http://www.sun.com/">SunOS 4.x</a> does not
263 and almost certainly never will have one because it as at
264 the very end of its development cycle for Sun. If you
265 have kernel source should be easy to patch.</li>
268 <p>There is a <a href="http://www.apache.org/dist/httpd/contrib/patches/1.2/fin_wait_2.patch">
269 patch available</a> for adding a timeout to the FIN_WAIT_2
270 state; it was originally intended for BSD/OS, but should be
271 adaptable to most systems using BSD networking code. You
272 need kernel source code to be able to use it.</p>
276 <h3><a name="no_lingering" id="no_lingering">Compile without using
277 <code>lingering_close()</code></a></h3>
279 <p>It is possible to compile Apache 1.2 without using the
280 <code>lingering_close()</code> function. This will result
281 in that section of code being similar to that which was in
282 1.1. If you do this, be aware that it can cause problems
283 with PUTs, POSTs and persistent connections, especially if
284 the client uses pipelining. That said, it is no worse than
285 on 1.1, and we understand that keeping your server running
286 is quite important.</p>
288 <p>To compile without the <code>lingering_close()</code>
289 function, add <code>-DNO_LINGCLOSE</code> to the end of the
290 <code>EXTRA_CFLAGS</code> line in your
291 <code>Configuration</code> file, rerun
292 <code class="program"><a href="../programs/Configure.html">Configure</a></code> and rebuild the server.</p>
296 <h3><a name="so_linger" id="so_linger">Use <code>SO_LINGER</code> as
297 an alternative to <code>lingering_close()</code></a></h3>
299 <p>On most systems, there is an option called
300 <code>SO_LINGER</code> that can be set with
301 <code>setsockopt(2)</code>. It does something very similar
302 to <code>lingering_close()</code>, except that it is broken
303 on many systems so that it causes far more problems than
304 <code>lingering_close</code>. On some systems, it could
305 possibly work better so it may be worth a try if you have
306 no other alternatives.</p>
308 <p>To try it, add <code>-DUSE_SO_LINGER
309 -DNO_LINGCLOSE</code> to the end of the
310 <code>EXTRA_CFLAGS</code> line in your
311 <code>Configuration</code> file, rerun
312 <code class="program"><a href="../programs/Configure.html">Configure</a></code> and rebuild the server.</p>
314 <div class="note"><h3>NOTE</h3>Attempting to use
315 <code>SO_LINGER</code> and <code>lingering_close()</code>
316 at the same time is very likely to do very bad things, so
321 <h3><a name="increase_mem" id="increase_mem">Increase the amount of memory
322 used for storing connection state</a></h3>
325 <dt>BSD based networking code:</dt>
328 BSD stores network data, such as connection states, in
329 something called an mbuf. When you get so many
330 connections that the kernel does not have enough mbufs
331 to put them all in, your kernel will likely crash. You
332 can reduce the effects of the problem by increasing the
333 number of mbufs that are available; this will not
334 prevent the problem, it will just make the server go
335 longer before crashing.
337 <p>The exact way to increase them may depend on your
338 OS; look for some reference to the number of "mbufs" or
339 "mbuf clusters". On many systems, this can be done by
340 adding the line <code>NMBCLUSTERS="n"</code>, where
341 <code>n</code> is the number of mbuf clusters you want
342 to your kernel config file and rebuilding your
349 <h3><a name="disable" id="disable">Disable KeepAlive</a></h3>
351 <p>If you are unable to do any of the above then you
352 should, as a last resort, disable KeepAlive. Edit your
353 httpd.conf and change "KeepAlive On" to "KeepAlive
357 </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
358 <div class="section">
359 <h2><a name="appendix" id="appendix">Appendix</a></h2>
361 <p>Below is a message from Roy Fielding, one of the authors
364 <h3><a name="message" id="message">Why the lingering close
365 functionality is necessary with HTTP</a></h3>
367 <p>The need for a server to linger on a socket after a close
368 is noted a couple times in the HTTP specs, but not
369 explained. This explanation is based on discussions between
370 myself, Henrik Frystyk, Robert S. Thau, Dave Raggett, and
371 John C. Mallery in the hallways of MIT while I was at W3C.</p>
373 <p>If a server closes the input side of the connection
374 while the client is sending data (or is planning to send
375 data), then the server's TCP stack will signal an RST
376 (reset) back to the client. Upon receipt of the RST, the
377 client will flush its own incoming TCP buffer back to the
378 un-ACKed packet indicated by the RST packet argument. If
379 the server has sent a message, usually an error response,
380 to the client just before the close, and the client
381 receives the RST packet before its application code has
382 read the error message from its incoming TCP buffer and
383 before the server has received the ACK sent by the client
384 upon receipt of that buffer, then the RST will flush the
385 error message before the client application has a chance to
386 see it. The result is that the client is left thinking that
387 the connection failed for no apparent reason.</p>
389 <p>There are two conditions under which this is likely to
393 <li>sending POST or PUT data without proper
396 <li>sending multiple requests before each response
397 (pipelining) and one of the middle requests resulting in
398 an error or other break-the-connection result.</li>
401 <p>The solution in all cases is to send the response, close
402 only the write half of the connection (what shutdown is
403 supposed to do), and continue reading on the socket until
404 it is either closed by the client (signifying it has
405 finally read the response) or a timeout occurs. That is
406 what the kernel is supposed to do if SO_LINGER is set.
407 Unfortunately, SO_LINGER has no effect on some systems; on
408 some other systems, it does not have its own timeout and
409 thus the TCP memory segments just pile-up until the next
410 reboot (planned or not).</p>
412 <p>Please note that simply removing the linger code will
413 not solve the problem -- it only moves it to a different
414 and much harder one to detect.</p>
417 <div class="bottomlang">
418 <p><span>Available Languages: </span><a href="../en/misc/fin_wait_2.html" title="English"> en </a></p>
419 </div><div id="footer">
420 <p class="apache">Copyright 2009 The Apache Software Foundation.<br />Licensed under the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.</p>
421 <p class="menu"><a href="../mod/">Modules</a> | <a href="../mod/directives.html">Directives</a> | <a href="../faq/">FAQ</a> | <a href="../glossary.html">Glossary</a> | <a href="../sitemap.html">Sitemap</a></p></div>