Class | Mongrel::HttpServer |
In: |
lib/mongrel.rb
lib/mongrel.rb |
Parent: | Object |
This is the main driver of Mongrel, while the Mongrel::HttpParser and Mongrel::URIClassifier make up the majority of how the server functions. It‘s a very simple class that just has a thread accepting connections and a simple HttpServer.process_client function to do the heavy lifting with the IO and Ruby.
You use it by doing the following:
server = HttpServer.new("0.0.0.0", 3000) server.register("/stuff", MyNiftyHandler.new) server.run.join
The last line can be just server.run if you don‘t want to join the thread used. If you don‘t though Ruby will mysteriously just exit on you.
Ruby‘s thread implementation is "interesting" to say the least. Experiments with many different types of IO processing simply cannot make a dent in it. Future releases of Mongrel will find other creative ways to make threads faster, but don‘t hold your breath until Ruby 1.9 is actually finally useful.
acceptor | [R] | |
acceptor | [R] | |
classifier | [R] | |
classifier | [R] | |
host | [R] | |
host | [R] | |
num_processors | [R] | |
num_processors | [R] | |
port | [R] | |
port | [R] | |
throttle | [R] | |
throttle | [R] | |
timeout | [R] | |
timeout | [R] | |
workers | [R] | |
workers | [R] |
Creates a working server on host:port (strange things happen if port isn‘t a Number). Use HttpServer::run to start the server and HttpServer.acceptor.join to join the thread that‘s processing incoming requests on the socket.
The num_processors optional argument is the maximum number of concurrent processors to accept, anything over this is closed immediately to maintain server processing performance. This may seem mean but it is the most efficient way to deal with overload. Other schemes involve still parsing the client‘s request which defeats the point of an overload handling system.
The throttle parameter is a sleep timeout (in hundredths of a second) that is placed between socket.accept calls in order to give the server a cheap throttle time. It defaults to 0 and actually if it is 0 then the sleep is not done at all.
# File lib/mongrel.rb, line 90 90: def initialize(host, port, num_processors=950, throttle=0, timeout=60) 91: 92: tries = 0 93: @socket = TCPServer.new(host, port) 94: 95: @classifier = URIClassifier.new 96: @host = host 97: @port = port 98: @workers = ThreadGroup.new 99: @throttle = throttle / 100.0 100: @num_processors = num_processors 101: @timeout = timeout 102: end
Creates a working server on host:port (strange things happen if port isn‘t a Number). Use HttpServer::run to start the server and HttpServer.acceptor.join to join the thread that‘s processing incoming requests on the socket.
The num_processors optional argument is the maximum number of concurrent processors to accept, anything over this is closed immediately to maintain server processing performance. This may seem mean but it is the most efficient way to deal with overload. Other schemes involve still parsing the client‘s request which defeats the point of an overload handling system.
The throttle parameter is a sleep timeout (in hundredths of a second) that is placed between socket.accept calls in order to give the server a cheap throttle time. It defaults to 0 and actually if it is 0 then the sleep is not done at all.
# File lib/mongrel.rb, line 90 90: def initialize(host, port, num_processors=950, throttle=0, timeout=60) 91: 92: tries = 0 93: @socket = TCPServer.new(host, port) 94: 95: @classifier = URIClassifier.new 96: @host = host 97: @port = port 98: @workers = ThreadGroup.new 99: @throttle = throttle / 100.0 100: @num_processors = num_processors 101: @timeout = timeout 102: end
# File lib/mongrel.rb, line 240 240: def configure_socket_options 241: case RUBY_PLATFORM 242: when /linux/ 243: # 9 is currently TCP_DEFER_ACCEPT 244: $tcp_defer_accept_opts = [Socket::SOL_TCP, 9, 1] 245: $tcp_cork_opts = [Socket::SOL_TCP, 3, 1] 246: when /freebsd(([1-4]\..{1,2})|5\.[0-4])/ 247: # Do nothing, just closing a bug when freebsd <= 5.4 248: when /freebsd/ 249: # Use the HTTP accept filter if available. 250: # The struct made by pack() is defined in /usr/include/sys/socket.h as accept_filter_arg 251: unless `/sbin/sysctl -nq net.inet.accf.http`.empty? 252: $tcp_defer_accept_opts = [Socket::SOL_SOCKET, Socket::SO_ACCEPTFILTER, ['httpready', nil].pack('a16a240')] 253: end 254: end 255: end
# File lib/mongrel.rb, line 240 240: def configure_socket_options 241: case RUBY_PLATFORM 242: when /linux/ 243: # 9 is currently TCP_DEFER_ACCEPT 244: $tcp_defer_accept_opts = [Socket::SOL_TCP, 9, 1] 245: $tcp_cork_opts = [Socket::SOL_TCP, 3, 1] 246: when /freebsd(([1-4]\..{1,2})|5\.[0-4])/ 247: # Do nothing, just closing a bug when freebsd <= 5.4 248: when /freebsd/ 249: # Use the HTTP accept filter if available. 250: # The struct made by pack() is defined in /usr/include/sys/socket.h as accept_filter_arg 251: unless `/sbin/sysctl -nq net.inet.accf.http`.empty? 252: $tcp_defer_accept_opts = [Socket::SOL_SOCKET, Socket::SO_ACCEPTFILTER, ['httpready', nil].pack('a16a240')] 253: end 254: end 255: end
Performs a wait on all the currently running threads and kills any that take too long. It waits by @timeout seconds, which can be set in .initialize or via mongrel_rails. The @throttle setting does extend this waiting period by that much longer.
# File lib/mongrel.rb, line 233 233: def graceful_shutdown 234: while reap_dead_workers("shutdown") > 0 235: STDERR.puts "Waiting for #{@workers.list.length} requests to finish, could take #{@timeout + @throttle} seconds." 236: sleep @timeout / 10 237: end 238: end
Performs a wait on all the currently running threads and kills any that take too long. It waits by @timeout seconds, which can be set in .initialize or via mongrel_rails. The @throttle setting does extend this waiting period by that much longer.
# File lib/mongrel.rb, line 233 233: def graceful_shutdown 234: while reap_dead_workers("shutdown") > 0 235: STDERR.puts "Waiting for #{@workers.list.length} requests to finish, could take #{@timeout + @throttle} seconds." 236: sleep @timeout / 10 237: end 238: end
Does the majority of the IO processing. It has been written in Ruby using about 7 different IO processing strategies and no matter how it‘s done the performance just does not improve. It is currently carefully constructed to make sure that it gets the best possible performance, but anyone who thinks they can make it faster is more than welcome to take a crack at it.
# File lib/mongrel.rb, line 109 109: def process_client(client) 110: begin 111: parser = HttpParser.new 112: params = HttpParams.new 113: request = nil 114: data = client.readpartial(Const::CHUNK_SIZE) 115: nparsed = 0 116: 117: # Assumption: nparsed will always be less since data will get filled with more 118: # after each parsing. If it doesn't get more then there was a problem 119: # with the read operation on the client socket. Effect is to stop processing when the 120: # socket can't fill the buffer for further parsing. 121: while nparsed < data.length 122: nparsed = parser.execute(params, data, nparsed) 123: 124: if parser.finished? 125: if not params[Const::REQUEST_PATH] 126: # it might be a dumbass full host request header 127: uri = URI.parse(params[Const::REQUEST_URI]) 128: params[Const::REQUEST_PATH] = uri.path 129: end 130: 131: raise "No REQUEST PATH" if not params[Const::REQUEST_PATH] 132: 133: script_name, path_info, handlers = @classifier.resolve(params[Const::REQUEST_PATH]) 134: 135: if handlers 136: params[Const::PATH_INFO] = path_info 137: params[Const::SCRIPT_NAME] = script_name 138: 139: # From http://www.ietf.org/rfc/rfc3875 : 140: # "Script authors should be aware that the REMOTE_ADDR and REMOTE_HOST 141: # meta-variables (see sections 4.1.8 and 4.1.9) may not identify the 142: # ultimate source of the request. They identify the client for the 143: # immediate request to the server; that client may be a proxy, gateway, 144: # or other intermediary acting on behalf of the actual source client." 145: params[Const::REMOTE_ADDR] = client.peeraddr.last 146: 147: # select handlers that want more detailed request notification 148: notifiers = handlers.select { |h| h.request_notify } 149: request = HttpRequest.new(params, client, notifiers) 150: 151: # in the case of large file uploads the user could close the socket, so skip those requests 152: break if request.body == nil # nil signals from HttpRequest::initialize that the request was aborted 153: 154: # request is good so far, continue processing the response 155: response = HttpResponse.new(client) 156: 157: # Process each handler in registered order until we run out or one finalizes the response. 158: handlers.each do |handler| 159: handler.process(request, response) 160: break if response.done or client.closed? 161: end 162: 163: # And finally, if nobody closed the response off, we finalize it. 164: unless response.done or client.closed? 165: response.finished 166: end 167: else 168: # Didn't find it, return a stock 404 response. 169: client.write(Const::ERROR_404_RESPONSE) 170: end 171: 172: break #done 173: else 174: # Parser is not done, queue up more data to read and continue parsing 175: chunk = client.readpartial(Const::CHUNK_SIZE) 176: break if !chunk or chunk.length == 0 # read failed, stop processing 177: 178: data << chunk 179: if data.length >= Const::MAX_HEADER 180: raise HttpParserError.new("HEADER is longer than allowed, aborting client early.") 181: end 182: end 183: end 184: rescue EOFError,Errno::ECONNRESET,Errno::EPIPE,Errno::EINVAL,Errno::EBADF 185: client.close rescue nil 186: rescue HttpParserError => e 187: STDERR.puts "#{Time.now}: HTTP parse error, malformed request (#{params[Const::HTTP_X_FORWARDED_FOR] || client.peeraddr.last}): #{e.inspect}" 188: STDERR.puts "#{Time.now}: REQUEST DATA: #{data.inspect}\n---\nPARAMS: #{params.inspect}\n---\n" 189: rescue Errno::EMFILE 190: reap_dead_workers('too many files') 191: rescue Object => e 192: STDERR.puts "#{Time.now}: Read error: #{e.inspect}" 193: STDERR.puts e.backtrace.join("\n") 194: ensure 195: begin 196: client.close 197: rescue IOError 198: # Already closed 199: rescue Object => e 200: STDERR.puts "#{Time.now}: Client error: #{e.inspect}" 201: STDERR.puts e.backtrace.join("\n") 202: end 203: request.body.delete if request and request.body.class == Tempfile 204: end 205: end
Does the majority of the IO processing. It has been written in Ruby using about 7 different IO processing strategies and no matter how it‘s done the performance just does not improve. It is currently carefully constructed to make sure that it gets the best possible performance, but anyone who thinks they can make it faster is more than welcome to take a crack at it.
# File lib/mongrel.rb, line 109 109: def process_client(client) 110: begin 111: parser = HttpParser.new 112: params = HttpParams.new 113: request = nil 114: data = client.readpartial(Const::CHUNK_SIZE) 115: nparsed = 0 116: 117: # Assumption: nparsed will always be less since data will get filled with more 118: # after each parsing. If it doesn't get more then there was a problem 119: # with the read operation on the client socket. Effect is to stop processing when the 120: # socket can't fill the buffer for further parsing. 121: while nparsed < data.length 122: nparsed = parser.execute(params, data, nparsed) 123: 124: if parser.finished? 125: if not params[Const::REQUEST_PATH] 126: # it might be a dumbass full host request header 127: uri = URI.parse(params[Const::REQUEST_URI]) 128: params[Const::REQUEST_PATH] = uri.path 129: end 130: 131: raise "No REQUEST PATH" if not params[Const::REQUEST_PATH] 132: 133: script_name, path_info, handlers = @classifier.resolve(params[Const::REQUEST_PATH]) 134: 135: if handlers 136: params[Const::PATH_INFO] = path_info 137: params[Const::SCRIPT_NAME] = script_name 138: 139: # From http://www.ietf.org/rfc/rfc3875 : 140: # "Script authors should be aware that the REMOTE_ADDR and REMOTE_HOST 141: # meta-variables (see sections 4.1.8 and 4.1.9) may not identify the 142: # ultimate source of the request. They identify the client for the 143: # immediate request to the server; that client may be a proxy, gateway, 144: # or other intermediary acting on behalf of the actual source client." 145: params[Const::REMOTE_ADDR] = client.peeraddr.last 146: 147: # select handlers that want more detailed request notification 148: notifiers = handlers.select { |h| h.request_notify } 149: request = HttpRequest.new(params, client, notifiers) 150: 151: # in the case of large file uploads the user could close the socket, so skip those requests 152: break if request.body == nil # nil signals from HttpRequest::initialize that the request was aborted 153: 154: # request is good so far, continue processing the response 155: response = HttpResponse.new(client) 156: 157: # Process each handler in registered order until we run out or one finalizes the response. 158: handlers.each do |handler| 159: handler.process(request, response) 160: break if response.done or client.closed? 161: end 162: 163: # And finally, if nobody closed the response off, we finalize it. 164: unless response.done or client.closed? 165: response.finished 166: end 167: else 168: # Didn't find it, return a stock 404 response. 169: client.write(Const::ERROR_404_RESPONSE) 170: end 171: 172: break #done 173: else 174: # Parser is not done, queue up more data to read and continue parsing 175: chunk = client.readpartial(Const::CHUNK_SIZE) 176: break if !chunk or chunk.length == 0 # read failed, stop processing 177: 178: data << chunk 179: if data.length >= Const::MAX_HEADER 180: raise HttpParserError.new("HEADER is longer than allowed, aborting client early.") 181: end 182: end 183: end 184: rescue EOFError,Errno::ECONNRESET,Errno::EPIPE,Errno::EINVAL,Errno::EBADF 185: client.close rescue nil 186: rescue HttpParserError => e 187: STDERR.puts "#{Time.now}: HTTP parse error, malformed request (#{params[Const::HTTP_X_FORWARDED_FOR] || client.peeraddr.last}): #{e.inspect}" 188: STDERR.puts "#{Time.now}: REQUEST DATA: #{data.inspect}\n---\nPARAMS: #{params.inspect}\n---\n" 189: rescue Errno::EMFILE 190: reap_dead_workers('too many files') 191: rescue Object => e 192: STDERR.puts "#{Time.now}: Read error: #{e.inspect}" 193: STDERR.puts e.backtrace.join("\n") 194: ensure 195: begin 196: client.close 197: rescue IOError 198: # Already closed 199: rescue Object => e 200: STDERR.puts "#{Time.now}: Client error: #{e.inspect}" 201: STDERR.puts e.backtrace.join("\n") 202: end 203: request.body.delete if request and request.body.class == Tempfile 204: end 205: end
Used internally to kill off any worker threads that have taken too long to complete processing. Only called if there are too many processors currently servicing. It returns the count of workers still active after the reap is done. It only runs if there are workers to reap.
# File lib/mongrel.rb, line 211 211: def reap_dead_workers(reason='unknown') 212: if @workers.list.length > 0 213: STDERR.puts "#{Time.now}: Reaping #{@workers.list.length} threads for slow workers because of '#{reason}'" 214: error_msg = "Mongrel timed out this thread: #{reason}" 215: mark = Time.now 216: @workers.list.each do |worker| 217: worker[:started_on] = Time.now if not worker[:started_on] 218: 219: if mark - worker[:started_on] > @timeout + @throttle 220: STDERR.puts "Thread #{worker.inspect} is too old, killing." 221: worker.raise(TimeoutError.new(error_msg)) 222: end 223: end 224: end 225: 226: return @workers.list.length 227: end
Used internally to kill off any worker threads that have taken too long to complete processing. Only called if there are too many processors currently servicing. It returns the count of workers still active after the reap is done. It only runs if there are workers to reap.
# File lib/mongrel.rb, line 211 211: def reap_dead_workers(reason='unknown') 212: if @workers.list.length > 0 213: STDERR.puts "#{Time.now}: Reaping #{@workers.list.length} threads for slow workers because of '#{reason}'" 214: error_msg = "Mongrel timed out this thread: #{reason}" 215: mark = Time.now 216: @workers.list.each do |worker| 217: worker[:started_on] = Time.now if not worker[:started_on] 218: 219: if mark - worker[:started_on] > @timeout + @throttle 220: STDERR.puts "Thread #{worker.inspect} is too old, killing." 221: worker.raise(TimeoutError.new(error_msg)) 222: end 223: end 224: end 225: 226: return @workers.list.length 227: end
Simply registers a handler with the internal URIClassifier. When the URI is found in the prefix of a request then your handler‘s HttpHandler::process method is called. See Mongrel::URIClassifier#register for more information.
If you set in_front=true then the passed in handler will be put in the front of the list for that particular URI. Otherwise it‘s placed at the end of the list.
# File lib/mongrel.rb, line 320 320: def register(uri, handler, in_front=false) 321: begin 322: @classifier.register(uri, [handler]) 323: rescue URIClassifier::RegistrationError 324: handlers = @classifier.resolve(uri)[2] 325: method_name = in_front ? 'unshift' : 'push' 326: handlers.send(method_name, handler) 327: end 328: handler.listener = self 329: end
Simply registers a handler with the internal URIClassifier. When the URI is found in the prefix of a request then your handler‘s HttpHandler::process method is called. See Mongrel::URIClassifier#register for more information.
If you set in_front=true then the passed in handler will be put in the front of the list for that particular URI. Otherwise it‘s placed at the end of the list.
# File lib/mongrel.rb, line 320 320: def register(uri, handler, in_front=false) 321: begin 322: @classifier.register(uri, [handler]) 323: rescue URIClassifier::RegistrationError 324: handlers = @classifier.resolve(uri)[2] 325: method_name = in_front ? 'unshift' : 'push' 326: handlers.send(method_name, handler) 327: end 328: handler.listener = self 329: end
Runs the thing. It returns the thread used so you can "join" it. You can also access the HttpServer::acceptor attribute to get the thread later.
# File lib/mongrel.rb, line 259 259: def run 260: BasicSocket.do_not_reverse_lookup=true 261: 262: configure_socket_options 263: 264: if defined?($tcp_defer_accept_opts) and $tcp_defer_accept_opts 265: @socket.setsockopt(*$tcp_defer_accept_opts) rescue nil 266: end 267: 268: @acceptor = Thread.new do 269: begin 270: while true 271: begin 272: client = @socket.accept 273: 274: if defined?($tcp_cork_opts) and $tcp_cork_opts 275: client.setsockopt(*$tcp_cork_opts) rescue nil 276: end 277: 278: worker_list = @workers.list 279: 280: if worker_list.length >= @num_processors 281: STDERR.puts "Server overloaded with #{worker_list.length} processors (#@num_processors max). Dropping connection." 282: client.close rescue nil 283: reap_dead_workers("max processors") 284: else 285: thread = Thread.new(client) {|c| process_client(c) } 286: thread[:started_on] = Time.now 287: @workers.add(thread) 288: 289: sleep @throttle if @throttle > 0 290: end 291: rescue StopServer 292: break 293: rescue Errno::EMFILE 294: reap_dead_workers("too many open files") 295: sleep 0.5 296: rescue Errno::ECONNABORTED 297: # client closed the socket even before accept 298: client.close rescue nil 299: rescue Object => e 300: STDERR.puts "#{Time.now}: Unhandled listen loop exception #{e.inspect}." 301: STDERR.puts e.backtrace.join("\n") 302: end 303: end 304: graceful_shutdown 305: ensure 306: @socket.close 307: # STDERR.puts "#{Time.now}: Closed socket." 308: end 309: end 310: 311: return @acceptor 312: end
Runs the thing. It returns the thread used so you can "join" it. You can also access the HttpServer::acceptor attribute to get the thread later.
# File lib/mongrel.rb, line 259 259: def run 260: BasicSocket.do_not_reverse_lookup=true 261: 262: configure_socket_options 263: 264: if defined?($tcp_defer_accept_opts) and $tcp_defer_accept_opts 265: @socket.setsockopt(*$tcp_defer_accept_opts) rescue nil 266: end 267: 268: @acceptor = Thread.new do 269: begin 270: while true 271: begin 272: client = @socket.accept 273: 274: if defined?($tcp_cork_opts) and $tcp_cork_opts 275: client.setsockopt(*$tcp_cork_opts) rescue nil 276: end 277: 278: worker_list = @workers.list 279: 280: if worker_list.length >= @num_processors 281: STDERR.puts "Server overloaded with #{worker_list.length} processors (#@num_processors max). Dropping connection." 282: client.close rescue nil 283: reap_dead_workers("max processors") 284: else 285: thread = Thread.new(client) {|c| process_client(c) } 286: thread[:started_on] = Time.now 287: @workers.add(thread) 288: 289: sleep @throttle if @throttle > 0 290: end 291: rescue StopServer 292: break 293: rescue Errno::EMFILE 294: reap_dead_workers("too many open files") 295: sleep 0.5 296: rescue Errno::ECONNABORTED 297: # client closed the socket even before accept 298: client.close rescue nil 299: rescue Object => e 300: STDERR.puts "#{Time.now}: Unhandled listen loop exception #{e.inspect}." 301: STDERR.puts e.backtrace.join("\n") 302: end 303: end 304: graceful_shutdown 305: ensure 306: @socket.close 307: # STDERR.puts "#{Time.now}: Closed socket." 308: end 309: end 310: 311: return @acceptor 312: end
Stops the acceptor thread and then causes the worker threads to finish off the request queue before finally exiting.
# File lib/mongrel.rb, line 340 340: def stop(synchronous=false) 341: @acceptor.raise(StopServer.new) 342: 343: if synchronous 344: sleep(0.5) while @acceptor.alive? 345: end 346: end
Stops the acceptor thread and then causes the worker threads to finish off the request queue before finally exiting.
# File lib/mongrel.rb, line 340 340: def stop(synchronous=false) 341: @acceptor.raise(StopServer.new) 342: 343: if synchronous 344: sleep(0.5) while @acceptor.alive? 345: end 346: end
Removes any handlers registered at the given URI. See Mongrel::URIClassifier#unregister for more information. Remember this removes them all so the entire processing chain goes away.
# File lib/mongrel.rb, line 334 334: def unregister(uri) 335: @classifier.unregister(uri) 336: end
Removes any handlers registered at the given URI. See Mongrel::URIClassifier#unregister for more information. Remember this removes them all so the entire processing chain goes away.
# File lib/mongrel.rb, line 334 334: def unregister(uri) 335: @classifier.unregister(uri) 336: end