Date: Mon, 20 Aug 2007 19:04:20 GMT From: Matus Harvan <mharvan@FreeBSD.org> To: Perforce Change Reviews <perforce@FreeBSD.org> Subject: PERFORCE change 125446 for review Message-ID: <200708201904.l7KJ4KTp080579@repoman.freebsd.org>
next in thread | raw e-mail | index | archive | help
http://perforce.freebsd.org/chv.cgi?CH=125446 Change 125446 by mharvan@mharvan_bike-planet on 2007/08/20 19:04:05 Updated design.txt Affected files ... .. //depot/projects/soc2007/mharvan-mtund/mtund.doc/design.txt#4 edit Differences ... ==== //depot/projects/soc2007/mharvan-mtund/mtund.doc/design.txt#4 (text+ko) ==== @@ -1,175 +1,266 @@ -This document describes the intended design and main implementation -details of mtund. This plan is work in progress and changes are -expected to reflect the outcomes of (mostly email) discussions with my -mentors. + Magic Tunnel Daemon Design and Implementation Details + +IP can easily be tunneled over a plethora of network protocols at +various layers, such as IP, ICMP, UDP, TCP, DNS, HTTP, SSH and many +others. While a direct connection may not always be possible due to a +firewall, the IP packets could be encapsulated as payload in other +protocols, which would get through. However, each such encapsulation +requires the setup of a different program and the user has to manually +probe different encapsulations to find out which of them works in a +given environment. + +The Magic Tunneling Daemon (mtund) uses plugins for different +encapsulations. It automagically selects a working encapsulation in +each environment an can failover to another one if the environment +changes. This document describes the design and main implementation +details of mtund. After an overview of the daemon, various details of +the implemetnation are discussed. Afterwards, the implemented plugins +are described. The document concludes with things remaining to be done +in the future. + + +MTUND - GENERAL OVERVIEW +The daemon and plugins are written in plain C. The daemon can operate +in two modes, as a client or as a server. A server accepts connections +from multiple clients. It uses a tun(4) interface, one per client, as +the tunnel endpoint. Using private ip addresses and nat on the tun(4) +interfaces seems to be a viable solution. Traffic from the client can +be encapsulated in various network protocols. Each such encapsulation +is implemented by a plugin. Plugins are loaded at run-time with +dlopen(3). For multiplexing between the tun interfaces and the sockets +provided used by the plugins libevent is used. Plugins register their +events directly by calling event_add(3) and event_dispatch(3) is +called from main() in the daemon. Libevent can then also call handler +functions from the plugins. + +Each plugin has to implement a set of functions, defined in +plugin.h. A plugin is represented in the daemon by struct plugin. This +struct contains pointers to the plugin functions, allowing the daemon +to interact with the plugin. The functions avialable to the daemon are +described in more detail in plugin.h. In client mode, the plugin tries +to connect to the server after calling plugin_initalize(). The daemon +can then use plugin_send() to send data throught the tunnel set up by +the plugin. For the other direction of data flow, the plugin registers +events for watching its sockets for incoming traffic. Upon receiving +data over the encapsulation, the plugin decapsulates it and passes it +to the daemon by calling the process_data_from_daemon() function. The +plugin also provides a function ot deinitialize itslef, +plugin_deinitialize(). In the server mode, the plugin_initialize() +function sets up a socket listening for incoming +connections. Otherwise, the functionality is similar to the client +mode. + +In server mode, the server initializes all plugins and waits for +incoming connections. The client initializes plugins serially, i.e., +try the next one if the current one fails. After initialization, it +starts sending probes (pings) to the server. If the server receives +them, it generates a reply. Upon reception of a ping reply, the client +request a client ID from the server. If granted, the server and the +client configure tun interfaces and start exchanging traffic using the +plugin. + +PROBING AND KEEP-ALIVE PINGS +The ping probes used for initial probing are also used as regular +keep-alive traffic and to check if unreliable connections have not +failed. If a certain number of successive ping requests is left +without a reply (PING_FAIL_PLUGIN), the encapsulation is considered +malfunctioning and the client tries to find another working +encapsulation. The server is sending such probes to the client as +well. If a certain number of pings is left unanswered +(PING_FAIL_COMPLETE) then it is assumed that the client has +disconnected and its tun interface on the server is deinitialized. If +a working encapsulation is found, the client should send data or a +special message with DISPATCH_SELECT_PLUGIN to notify the server of +plugin failover, i.e., to make sure the server uses the new plugin for +talking to the client. + +MULTI-USER SUPPORT +The server supports concurrent sessions from multi-plient clients. In +order to tunnel traffic for a client, the client first has to +associate with the server. This is done by requesting a client ID. If +the server alsready has too many clients, it simply does not answer +the reuqest for an ID. If a client does not answer PING_FAIL_COMPLETE +pings, it is considered disconnected and its client ID is reclaimed by +the server. The client ID is prepended before the payload in traffic +from the client to the server so that the server could determine from +which client the traffic is coming. Although for TCP the file +descriptor could be used, for DNS or ICMP the client ID is needed. For +traffic from the server to the client no client ID is prepended as +each client talks to only one server at a time. + +The plugins also have to support multiple connections as several users +can be using the same encapsulation to communicate with the server. A +plugin keeps track of its connections on its own. The interface to the +daemon uses the client ID. The previous design had references to +client ID and a connection flag passed to the daemon as arguments to +the process_data_from_plugin() function. However, this was abandoned +in favour of having a separate function plugin_conn_map(), implemented +by the plugin. This function is called from process_data_from_plugin() +before calling plugin_send(). The client ID is passed to this function +and it maps the last incoming connection to the client ID. Note that +there is no race condition as the plugin_conn_map() function is called +before process_data_from_plugin() returns and hence before +plugin_receive() returns. Hence, no other event from libevent can +preempt the call. In addition, a flag is passed indicating whether +* the payload was garbage and hence the connection should be + discarded +* the payload was a ping, the connection is temporary +* the payload was data from an associated client and hence the + connection is permanent + +A temporary connection can be used by the daemon until +process_data_from_plugin() returns. It is used for replying to +pings. For the TCP plugin, the socket is closed after a timeout while +for non-connection oriented encapsulations the connection metadata can +be removed earlier. The metadata of a permanent connection should be +kept around until plugin_conn_close() is called by the daemon. A +permanent connection is used by associated clients. + +The funcitons use the client ID to identify the connection and the +mapping from the client ID to the particular connection is the +responsibility of the plugin. Note that it is not desirable for the +plugins to inspect the payload as this is done centrally by the +daemon. To this end, the above described scheme has been designed. + +DISPATCH +To distinguish between tunneled traffic, pings and other types of +control traffic, a dispatch value indicating the type of payload is +prepended before the payload. It is after the client ID for traffic +originated by the client. The various dispatch values are defined in +mtund.h. -TODO: -o man page -o port skeleton +If a plugin wishes to exchange traffic directly with another plugin, +the payload still has to pass via the daemon so that the client ID +gets prepended. In this way the plugins know to which +client/connection the traffic relates. This is usefull for plugins +probing which types of traffic pass through and for polling plugins to +send empty requests. -TODO this document -o plugin_send() return values - prevent filling the stack -o fragmenation, fragment reassembly, framing +FRAGMENTATION, FRAGMENT REASSEMBLY +Some plugins may offer only a lower mtu. Therefore, fragmentation and +fragment reassembly has been implemented in the daemon. It is used if +a larger packet than the indicated plugin mtu should be sent. For +sending, a fragment header, struct frag_hdr defined in mtund.c, is +prepended before each fragment. As the plugin indicates how much data +was consumed, it is also posA similar problem is solved complete +messages; TCP - framing -TUN(4) INTERFACE (OR SOMETHING ELSE?) -My original idea, as described in the proposal is to use the tun(4) -interface. It gives a virtual network interface (point-to-point). -Whenever a packet comes to it, it is passed into the userspace and one -can write into it packets from the userspace to produce outgoing -packets. Packets start with the IP header. Hence, using it is rather -painless. On the other hand, it is a proper network interface so that -one can assign IP addresses to it and add entries into the routing table -for that interface. An additional benefit is that it is implemented on -several different OSes, allowing for easy portability. Also, I already -know how to use it and hence getting started with it is easy for me. +sible to support plugins where the mtu +varies from packet to packet. -For using the tun interface, my idea was to set up the routing table in -a way that all traffic would be routed via this interface except for -traffic to the tunnel endpoint, i.e. don't try to send the encapsulated -traffic via the tunnel. The tunnel endpoint could then do NAT or -routing, depending on how many public IP addresses it would have -available. This approach would basically be at the IP layer, so one -could not easily say that some ports should go via the tunnel while -others could not. Or maybe this would be doable with one of the -firewalls available on FreeBSD? +Teh reassembly happens then on the receiving end. If not all fragments +are received within a given time, the reassembly buffer is +discarded. After reassembling the complete packet, is is passed to the +tun interface. -One problem with IPv6 and tun(4) is that by default it tags packets as -being IPv4. However, at the moment IPv4 is the main goal and IPv6 -support is left for later. To have IPv6 working over it, one has to -disable the the IFF_NO_PI flag and prefix each packet with a 4-byte -struct specifying the type of traffic (ETH_P_IPV6). These are the flag -constants on linux, as I am currently writing code on linux. On -FreeBSD, the flag seems to be TUNSLMODE. Another approach would be to -change the kernel code to look at the first byte of the packet (the -version field for both IPv4 and IPv6) and determine from that if it's -IPv4 or IPv6. On FreeBSD 6.1, the hardcoded value is set on line 865 -in file /sys/net/if_tun.c. +Note that the fragmentation and fragment reassembly happens in the +daemon rather than the plugins and hence any plugin automatically can +use it. -Max has mentioned netgraph as an alternative for tun. However, it has -been decided that tun(4) would be used rather than netgraph. +DIRECT/POLLING PLUGINS +In general, there are two types of plugins. Direct plugins and polling +plugins. -MULTIPLEXING BETWEEN DIFFERENT FILE DESCRIPTORS -The easiest way that came to my mind was using select(2). The plugins -of course have to register their file descriptors to be watched. The -code will be rewritten to use libevent. This will allow to easily use -timeouts for checking the connectivity of plugins or do other -timeout-related things. +For direct plugins both the client and the server can send a packet +whenever they wish so. An example would be the TCP plugin using a TCP +socket or a UDP plugin using a UDP socket. -Another alternative would be to fork a process for each plugin or to use -threads. It may give more flexibility to the plugins, but at the moment -I do not see a useful advantage in this approach. The current design and -the functions from the main daemon avaialable to the plugins pretty much -dictate the plugin design and to some extent capabilities. +The polling plugins use a request-reply scheme, such as ICMP echo +request/reply or DNS query/answer. While the client can initiate a +request at any time to send data, the server can only tunnel data in +responses. These responses can only originate in response to requests +and hence the server cannot send packets whenever is wants. To tackle +this problem, the plugin queues one packet at a time and sends it when +a response is geneerated. Actually, there are two one-packet +queuues. One is for normal data, such as packets from the tun device +or ping requests. The other, called urgent, is for replies generated +in response to received traffic. These are ping replies and client ID +offers. These have priority over normal data. To distinguish between +the different types of plugins, the plugin_is_ready_to_send() function +return values indicates whether the data would be send immediately, +queued or whether the queue is full. -CHECKING UNRELIABLE CONNECTIONS -One thing that is missing in the code I wrote for the application is -checking whether a udp connection (still) works. For tcp, one can use -the return value of the write(2) call. For unreliable protocols like -udp, icmp or ip, the read call does not indicate that something went -wrong. In order to detect this problem I was thinking of exchanging some -regular keep-alive traffic, i.e. regularly sending an echo request on -the application level and expecting an echo reply within a time -interval. If that fails N times, the connection is declared -malfunctioning. +If a polling plugin is used, the monitoring of the tun device by +libevent is disabled. When a response should be sent, plugin uses the +function report_plugin() with the REPORT_READY_TO_SEND flag to +indicate that is can send a packet. The daemon then checks whether no +fragments are pending. If not, a read on the tun interface is be +attempted. Note that the queue is still needed to originate ping +requests on the server as it does not queue them, but expects the +plugin to do so. Using the "urgent" queue for replies is just a +technical issue to simplify the plugins. -For the implementation, things should get easier with the use of -libevent. This would easily allow signalling timeouts. One issue is -that I might receive the timer signal at an inappropriate time and -would have to properly protect shared variables. This has to be -checked. In general, synchronization and multi-threaded safety should -be considered. +Upon receiving a response, the plugin on the client immediately +generates a new request. If no data is avaiable, it sends an empty +request. The reason is that the server could have more data queued and +is waiting for another response to send it. In addition, the client +sends regular request, possibly empty to decrease the latency in case +traffic becomes avaialable on the tun device on the server, but the +client has no data to send at the moment. -Another way would be to use select(2) with a struct timeval *timeout -(5th argument), which might be easier. +The report_plugin() function is also used to indicate various errors +and failures of a plugin. -For this part, having plugins written as threads/processes might give -them more flexibility. However, I do not think that such flexibility is -needed. At least not at the moment. -REDIRECTING ONLY CERTAIN PORTS -We might want to allow for redirecting only certain ports via the -tunnel while leaving others to go directly. One way to achieve this -would be to use pf anchors. Examples can be found in usr.sbin/authpf -in base or ftp/ftp-proxy in ports. Currently, this is consider an -optional feature and hence left for later, +PLUGINS -MULTI USER SUPPORT ON THE SERVER -The server shall support multiple users concurrently. This is an -important feature and the design has to be changed appropriately to -accomodate for it. +UDP PLUGIN +The UDP plugin is a direct plugin using a UDP socket for the encapsulation. -At the moment, it is unclear how the multi user support could be -achieved, but some session management, possibly with some -atuhentication/handshake out-of-band. Some encapsulations such as UDP -and TCP offer port and addresses for identifiyng the sessions/clients -while for others (ICMP) the session may have to be signalled -in-band. In particular, with the former a separate file descriptor -represents each client, making things easier. +UDP CATCHALL PLUGIN +The UDP CATCHALL plugin uses a raw IP socket to receive unclaimed UDP +traffic, i.e., listen on all unused ports. A kernel patch is provided +to allow this. -Maybe a separate tun(4) interface could be created for each user on -the server and the burden of multiplexing between them and assigning -traffic to the right tun interface (and hence mtund instance) would be -the responsibility of the kernel by using entries in the routing table -or tracking connections when natting. +TCP PLUGIN +The TCP plugin is a direct plugin using a TCP socket for the +encapsulation. In addition, a patch for the kernel is provided to +allow a TCP socket to listen on all unused ports. -AUTODETECTION -I'm not sure what/how could/should be autodetected. I guess some proxy -information could be taken from environment variables, -firefox/opera/konqueror configuration. Default gateway and other routing -informaiton could be taken from the system's routing tables. Maybe some -information about socks proxies could be taken from Dante's SOCKS config -file. Possible name servers could be taken from /etc/resolv.conf to use -for dns tunelling. Maybe the default gateway should be probed for -offering dns, http proxying,... +ICMP PLUGIN +The ICMP plugin is a polling plugin using ICMP echo requeust/response +exchanges. -QUEUING -The current implementation with blocking I/O does queuing of one -packet. If this approach turns out to be problematic, different -queuing strategies would have to be investigated. +DNS PLUGIN +The DNS plugin is a polling plugin using DNS queries/answers. Fro the +DNS encoding/decoding, code from the iodine project is used. -FRAGMENTATION -What if the encapsulation provides a smaler MTU? This might be the -case for DNS tunnelling. We should then probably fragment packets and -reassemble fragments on the other end. For this, in-band signalling -might be needed. +THINGS LEFT TO DO: -In additiona, for TCP we have to do STREAM <-> MSG dissection, for MSG -based protocols we have to figure out MSS (probably without the help -of ICMP) and fragment accordingly. +HTTP PLUGIN +Reading httptunnel sources is a good starting point. -Max thinks the absolute minimum we should provide is a MTU of 1300 -(1280 + 20) which will allow to run a gif tunnel over the mtund tunnel -without the need for IPv4 fragmentation for the gif tunnel. +CONFIG FILE +Currently, the config options are specified with #defines. A parser +for the config needs to be written. lex/yacc is a good candidate +here. The plugin-specific parts of the config file may be parsed by +the plugins. This would allow to leave the daemon independent of the +plugins. CRYPTO The easiest way to secure the tunnel would be to put IPSec on the tun -interface. Other options would likely not be investigated, but -nevertheless are descibed in this document. +interface. However, this would not secure the control traffic. Putting +a symmetric key onto both, the client and the server, the traffic +could be encrypted with blowfish, aes or another symmetric cipher. -Offering basic encryption support should be easy. Putting a symmetric -key onto both, the client and the server, encapsulated payload could -be encrypted with blowfish, aes or another symmetric cipher. +REDIRECTING ONLY CERTAIN PORTS +We might want to allow for redirecting only certain ports via the +tunnel while leaving others to go directly. One way to achieve this +would be to use pf anchors. Examples can be found in usr.sbin/authpf +in base or ftp/ftp-proxy in ports. Currently, this is consider an +optional feature and hence left for later, -Adding authentication might be harder and I'm not sure it's a high -priority. +MAN PAGE +Depened on the config file parsing. -CONFIG FILE -What should be configurable? What should the config file then look -like? The config file would be plain text file. It's format and -contents will be determined later. +PORT SKELETON +Depened on the config file parsing. -PROJECT SCHEDULE -I have put a rough estimate of a schedule in the proposal: -* core tunnel daemon, config file parsing, plugin interface - 2 weeks -* probing/checking strategy for unreliable protocols (UDP, ICMP) - - 1 week -* TCP, UDP plugins - 1 week -* ICMP plugin - 1 week -* HTTP plugin - 1 week -* DNS plugin - 1 week -* SSH plugin - 1 week -* man pages, ports Makefile - 1 week +MTU PROBING +The ping mechanism in the plugin can be used to probe which maximal +MTU passes the firewall. -However, if things go well, most items in the schedule should be done -faster. But at the moment, it seems hard to me to predict the amount of -times the various items would take. +ICMP PLUGIN PROBING +The ICMP plugin should probed if the icmp request/response exchanges +are needed or if the firewall would pass through more responses per +request,...
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200708201904.l7KJ4KTp080579>