Date: Sun, 12 Jan 2003 15:39:13 -0800 From: Terry Lambert <tlambert2@mindspring.com> To: Brad Knowles <brad.knowles@skynet.be> Cc: Doug Barton <DougB@FreeBSD.org>, freebsd-chat@FreeBSD.org Subject: Re: Need advice on PHP and MySQL books Message-ID: <3E21FCA1.C4C63219@mindspring.com> References: <20030110234309.R12065@2-234-22-23.pyvrag.nggov.pbz> <3E1FF12B.5390D978@mindspring.com> <20030111144619.X22424@2-234-22-23.pyvrag.nggov.pbz> <3E20D1D6.E9FC60B1@mindspring.com> <a05200f23ba474c2d8565@[12.27.220.113]>
next in thread | previous in thread | raw e-mail | index | archive | help
Brad Knowles wrote: > However, in terms of the work of this sort that is publicly > available, the most advanced project I know of is bind-dlz (see > <http://www.nlnet.nl/project/bind-dlz/>). I was unaware of this project; it looks like there is a single developer (common, for Source Forge projects, unfortunately), but the last patch was in December, so I guess it's still somewhat active. > > The next generation would have been able to update in realtime, > > without needing to kick servers in the head, or to explicitly > > derive the data. > > Really? I'd love to hear more about that. Basically, it would have used the DNSUPDAT mechanism to push data into the standard bind. Following a discussion on namedroppers (the DNSEXT mailing list), I modified the bind code to permit creation of zones in a primary, via DNSUPDAT. The specific reason for disallowing creation of zones via DNSUPDAT was lack of a security association. But there is really not a model issue -- it's prefectly reasonable to disable the check that precents this in the source code, if the zone is being created as a primary. Creating zones in a secondary is much harder... doing that would require a security association. Also, the most common method of operation is "stealth primary", for this purpose, so that the secondaries (which appear to the world as primaries) are the only machines banging your main DNS server. This is desirable for (hopefully) obvious reasons. > > It would be very east to do this with an LDAP directory, I think; > > much harder to do with a relation database, unless you establish > > record references internal to the database which were, themselves, > > hierarchical. > > BIND 9 already provides hooks for back-end databases, and there > are already defined profiles for working with PostgreSQL, and a > Berkeley DB implementation should be nearly trivial. The problem is > that any alternative back-end database you could choose would almost > certainly be much, much slower than the built-in in-memory red/black > tree that is already used within BIND. Precisely. This is a result of the "impedance mismatch" that I noted, which comes from trying to map a hierarchical relationship into a relational database, which assumes flat relations. This is why a hierarchical database (e.g. LDAP) or even a file system, works better as a data source. It's also why DNSUPDAT is preferred. The primary harm these days is actually "cached data"; I've been considerening for several years now, whether or not to write a canonical "Cached Data Considered Harmful". I think I can make a good case. In any case, with explicit update, the caching problem is solved, as is the non-native performance problem. The same would be true for DNS served out of an already hierarchical data store, like LDAP, because it has the same issues to resolve that the red/black tree solves in DNS. Rather than solving this twice (and ending up with a very fast LDAP server -- but a difficult task 8-)), my take was to update the DNS data in situ. The biggest issue was the creation of caching files, so that the DNS data remained cached to disk in a way that allowed a recovery after a server reboot (again, zone creation is the problem, since a zone cache file, once it exists, will be maintained properly with no additional work). So really, this was about two modifications to bind. > The only question should be whether or not the replacement > back-end database gives you enough other advantages to make up for > the loss in performance. IMO, "no". The point of having high performance is to have a stomach that's bigger than your eyes, so to speak: you want to be able to handle as much load as it's possible to shove down all the pipes to your server, and then add overage, so if you are DDOS'ed, the limiting factor is your pipe, not your server. Without this, you would be required to extendively modify your server software; bere little DNS software, no matter who writes it, has the ability to shed load very well. The (relatively) recent attack on the root servers had no effect only because it was not sustained until more than 50% of the TTL had passed, at which point the overall degradation would have been exponential; as it was, less than 1% of the Internet even knew there was an attack in progress. If you added a really long latency, effectively increasing the pool retention time for requests, then you decrease the total number of requests you can handle in a given period of time before the servers become vulnerable by becomining more limiting than the pipe size. At that point, you've introduced a serious vulnerability to the system, that hasn't been addressed, and can't be addressed merely by throwing more resources at it. > > IMO, you are much better off just sending notifications of changes > > to the DNS server (via DNSUPDAT), and modifying the DNS server to > > permit zone creation in the primary (minimally). > > Thinking about this a bit more, if you had multiple primaries > pulling data off the same PostgreSQL dictionary, and/or you had your > also-notify setting configured to hit the other "primary" servers, > then you wouldn't have to worry about the synchronization between the > "primary" and the "secondaries", because all machines would be > primary. This is really wrong. The problem with this model is the communications channel between the primaries is vulnerable. As I stated on namedroppers years ago, when DJB first argued this approach, and that inter-server synchronization was "left as an exercise for the student" (his suggestion was rsync or some other protocol for synchronization, over SSH, etc.), the problem is that DNS data should always flow over the DNS protocol. There are a lot of reasons for this, but the top ones, to my mind, are standardization of installations, and the inability to override corporate policy and poke holes in firewalls. The main secondary reason is, I think, that you don't want to open up a communications channel from an insecure server (the DNS server is public-facing) to a secure zone server (the database, whatever it is, is vulnerable. Basically, this argues that data should be pushed from the secure area to the insecure area, rather than pulled from the insecure area from the secure... unless that pull occurs over an already established secure tunnel. Thus my recommendation would be to replicate a DNS server contents in the customer-facing server in the insecure area from another, non-customer-facing, server in the secure area. Basically, this boils down to an implementation of: database server primary DNS secondary DNS ------------> <----------- contact ------------> -----------> data flow | | barriers green yellow red "IBM-ese" Most often, your database is actually your customer database. Effectively, this means that the database contains data which has been normalized for business processes, rather than for efficiency of serving DNS requests, even if you've magically addressed all the security concerns. One of the biggest problesm IBM has, actually, from an insider's perspective, is that it only ever expends effort on systems that are customer-facing and/or customer-visible. Everything else is held together with spit an bailing wire. It doesn't matter how hard it is for an IBM employeee to do their job, so long as it's easy for the customer. This actually results in the evolution of systems with an extremely high marginal cost on a per incident basis, for customer interactions, with the products IBM offers being limited to either front-loaded costs, or ongoing consulting costs, at margins on the order of 40%; they could actually cut out 20% of overhead (increasing profits by 20%) by working *on* their business. In any case, what this came down to in our case was a lot of manual effort on the part of an employee, when it came to account setup and teardown. Effectively, the systems ran independently, like you were suggesting they might, such that the data can be normalized to how it's going to be used, instead of for a business process. This will not be the common case, out in the world, where you're not able to charge another 20% for the name "IBM", or you can charge for it, but you want that 20% to go into profits, instead of into operations costs. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-chat" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3E21FCA1.C4C63219>