Wednesday, November 29, 2006

Web infrastructure 2.0

The definitive guide for robust websites and email

since I'm working with web infrastructure, people often ask me for help when they run into trouble with their websites. Problems with domain name transfers, email, website downtimes etc.

One of the main reasons for all these issues is the fact that they are entrusting all components to ONE company. So if the webhosting company has a downtime, you are also unable to get your emails etc.

Most of the webhosting companies offer the all-inclusive package, domain name, email, lots of storage and traffic and all kinds of features for dynamic content. This is a nice idea, but if you want a robust solution, my advice is: split things up! Use services that focus on ONE feature. They are best at that and since it is their core product, they will do everything to keep it up and constantly improve it. And you're flexible to change the provider for each component if you're not satisfied anymore.


1. Use a DNS provider for your domain name

I really don't get it. And I can't count how many times I read about problems coming from the fact that people have their domain name with the same company they host their website with. Let alone the trouble and the unpredicted downtime window you have when you want to transfer a domain from one provider to another. I have tried many webhosting companies through the years, when the plan included a domain name, I chose one I don't care about (I call it "hosting domain"). But I don't allow hosting companies to handle my domain name. There are several DNS providers on the market where you can register your domain name and you have full control over the DNS records, and then I can point my domain name to the webhost of my choice. More about that later.


2. Keep email separate from web servers

I have often smiled about people who owned their own domain names but still used freemail (Yahoo, Gmail etc.) as their contact address. Since I know how troublesome setting up email can be, I tend to understand them better. I've seen a lot of systems where emails are handled on the same server instance as the websites (bad idea), and at some where incoming emails were written to MySQL databases (I passionately hate MySQL from an admin perspective) and people would eventually read it with crappy interfaces. Where these scenario would mean you can't get email when your webhost is down.

Some DNS providers also offer email forwarding, so you could have all incoming email sent to Yahoo! Mail or Gmail, and profit from their spam / filter rules / folders / tags etc. Both providers allow you to choose an alternate sender address, so you when you compose mails they will appear to be from your domain. Another option is pointing the MX records of your domain to a box that JUST handles email.

Google offers a hosted solution which works just like that, and it is free.


3. buy a rock solid webhosting package or dedicated server (or more than one)

Choosing a good host is hard, choosing the perfect host is probably impossible. The quality of service changes constantly. I sometimes compare that with restaurants, an article in the paper praises the excellent cuisine of a small cafe, and quite soon the place is very much different from how it was before: more tables, delays, waitresses that can't handle the workload, a manager who suddenly wants to try new concepts with the food they offer.

For me good webhosting has an uptime over 99% and comes with shell access (also used for secure copy. Forget FTP.) and a dedicated IP address, or at least a decent system for hosting domains that are registered somewhere else. If you have the money for a dedicated server, an IP and virtual hosting is already included by design.

Since the domain is registered somewhere else (see 1.) you have full flexibility and can simply set an A record or a CNAME in the DNS to point to the webhosting. If you have a second webhosting that you bought with another company, you can point your domain to this one when the other one is down. That can be done within five minutes and you don't need help from anyone.

You're also free to point www1.yourdomain to host A and www2.yourdomain to host B and write a tiny loadbalancer at host C that knows which hosts are up and can shift traffic to them accordingly. This solution also allows you to plugin additional servers (e.g. the Amazon EC2 grid) when you expect an increase of visitors for a roadshow or TV commercial.

And when you're not happy anymore with the webhosting company (which happens often, unfortunately), you simply set up things at the new location, then change your DNS at a date and time of your choice and you're done.


4. Host static files somewhere else

Again, split things up! Host static files like images, videos, css, flash somewhere else and let your webserver focus on delivering your website. I use Amazon S3 for that, and I added a CNAME to my DNS to have a hostname like media.mydomain.com, you could also use flickr (read the TOS before) or imageshack for hosting images if you like.


5. Cache as much dynamic content as you can

I think lots of problems result from the fact that many websites are run by people who have only a vague idea about webservers and DNS, and they think everything is about php and mysql, with some fancy css around it. If we leave highly personalized websites (web mail, content management systems etc.) out of here, I generally don't like the idea that a system makes requests to (mysql) databases because a user just CLICKED somewhere.

  • Cache your stuff as much as you can, and if you have to use one of those fancy blog / cms software, enable caching
  • use web stress tools to find out how many users are able to surf on your website at the same time. Check if you are happy with the result
  • in large scenarios have a middleware server that produces static html files and pushes it to the www box
  • use content distribution networks like Amazon S3 or akamai to host your static pages and avoid a server downtime (because you are on the frontpage of digg)
  • databases, especially mysql are not something super cool that a website can't live without. You can use a webservice (like Amazon S3, again) to store and get your data with REST, SOAP or JSON requests. Depending on your scenario, even the Unix filesystem (serialized data) can be a good choice to speed things up and avoid another point of failure

Well, that's it! Sure, there is much more to it and I didn't walk you step by step through things. I just wanted to give you some hints. One last thing are backups. I often realize how often this issue is ignored and how many websites never go online again after a downtime, because there were no (useful) backups.

Labels: , , , , , , , , , , , , ,