Fundamentals of Massive Linux Scaling -- reallylinux.com
Fundamentals of Massive Linux Scaling
by Mark Rais, senior editor ReallyLinux.com
This introductory article encapsulates four of the top
things I learned about major enterprise Linux scaling. There are many other essential ingredients, and I express some of them in my Scaling Linux Servers article. However, for the time being I
think these four convey the first rung, or foundational aspects for a
robust scaled server complex using Linux. They also address issues I
see fairly consistently in some Linux enterprises.
My hope is that these
fundamental principles will help some administrators overcome obstacles with their massive Linux scaling projects.
CACHING IS NOT THE SOLUTION
Notice that it can be part of the
solution. I propose, however, that caching can not be the crux of
the solution. I'm referring to multi-tiered infrastructure.
If you have a single tier server complex and you place caching boxes
into the foray of server requests, then perhaps it will work; simple
socket requests for instance.
But when you start to see latency
issues across the network routers, the database I/O, the in-bound
pipes, emphasis on caching is just too single sided in my book.
Instead, I propose that where some
architects literally put all the eggs in the caching basket, others
are wise enough to scale where known latencies already exist, rather
than try to offset these with caching schemes that often fail.
How do I know this?
When my team tried
to use caching as a foundational component of scaling for the
"AOL/CBS Big Brother" website, it just failed at first. We saw
over half million in-bound requests per second come through the pipes and the
whole thing melted. Melted so bad that our miscalculation began
choking out routers way down the pipe. In reality we had failed to
understand two very important aspects regarding caching - besides learning that
a prime time TV show can really juice up webhits.
"We saw over half million in-bound requests per second come through the pipes"
First, caching for content driven
websites is often problematic since certain content pieces are unique
and dynamic to the last minute. So where we should normally have had
the final content delivered and cached, some of the editorial staff
were still working on bits and pieces of content. Some of the
content was essential and caching was of absolutely no use. In more
news oriented sites this is especially an issue. Obviously this is
more a human factor than a hardware factor, but nevertheless
Second, caching works when properly
configured. However, configuring caching for major scaling endeavors
also introduces complexities you ordinarily do not notice. Your
caching schema for an existing infrastructure may simply not address
new peak load characteristics.
In our case, the caching failed
because major schema changes were needed and ultimately proved to be
untestable without true peak load. Even the best testing tools can
not present true peak load evaluations, and thereby fail to uncover
certain key nuances in the schema. The result in our case was that
the caching systems began passing the ball to one another and at such
a high rate that they actually took each other down. The pressures
to introduce a new scheme along with many new components in a
ridiculously short period of time contributed to the failure.
I have found that effective mass scale caching requires time and thorough
testing. Two things that are sometimes hard to come by during a
KEEP DNS ROUND ROBIN FOR THE BIRDS
Well, we can all probably agree on one
thing. Using the DNS server as a primary mechanism for load
balancing sucks. Perhaps my conclusion isn't as eloquent as some
may prefer, but it's based on some hard facts.
First, I've met so many people who
use simple DNS round robin configurations to deal with peak load
balancing and simple scaling, it scares me. If you're getting
1,000 hits per second, maybe this will work - for the time being.
But when you move to serious peak loads, DNS Round Robin falls flat
on its face because it can not address the failure points.
Second, based on the first point, DNS
can't even identify when the server fails on load. Instead, it
keeps passing the requests. And, unless you've written some fail-over
code to pull a downed server off the DNS, it just keeps failing.
I guess the one positive for using DNS,
if the server chokes on a request the DNS just returns a fail rather
than pass the request on or stack it in an exponentially increasing
The main point is that using DNS round
robin isn't a great solution for you whether you hit peak or not.
It simply can not address re-balancing or cutoffs without manual
If you have the money then solutions
with expensive switches like the ones Foundry offers, as an example, makes a whole
lot better sense. I just prefer not to see people dealing with peak
load failures on Friday nights because they have to manually adjust
Someone asked if this includes general
round robin approaches. It all depends on the situation. We once
used a "stupid" round robin that was based on simple code that
alternated which servers would get requests by time. Every few
seconds requests were processed to alternating servers. It actually
worked because we were passing very little data per request. We used
this to do a simple and rather inexpensive load balance for shopping
requests during peak holiday shopping times. The infrastructure
stayed relatively the same with only the addition of the front
caching server managing requests.
But let me say that this scheme fails
when the latency for retrieving something such as large images goes
to an already loaded server and you see requests slow down. Then
every other system on the round robin takes the load right? WRONG.
The next server in line receives the off loaded requests and it too
goes down. Yet again you can see a nice domino affect.
In general round robins work well in
smaller scale environments. But not when dealing with major peak
loads - emphasis on peak. I just as soon keep my distance unless
it's a mid-sized environment.
NFS AND BIG DISK DRAMA
For one reason or another, NFS mounts
are everywhere in the Linux world. There's absolutely nothing
wrong with NFS in most situations. But add up those mount points,
spread them across a server farm, add cross site connectivity or
other complexity and you get into a bit of a muddle.
The biggest issues I've personally
seen on mass scale NFS mount infrastructure is that we either die
because the mount points start failing or because the disk I/O chokes
out when implemented on striped disks.
If you use NFS mount points, there's
nothing bad about it.
If you use RAID 5 striping, there's
nothing bad about it.
If you use large disks, which most
people do to save costs on their storage solutions, there's nothing
But combine large disks, complex
striping with lots of NFS mount points and you reach that wonderful
realm of death by positional latency. It is one of
the biggest pains to uncover and then resolve during a scaling
Instead, I've consistently shared
with people that if you're looking to do some serious scaling with
NFS mounts, please stay away from those huge cheap drives.
several smaller GB drives costs more but reduces the chances that
disk latency combined with striping algorithms will become a culprit
for NFS mount point failures. This may be old news to some,
but there are still many shops mixing these three ingredients in
their daily Linux scaling work.
"stay away from those huge cheap drives"
I love MySQL. I love seeing PHP and
other dynamic web design. It provides for a great tool set and makes
most websites valuable. But there is no question that some great
websites have choked because the architecture placed everything into
dynamic page loads.
I still believe, if you're serving
serious web hits, that reducing the dynamic requests and increasing
the static HTML content is a sure way to beat the peak load scaling
problem. I know this from experience because the solution that the outstanding
Operations staff used to bring back alive the Big Brother website was
to flatten the highly dynamic site into basic html, rdist the pages
to about 220 web servers and run those servers at 105% peak load.
There's nothing wrong with designing
webpages and implementing dynamic websites. But on spike loads the
key hit pages need to have far less dynamic elements that must be
called out of caches, out of databases, out of other server file
stores, and far more html. Yes, good old HTML. The good major
websites utilize this nice balanced approach. Unfortunately, with
today's emphasis not only on dynamic content but also dynamic ads,
the slowdown is perceivable even to non-technical savvy web surfers.
One key ingredient is to isolate non-essential elements of a web page
and use static HTML. Another is to ensure that the page design itself contributes to
load balancing during peaks. For instance, offering navigation on the site that does
not channel all readers into single highly dynamic pages. Instead, the navigation can
create parallel paths, reducing overall requests on single pages, while still granting users content they require. Finally, if you must keep certain elements as dynamic, then ensure that you have normalized your database. Simplifying your DB, cleaning up tables and keys that are redundant or bottlenecks, and avoiding BLOBS (binary large objects) like the plague, will further contribute to peak performance.
Finding the right balance between
dynamic and static is not easy and often leads to feuds between the
operations department trying to scale for peak loads and the design
department looking to create even more compelling content. But I'm
of the opinion finding such a balance is essential for good scaling.
Ultimately, the decisions for scaling
will fall on the shoulders of the operations personnel who are likely
to be called in at the last minute, during crisis, and under
pressure. By keeping these four foundational aspects in mind during
the design phase, you can make even the most severe load spikes
Mark Rais serves as senior editor for reallylinux.com, and has written a number of books and articles related to enterprise Linux and UNIX implementation. He also served as a senior technology manager for America Online, Inc. and Netscape, as well as task force leader for several International non-profit organizations. Today, Rais promotes Linux use worldwide as an integration consultant.
Linux beginners can learn about commands and beginner linux from our 100 articles below:
Windows to Linux: Beginner Startup Guide
Linux Installation Brief Guide
Common Linux Installation Issues and Their Solutions
Choosing the Right Linux Distribution
Beginner's Introduction to the KDE Desktop
Basic Linux Hardware & Network Configuration
Starting Processes in Fedora and SuSe
Installing Ubuntu on a Mac OSX PC
Creating a Linux Boot Disk
Guide to Dial-up Internet with Linux Fedora and SuSe
Guide to DSL Internet with Linux Fedora and SuSe
Most Popular Linux Questions Answered
Basic Networking: Windows to Linux Fedora
FEATURED Linux in a Windows Network with a SAMBA Server
Beginner Linux Commands
Commands for Handling Files and Permissions
Essential Commands for Guru Wanna-bees
Commands for Linux Directories and Files
Beginning Commands for Server Administration
Essential commands for Networks
Using Text Editors (Emacs, Vi)
Exploring Linux with Ubuntu
Managing users in Ubuntu
PCLinuxOS: A Windows User's Delight
Kanotix Live-CD: Gentle Intro for Beginners
Screwing the end user and seeing how far you get
Linux as a Tool for Windows Hardware Errors (Windows vs. Linux)
Linux in Enterprise is Already Prime Time
Killing the Five Myths Against Linux
Top 10 Ways to Protect Your Linux Home System
Out of the mouth of babes: Linux is Better
Free Software: Understanding the Foundation to Linux Use
Linux Works Even for Total Newbies
Blame the Operating System
Emergency Booting Windows PCs: Another Use for Linux Knoppix
Which Linux Flavor is Best? One Man's Answer
Beginner Linux book made available for free