In my mind there are two scaling patterns that are used to scale a typical web application. One handles the computation requirements, the other handles the storage requirements. Another way to think about this is stateful vs stateless scaling.
If you don’t need to handle any state (storage beyond each web request) in your web application, you can use the stateless scaling approach. The stateless scaling approach is pretty simple. You get what is called a load balancer and put a bunch of servers behind it. A good load balancer can handle hundreds if not thousands of servers, so you should be good for quite a lot of traffic before you’d need a different strategy such as DNS round robin or multi-homed IPs. Of course, the load balancer here is a single point of failure, so if you are worried about downtime if the load balancer ever fails, you should look into some other high availability solutions. You can keep adding (and removing servers) from the load balancer as traffic goes up and down. A good way to do this is with Amazon EC2’s auto-scaling feature.
If you need to store state in your web application (which is usually the case) you need a different strategy for scaling out the storage. A good strategy here is what is know as partitioning or sharding. The idea is to split up the data onto different servers in some way. What you need is some form of a distributed hash table. The data is typically split based on the primary key, in other words, the identifier that is most often used to access the data. Once you get a large enough set of data, you’ll need a way to split up the data such that when you add or remove a server, you don’t have to shuffle all the data around. For this, I would suggest using a concept known as consistent hashing. If you are just storing files, I’d recommend going with Amazon’s S3 which does this sharding for you, basically infinitely. If you need faster access to a bunch of smaller pieces of data look at MySQL or try one of the many NoSQL database systems out there, some of which have built in sharding.