Archive for the ‘General’ Category

How to Scale a Web Application

Wednesday, April 28th, 2010

In my mind there are two scaling patterns that are used to scale a typical web application. One handles the computation requirements, the other handles the storage requirements. Another way to think about this is stateful vs stateless scaling.

If you don’t need to handle any state (storage beyond each web request) in your web application, you can use the stateless scaling approach. The stateless scaling approach is pretty simple. You get what is called a load balancer and put a bunch of servers behind it. A good load balancer can handle hundreds if not thousands of servers, so you should be good for quite a lot of traffic before you’d need a different strategy such as DNS round robin or multi-homed IPs. Of course, the load balancer here is a single point of failure, so if you are worried about downtime if the load balancer ever fails, you should look into some other high availability solutions. You can keep adding (and removing servers) from the load balancer as traffic goes up and down. A good way to do this is with Amazon EC2’s auto-scaling feature. (Disclosure: I own some Amazon stock)

If you need to store state in your web application (which is usually the case) you need a different strategy for scaling out the storage. A good strategy here is what is know as partitioning or sharding. The idea is to split up the data onto different servers in some way. What you need is some form of a distributed hash table. The data is typically split based on the primary key, in other words, the identifier that is most often used to access the data. Once you get a large enough set of data, you’ll need a way to split up the data such that when you add or remove a server, you don’t have to shuffle all the data around. For this, I would suggest using a concept known as consistent hashing. If you are just storing files, I’d recommend going with Amazon’s S3 which does this sharding for you, basically infinitely. If you need faster access to a bunch of smaller pieces of data look at MySQL or try one of the many NoSQL database systems out there, some of which have built in sharding.

Why I Switched From SVN to Git

Thursday, January 7th, 2010

A HUGE benefit of git (or any other distributed source code control system) is that the entire repository is stored in each developer’s environment. This means that you automatically have as many backups of the source code as you have developers. If you use a hosted service such as github, this means that even if github looses ALL of your data, you still have all your source code (and revision history) on your own machines.

In a software startup, your source code is like your crown jewels. Losing your source code can be disastrous.

This is primarily why I switched from svn to git, and why you should too.

Umich CAEN Wireless on Snow Leopard

Wednesday, November 11th, 2009

I was just told how to get on to the University of Michigan CAEN Wireless with the VPN client built into Mac OSX Snow Leopard, so I thought I would share.

If there’s anyone actually subscribed to my blog that doesn’t care about this… sorry, I just use it as a dumping ground for information.

Since some of this info is protected, I’ll just refer you to the protected URL that has the info:
https://www.itcom.itd.umich.edu/vpn/software/UM-on-campus-wireless.pcf

Go to System Preferences, Network.
Click the plus sign in the lower left to add a new connection.
Interface: VPN
VPN Type: Cisco IPSec (if this doesn’t show up, try downloading the Cisco VPN client from here)
Server address: <host in the pcf file>
Account name: <your uniqname>
Password: <your regular umich password>
Click “Authentication Settings…”
Shared secret: <grouppwd in the pcf file>
Group Name: <groupname in the pcf file>

My Thoughts on Startup Weekend Redmond

Wednesday, September 2nd, 2009

So, Startup Weekend Redmond happened last weekend. It was hosted by BizSpark on Microsoft’s campus, and heavily branded that way. 14 out of the 15 startups were built using Microsoft technologies [edit] likely because of the $5,000 prize from BizSpark[/edit]. Guess who won the popular vote! The only team that DIDN’T build using Microsoft. They built an iPhone app, a Palm Pre app, and I believe a web app using something other than ASP or Azure. (correct me if I’m wrong) Apparently that team was disqualified not eligible for the prize money from BizSpark because of that and the prize was given instead to the #2 team. More info can be found on the TechFlash report about Startup Weekend Redmond.

Microsoft/BizSpark got a lot of bad press as a result. Clint Nelson, one of the guys behind of the national Startup Weekend organization posted a blog entry called Sticking Up for the Big Guy. You might want to read that since what follows is basically my response to that article.

Startup Weekend is a great concept. It’s a great community building event where people in the same city interested in the same thing (namely building a startup) get together for a weekend and work together. You get to meet new people, and get to know people better that you’ve already met. But, the fact of the matter is most teams formed at Startup Weekend don’t continue working together on the startup after the weekend is over. So, saying that “we launched 15 startups that otherwise would not exist” is kind of a misnomer. It’s not about the startups that are launched that weekend. It’s about the connections made between the people. Hopefully those people will continue the conversation and partner to form their own startups later.

It’s great that Microsoft wants to support the startup community via BizSpark, but I feel that Microsoft is being disingenuous by only giving an award to a startup that uses Microsoft’s technology at Startup Weekend.

If they want to have their own BizSpark Weekend or whatever, that’s fine. They can run it themselves. They have enough money, they have enough people, they have a big enough marketing budget. Microsoft doesn’t need Startup Weekend to run their own event that is similar that is restricted to building on the Microsoft stack.

“Bizspark is absolutely being crucified for giving us the community exactly what we asked for.” Really? You asked them to disqualify anyone not using Microsoft technology?

In the future, please keep prize money out of Startup Weekend. kthxbye

Twilk, a Twitter Background Generator

Saturday, July 4th, 2009

Twilk is a web service I’ve been working for a couple months now. I launched it a few weeks ago at a Twitter conference. It automatically creates a Twitter background made up of the profile photos of the people you follow on Twitter. If you are on Twitter, you have to check it out! I’d love to hear your feedback on the service so that I can continue to iterate and make it better. So, leave a comment, or use the feedback form after using the service. If you’d like an example, check out my Twitter page. The background has a bunch of my friends’ photos on it, like this:

List of Sites Affected by Fisher Plaza Data Center Fire

Friday, July 3rd, 2009

I’m keeping a list of the sites that were/are seemingly affected by the Fisher Plaza data center fire last night sorted by Alexa Traffic Rank. Comment if you have more information. Sites marked with a * appear to be back up. Follow me (@mulka) on Twitter to get notified when I update this list.

http://bing.com/travel 57 *
http://allrecipes.com 871 *
http://bigfishgames.com 1,822 *
http://geocaching.com 4,233 *
http://authorize.net 5,345 *
http://komonews.com 13,306 *
http://dotster.com 27,895 *
http://waymarking.com 38,446 *
http://kcls.org 41,085 *
http://marshillchurch.org 63,317 *
http://ideascale.com 85,951 *
http://adhost.com 180,491 *
http://onlinemetals.com 180,846 *
http://tomsofmaine.com 247,800 *
http://pacsci.org 285,570 *
http://pccnaturalmarkets.com 300,451 *
http://avayausers.com 413,083 *
http://pspinc.com 556,777 *
http://bartelldrugs.com 471,734 *
http://ovaleye.com 3,392,698 *
http://newsdata.com 3,456,473 *
http://tradetech.net 7,113,570 *

Even more in the comments including:

http://www.portentinteractive.com
http://www.momagenda.com
http://www.princesslodges.com
http://www.goiam.org

http://www.ringor.com

http://www.nettica.com/

http://www.motherjones.com

http://www.questionpro.com
http://micropoll.com

http://www.square1books.com


My Strategy for the FlightControl Game for iPhone

Friday, May 8th, 2009

I want to blog more, and this is something that wouldn’t fit in a tweet, so here it goes. These are my tips and tricks for playing the FlightControl game for iPhone. I recently landed over 500 planes in a single game. That puts me in the 99th percentile of players according to their statistics. So… I know what I’m talking about.

I’m going to assume you know the basics of the game. It’s pretty easy. If you don’t, you should just buy the game and give it a try. It is dead simple to learn, a lot of fun, and addicting.

Try to set up flight paths which won’t collide with anything that already has a flight path as soon as possible. If you aren’t certain whether a collision will happen or not, just keep the plane out of trouble temporarily and make a flight plane later.

Don’t think too long on any one thing. Make a decision quickly, and move on. Always keep a watch on the big picture, especially the edges where new planes come in. Don’t focus on one plane, or group of planes for too long because you won’t see the new planes coming in, and they will crash because you haven’t given them a collision-free flight plan.

After playing for a while, you will realize that the slower, non-pink aircraft usually won’t have very much flexibility in their flight plans. Because they are slow, they stay on the screen longer, taking up valuable airspace. So, you usually want to put the slower planes on a straight path to their landing locations, and make the faster ones take less direct routes if they need to go around the slower ones.

I find that the pink airplanes are the main ones I have trouble with. The key is to get them to land as soon as possible. What that means is that you should get the planes to be as close together as possible.

Remember what I said about getting a collision-free flight plan in place for each aircraft as soon as possible? Well, if you do that, it can be difficult to pack them in tightly. So, what you do is adjust. Start from the airplane closest to the runway and adjust its path so that it goes straight in. Take the next closest airplane and make its flight plan shorter, but not so short as to bump into anything.

Since the larger pink planes move faster than the smaller ones, sometimes you can get into trouble where a larger plane and a smaller plane share the same flight path, and the larger catches up with the smaller and crashes. You have to take this into account. It can be helpful to group planes. So, land all the large pink planes one after another, then land the smaller ones. There’s a contradiction here that you have to keep balanced. On the one hand, you want the slower planes to have straighter paths to get them off the screen faster. On the other hand, you want the faster planes to go in before the slower planes since you might have the faster planes run into the slower planes if you did things the other way around. Which strategy you take at which time depends on the positions of the planes and how many planes are currently on the screen.

There you go. There are my tips and tricks for the FlightControl app for iPhone. Hopefully you found these tips useful in your air traffic controling.

Welcome to the Era of Cloud Computing

Tuesday, November 18th, 2008

Clouds

I declare that we have officially entered the era of cloud computing. Instead of programming directly to operating systems that run on a single machine, we write code that runs on any number of machines. This is a huge shift. Even the web era was usually about a single machine. You had one web server, unless you were a large website, in which case you would have more than one web server behind a load balancer. Even if you had a handful of web servers, you probably still only had one database… as big a machine as you needed to fit your data. Only a few companies needed more than one database. Most companies don’t necessarily need more than one database, but if they suddenly do, there was a lot of work to be done.

Now, with Google App Engine, and (I think) Microsoft’s Azure, you write code, but then don’t know (or care) how many physical machines it actually runs on. This is great because no longer do application developers have to worry about how many machines their are. No longer do developers have to worry about how to scale. Even with Amazon EC2, you have to care about how many machines you have. If you don’t have enough, either you or your software has to detect that and get more machines. You pay per machine per hour, whether or not you are actually using the compute cycles. At least with Google App Engine, the level of granularity is dropped down to the number of compute cycles you use, instead of the number of machines you have.

This is a pretty exciting time. There is a lot of ground to cover still in the area of cloud computing. I think there will be a lot of innovation in the coming years in this area. My next post will talk about the differences between the different clouds. As of now, Amazon has a cloud, Google has a cloud, and Microsoft has a cloud. There are also other less well known companies offering cloud-like services such as Slicehost which was recently bought by Rackspace, Joyent, GoGrid, and Media Temple. There’s even some companies poping up that will offer telephony services in the cloud such as Twilio.

UPDATE: The conversation continues in the comments. Come join us!

What’s Great About On-Demand Internet TV

Monday, July 7th, 2008

You know what’s great about on-demand internet TV? You can link to it!

I was reading a Twitter blog post, which linked to a TechCrunch post, which linked to an episode of The Daily Show which referenced Twitter. Links… its all about links to more information.

Twitter!

Monday, May 26th, 2008