TechCrunch Down. I’m Pissed.
by Mike on February 12, 2007

The single hardest thing about running TechCrunch is simply keeping the site live. Some weeks, more hours are spent by various people trying to keep the site up and running than are spent actually writing. There are many culprits. First, we have a lot of third party widgets, ads and analytics apps running on the site. They are often the cause for slow load times. FM Publishing, our advertising network, often slows down the site and then other things pile on to crush it.

Today we had three problems. FM is updating their software and caused massive . We switched to the new version of wordpress which is clearly not bug free. And on top of that we have a number of plugins that are acting weird on the new wordpress software. One of them took us down earlier tonight.

Another culprit is MyBlogLog, which we’ve had to strip off the site a number of times because of slowdowns.

Thank God for Media Temple, who work with us to keep the site live. Right now, TC is completely down, and Chris Lea at MT is being dragged out of bed to try to fix it for the tenth time today.

At some point I’m going to give up and move on to partners and software that can address our needs. Somebody who’ll be up with me at 4:17 AM when my site goes down because of them.

Comments rss icon

  • Considering the traffic TechCrunch gets, you really should have a 24/7 in-site hosting provider. I’m actually surprised to her you don’t.

    As for the widgets, well, that’s a problem. Suddenly, the speed and response of a site depends on many different sites. There’s an idea for a project: to unify widget functionality and dependecy from one (and reliable) source.

    Hope it comes back up soon. I know the feeling very well.

  • Hi, we make rocking stable software! drop me a line and i’ll make you guaranteed blogging tools, ok?

  • Most of the time I incorporate third-party content I end up dumping it because it slows page loads.

    As big as TechCrunch is, you should be able to demand — and get — reliable fallback behavior from these companies for the times when their services fail under heavy load or other problems.

    Lacking that, there’s an opportunity for services that can offer a widget control panel that makes it easy to turn them on and off.

  • Mike,

    There are a couple of suggestions that I can make (if you’d like to hear them ;-) ):

    1) Perhaps you should give up some of the freedom you have and simply run from one of the top blogging engines (I’m sure they will all love to get you). Perhaps Wordpress.com or something.

    2) Regarding load time and stuff, you might want to try a trick and load the whole half side of your blog (the one with the ads and all) inside an iframe. It might suck but it might make things appear to go faster to the users since the browser will first render what’s really on the page and then try to render what’s inside the iframe.

    3) You’ll need to start thinking about having a more robust configuration with more than one server or with multiple slave MySQLs. But that’s just a guess since I haven’t seen your traffic logs.

  • If it’s just slowing down the load time, why not just get rid of all the extraneous baubles and trinkets all over the site?

    Personally I don’t notice, because I only ever read the site via RSS, but if it’s hurting performance (and not really adding much to the site) why have all that stuff at all?

  • re: mybloglog

    There’s a little known trick where you can switch to an image only / no javascript version of mybloglog. I find this isn’t as bad on load times as the javascript version (since it’s just an image not loading).

    more info here: http://engtech.wordpress.com/2007/01/15/mybloglog-widget-for-wordpresscom-blogs-one-of-the-best-web-widgets-available/

  • Well TC is back up now, so it wasn’t out for too long thankfully. Do you plan on dropping any of the third party widgets permanently from TC?

  • Oh, another option is to try out wp-cache (http://ocaoimh.ie/2006/12/31/wp-cache-2020-released/) though I’m quite sure you are already running it.

  • Hi Mike, Did we contribute to the issue this time? We’re getting better and better supported by the Y! infrastructure, and I think we’ve stopped causing people issues.

  • When I first saw the site down, I thought of the Barack post and wondered if people were doing something in response. Glad to know that wasn’t the issue. I am actually a bit shocked that you would “call out” your advertising partner. All ads slow stuff down. Way it works. You don’t even need a big blog to do that.

    And with the income brought in on this site, can’t you spend $2k or something on a hosting/blogging consultant who can help you get what you need?

    Anyway, good luck - hope your new providers work better for you.

  • Hi Mike. Your point about your needing to “move on to partners and software that can address our needs” reminded me of when we over at threadless were having growing pains and realized that we needed to move from our much loved but sadly too small colo to a much larger organization that would have the support infrastructure we needed. We ended up using rackspace (which is really expensive - but when you call at 3am and a dude answers the phone and helps you - you quickly realize that you get what you pay for). We haven’t looked back sense. It has been a great experience.

  • You get what you pay for.

  • Oh the pain….I feel it but certainly not to your extent since all of my Web assets *combined* don’t equal your pageviews or probably what is one downside to a blogging platform that is so dependent on database pulls!

    I’m a 100% widget/gadget/openAPI/mashup fan but there is LITTLE UNDERSTANDING OF LATENCY in the marketplace and especially amongst startups. In fact, I wrote about this “dirty little secret” of Web 2.0 in October of *2005* at the Web 2.0 conference (http://www.iconnectdots.com/ctd/2005/10/web_20_conferen.html). It was sort of funny to me too that Marc Canter — the poster child for interactivity — at the time had one of the SLOWEST blogs to parse of anyone I followed since he had a crapload of 3rd party pieces that the page had to wait on in order to load.

    Unfortunately, too many startups I interact with are amazingly uninformed about internet architecture, latency, and how small lags in response fetching data from 3rd party API’s is a crapshoot.

    Mike, you could be a guy (or someone you trust) that could perform some sort of benchmarking as you rate or launch startups or otherwise analyze their usability.

    Just for shits-n-giggles, I created a sandbox blog and populated it with 10 widgets and nothing else. I then tried different browsers and platforms I own and measured the latency until each widget was populated and the page fully loaded. Now I know this is *totally* anecdotal and about as scientific as believing the world is 4,000 years old, but it took between 42 and 61 seconds to load them in total. I could only imagine if my *main* blog had 10 widgets in the sidebars along with all of my other content…it would probably be pushing two minutes to load a page.

  • Mike,

    Would certainly like to chat with you about this. You can have too many providers of critical services (a Web 2.0 phenomenon for sure) or on platform that does many things and factors in scalability matters (my view of the world).

    Just pop me mail if you want to discuss in more detail etc..

    This event is a pretty important Web 2.0 case study (IMHO).

    Kingsley

  • I use a managed services provider, Rackspace. They have a complete team of technicains at WORK 24/7. That doesn’t mean some clown either, but a specialist. An OS tech, a backup tech, a firewall tech, etc that is assigned to your account. 37Signals uses Rackspace too. They are more expensive at first blush. But they have 1 hour guaranteed hardware replacement and a human being answers the phone in 30 seconds 24/7. I love em.

  • Can you explain what the hell you were thinking by attempting a major upgrade without testing first? That’s silly, Mike. You don’t even need a consultant like me to tell you that. Pride goes before the fall.

  • Mike,

    For the amount of traffic that you get, and the amount of revenue the site must generate, it’s time to invest a little bit in your infrastructure to make sure you’re online 24/7.

    Get a collection of midrange boxes, set them up with your choice of Java app server (Resin, JBoss, Weblogic Express all come to mind), cluster them, and stick them behind a BigIP load balancer, and you’ll be sleeping soundly at night. This reliance on disk-IO-heavy PHP and subpar blogging software is for the birds. Throw Magnolia onto a real app server, go heavy on the in-memory-cache (which you can do, since most of your pages are anonymous) and you’ll run out of bandwidth long before you kill your app servers.

  • I haven’t been able to write on my Word Press blog in months - it just won’t work. I think they’re a great interface but I’ve had nothing but problems with their technology and they don’t have much customer support it seems.

    I noticed TC loads sooo slow now on my PC, sometimes I skip coming to read posts because I don’t have time to wait for it to load - I think it’s good you’re working out the growing pains :)

  • re: mybloglog

    We, at Megite, saw the slow down from mybloglog widget too. I think Mybloglog has to take action to guarantee the performance, or people will start to give up them.

  • I have had nothing but problems with media temple. I was going to make a bad joke about how I have more traffic than you but I won’t.

    Bottom line: Media temple is not a hosting company I would recommend.

    I am going to check out ev1, the planet or dreamhost.

  • It’s not MediaTemple, its all about wordpress/plugins/widgets/etc. I agree with what Mark above says in that with a good java middleware cluster you can sleep at night, but there is a lot more that can be done with php/wp/etc. before such a big move is made

  • It sounds like there is a lack of accountability in multiple parts of your equation.

  • Your site is designed by “WeBreakStuff.” How fitting.

  • Try putting all the ads inside of a Iframe, so the main site loads first and then things like ads and widgets load second.

  • If it’s not MT, it’s a miracle… The MT myth bubble is bursting at the seams… the last host I ever needed if only I’d killed myself.

  • What I’m wondering is: why did you not test it on the new platform before upgrading?

    I did that and realized there was a major flaw in one of the plugins I was using so I had someone update it for 2.1 and now it works great.

    Isn’t that just a generally good practice to test things before upgrading live?

  • Now that you’re on WP 2.1, you should activate MySQL query caching. The posts queries (like the rather expensive front page query) are static in 2.1, so they can be cached. It also may be worth looking into an in-memory object cache backend (like APC or Memcached).

    Another thing you can do is identify pages where the cookie shouldn’t affect the page’s display. For instance, the front page. You can then alter WP-Cache so that it only uses the URL when generating that page’s caching hash. That’ll increase the number of hits that can be served from the HTML output cache (which is much faster than a dynamic load).

  • 1. Static rendering is your friend.

    2. Lose the widgets.

  • We had to switch from Wordpress to Drupal. We have not looked back.

  • You must be very lucky with Media Temple. I am having a lot of difficulties with (mt) and I signed up with them less than a month ago.

  • Waaah! My finicky Ferrari that lays the golden eggs, is in the shop.

    Um, its a little hard to work up sympathy for something that makes a butt load of cash, but can’t afford test/staging systems and/or a freakin rollback plan or a backup colo, etc. Most would label this bad systems management, design and processes. But the funny thing is, that it just generates more attention and hits for your sites. Kudos on making meta-lemonade.

  • When the Know More Media network slows down, we affectionately call it “Slow More Media.” Not fun, but it happens sometimes with all the code flying around and doohickeys needing to be loaded just right. I wonder if slowdowns and this sort of issue will still exist in say, 2017. Best wishes with that, Mike!

  • Wow…for such a big site…you’d think that there would some level of professionalism and care taken with the site structure…but I don’t see it…

    What a joke…I’m sorry to say!

  • Mike, if you’d like I’d be happy to chat about doing some infrastructure sharing. Over the last 4 months we’ve solved almost every issue (the exception being widgets) that you’re running into with TC’s growth.

    It’s never easy, and it always takes more time and money than you think it will. We’re doing this with a number of larger sites right now and everyone’s happy about it. It lets you just worry about blogging instead of all the other crap.

  • Hello all. I’m Chris Lea from (mt) Media Temple. There’s been some conjecture here about what the problems were with techcrunch.com and what we did to fix it, so I’d like to set the record straight. We’re all fans of transparency. :)

    I’ll start off by clarifying things that were NOT the problem.

    1) The problem wasn’t that we don’t have staff on hand to look at things. We have 24/7/365 support. This account is flagged such that if there is a service down issue, either myself or Daniel Greene (another of our engineers) is to be located no matter what time of day it is. This is because the two of us are very familiar with the server set up. The amount of time it takes me to wake up at 4am, walk over to my computer, and shell in is less time than it would take for an engineer who is not familiar with the site to look around as see what is going on, so that’s why I got the phone call.

    2) The problem was not hardware related. The site has more than enough hardware for the traffic it is currently getting.

    3) The problem was not Wordpress. There were a few issues that came up because of a couple of Wordpress plugins that are really badly written and were beating up the database, but they were identified and disabled. Wordpress itself did not cause any issues.

    So, now moving on to what the issues actually were. Essentially, what was happening is that in circumstances where you had the combination of a traffic burst (say from Digg) and any slowdown in any part of the system (say from widgets) there would be a snowball effect with the apache processes. In the forking model that apache is running under, when a widget is slow to load, an apache child stays open waiting for the request to complete. During that time, the child is unable to accept any new requests, so when the next request came in apache would fork another child to handle it. Under a traffic burst situation, this would rather quickly run apache up to the maximum number of allowed children. At that point, the next connection would appear to just hang and the site would appear non-responsive or very slow. Increasing the amount of children apache was allowed to spawn to a number high enough to handle all the requests ate up all the RAM on the box so that wasn’t a viable solution.

    The thing that did fix the situation was switching techcrunch.com to being served by lighttpd. I had been thinking about doing that for a while, but I hadn’t run it before on a site with these operating parameters. And, I didn’t understand its blocking model as well as I do now, so I wasn’t sure that switching to lighttpd would have fixed things.

    Fortunately, the great Matt Mullenweg shot an email over saying “if you’re running apache and mod_php, you’re likely to see behavior like [exactly what we were seeing] … you should run something like Lightspeed or lighttpd.” Now, since Matt is one of the smartest guys I’ve ever talked to in this industry, I was more than happy to believe him and we switched over to lighttpd that evening.

    That solved the issue because lighttpd has an event driven architecture and therefore doesn’t block on things the same way apache does. We have not had any material issues with the setup since then.

    So, I hope this clears things up for those parties that are interested. If you’d like to know more about anything in particular, I can be reached at chl@mediatemple.net.

  • today it’s down too:

    “The page you are looking for is temporarily unavailable.
    Please try again later.”

  • Hahaha, yeah I came here after todays downtime…

    Funny… This is a two year old thread almost to the day!

Leave Comment

Commenting Options

Create an avatar that will appear whenever you leave a comment on a Gravatar-enabled blog.

Trackback URL
  • Actively Discussed Posts
  • MediaTemple Logo
  • QuickSprout Logo
  • OpenX Logo
  • Cotendo Logo