Conditional-tier rendering; The battle of Server + innerHTML vs. JS MVC + JSON
The cool new debate these days seems to be about whether you should be build web apps in the new-age rich client MVC style of Backbone/Ember/Enyo/Spine, or if you should just keep it simple with good ole
innerHTML and have the server send rendered views back to you (as characterised by dhh).
I think that it behooves us to take apart some of the factors involved in the core issue of delivering a Web experience to our users. Although so many love to find black and white, I think that once again you find the somewhat unsatisfying grey.
I touched upon conditional tier rendering before when talking about Twitter and how made some big changes in their latest #NewNewNewTwitter release. They slayed some hashes in URLs (more to come!) and got rid of the deadly spinner by sending back a rendered Tweet.
Twitter is a perfect example of why you would want to do this. The leaf node is just showing you 140 characters to read! Why would you not render that as fast as possible for your users to read? Note, that this is vastly different from a very rich app-like experience.
Let’s take a look at some of the vectors of the discussion. First up…
How do you like to develop Web applications?
We have seen this play out time and time again. Take something like GWT. Although, I would personally never want to build anything in it, that is mainly because:
- I don’t enjoy writing Java code (it just isn’t fun, and I prefer less boilerplate)
- I don’t want the abstraction, I have spent years learning HTML/JS/CSS and don’t want to have to punch through
- I found that good libraries handle the browser abstractions
I totally understand why others like GWT, and how projects would use it…. especially based on the team. If you were an Enterprise Java shop that was holding its nose to do a Web app…. go for it! Pretend you are on the JVM all the way! GWT was also an amazing piece of work. I have the upmost respect for the team that created it. You can create a great application using that approach.
So, if you are someone (or are a team of someones) who much prefers a simple
innerHTML contract where you can focus on creating views on the server, go for it! (just make sure that you understand the tradeoffs below)
The simplicity, especially for a pretty simple app, is very handy indeed. Users are used to using Web apps in this manner, and apps have been created this way for a long time.
Let’s talk about performance
Performance obviously matters. At a high level, the client JS guy will say “you don’t want to do page reloads for now reason! Web 1.0 is for losers!”. The server guy will retort with “I don’t want no stinking inline spinners all the time, and you have to wait forever for the first launch experience!”
The “first launch” experience is something that you should consider. We saw the impact that Twitter has had by showing you the Tweet vs. the loading indicator. One page comes down with everything vs. getting the app and then going back to ask a JSON service for the content to render. I see this same issue in other experiences too. When I tap on a link from a LinkedIn email, it feels like I get redirected a bunch of times before I get to
touch.* and a big spinner is in my face. All this when I just want to accept a darn request!
Make sure to think about the first launch experience, and since this is the Web it can come via a deep link as much as the home screen. There are a couple of approaches to take:
Send back the content needed to render the view
It is important to seperate the format of the content. Your only choice isn’t to send back fully rendered HTML. Another choice would be to piggy back the data needed to render the view along with the app itself. You see this a lot on sites from Google such as Google+. It can fit in nicely because you get to develop your client JS oriented app, but for performance you get everything that is needed to render in one go (so you don’t have to go back to a JSON service). This is a great compromise. In modern browsers, how much of an issue is having the JS render the layout? I would argue trivial (especially compared to the time to get data). Of course, since you are using chunked encoding you are first getting the browser to grab the app (the server can send this down immediately) and then it gets the data and shoots it down in another chunk. Lovely jubbly!
Depending on how long it takes to get the data, you could argue that could be better for performance than having a chunk waiting half way through the server generation process as it gets the data to throw back, and then throw back the final chunks.
Send back a fully rendered view
This is a very simple pattern. On first launch you render the entire UI, and then for later actions you get pieces that you
Your app may fit nicely to a model where you can wipe a particular div area as you go through actions too. This theory can also break down. What if when you send back the view you want to change an area elsewhere? Do you have the HTML also have JS that can tweak that area? That gets ugly. Or do you have that other section working seperately to poll and update? That could be wrong headed too depending on how coupled the areas are.
The MVC approach allows you to nicely seperate all of this logic in a way that makes sense, and you don’t have to get into these hacks. Each view has a nice grouping of what it needs to be rendered. You do have to take the time to setup the coupling of the various functionality. This is exactly why we created Lumbar. From a given view we can define the dependencies so we load in what is needed but that doesn’t have to mean loading the entire application which could contain a ton of functionality that the user will not ever access. If the user does access something that isn’t loaded, we load that in quickly before you move on. You can spend some real time defining what should get loaded ASAP, and what could get loaded lazily (and preload if the user isn’y doing anything etc).
Perceived performance of views
We used to blank out a page and reload the entire canvas on each action. Then we nuke an area as we
innerHTML something new in its space. This is often not ideal. Let me share an example. In the Walmart application, you can do a search for a product. The list of products that come back have enough data to “preview” the product. This allows us to be smart and take advantage of the fact that we can have downloaded all of the views (templates) needed for the given flow. When you tap the product from the list of results, the client can render a product details view, and place some of the data inline. All of this without even having to touch the network. Now we go out and get the extra info and fill it in. This is a common pattern when you try to make the user feel like the app is snappy. Instead of having a view kick off an action, than shows an entire view when the data is back (and having to just stare at the darn spinner) you can get the new view loaded in place, ideally with some good data for the user to soak in as the rest fills in. Much better. Much more control over the experience. Much harder to do by taking some data out of the DOM that came back.
LocalStorage and caching
You can use local storage and local database access to aggressively cache the application JS, CSS, and templates, and even content on various TTY’s. You can even get fancy and when the app upgrades, deltas are sent down and applied.
What about Googlebot?
With conditional-tier rendering, one of the benefits that is mentioned is that you can choose to render the DOM and send it down to the experience. Above I mentioned that, if the client is decent, it can actually be easier to still send down the same old JS application with enough data to render the particular view.
This doesn’t work if you care about Googlebots (even if they are getting better with JS these days!) In this case, doing the rendering to HTML makes more sense. The ideal conditional-tier rendering platform would allow you to flip a bit to send it all down.
What about mobile?
You may also want to choose to send back a full HTML view to particular mobile clients. You also may want a different profile based on the different world that mobile devices can live in (cell tower latency, device capability constraints, etc).
Also, how many experiences are you developing? If you have native clients, that may change your emphasis. All of a sudden having the Web application be “another client to the JSON services” doesn’t actually seem that bad. Instead of forcing another tier, you can get benefits from a client-oriented approach. With a smart framework that can render the client code on the server, you can make the right optimizations without having to write a bunch of server code just for Web.
It seems like a lot of people think of analytics at the end of a project. I think it is better to make this a first class “what do we want to measure, where, and how” else you get into trouble. As a result, you need to make sure that you don’t have pain having to setup analytics in different ways depending on how your app is rendered at a given time.
There is no “right way” to build a responsive Web application. There are plenty of trade offs in various approaches, and as long as you are thinking through them to make sure that you deliver the best experience possible…. good on ya.
For any application that goes beyond something simple, I think that we are still finding our way on the best way to get conditional-tier rendering to allow us the nobs and dials to tweek for developer productivity AND performance. There are a variety of ways to get an optimized first use experience and then bootstrapping a non-reload experience after that. Next up I need to show the code for some of this.