Fairytale about performance in web application

August 13, 2018 0 Comments

Fairytale about performance in web application

 

 

Once upon a time, in a very far-off land, there lived front end developers who had been so fortunate to develop web UI on their powerful laptops all the time that they became completely careless about performance implications of their code. It happened one day that one man, who’s name nobody remembers already, tried to run the application on average price mobile phone. What happened after that is truly astonishing… All we know, the King of that land was quite upset and couldn’t sleep for decades. Well you will be upset as well when you see JavaScript execution is blocking browser’s rendering engine for 30 sec!

It was a regular sunny day, summer just began and it was quite hot already. That was something different that day, I knew that something very special was about to happen...

Old friend of mine contacted me with the next words:

“Hey man, we have this web application, ‘start up’ and all that kind of things. After a year of active development, right before release, it’s turned out performance on mobile is kind of suck. The app is about searching trips, basically, and when we do search, list of results behaves poorly. May you have a look and suggest how to improve that?“

Bad performance on mobile for list of results. How big is the list? Well, I’ve seen that many times, not smooth animations, scroll sometimes not responsive that fast as expected, right? Or maybe you just have too much re-rendering and tweaking shouldComponentUpdate just will solve it all at once? Yeah, sure, I am glad to help, sounds like it’s gonna be fun 😺

Next day I got all credentials and permissions to source code: git clone, yarn & yarn start and I am ready to have fun! So, what we got here? I ran the application once on new MacBook Pro, well, some animations twitching before main UI elements and list renders, but nothing outstanding. Well, let me do 6x processor speed slow down via Chrome dev-tools. Okey, there is something.. erm.. why half of visual components are not rendering now at all? Command+R. Again. Wait a bit. Here we go.. 26 sec and I have fully rendered UI ready to use. Not fun 🙀

Let me highlight this again: web page is freezing for 26 sec, UI is completely not responsive. React-based web application with something like 20 items in search result. In Chrome, it’s not even IE8 back in the days. What you’ve done there, guys?

Where should I start? Well, let’s start from package.json so I can check quickly what is in use, main libraries and tools. Alright, what we’ve got here:

  • react
  • react-redux
  • reselect, redux-saga, router

Well, quite standard stack for these days right? Wait, what’s else there?

Wait a sec, why you need both redux and mobx? Well, maybe it’s just a legacy, bad sign already, but, moving on.

Quick go through some folders, entry points file. Let’s put few break-points here and there, few console.logs, get more knowledge about the code base and flow. Learn the code style. Learn main features and use cases. Where data is coming from? How it’s parsed and applied to UI? What are main UI components? Use React dev tools and inspect HTML elements to understand how layout is built.

15 minutes later.. you know the code base!

Time to run Chrome profiler to measure performance.

Scripting is quite heavy. No surprise though.

What can we get from it? Well, if you ever ran Chrome profiler for a messy unknown code-base it can scare you at first. It is one of that scenarios when there is so much information on screen that you literally don’t know where to start. First of all, you can see that red marks on time line, that means there are real performance issues, even profiler can detect them without your analysis.

What I usually do, as quick first step, I check ‘hot functions’ in Bottom-Up list. Because if there is a function which does a gazzilion iterations calculating PI number with 15 digits after coma - you will spot it right there on the top of the list (hah, if it would be this easy)

Anyway, in profiler results I noticed mobx ,mobx-state-tree and one file from project code — journeys.js (as I mentioned, project is in tourism/traveling domain, so you will see some related names). My first thought was ‘What is mobx-state-tree?’ since I hadn’t worked with it before. I did some googling, checked Github page, checked how it’s used in code — well, it’s libraries which provides you with tooling to wrap your data into models and combine them in one big model (tree). Then you just work with store object calling models methods to access or mutate its data.

Sample from the repo README

That was a moment when the first bell ringed. Since it is integrated with mobx and works based on observables I got a hunch that on big amount of models it might be a problem. At that moment of time I had no proof, and, I knew the number of models wasn’t that big, so, just moved on.

If you remember, next file in profiler was journeys.js. Well, I liked that, I mean, I have this good feeling of confidence when I can actually ‘touch’ the code, i.e. modify it right there and see immediate feedback, just feels like a lot of control on the end of your fingers. I opened the file and found line pointed by profiler — here what I saw:

Server-sent events?

Here how it works: when user press ‘search’ the session (based on search parameters, etc.) will be created. Then, server-sent events subscription will be created and will be pushing search results to you via events.

If you think about that it’s pretty wise approach. Imagine your regular REST call when it takes quite some time to process result an send back to client. User, same as entire application, just waits for response back and blocked of doing other stuff (since data in not there yet). But, what if you can start responding to client right away, not waiting for entire task is finished? Yep, here we go, events based communications solve that.

When message event received from server, JSON data will be parsed into object and passed further to callback which is our entry point of processing search results.

Well, now I feel confident, I know exactly how data is coming, I can start mocking it, playing around with it and doing all other illegal fancy stuff 👌

No sooner said than done. I did search to get huge response, copied JSON, saved, mocked communications with server by it.

I did have static data at that point, which is very important first step, because all measurement I was planning to do will not be compromised or polluted by changing data.

“It must be about amount of data” — I told to myself. I mocked 100 items to search result (instead of 400 before) and, guess what? Hell no, it still sucked, not 26 sec, but still far from acceptable. WhaT a Fun?

It was almost evening at that moment. I definitely didn’t want go to sleep with half opened problem, because my brain will not allow me to sleep if I don’t give it at least some reward. But I had nothing. Yeah, I got myself familiar with codebase, had mocked data and several hints from profiler, but, it’s far from being satisfied. So, I started playing around, commenting code here and there, testing again again and again, hoping for a quick win. I was trying to localise part of application which affected performance. Yes, I know it’s right after data was received to journeys service (obviously), but then trace just disappears…

“It can’t be bad everywhere, right, it’s probably one shitty component somewhere which just send entire application to hell?”

At one point I just removed entire rendering, meaning no react-renders at all, and performance was still same, i.e. awful.

Once again: I removed entire rendering and did not see any improvement. How that possible? Exactly! The bottleneck was not in rendering, but in data processing. Right in the middle between the point when data received from server and passed to UI components. Wow, that’s mind-blowing. I mean, who expected that, all performance issues in XXI century were rendering related, not plain JavaScript

“Well, that’s even better, it will be simpler to fix in Vanilla JS, right?” — naive voice in my head wondered.

A new day. What was in between API call and visual components? mobx-state-tree models creation. I put start-endwindow.performance.now logs right after data passed to models tree and and after models were created:

24 seconds for around 800 models

I tried like 3 times less amount of models:

Obviously, it has linear dependency to number of models..

Let me justify the numbers. We are not talking about hundred thousands here. App has next models:

  • journeys
  • routes
  • segments
  • …and few more which do not affect big picture

They are combined in a tree-like way — ‘journeys’ model contains ‘routes’, ‘routes’ model contains ‘segments’, etc. So, let’s say, it creates 50 journeys, 100 routes for them + 75 segments. So it’s about 200 models. It takes 5sec to create them.

“It’s time to get know you a bit better, my little friend” — thought I about myself and went to checkout Git-repo of mob-state-tree.

I started from checking filed issues for repository. It’s a good practise do to that before putting a module as base for you application code. Well, too late here.

I searched ‘perf’ in issues and got few . Quick go through. “It’s not only in my head, other people complaining about same thing”.

But, seems like maintainers gonna fix that. In fact, they already fixed that but it will be released in next major version 🙌 (mst v3, was not released yet at the moment of described events, but already released now)

But (again) we obviously can’t just wait until it’s released (and hope it has fixed performance) — guys need to release their app as well. Yesterday.

But (I know) what else I can do? (considering time/work/money concerns)

So, you have this ‘not pretty’ thing in your codebase you should live with. You can’t avoid facing it. You can’t run away from it — it’s just too big. You have these fighting thoughts that you obviously should just cut that code off and maybe re-write that functionality, and, maybe, even make it work without bugs. And, maybe, event make it in time…

Welcome to the real world of software engineering. Tough choices. No one told you that before you did your first image carousel with jQuery, right? 😹

Anyway. No time to re-write, let’s see other options. So, once again:

Problem: models creation blocks JS thread, user can not see any updates on screen for 25 sec.

1st cheap solution: repeatedly un-block main thread so tasks for UI-updates can be picked up by Event loop (checkout video here if it’s new concept for you)

How: chunks. Simple like that.

Chunking approach is classic for handling big amount of data, long lists on UI, etc. You definitely used it many times in your applications: lazy loading, pagination — all the same idea.

If you have heavy task you always can split it in portions (chunks) and do it chunk by chunk, taking a pause in between if needed.

Applying to our situation, I was thinking to split data right after it was received into chunks by 20 journeys and then send chunk by chunk further, pausing in between for a while to allow UI to update. Easy Peasy.

Simplified, just to get the idea.

I got one issue so far, as I mentioned, tree has next models: journeys, routes, segments. Linked by IDs. From server they were coming normalised, so I needed to connect them manually, to make sure chunk of 20 journeys has exactly enough data to work, no more no less. The trick here, I need to do that as efficient as possible, since we I have no luxury of writing not performant code 😹

Task: create complete data structure from normalised arrays of models, i.e. search in long lists of objects by ID and combine objects together.

Solution: well, definitely, if you come to data structures and algorithms complexity there are many ways of doing that. For search here ideal and the simplest way is to convert array into object with ID as key. You are getting O(1) search in 5 mins!

After all, I felt like it should be working better now. How much better? I need a way to measure that!

Here is a thing. You probably heard many times that someone ‘feels like’ that app is slow, without even knowing you profiling numbers and it doesn’t matter what your numbers says, user is not happy. Having not happy users is that last thing to dream about. The question is how you can measure user’s feeling. Apparently, this task is not that hard as well.

User feels what works slower than expected, because, obviously, we all have precise feeling of real time. If your app’s behaviour distract real time, guess what, user will notice that. Knowing that, now we know how to measure user’s feeling — we should measure how our code works in compare with real time (I know, stupid simple for a theory).
Stupid simple.

Run setTimeout (computer time) and measure how it differs from real time. If it’s quite different it means some of your code execution took more than allowed. You can find gist with complete code here.

Time to test it:

Well, definitely less than 26 sec..

So, here is the deal. The app has progress bar with timer (from 1–30 sec) on top of the page, which progresses when user does search. Timer should update each second, so user see how much time it takes.

Before: progress bar hangs on ‘1 sec’ state for 30 seconds after user pressed ‘Search’. Then jumps to the end (if, of course, user waited for 30 seconds with blank screen and didn’t close the tab…)

After: progress bar updates each second (so user can see all numbers from 1 sec to 30 sec), which is significant improvement. But, you can see on measurement above, it’s not smooth, since it creates 20 models for each chunk and it takes a sec per chunk on average to do so.

If there are any animations on screen, it’s quite noticeable that something is wrong, 1.18 sec, 1.89 sec, delays… Apparently it’s not enough. Much better user experience, of course, but no, still not acceptable.

“We can do better than that.” — I was challenging myself.

What we know so far? Models creation is heavy task. Even for smaller amounts, in chunks, it still affects UI re-rendering. But… do we really need them all, all the time? That’s a good question which brings us to the second solution.

2nd solution: postpone heavy code execution (AKA ‘models creation’ here) until it’s impossible to wait anymore longer.

That was it 💡

There are 20 journeys in list on screen. On scroll we add more with step 20 as well, i.e. user scrolls down and gets more journeys. That means, from cold start we do not need more than 20 models. So.. why we even bother about other 700 models?

Few more minutes to create caching services and communication bridge between scroll event and models initialisation. That’s it. Now I didn’t pass data from server right to mobx-state-tree,but instead I was caching data in plain JavaScript object and, later, using it to create models when they were required from UI.

Well, definitely less than 26 sec, like in 28 times…

Before: on pressing ‘Search’ button, progress bar updates each second (thanks to chunks). But, still, creation of 20 models for each chunk takes a sec on average, so progress bar hangs for that time, what is hard to not notice.

After: progress bar hangs for 1 sec after pressing ‘Search’ button, but then updates each second perfectly smooth without any interruption. On scroll, though, it will hang for 1 sec again (since it will be another models chunk creation), but, this is a little evil we need to accept.

I thought about that again:

Originally, it was 26 sec with blank screen for user to wait until the app is usable. After, with few very cheap fixes — the app is ready to use in less than 1 sec.

That’s it?

Alright, that was an elephant in the room. It was too big to get rid of it, but with few decorations I was able to make it way less noticeable. But what about other code then?

That was quite a room for improvement in many other different places:

  • there ismomentjs which is slow and big, can be replaced with dayjs and save you seconds
  • tracking logic is too heavy, data for tracking should not be composed in O(n2)
  • display values are composed each time, again, via search in list, that should be calculated once before and cached somewhere
  • and much much more… but it’s already another fairytale

Remember times, when you did animation with jQuery, and it was lugging quite badly in Chrome, and, you still could do the same thing with ActionScript 3.0 (Flash) and it was way better and faster? Or, when you needed some big editable table to work in IE8, cutting here and there, using createDocumentFragment to reduce re-flows you finally were able to prevent browser to crash? Time flies and seems like just yesterday Microsoft launched Edge browser, no one give a damn about IE8 anymore… Amen. Yeah, what a relieve, finally drop support of 10 years old browser.

So, what is the new thing to give a headache to front end developers? Mobile. In fact, it’s far not new, to be fair, the transformation happened very fast, shift of users from desktop to mobile happened like 3 or 5 years ago, right? Mobile browsers is our target audience now, so… please, stop testing everything on your new MacBook Pro!

Performance is not something you can simply ‘apply’ in the end, sometimes it’s impossible or very expensive to fix consequences of bad code, application architecture or chosen libraries.

Take an average Android phone for $199, test your application on it, and, you might have a story for your own fairytale…


Tag cloud