CGTRP

These are some notes for my own personal benefit. The title is necessarily cryptic, mainly to protect the author, whose work I had purchased.

The most important rule: most of the slowness of your site will come from a few sources. Accordingly, it makes no sense to haphazardly try to optimize something, without measuring where the bulk of the slowness comes from. Solving a few well placed problems will yeild extremely desirable results. In other words, focus on the 20% of your code which will deliver 80% of the performance benefits.

A couple of key lessons:

  • (i) I will not optimize anything until my application metrics tell me to do so.

  • (ii) I must profile to see where the choke points are, and attack those choke points.

(iii) Programmers need to justify performance/optimisations in terms of cost. Generally speaking, the faster the site, the more you can convert customers. Generally speaking, the faster the better - especially on mobile devices, and users in Asia, where the internet tends to be a little slower.

Performance Culture:

  • (i) quantify performance in terms of dollars, not seconds.
  • Set a front-end load time budget. Webpagetest.org (use the “document complete” and “fully loaded” and “start render time” tests.)

(Document complete: 3.6 seconds (too slow), First byte: 1.365 seconds (too slow))

  • Set MART and/or M95RT: look at the average and what is going on in the tail (when the times are really slow.). Set a budget for this, and the costs of exceeding the budget.
  • Set a page weight budget. Your pageweight cannot exceed <projected user bandwidth in megabytes/second> / <load time budget in seconds>. In my case, I want the page to be able to be loaded on a mobile phone. So we are going for: 0.5 MB / second (the US mobile average).
  • Quantify integration costs: quantify dollar value / traffic value of a second of load time.
  • Add automated performance and page wegith tests to your CI. This is the real MVP.

Follow the Checklist for your app

  • This is outlined in the book without a need for replication here.

Profiling with Ruby-Prof and Stackprof

  • Using New Relic and Scout etc in production environments actually causes your application to slow down. And secondly, it can never compare code before vs after. Be aware of this.

The general process is the scientific method: (1) Observe (2) Hypotheses (3) Develop testable predictions (4) Gather the data

When the profiler lies

(1) CPU - Clock Counter

e.g. If your profiler measures CPU cycles, and there are no CPU cycles running (e.g. when using sleep(4) the profiler says 0 seconds, but you know it must take at least 4 seconds).

  • Most CPU time measurements are system wide - background work in the Operating System will affect CPU times.

  • Use CPU time if you want to profile without I/O.

(2) Wall Time

  • Think of a stop watch on God’s hand that doesn’t lie (well, sort of).

Wall time can be affected by:

  • Other processes. i.e. a resource utilisation issue. e.g. doing heavy lifting in the background (i.e. disk writing operations) may affect your times.
  • Network or I/O conditions. e.g. times influenced by Redis - it might look really bad, but remember, these times can vary a lot, and they obscure problematic code which may exist in Ruby.

Warning: Not to use wall time when lots of I/O is involved e.g. accessing a network.

(3) Process Time

Measures time used by a process between two moments. It will be unaffected by other processes on your system. But….it doesn’t include time in child processes. If you use fork or spawn then this method of testing will not be ideal for you.

Tracing

  • ruby-prof is a tracer - it’s extremely accurate, but it cannot be used in production, and cannot be used in production.

Sampling

  • Stop your code, and investigate the call stack. You might come to realise where you are stalling.
Written on December 29, 2020