Performance Awareness and Optimization

Explore several guiding principles that help drive performance, and learn to apply them to your software development context.

Having been working for over a decade on various client projects and products, I’ve had several opportunities to work on different aspects of web application performance. People often expect my team and me to “implement performance,” but that’s not how this works, of course. There’s no list of ten concrete things everyone needs to do, always, in order to make something perform better. What I found works really well instead is to adopt a performance-oriented mindset, as well as following several guiding principles and high-level concepts, which will then translate into different concrete actions in a given context.

I have always been interested in software as well as system performance, and so I began sharing things I have picked up here and there on different occasions. A couple of weeks ago, I gave a talk about this topic at WordCamp Netherlands. (If you are interested, you can find the slides online.)

In this post, I will share the gist of that talk. While there are code examples written in PHP or JavaScript, they are for illustrative purposes only. The concepts may translate into any context and language, but you also don’t need to understand or look at the code at all.

What Is Performance Anyway?

Understanding what “performance” means is crucial before diving into best practices and related considerations. For many people, hearing “performance” in the context of web development translates into front-end performance, Google Lighthouse, PageSpeed Insights, or Core Web Vitals. While that is not wrong per se, this is either too broad or too specific and focused.

In general, looking at performance means asking the following question:

“How well is a task being executed?”

The important bit in this is the definition of “well.” This means setting the (main) criteria and expectations. The performance of a system can be based on time, for example, the processing time (i.e., on the server/computer side) or the response time (i.e., from the user/consumer side). Or it could be looking at the resources used or the system’s availability. Or even personal preference: “Did you like this band’s or artist’s performance?” In web development, performance is often a mix of time and resource constraints.

Guiding Principles

Consider the following guiding principles. By understanding the high-level concept and applying it to a specific context, you will build a performance mindset.

  • Don’t do work too early.
  • Don’t do work over and over again.
  • Don’t do unnecessarily complex work.
  • Clean up after yourself.
  • Don’t optimize prematurely.
  • Do measure performance.
  • Use the right tool for the job.

As you can see, these items aren’t special in any way. In fact, most of them are nothing more than common knowledge and common sense.

Now, let’s look at some real-life examples of these, shall we?

Failing Early

Spending time or resources on something no one needs is senseless. When considering a function to process some data or perform some other action, it may be useful to check relevant preconditions and stop if something is not right. This begs the question, “Should I actually do anything?” and it covers the current context (e.g., front-end vs. admin side, REST API request vs. regular front-end request, or logged-in user vs. anonymous visitor) and the data or processing logic.

Check the Context

Imagine a PHP function in a WordPress context that should only run for published posts in a specific custom post type. At a minimum, we are looking at two conditions: checking the post type and checking the post status. Does it matter in what order we do this? For this particular example, it may or may not. This purely depends on the project context. Are there many more published posts than non-published posts? Are there just a handful of posts in that special custom post type? How often does this function and thus the condition execute?

function myFunction( WP_Post $post ) {

    if ( $post->post_type !== 'special' ) {
        return /* data */;
    }

    if ( $post->post_status !== 'publish' ) {
        return /* data */;
    }

    // Perform actual task...
}

In general, this is an opportunity to start building your performance-oriented mindset. While it may not make a massive difference if you check the post status or the post type first, you have to decide on one. So, you may at least acknowledge that there are two options and think about the “better” one, given the information available.

If we were looking at a slightly different function, one that should run on published posts only, but only if another condition based on an external API holds true, this suddenly becomes a different situation. It should be obvious that checking the (scalar and immediately available) post type of a given post object first is the right thing to do, and only then send the request to the remote server to wait for data, which we then process further.

function myFunction( WP_Post $post ) {

    if ( $post->post_type !== 'special' ) {
        return /* data */;
    }

    if ( ! expensiveCheck() ) {
        return /* data */;
    }

    // Perform actual task...
}

Check the Data

Similarly, let’s assume we are looking at code to apply various types of processing logic to data in relation to the given input values. If there is no data to be found or no processors registered, all this function should do is check for data and processors (or maybe just one of these things) and then stop.

function processData( int $postId ) {

    // Get $data from somewhere...
    if ( ! $data ) {
        return;
    }

    // ...
}

Caching

Answering the “Should I do anything at all?” question in the previous section, the next important one to consider is “Should I do this thing again?” This means thinking about the time after which an action should get performed again or what state or condition change would have needed to happen for this action to be executed another time. So, we are talking about caching, among other things.

You may have heard people say caching is not the right way to address a badly performing system. And I agree with that, in general. But not in all contexts. There are different reasons for caching, and there are different ways and levels of caching. There are different pieces of data or logic one could cache. There is expiration and invalidation. There are different scopes in which to cache and so much more…

It’s true that, in some situations, caching will not fix the root cause but “only” address one or more symptoms. So caching isn’t a silver bullet; it should never be the only thing one does regarding performance. But not caching at all may be an even bigger problem.

Levels of Caching

Different codebases present different levels, places, and scopes where you can apply caching. The ones that most people think of when they talk about caching would be a proper long-lived cache. For example, a dedicated object or page cache, a browser cookie, localStorage, or a dedicated table in your existing database. These are the ones where you definitely want to have correct expiration and invalidation logic in place.

One level “down” would be a semi-persistent caching layer, like the current PHP or MySQL (or other CLI) session. Moving further from this one, we would look at the current request or the lifecycle for a single interaction. Looking at PHP, using a global variable would be one form of request-based caching.

The next level would be a local cache, which uses just a part of the current request/interaction, for example, caching data in a static variable in some function or a field/property of some class instance. The last level is a hyper-local cache, meaning the context of a single non-static variable in a longer function. Considering JavaScript, this could even be a variable (const or let) in a sub-scope.

Cache Expensive Operations

The things you may want to consider for caching are expensive operations. In this context, the meaning of expensive could mean time, resources, or both. If you are performing a local operation that takes longer to complete, if you talk to a remote API, or if that API is actually performing an action that takes a while, these are great candidates to integrate caching into or around them.

Depending on the concrete caching implementation, there are different things for you to consider. For example, expiration, invalidation, or refreshing.

Cache Repetitive Operations

Even if something does not take a long time or a larger amount of resources, it may be a sensible idea to think about caching here if that piece of logic gets executed repeatedly. For a given block of code (which could be as short as a single line) that results in a single value stored in a variable or printed to the screen, computing that value repeatedly may not be ideal. Instead, one could store and then reuse it later for any use case that needs it. This can be interacting with I/O, or network, or performing a complex algorithm, but it can also be as simple as accessing some piece of data in an array or object that is nested three levels deep.

// Cache function call.
$foo = getFoo();

// Cache array/object lookup.
$foo = $item->fooBar['data']->foo;

Please note, however, that sometimes, you want to “ask” for the result again and not reuse a previous result. This could be due to the result changing, and you want to use the most recent value instead of the previous one. However, there are also cases where there is a new value now, but you still want or need to use the previous one.

Memoization

A special form of caching the output value for a given set of input values is called memoization. A memoized function or value will serve the previously stored return value of a (pure) function for the exact set of arguments given.

const categoryOptions = useMemo( () => {
     return [
         {
             label: '',
             value: 0,
         },
         ...(
             categories.map( ( { id, name } ) => ( {
                 label: unescape( name ),
                 value: Number( id ),
             } ) )
         ),
     ];
 }, [ categories ] );

While memoization is a general concept that applies to a lot of programming languages and contexts, for WordPress, it may have been made more widely known via JavaScript, for example, in the form of the React.memo or React.useMemo functions, or the memoize helper in Lodash.

Function Execution Control

Several implementations of trigger-based actions are naive and don’t perform well. Examples of this may be reacting to a user/visitor moving their cursor across the screen or typing into a form field. Often, it does not make much sense to perform an action upon every new trigger. Instead, you may react to only a select number of triggers, consider a certain threshold, or cooldown time in between two actions.

Throttling

Performing an action at a reasonable interval only, instead of reacting to every user action right away, is often referred to as throttling. Instead of hooking up the actual code to execute to the user trigger, one would use a dedicated function that wraps around the code and internally keeps track of the last time the code executed. Subsequent requests to execute the code again will be discarded for as long as a predefined threshold is not reached. Attempting to execute the code after the threshold will work fine, and the internal timestamp will be updated.

// Execute callback for every single position change.
window.addEventListener( 'dragover', onDragOver );

// Execute callback every 200ms, at maximum.
const throttledHandler = throttle( onDragOver, 200 );
window.addEventListener( 'dragover', throttledHandler );

A throttled function will execute for the very first call right away. However, for each interval in time, it will execute a single time only, or not at all, but never more than once. A well-known JavaScript implementation is the throttle function in Lodash.

Debouncing

The companion to throttling is debouncing. Here, we are wrapping our actual code by another custom function, which “waits” for more requests to come in. Once a predefined interval has passed since the most recent request, the code executes. A prominent use case for debouncing is when reacting to user input, for example, by manipulating existing data or querying a remote API and rendering any results received. There is no point in requesting data for “p”, then “pr”, and then “pro” until the user finishes writing “programming,” with possible typos and corrections.

// Perform search request for every single user input.
input.addEventListener( 'change', searchPosts );

// Perform search once the user stopped typing for 300ms.
const debouncedSearch = debounce( searchPosts, 300 );
input.addEventListener( 'change', debouncedSearch );

A debounced function will execute eventually, provided there are no new calls incoming. The very last call to a debounced function will always execute. Again, Lodash includes a widely used debounce helper function.

Cleaning Up (Memory Leaks)

In a dynamic and interactive context, a lot of different things can and do change. However, they may take time to kick off or complete. That’s why a larger amount of performance considerations revolves around stuff set up in a specific context, which has since changed. It’s important to keep track of and clean up those things, which addresses both memory leaks (i.e., binding or wasting resources without use) and bad user experience (e.g., reacting to data brought back from old requests that are no longer required or desired), or even error states because of unforeseen program flows.

Remove Event Listeners

A more widely known type of memory leak is based on JavaScript event handlers or listeners. Typically, an event listener involves a component that wants a block of code to execute in case a specific event is happening on a specific element on the page. Now, if either of the two components or the element itself has been removed, there is no longer a need for that event listener to exist.

One good example that illustrates the manual removal of event listeners can be found in a React codebase in a useEffect hook. The callback ensures an event listener exists, while it will also remove it when the component itself is “unmounted.”

useEffect( () => {
    window.addEventListener( 'dragover', onDragOver );

    return () => {
        window.removeEventListener( 'dragover', onDragOver );
    };
}, [ onDragOver ] );

Remove Time-Based Functions

Similar to the event-based function above, it is important to clean up any scheduled or time-based callbacks operating on data or components that no longer exist. In a browser or JavaScript context, the most prominent ones would be callbacks executing after a user-defined timeout or within a specific interval, using the setTimeout and setInterval API functions.

Removing those callbacks can easily be done via the respective clearTimeout or clearInterval function. In a React context, this could look very much like the above example with the useEffect hook, where the set* function would be called in the hook callback, whereas the clear* call would be wrapped in the cleanup function.

Abort Pending Requests

Imagine a text input used for searching posts on a remote site. We already learned that we might want to use a debounced callback for reacting to the user typing and then performing the search. That way, we don’t send a request for every single new or removed character. But what if the user is typing somewhat slowly still? Do we need to wait for and process data from a previous request with an incomplete search string, given that the user now has typed again? No, we usually don’t. So, we should abort the request and not process any data.

One (modern) way to do that is to use an AbortController. Thanks to this Web API, you can abort one or more in-progress requests, and almost all libraries or APIs for fetching data support this (concept).

useEffect( () => {
    const abortController = new AbortController();

    apiFetch( {
        path: '/some/path',
        signal: abortController.signal,
    } )
        .then( /* Process data... */ )
        .catch( () => {
            if ( abortController.signal.aborted ) {
                return;
            }

            // Handle other errors...
        } );

    return () => {
        abortController.abort();
    };
}, [ /* Dependencies required for querying and/or processing the data */ ] );

The above example shows how to send and abort a request in case the useEffect hook unmounts, for example, because the component itself unmounts or if a dependency has changed. In this case, the abort controller is instantiated right inside the callback and doesn’t leave its scope.

That said, it is also possible to manually abort the above request, for example, if the user clicks a special “Abort” button that is designed to cancel any or all pending requests on the page. In this case, you would have to instantiate the abort controller outside the hook and pass it (or at least the signal and abort properties) as dependencies.

Cancel Scheduled Callbacks

Just like other event-based, time-based, or remote request callbacks, it may also happen that a debounced callback needs to be canceled. This is the case when using a somewhat longer interval and observing context changes after the user-initiated trigger but before actually executing the wrapped code. We don’t want the custom debounced callback to execute the code.

const debouncedFetchData = useMemo( () => {
    return debounce( fetchData, 300 );
}, [ fetchData ] );

useEffect( () => {
    return () => {
        debouncedFetchData.cancel();
    };
}, [ debouncedFetchData ] );

If you are using Lodash, both the debounce and the throttle function natively support a cancel method to cancel any delayed function invocation.

Measuring Performance

Now, with the above guiding principles and examples in mind, we are on the right path to consider performance right from the start and adopt a performance-oriented mindset. But still, there will be a lot of situations and implementations where you can improve the overall performance of whatever system you are building. However, in order to improve something, you first have to identify what to improve. The most straightforward way to do this is to measure the performance of select workflows, processes, and routines of your application.

Profiling PHP, MySQL, and WordPress

There are a lot of relevant functionalities and services out there, maybe even built into the programming languages and tools you use on a daily basis; you just have to learn to use them. For example, if you want to understand your PHP application better on a lower level, you may want to use the Profiler built into Xdebug (if possible in your setup). Alternatives and companions include XHProf, PHPBench, and Tracy.

In a WordPress context, one of the most important tools to understand what’s going on in your website is the Query Monitor plugin, together with its add-ons, as well as the Debug Bar add-ons. Learn about the current requests, database queries, asset files loaded, HTTP requests, and a lot more.

If you need more information about what is happening as part of one or more database queries, you may be surprised to hear that MySQL (and other DBs) come with built-in profiling capabilities. All you have to do is turn on profiling, execute queries, and look at the overall performance profile, as well as individual queries.

Infrastructure Performance

Depending on your host and general infrastructure setup, you may be able to make use of powerful integrations and services. For example, considering AWS Cloud, I would highly recommend looking into getting AWS X-Ray set up. Other types of cloud solutions include Azure Monitor and Google Cloud’s operations suite, or platform- and infrastructure-agnostic services such as New Relic.

Browser Performance Insights

Over the last few years, browsers have become more powerful and useful when it comes to understanding what is happening on a website and also in the background. All modern browsers come with a flexible and configurable network view, breaking down various (sub)requests in regard to timings, file sizes, and integration and setup. There are dedicated performance insights and highly flexible and powerful browser extensions, for example, for things like React or Redux.

Conclusion

Writing the initial code of a project, as well as measuring and improving its performance, requires a good understanding of all the pieces involved. The programming language, the architecture, the implementation details, the programming and algorithmic principles applied, the infrastructure, the third-party libraries, or environment-specific APIs to use. There’s no getting around knowing your tools (and environment).

Are you looking to start a new project with an eye on performance from the first day? Maybe you have a live site that is performing far below your visitors, customers, or your expectations. We’re more than happy to become your technical partner of choice, ensuring that performance is never an afterthought.