Skip to content
This repository has been archived by the owner on Jul 1, 2024. It is now read-only.

Issue regarding advice for loop optimisation in 2009-03-06-javascript-best-practices.md #636

Open
UberKluger opened this issue Aug 8, 2021 · 0 comments

Comments

@UberKluger
Copy link

This has turned into a bit of a TLDR; The basic motivation for this issue is that this article has been mirrored on w3.org. As such, the content has been elevated from "personal preferences and suggestions" to "de facto recommended practice" and therefore the reasoning behind any suggestion must meet a higher standard. Not that I am imposing my standards upon the content, I am simply raising some points of discussion. If there are effective rebuttals of the points I make then so much the better. It proves that the content is of high quality (and that I need to improve my knowledge).

The idea that a substantial saving can be achieved by replacing a simple member access with a local variable is questionable.
All accesses within javascript (ECMA-262) are effectively member accesses. Global variables are just members of the global object. Function local variables are just members of the function's current execution context object. True, this also applies to the object being accessed to retrieve its member but the distinction is small and ignores the effects an optimising compiler could achieve. For loops, it would be reasonable to think that a clever compiler would detect that a variable was not being altered within the loop and so its value does not change and can thus be treated as a constant (no repeated retrieval required, just cache the first value). For object member references where the actual object variable is not changed (always refers to the same object instance), this would mean that a member reference could be obtained once, cached, and then the direct reference to the member used thereafter (possibly only if the member has [[Configurable]]:false like Array.length). (Note: the standard specifies "abstract" operations. Hosts are free to "implement" these operations any way they see fit provided that the end effect is the same.)

Thus, for a reasonably optimising compiler, using a local variable can provide a slight improvement over a simple member access (possible const vs direct reference) but this is probably small when compared with the work done inside all but the most trivial of loops. (Inside includes the work done by any called functions.)

And then there is the problem of temporary variable bloat. What if there are several loops on different arrays? A new length variable for each? If you reuse the same one, how do you name it? Even if it is the same array, what if the length has been altered between the loops? Was the length variable updated? Redundancy is the bane of database maintainers (trying to keep all the independent copies of the same data consistent). So it is when having temporary copies of (potentially varying) object members. Just more things to keep track of to avoid insidious bugs creeping in. Array.length is always the correct value, wherever or whenever it is used.

I suspect that such advice stems for the good 'ol (bad 'ol?) days before optimising compilers where source code was translated into machine code templates that simply implemented each source instruction without reference to anything around it. These days, attempting to optimise at the source level is often pointless, since the compiler was probably going to do it anyway, and sometimes even counter-productive as it might circumvent an even better optimisation that the compiler would have done. After all, the compiler (writer) knows exactly what machine code (or intermediate language) is going to be produced and how it can be adjusted for best performance. For the case in point, accessing the length of an array as a loop iteration criterion is EXTREMELY common so it might be expected that a good compiler would be set up to make it as fast as possible in the actually executed code, possibly by having a special internal representation of an Array that is substantially different from "normal" objects.

It is also possible that the advice "Don't repeatedly access the length" is simply a corruption of advice regarding objects without a known (i.e. stored) length, "Don't repeatedly calculate the length". The prime example being a novice implementing strchr(s, c):

int strchr(s, c) /* Original K&R to emphasize how old this advice is */
char *s;
char c;
{
  int  i;

  for (i = 0; i < strlen(s); i++)
    if (s[i] == c)
      return i;
  return -1;
}

Re-evaluating the string length on every iteration is horribly inefficient and it is unlikely that anything less than a super-genius compiler would be able to optimise this O(n squared) down to O(n). In this case, changing it to

  int  i, slen;

  slen = strlen(s);
  for (i = 0; i < slen; i++)

is VERY good advice.

There was almost some more useful advice right at the bottom. When it said to avoid computationally heavy code like regular expressions, it didn't say how. You can't really avoid executing the regular expression in the loop if that's what the loop is for but you can avoid wasted effort by using a variable containing an already created regular expression object instead of using a regular expression literal, which creates an identical new RE object on every iteration. Just remember to ensure that the RE object state (lastIndex) is reset between independent uses. This is the example where using a (temporary?) variable provides a possibly substantial benefit.

As for moving the boundary variable inside the loop initialisation, hopefully this has NO effect as the two versions are semantically identical, at least as far as using var in javascript they are (now if it were let...). They are probably even the same size, except for some formatting white-space and a new-line. Any measurable difference would indicate something weird happening with the compiler. The only real effect is to slightly clutter the loop initialisation although it does bring the boundary variable assignment within the "loop" structure (again, if it were let). However, this topic is loop optimisation not source readability.

In summary, apart from avoiding the most egregiously inefficient examples (such as strchr and RE literals above) which basically do the same work (with the same result) over and over again (q.v. Keep DOM access to a minimum), simple optimisation tricks probably don't provide enough benefit to compensate for the increase in source size and reduction of clarity about what the program is doing (all of these differences are admittedly fairly small). Trying to squeeze a few more fractions of a second out of a response time (particularly for an interactive program) should be left to those involved with speed competitions. If you're response time is woeful then the tip about loop boundary variables probably won't help.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant