Beware of loops and tasks

Loops and anonymous methods don't go hand in hand. Any variable that changes for each iteration will cause problems if it is used within an anonymous method. I explained the mechanism behind variable capturing in the post Mysterious Case Of Wrong Value, along with the problem it causes with loops and its solution.

Commonly, anonymous methods are used as callbacks or event handlers, where there are no loops in the context where variable capture happens. While there is a possibility to have an unexpected—wrong—value in such scenarios, it is rather easy to debug them and understand the order in which some code is called, and why the captured value is wrong in a given place.

With the rise of multi-threading usage, and especially the Parallel Programming Library and tasks, writing anonymous methods inside a loop becomes more common, and it is much easier to write code where a captured variable will have the wrong value. The biggest problem with such code is that it will be harder to debug and understand the cause if you are not already familiar with it.

The mechanism behind variable capture with and without loops is the same, but the loop itself is what makes the difference, as explained in the linked post.

The typical code pattern that causes the issue is running multiple tasks to perform some work:

procedure ProcessItems(Items: array of TItem); var i: Integer; begin for i := 0 to High(Items) do begin TTask.Run( procedure begin Items[i].Process; end); end; end;

The problem with the above code is that it uses and therefore captures the loop index within the anonymous method. When a particular task runs, that index value will be the current value at the moment when the anonymous method runs, not the value held at the time the task itself is created by the call to TTask.Run. So in reality when you need to process 10 items, you will maybe process items at indexes 5 and 9, not items on all indexes from 0 to 9.

Slightly modified code that uses a temporary Item variable will not resolve the problem:

procedure ProcessItems(Items: array of TItem); var i: Integer; Item: TItem; begin for i := 0 to High(Items) do begin Item := Items[i]; TTask.Run( procedure begin Item.Process; end); end; end;

Now the code does not capture the loop index variable that obviously changes with each iteration. But, the captured Item variable also changes with each iteration. Because there is only a single Item variable (location) captured, the actual content of Item when the task runs will again be the current value stored in Item variable at the time when task code is executed, not the one when the task itself was created. So instead of processing 10 different items, that code will process only a few of them.

The same would happen if you used any other kind of loop. For instance:

procedure ProcessItems(Items: array of TItem); var i: Integer; Item: TItem; begin for Item in Items do begin TTask.Run( procedure begin Item.Process; end); end; end;

Because the problem is the same as the one I wrote about in the previous post, the solution will also be the same. Adding a separate method that will contain the anonymous method declaration, and passing all captured variables that change during the iteration as parameters to that method (you will also need to pass other captured variables, if they are otherwise inaccessible from the extracted code):

procedure ProcessItem(Item: TItem); begin TTask.Run( procedure begin Item.Process; end); end; procedure ProcessItems(Items: array of TItem); var i: Integer; begin for i := 0 to High(Items) do begin ProcessItem(Items[i]); end; end;

Fortunately, this kind of problem can be rather easily spotted in code. The only rule you need to remember is: Don't write anonymous methods directly within loops.

If you have a loop containing an anonymous method, you must extract that anonymous method and put it in separate procedure or method.

Comments

  1. Thanks for this insight, Dalija. Always fun to read your postings.

    ReplyDelete
  2. Dalija you explain stuff real good. Thank you for sharing.

    ReplyDelete
  3. Wouldn't inline declared variables (in the for loop) fix the issue?

    ReplyDelete
    Replies
    1. No, inline variables would not fix the problem. Inline variable inside the loop is still just single variable - memory location, not multiple ones.

      Delete

Post a Comment

Popular posts from this blog

Coming in Delphi 12: Disabled Floating-Point Exceptions

Catch Me If You Can - Part II