Your browser may have trouble rendering this page. See supported browsers for more information.

This page shows the source for this entry, with WebCore formatting language tags and attributes highlighted.

Title

Optimizing data access for high-latency networks II

Description

<img attachment="sample_calendar.png" align="left">In the <a href="{app}view_article.php?id=3003">previous article</a>, we discussed a performance problem in the calendar of Encodo's time-tracking product, Punchclock. Instead of guessing at the problem, we profiled the application using the database-statistics window available to all Quino applications.<fn> We quickly discovered that most of the slowdown stems from the relatively innocuous line of code shown below. <clear><code> var people = Session.GetList<person>(). Where(p => p.TimeEntries.Any()). ToList(); </code> <h>First things first: what does the code do?</h> Before doing anything else, we should establish what the code does. Logically, it retrieves a list of people in the database who have recorded at least one time entry. The first question we should ask at this point is: does the application even need to do this? The answer in this case is 'yes'. The calendar includes a drop-down control that lets the user switch between the calendars for different users. This query returns the people to show in this drop-down control. With the intent and usefulness of the code established, let's dissect <i>how</i> it is accomplishing the task. <ol> The <c>Session.GetList<person>()</c> portion retrieves a list of all people from the database The <c>Where()</c> method is applied locally for each object in the list<fn> For a given person, the list of TimeEntries is accessed This access triggers a lazy load of the list The <c>Any()</c> method is applied to the full list of time entries The <c>ToList()</c> method creates a list of all people who match the condition </ol> Though the line of code looks innocuous enough, it causes a huge number of objects to be retrieved, materialized and retained in memory---simply in order to check whether there is at least one object. This is a real-world example of a performance problem that can happen to any developer. Instead of blaming the developer who wrote this line of code, its more important to stay vigilant to performance problems and to have tools available to quickly and easily find them. <h>Stop creating all of the objects</h> The first solution I came up with<fn> was to stop creating objects that I didn't need. A good way of doing this and one that was covered in <a href="{app}view_article.php?id=2951">Quino: partially-mapped queries</a> is to use <i>cursors</i> instead of <i>lists</i>. Instead of using the generated list <c>TimeEntries</c>, the following code retrieves a cursor on that list's query and materializes at most one object for the sub-query. <code> var people = Session.GetList<person>().Select(p => { using (var cursor = Session.CreateCursor<timeentry>(p.TimeEntries.Query))<fn> { return cursor.Any(); } }).ToList(); </code> A check of the database statistics shows improvement, as shown below. <img src="{att_link}time_entry_queries_with_cursors.png" href="{att_link}time_entry_queries_with_cursors.png" align="none" caption="Time-entry queries with cursors" scale="75%"> Just by using cursors, we've managed to reduce the execution time for each query by about 75%.<fn> Since all we're interested in finding out is whether there is at least one time entry for a person, we could also ask the database to count objects rather than to return them. That should be even faster. The following code is very similar to the example above but, instead of getting a cursor based on the <c>TimeEntries</c> query, it gets the count. <code> var people = Session.GetList<person>(). Where(p => Session.GetCount(p.TimeEntries.Query) > 0). ToList(); </code> How did we do? A check of the database statistics shows even more improvement, as shown below. <img src="{att_link}time_entry_queries_with_counts.png" href="{att_link}time_entry_queries_with_counts.png" align="none" caption="Time-entry queries with COUNTs instead of SELECTs" scale="75%"> We're now down to a few dozen milliseconds for all of our queries, so we're done, right? A 95% reduction in query-execution time should be enough. Unfortunately, we're still executing just as many queries as before, even though we're taking far less time to execute them. This is better, but still not optimal. In high-latency situations, the user is still likely to experience a significant delay when opening the calendar since each query's execution time is increased by the latency of the connection. In a local network, the latency is negligible; on a WAN, we still have a problem. In the next article, we’ll see if we can't reduce the number of queries being executed. <hr> <ft>This series of articles shows the statistics window as it appears in Winforms applications. The data-provider statistics are also available in Quino web applications as a Glimpse plug-in.</ft> <ft>It is important for users of the Microsoft Entity Framework (EF) to point out that Quino does not have a Linq-to-Sql mapper. That means that any Linq expressions like <c>Where()</c> are evaluated locally instead of being mapped to the database. There are various reasons for this but the main one is that we ended up preferring a strict boundary between the mappable query API and the local evaluation API. Anything formulated with the query API is guaranteed to be executed by the data provider (even if it must be evaluated locally) and anything formulated with Linq is naturally evaluated locally. In this way, the code is clear in what is sent to the server and what is evaluated locally. Quino only very, very rarely issues an "unmappable query" exception, unlike EF, which occasionally requires contortions until you've figured out which C# formulation of a particular expression can be mapped by EF.</ft> <ft>Well, the first answer I'm going to pretend I came up with. I actually thought of another answer first, but then quickly discovered that Quino wasn't mapping that little-used feature correctly. I added an issue to tackle that problem at a later date and started looking for workarounds. That fix will be covered in the next article in this series.</ft> <ft>Please note that cursors are disposable and that the calling application is responsible for cleanup. Failure to dispose of a cursor that has been at least partially iterated will result in an open connection in the underlying database providers associated with the query and will eventually lead to connection-pool exhaustion on those databases.</ft> <ft>Please ignore the fact that we also dropped 13 person queries. This was not due to any fix that we made but rather that I executed the test slightly differently...and was too lazy to make a new screenshot. The 13 queries are still being executed and we'll tackle those in the last article in this series.</ft>