This page shows the source for this entry, with WebCore formatting language tags and attributes highlighted.

Title

Improving performance in GenericObject

Description

<n>This article was originally published on the <a href="http://encodo.com/en/blogs.php?entry_id=216">Encodo blogs</a> and cross-published here.</n> <hr> Quino is Encodo's metadata framework, written in C#/.NET 4.0. Since its inception four years ago, we've used it in several products and the code base has been updated continuously. However, it was only in a recent product that one of the central features of the framework came under scrutiny for performance issues. It turned out that reading and writing to Quino data objects was a bit slower than we needed it to be. <h>How Data Objects are Implemented</h> A typical ORM (like Hibernate or Microsoft's Entity Framework) uses a C# class as the base entity in the model, decorating those classes with attributes to add to the model. The ORM then uses this information to communicate with the database, reading and writing values through reflection. Creating objects and getting and setting values---including default values---is all done through direct calls to property getters and setters. Quino took a different approach, putting the model at the center of the framework and defining an in-memory structure for the model that is accessible through a regular API rather than reflection. The actual C# classes used by business logic are then generated from this model---instead of the other way around. This decoupling of metadata from the classes has a lot of advantages, not the least of which is that Quino provides generalized access to any of these business objects. Components that work with Quino data do not need to be aware of the actual classes: instead, those components use the metadata and an API to read and write values. Since the interface is generalized, these values are get and set using Quino code rather than direct getters and setters. As you would expect, there is a base class from which all Quino data objects inherit that provides the support for this interface, called <c>GenericObject</c>. It was in this central class that we had to go to work with a profiler to squeeze out some more speed. <h>Improving Performance</h> The actual use case for our data objects didn't even use our ORM, as such. Instead, we were generating the objects from a data stream with 0 to n columns defined (a perfect situation to use an object that supports a flexible interface). Once those objects were created, they were handed off to the user interface, which applied them to a grid, replacing rows or updating values as required. So, we needed to improve things on several fronts: <ul> We needed to improve speed when creating objects because data was arriving at a serious clip. We needed to improve speed when applying values because there were often several grids open at once, and they all needed to be updated as quickly as possible.<fn> We also needed to decrease the memory footprint because when the data flow was heavy, there were a lot of objects in memory and the application was reaching the limit of its address space.<fn> </ul> As mentioned above, the data object we had worked fine. It was fast enough and slim enough that we never noticed any performance or memory issues in more classical client applications. It was only when using the data object in a very high-demand, high-performance product that the issue arose. That's actually the way we prefer working: get the code running correctly first, then make it faster if needed. And how do you make it faster and slimmer without breaking everything else you've already written? You run each subsequent version against your unit, regression and integration tests to verify it, that's how. Quino has several thousand automated tests that we ran each step of the way to make sure that our performance improvements didn't break behavior. <h>Charts and Methodology</h> The charts below indicate a relative improvement in speed and memory usage. The numbers are not meant to be compared in absolute terms to any other numbers. In fact, the application being tested was a simple console application we wrote that created a bunch of objects with a bunch of random data. Naturally we built the test to adequately approximate the behavior of the real-world application that was experiencing problems. This test application emitted the numbers you see below. We used the <a href="http://www.yourkit.com/dotnet/features/index.jsp">YourKit Profiler for .NET</a> to find code points that still needed improvement and iterated until we were happy with the result. We are very happy with YourKit as a profiler. It's fast and works well for sampling and tracing as well as detecting memory leaks and tracking memory usage. To test performance, we would execute part of the tests below with tracing enabled (no recompilation necessary), show "Hot Spots" and fix those. The tests focused on creating a certain number of objects with a certain number of columns (with total data fields = #objects * #columns), corresponding to the first two columns in the table. The other columns are v0 (the baseline) and v1--v3, which are various versions we made as we tried to hone performance. The final three columns show the speed of v1--v3 vs. v0. <img attachment="createobjects.png" align="center" class="frame" caption="Time needed to create objects"> <img attachment="setvalues.png" align="center" class="frame" caption="Time needed to set values"> Finally, not only did we make creating objects over 3 times faster and changing values more than twice as fast, but we also decreased the memory footprint of each object to just over 1/3 of the original size. <img attachment="memoryusage.png" align="center" class="frame" caption="Memory usage"> These improvements didn't come by magic: the major change we made was to move from using a dictionary as an internal representation to using arrays and direct indexing. The dictionary is the more natural choice as the generalized API maps property and relation names to values, but it uses more space and is slower than an array. It is, however, much easier to use if you don't have to worry about extreme performance situations. Using an array gives us the speed we need, but it also requires that we be much more careful about index-out-of-bounds situations. That's where our rich suite of tests came to the rescue and let us have our cake and eat it too. These improvements are available to any application using Quino 1.6.2.0 and higher. <hr> <ft>In a subsequent version of this product, we would move each grid/window into its own UI thread in order to parallelize the work and use all 8 cores on the target machine to make updates even faster.</ft> <ft>Because of the parallelization mentioned in the footnote above, the subsequent version was still reaching the limit of the 32-bit address space, even with the decreased memory footprint per object. So we compiled as 64-bit to remove that limitation as well.</ft>