While I found a lot of what he had to say interesting, I did wonder how applicable CI is for the kinds of teams that I... [More]
]]>Published by marco on 18. Mar 2024 10:56:49 (GMT-5)
Updated by marco on 18. Mar 2024 11:18:01 (GMT-5)
The article Continuous Integration by Martin Fowler makes many interesting points. It is a compendium of know-how about CI by one of the industry heavyweights, who’s been using it for a long time.
While I found a lot of what he had to say interesting, I did wonder how applicable CI is for the kinds of teams that I know and work with. He makes several statements toward that end that pretty severely limit the applicability of what he calls “true CI” for many, if not most, teams.
I think he should have started his article with a very clear delineation for which kinds of organizations this kind of process is appropriate or efficient. In leaving it out, he seems to suggest that it’s the best for everyone, but at the end of the article, he lists what are, for me, quite severe restrictions. For example,
I don’t get the impression that Fowler is discussing a dream scenario toward which one works, but rather what he considers to be the absolute minimum process that anyone should be utterly embarrassed about themselves for not already having. I didn’t see a single sentence in this 40-page, at-times repetitive document about how to actually get there from here—or whether that’s really appropriate for many projects that people who read Martin Fowler might be working on.
I wonder about the wisdom of prioritizing integration seemingly above all else.
Below are citations from the long paper, with my comments interleaved.
“This contrast isn’t the result of an expensive and complex tool. The essence of it lies in the simple practice of everyone on the team integrating frequently, at least daily, against a controlled source code repository. This practice is called “Continuous Integration” (or it’s called “Trunk-Based Development”).”
He says this a lot, but I never hear about the costs. Is there no amount of time lost on integrations that is too high a price? Is there no task that he doesn’t break down into a million pieces in order to accommodate this style of work? Is there no efficiency lost by making each task into 1-hour chunks of coding that the entire team then integrates? Is that what we’re doing now?
“This will consist of both altering the product code, and also adding or changing some of the automated tests. During that time I run the automated build and tests frequently. After an hour or so I have the moon logic incorporated and tests updated.”
I’m quite fed up with reading this kind of optimistic bulls%!t. What kind of programmers are these who can accomplish major work in one hour? Or are the tasks that Fowler can conceive of all so simple that they can be accomplished in an hour? I’m very suspicious about these kinds of statements. It reminds me of game developers in the 90s talking about how they’d “written the whole engine in a weekend”, but then the game still took five more years to deliver.
“Some people do keep the build products in source control, but I consider that to be a smell − an indication of a deeper problem, usually an inability to reliably recreate builds. It can be useful to cache build products, but they should always be treated as disposable, and it’s usually good to then ensure they are removed promptly so that people don’t rely on them when they shouldn’t.”
Sure. But—priorities. Your product is not the pipeline. It’s your product. You can’t make everything a slave to the process. Remember to fix that which you can fix quickly, but to focus on your own priorities. Don’t polish a build so that Martin Fowler is happy, if it’s going to make your customers wait a lot longer for their release.
“The tests act as an automated check of the health of the code base, and while tests are the key element of such an automated verification of the code, many programming environments provide additional verification tools. Linters can detect poor programming practices, and ensure code follows a team’s preferred formatting style, vulnerability scanners can find security weaknesses. Teams should evaluate these tools to include them in the verification process.”
“Everyone Pushes Commits To the Mainline Every Day
“No code sits unintegrated for more than a couple of hours.”
This feels completely divorced from reality, but maybe I just “don’t get it.”
“If everyone pushes to the mainline frequently, developers quickly find out if there’s a conflict between two developers. The key to fixing problems quickly is finding them quickly. With developers committing every few hours a conflict can be detected within a few hours of it occurring, at that point not much has happened and it’s easy to resolve. Conflicts that stay undetected for weeks can be very hard to resolve.”
I agree with the last sentence, but at what cost? It feels like you’re going to spend so much time committing and integrating. How is finding out if you have conflicts the highest-priority task your team has?
“Full mainline integration requires that developers push their work back into the mainline. If they don’t do that, then other team members can’t see their work and check for any conflicts.”
Who finishes anything non-trivial in an hour? I can’t escape the feeling that one-hour chunks is almost too granular, that this size was chosen because it aids integration. While that’s a noble goal, I wonder how appropriate it is for many tasks, and to what degree the shape of the process affects the size of the solution set.
“Since there’s only a few hours of changes between commits, there’s only so many places where the problem could be hiding. Furthermore since not much has changed we can use Diff Debugging to help us find the bug.”
But don’t you waste time hunting bugs that would have gone away by themselves if the process weren’t so frenetic? If you rebase everything, then you’ll still encounter every integration conflict. If you merge, though, you can skip many of those interim integrations because subsequent changes might have obviated prior ones that might have caused conflicts.
Instead of testing occasional version, you end up testing absolutely everything you do as if it were a release candidate. I’m not convinced that there’s no downside to that. I feel like it’s a waste of time if applied so mindlessly.
“Often people initially feel they can’t do something meaningful in just a few hours, but we’ve found that mentoring and practice helps us learn.”
I don’t know who you’re working with, but I wonder how useful is that? How useful is it to tailor your entire process to ruthlessly chopping up your work into tiny segments? What if that’s not how some people work? What if they can’t learn? Fire ‘em?
“Continuous Integration can only work if the mainline is kept in a healthy state. Should the integration build fail, then it needs to be fixed right away. As Kent Beck puts it: “nobody has a higher priority task than fixing the build”.”
You goal ends up being running to run the process, rather than to build the product. This sounds more and more like a cult.
“If the secondary build detects a bug, that’s a sign that the commit build could do with another test. As much as possible we want to ensure that any later-stage failure leads to new tests in the commit build that would have caught the bug, so the bug stays fixed in the commit build.”
“A team should thus automatically check for new versions of dependencies and integrate them into the build, essentially as if they were another team member. This should be done frequently, usually at least daily, depending on the rate of change of the dependencies.”
This seems like another thing that becomes a higher priority than building the product itself. Daily dependency check seems like overkill, but it’s automated, so who cares? He’s just running builds all the time, like we don’t have a climate crisis.
“if we rename a database field, we first create a new field with the new name, then write to both old and new fields, then copy data from the exisitng old fields, then read from the new field, and only then remove the old field. We can reverse any of these steps, which would not be possible if we made such a change all at once. Teams using Continuous Integration often look to break up changes in this way, keeping changes small and easy to undo.”
“Virtual environments make it much easier than it was in the past to do this. We run production software in containers, and reliably build exactly the same containers for testing, even in a developer’s workspace. It’s worth the effort and cost to do this, the price is usually small compared to hunting down a single bug that crawled out of the hole created by environment mismatches.”
I agree with this part, without qualification. At least as a goal.
“Being able to automatically revert also reduces a lot of the tension of deployment, encouraging people to deploy more frequently and thus get new features out to users quickly. Blue Green Deployment allows us to both make new versions live quickly, and to roll back equally quickly if needed, by shifting traffic between deployed versions.”
What about data schemas? What about if you don’t have a product that deploys on a web server or app store? I understand that there are solutions to this, but I wonder how great a fit they are to many teams? If your team is accustomed to SQL programming—or if you already have a suite of products that use SQL databases—then how worthwhile to your business is it to prioritize moving away from SQL to a local DB like SQLite, a NoSQL document store like RavenDB, or even to a completely different back-end like Rama?
“Continuous Integration effectively eliminates delivery risk. The integrations are so small that they usually proceed without comment. An awkward integration would be one that takes more than a few minutes to resolve.”
It sounds like very much like it prioritizes eliminating delivery risk over all else. It is only applicable to products built in this way from the beginning.
“Having to put work on a new feature aside to debug a problem found in an integration test [or] feature finished two weeks ago saps productivity.”
So does constantly integrating, though! It can be noise. It’s like the noise of micro-reviewing AI responses. You have to figure out the sweet spot for your team and iterate toward that goal, always ensuring that your team can deliver even if the dream process is not already in place. Make a diagram of all the facets and discuss a plan for your project. Pragmatic. Realistic.
I don’t get the impression that Fowler is discussing a dream scenario toward which one works, but rather what he considers to be the absolute minimum process that anyone should be utterly embarrassed about themselves for not already having. I didn’t see a single sentence in this 40-page, at-times repetitive document about how to actually get there from here—or whether that’s really appropriate for many projects that people who read Martin Fowler might be working on.
“They found that elite teams deployed to production more rapidly, more frequently, and had a dramatically lower incidence of failure when they made these changes. The research also finds that teams have higher levels of performance when they have three or fewer active branches in the application’s code repository, merge branches to mainline at least once a day, and don’t have code freezes or integration phases.”
What if you don’t have an elite team?
“A two week refactoring session may greatly improve the code, but result in long merges because everyone else has been spending the last two weeks working with the old structure. This raises the costs of refactoring to prohibitive levels. Frequent integration solves this dilemma by ensuring that both those doing the refactoring and everyone else are regularly synchronizing their work.”
Some refactoring can’t just be done in mini bites like that. Sometimes, you work on a POC that takes more time to verify. Now what? Throw it away and build it from scratch in bite-sized pieces? Or integrate a long-lived branch, which is verboten?
I’m working on a sweeping change to the way solutions are configured. It involves changing packages and versions in four different solutions. Should I have merged to master everywhere and involved the whole team in my project? That sounds stupid. Sure, it takes longer to verify and integrate in one big chunk, but it has the advantage that it didn’t make upgrading the solution format the number-one priority for all developers for a sprint or two.
“[…] teams that spend a lot of effort keeping their code base healthy deliver features faster and cheaper. Time invested in writing tests and refactoring delivers impressive returns in delivery speed, and Continuous Integration is a core part of making that work in a team setting.”
For non-legacy projects. Continuous delivery can only really work for web-based products or apps. A lot of other products have to be deployed to processes that aren’t as easy to update five times a day.
“Continuous Integration is more suited for team working full-time on a product, as is usually the case with commercial software. But there is much middle ground between the classical open-source and the full-time model. We need to use our judgment about what integration policy to use that fits the commitment of the team.”
That is the first time that he’s conceded that maybe there are use cases to which this whole article doesn’t apply very well.
“If a team attempts Continuous Integration without a strong test suite, they will run into all sorts of trouble because they don’t have a mechanism for screening out bugs. If they don’t automate, integration will take too long, interfering with the flow of development.”
No kidding. You need some serious test coverage to continuously integrate and deploy. I also wonder about the size of the product you can legitimately do this. Can you imagine if your test suite takes ten minutes to run and you integrate three or four times per day? Can you imagine how much time you’re not developing software because you’re integrating someone else’s code? I understand that this happens eventually, but I wonder about the wisdom of prioritizing integration seemingly above all else.
“Continuous Integration is about integrating code to the mainline in the development team’s environment, and Continuous Delivery is the rest of the deployment pipeline heading to a production release.”
This is a good definition and I wonder that he rewrote this whole essay and didn’t put this right at the top.
“Continuous Integration ensures everyone integrates their code at least daily to the mainline in version control. Continuous Delivery then carries out any steps required to ensure that the product is releasable to product[ion] whenever anyone wishes. Continuous Deployment means the product is automatically released to production whenever it passes all the automated tests in the deployment pipeline.”
Also excellent definitions that make the distinction clear. Continuous Delivery is the one that many teams could strive for, even if they will never be able to do Continuous Delivery. The question is: at what cost?
“Those who do Continuous Integration deal with this by reframing how code review fits into their workflow.”
Well, that’s an interesting statement. Integration trumps review? Get your code in there and review later? Trust in your tests? Are you kidding me? You should review design, as well as implementation. If everyone’s coding and committing and pushing in hours, when do they review? Is the idea to have people communicate with each other only when they’ve already built something?
]]>
Published by marco on 11. Feb 2024 22:33:58 (GMT-5)
The article The web just gets better with Interop 2024 by Jen Simmons (Webkit Blog) writes,
“The Interop project aims to improve interoperability by encouraging browser engine teams to look deeper into specific focus areas. Now, for a third year, Apple, Bocoup, Google, Igalia, Microsoft, and Mozilla pooled our collective expertise and selected a specific subset of automated tests for 2024.
“Some of the technologies chosen have been around for a long time. Other areas are brand new. By selecting some of the highest priority features that developers have avoided for years because of their bugs, we can get them to a place where they can finally be relied on.”
When we complain about features that remain unimplemented in browsers, we also have to acknowledge that there’s only so much you can do with a given team. There are problems that are technically easier to solve than others. When we complain, we’re actually more concerned about the prioritization of issues. We want to be able to influence what gets fixed when, rather than just having to passively hope that the manufacturer eventually gets around to it.
That where the Web Platform Tests come in. The Interop 2024 project follows on iterations from 2023, 2022, and 2021, when it all started.
Last year was a banner year. For CSS “Subgrid, Container Queries, :has()
, Motion Path, CSS Math Functions, inert and @property
are now supported in every modern browser.” For JavaScript, we got “Improved Web APIs include Offscreen Canvas, Modules in Web Workers, Import Maps, Import Assertions, and JavaScript Modules” across all modern browsers.
These are all super-important features. E.g., Import Assertions for JSON import and Modules in Web Workers, which allows modern and modular programming, making it much easier to offload work, as one would with code running directly on modern operating systems.
What’s on the schedule for 2024?
@property
will similarly be more polished, as the percentage support is still quite low in many browsers.sub-grids
or display: contents
affect element order—as this means that we will get sites that are automatically accessible, as long as we build our sites logically.IndexedDB
will make it easier to write powerful local-first applications (even though something like Automerge might be a better fit for apps offering concurrent or collaborative editing).popover
with anchors is long overdue, as making usable tooltips and popups is an area fraught with custom code and half-baked solutions. It’s nice to see this become an area where you’ll no longer need custom JavaScript.@starting-style
will fill a gap in CSS that finally allows sites to indicate how an element will transition from or to display: none
.See the original article for much more detail.
When we think about navigating or... [More]
]]>Published by marco on 8. Jan 2024 09:50:50 (GMT-5)
Updated by marco on 9. Jan 2024 11:04:29 (GMT-5)
I published a very similar version of the following article in the DevOps Wiki at Uster Technologies AG. Since nearly all of that post is general knowledge that I would have been happy to find before I started my investigations, I’m sharing it here.
When we think about navigating or debugging our code, we usually focus on the code we’ve written ourselves—local sources in our file system. IDEs have classically focused on being able to debug and navigate this code.
More and more, though, we’re also interested in navigating and debugging our versioned and compiled dependencies:
Most of these are available as source code. We would ideally like to be able to navigate and debug that code just as easily as we can our own.
The following sections define file types and terminology, and then explain how these concepts apply to debugging and navigation for external sources. You can also just jump to the sections on producing or consuming packages (especially as relates to authentication for private sources).
The following diagram provides an overview of the process of obtaining external packages, along with their symbols and source files. It looks quite complicated, but accommodates the flexibility required by various stakeholders.
There are several types of files associated with debugging and navigation:
DLL
PDB
XML
*.cs
It’s reasonable to ask why this process is so complex.
nupkg
just include the PDB
and the *.cs
files?The system was designed for use cases where most sources were closed. That has changed, but the system still reflects the original design choices. The PDB files can also add about 30% to the size of the package. The original use cases preferred to avoid using 30% more space for package downloads that didn’t need the debugging information.
Again, historically, the use cases were for providing improved stack traces with symbols, but not to provide access to closed sources. Even if the sources are partially open, access may be restricted to only some users of the packages or symbols. Having the IDE request the sources separately allows an additional authorization phase.
The defaults still reflect the original use cases, which actually represent fewer and fewer packages as time goes on.
These answers aren’t particularly satisfying if your use case happens to be “make a package that has symbols for excellent stack traces and sources for excellent debugging”. At least we now have IDEs that know how to work with this system and there is a lot of automation for producing packages with the desired symbol and source-code support.
A developer debugs source code by interrupting execution of a program—either manually or by setting breakpoints—and then stepping through the instructions, examining the contents of symbols (variables) to investigate the runtime behavior and operation of the system.
The debugger uses the PDB
to allow source-level debugging, i.e. debugging in the original source code. While debugging in “lower” formats is possible, it’s not nearly as reliable as being able to step through the code in the original source code, using the original symbols.
How does the debugger obtain the PDB
for a given DLL
?
DLLs
and PDBs
have unique identifiers that make it possible to request and download the correct file.Once the debugger has the PDB, it has everything it needs—except the source code.
If the PDB was generated locally, then it most likely references the source files that are still in the same locations in the file system as when it was built. In that case, the debugger easily finds the source files because they’re just at the paths that are directly referenced by the PDB
.
If the PDB was not generated locally or the source-code paths do not match, then there are other tricks to find the source files. Visual Studio allows you to set “Directories containing source code” for the “Debug Source Files”
If the sources aren’t available locally, e.g., for a NuGet package, then there is a system called SourceLink that is extremely well-supported in the .NET world that makes it possible to easily download the source files that generated a DLL
and that are referenced by its PDB
.
Things to be aware of:
If the package does not support SourceLink, but the sources are available, then you can download the sources locally and use the solution-level mapping above to tell the debugger where the source files are. You can also just point the debugger to the top-level folder when it asks for the file’s location, in which case the debugger makes the entry for you.
A developer navigates by requesting the source code for a symbol. For example, if the declared type of a variable in an open source file is the class Setting
, then the developer can ask the IDE to show the source of Setting
by Ctrl + clicking, by pressing F12 in Visual Studio, or by pressing Ctrl + B in Rider.
As with debugging, navigating local sources is straightforward, since the sources are in the local file system. For symbols in NuGet packages, the IDE has to be clever enough to download, cache, and use the sources.
Visual Studio on its own does not support navigating to external sources via SourceLink. Instead, it always decompiles external sources, as shown in the example below.
If you have ReSharper installed, then the default setting is to try as hard as possible to avoid showing a decompiled version.
You can also add “Folder Substitutions” in the “Advanced Symbol options…” for navigating to “External Sources”. The option does not seem to be available in Rider.
SourceLink is a system that provides source files for external sources like NuGet packages for debugging or navigation. In order for this to work, you must be able to provide external sources or the client is not properly configured for debugging.
See below for troubleshooting information, especially as relates to authentication for packages and source code pulled from authenticated locations.
A decompiled version of the source code is a reconstruction of the original source from the instructions and information in the DLL
and PDB
. When sources cannot be located for a given symbol, Visual Studio, ReSharper, and Rider will produce a decompiled version as a fallback.
This is often good enough to be able to read the code reasonably well, but it leaves certain common constructs in their “lowered” format. E.g., calls to extension methods appear as static-method calls rather than as targeted on the first parameter.
This can make debugging difficult, as the instructions don’t match the mapping. Rider has support for patching the PDB on-the-fly to allow more comfortable debugging of decompiled sources. This is, however, a fallback solution for external packages over which you have no control. It’s best to configure your packages to publish with symbols and sources available to IDEs that support them, as shown in the next section.
The documentation to Enable debugging and diagnostics with Source Link is thorough and tells you all you need to know about all of the options.
If you’re working with Azure DevOps Services, you should include the following package reference:
<ItemGroup>
<PackageReference Include="Microsoft.SourceLink.AzureRepos.Git" Version="8.0.0" PrivateAssets="All"/>
</ItemGroup>
With this, you’re all set. The package is published to the Azure Artifacts, with a corresponding snupkg
available on the Azure symbol server and sources available via the repository URL (subject to authorization; see below for troubleshooting).
You can set a few optional properties, detailed below. Most projects won’t need to set these, but they are included to spare you the research if you see them in code examples, either in your institution’s code or online. As noted, the only line you need is the package reference shown above.
IncludeSymbols
DebugType
is set to embedded
) or in a separate symbol package (if SymbolPackageFormat
is set to snupkg
). This is implied when the NuGet package Microsoft.SourceLink.AzureRepos.Git is included, as shown below.snupkg
when the NuGet package Microsoft.SourceLink.AzureRepos.Git is included, as shown below.See the SourceLink documentation for more details. Among other details, they also note that projects that target .NET 8 no longer need to include this support explicitly because Azure Repos are supported by default, as detailed in the readme for the SourceLink project.
“If your project uses .NET SDK 8+ and is hosted by the above providers (GitHub, Azure Repos, GitLab, BitBucket) it does not need to reference any Source Link packages or set any build properties.”
You can also include the packaging conditionally in the Directory.Build.Targets
, as shown below.
<ItemGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' ">
<PackageReference Include="Microsoft.SourceLink.AzureRepos.Git" Version="8.0.0" PrivateAssets="All"/>
</ItemGroup>
See the appendix for Directory.Build.Props
and Directory.Build.Targets
for more information about which variables and directives are respected in which file.
If a package has SourceLink enabled and you have access to the online repository from which it was built, then to seamlessly debug into that source code, ensure the following:
As noted above, Visual Studio doesn’t support navigating via SourceLink. To browse external sources with JetBrains tools, ensure the following:
Once you’re sure that the package supports SourceLink, then you should also make sure that the Just My Code setting is disabled.
When Just My Code is enabled, the debugger skips over any code that doesn’t correspond to source code in one of the local projects.
.pdb
file next to the .dll
file)?PDB
is not included with the package, is it available on a Symbol Server?DLL
?If it’s available in the package, but is not being copied to the output folder, then if you’re using .NET 7.0 SDK or higher, you can use the build property named CopyDebugSymbolFilesFromPackages.
<PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' ">
<CopyDebugSymbolFilesFromPackages>true</CopyDebugSymbolFilesFromPackages>
</PropertyGroup>
Verify that the symbols for the module you’re trying to debug have been loaded. If they aren’t loaded, you can try to load symbols while debugging. For more details and a screenshot, see Just My Code debugging.
If you’re trying to navigate in code, but ReSharper or Rider keeps decompiling instead of getting the sources from SourceLink, then check your External Sources settings in ReSharper or Rider. Verify that the tool is configured to check for external sources before it tries decompiling.
If the IDE is having trouble authenticating, then you will usually see a decompiled version instead. Sometimes the code is so close to the original that it’s hard to tell; scroll to the top to see if it includes the “decompiled by JetBrains…” header.
Once the IDE has decompiled a source file, it will continue to use this cached copy until you close the tab, or sometimes you have to close and re-open the project. If you’re troubleshooting your way through this setup, then you can temporarily disable decompilation as a fallback, which avoids producing the unwanted source-code variant in the first place.
Visual Studio uses the authentication associated with the logged-in user that you use to enable the IDE. This can be in a weird state if you’ve recently changed your password or your authentication token is stale or in a non-refreshable state. Try logging out and back in.
JetBrains tools (Rider, ReSharper, DotPeek, etc.), on the other hand, need to be given a token.
If the tool shows a notification indicating that authentication has failed, then do the following:
Configure
on the notification to show a dialog
john.doe@example.com
Test
button to verify that it works (you should see OK 200
)Ok
to save the credentialsHowever, there is a bug whereby JetBrains tools fail to show a notification or offer a way to enter credentials. [1] That’s going to look something like this:
It claims that it can download the source, but it never completes. You have to cancel the dialog. If you then look at the ReSharper Output, then you’ll see something like this:
The relevant text is at the end of the third line, which indicates that the request for the source file returned a “Non-OK HTTP status code”.
PdbNavigator: Searching for 'Example.Core.AppConfig.AppConfigKeyAttribute' type sources in C:\Users\john.doe\.nuget\packages\example.core.appconfig\4.1.0\lib\netstandard2.0\Example.Core.AppConfig.pdb
PdbNavigator: File names (1) are inferred for type Example.Core.AppConfig.AppConfigKeyAttribute
PdbNavigator: Downloader: https://dev.azure.com/example/example.Core/_apis/git/repositories/Example.Core.LabInstruments/items?api-version=1.0&versionType=commit&version=8b34c2aa672facd47e835c27152f695fa796a408&path=/Example.Core/DotNetStandard/Example.Core.AppConfig/AppConfigKeyAttribute.cs -> Non-OK HTTP status code
The most reliable way to fix this is to create the credentials in the Credential Manager. Be aware that you will need to create an Azure PAT (personal access token).
Windows Credentials
JetBrains SourceLink https://dev.azure.com/exampleOrganization
If you don’t have this entry, then that’s the problem. If you have it, but you still can’t get the sources, then edit the entry to have valid credentials.
To create or edit the record, do the following from the Credentials Manager:
JetBrains SourceLink https://dev.azure.com/exampleOrganization
john.doe@example.com
As you can see above, although publishing a package is relatively straightforward, there are quite a few stumbling blocks on the way to consuming the package for navigation and debugging. Once you have everything set up and working, it’s great, but … there is still one other drawback.
You can’t edit the code for packages.
This is not optimal. Optimally, we’d like to quickly verify that change to an upstream code would address an issue in downstream code without having to generate new packages. It would be great to just edit the upstream code as if it were part of your downstream solution until you’re sure that the change would address your downstream issue. At that point, you can copy the changes back to the upstream solution (where the dependency is produced), add tests, and produce a new version, being pretty certain that the change is effective.
The shortest possible developer-feedback loop with code in external packages is:
PDB
)If your package has dependencies or your change in the external package’s solution touches multiple packages, then you can do the following:
If it get too complicated to do locally, then you can always commit, push, and have the CI generate new versions of your packages (hopefully with a prerelease version, e.g., 3.2.4-preview2
)
The solutions outlined above have a reasonable turnaround time, but sometimes you want to pretend that the external packages are just internal projects instead. This basically entails:
At that point, you can edit, debug, and navigate the code as if it were your own.
See the “Project Munging with Tools & PowerShell” section of How to Debug NuGet Packages with Symbols and Source Link Painlessly for a PowerShell script that can help you automate part of this.
MSBuild supports including common configuration in project files. While earlier versions required all configuration to be included explicitly, modern versions include configuration files with special names automatically, greatly simplifying common configuration and reducing clutter in project files.
If the file is named Directory.Build.Props
or Directory.Build.Targets
, it is picked up automatically and included for all projects in that folder or any subfolder. If you use a different name, then you have to explicitly reference that file from a project or from another *.props
or *.targets
file. If you choose your own name, you don’t have to use the Build.Properties
or Build.Targets
convention, but it’s strongly recommended, to avoid confusion.
You can use a Directory.Build.Properties file to include settings for all projects in a folder or set of subfolders.
For example, the following package reference can and should be included in Directory.Build.Props
:
<PackageReference Include="Microsoft.SourceLink.AzureRepos.Git" Version="8.0.0" PrivateAssets="All"/>
If you want to include settings conditionally based on build configuration (e.g., Configuration
or Platform
), then you’ll have to use the Directory.Build.Targets file, which has access to those variables.
<ItemGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' ">
<PackageReference Include="Microsoft.SourceLink.AzureRepos.Git" Version="8.0.0" PrivateAssets="All"/>
</ItemGroup>
Directory.Build.Props
file at the root of the solution.A lot of my analysis and notes boils down to: you need to know what you’re... [More]
]]>Published by marco on 30. Dec 2023 22:46:09 (GMT-5)
The article Exploring Generative AI by Birgitta Böckeler (MartinFowler.com) is chock-full of helpful tips from eight newsletters totaling 25 pages that she wrote throughout 2023. I include some of my own thoughts, but most of this article consists of citations.
A lot of my analysis and notes boils down to: you need to know what you’re doing to use these tools. They can help you build things that you don’t understand, but it’s not for medium- or long-term solutions. I’ve written a lot more about the need for expertise in How important is human expertise?
“The following are the dimensions of my current mental model of tools that use LLMs (Large Language Models) to support with coding.
“Assisted tasks”
“These are the types of tasks I see most commonly tackled when it comes to coding assistance, although there is a lot more if I would expand the scope to other tasks in the software delivery lifecycle.”
- Finding information faster, and in context
- Generating code
- “Reasoning” about code (Explaining code, or problems in the code)
- Transforming code into something else (e.g. documentation text or diagram)
“In this particular case of a very common and small function like median, I would even consider using generated code for both the tests and the function. The tests were quite readable and it was easy for me to reason about their coverage, plus they would have helped me remember that I need to look at both even and uneven lengths of input. However, for other more complex functions with more custom code I would consider writing the tests myself, as a means of quality control. Especially with larger functions, I would want to think through my test cases in a structured way from scratch, instead of getting partial scenarios from a tool, and then having to fill in the missing ones.”
“The tool itself might have the answer to what’s wrong or could be improved in the generated code − is that a path to make it better in the future, or are we doomed to have circular conversation with our AI tools?”
“[…] generating tests could give me ideas for test scenarios I missed, even if I discard the code afterwards. And depending on the complexity of the function, I might consider using generated tests as well, if it’s easy to reason about the scenarios.”
“For the purposes of this memo, I’m defining “useful” as “the generated suggestions are helping me solve problems faster and at comparable quality than without the tool”. That includes not only the writing of the code, but also the review and tweaking of the generated suggestions, and dealing with rework later, should there be quality issues.”
- […]
- Boilerplate: Create boilerplate setups like an ExpressJS server, or a React component, or a database connection and query execution.
- Repetitive patterns: It helps speed up typing of things that have very common and repetitive patterns, like creating a new constructor or a data structure, or a repetition of a test setup in a test suite. I traditionally use a lot of copy and paste for these things, and Copilot can speed that up.
Interesting. I’ve just always used the existing templates or made my own expansion templates. At least then it makes exactly what I want—and even leaves the cursor in the right position afterwards.
Another thought I had is that the kind of programmer that this helps doesn’t use any generalization for common patterns. Otherwise, the suggestions wouldn’t be useful because they can’t possibly take advantage of those highly specialized patterns. Or maybe they can, if they’re included in the context. It seems unlikely, if only because the sample size is too small to be able to influence the algorithm sufficiently. But maybe enough weight can be given to the immediate context to make that work somehow.
At that point, though, you’re just spending all of your time coaxing your LLM copilot into building the code that you already knew you wanted. This practice seems like it would end up discouraging generalization and abstraction—unless it can grok your API (as I’ve noted above).
This is an age-old problem that is maybe solved, once and for all. The problem is that when you generalize a solution, it becomes much easier, more efficient, and more economical to maintain, but it can end up being more difficult to understand. If the API is well-made and addresses a problem domain with a complexity that the programmer is actually capable of understanding, then the higher-level API may be easier to use, and perhaps even maintain.
However, a non-generalized solution is sometimes easier for a novice or less-experienced programmer to understand and extend. It’s questionable whether you’d want your code being extended and maintained by someone who barely—or doesn’t—understand it, but that situation is sometimes thrust on teams and managers.
“This autocomplete-on-steroids effect can be less useful though for developers who are already very good at using IDE features, shortcuts, and things like multiple cursor mode. And beware that when coding assistants reduce the pain of repetitive code, we might be less motivated to refactor.”
“You can use a coding assistant to explore some ideas when you are getting started with more complex problems, even if you discard the suggestion afterwards.”
“The larger the suggestion, the more time you will have to spend to understand it, and the more likely it is that you will have to change it to fit your context. Larger snippets also tempt us to go in larger steps, which increases the risk of missing test coverage, or introducing things that are unnecessary.”
On the other hand,
“[…] when you do not have a plan yet because you are less experienced, or the problem is more complex, then a larger snippet might help you get started with that plan.”
This is not unlike using StackOverflow or any other resource. There’s no getting around knowing what you’re doing, at least a little bit. You can’t bootstrap without even a bootstrap.
“Experience still matters. The more experienced the developer, the more likely they are to be able to judge the quality of the suggestions, and to be able to use them effectively. As GitHub themselves put it: “It’s good at stuff you forgot.” This study even found that “in some cases, tasks took junior developers 7 to 10 percent longer with the tools than without them.””
“Using coding assistance tools effectively is a skill that is not simply learned from a training course or a blog post. It’s important to use them for a period of time, experiment in and outside of the safe waters, and build up a feeling for when this tooling is useful for you, and when to just move on and do it yourself.”
This is just like any other tool. There is no shortcut to being good at something complex. The only tasks for which there are shortcuts are the non-complex ones. In that case, you should be asking yourself why your solutions involve so much repetitive programming.
“We have found that having the right files open in the editor to enhance the prompt is quite a big factor in improving the usefulness of suggestions. However, the tools cannot distinguish good code from bad code. They will inject anything into the context that seems relevant. (According to this reverse engineering effort, GitHub Copilot will look for open files with the same programming language, and use some heuristic to find similar snippets to add to the prompt.) As a result, the coding assistant can become that developer on the team who keeps copying code from the bad examples in the codebase.”
That will be so much fun, especially if you can get an echo chamber of lower-skilled programmers approving each other’s pull requests. 😉
“We also found that after refactoring an interface, or introducing new patterns into the codebase, the assistant can get stuck in the old ways. For example, the team might want to introduce a new pattern like “start using the Factory pattern for dependency injection”, but the tool keeps suggesting the current way of dependency injection because that is still prevalent all over the codebase and in the open files. We call this a poisoned context , and we don’t really have a good way to mitigate this yet.”
“Using a coding assistant means having to do small code reviews over and over again. Usually when we code, our flow is much more about actively writing code, and implementing the solution plan in our head. This is now sprinkled with reading and reviewing code, which is cognitively different, and also something most of us enjoy less than actively producing code. This can lead to review fatigue, and a feeling that the flow is more disrupted than enhanced by the assistant.”
“Automation Bias is our tendency “to favor suggestions from automated systems and to ignore contradictory information made without automation, even if it is correct.” Once we have had good experience and success with GenAI assistants, we might start trusting them too much.”
“[…] once we have that multi-line code suggestion from the tool, it can feel more rational to spend 20 minutes on making that suggestion work than to spend 5 minutes on writing the code ourselves once we see the suggestion is not quite right.”
“Once we have seen a code suggestion, it’s hard to unsee it, and we have a harder time thinking about other solutions. That is because of the Anchoring Effect, which happens when “an individual’s decisions are influenced by a particular reference point or ‘anchor’”. so while coding assistants’ suggestions can be great for brainstorming when we don’t know how to solve something yet, awareness of the Anchoring Effect is important when the brainstorm is not fruitful, and we need to reset our brain for a fresh start.”
“The framing of coding assistants as pair programmers is a disservice to the practice, and reinforces the widespread simplified understanding and misconception of what the benefits of pairing are.”
“Pair programming however is also about the type of knowledge sharing that creates collective code ownership, and a shared knowledge of the history of the codebase. It’s about sharing the tacit knowledge that is not written down anywhere, and therefore also not available to a Large Language Model. Pairing is also about improving team flow, avoiding waste, and making Continuous Integration easier. It helps us practice collaboration skills like communication, empathy, and giving and receiving feedback. And it provides precious opportunities to bond with one another in remote-first teams.”
“LLMs rarely provide the exact functionality we need after a single prompt. So iterative development is not going away yet. Also, LLMs appear to “elicit reasoning” (see linked study) when they solve problems incrementally via chain-of-thought prompting. LLM-based AI coding assistants perform best when they divide-and-conquer problems, and TDD is how we do that for software development.”
“Some examples of starting context that have worked for us:”
- ASCII art mockup
- Acceptance Criteria
Guiding Assumptions such as:
- “No GUI needed”
- “Use Object Oriented Programming” (vs. Functional Programming)
“For example, if we are working on backend code, and Copilot is code-completing our test example name to be, “given the user… clicks the buy button ” , this tells us that we should update the top-of-file context to specify, “assume no GUI” or, “this test suite interfaces with the API endpoints of a Python Flask app”.”
“Copilot often fails to take “baby steps”. For example, when adding a new method, the “baby step” means returning a hard-coded value that passes the test. To date, we haven’t been able to coax Copilot to take this approach.”
Knowing a bit about how LLMs work, there’s no way you really could train it to do TDD, because it’s an iterative process. It doesn’t know what TDD is, nor does the way it’s built have any mechanism for learning how to do it. Nor does it know what coding is, for that matter. It’s just a really, really good guesser. Everything it does is hallucination. It’s just that some of it is useful.
“As a workaround, we “backfill” the missing tests. While this diverges from the standard TDD flow, we have yet to see any serious issues with our workaround.”
Changing how you program because of the tool is something you should do deliberately. This is a slippery slope.
“For implementation code that needs updating, the most effective way to involve Copilot is to delete the implementation and have it regenerate the code from scratch. If this fails, deleting the method contents and writing out the step-by-step approach using code comments may help. Failing that, the best way forward may be to simply turn off Copilot momentarily and code out the solution manually.”
Jaysus. That’s pretty grim.
“The common saying, “garbage in, garbage out” applies to both Data Engineering as well as Generative AI and LLMs. Stated differently: higher quality inputs allow for the capability of LLMs to be better leveraged. In our case, TDD maintains a high level of code quality. This high quality input leads to better Copilot performance than is otherwise possible.”
“Model-Driven Development (MDD). We would come up with a modeling language to represent our domain or application, and then describe our requirements with that language, either graphically or textually (customized UML, or DSLs). Then we would build code generators to translate those models into code, and leave designated areas in the code that would be implemented and customized by developers.”
“That unreliability creates two main risks: It can affect the quality of my code negatively, and it can waste my time. Given these risks, quickly and effectively assessing my confidence in the coding assistant’s input is crucial.”
“Can my IDE help me with the feedback loop? Do I have syntax highlighting, compiler or transpiler integration, linting plugins? Do I have a test, or a quick way to run the suggested code manually?”
“I have noticed that in CSS, GitHub Copilot suggests flexbox layout to me a lot. Choosing a layouting approach is a big decision though, so I would want to consult with a frontend expert and other members of my team before I use this.”
That’s because you care about architecture. Review was always important, but more so when code is being written by something you never hired.
“How long-lived will this code be? If I’m working on a prototype, or a throwaway piece of code, I’m more likely to use the AI input without much questioning than if I’m working on a production system.”
“[…] it’s also good to know if the AI tool at hand has access to more information than just the training data. If I’m using a chat, I want to be aware if it has the ability to take online searches into account, or if it is limited to the training data.”
“To mitigate the risk of wasting my time, one approach I take is to give it a kind of ultimatum. If the suggestion doesn’t bring me value with little additional effort, I move on. If an input is not helping me quick enough, I always assume the worst about the assistant, rather than giving it the benefit of the doubt and spending 20 more minutes on making it work.”
“GitHub Copilot is not a traditional code generator that gives you 100% what you need. But in 40-60% of situations, it can get you 40-80% of the way there, which is still useful. When you adjust these expectations, and give yourself some time to understand the behaviours and quirks of the eager donkey, you’ll get more out of AI coding assistants.”
That doesn’t seem like such a big problem to me, when the point of AOT is to improve cold-start times for... [More]
]]>Published by marco on 15. Dec 2023 13:15:17 (GMT-5)
The latest video by Nick Chapsas has a more-than-usually clickbait-y headline. The “big” problem that NativeAOT has, is that it’s 4% slower during runtime than the JIT-compiled version.
That doesn’t seem like such a big problem to me, when the point of AOT is to improve cold-start times for applications launched on-demand. For that use-case, AOT shines. It’s over 4x faster on startup than the JIT-compiled version. It’s incredibly impressive that JIT-compilation takes less than 1/10 of a second, but it’s still 4x slower than AOT.
So, you get the app started 4x fast, but it then performs 4% more slowly than the non-AOT version. It really depends on the use-case, but it’s great for the common one of starting a server to answer a function call—think Azure Functions or AWS Lambdas—and then shutting down again, possibly immediately.
Damian P Edwards (Principal Architect at Microsoft) commented on the post,
“[There are a] few things that cause the slightly lower performance in native AOT apps right now. First (in apps using the web SDK) is the new DATAS Server GC mode. This new GC mode uses far less memory than traditional ServerGC by dynamically adapting memory use based on the app’s demands, but in this 1st generation it impacts the performance slightly. The goal is to remove the performance impact and enable DATAS for all Server GC apps in the future.
“Second is CoreCLR in .NET 8 has Dynamic PGO enabled by default, which allows the JIT to recompile hot methods with more aggressive optimizations based on what it observes while the app is running. Native AOT has static PGO with a default profile applied and by definition can never have Dynamic PGO.
“Thirdly, JIT can detect hardware capabilities (e.g. CPU intrinsics) at runtime and target those in the code it generates. Native AOT however defaults to a highly compatible target instruction set which won’t have those optimizations but you can specify them at compile time based on the hardware you know you’re going to run on.
“Running the tests in [the] video with DATAS disabled and native AOT configured for the target CPU could improve the results slightly.”
To summarize:
An AOT-compiled app cannot benefit from dynamic PGO. It benefits from static PGO, but cannot recompile itself on-the-fly because it doesn’t have a JIT compiler to do so.
The JIT-compiled app can dynamically recompile what it observes as performance hotspots with more highly optimized code. I wrote a bit about how Safari does something similar for JavaScript in Optimizing compilation and execution for dynamic languages—although for JavaScript, dynamic recompilation is sometimes necessary for backing out of an incorrect assumption about what type a variable is going to have.
As well, a JIT-compiled app can take actual hardware capabilities into account, while an AOT-compiled app necessarily targets a static hardware profile.
The generic hardware profile is going to be extremely conservative about capabilities because if it assumes a capability that doesn’t exist, the app simply won’t run. Choosing a hardware profile for AOT that matches the target hardware would boost performance.
I guess that was more of a rephrasing, rather than a summary.
Anyway, another commenter asked,
“[…] would it be possible in the future for a JIT application with Dynamic PGO that has run for a while and has made all kinds of optimizations to then create a “profile” of sorts that could be used by the Native AOT compiler to build an application that is both fast in startup time and highly optimized for a given workload?”
Yes. That should be possible. It’s unclear what sort of extra performance boost this would give, especially if you’d already fine-tuned the target hardware profile—which is the first thing you should do. I could imagine adding this sort of profiling as a compilation step, though. You always have to be careful, though, whenever you’re running something in production that is different than what you’ve tested. We put a lot of faith in the JIT and dynamic PGO, don’t we?
I wanted to also note that, at the end of the video, Chapsas showed Microsoft’s numbers, which confirm the performance drop, but also show an over 50% reduction in working set! Dude! How do you not mention that!? The app uses less than half of the memory and runs almost as fast? Yes, please! That’s a huge win for people paying for cloud-based services.
For once, I’m somewhat surprised to see how naive Nick’s take is—that a 4% drop in performance is at-all significant, especially when the “slow” version is still processing 50,000 requests per second in a performance-constrained environment. He did mention a trade-off, but was very excited to tell people that AOT is slower during runtime.
There are always trade-offs and you should be very aware of the actual non-functional requirements for your application before you decide whether to use a technology or not. For 99.9% of the applications, the 4% drop in performance vis á vis a JIT-compiled version won’t be the deciding factor. When it’s accompanied by a working set that’s only ½ the size, then it becomes an even more attractive target.
Published by marco on 15. Dec 2023 11:52:23 (GMT-5)
Updated by marco on 15. Dec 2023 12:23:31 (GMT-5)
A build started started failing after a commit. However, the errors had nothing to do with the changes in the commit. A little investigation revealed that the cloud agent had started using a newer version of the build tool that included an expanded set of default warnings. These warnings started appearing first on CI because developers hadn’t had the chance to update their tools yet.
The “warnings as errors” setting turned what would have been a build with a few extra warnings into a failing build that prevented a developer from being able to apply completely unrelated changes. The setting allowed new, unrelated, and irrelevant warnings to push their way to the top of the priority queue.
👉 tl;dr: I don’t think we should use the “errors as warnings” setting anymore. You can get the same benefit —and even more—by using newer, more finer-grained configuration options.
This section wasn’t included in my original draft of this essay. It only occurred to me under the shower that this is the real reason why I wrote a ten-page essay to answer a teammate’s question in a PR review.
In hindsight, it’s obvious: to answer whether we should re-enable the “warnings as errors” setting, we should first think about what doing so would accomplish. What need does it fulfill?
The rest of this essay meanders drunkenly along a path toward what I hope is a reasonable answer.
I understand the sentiment. You’re in a team that never, or rarely, looks at warnings. You’ve given up on teaching them how to look at warnings and keep them fixed. Fine. You just make every warning an error and now they absolutely have to fix everything. Problem solved.
Except it isn’t, is it? Not really.
What you’ve now done is ensured that your team will be constantly fixing errors that aren’t really errors at times when they wouldn’t want or need to be doing so.
Don’t make me waste time pretty-printing code that I’m still writing! How annoying is it when you can’t run a test because your comment has an extra line below it? Are you kidding me? [1]
If your team does care about warnings, then, … why do you need to make them errors?
Before handcuffing developers with a setting, think about whether there isn’t a trust problem first. Are you addressing a symptom rather than the cause?
While it’s possible that applying handcuffs is the best possible solution in your case, consider that there are other solutions along a spectrum that goes from “enforcing discipline” to “relying on individual discipline”.
Any feature that’s enforced at all times will end up hampering efficiency and flexibility in some cases, while any feature that’s left up to developers is liable to not be applied consistently.
The job of the person setting up code-style configuration is to thread that needle, tailoring the configuration for the team and solution at hand.
If you have a lot of solutions and teams, then you also get to consider the maintenance overhead of having too many custom configurations. In that case, you might want to make a few standard bundles that group teams and solutions, like “legacy”, “modern”, “junior team”, etc.
You don’t have to name them like that, but the name should give you an idea of how loose or restrictive the settings would be.
I don’t have time for all of that. Let’s just run them on the CI. Warnings as errors in the cloud FTW!
Now you’re allowing team members to push all the way up to the server before they realize that they have errors. Granted, they’re actually warnings, but you can’t merge to master until you fix them, so, yeah, they’re errors. This isn’t less annoying.
But, but, but, what if they’re, like, real warnings? Like “possible NullReferenceException
” or something like that? That’s a good point, sure.
But, in most cases, it’s something more like “extra line found at end of file”, “space missing after parenthesis”, “method can be made private”, “class should be internal”, etc.
There are better—more automated—ways of addressing some of those, which we’ll discuss below.
Also, what if some warnings start appearing in your CI because of a tooling change? That can never happen, though, right? Because you’ve locked down all of your tool versions so that it can never happen? No? You didn’t do that? You’re using “latest”? Why?
The people building the tools are pretty clever, so we want to know what new things they have to tell us about our code.
Oh, right. Because it makes sense. If you lock down your tool versions, you run the very real risk of not knowing when your build will stop running with more-modern tools. You run the risk of it having been years since you last changed anything in your build and your being stuck with those settings and old tools … until they’re obsolete or no longer available on your build server.
It’s better to use “latest” and have an occasional spike of warnings than to just never know where you stand with newer toolchains. Locking down tool versions leads to things like DevOps having to set up on-site build agents with Visual Studio 2010 on them for certain projects.
OK, so we want to use latest tools, but that means that we might also get new warnings. These are a good thing! The people building the tools are pretty clever, so we want to know what new things they have to tell us about our code.
What we don’t want is for those new things to break builds that used to be running just fine.
This usually shows up when someone pushes new commits, runs the CI, and sees that they’re getting errors that they didn’t see locally. WTH? “My code didn’t cause those errors?”
The drawback here is this is (A) annoying and (B) it’s very possible that the new errors are a distraction at this point in time. The person’s bug fix may be important, but the new warnings have now bumped themselves to the top of the priority queue!
And what if the person whose build has failed isn’t well-qualified to address these new warnings? Well, then they get to bump the new warnings to the top of someone else’s priority queue! Probably a more senior developer. Fun for all!
What’s the solution then? Well, if you realize that the new warnings appeared because of a tool change, then I suppose you should try to pin the tool version on the CI, with all of the drawbacks outlined above.
That’s assuming that the person to whom this happens is (A) capable of figuring this out and (B) knows how to pin the tool version. And (C) we don’t really like that solution, for the reasons outlined above.
What about if we think again about what we’re trying to accomplish with “warnings as errors”?
Thinking…🤔🤔🤔…
Each solution should be able to decide what is an error and what is a warning and what is a suggestion. You can’t make “possible null-reference exception” an error in some legacy solutions without completely killing forward progress.
We want warnings to indicate potential problems, but be careful about forcing a solution to address all of them immediately. It’s more realistic to create tasks to slowly eliminate warnings, only switching a setting to an error later, to prevent future transgressions.
If the developer is focused on something, they shouldn’t be forced to switch modes and prioritize formatting. Use gentle, visible hints, unless it’s really, really relevant to what they’re working on.
For example, a possible NullReferenceException
is something to be avoided, but is it really an error in all code? It’s definitely a warning, but if the developer knows that it doesn’t matter right now, then they should be able to ignore it, no?
I mean, they haven’t even committed it yet (as far as you know 😉). Maybe they have a breakpoint to see how the heck that variable could be null
in the first place and they were just going to bounce the EIP past the crash anyway. YOLO.
Anyway, we want to be really careful about how pushy we are with the IDE configuration. We want to strike a balance between missing actual problems and decreasing efficiency. We don’t want the developer above to have to write a suppression—or, even worse, do some other, ad-hoc short-circuit of inspections—in order to keep working.
Something should fail only on CI as a last resort. That is, a developer must have tools that make it relatively easy to pass CI. This includes being able to see all warnings in the solution, whether warnings would fail the CI, or having an easy way to apply formatting to all files, if incorrect formatting would fail the build.
We want to avoid a process that leads to half of our commits being called “fix formatting” and “remove warnings”. So, we should consider things like having the IDE auto-reformat files on save.
Inspections should be applied and made visible as quickly as possible, to give the developer the opportunity to produce conforming code from the get-go. The path of least resistance should result in committing code that will also pass CI.
We don’t want to encourage “noisy” commits that “fix up” formatting or other inspection violations. We would rather have a high signal-to-noise ratio in our commits. We want compact, descriptive commits—so we don’t want bug-fix commits to include formatting changes to other parts of the file, if we can avoid it.
Looking at these requirements, we have to conclude that the “warnings as errors” configuration option is an absolute cudgel that we had to use in the old days because we didn’t have fine-tuned control of the inspection-configuration.
Can we do better today, with modern tools?
Absolutely, we can! Most modern IDEs support .editorconfig
, which allows fine-tuned configuration of both code-style and formatting, especially for languages like C# and TypeScript/JavaScript. The wide variety of JetBrains, Intellij-based tools use it as well, e.g. PyCharm, WebStorm, or PHPStorm. Visual Studio understands it. Visual Studio Code understands it.
Of course, the devil is in the details and, the degree to which code-inspection configuration applies from one IDE to another depends very much on the level of standardization for that language and environment. The .NET/C# world has a high degree of standardization, which is very helpful.
EditorConfig allows you to control almost anything you can think of about your code style or formatting. These are called inspections, each of which you can configure with an inspection-specific value and a severity to assign when the inspection is triggered.
For example:
dotnet_style_require_accessibility_modifiers = for_non_interface_members:silent
dotnet_style_prefer_auto_properties = true:silent
The two inspections above should be relatively obvious. In both cases, the preferred setting is configured, but the severity is “silent”, so the IDE doesn’t complain about it.
What’s the point of configuring a preference and then not showing it to the developer?
Ah, because the developer is the not the only one modifying the code.
Excuse me?
Don’t forget that the IDE will auto-format the code when requested. The IDE also writes code when it refactors anything. It needs to know how to format the code that it’s inserting or modifying.
The IDE uses the configuration in the EditorConfig to determine how to format the code. Your tools guy can configure the EditorConfig to conform to the style that the solution / team wants to use. When the code is auto-formatted or refactored, everything should end up looking just the way they wanted it.
If you have a “silent” severity, that means it’s something that you don’t want the team wasting time with during development. However, if no-one ever auto-formats the code, then those inspections will never be applied.
You should consider the process by which your solution will be made to conform with silent inspections in the EditorConfig.
If the inspection severity is suggestion
or higher [2], then the developer sees an indicator in the code when the file is open.
Suggestions, warnings, and errors are shown in the build output, as well. Of course, the developer can disable showing warnings and messages (where suggestions appear) in the error-list pane, but you can’t control everything—and you shouldn’t try.
Give your developers the tools and configuration to be efficient and produce good code, but try not to be too pushy about when they do it.
If the inspection severity is silent
or none
, then the inspection setting is only used by auto-formatting and refactoring tools.
In this case, you’ll have to consider when will your code be formatted? Do your developers occasionally auto-format files? Do they auto-format on save? Is there a step in the CI that auto-formats everything before compilation? If so, does it commit those changes? Or does the CI reject for formatting warnings?
If you have silent inspections, be honest about when they’re going to be applied. If you don’t have a plan, then they will be applied seemingly randomly when someone inadvertently triggers the hotkey for auto-formatting a file [3], which may lead to unpleasant surprises and/or messy commits.
Let’s clear up the distinction between these two main groups of inspections.
var
instead of an explicit type can, in very rare cases, lead to code that no longer compiles. By now, many IDE tools are generally clever enough to avoid even suggesting such a change, but it can still happen.So we’ve examined inspections in detail and talked a lot about setting severity to optimize the developer feedback loop i.e., we don’t want to mess with a developer’s priority queue unless absolutely necessary.
But aren’t there some things that we might allow a developer to do locally but not allow to pass CI? That’s where the “warnings as errors” setting ensured that the CI never passed, even if the developer forgot to check something locally. For example, it’s important to have consistent formatting before attempting a merge.
There are other ways to encourage and support proper coding practices, though.
Pre-commit hooks can run locally, running global formatting on the code base before a developer can commit. This is kind of touchy, as sometimes developers are just committing a WIP to avoid losing their changes. It would be annoying if you had to clean up your formatting just to commit those.
You could include auto-formatting in the commit hook, but it’s probably better to set up auto-formatting in the IDE.
Instead of a local pre-commit hook, you can configure a pre-commit hook on the server. This hook could cause a push to be rejected if its head commit doesn’t conform to certain conditions.
But…isn’t that what the CI is for? Well, kind of, but the CI runs only after the commits have landed on the server. It’s prefereable to have the developer fix commits locally before being able to push, again, to avoid “fix formatting” and “cleanup warnings” commits.
You could choose which branch patterns to run these on.
My recommendation is to lean as heavily as possible on IDE configuration before getting lost in the weeds with commit hooks.
As soon as we start talking about “fixes” for warnings or formatting, we’re talking about “noisy” commits. If we enforce inspections more strictly on CI than we do locally, then there will be more “fixup” commits.
OK, so what do we do about them?
Squash ‘em!
Right? Right?
🫠
Kind of. Look, the PR machinery allows you to merge, rebase, squash-merge, or squash-rebase. That’s OK, but it’s not great. A lot of times, you’ll have four commits that are descriptive and semantically relevant, describing changes that were made, as well as a few commits that address problems that either came up in CI or as part of the review. Don’t you think you should squash those into the four commits and make a clean history instead of just squashing the whole lump into one big hairball?
Or do you think that each PR should have only one commit, equating a branch with a commit (as e.g. plugins like Graphite positively encourage)? I recently wrote PRs suck. Stop trying to fix them. that also touches on the workflow outlined below.
You see how tool configuration affects everything? You have to think about how your team builds PRs, how they review PRs, how they repair PRs after review—or whether they even use PRs.
I would encourage a more real-time review culture, where possible.
What’s the problem? Don’t you trust your team members to decide what to do with their own highly ephemeral feature branches?
Allowing force-push encourages team members to care about what the commit history looks like. It give them a tool that allows them to revise their commit history until it tells a coherent story. See Rebase Considered Essential for a longer discussion on rewriting commit history.
Phew! So, what have we learned?
If that all sounds like a lot, well—it is. Building clean, maintainable code is a complex undertaking. There are a lot of tools that can help, but you have to put some time into thinking how you want to use them, and then into configuring them so they help you instead of getting in your way.
It’s a delicate balancing act: to give developers the best chance of (A) producing conforming code in the first place and (B) avoiding “noisy” commits, while (C) not hitting them with priority interrupts irrelevant to what they’re working on. There will be tradeoffs.
Once you’ve set up a couple of solutions, you can just copy/paste the configuration to others as a starting point. Remember, though, that solutions are usually pretty unique. Only consider generalizing or packaging a configuration if you’ve considered that,
For these reasons, each solution having its own copy of the configuration is probably better. They can just copy/paste—the horror!—improvements where appropriate. If you’re worrying about configurations drifting out-of-sync, schedule a work item every few sprints that evaluates and possibly re-syncs configurations.
There are always trade-offs. Improving code-quality is an incremental process. So is configuring the tools that support that process. It gets easier with practice. Good luck!
There is a bit of a mismatch with using .EditorConfig versus the JetBrains-native configuration: JetBrains tools support an additional severity level called “Hint”, which is generally shown as a green squiggly line rather than the blue one for warnings. However, if you set the severity to “hint”, Visual Studio interprets it as a warning, showing it as such in both the IDE and in the build output.
On top of that, JetBrains seems to think that the silent
option is called none
, although it seems to understand silent
well enough.
Published by marco on 15. Dec 2023 11:37:03 (GMT-5)
Updated by marco on 15. Dec 2023 11:58:51 (GMT-5)
I read through the article Your GitHub pull request workflow is slowing everyone down (Graphite.Dev) with great interest because I, too, am not thrilled about how PRs work. While I agree with the problems Graphite see with PRs, I think they miss other problems—and I don’t like their solution very much.
“The single most important bottleneck is PR size − large PRs can make code reviews frustrating and ineffective. The average PR on GitHub has 900+ lines of code changes. For speed and quality, PRs should be maintained under 200 lines—with 50 lines being ideal. To put this in perspective, where giant 500+ line PRs take around 9 days to get merged on average, tiny PRs under 100 lines can make it from creation to landing within hours.”
Holy shit! The average is 900 lines? That’s already using the system completely incorrectly. That’s so wild. It absolutely confirms my theory that PRs are a terrible way of committing code. I already thought they were terrible just because of the limited UI and lack of introspection of what the code you’re reviewing actually does.
PRs don’t encourage starting and running the change to verify that it actually works as advertised. You’re not using any of the tools that you use to develop code to review it. How silly is that? If you load changes into an IDE, you can see how many warnings there are, see if the layout shifts when you format the document, etc. Why would you want to review in a completely different environment? As Robin Williams once eloquently put it, It’s like masturbating with an oven mitt. (YouTube).
Not only that, but people probably aren’t looking at individual commits, so they’re just reviewing 900+ lines at once. The fewer people there are looking at individual commits, the fewer people there will be who make good, individual commits. This is a shame because it would counteract the awfulness of reviewing code in the PR web-UI, at least a little bit.
There are far better and more efficient ways of reviewing code than with PR web UIs. Reviewing through a PR web UI should be a fallback that you only use when nothing else is possible.
If you’re in the same time zone and working on the same schedule as the rest of your team, there is absolutely no reason why you should you be using the PR web UI instead of real-time reviews of local commits.
What the current PR machinery does is fool remote, async teams into thinking that they’re reviewing code efficiently. A face-to-face, real-time review will be much more efficient and yield much higher-quality code.
I honestly can’t believe the high pain threshold that some developers have.
If the developer hasn’t pushed yet, then:
If the developer has pushed and is not available for real-time review, then:
Apply your own commits instead of review notes wherever possible.
Yes, you can do this! Why not? You’re both on the same team. It’s a shared code base, not someone’s personal zen garden. Instead of explaining what you would want changed, just make your suggestion in the form of a commit. It’s often more efficient than writing prose.
You can thank me later.
“Problems can easily get hidden between the diffs, and reviewers often make assumptions instead of testing to avoid feeling overwhelmed. One particularly interesting finding is that as the size of a PR increases (by number of files changed), the amount of time reviewers spend on each file decreases significantly (for PRs with 8 or more files changed).”
Obviously! But it’s good to measure—this was my intuition. PRs don’t encourage local testing or verification in an environment similar to that which the original developer used.
“By default, every PR is restricted to only 1 commit of <200 lines, keeping changes tightly scoped. This forces developers to consciously limit work to related changes—the registration endpoint PR can’t sneak in unrelated styling tweaks.”
Yikes! I don’t like the sound of that. So you make multiple PRs rather than one PR with multiple smaller commits? Why don’t you just review commits rather than one giant blob? Do you really need to corral each commit into its own branch and PR to force yourselves to actually make useful commits?
Yeeess? 🧐
“Stacking centers around breaking down big feature work into chains of smaller pull requests. Each PR is typically limited to 1 commit focused on an isolated change. This restriction guides developers to consciously make only a single change, squashing and rebasing along the way, instead of cluttering the PR with random unnecessary commits like “typo fixes”.”
This is yet another technique invented to accommodate teams that don’t trust each other, or that contain people who, if they can’t be trained to do better—or don’t understand what better is—probably shouldn’t be programming yet. Instead of teaching team members how to use their tools, they impose an arbitrary rule. What a kindergarten.
“Unlike Git workflows, where it is easy to neglect staying updated, Graphite centers your workflow around continually integrating with the current mainline state.”
Yikes! I don’t love the sound of that, either. Doesn’t that force you to spend more time on integration that you might have spent working? I understand you don’t want to have long-lived branches, but now you’re just shooting to the other extreme, forcing integration on every pull.
It’s not bad as long as the integrations are automatic, but might not be appropriate for developers who aren’t great at resolving merge conflicts. Even if they know how to deal with them well, might they not waste time resolving conflicts integrating a version of their code that wasn’t at all ready to be integrated?
I understand that this feature follows from the logic of “if you integrate more often, then integration is easier,” but, again, you’re taking agency out of developers’ hands, implicitly not trusting your team members. I don’t like it.
If you have several stacked commits, I wonder how much shuffling there is in the working tree (causing unwanted IDE reloads) during the integration cascade. Are they somehow integrating without touching the working tree? I don’t know that that’s possible.
Go ahead and work on the main branch if you want—I do it all the time—but this should be more of a choice than it sounds like it is.
“This command will add your changes and create a new branch in one motion. You can then continue iterating by creating and stacking additional branches:”
Ah, I see now. They’ve reinvented Mercurial’s patch queues. Everything old is new again.
A really bright and good friend of mine added an extension to Mercurial’s mq
decades ago that sounds like it works the same. I remember discussing the technique with him as he was developing it.
I’m a bit worried about two things:
“By cleaning up your PR commit history, you ensure a clear and concise main branch history that makes it easy to see exactly what’s changed over time.”
By enforcing one commit per branch, you dumb everything down.
It does seem that, instead of acknowledging that PR supremacy is stupid, Graphite doubles down, strips branches of most of their functionality by equating them to commits, and uses multiple PRs to force people to review by commit. It seems like a waste.
But, hey, maybe I need to actually try it. I might be missing something.
Still, instead of adding another tool, I think you should use git better.
]]>“I see “local-first” as shifting reads and writes to an embedded database in each client via“sync engines” that... [More]”
Published by marco on 30. Nov 2023 21:23:21 (GMT-5)
Updated by marco on 30. Nov 2023 21:43:00 (GMT-5)
The article Some notes on Local-First Development by Kyle Matthews (Bricolage) focuses on a very good trend in app development, but focuses a bit too much on what he calls DX, or developer experience.
“I see “local-first” as shifting reads and writes to an embedded database in each client via“sync engines” that facilitate data exchange between clients and servers. […] The benefits are multiple:”
- Simplified state management for developers.
- Built-in support for real-time sync, offline usage, and multiplayer collaborative features.
- Faster (60 FPS)
- CRUD
- More robust applications for end-users.
I don’t want to read too much into it, but he did mention end-users only in the last bullet point.
I think the author is focusing too much on the tech and too little on the value. DX is great and all, but it’s about the UX, no? Every app would benefit from realtime updates if it’s cheap and easy to build. Almost every app is multiplayer, if you think about it a bit.
“For almost any real-time use case, I’d choose replicated data structures over raw web sockets as they give you a much simpler DX and robust guarantees that clients will get updates.”
No, my friend. You’ve come to the right conclusion for the wrong reason.
If the tech is solid, if it doesn’t negatively influence debuggability or traceability, if it’s predictable, if operations can be correlated, if you don’t end up limiting your functionality to fit the framework—then go for it.
What I mean is that it’s important that the thought process that leads to the correct conclusion serves all stakeholders. If you’re only doing things because they’re better for developers, then, eventually, you’re going to be deciding against the users.
Be aware of the trade-offs, and be sure all of the stakeholders can live with them. What does good DX translate to for other stakeholders? Easier maintenance? Less complexity? Easier onboarding? The DX is really mostly secondary unless you’re making a framework, in which case it might matter. No-one cares about DX for real-world products. I love good DX, but I’m a developer! As a developer with a lot of experience, I’m forced to admit that it’s not at all a primary goal. Having good DX might lead to other desirable things, but that doesn’t make it directly desirable. Don’t forget that.
At one point (I forget where), he says,
“I don’t like unit tests.”
Agreed. I likelove automated tests. They’re indispensable. But I think unit tests are only useful... [More]
Published by marco on 8. Nov 2023 21:50:04 (GMT-5)
This is a brilliant interview, in that Oren Eini just talks for about 40 minutes, answering pretty much just one or two questions.
At one point (I forget where), he says,
“I don’t like unit tests.”
Agreed. I likelove automated tests. They’re indispensable. But I think unit tests are only useful when you want to focus on a failing integration test. David rightly points out that they’re really good for pinpointing where a problem actually happens, but Eini says that they also “hinder change” because, by their nature, they lock down a lot of the design and implementation.This is absolutely true.
Just to be clear: I think of anything that’s not a unit test as an integration test. I generally like “smaller” integration tests.
It’s probably better to just be agile about it and write them when the situation requires it, i.e., when the cause behind a failing integration test is proving difficult to pin down—or when you’ve determined the cause and you want a direct proof that you’ve fixed the underlying problem.
Ir requires discipline to realize when you need to write more unit tests in order to help pinpoint which component involved in a failing integration test is causing the problem. If you preemptively write all of the unit tests, you’re wasting time that could be better spent elsewhere.
I have had no small amount of success with a large test suite that was mostly integration tests. It ran relatively quickly (10 minutes for 10,000 tests on a reasonably classed developer desktop) and helped me survive three major refactorings.
Published by marco on 24. Oct 2023 22:39:45 (GMT-5)
The following video is a talk by Robert Martin “Uncle Bob”, one of the graybeards worth listening to. This video from 2011 is wide-ranging and contains a lot of brilliant advice. It’s stuff that we’ve known for a long time now, but every generation of programmers needs to re-learn these things about every 5-10 years. You usually can’t stop people from just reinventing the wheel because who wants to watch videos of or read blog posts written by old dudes, ammirite?
At 10:00, he talks about how the top-level architecture of most applications reflects the framework used to implement the web-delivery mechanism rather than the purpose of the application itself. In his example, he shows how a Ruby-on-Rails application is immediately recognizable as such, but that you have literally no idea what the application does.
He urges us to consider what this implies about our priorities as architects and developers. It means that we are much more concerned with the technology than with the functionality. This is not good.
He contrasts it with a high-level. 2-d blueprint of the first floor of a church, where the intent is obvious: it’s a church (he says). Of course, inferring that it’s a church involves applying the appearance of the diagram to a given context—e.g., a very western one—but the point is clear: the standard, top-level view of the design of a church screams out that it’s a church. It says nothing about how the church is to be built—or has been built—it says what it is.
“Architecture is about intent.”
Just to be clear: this presentation is from 12 years ago, and we’re still confronted with the same concepts—still confronted with the same failure to remember these precepts. Our frameworks still push themselves to the fore.
This is, in a way, the problem with LLM-generated code: we are already terrible at expressing the intent of our software in a way that makes it maintainable and qualitative. We are already mostly terrible at designing and building things in a way that satisfy the nearly-always-implicit non-functional requirements, like maintainability, usability, performance, etc.
And now we’re asking another piece of software, whose workings we can’t yet fathom, but which we know we’ve built by feeding it all of these terrible versions of software, and asking it to write software for us. All of the theory that we’ve developed about how to build software will not be respected, except by luck, if the neural net is feeling like that’s a high-probability next token.
On the one hand, I have to admit that this doesn’t sound much different from how software is built today, except that the human builders are potentially capable of following rules, whereas the software-based builders are less trainable. Again, though, we have decades of experience showing that, while people are ostensibly trainable, they are not necessarily practically trainable, at least in the general case for the general type of person who takes part in this field of endeavor we call programming.
Which leaves us with the question: have we achieved the maximum potential in software development? We already knew everything we needed to know about how to do it decades ago. What is missing is the will to do it that way. It’s definitely possible to train people to do it that way. The hangup is, as always, the cost, specifically, the cost-benefit ratio. The perceived benefit of better software is usually far less than the perceived (initial) cost.
And we always perceive only the initial cost because we are super-bad at long-term thinking about complex problems like building software.
At 34:00, Uncle Bob says
“There’s gotta be some better way to do this. […] This is just 3270 programming poisoned with all sorts of crud. How many languages do you have to do know to write a web application? Well, there’s some programming language, but that’s incidental! You’ve gotta know HTML and CSS and JS and Zazzle and Dazzle and … and, you know, the guy over here’s going: ‘let’s build communities by leveling people up. Leveling them up! I mean, what we’re going to do is hand them a … OK, now, hold this hammer. Ok? Good. You got that hammer? Now, here’s another one. Hold that hammer too. Now I’ve got a big barrel you’ve got to hold on your head. We are not helping our cause with this truly terrible mechanism that we have adopted.”
At 41:00, he says
“The database is a detail.”
This reminds me of The UI is an afterthought, a detail, an article I wrote recently [1] about a 7-year-old video I watched that expressed the same sentiments about external systems that Martin is expressing in his 12-year-old video.
“That’s what architecture is: find some place to draw a line and then make sure every dependency that crosses that line goes in the same direction.”
At 55:45, he says,
“There’s an interesting case of the database—the thing that’s so incredibly important—and yet, we took that decision and we just deferred it off the end of the world and then, when somebody needed it, we shimmed it in in a day. Because our architecture had done something right. What is the hallmark of a really good architecture? A good architecture allows major decisions to be deferred.”
“A good architecture maximizes the number of decisions not made.”
At 1:00:50, he says,
“How do you keep the beast under control? You need a suite of tests you trust with your life. You must never look at that suite of tests and think ‘you know? I don’t think I really tested everything?’ As soon as you think that, you’ve lost it. Because now you’re afraid of your code. The reason we write our tests first is so that we know, that every single line of code we wrote was because of a failing test that we wrote. So that we know that every single decision that we made is tested. So that then, we can pull up that code on our screen and say ‘Oh my God, that looks like a mess’—and clean it!…without any fear.”
Great talk. Add it to the pile of things that we know—or should know—better, but don’t.
Published by marco on 11. Oct 2023 21:15:00 (GMT-5)
Updated by marco on 6. Mar 2024 07:35:06 (GMT-5)
As I was reading the absolute train wreck of a unit test in Testing with a Lisp (Daily WTF), the song “What the fuck is going on?” popped into my head, like it always does when I see that a programmer not only didn’t understand the assignment, not only doesn’t know how to program, but also doesn’t know that they don’t know how to program.
They are living their best life because they don’t think that “knowing how to program” is required in order to be a programmer. Neither does their boss or team, I guess.
That’s when the music starts to play in my head, and I think of little blind Dillon playing football because a very non-PC friend [1] sent me that video so many years ago.
Am I going to link the video? Of course I am. Because I’m a terrible person. [2]
And this is the test from the article above.
test("Returned objects arguments immutable (a b)", function() {
var result = lispParser("(a b)");
expect(3);
ok(typeof(result) === 'object', "result is an object");
var children = result.arguments;
var newValue = 2;
var firstChild = children[0];
if (children[0] == newValue) {
firstChild = ++newValue;
}
notEqual(result.arguments[0], newValue, "Underlying array was immutable");
equal(result.arguments[0], firstChild, "Underlying array was immutable");
});
Nothing about that test makes any sense. It will always pass. It is, in its own way, a work of art. It is the JavaScript equivalent of Chomsky’s Colorless green ideas sleep furiously (Wikipedia), an example of a sentence that is “grammatically well-formed, but semantically nonsensical”.
Honestly, this looks worse than anything I’ve seen my students try to write. They usually have enough shame that they don’t bother filling in an answer if they really have no idea what’s going on.
I’m also wondering, of course, whether this is the work of an AI—or the bastard child of a poseur-programmer and an AI. The future is bright.
]]>We would like to do a course about SW development with Python, preferably an online course, so that we can start at our own pace.
We don’t want a Python course, but would instead like a course... [More]
Published by marco on 5. Oct 2023 13:48:10 (GMT-5)
Updated by marco on 10. Oct 2023 06:22:28 (GMT-5)
I was recently asked something like the following question, which I am citing with a few minor edits.
We would like to do a course about SW development with Python, preferably an online course, so that we can start at our own pace.
We don’t want a Python course, but would instead like a course more about SW development. It would be great if it were in Python because we are comfortable with it.
The interesting topics would be:
- object-oriented programming
- functional programming
- design patterns
- good coding practices
As well as other important topics such as:
- Testing
- Documenting
- Version control
- Working in a team with version control
The course doesn’t have to contain all these topics. It can be also several courses or it can be toy-projects from somewhere.
I have very little familiarity with courses as I’ve usually been tasked with figuring out how to do things before others have gotten to it. Of late, I’ve been teaching courses, not taking them.
So, how did I learn what I know about software development? When I started writing software, there was nothing available online, outside of a bunch of GeoCities pages (one of which was mine). MSDN was on CDs or local help files.
I read some books, OOSC and OOSC2, as well as the Gang of Four’s Design Patterns. I can’t remember what else, but that’s partly how I leveled up my skills. I had the great fortune of being able to build and work on large frameworks, from which I drew many lessons. I worked with very good people, who challenged me and taught me a lot.
Nowadays, I use DuckDuckGo as my online reference. I have developed a relatively advanced skill at searching for what I’m looking for. I very often get it within minutes. I almost never use videos.
a primary skill in software development is to be able to imagine what you should be looking for. That is, you don’t have to know how to do everything without looking it up, but you do have to imagine that it might exist.
For example, I don’t know how to write automated tests in Python, but I know that it should be possible. I know that I should figure that out very early in my experiments with Python. I know what to expect from an automated-testing environment. I know which settings to look for and expect.
That kind of knowledge transfers from one language or development environment to another. I know that I code-completion makes me faster, I know that I would like to avoid runtime errors—how can I best use Python to achieve those ends?
I took a quick look around for online courses, but was not immediately convinced that I am equipped to be able to distinguish between scams and actually worthwhile courses. Does the course even mention general software-development principles? How much time is allocated to that?
The Complete Software Engineering Course with Python (Udemy) looks as follows:
What about general programming?
Just over nine minutes? And you can’t even be bothered to describe it in something approaching well-written English? No, thanks.
The course Learning To Program − Part 2: Abstractions (PluralSight) looks a bit more professional, but it still has some quirks (especially for $29 per month).
There is an assessment that you can take, but you have to sign up first.
Maybe PluralSight is able to tell you which courses you need, but I doubt it will err on the “you need fewer courses” side.
I’ve recently heard from a source I’ve been watching for a while that this course is quite good for C# developers: From Zero to Hero: Test-Driven Development in C# by Guilherme Ferreira. The person recommending it releases quite interesting/advanced videos on YouTube and has his own range of courses at DomeTrain.
How would I teach basic software-development principles? I would probably start with very abstract principles that try to answer the classic questions for “use cases”:
A question people tend to start with is: which programming language should I use?
That’s the wrong question.
The applicability of programming languages to fields differ widely, but most languages have a large overlap in functionality. Where they differ is in the degree of runtime or library support for specific tasks.
For example, Python famously has a lot of libraries for number-crunching and data-analysis (although I feel that this advantage is grossly exaggerated) whereas it’s terrible for writing Windows GUI applications. C#/.NET has excellent web and desktop technology support. The Python runtime is notoriously slow (with essential libraries written in C++) whereas .NET is known as a very performant cross-platform runtime.
Do you see how quickly the conversation turns from “what can the language do?” to “what can the standard runtime/libraries/environment do?” That’s because you can do most tasks with most languages.
Instead, we want to think about this at a higher level. We want to,
Programming languages exist on several spectra. One of these is “the degree of developer discipline required to use the language effectively and safely.”
What does that mean? For example, Python and JavaScript have a dynamic type system. While there are mechanisms, practices, and IDE support that you can use to set up guardrails missing in the language, but they are optional and Idiomatically written code in both of these languages tends not to use any of it. It’s the wild west, for the most part, with a lot of assumptions that nothing will ever go wrong.
More strict languages force you to consider all possibilities before your program even compiles or runs. For example, Haskell and Rust are famously picky. If you have a function that returns a value under certain conditions, those languages will make you explicitly indicate what to return when those conditions don’t hold. Forgiving languages will just use some default value, usually null
or undefined
.
This is called “happy path” programming because you only write the code for the hoped-for path through your use case. For example, the user selects a valid file with the expected data format with an acceptable length with no validation or processing errors, generating a data file to which the initiating user has access.
Writing programs in this fashion is a dangerous thing to do with a strict language, and it’s even worse to do in a lax language.
Even the simplest software has many, many branches. The less your language or compiler or IDE reminds you of them, the more you have to fill that gap with developer discipline.
To get more concrete, some good questions to consider are:
If these don’t make any sense to you, don’t worry. But they are questions that are important when you’re choosing a tool for building software.
The whole point of a programming language is to express intent. You indicate what you intend to happen when a given event occurs.
An programmer expresses an intent by writing that, “when this thing happens, I intend for this other thing to happen.”
For example,
How do we choose a programming language? You’re not just choosing a programming language, you’re also implicitly deciding which subset of language features to use. This is predicated, of course, on knowing about these features. It’s best to inform yourself about what your language/libraries/runtime (let’s call it a software-development tool) can do for you—or find someone who is well-informed to help.
For each feature, you should ask yourself: how useful is it? Does it help me achieve my task?
Let’s take a look at high-level features of a software-development tool that may be important.
For code designed to be reusable (libraries, frameworks), you can also consider:
Which of the features above matters more depends on what you’re building. A one-off script doesn’t need to satisfy many of these features. A full-blown application that needs to be maintained for 10-20 years by different teams has to be much, much more careful.
This isn’t the first time I’ve written about these ideas, so I’ve included links to other, similar articles below.
These articles discuss the topic of software-development on a similar level to the discussion above.
The articles below are more recent, are more-or-less on the same level, but are also more targeted.
These white papers were written from 2006 to 2019 when I was still employed at Encodo Systems AG. They expand on recommended practices of specific facets of software development. They are presented in reverse-chronological order, but can be read in any order.
This is a YouTube playlist I’ve maintained for years that I continuously update whenever I watch a video that I think would be interesting for other developers. It’s only technology videos, but it’s pretty eclectic (i.e., it’s language- and technology-agnostic).
Developer suggestions (YouTube)
Pace yourself. You can’t have everything all at once. Programming takes wisdom. Wisdom takes time. It takes practice. It comes, or it doesn’t. It takes different forms.
As Rainer Maria Rilke wrote in 1903 [2],
“Forschen Sie jetzt nicht nach den Antworten, die Ihnen nicht gegeben werden können, weil Sie sie nicht leben könnten. Und es handelt sich darum, alles zu leben. Leben Sie jetzt die Fragen. Vielleicht leben Sie dann allmählich, ohne es zu merken, eines fernen Tages in die Antwort hinein.”
Good luck.
I 100% agree with you, in general. I absolutely want to know immediately when an assumption I’ve made does not hold.
But…😁
The degree to which I’m willing to crash depends on whose consistency I’m basing my assumptions on.... [More]
]]>Published by marco on 4. Oct 2023 21:54:06 (GMT-5)
Note: I found this old draft containing my response to a colleague.
I 100% agree with you, in general. I absolutely want to know immediately when an assumption I’ve made does not hold.
But…😁
The degree to which I’m willing to crash depends on whose consistency I’m basing my assumptions on. When I call a method in my code from another method in my code, I’m absolutely going to assert that an argument is not null. I can control that. My IDE will tell me when I might be passing null
. That is definitely a programming error.
When I’m getting external input (e.g. from the Windows registry), I’m a bit more cautious because I’m less sure about how solid my assumption is. I know what the documentation says but a lifetime of programming has taught me that some things (like the Windows registry) are going to work exactly as expected on my (modern) developer machine, but are going to fail mysteriously on a (perhaps less modern) machine in (for me) completely unpredictable ways.
Therefore, I’m a bit careful about is what I’m willing to pay to find errors. The primary purpose of a program is to bring value to the customer/user. I want to improve my program for more situations, but how am I going to find out in which situations it doesn’t work?
I can test, of course, but some things will only ever happen in the field. If it happens in the field, then I’m using the customer’s/user’s time to help me fix my program (they benefit, of course, but not for free). Can I soften the blow to the user of having to help me improve the program without sacrificing consistency or accuracy?
Sometimes, the answer is a resounding no. The program absolutely cannot continue if e.g., the reference to the data it needs to work on is null
. That’s a no-go. There’s no rescuing the program from that or completing any other useful work.
In the case of this tool, if it crashes, the user no longer gets a report. Would they have been able to get some of the report if it hadn’t crashed? In this case, yes. All of the other checks could be run. The checks that crashed would show as “failed” with the exception message. That seems to me to be better than skipping all subsequent checks when one crashes.
I can even continue to hope that the user then reports the mysterious error message they got for one of the reports! Die Hoffnung stirbt zuletzt!
I’m delighted to discuss programming and error-handling philosophy in person next week!
Through our many years of experience building software, we’ve accumulated methodologies and principles that... [More]
]]>Published by marco on 4. Oct 2023 21:36:27 (GMT-5)
This article is a copy of the white papers and process description that I wrote for Encodo Systems while I still worked there. I’ve preserved a copy of it here and in the linked articles.
Through our many years of experience building software, we’ve accumulated methodologies and principles that lead to quality software.
Listed below are our methodologies.
We implement the Inversion of Control [I] pattern with the dependency-injection pattern (D) to allow for a large amount of flexibility in how an application is composed. We’ve applied this principle throughout the Quino... [More]
]]>Published by marco on 4. Oct 2023 21:36:20 (GMT-5)
Encodo keeps the SOLID principles in mind when designing software.
We implement the Inversion of Control [I] pattern with the dependency-injection pattern (D) to allow for a large amount of flexibility in how an application is composed. We’ve applied this principle throughout the Quino framework and use it in our products as well.
What does this mean? It means that the product or framework doesn’t make any decisions about which exact components to use. Instead, it indicates the API Surface (interface) that it expects in the form of injected components. That is, the responsibility for deciding which component to use lies not with the lowest level of the software stack, but with the highest level.
This inversion means that the application entry point configures the object graph (i.e. which objects will be used). That makes it much easier to isolate and test individual components, especially where those components would depend on native- or web-only functionality in production.
See the How do I DI? presentation from February 2018 for more information.
An application is a graph of components, each with one responsibility (S) and zero or more dependencies, injected via the constructor. Components are composed with other components to build higher-level functionality (O). They are also unaware of the other components’ implementations and can be replaced with other implementations (L).
Components make software flexible:
Components have a very clear purpose (S) indicated through an interface. In most cases, we use an actual “interface” language construct to clearly define the API surface and to not limit a product in its implementation (e.g. with an abstract base class).
Most components have a single method, amounting to a functional interface and allowing composition with lambdas. While TypeScript has this feature (as does Java), C# does not. We end up defining a lot of single-method classes that implement a single interface. It’s more code than we’d like, but it’s purely structural syntax and doesn’t introduce additional complexity.
See the Interfaces, base classes and virtual methods in the Quino conceptual documentation for more information and on and examples of patterns that we use.
Although it’s possible for applications to manually create an object graph (the composition root), we prefer to use an IOC Container.
The container provides two services:
The container introduces the following restriction:
The lifetime of an application is as follows:
See the Quino Application Configuration for more information about application lifecycle. The blog article Starting up an application, in detail is a bit older, but provides more detail on how Quino integrates the IOC into the startup.
In the long example below, we will first look at how composition even without a container is very powerful. Then we’ll look at how a container can improve on that.
Although we generally use C# or TypeScript in our work, these examples were originally written to introduce Swift developers to an iOS framework that we wrote.
Let’s take a look at an example of an application that looks OK at first, but turns out not to be very flexible.
Note: The example is small, so some of the steps will feel like over-engineering. It’s a good point, but the principles shown here apply just as well for larger systems.
The following example defines a simulator that can move a robot along a route, defined by movements. The robot starts at a given location and can travel at a fixed speed.
enum Direction
{
case north
case south
case east
case west
}
struct Movement
{
let direction: Direction
let distance: Int
}
struct Point
{
var x: Int
var y: Int
}
class FastRobot
{
var speed = 2
var location: Point = Point(x: 0, y: 0)
let movements: [Movement] = [Movement(direction: .north, distance: 1)]
func move()
{
for movement in movements
{
let distance = speed * movement.distance
switch (movement.direction)
{
case .north:
location.y += distance
case .south:
location.y -= distance
case .east:
location.x += distance
case .west:
location.x -= distance
}
}
}
}
class Simulator
{
func run()
{
FastRobot().move()
}
}
As mentioned above, this implementation looks well-written, but what if we wanted to verify that the robot ended up at the right location? Let’s try that below.
Simulator().run()
// Now what?
It turns out that we can’t test anything in this application. We can fix this by applying the patterns outlined in the first section.
First, let’s tackle the Simulator interface:
class Simulator
{
func run(robot: FastRobot)
{
robot.move()
}
}
let robot = FastRobot()
Simulator().run(robot: robot)
XCTAssertEqual(robot.location.x, 0)
XCTAssertEqual(robot.location.y, 2)
Now we can test that the robot is working as expected.
The robot is still quite hard-coded, as is the simulator’s relationship to the robot. The robot must be a FastRobot
and it can only move along a fixed route.
We’ll first decouple the Simulator from a direct dependence on the FastRobot.
protocol IRobot
{
func move()
}
class Robot : IRobot
{
// As above
}
class Simulator
{
func run(robot: IRobot)
{
robot.move()
}
}
Now the simulator only knows about the protocol IRobot
, which has a very small surface area. It’s still too small to be very useful.
Instead of hard-coding everything, we can compose the robot out of parts. Examining the algorithm, we see three parts that could be externalized:
Let’s first externalize all of the hard-coded values out of the FastRobot
into a generic Robot
class.
class Robot : IRobot
{
let speed: Int
var location: Point
let movements: [Movement]
init(speed: Int, location: Point, movements: [Movement])
{
self.speed = speed
self.location = location
self.movements = movements
}
func move()
{
for movement in movements
{
let distance = speed * movement.distance
switch (movement.direction)
{
case .north:
location.y += distance
case .south:
location.y -= distance
case .east:
location.x += distance
case .west:
location.x -= distance
}
}
}
}
Now we can create a Robot
, injecting all of the initial conditions.
let origin = Point(x: 0, y: 0)
let route = [Movement(direction: .north, distance: 1)]
let robot = Robot(speed: 2, location: origin, movements: route)
Simulator().run(robot: robot)
XCTAssertEqual(robot.location.x, 0)
XCTAssertEqual(robot.location.y, 2)
The same assertions hold as before, but the Robot
class is much more generalized. We can now test the robot’s movement algorithm with various combinations of origin, speed and route.
At this point, we’ve made the robot and simulator composable and testable. Now we want to have a look at how we can separate the configuration from the usage.
We’re not nearly done, though. What does this all have to do with a service provider? That’s where the inversion part comes in.
In the very first example, the Simulator
was responsible for creating the robot. This made it impossible to test whether the robot did what it was supposed to do.
So we passed the robot in as a parameter to run()
, making the caller responsible for creating the robot instead of the Simulator
.
This is fine, as long as the caller is the top-level part of the program, responsible for composing the objects that will be used. However, what if the direct caller doesn’t know how to do that? Or, put another way, what if the caller should not be doing that?
What if the caller is a button handler in a UI? Would we want the button handler—or the UI that contains it—to be responsible for constructing the robot or its initial conditions?
This is where the container comes in: we want to register all of the types and instances that we want to use in one place. This configuration can be retrieved at any later point without knowing any more than the interface that’s required.
This takes us full circle to the original code, except, instead of creating the Simulator
directly, we want to get it from a container, called a provider in the following examples.
let simulator = provider.resolve(ISimulator.self)
simulator.run()
let robot = provider.resolve(IRobot.self)
XCTAssertEqual(robot.location.x, 0)
XCTAssertEqual(robot.location.y, 2)
Note: For reasons of simplicity, we assume that all objects in the container are singletons.
Let’s take the configurable code above and translate it to a container. Here the registrar
is the configurable part and the provider
is the part that can be used to retrieve objects based on that configuration. The registrar
is sometimes called the composition root.
Note: We use the syntax for the Swift IOC, but the examples are hopefully clear enough in their intent.
In the example below, we register singletons for each of the objects we want the container to be able to create, Point
, Int
, [Movement]
, IRobot
and Simulator
.
let registrar = ServiceRegistrar()
.registerSingle(Int.class) { _ in 2 }
.registerSingle(Point.class) { _ in Point(x: 0, y: 0) }
.registerSingle([Movement].class) { _ in [Movement(direction: .north, distance: 1)] }
.registerSingle(IRobot.class) { p in Robot(
speed: p.resolve(Int.class),
location: p.resolve(Point.class),
movements: p.resolve([Movement].class)
)}
.registerSingle(Simulator.class) {p in Simulator(p.resolve(IRobot.class))}
This is a decent start, but many of the registrations above have no semantic meaning, like Int
and Point
and [Movement]
. For these, it’s better to use higher-level abstractions.
We need to define three abstractions—called IOrigin
, IRoute
and IEngine
—with implementations. The IRobot
interface also needs to be redesigned to use them.
protocol IRoute
{
var movements: [Movement] { get }
}
protocol IOrigin
{
var point: Point { get }
}
protocol IEngine
{
var speed: Int { get }
}
protocol ISimulator
{
func run()
}
class Simulator : ISimulator
{
var robot: IRobot
init (_ robot: IRobot)
{
self.robot = robot
}
func run()
{
robot.move()
}
}
struct StandardRoute : IRoute
{
var movements: [Movement] = [Movement(direction: .north, distance: 1)]
}
struct StandardOrigin: IOrigin
{
var point: Point = Point(x: 0, y: 0)
}
struct FastEngine : IEngine
{
var speed: Int = 2
}
class Robot : IRobot
{
var location: Point!
let engine: IEngine
let route: IRoute
init(_ engine: IEngine, _ origin: IOrigin, _ route: IRoute)
{
self.engine = engine
self.route = route
location = origin.point
}
func move()
{
for movement in route.movements
{
let distance = engine.speed * movement.distance
switch (movement.direction)
{
case .north:
location.y += distance
case .south:
location.y -= distance
case .east:
location.x += distance
case .west:
location.x -= distance
}
}
}
}
We’ve created concrete objects for our standard parameters. An added bonus of the improved semantics is that we can rewrite the init
for IRobot
so that it no longer expects argument labels—because the parameter are now clear without further explanation.
Now we can take another crack at the configuration using these new types. This time, we’ll define an extension
of the IServiceRegistrar
that we can use again below.
extension IServiceRegistrar
{
func useSimulator() -> IServiceRegistrar
{
return self
.registerSingle(IEngine.class) { _ in FastEngine() }
.registerSingle(IOrigin.class) { _ in StandardOrigin() }
.registerSingle(IRoute.class) { _ in StandardRoute() }
.registerSingle(IRobot.class) { p in Robot(
p.resolve(IEngine.class),
p.resolve(IOrigin.class),
p.resolve(IRoute.class)
)}
.registerSingle(ISimulator.class) {p in Simulator(p.resolve(IRobot.class))}
}
}
We’ve now configured a system that knows how to create our simulator along with all of its dependencies. You can see that if the ISimulator
type is resolved from the container, it will,
Simulator
, whichIRobot
, whichIEngine
, IOrigin
and IRoute
An application can now change the speed of the robot without knowing anything else about the simulator, simply by changing the IEngine
that’s used.
class SlowEngine : IEngine
{
var speed: Int = 1
}
let provider = ServiceRegistrar()
.useSimulator()
.registerSingle(IEngine.class) { _ in SlowEngine() }
.commit()
As well, any location in the application can either use the IRobot
or the ISimulator
without having to know anything about how either of the concrete objects are constructed. The simulator might be much more complicated than the very simple one defined above. The robot might do much more when asked to move.
What if we wanted to let the robot decide how fast it is, depending on what kind of robot it is? Or what if we want to separate the speed from being fixed in the IEngine
?
What we need is a way to create transient objects that require parameters that are not available in the provider. These are types like Int
, String
, etc., as we had in Step Six above.
The example below shows a very simple usage of the factory pattern. Instead of having a single IEngine
for the whole application, we want to provide settings that the robot uses to get its engine.
The code below sketches the new types and shows how the robot would use them.
protocol IEngineFactory
{
func createEngine(speed: Int)
}
protocol IRobotSettings
{
var speed: Int
}
class Robot : IRobot
{
init(_ engineFactory: IEngineFactory, _ settings: IRobotSettings, _ origin: IOrigin, _ route: IRoute)
{
self.engine = _engineFactory.createEngine(settings.speed)
// …
}
}
You’ll note that we didn’t declare any new properties. The robot still just has an engine, but asks the factory to create it based on a speed, rather than having the provider inject its singleton.
The robot’s speed can now be configured without replacing the entire implementation.
let simulator = provider.resolve(ISimulator.self)
let robot = provider.resolve(IRobot.self)
let settings = provider.resolve(IRobotSettings.self)
settings.speed = 10;
simulator.run()
XCTAssertEqual(robot.location.x, 0)
XCTAssertEqual(robot.location.y, 10)
This first principle is a constant reminder to ourselves to avoid the seductive call of cleverness. Most code does not need to be clever. Very occasionally, it is... [More]
]]>Published by marco on 4. Oct 2023 21:36:13 (GMT-5)
These are the two core principles that guide how we write code:
This first principle is a constant reminder to ourselves to avoid the seductive call of cleverness. Most code does not need to be clever. Very occasionally, it is necessary to implement something with real flair, that requires explanation.
The best code, though, requires no explanation. The best code gets its job done in a very boring way, using the same patterns to achieve different ends. The best code is instantly recognizable to those who know the patterns. The best code doesn’t raise any questions. The best code doesn’t need comments. The best code is obvious and, yet, does amazing things—like fulfill requirements in a stable, predictable, testable, customizable and high-performance manner.
It’s kind of obvious: The lower the complexity, the easier it is to reason about systems. The easier it is to reason about a system, the easier it is to prove that either certain things can’t happen or will always happen. It should be obvious where to add a customization—because there’s only one place that it could logically go. It should be obvious where a bug lies—because there’s only one place it could have originated.
The best code is readable and understandable not only by the original programmer, but also by another programmer—even if that’s the original programmer, six months later.
We’d be lying if we said that we never write code that we don’t need, but we keep this principle in mind whenever we build code. There’s a bit more wiggle room when building frameworks vs. products. It’s easier to determine whether a feature is appropriate for a product than to do the same for a framework. Who knows how a framework might be used?
Encodo does have a framework named Quino. The point of a framework is to support the development of products that use it. It’s not easy to predict what those products might need, even when you’re focused only on features that your framework is supposed to provide. However, a framework or library has a purpose and it shouldn’t stray from it.
Just as an example: Does Quino provide a remote data driver? Yes, because products have used it and the feature fits into the strategy of metadata-supported data. Is there an XML transport protocol? No, because no-one needed it. Do we support any kind of object? Not out of the box, we don’t. You can register your own converters, but it’s not a generalized protocol.
At the very least, we stay away from throwing in everything but the kitchen sink—just in case a product that uses Quino might need it. Be prepared for anything, but build only what you need.
We apply the following principles to avoid unneeded complexity.
From the article Why OO Sucks by Joe Armstrong (inventor of Erlang).
“State is the root of all evil. In particular, functions with side effects should be avoided.”
The sentiment in the title is a bit strong, but its not unfair. OO programming mixes data with operations, leading to more complexity than required by the task.
Most applications need some state. That state should be isolated from most components. State should be stored in dumb objects and passed around.
A component without state is purely functional, drastically simplifying the things that could possibly happen to it. Its output is completely determined by its inputs. It does not introduce any threading issues beyond those inherent in its input.
A component avoids a whole class of issues if it cannot make changes to the data that flows through it. As with state, restrict mutability to only certain components.
For example, transient objects like DTOs or ORM objects are mutable because it makes the program logic much more understandable
Another example is stateless singletons with configuration settings. instead of using a single component with mutable properties, define the configuration in a settings component. This has several advantages:
If references are guaranteed to be non-null, whole swaths of checking code fall away and make the component much simpler. As with immutability, there are far fewer possibilities of what can happen to non-nullable code.
TypeScript supports a null-checking mode. C# supports one as well, starting with C#8. For older versions of C#, use the JetBrains Annotations along with ReSharper to enable real-time/compile-time null-checking.
A method should either change state or it should return data. This is the idea behind CQRS (Command-Query-Separation Principle). That said, we employ a weaker version where only visible state really counts.
Techniques like lazy-initialization and caching retrieved data are generally OK. Technically, those behaviors have non-visible state in the sense that they affect performance, but are still OK if used carefully.
We use C# and TypeScript—wonderful OO languages with strong functional support—but we’re using less and less of what OO has to offer.
Virtual methods are a code smell. Instead, use smaller, testable components with a single purpose. If it’s easier to test, it’s easier to replace where necessary. Smaller components are more focused and easier to replace without duplicating code.
If logic is separated from data, and services are injected or passed as parameters, then there is less and less need for base classes with many helper functions or virtual/protected methods.
If state just flows through a component, then that component can be a singleton, avoiding needless allocation.
It’s a lot easier to reason about an application that comprises a graph of singletons with transient data flowing through it.
Inject factories to create transient services (e.g. a remote-method caller that captures state).
As you can see, we put a lot of thought and care into our development practices and patterns. We try really hard to work in a way that ends up with quality software: stable, maintainable, extensible, testable and, most importantly, does what it’s supposed to.
For more information about specific development patterns, please see the architecture section of the Quino conceptual documentation. There are sections on interfaces, base classes and virtual methods, providers, tools & toolkits, task-specific interfaces and much more.
It should be easy to verify any requirement with a test. The tests should tell the story of the requirements.
A developer can test any component in isolation (unit... [More]
]]>Published by marco on 4. Oct 2023 21:36:07 (GMT-5)
Tests are code. Writing tests is not a “step”—it is part of writing the code itself. The component is nothing without its tests.
It should be easy to verify any requirement with a test. The tests should tell the story of the requirements.
A developer can test any component in isolation (unit testing) or can test the component in the constellation in which it normally exists (integration testing).
Just so we’ve said it: tests are not a place to use a different coding style or different coding practices than in “regular” code. Choose your frameworks wisely. It should be easy to write powerful, elegant and easily understood tests. Build your own support code and libraries where needed. Apply the same coding principles as you would with the code being tested. You have to maintain testing code just like any other code.
We discuss below that we prefer integration tests to unit tests—that only works if you provide a way to write high-performance integrated tests without repeating a lot of code.
Unit tests are very easy to write for properly written components. With a proper infrastructure, such tests can just as easily be executed in an integrated environment. In such cases, there is generally no need to invest time (and incur maintenance debt) writing two sets of tests.
Automated tests will sometimes replace components and dependencies with fake or mocked objects, in order to isolate and test only a component’s logic without incurring the costs of configuring and setting up unrelated components.
If integration testing is too complicated or too slow, then a web of unit tests may suffice. In most cases, though, this doesn’t apply and we avoid mocking entirely and test components directly in common, integrated settings.
For example, if a component is commonly used as part of a database-based application, then it is more effective to test that component in such a scenario, rather than expending effort in isolating the component in order to have a “true” unit test.
With only unit tests, there is a danger that the component works, but only as tested, not as actually used.
Often, these problems arise in component configuration. A unit test will pass in carefully prepared (and sometimes faked) dependencies and run all-green.
However, an integration test will check that the configuration code also works. That is, that the component is configured correctly for products that use it and not just in the tests that verify its behavior.
Mocks and fakes must be used judiciously, otherwise you end up either testing only the mock or you end up hiding certain classes of problems, as discussed in more detail below.
Imagine a UI list that validates and saves entries when the focus changes. This list might work just fine in a test, where notifications and side-effects as a result of saving are disabled with mocks.
This is no longer the real-world situation, though. What happens if one of the notifications would have led to a reload of the list or a state-change in one or more objects? What if the list only saves an object it is is marked as “changed” but that the spurious event resets that status in integration? This kind of interaction—this kind of bug—represents exactly the kind of thing we would miss when testing the list in too isolated a manner.
Because we’ve mocked away too much—because we focused too tightly on a unit test of the list—we’ve missed a bug that will come up in production instead.
While we don’t practice strict TDD at Encodo, we do write tests from the very beginning.
It’s really the only way to test the code that you’re writing, isn’t it? What are you going to do instead? Fire up the web server each time you want to throw data at a controller? Use a browser or Postman to fire those requests? Or are you starting a desktop UI and clicking around and typing? Or did you hack together a little console application in order to debug code?
Stop doing all of those things. Use a testing environment instead, so your product acquires a growing stable of automated, repeatable regression tests. It will become second nature to write tests to verify requirements about the components you write.
As we said above: the tests are part of the component.
A point made above is that unit tests are useful but they’re often not complete. Unit tests can fool you with excellent syntactic coverage but sub-standard functional coverage. We have many tools to measure the former, but only experience to measure the latter.
Sure, you’ve covered all of the lines, but did you actually choose a representative set of inputs? Are you making the right assertions? Did you actually test the requirements?
One technique that we use a lot is expectation files (called snapshots in some frameworks). Instead of writing several (sometimes, dozens of) assertions, we format output to text and then compare it against the text produced by the previous, presumably correct test run.
The idea is to detect when something has changed. We use this in Quino to verify log output during certain operations, or to verify queries or generated SQL or model structure or lists of data. Expectation files increase the depth and robustness of tests while at the same time making it extremely efficient to write and maintain such tests.
An expectation (or snapshot) is updated automatically when it changes and shows up as a difference in source control. If the change is expected, the developer commits it.
It takes a lot of experience to write just the right number and kind of tests. You don’t want to write too many tests: it’s code you have to maintain, after all. Also, it can be confusing when the same problem crops up in multiple places in different fixtures.
Some components should have unit tests as well as integration tests. For other components, unit tests are redundant because the integration tests cover everything already. Experience guides you in deciding what to write first, what to keep, and what to throw away.
It is possible to have too many tests. If you’re not aware in which layer your code resides, you might end up running the same code in multiple scenarios, when that component behaves the same regardless.
For example, if you’re testing how expressions are mapped to a database, then that test should definitely run against every supported database. If you’re testing how a high-level query composes those expressions before they get to the mapper, then you only really need to run it against one database in integration.
No-one wants to admit to releasing untested software. And no-one really wants to do manual testing. Automating tests reduces turnaround time for changes and enhancements. It also increases confidence for quick turnarounds when going to manual testing or production.
Unit tests are good, but prefer coverage in integration tests so that you have the best guarantee that your tests are running your code in a way that emulates the production environment as closely as possible.
Documentation has an... [More]
]]>Published by marco on 4. Oct 2023 21:36:01 (GMT-5)
Good documentation is part of every piece of quality software. What do we mean by “good”, though? Or “documentation”, for that matter? Quality software should be self-explanatory, but don’t be fooled into thinking that you don’t need to write documentation.
Documentation has an audience. Before writing anything, consider who you’re writing it for. What are the possible audiences?
Evaluators are interested in what your software does, how it interacts with other software, its performance characteristics, system requirements, the product roadmap, open issues and so on. If you don’t document your software sufficiently, an evaluator won’t purchase it in the first place.
By “purchase”, we mean that an evaluator will decide to use your software. This applies not only to commercial projects, but also to open-source freeware or even internal company software, be it a potentially time-saving Excel spreadsheet, a set of common UI or server components or an enterprise-wide multi-tier application.
Installers are interested in the basic installation options/paths and how to get from purchase/download to running. Here you need to find a balance between getting them up and running quickly, but also informing them that there is more to your product than just the standard rollout. They need to know that they can get set up efficiently but also that they’re not locked in to a single way of doing things (unless that’s what you’re selling).
Customizers are advanced installers: they want to know how to tweak or customize an installation to meet their special needs. These are often the same people as installers, but
New users are going to use installed/customized software. They want to not only know what your software does, but how they can use it for these standard tasks. They are interested in underlying concepts in both the application domain and the user experience. They need both introductory and high-level documentation, with meticulous, step-by-step instructions. These users are likely to navigate documentation in a progressive manner, reading from beginning to end.
Everyday/experienced users aren’t generally interested in introductory documentation. They are interested in how to become more efficient with your software. They will jump around in the documentation, using a search function to find what they need.
Extenders are users—usually developers—who will be using your software as a building block, integrating it with other software or extending it to meet their needs. These users are interested in command-line options as well as descriptions of available APIs. If the API surface is larger, then functionality should be grouped and examples included to demonstrate how to use the various calls in common workflows.
Last but not least, you have to document for developers. That means writing your code and documenting it in a way that is understandable not only to you but other members of your team. Future members of your team, will also need to get up to speed. As is often the case, you yourself will be one of those future developers, when you come back to a project or product after a longer absence. Your future you will definitely thank you for leaving well-documented clues.
Wow! That seems like quite a lot of documentation to write. It is. As with anything else, you’ll have to prioritize. We can make a list of the various documentation types we have at our disposal and identify the actors that would use them.
As you can see, we consider anything that helps actors to understand the software to be documentation. That means that writing useful error and logging messages is also an important way of documenting the product. Similarly, a clearly defined roadmap with stories/bugs/todos provides context for evaluators and developers. All of these forms of documenting a product can save everyone a lot of time, money and confusion by offering context-sensitive documentation right where it’s needed.
This extends to everything in your software or product: the best documentation is a good design. If the UX is more intuitive or command-line help is clear or the APIs are consistent and well-organized, that can go a very long way already. There is less need for extensive tutorials explaining each and every task when the product documents itself.
For example, if you name an API getUsers()
and an input variable includeAdministratorUsers
, then you don’t need to write much more than “Gets a list of users, optionally including administrators.”
For those reasons and many others, we recommend getting started early with documentation. If look at the list above, that’s kind of obvious advice.
Most importantly, the simple act of trying to describe what you are making will lead to a better product. You’ll often find that, as you document, you’ll notice things that could be done better or more intuitively or more consistently or more easily. The simple act of trying to explain what you’re making leads to a better product. If you find it relatively quick and easy to write documentation, then there’s a good chance that you’ve managed to build quality software.
If you can’t get your software into your customer’s hands, then what’s the point of writing it at all?
There are several at-times cross-cutting goals. In descending order of importance, they are:
Published by marco on 4. Oct 2023 21:35:54 (GMT-5)
An important part of the software process is the final step: delivery.
If you can’t get your software into your customer’s hands, then what’s the point of writing it at all?
There are several at-times cross-cutting goals. In descending order of importance, they are:
There are several aspects to continuous integration and delivery:
As expected, working in an organized manner with increased automation has clear benefits.
There are obviously limitations as well. The most immediate one is infrastructure investment: you have to set up build servers or purchase them in the cloud. You also have to make your process work with automated builds and possibly retrain personnel to work with it.
You have to plan your project and you have to have patience on the part of all stakeholders. You have to train everyone on the team to not even consider releasing a version of the software from a developer PC.
Setup and maintenance of build agents takes time and effort, especially over longer periods of time. Operating systems are upgraded, core components changed, build systems upgraded. All of these things will cause the build to fail on a given agent, even though nothing is actually wrong with the product. Here again, though, the agent will act as a canary in the coalmine for your development team. More often than not, the build-server failure will alert the team to avoid a feature that would have other wise cost them time to integrate before it’s ready.
The type of deployment depends on the product.
For desktop software, you need to build an installer or a compressed archive that users can execute and install. Mobile or UWP applications must be built and then delivered to app stores for installation. Web servers and sites can be deployed directly to in-house servers or into the cloud (e.g. AWS or Azure).
These deployment types are for the end users, but there are many more releases than that. Developers need to test their changes locally. Testers need to get these versions in order to provide feedback in a timely manner. We think of all of these releases as part of the build infrastructure, not just the continuous-integration server delivering an end-product.
At Encodo, we have experience with various systems for various types of software. We started off using Jenkins but moved to JetBrains TeamCity several years ago. Web projects have their own packaging and testing mechanisms (e.g. WebPack, Mocha) that integrate into almost any build infrastructure. We’ve also used Fastlane combined with Test Flight for mobile deployment. Our main expertise lies with configuration of .NET deployments paired with TeamCity.
Published by marco on 4. Oct 2023 21:35:48 (GMT-5)
Design by Contract is a software engineering practice in which software requirements and promises − the “Contract” − are explicitly written into the code. The code is, at the same time, better documented, more reliable and easier to test against. Encodo uses this technique to ensure software quality.
A software contract is composed of several components: preconditions, postconditions and invariants. Preconditions are what a component requires of a client, whereas postconditions are what a component guarantees to a client. In object-oriented programming, these contracts are attached to method calls in a class. Invariants are a list of conditions that must always be true for software. An invariant is typically attached directly to a class; the runtime checks the class invariant when entering and exiting a method call.
Popular programming languages, like Java, C#, Delphi Pascal and others, lack the language constructs needed to express these contracts. However, these languages contain assertion constructs, which allow one to roughly describe the contracts. The section on emulating contracts in other languages section shows the most common technique.
Eiffel is a language whose inventor, Bertrand Meyer, pioneered Design by Contract. It includes rich support for expressing contracts, is similar to Pascal in syntax and will be used for the examples below. The FAQ offers more information on why we chose Eiffel for our examples.
The best way to show how the use of contracts affects software is with an example. Imagine a database connection class with a method Open
. This opens a connection to the database, allocating resources for it and failing if the request is refused.
Open is
do
– Execute code to open the connection here
end
Any procedural programming language is capable of formulating the code above. However, what happens if Open
is called twice in a row on the same connection? One way to handle this is to simply ignore subsequent calls to Open
.
Open is
do
if not IsOpen then
– Execute code to open the connection here
end
end
This is not optimal, for several reasons:
Open
will never know they are doing so.Another way to respond is to accept that this might happen, but making it non-silent, logging the occurrence to some sort of logging mechanism.
Open is
do
if not IsOpen then
– Execute code to open the connection here
else
– Log a warning
end
end
This is slightly better and an entirely appropriate solution in some cases. However, the connection is quite a low-level component; it should not be responsible for deciding what to do about repeated calls to Open
. We can use a contract to push the responsibility onto the client.
Open is
require
not IsOpen
do
– Execute code to open the connection here
end
The require clause contains optionally named boolean expressions. If one evaluates to false, a precondition violation is signaled. The violator can immediately be pinpointed and repaired to conform to the contract (by adding a check for IsOpen
before calling Open
). What are the benefits?
The contract for this routine is not complete, as it has only published its requirements, but said nothing about guarantees. Given the name of the function, we would expect it to have the following postcondition:
Open is
require
not IsOpen
do
– Execute code to open the connection here
ensure
IsOpen
end
The function is now completely defined, having explicitly detailed its requirements and guarantees. The postcondition often looks quite superfluous: the code for opening the connection is right above it, isn’t it?
Not necessarily.
If the function is deferred
(abstract
in Java and Pascal, virtual
in C-style languages), the implementation is in a descendent. The pre- and postconditions apply to the redefinitions as well. This allows a base class to very precisely define its interface with other classes without making any decisions about implementation.
Open is
require
not IsOpen
deferred
ensure
IsOpen
end
The precondition can only be expanded in a descendent, whereas the postcondition can only be further constrained. That is, a descendent cannot define the precondition to be not IsOpen
and DatabaseExists
. A client with a reference to the ancestor class sees only the ancestor precondition and cannot be forced to conform to a contract defined in a descendent.
Likewise, the postcondition cannot be redefined to be IsOpen
or ActionFailed
. The original interface has already decided that if the database cannot be opened, the implementation must raise an exception. A client with a reference to the ancestor class does not have access to the ActionFailed
feature and cannot accept this as a valid postcondition.
The descendent adjusts the precondition in a function like this:
Open is
require else
AutoCloseIfOpened
do
– Execute code to open the connection here
ensure then
not CompactOnOpen or DatabaseIsCompacted
end
This descendent has expanded the precondition to allow a caller to call Open
repeatedly only if IsOpen
is false (inherited precondition) or if the AutoCloseIfOpened
option has been set. Likewise, it has further constrained the postcondition to promise that, in addition to IsOpen
being true (inherited postcondition), the database will be compacted if the CompactOnOpen
option is set.
So, that’s Eiffel. How can other languages express contracts without the proper language constructs? As mentioned above, almost all modern languages include an assert function, which accepts a boolean expression and raises an exception if it is false. This function can emulate pre- and postconditions, but class invariants are largely impractical in languages without some form of pre-processor (a search for Design by Contract in C++ turns up several such libraries). Here’s Listing 5 written in Delphi Pascal:
procedure Open;
begin
Assert( not IsOpen );
// Execute code to open the connection here
Assert( IsOpen );
end {Open};
Note how the contract is expressed in the implementation body; this makes contract inheritance difficult. The following pattern illustrates a single level of contract inheritance (which prevents descendants from removing contracts by not calling inherited methods):
procedure Open; // Not overridable
begin
Assert( not IsOpen );
DoOpen;
Assert( IsOpen );
end;
procedure DoOpen; virtual; abstract;
Under this pattern, descendants are required to implement DoOpen
and cannot alter Open
(Delphi methods are by static by default − equivalent to final
in Java, sealed
in C# or frozen
in Eiffel). There are naturally drawbacks to this approach, especially when compared to the rich contract syntax available in Eiffel*, but the technique is sufficient for many of the desired contracts.
See the further reading below to learn about using old in postconditions and expressing class invariants
“Why is there notry .. finally
to ensure that the postcondition is checked in Listing 8?”
A postcondition is only guaranteed when the function exits successfully. In the example, it is perfectly legitimate for Open
to fail because of an external connection problem. The precondition only guarantees that the connection is not open, not that it can be opened. Such guarantees are useless because they involve performing the action in order to check that the action can be performed.
The function should raise an exception if it cannot open the connection, avoiding evaluation of the postcondition and resulting in an acceptable error condition. An implementation that fails silently will cause a postcondition violation, which is an unacceptable error condition.
Using a try .. finally
construct to force evaluation of the postcondition under all circumstances would result in both the desired error (connection could not be opened) and a postcondition violation, which is not correct.
“What if there is an exit or return statement in Listing 8?”
Question 1 proposed a using a try .. finally
construct to ensure that the postcondition was always executed. As you can see from the answer, this has undesirable side effects. The simple answer is not to use instructions that break the normal instruction flow (e.g. exit or break). The usefulness of such constructs is debatable and the drawbacks are high (especially, as shown above, when the instruction avoids checking contracts).
This exposes the weakness of languages without explicit contract constructs — it requires discipline to avoid bad practices. Relying purely on discipline invites error. However, it is better than nothing at all.
Published by marco on 4. Oct 2023 21:35:39 (GMT-5)
A healthy and active review culture is essential for any team interested in building quality software. At Encodo, we’ve been doing reviews for a long time. They’ve become an essential part of everything we do:
What we mean by review is not a formal process at all. It is simply that you prepare work you’ve done for an informal presentation to a team member. Explaining what you’ve done in a review is often a good way of collecting your thoughts—you should be able to explain what you’ve done. Getting a review from a colleague is an efficient and productive way of making sure you can do that.
While there are many reasons to do reviews, we’ve also learned that reviews can’t do everything.
It’s important to get reviews often enough to avoid wasting time and effort but not so often that your work or the reviewer’s work grinds to a halt. It’s all about balance.
A good rule of thumb is about one review per task. If your task is longer than a day, then think about how to break up that work into phases in order to get a review of earlier phases.
That way, you’re more likely to catch issues before building on top of mistakes.
Encodo prefers live, face-to-face reviews.
This is the most efficient manner of reviewing as neither party has to prepare anything other than the work to be reviewed. Issues that come up can often be handled immediately—and such issues are far more likely to be mentioned and fixed. While in-person reviews are superior, video-chat/shared-desktop reviews work quite well, too.
If that’s not possible, then we have also used tool-based, asynchronous reviews, such as pull requests with review software. However, we find these to be not only less efficient but also less likely to find as many issues.
With a live code review, it’s relatively easy to ask the submitter to reorder, split or squash commits. It’s also easier to point out and quickly fix stylistic issues (like naming or interface usage, etc.). Because the turnaround time is much faster, a reviewer is far more likely to point out smaller fixes that would improve code quality, maintainability and so on.
However, in an asynchronous review, a reviewer must decide what is most important. Is it worth rejecting the whole pull request if it’s 95% correct with a few details? Do you reject it and ask the submitter to fix up spacing or formatting or missing documentation? Do you really write down every last little thing you would have said? Do you reject it and hope that the submitter understands all of your notes? Or do you accept it and just fix those things up yourself? How many iterations do you go through?
We prefer synchronous, face-to-face reviews because they’re much more efficient. Misunderstandings can be cleared up quickly, iterating until the submitter and reviewer find a consensus.
We encourage reviews everywhere because we know how to make them faster.
Both the reviewer and the submitter need to practice. A reviewer should practice diplomacy and formulate critique in a way that it will be accepted. A submitter must keep an open mind and prepare good arguments or justification for the code. Both sides should stay positive. A review shouldn’t be a competition: it’s about producing high-quality code together, as a team.
Encodo has done presentations on reviews, in both English and German.
What is the best approach when designing a new application, be it a small tool or an end-user application?
Many developers jump straight into a prototype, in order to get a feel for how the application will work. While... [More]
]]>Published by marco on 4. Oct 2023 21:35:31 (GMT-5)
This article is part of an archive of Encodo White Papers.
What is the best approach when designing a new application, be it a small tool or an end-user application?
Many developers jump straight into a prototype, in order to get a feel for how the application will work. While prototypes are good for demonstrations, they are dangerous: in projects with tight time or budget constraints, the temptation to simply “build out” the prototype becomes irresistible. This leads to applications with nice user interfaces (hereafter called UI), but inflexible and difficult-to-follow implementations.
A better first step is to list the requirements and assign them to possible components. This doesn’t have to be a long or complete evaluation of the requirements; a few minutes is enough to come up with enough ideas to get started coding. These non-UI components are a natural fit for testing environments and are more likely to define a clean, sensible API (Application Programming Interface). Once the core logic has been built and tested, a prototype can easily be built on top of it.
To summarize, the component-based approach is important for the following reasons:
A good UI library is a wonderful thing, allowing clean-looking, well-integrated applications to be built in a very short time. However, the allure of this style of programming is dangerous, as it quickly leads to applications without a clearly defined API, which leads to extensibility and maintenance issues.
These systems entice programmers into working “backwards”, building their application logic around events generated by the UI. The first generation of RAD environments were notorious for mixing UI and business code. The latest generations make use of libraries with “code-behind” built right in, automatically supporting core/UI separation in both web or classic UI application.
This separation of core logic and UI events makes is commonly called the MVC or Model-View-Controller pattern.
MVC is the official name for the technique described above, in which functionality is contained in a model (M), which communicates state changes to a view (V) through some form of update mechanism. The controller (C) represents user input and applies changes to the model.
In many UI libraries, the view and controller layers are merged, making it much easier to apply the pattern to smaller projects. View components are typically bound to model components using the Observer pattern: the view “listens” for changes in the model and reacts accordingly.
Consider a tool which processes text files and generates output of some kind (perhaps PDF or CSV). The actual task doesn’t matter − this is the kind of tool that is often written in a seat-of-the-pants fashion, with the excuse that it is “faster” to get it done this way. Let’s take a component-based approach and see what we get.
What are the components of the system?
This list took only a few minutes to write and could have been written by anyone familiar with the project. The list contains only domain knowledge — there is no implementation-specific data. Having written down the requirements, we see that there is a need for an internal data representation, which will be used by the importers, exporters and actions. This is a facet of the design that might have gone unnoticed during prototyping, but would have been expressed implicitly nonetheless.
The list of features above is not an “over-design”, but rather an explicit expression of the specifications. While an implementation can avoid using importer, exporter and action components, these concepts are part of the design nonetheless: an implementation without tehm is simply more difficult to describe, understand and extend.
With a little bit of thought, we have designed a system that will scale to multiple import and export formats and even support multiple transformations. Writing the application in this way may involve marginally more initial work, but will result in a far more testable, extensible and reusable framework, decreasing maintenance and support time.
Another popular argument is the perceived reduction in programming efficiency. Applications or tools of the “throwaway” kind will take longer to develop when using a clean programming model. Whereas that may be true in the very short term, the majority of an application’s life span is spent in support and maintenance, which takes more time and energy if the application is poorly designed.
Though a throwaway prototype may be available marginally quicker, it will be of poorer quality. In addition, subsequent applications cannot benefit from its code. The biggest loss comes in the form of functionality, improvements or bug fixes which are never even attempted because the code is not in a maintainable or testable state.
Realization of this design at the core level is not so difficult. Even though the application initially only has one importer and one exporter, it doesn’t take much more to define an API that supports multiple plugins. Writing the tests for these components is likewise trivial. The opposite is true in the UI: building an interface to manage and configure all of the functionality that was easily written into the model is prohibitive.
There is no reason, however, that the UI has to express all of the details of the underlying model; the application, as specified, need only expose enough functionality in the UI to be able to import and export. The UI stays remarkably simple, but can be easily and quickly extended to offer more features, if desired. Since the model has automatic tests, it can be assumed to be stable and it is easier to accurately estimate the time required to build the new GUI elements.
The standard, quick-prototyping approach would have started coding a main form with some input fields, building the transformation code directly into the form itself. Options and preferences would have likely been encapsulated with a few controls on the main form, which, in turn, would have been responsible for loading and storing them.
The design sketched above would be expressed implicitly and partially, at best. An application written without these concepts in mind will not be worth refactoring. If the code is re-used at all, it is typically copied to a new project and modified there, resulting in multiple copies of nearly the same code. Fixes and enhancements to one will not necessarily appear in the other.
A prototype that is considered “throwaway”, but grows into an application, does not benefit from any of the following:
It’s obvious from the design above that it can be extended to support multiple importers, exporters and actions. The initial application was assumed to be a GUI which did not expose all of the functionality available in the model. The GUI can be made more powerful, exposing more of the underlying functionality. The extensibility of the design is clear. What about reuse?
The examples below are in Delphi Pascal.
In a traditional prototype, command-line support is bolted on to the same application, because the required code is buried in UI structures. Such a command line application will involve something like:
if command = 'C' then begin
{ Create the main form first, so it is
treated as the main form by the system, then
hide both forms so they don't appear in front
of the command line.
}
form:= MainForm.Create;
form.Visible:= False;
prefsForm:= PrefsForm.Create;
prefsForm.Visible:= False;
prefsForm.LoadOptions;
form.EdtFileToUse.Text:= parameterFromCommandLine;
form.BtnConvertClick( nil );
form.Close; // Close main form to quit application
end;
Using the elements of the model from the component-based design, we could build a separate application, whose main loop is logical and readable:
if command = 'C' then begin
options:= ToolOptions.Create;
options.Load;
try
try
converter:= FileConverter.Create( options );
converter.Convert( parameterFromCommandLine );
finally
FreeAndNil( converter );
end;
finally
FreeAndNil( options );
end;
end;
The second version addresses the requirements in a much clearer, more maintainable fashion. On top of that, the implementation in the GUI application would have a similar pattern. The code above could go into an event handler, passing text from an input control instead of an argument from the command line. The following code assumes that the converter and options from the command line example above are globally available:
procedure MainForm.BtnConvertClick( Sender: Object );
begin
Converter.Convert( EdtFileToUse.Text );
end;
With a small amount of time invested at the beginning, one can define any application in terms of UI-independent components. An application that was designed in this way lends itself to ready reuse. Applications that use these components need only be concerned with delivering input to a clearly defined API. Fixes and updates to the core components will be reflected in all applications.
One of the more negative associations is the notion of unit testing. Unit testing traditionally involves writing a test for each... [More]
]]>Published by marco on 4. Oct 2023 21:33:33 (GMT-5)
Most people in the software industry have heard of test-driven development — it has become a buzzword with several possible meanings.
One of the more negative associations is the notion of unit testing. Unit testing traditionally involves writing a test for each and every routine in a unit or class, to ensure that it does what it claims. This practice has, of late, declined in popularity — mostly because of the sheer mindlessness of maintaining complete coverage of an ever-growing API.
Another form of testing is to write tests for components of a system, ensuring functionality on a higher level than that of the routine. Tests of this kind tend to encapsulate use cases, which are far more closely related to the way in which clients (actual users or other software) make use of an API. Naturally, use cases for extremely low-level components will end up testing individual routines, just as unit testing does.
Writing the component tests is not tedious and, in fact, helps tremendously in determining whether a piece of software is complete or not. They can be viewed as software implementations of the requirements documents or specifications. Proper application of Component-based Design makes it quite simple to build tests for the majority of an application’s functionality.
A far better tool for ensuring consistency at the lowest level, where unit testing traditionally comes into play, is Design by Contract. This practice involves including verification mechanism directly in the software, so that violations of software contracts can be pinpointed and quickly repaired.
The most important element of any testing strategy is to stick with it. When a defect is found, the first step is to create a test to replicate the problem. The next is to fix the error so the problem no longer occurs, but all the other tests still work. Finally, any missing contracts that may have helped pinpoint the problem sooner should be added.
Once the test suite runs through without problems, the software is ready for release testing.
Automated testing is a fantastic way of guaranteeing baseline software quality, but it is not the last step before releasing a product. For server software or software with a command-line interface only, the test suite can provide an extremely high-level of coverage (approaching 100%). Software which interacts with humans, however, requires a manual testing regimen to verify that the software functions as desired for all forms of input. Whichever parts of the testing chain cannot be automated (UI testing is notoriously difficult) should be documented in detail to ensure reproducibility between releases.
“Can we make our UI dumb enough to make our app usable without it?”
The video demonstrates navigating through a simple e-commerce site. Then, he shows how the app can be driven from the console by calling the APIs directly—upon which the URL and UI all update automatically. That is, the logic is... [More]
]]>Published by marco on 7. Sep 2023 11:02:51 (GMT-5)
“Can we make our UI dumb enough to make our app usable without it?”
The video demonstrates navigating through a simple e-commerce site. Then, he shows how the app can be driven from the console by calling the APIs directly—upon which the URL and UI all update automatically. That is, the logic is not in the UI.
He then demonstrates that he can drive the web site without a UI by deleting the rendering to React DOM entirely. He can still manipulate the console API to perform the same operations because the logic is all defined completely independent of the UI. Of course, this is the same command-line interface that can be used in the automated tests, which means that the entire product can be tested without a UI at all.
I’m becoming increasingly convinced that neither React nor Angular is the way to go. Both React and Angular mix logic into the UI, putting the UI front and center. This is wrong. Additionally, Angular suffers from a complete inability to speed up the development lifecycle because it’s so strongly tied to WebPack.
I’ve used Redux before and the boilerplate becomes prodigious. I’ve used the React reducers as well, and it’s a bit better, but still doesn’t feel very natural. I’ve used MobX but long before its current incarnation where it really seems to “just work” as a store of state and reactive programming logic.
The when
construct (see 16:37 in the video), which takes a predicate and an action, is a very neat concept that allows you to define exactly how your application reacts to state changes without burying it all in the components.
“If the view is to be purely derived from the state, then routing should affect state, not the derived component tree.”
Therefore, a url-change is an action like any other, modifying the state and letting MobX handle notifying all interested parties. Once you’ve gotten that far, you don’t even need a UI-specific routing library because you can just configure any router to direct URLs to the store API—which will automatically update the UI. The UI (e.g., React) doesn’t have to have anything to do with routing. A route change triggers an action, which changes the state. The UI reacts. The UI does not do anything with the route—it just triggers actions.
A reactive non-UI component ensures that the route stays in-sync with the state by reacting to changes in the state. In most cases, you can just create a value that calculates what the URL should be, based on the state. This could get complicated, of course, but it’s also completely separate from the rest of the application logic and can be thoroughly tested. We can also use the when
construct outlined above to simply listen for changes to the calculated URL and update the browser’s location and history. This way, the management of the history and URL is not entwined with the rest of the application logic. It’s just reacting to state changes, like everything else.
Working like this results in automated tests that work naturally and look very much like Playwright tests—but completely without UI and using semantically meaningful constructs. The UI is an afterthought (as Michel himself wrote in 2019). Playwright is nice, but it’s a last resort when you’ve already botched the job of writing your code in a more testable manner. It’s a nice check that the UI is properly wired to the logic of the application, but should not be used to verify application behavior—simply to verify UI behavior.
This all goes very much in the direction of The Humble Dialog Box by Martin Fowler in 2002, which shows that we’ve known how to build software correctly for over 20 years—and we keep getting distracted by “the new shiny”, thinking that we can somehow start with the UI and still get maintainable software.
“When you have general-purpose software, though, do you really need containers?”
Well, yes. The point isn’t that you need a container to paper over software that isn’t sufficiently... [More]
]]>Published by marco on 27. Aug 2023 03:32:42 (GMT-5)
The article Works on most machines by Mark Seemann (Ploeh Blog) argues provocatively that containers are a fallback for poorly written software.
“When you have general-purpose software, though, do you really need containers?”
Well, yes. The point isn’t that you need a container to paper over software that isn’t sufficiently generic: it’s to avoid fixing incompatibilities that have nothing to do with your target deployment systems.
I think the author is thinking too much of highly general-purpose software whereas the majority of software doesn’t need to run everywhere and anywhere.
If it’s built for the cloud, it’s going to run in a container anyway. If it’s built for a specific device, it’s going to run on that device.
In that case, why not just run that software at the developer side in the same environment? That way, you can avoid wasting a ton of time fixing problems that are related to how it runs in development rather than production.
“Ultimately, you may need to query the environment about various things, but in functional programming, querying the environment is impure, so you push it to the boundary of the system. Functional programming encourages you to explicitly consider and separate impure actions from pure functions. This implies that the environment-specific code is small, cohesive, and easy to review.”
It implies it, but it in no way guarantees it. The author is also forgetting about the quality of the developer that is likely to be building the solution.
In this post, he assumes that the developer uses enough tests to thoroughly test the system—even to the point where he is able to determine where a solution isn’t sufficiently generalized yet. He assumes that the developer uses methodology like functional programming to separate pure from impure code, and that the developer is good enough to do all of this in a way that is both efficient and leads to a finished product.
This is not at all a guarantee—or even a likelihood—in the real world.
In the real world, developers are not reaching for the stars—even if they had the capabilities, which many do not, they’re often not given the time to do things correctly—they are just trying to get it done.
If they can “cheat” by restricting the world of possible environments—rather than accommodating their software to environments it will never encounter in production—then why not?
It’s actually an engineering problem. If you’re going to make something that has to work well underwater, the only reason it needs to work out of water is because it makes it easier to work on, not because you think it’s worth the time making it function properly when in air. If you can make it just as easy to work on underwater than you it is in air, then you would just do that instead. Wouldn’t you? Why waste your time and your company’s when there’s a lot of other, more important work to do?
I was curious about the imaging library he was using and searched for ImageProcessingContext
(because I saw it in his code). That led me to ImageSharp, after which I searched for comparisons to the cross-platform... [More]
Published by marco on 30. May 2023 22:04:15 (GMT-5)
Updated by marco on 11. Sep 2023 13:14:50 (GMT-5)
I watched a great video about image-manipulation using an AWS lambda function.
I was curious about the imaging library he was using and searched for ImageProcessingContext
(because I saw it in his code). That led me to ImageSharp, after which I searched for comparisons to the cross-platform library used in Maui (MSDN).
That led me to the issue SkiaSharp vs ImageSharp (GitHub), which noted that,
“Note that JimBobSquarePants, the creator of ImageSharp, contributed some interesting discussion in #47.”
I read/waded through that whole issue thread and commented the following:
For future readers: The discussion itself is not very interesting, but the conclusion is. The title of the issue is Basic premise of the library is based upon a fallacy and harms existing projects. (GitHub) (referring to Maui.Graphics), which doesn’t feel super-constructive (and wasn’t). There are long screeds about how harmful MS is for everything OSS. The final comment is worth reading, as it explains that it turns out that the harshness of the issue title was completely unwarranted (as admitted by the original poster). Good conclusion; typically unproductive Internet discussion.
There is no conflict. Skia’s support for images is weaker than ImageSharp’s but it allows using GPU rendering on supported platforms whereas ImageSharp is for in-memory data (CPU-bound).
In the referenced issue itself, I commented,
“That’s wonderful. While I’m happy to learn that the issue was resolved, is there any way that we can pin this comment to the top so that future readers don’t have to wade through the 80% catfight in the middle?
“I was linked to this issue while researching Skia vs. ImageSharp and found the initial question and a couple of responses interesting, then waded through 80% chest-thumping, then finally got to this comment that essentially says “hey, we actually talked to each other and it turns out it was a tempest in a teapot”, which is what I was hoping to learn.”
I just got a response today:
“No way to pin comments, but I added a link to that comment from the initial issue description.”
Nice! 👌❤️🔥
The intended audience of this document is people interested in knowing which commands to execute to update submodules. The initial analysis section is intended for people interested in knowing how the commands work and what their strengths/weaknesses are.
The inspiration for this... [More]
]]>Published by marco on 28. Mar 2023 22:15:01 (GMT-5)
The intended audience of this document is people interested in knowing which commands to execute to update submodules. The initial analysis section is intended for people interested in knowing how the commands work and what their strengths/weaknesses are.
The inspiration for this documentation was that I was wondering whether submodules were always cloned with detached heads and if there were some way to avoid that. The short answers to these questions are, respectively, “yes” and “no”.
Skip to the examples below to just see the commands and their effects.
At the end of the document are links to pages referenced to produce this documentation.
In the discussion below, the term superproject refers to the root repository that contains submodule references. It comes from the git documentation where they make the distinction because submodules can be nested. Suppose, we have multiple nesting, as shown below.
📁 A
📁 B
📁 C
A
is the root repository of both B
and C
A
is the superproject of B
B
is the superproject of C
Submodules are stored inside another repository.
For a simple we would see the following:
📁 A
📁 .git
📁 modules
📁 B
📄 config (worktree = ../../../B)
📁 B
📄 .git (points to ../.git/modules/B)
The submodule’s git
folder is stored in the superproject’s git
folder and is replaced by a file that references the new location. The submodule uses the worktrees feature to check out to a different folder.
No. Storing the working tree of the submodule outside of the repository is not supported.
Why would you want to do that anyway?
One use case is that you have two repositories, each of which includes the same submodule, as shown below.
📁 A
📁 B
📁 C
📁 B
Instead of using two copies, you might think you could make the superprojects refer to the same copy of the submodule.
📁 A (refers to ../B)
📁 B
📁 C (refers to ../B)
A
would immediately be available in C
A
and C
refer to different commitsWhereas you can manually move a submodule outside of the repository after you’ve cloned it, you cannot configure a superproject’s submodules in a way that Git will be able to clone
properly. If you try it, you’ll probably get an error message like,
fatal: No url found for submodule path 'SUBMODULE.NAME' in .gitmodules
The next section explains how you can share local commits for testing.
Assume, as above, that there are two copies of the submodule, BA and BC. Suppose there are commits in BA that have been tested with A
, but should also be tested with C
.
One way to test C
would be to push the commits in BA and then pull them from BC. That involves a round-trip to the server, which is not optimal, but relatively straightforward.
Another way to test C
would be to add the local BA as a remote to BC and then check out the commit from BA directly.
To set up a remote called B_A
in BC, execute:
git remote add B_A ../../A/B
The testing flow would be, roughly,
A
#1
in BAB_A
into BC#1
in BCC
A clone of a superproject (a repository with submodules) fetches submodules only when required (e.g. when –recurse-submodules
is included). If submodules are fetched, then git sets the checked-out commit in each submodule to the commit ID specified for that module in the superproject. This makes sense because that is the correct commit to use. However, this also means that, after a clone, all submodules will be in a detached head state.
On an initial clone, git creates a local branch in the superproject corresponding to the checked-out branch in the clone command (either the default branch or the branch specified in the -b
option, if included).
Git does not create local branches in any of the submodules. Git assumes that you will be working in the root repository and not in the submodules. The checked-out branch in the submodule is irrelevant to the superproject.
If you want to work in (one or more of) the submodules anyway, then you have to create a local branch for yourself and check it out.
The detached head situation is not “weird” but “entirely expected” and “working as designed”. All detached head means is that a commit ID has been checked out rather than a named, local branch.
If, however, you want the submodule to be checked out to the same branch as that checked out in the superproject (e.g. main), then the way to address that is to call git switch main
in the submodule repository.
This will have no effect on the superproject if the main branch in the submodule repository is at the same commit ID as the one pointed to by the superproject. If it is not, then switching to the main branch in the submodule repository will show up as a change in the superproject (the change being that the submodule repository is now pointing to a different commit). To accept that change in the superproject, simply git add
the submodule folder and commit the change.
–remote-submodules
do?The –remote-submodules
option does the following (according to the official documentation):
“Git will use the status of the submodule’s remote-tracking branch to update the submodule, rather than the superproject’s recorded SHA-1 (i.e. “commit ID”)”
That means that using this parameter may cause changes in the working tree of the superproject if the remote-tracking branch in the submodule repository does not point to the same commit as that referenced by the superproject.
The basic submodule registration looks like this in the .gitmodules file.
[submodule "SharedRepo"]
path = SharedRepo
url = git@ssh.dev.azure.com:v3/ustertechnologies/uster.quantum/PoC.IMHSharedRepo
If you don’t plan on using –remote-submodules
, then that’s all you need.
However, if you want to set up your git submodules so that the superproject knows which branch it should “track” in the submodule, use the following configuration:
[submodule "SharedRepo"]
path = SharedRepo
url = git@ssh.dev.azure.com:v3/ustertechnologies/uster.quantum/PoC.IMHSharedRepo
branch = .
update = rebase
Note that the branch name is “.”. This tells git to use the same branch name as that which is checked out in the superproject (if it exists; if it doesn’t, then git does nothing further). This allows you to set up the .gitmodules
once and it works as expected for all branches. Otherwise, you run the risk of merging in a .gitmodules
file that references a specific feature branch (for example) and you end up syncing with that feature branch by accident if you call submodule update with –remote
.
The update action indicates how git should get to the desired commit if it needs to make a change. Again, this only applies if you explicitly tell git to use the head commit for the given branch on the remote instead of just using whichever commit is already referenced locally.
A superproject will see an update if it follows a branch in the submodule (as outlined in the preceding section) and that branch in the submodule has gained new commits since the last time the superproject was updated (i.e. the superproject still references a commit in the submodule that does not correspond to the current HEAD
of the branch in the submodule).
Using the –remote-submodules
option is a way of cloning a superproject, but also updating its submodules to the latest commits instead of just checking out whatever is referenced in the superproject. It is a useful way of cloning a superproject with the latest commits in not only the superproject’s repository, but also all submodules. However, you are then not only checking out the current state of the repository, but also requesting updates to the referenced submodules.
This only works if the submodule reference specifies a branch, though. If it doesn’t, then git has no way of knowing which branch in the submodule repository it should update to. As noted above, setting this branch doesn’t mean that git will create a local branch in the submodule with that name and check it out; it just means that it will change the commit ID referenced by the superproject for that submodule if the commit referenced by that branch in the submodule is different than the commit currently referenced by the superproject.
Phew! We now know enough to determine the commands to use.
We now have the base knowledge to work with git and submodules using the command line. This will be useful for e.g. setting up agents.
Imagine we have two repositories
The examples will use something like the following diagram to show results. The bold indicates the commit and branch that are checked out. A bold commit with a non-bold branch name indicates a detached head.
The diagram below shows the situation outlined above, with main checked out.
To clone a repository with submodules and check out the default branch in the superproject, execute the following:
git clone –recurse-submodules <URL>
This results in:
Using the example from the start of this section, after executing this command, we will see:
No change from the example is expected.
To do the same as above, but check out a particular branch, execute the following:
git clone -b feature/setup –recurse-submodules <URL>
This results in the same as above, but the superproject is checked out to “feature/setup”. Using the example from the start of this section, after executing this command, we will see:
To update submodules after an initial clone (not necessary immediately after a clone, of course), execute the following:
git submodule update
This results in:
Submodules where a change to the checked-out commit is required are in detached head state. If no change is made, then the submodule remains at which detached commit or branch was previously checked out
As with an initial clone, this command does not update any references to submodule commits.
To not only clone a superproject and all of its submodules, but to also update references to those submodule’s latest HEADs (as outlined in the remote-submodules section above), execute the following:
git clone –recurse-submodules –remote-submodules <URL>
This results in:
If, for example, the remote branch main in repository B had been updated to BID2, then the reference from A to B would also have been updated to BID2:
To update submodules after an initial clone and update references (as outlined in the remote-submodules section above), execute the following:
git submodule update –remote
This results in:
As when calling clone with –remote-submodules
, this command updates submodule references. Therefore, if the remote branch main in repository B had been updated to ID3, then we would expect to see A referencing that commit in B.
The following links were helpful in writing this documentation:
Published by marco on 17. Mar 2023 07:22:46 (GMT-5)
I’d watched an excellent movie [1] that was primarily in German but had some English parts, with hard-coded English subtitles and soft German subtitles plastered on top of that. I wanted to cite a bunch of interesting sections, so I looked for the subtitles online. Only the English subtitles are available, which I didn’t want. I liked the German formulation and wanted to cite that.
Well, I have the subtitles: they’re just trapped in the mkv
file. I figured that there was some way of extracting them, but a search turned up a lot of pre-compiled and sketchy-looking software whose veracity I couldn’t adequately validate. I want the subtitles, but I don’t want to get a virus or crypto-locked.
I got a good hint to use ffmpeg
from How to Extract .SRT Files From MKV File (Reddit). It suggested something like,
ffmpeg -i FILENAME.mkv -map 0:s:0 german.srt
Once I’d installed ffmpeg
with Homebrew, I was able to extract a subtitle stream. Unfortunately, it was kind of short, so I’d grabbed the wrong stream.
Part of the output of the command above is a list of available streams, shown below.
Stream #0:0: Video: h264 (Main), … (default) Metadata: DURATION : 01:25:55.332000000 Stream #0:1(ger): Audio: aac (LC), 48000 Hz, stereo, fltp (default) Metadata: title : Stereo DURATION : 01:25:55.285000000 Stream #0:2(ger): Subtitle: ass Metadata: title : German forced DURATION : 01:03:24.130000000 Stream #0:3(ger): Subtitle: ass Metadata: title : German DURATION : 01:25:43.890000000 Stream #0:4(ger): Subtitle: ass Metadata: title : German SDH DURATION : 01:25:43.890000000
The ffmepg
documentation isn’t particularly illuminating on the -map
option, but I finally figured out that the parameter is something like:
-i
optionss
indicates subtitles (I intuit this because it looks like p
indicates programs, according to FFMPEG: How to chose a stream from all stream [sic])Armed with this information, I was able to select the second subtitle stream, which is the full German subtitles rather than just the German subtitles for the English parts.
ffmpeg -i FILENAME.mkv -map 0:s:1 german.srt
This gave me the desired subtitles in seconds.
Happily, I have what I want and I didn’t have to install any sketchy tools that were installed in an unvetted binary. Instead, I’m comfortable installing the well-known tool ffmpeg
using the well-known package manager brew
.
Testing is any form of validation that verifies a product. That includes not only structured validation using checklists, test plans, etc. but also informal testing, as when engineers click their way through a UI, emit values in debugging output to a console, or perform operations on... [More]
]]>Published by marco on 5. Mar 2023 21:23:29 (GMT-5)
Updated by marco on 9. Apr 2023 23:29:26 (GMT-5)
Testing is any form of validation that verifies a product. That includes not only structured validation using checklists, test plans, etc. but also informal testing, as when engineers click their way through a UI, emit values in debugging output to a console, or perform operations on hardware.
Automated testing is common for software, as regression-style tests that execute both locally and in CI. This includes unit, integration, and end-to-end tests.
The following discussion focuses primarily on _software-testing_ but hopefully contains some insights and information relevant to other engineering disciplines (e.g., embedded and hardware developers).
Testing is primarily a mindset.
Thinking about what you’re building in the terms outlined above can help you to determine how and what you’re actually going to build. It will help you focus,
You should think of writing tests not as something you _have_ to do, but rather as something you _want_ to do.
Let’s define some of this jargon—use cases? “it works”, etc.—before we continue.
It’s a bit of a provocative question, perhaps, but it makes sense to ask about anything into which you’re going to invest time and money.
So, let’s start a bit further back.
❓ What would we like to do?
“We would like to build a product of high quality”
❓ What’s a product?
“A product is an implementation of a set of requirements.”
❓ Then what’s a requirement?
“A requirement is a collection of use cases.”
❓ OK, fine. What’s a use case?
“A use case comprises a set of initial conditions, an action, a set of inputs, and an expected output.”
❓ What is quality?
“A product that satisfies its requirements is of higher quality than one that does not.”
❓ How can I know that my product has the desired quality?
“We test use cases for a version of a product to determine quality.”
❓ How can I know when my product has enough tests?
“When all of the use cases are covered.”
❓ What if I change the product after I’ve tested it?
“Then you have to test all of the use cases again.”
❗ What the heck? That’s boring! I don’t have time for that!
“It’s called regression-testing. There’s no way around it.”
❓ What if I know that I’ve only changed a tiny thing?
“You might be able to get away with it. But that’s where 🕷 bugs come from.”
❗ I can’t afford to test everything manually every time I make a change!
“ That’s why you automate as many tests as you can.”
❗ Running the tests ties up my local machine! I can’t work.
“Run tests in another environment (e.g., in the cloud)”
We’ve established both that testing is a mindset and that it is necessary to building high-quality products.
We should keep in mind that the goal is to have a well-tested product with as many of these tests as possible being automated. The question is: how close to the goal state do you stay during development?
In other words, what does the development-feedback loop look like?
The goal of the development-feedback loop is to shorten the time between a change and its verification. In practice, this often manifests as “knowing as soon as possible when you’ve broken something.” The longer it takes from change to verification, the more likely it is that multiple changes will be verified at once. Root-cause analysis becomes more difficult.
That’s why manual tests are undesirable: they are far less likely to be run/applied in a timely manner, increasing the number of changes that have occurred since the last time tests were run.
So, the longer you wait to define tests, the longer your product remains untested. The longer you wait to automate tests, the longer you must do manual testing to verify behavior.
With that idea in mind, let’s consider the spectrum of methodologies. At one end, there’s development”>TDD, where you write the tests first, letting them fail and then writing the implementation. At the other, there’s writing all of your acceptance tests once you’ve finished the product.
Always writing the tests first is just one extreme, and one that scares a lot of people away from automated testing. As with any dogma, strict adherence is unlikely to be efficient.
Sometimes, you’ll need to try out an implementation to see if it’s even feasible or want to play with an API to see how it feels before you write a ton of tests for it. You don’t want to go too long without testing that you haven’t broken something, but you also don’t want to write tests for code that you’re going to throw away in an hour anyway.
Tests are only one part of the array of techniques a developer can use to verify a product. As discussed in more detail below, a strong type system, linting, and static-code analysis of all kinds help verify a product.
We should always be aware of which parts are necessary during which phases. If certain tools take longer to verify code, consider whether they need to be executed all the time, or perhaps just when pushing to a remote, or before merging into the master branch.
If you wait until you’ve finished the product to write all of your tests, you will still have a well-tested product, but you will not have benefited from testing during development.
Being able to test as you go improves your efficiency tremendously, as you’re not constantly fighting with things that are mysteriously breaking. Instead, you’re usually able to pin the blame on the most recent change you’ve made.
A product of nontrivial complexity can be written more reliably and quickly if there are tests. It also becomes possible for one team member to write the tests while another provides the implementation that satisfies it.
The spectrum in between is where most developers live, writing tests as they go, but not always before they’ve implemented something.
It’s understandable that there will always be certain tests that are difficult, if not impossible to automate. However, the document that follows will provide some tools for extracting the testable bits from the untestable ones to increase coverage. Anything that can be tested automatically can be executed by all team members all the time, as well by pipelines in the cloud.
You’re almost certainly already testing.
You might be clicking through the UI or emitting statements in a command-line application, but you’re verifying your code somehow. I mean … you are, right? RIGHT?
I’m kidding. Of course you’re not just writing code, building it, and committing it. You’re validating it somehow.
That’s testing.
If you’re really good, you might even keep a list of these validations. Once you have a list, then,
This is fine, but it’s still a manual process. A manual process carries with it the following drawbacks:
Automated testing means that you codify those validations.
Don’t panic. Almost any code can be tested. In fact, if you can’t get at it with a test, then you might have found an architectural problem.
See? Automating tests will even help you write better code!
Just start somewhere. It doesn’t matter where. Don’t worry about coverage. Just get the feeling for writing a proof about a facet of your code. Any bit of logic can—and should—be tested.
What if you still don’t know where to begin? Ask someone for help! Don’t be shy. It’s in everyone’s best interest for a project to have good tests. You want everyone’s code to have tests so you know right away when you’ve broken something in a completely unrelated area. This is a good thing!
A project should provide support for mocking devices and external APIs, or for using test-specific datasets.
A reasonably fast test suite will tend to be run more often. We would like a developer to notice a broken test right after the change that broke it, preferably even before pushing it.
Tests a developer runs locally should almost always work in CI. Failing tests in CI should also fail locally.
For example,
The following questions should help you evaluate for yourself where you are on your automated-testing journey.
We never want anyone in a team to get the impression that we’re writing tests just to write tests. We write tests because they help us write better code and because it feels good to be able to prove that something that was working continues to work. You should feel more efficient and productive and feel like you’re producing higher-quality code.
How do you know when there are “enough” automated tests?
Don’t get distracted by trying to achieve a specific coverage percentage. The most important thing is that the major use cases are covered.
If software is stable and there is “only” 40% test-coverage, then maybe there is a lot of code that rarely or never gets used? In that case, you might want to think about removing code that you don’t need rather than to waste time writing tests for code that never runs.
New code, though, should always have automated tests. A code reviewer should verify that new functionality is being tested.
Cover a single unit, mocking away other dependencies where needed.
Useful for verifying simple logic like calculated properties or verifying the results of service methods with given inputs.
Cover multiple units, possibly mocking unwanted dependencies
Useful for verifying behavior of units in composition, as they will be used in the end product. The goal is to cover as much as possible without resorting to more costly end-to-end tests
Also called UI Tests, these tests verify the entire stack for actual customer use cases
Very useful, but generally require more maintenance as they tend to be more fragile. Essential for verifying UI behavior not reflected in a programmatic model. Can work with snapshots (e.g. error label is in red)
The article Write tests. Not too many. Mostly integration. describes a pragmatic approach quite well. Instead of the classic “testing pyramid”, it suggests a “testing trophy”.
This style of development has the following aims:
A project should include analyzers and techniques so that the compiler helps make many tests unnecessary. For example, if you know that a parameter or result can never be null
, then you can avoid a whole slew of tests.
Developers should only spend time writing tests that verify semantic aspects that can’t be proven by the compiler.
The .NET world provides many, many analyzers and tools to verify code quality. One of the most important things a project can do is to improve null-checking. The best way to do this is to upgrade to C# 8 or higher and enable null-reference analysis. The default language for .NET Framework is going to stay C# 7.3, but
you can enable null-reference analysis for .NET Framework quite easily.
Another option is to use the JetBrains Annotations NuGet package, which provides attributes to indicate whether parameters or results are nullable.
The preferred way, though, is to use the by-now standard nullability-checking available in .NET.
Doing neither is not a good option, as it will be very difficult to avoid null-reference exceptions.
Unit tests are very useful for validating requirements and invariants about your code.
These are the easiest tests to write and will generally be the first ones that you will write.
A requirement or an invariant may be specified in the story itself, but it can be .anything that you know about the code that’s important. It’s up to the developer and the reviewer(s) to determine which tests are necessary. It gets easier with experience—and it doesn’t take long to get enough experience so that it’s no longer so intimidating.
Just as a quick example in .NET, consider the following code,
public bool IsDiagnosticModeRunning
{
get => _isDiagnosticModeRunning;
set
{
_isDiagnosticModeRunning = value;
_statusManager.InstrumentState = value ? InstrumentState.DiagnosticMode : InstrumentState.Ready;
}
}
Here we see a relatively simple property with a getter and a setter. However, we also see that there is an invariant in the implementation: that the _statusManager.InstrumentState
is synced with it.
Using many of the techniques described below, we could write the following test:
[DataRow(true, InstrumentState.DiagnosticMode)]
[DataRow(false, InstrumentState.Ready)]
[TestMethod]
public void TestIsDiagnosticModeRunning(bool running, InstrumentState expectedInstrumentState)
{
var locator = CreateLocator();
var instrumentControlService = locator.GetInstance<IInstrumentControlService>();
var statusManager = locator.GetInstance<IStatusManager>();
Assert.AreNotEqual(expectedInstrumentState, statusManager.InstrumentState);
instrumentControlService.IsDiagnosticModeRunning = running;
Assert.AreEqual(expectedInstrumentState, statusManager.InstrumentState);
}
Here, we’re using MSTest to create a parameterized test that,
We now have code that validates two facts about the system. Should something change where these facts are no longer true, the tests will fail, giving the developer a chance to analyze the situation.
If you’re addressing a bug-fix, though, you might be able to prove that you’ve fixed the bug with a unit test, but it’s also likely that you’ll have to write an integration test instead.
Unit tests have their place, but they are far too emphasized in the testing pyramid. The testing pyramid comes from a time when writing integration tests was much more difficult than it (theoretically) is today.
The “theoretically” above means that the ability to write integration tests as efficiently as unit tests is contingent on a project offering proper tools and support.
One common complaint about integration tests vis à vis unit tests is that they run more slowly. Another is that they take longer to develop. Ideally, a project provides support to counteract both of these tendencies.
To this end, then, a project should offer base and support classes that make common integration tests easy to set up and quick to execute:
There are many different ways to solve this problem, each with tradeoffs. For example, a project can load dependencies in Docker containers, either created and started manually (see Testing your ASP.NET Core application − using a real database) or even dynamically with a tool like the Testcontainers NuGet package.
A drawback to unit tests is that, while they can test an individual component well, it’s really the big picture that we want to test. We want to test scenarios that correspond to actual use cases rather than covering theoretical call stacks. It’s not that the second part isn’t important, but that it’s not as important.
Given limited time and resources, we would prefer to have integration tests that also cover a lot of the same code paths that we would have covered with unit tests, rather than to have unit tests, but few to no integration tests.
This, however, leads directly to…
The advantage of a unit test over an integration test is that when it fails, it’s obvious which code failed. An integration test, by its very nature, involves multiple components. When it fails, it might not be obvious which sub-component caused the error.
If you find that you have integration tests failing and it takes a while to figure out what went wrong, then that’s a sign that you should bolster your test suite with more unit tests.
Once an integration test fails and one or more unit tests fail, then you have the best of both worlds: you’ve been made aware that you’ve broken a use case (integration test), but you also know which precise behavior is no longer working as before (unit test).
Testing code is just as important as product code. Use all of the same techniques to improve code quality in testing code as your would in product code. Clean coding, good variable names, avoid copy/paste coding—all of it applies just as much to tests.
There are two main differences:
This is a big, big topic, of course. There are a few guidelines that make it easier to write tests—or to avoid having to write tests at all.
As noted above, code that can be validated by the compiler (static analysis) doesn’t need tests. E.g. you don’t have to write a test for how your code behaves when passed a null
parameter if you just forbid it. Likewise, you don’t have to re-verify that types work as they should in statically typed languages. We can trust the compiler.
Here are a handful of tips.
See the following articles for more ideas.
Investigate your testing library to learn how to write multiple tests without having to write a lot of code. In the MSTests framework, you can use DataRow
to parameterize a test. In NUnit, TestCase
does the same thing, and Value
allows you to provide parameter values for a list of tests that are the Cartesian product of all values.
Use mocks or fakes to exclude a subsystem from a test. What would you want to exclude? While you will want to make some tests that include database access or REST API calls, there are a lot of tests where you’re proving a fact that doesn’t depend on these results.
For example, suppose a component reads its configuration from the database by default. A test of that component may simply want to see how it reacts with a given input to a given method. Where the configuration came from is irrelevant to that particular test. In that case, you could mock away the component that loads the configuration from the database and instead use a fake object that just provides some standard values.
Another possibility is to fake an external service to see how your code reacts when the service returns an error or an ambiguous response. Without mocks, how would you test how your code reacts when a REST endpoint returns 503 or 404? Without a mock, how would you force the purely external endpoint to give a certain code? You really can’t. With a mock, though, you can replace the service and return a 404 response for a specific test. This is quite a powerful technique.
As noted above, it’s much, much easier to use fake objects if you’ve consistently used interfaces. You can just create your own implementation of the interface whose standard implementation you want to replace, give it a fake implementation (e.g. returning false
and empty string and null
for methods and properties) and then use that class as the implementation.
If you have interfaces that perform a single task (single-responsibility principle), then it doesn’t take too much effort to write the fake object by hand. However, it’s much easier to use a library to create fake objects—and there are other benefits as well, like tracking which methods were called with which parameters. You can assert on this data collected by the fake object.
For .NET, a great library for faking objects is FakeItEasy.
With a fake object, you can indicate which values to return for a given set of parameters without too much effort. Similarly, you can use the same API to query how often these methods have been called. This allows you to verify, for example, that a call to a REST service would have been made. This is a powerful way of proving facts about your code without having to actually interact with external services.
The following code configures a fake object for ITestUnitConfigurationService
that returns default data for all properties, except for Configuration
and GetTestUnitParameterValues()
, which are configured to return specific data.
private static ITestUnitConfigurationService CreateFakeTestUnitConfigurationService()
{
var result = A.Fake<ITestUnitConfigurationService>();
var testUnitParameters = CreateTestUnitParameters();
var testUnitConfiguration = new TestUnitConfiguration(testUnitParameters);
A.CallTo(() => result.Configuration).Returns(testUnitConfiguration);
var testUnitParameterValues = CreateTestUnitParameterValues();
A.CallTo(() => result.GetTestUnitParameterValues()).Returns(testUnitParameterValues);
return result;
}
In the test, we could get this fake object back out of the IOC (for example) and then verify that certain methods have been called the expected number of times.
var testUnitConfigurationService = locator.GetInstance<ITestUnitConfigurationService>();
A.CallTo(() => testUnitConfigurationService.Configuration).MustHaveHappenedOnceExactly();
A.CallTo(() => testUnitConfigurationService.GetTestUnitParameterValues()).MustHaveHappenedOnceExactly();
You can avoid writing a ton of assertions and a ton of tests with snapshot testing.
For example, imagine you have a test that generates a particular view model. You want to verify 30 different parts of this complex model.
You could navigate the data structure, asserting the 30 values individually.
That would be pretty tedious, though, and lead to fragile and hard-to-maintain testing code.
Instead, you could emit that structure as text and save it as a snapshot in the repository. If a future code change leads to a different snapshot, the test fails and the developer that caused the failure would have to approve the new snapshot (if it’s an expected or innocuous change) or fix the code (if it was inadvertent and wrong).
The upside is that large swaths of assertions are reduced to a simple snapshot assertion. The downside is that the test might break more often for spurious reasons. Generally, you can avoid these spurious reasons by being judicious about how your format the snapshot,
See the documentation for the Snapshooter NuGet package.
There have been many solutions to the problem of automated testing of web UIs over the years. The one many know is Selenium, but tools like Cypress, TestCafe, Puppeteer and Playwright have largely replaced it. The WebdriverIO library
Before choosing a tool, you’ll want to consider what your requirements are:
The current front-runner for end-to-end testing is Playwright, an open-source cross-browser, cross-platform, cross-language testing framework.
This pattern is particularly useful when you have a bunch of steps to execute. Instead of executing the steps as you go, you build a plan that describes how those steps would be executed and return that as the result of the planner phase. You can test this plan very easily without worrying about how to mock away the mutating part of the code.
For example, suppose you want to sync an online data source with a local configuration. The classic way would be to do something like the following:
var items = GetItemsFromServer();
foreach (var item in items)
{
var itemData = GetItemDataFromServer(item);
if (string.IsNullOrEmpty(itemData.Text))
{
SetStandardText(item, itemData);
SaveItemToServer(item);
}
}
With so little logic, there’s really no way to question this setup, is there? But think about what happens if there are more decisions to make, more data to retrieve, more data to update on the server. As this logic increases in complexity, the mutating code becomes ever more deeply embedded in read-only logic. That read-only logic ends up being the lion’s share of the code that you want to test, but you have to step very lightly to avoid making changes on the server. You can, of course, mock away services, to make sure that nothing is communicated back to the server, but there is another way.
What if you were to consider the set of operations as phases?
This approach has several advantages:
Once again, we have a pattern that not only makes testing easier, but it makes the entire architecture more robust, opening up possibilities that you wouldn’t have with the straightforward pattern (which would be harder to test).
To finish up this section, let’s take a quick look what that could look like in pseudocode.
var items = GetItemsFromServer();
var commands = new Commands();
foreach (var item in items)
{
var itemData = GetItemDataFromServer(item);
if (string.IsNullOrEmpty(itemData.Text))
{
var command = CreateCommand(
"Set standard text for {item}",
() => {
SetStandardText(item, itemData);
SaveItemToServer(item);
}
)
}
}
// Present commands to the user; store the commands for later, or execute them…
// This is where tests would verify the commands generated from a given set of
// item data.
foreach (var command in commands)
{
try
{
command.Apply();
}
catch
{
// Log error and continue?
}
}
Instead of executing the command immediately, we store what we would want to do with a closure and a description. We can do whatever we want with those commands; executing this is one option, but you can see how useful it would also be for verifying that the logic is correct in tests.
.
are hidden by default, but you can’t influence the structure other than by reordering pages with a .order
file in an individual folder.
The topic Hide folders that do... [More] (Microsoft Developer Community)
]]>Published by marco on 18. Jan 2023 10:09:22 (GMT-5)
It is currently not possible to hide individual folders or files in an Azure DevOps Code Wiki. Folders and files beginning with a .
are hidden by default, but you can’t influence the structure other than by reordering pages with a .order
file in an individual folder.
The topic Hide folders that do not contain Markdown files (Microsoft Developer Community) discusses extending this functionality.
I replied with the following:
There are a lot of good suggestions here.
Changing the name of the folder or file in order to hide it (e.g., by prepending the name with .
) is not a practical solution. Wikis based on, e.g., .NET solutions cannot just change the names of folders that would be empty in the Wiki.
Although I think that hiding empty folders by default seems like a good idea, I also understand that clicking an empty folder shows the UI that allows a user to create a page for an empty folder, so hiding that folder would also remove functionality from the online UI.
I think that many code-based Wikis wouldn’t mind losing this functionality, but we probably need a top-level Code Wiki option here where you can decide whether to show or hide empty folders by default.
That takes care of the default behavior, which would cover a lot of use cases for “cleaning up” the wiki’s structure.
However, if you elect not to hide folders by default, or if you just want to hide another file or folder, how can we support that requirement? I would suggest two mechanisms:
.wikiignore
file that allows globbing à la Git would be powerful (e.g., it would allow you to ignore all Properties
folders in all project folders in .NET solutions)..order
file to support !
, which would hide the folder or file from being displayed. This feature would technically also cover all use cases covered by a .wikiignore
file, but would involve quite a bit more work to support (i.e., you would have to add a .order
file to every Properties
folder instead of just configuring once, in a root file).The requirements are as follows:
]]>
- Premium customers get 20% off.
- Gold customers get 30% off.
- Regular customers, when they are students (< 25 years), get 10% off.
- Regular... [More]
Published by marco on 15. Jan 2023 11:10:38 (GMT-5)
In the article Why tuples in C# are not always a code smell by Dennis Frühauff (dateo. Coding Blog), the author writes the following code for calculating a discount.
The requirements are as follows:
- Premium customers get 20% off.
- Gold customers get 30% off.
- Regular customers, when they are students (< 25 years), get 10% off.
- Regular adult customers get no discount.
- All regular customers get 15% off during happy hour (3 to 8 p.m.).
public decimal CalculateDiscount(Customer customer, DateTime time)
{
if (customer.CustomerType == CustomerType.Gold)
{
return 0.3m;
}
else if (customer.CustomerType == CustomerType.Premium)
{
return 0.20m;
}
else
{
if (time.Hour is > 15 and < 20)
{
return 0.15m;
}
if (customer.Age < 25)
{
return 0.1m;
}
else
{
return 0m;
}
}
return 0m;
}
He doesn’t like this code, and neither do I. But we have different reasons.
The author rewrites the code above with pattern-matching, to make it “pretty much look like the business rules stated above”.
His final version looks like this:
public decimal CalculateDiscount(Customer customer, DateTime time)
{
return (IsStudent(customer), IsHappyHour(time), customer.CustomerType) switch
{
(_, _, CustomerType.Gold) => 0.3m,
(_, _, CustomerType.Premium) => 0.2m,
(_, true, CustomerType.Regular) => 0.15m,
(true, false, CustomerType.Regular) => 0.10m,
(false, false, CustomerType.Regular) => 0.0m
};
}
public bool IsStudent(Customer customer) => customer.Age < 25;
public bool IsHappyHour(DateTime datetime) => datetime.Hour is > 15 and < 20;
I strongly disagree that this looks like the original business requirements. In order to figure out who gets a 15% discount, you have to figure out what the first two boolean fields of the tuple indicate, so you look at the ad-hoc-instantiated tuple (which is created only in order to pattern-match on it), where you can see from the local-method names that they indicate whether the customer is a student and whether the sale was made during happy hour, respectively.
I have a few issues with this version;
_
placeholders) makes it look difficult to maintainI would tackle this differently, and with classic means. First of all, my main problem with the original version is that it’s made unnecessarily long and cluttered by including else
statements after returns
. Get rid of those and you’ll get rid of indenting and all of a sudden, the original code looks remarkably legible. It’s also 100% clear that there are no allocations and we don’t have to worry our pretty heads about the efficiency of code generated for either if
and return
statements or for simple comparisons.
public decimal CalculateDiscount(Customer customer, DateTime time)
{
if (customer.CustomerType == CustomerType.Gold)
{
return 0.3m;
}
if (customer.CustomerType == CustomerType.Premium)
{
return 0.20m;
}
if (time.Hour is > 15 and < 20)
{
return 0.15m;
}
if (customer.Age < 25)
{
return 0.1m;
}
return 0m;
}
How much clearer would you like that to be? I suppose we could add some local methods to add some semantics to the comparisons.
public decimal CalculateDiscount(Customer customer, DateTime time)
{
if (IsLevel(CustomerType.Gold))
{
return 0.3m;
}
if (IsLevel(CustomerType.Premium))
{
return 0.20m;
}
if (IsHappyHour())
{
return 0.15m;
}
if (IsStudent())
{
return 0.1m;
}
return 0m;
bool IsLevel(CustomerType customerType) => customer.CustomerType == customerType;
bool IsStudent() => customer.Age < 25;
bool IsHappyHour() => time.Hour is > 15 and < 20;
}
To make up for the fact that we lost all of that delicious pattern-matching and those tuples from the author’s version, we’re using local methods. Is this an improvement? Overall, I think so. The first version was already pretty good, but now we’ve improved the semantics by taking the guesswork out of the magic numbers. The IsHappyHour
method is definitely an improvement. The IsStudent
also imparts more knowledge about what the magic age of 25 means. Also, we’ve managed to separate the calculation of the rebate from the determination of the conditions that affect the rebate.
Can we do anything with pattern-matching, though? Can we use pattern-matching in a way that’s more legible than the version proposed by the author?
What about this?
public static decimal CalculateDiscount(this Customer customer, DateTime time)
{
return (customer, time) switch
{
({ CustomerType: CustomerType.Gold }, _) => 0.3m,
({ CustomerType: CustomerType.Premium }, _) => 0.2m,
(_, { Hour: > 15 and < 20}) => 0.15m,
({ Age: < 25 }, _) => 0.1m,
_ => 0m
};
}
OK. That’s not as bad as the author’s version. It doesn’t allocate a tuple just to be able to use a tuple, for starters. But is it more legible than the previous version? Not at all. We could, of course, improve the formatting to align all of the return statements, but that’s also no fun to maintain.
The real issue with the pattern-matching solution is that we can no longer use local functions to improve semantics. The only thing we could do would be to add an IsStudent
property directly to the class (extension properties are still being discussed (GitHub)). We cannot improve the semantics of the pattern-matching on DateTime
because that type is not under our control.
In conclusion, as with anything else in programming, you should be judicious in where you use the new and shiny features, always considering whether they’re actually helping improve your code.
]]>“Just curious as I’m learning and getting familiar with git. Do real production teams use any kind of tools for git like “git gui” or others? Or does everyone just use it from command line? Thanks for any insight. :)... [More]”
Published by marco on 11. Jan 2023 21:21:28 (GMT-5)
I’ve seen this Noob question: Does anyone use things like git gui? by Collekt (Reddit) again and again.
“Just curious as I’m learning and getting familiar with git. Do real production teams use any kind of tools for git like “git gui” or others? Or does everyone just use it from command line? Thanks for any insight. :)”
You almost certainly have several use cases for your source control:
The command-line isn’t the most efficient or least error-prone for any of these tasks.
For example—something you do every day—a good GUI client will let you very quickly navigate diffs in your working tree with only a few arrow-key presses. You can’t beat that with the command line.
And, once you have to merge … you’ll want a more powerful view on things than you’re going to get from command-line tools. Of course, it’s possible to merge on the command-line! I’m just saying it’s more error-prone and not as efficient—especially for most developers. There are probably a couple of John Henrys out there, but c’mon.
It’s great that the command-line exists! It allows us to build UIs on top of it. It allows us to integrate anything we’d like into a headless process like CI/CD.
However, you’re going to be more efficient with a good GUI. There are pros/cons to the various UIs. I’ve landed quite firmly on SmartGit after an evaluation of all of the other tools (in no particular order: Tower, VS, VSCode, GitLens, Kraken, GitExtensions, GitHub Desktop, SourceTree, Git GUI).
Why an external rather than an integrated Git client?
Why an integrated rather than external Git client?
You can use both, of course! Use whatever helps you be more accurate and efficient and happy.
Visual Studio Code’s default source control is very limited (no code forensics to speak of), so be careful of defaulting to that one. Visual Studio is getting better all the time, though. Still feels a bit weird for me, but it’s 10x better than it was a couple of versions ago.
Of course, YMMV, but please don’t continue to believe in the myth that using a command line is somehow a requirement to being a “real” developer. Developers who only use the command line are probably wasting time, probably making mistakes they shouldn’t, almost certainly missing out on powerful enhancements to their workflow.
The article is informative, but lists the values in what I consider to be an... [More]
]]>Published by marco on 11. Dec 2022 22:53:38 (GMT-5)
The article ”Thousand” Values of CSS by Karl Dubost (Otsukare) clarifies the definitions for the various types of value in CSS. While there aren’t a thousand different kinds of value in CSS, there are quite a few. Each has its raison d’être.
The article is informative, but lists the values in what I consider to be an unintuitive order. I’ve changed the order and consolidated a bit. Each term links to the W3C documentation [1] and each definition starts with the official description, a layman’s translation, and a simple code example.
Click to jump to the definition or read them in order to learn how they build on each other.
“Each property has an initial value, defined in the property’s definition table. ”
I.e. the initial value could also be called the default value, as defined in the specification.
p {
/* the initial value of color
is black
*/
}
“Each property declaration applied to an element contributes a declared value for that property associated with the element.”
I.e. the declared value is the one that you’ve directly assigned to a property in a CSS element.
p {
color: red; /* declared value is red
*/
}
“The cascaded value represents the result of the cascade: it is the declared value that wins the cascade (is sorted first in the output of the cascade). If the output of the cascade is an empty list, there is no cascaded value.”
I.e. the cascaded value is the declared value that sorts first in the list generated by the cascade of declared values that apply to that element.
p {
color: red; /* declared value is red
*/
}
p {
color: green; /* declared and cascaded value is green
*/
}
“The specified value is the value of a given property that the style sheet authors intended for that element. It is the result of putting the cascaded value through the defaulting processes, guaranteeing that a specified value exists for every property on every element.”
I.e., the specified value is the cascaded value, or the default value for that property, if there are no cascaded values.
p {
color: red; /* declared value is red
*/
}
p {
color: green; /* declared, cascaded, and selected value is red
. */
/* Also, the selected value for, e.g., margin-left
is 0
because that's the default, and no value was specified. */
}
“The computed value is the result of resolving the specified value as defined in the “Computed Value” line of the property definition table, generally absolutizing it in preparation for inheritance.”
I.e., the computed value is the specified value, but converted to absolute units (e.g., 2em
converts to 32px
if the font-size
is 16px
), or to a special value like auto
.
html {
font-size: 16px;
}
p {
font-size; 2em
/* declared, cascaded, and selected value are 2em
,
but computed value is 32px
. */
/* computed value of width
is auto
because there is no declared
value, so the selected value is the initial value. */
}
“The used value is the result of taking the computed value and completing any remaining calculations to make it the absolute theoretical value used in the formatting of the document.”
I.e., the used value is the computed value, but special values are converted based on context. E.g., a computed value of width: auto
will have a used value of width: 100px
if the parent container is 100px
wide.
body {
width: 100px;
}
p {
width; auto
/* declared, cascaded, selected, and computed value are 2em
,
but used value is 100px
. */
}
“A used value is in principle ready to be used, but a user agent may not be able to make use of the value in a given environment. For example, a user agent may only be able to render borders with integer pixel widths and may therefore have to approximate the used width. Also, the font size of an element may need adjustment based on the availability of fonts or the value of thefont-size-adjust
property. The actual value is the used value after any such adjustments have been made.”
I.e., the actual value is the used value, but adjusted as necessary for the output device.
p {
border-width: 1.1px;
/* declared, cascaded, selected, computed, and used value
are 1.1px
, but actual value is 1px
. */
}
Despited the name, the value returned by the getComputedStyle()
method will be either the computed or the used value, depending on the type of property. The result of this method is called the resolved value.
body {
width: 100px;
}
p {
width; auto
}
const p = document.querySelector('p')[0];
const resolvedValue = window.getComputedStyle(p).width;
/* resolvedValue == 100px */
Deciding to make something carries with it the obligation to design, develop, test, document, and support it. You’ll have everything under your control, but you’ll also have to do everything yourself.
If a... [More]
]]>Published by marco on 4. Dec 2022 22:11:39 (GMT-5)
As software developers, we are constantly making the decision between make or buy.
Deciding to make something carries with it the obligation to design, develop, test, document, and support it. You’ll have everything under your control, but you’ll also have to do everything yourself.
If a component is not part of your project’s core functionality, then it’s often a good idea to look around and see if you can find someone who’s already built that functionality. Optimally, the component you find will be free and open-source and will have been built by a team whose aim was to provide exactly that functionality.
Because they’ve focused on their task, it’s more likely to be a robust solution to your problem that what you would write yourself (focused, as you hopefully are, on your task). Their solution might go a bit too far (see “Size/Focus”), but that might be fine too (see “Extensibility”).
Is the component good, though? What do we mean by “good”? How can we tell? How do we go about sizing up a dependency?
The following table outlines various facets to consider.
I include my own notes below.
“Obvious” to me, at least. The terms link to examples in one of the articles... [More]
]]>Published by marco on 21. Nov 2022 22:48:57 (GMT-5)
Updated by marco on 7. Dec 2022 22:47:43 (GMT-5)
The articles Twelve C# 11 Features by Oleg Kyrylchuk and Welcome to C# 11 by Mads Torgersen (Microsoft .NET Blog) provide an excellent overview with examples of new features in C# 11, available with .NET 7.0.
I include my own notes below.
“Obvious” to me, at least. The terms link to examples in one of the articles linked above.
u8
to the end of a literal string to make it UTF-8 instead of the system-standard UTF-16. For example, “Test string”u8
will be encoded by the compiler as UTF-8 and will have the type ReadOnlySpan<byte>
.Finally, you can just pass a formatted and indented JSON into C# code, interpolate some variables, and do it all without escaping anything! [1]
“In fact .NET 7 comes with a new namespaceSystem.Numerics
chock-full of math interfaces, representing the different combinations of operators and other static members that you’d ever want to use. […] All the numeric types in .NET now implement these new interfaces – and you can add them for your own types too! So it’s now easy to write numeric algorithms once and for all – abstracted from the concrete types they work on – instead of having forests of overloads containing essentially the same code.”
See here for an example of using generic parameters in operators
, or Generic Math for an example that uses some of the new interfaces, like IAdditionOperators
and ISubtractionOperators
.
In that vein, there are a lot more interfaces that support generalized computation, like ISpanParsable<TSelf> Interface, which “[d]efines a mechanism for parsing a span of characters to a value.”
“Another ongoing theme that we’ve been working on for several releases is improving object creation and initialization. C# 11 continues these improvements with required members.”
[Generic<MyType>]
declared an attribute of type GenericAttribute
parametrized with MyType
.nameof
Scopenameof
with “method parameter[s] in an attribute on the method or parameter declaration.”StringSyntaxAttribute
RegEx
or DateTime.Format
), this is a welcome standardization that gives your own APIs the same star treatment. The post What does the StringSyntaxAttribute do? includes a list of the syntaxes supported out-of-the-box. The post StringSyntaxAttribute for syntax highlighting provides examples and screenshots.A few that seem a bit dubious, but are, I guess, welcome additions, and will be useful to someone are,
numbers is [_, >= 2, _, _]
returns true
if numbers
is a four-element list where the second element is greater than or equal to 2.System.Numerics
interfaces and the increased generality offered by abstracting over static members (linked above).This feature leverages the source-generation that’s been available since .NET 5 to avoid JIT for regular expressions by generating code for it directly. It’s really great to see the .NET team getting mileage out of the features they’re adding (I’m sure this isn’t a coincidence).
For another example of source-generation, see Generating PInvoke code for Win32 apis using a Source Generator by Gérald Barré (Meziantou's Blog), which explains how to use Microsoft’s NuGet package Microsoft.Windows.CsWin32
to easily generate source for any Win32 API or type—no more writing this stuff manually!
Check out the following animation of converting an escaped string to a raw string in Rider (from the post Rider 2022.3: Support for .NET 7 SDK, the Latest From C#11, Major Performance Improvements, and More! by Sasha Ivanova (The .NET Tools Blog):
Overall, I understand where the author is coming from,... [More]
]]>Published by marco on 20. Sep 2022 21:45:20 (GMT-5)
The article Agile Projects Have Become Waterfall Projects With Sprints by Ben Hosking (ITNext) argues that a lot of projects using agile aren’t agile at all, but are “more like waterfall projects with upfront requirements, fixed deadlines, sprints and 2 weekly demos.”
Overall, I understand where the author is coming from, but I found the tone pretty overwhelmingly negative. I can only imagine what the author has seen to have put them in such a dark place. 😐
I thought that this was an interesting comment in the article:
“You cannot create fixed deadlines unless you know all the requirements and guarantee no requirements are changed.”
However, you can create fixed deadlines (the world kind of expects them sometimes, e.g. when you’re preparing for a conference that happens on a specific day), but then you have to be willing to adjust on what will be delivered on that day.
Agile started out in a world where a partial product could be delivered and still have value. That is not the case with all projects. Thus, the designations MVP (Minimum Viable Product) and MMP (Minimum Marketable Product).
Even agile projects have to be honest about what the minimum time frame is for an MVP, though. Where some projects have an advantage is that they can iterate in smaller increments after that, but also can deliver useful, though nonviable pieces as artifacts of iterations. There are some projects where it’s more difficult to carve out such deliverables.
Although there is always work that has been planned and successfully accomplished and documented, it’s sometimes hard to measure or see progress until a larger amount of work has been done. I suppose that’s the art of planning and measuring.
Here, it’s also useful for technical team members or more technically oriented teams to learn how to consider administrative, planning, design, and documentation work as just as useful as producing technical artifacts (be they physical or virtual).
A waterfall process doesn’t help figure out what to do when the delivery cannot be completed on time. It (generally) has no plan for what to drop if you can’t deliver on time. Also, it doesn’t really have any ideas for what to do when new things “crop up”. An agile process is supposed to help you triangulate toward a version of the product that can actually be delivered by the target date—or help you better (and sooner) predict whether it’s even possible to deliver anything useful by that date.
I think you have to be honest about which projects really can be run in an agile way—but then also make sure that they take advantage of agility to be bolder than they have been.
Release early, release often, think about what your MVP is, all of those things are good to take from the agile process. As far as the “ceremony” of the process goes: I have always found value in the review and retros.
A bunch of people answered “just do it with CSS!” and one or two recommend using GSAP (Green Sock Animation Platform). I’d just heard about that library in the following instructive video and had had a chance to... [More]
]]>Published by marco on 2. Sep 2022 04:28:25 (GMT-5)
Some asked is there a js library that animates the text word by word like shown? by DemDavors (Reddit).
A bunch of people answered “just do it with CSS!” and one or two recommend using GSAP (Green Sock Animation Platform). I’d just heard about that library in the following instructive video and had had a chance to investigate how it works.
I’d like to expand on the comments recommending to use the “rule of least power”. They are absolutely correct, but you have to consider the entire task:
For those who already know how to do this and are trying to limit JS as much as possible then, by all means, use CSS only.
For anyone else, “least power” means using CSS where possible, but not necessarily excluding JS if doing so improves maintainability, enhances developer speed and accuracy, and reduces errors.
If you look at what GSAP does, it generally maps a high-level JS animation API to CSS animations and transitions. The concession you’ve made is to include animations using a relatively thin layer of JavaScript. That thin layer, though, is a change in technology (more power), which ensures that the animations will no longer work if JavaScript is disabled. However, you’re actually using CSS animations under the hood, benefiting from the high-level and highly optimized implementations in the browser. So you’ve lost flexibility as far as user agents is concerned, but the performance is the same, and you’ve probably saved time debugging and tweaking the implementation.
That might be a better balance for those developers who would have no idea how to animate the given example with native CSS. If they did that, they would have to first learn how to do it, taking up a lot of time, to say nothing of that they might end up creating a suboptimal implementation, both performance- and maintenance-wise.
Telling someone to “just use CSS” is technically correct, but also sounds a lot like answering “just use pipes” when someone asks how to install a toilet. There’s a bit of detail missing there.
I think Anchored Declarations and Qualified Anchored Declarations from Eiffel would be very useful.
I like the name “anchored” because you’re anchoring the type of one thing to another. Instead of... [More]
]]>Published by marco on 2. Sep 2022 04:27:24 (GMT-5)
I recently answered the question What features from other languages would you like to see in C#? by BatteriVolttas (Reddit)
I think Anchored Declarations and Qualified Anchored Declarations from Eiffel would be very useful.
I like the name “anchored” because you’re anchoring the type of one thing to another. Instead of using int
throughout a class, you can just make e.g. a field named _id
be an int
and then make all other types (e.g. for the parameter passed to a method) refer to the anchor with like _id
or typeof _id
.
If the type of the field ever needs to change, you only need to update one place. It’s more expressive because the alternative is to explicitly write the type of the parameter, whereas that was never what was going on. The method doesn’t decide what the type is; we’re just used to _syncing_ it to the type of the field _manually_ because there is no way to express the relationship in most languages we’re using.
Here’s an example:
class A
{
int Status { get; set; } = 0;
like Status PriorStatus { get; }
void Start(like Status s) {}
void Stop(like Status s) {}
}
The syntax is similar to how ref
and out
work now, but looking at it takes a bit of getting used to, especially for the property declaration.
TypeScript has this feature, with the typeof operator, but they don’t name it. TypeScript has two advantages here: it places the type after the variable name, which feels a bit more natural when the type is expressed with multiple words, and TypeScript has implicit return types, so you don’t have to write the type at all in many cases.
Because of the implicit typing, TypeScript has technically had anchored types all along!
class A
{
status: int = 0;
// The implicit type here is derived from "status",
// which "anchors" the type of the function to that field.
get priorStatus() { return status; }
// Here we're obliged to restrain the type explicitly
void Start(s: typeof status) {}
void Stop(s: typeof status) {}
}
As of TypeScript 4.7, it supports qualified anchored declarations on private fields as well.
Someone suggested in a response that generics might fill this bill already.
In a way, yes, that’s true. I could define the whole class with a generic type argument and then create a derived type that fixes the type argument to int
.
class A<TStatus>
where TStatus : INumber
{
TStatus Status { get; set; } = TStatus.Zero;
TStatus PriorStatus { get; }
void Start(TStatus s) {}
void Stop(TStatus s) {}
}
class IntA : A<int> {}
We have to use the newest features from C# 11 in order to be able to initialize the value to 0
. If it were a value that maps to a non-mathematical concept (e.g. additive or multiplicative identity), then we wouldn’t be able to use the generic approach.
It feels a bit like misuse of generics, though, when I just wanted a shorthand for letting one type reference another. As I wrote, TypeScript already allows this and seems to have found it a useful addition to generics (you can probably implement it under-the-hood with the same code in the compiler).
I feel the same way about the missing type
declaration from TypeScript (or the very similar, but less powerful typedef
from C or Pascal).
text-decoration: underline
(MDN) with text-decoration-thickness
(MDN) and text-underline-offset... [More]
(MDN)Published by marco on 21. Mar 2022 22:50:55 (GMT-5)
The article When to Avoid the text-decoration Shorthand Property by Šime Vidas (CSS Tricks) makes a couple of interesting points. Basically, you have a lot of control over how underlines are drawn on text.
text-decoration: underline
(MDN) with text-decoration-thickness
(MDN) and text-underline-offset
(MDN):any-link
(MDN) to select links that actually have an href
attribute rather than selecting all links.text-decoration
is a shorthand property, which means that setting it overwrites all of the properties that it might represent (including the underline thickness). (MDN)The article doesn’t mention these, but,
text-decoration-skip
(MDN) controls how to underline whitespacetext-decoration-skip-ink
(MDN) controls whether a text decoration (underline or overline) can touch the ascenders or descenders of glyphs. The following text has the style text-decoration: underline .4em; text-underline-offset: .4em
. Note that it doesn’t affect the bounding box.
The following text has the style text-decoration: underline; text-decoration-skip: spaces; text-decoration-skip-ink: all
. Note that text-decoration-skip
only works with Safari at the time of writing.
This is all... [More]
]]>Published by marco on 24. Jan 2022 17:20:05 (GMT-5)
Have you noticed that there is more and more content available to help you learn how to program? For every topic under the sun, there seems to be a blog article or video of superficially reasonable quality. For every question on StackOverflow, there’s an effusive answer with examples.
This is all pretty great, honestly.
However, with the increase in content. there is also the need to be able to wade through it.
How old is that StackOverflow answer? How appropriate is the answer to your particular question? Are there other solutions? Maybe easier ones? Maybe more modern ones? Has this solution to this particular problem been addressed in more recent versions? This isn’t new, of course. You should have been asking yourself questions like this for quite a while with these so-called expert-community sites.
However, now, we’re also inundated with content from people hustling to make a living as professional, freelance, advice-givers online. This is not a bad thing, necessarily. It’s great that the unsung masters that formerly only provided value inside of a single company are bringing their didactic abilities to the world. That’s not all that they’re doing, though.
Those who are on a subscriber model have to publish content in order to keep their subscribers. They don’t even necessarily have to produce anything of lasting value—they just have to produce something. They just have to retain and/or grow their subscriber base. This leads to nice-looking, but ultimately useless “fluff” content that rehashes an old concept with a few flashy graphics or an accompanying video. And the videos! Many of them take 15 minutes to explain a concept that you could describe adequately in a paragraph and a code example.
The Microsoft MVP bloggers are very conspicuous these days: there are many who are publishing an article or two per week “explaining” a C# 10 feature that has already been explained to death in dozens of other high-profile articles—to say nothing of the article Welcome to C# 10 by Kathleen Dollard (Microsoft Dev Blogs), which comes straight from the horse’s mouth, is wonderfully written, and, honestly, says all there needs to be said about these features.
But, if you search for “C# 10”, there is a flood of repetitive and, sometimes, outdated, information on C# 10. And these authors are all still churning out the articles. They’re doing it for the clicks, for the ad-views, for the subscribers. It’s a living. I get it. But, overall, it contributes to a very muddled picture that makes it difficult for people looking for advice and assistance.
Published by marco on 22. Jan 2022 12:22:44 (GMT-5)
If you want to test or hone your CSS skills, check out the CSS Speedrun. It lets you warm up with a relatively easy “intro”, then takes you through ten levels. Generally, each level tests a different feature of CSS (usually a specific selector). The final question (pictured) makes you combine what you’ve learned or used from other levels.
The image below is from my second time through. The first time through I needed about nine minutes; the next morning, I got through much more quickly. I guess I’d learned something. 🎉 for me.
nth-child(n)
for a long time. It selects the nth child from a structure if that child happens to match the given tag. You can always select the nth child by omitting the tag.
For example, div :nth-child(2)
(two selectors) will match the second child of any div, regardless of type.... [More]
Published by marco on 21. Jan 2022 11:12:46 (GMT-5)
I’ve known about nth-child(n)
for a long time. It selects the nth child from a structure if that child happens to match the given tag. You can always select the nth child by omitting the tag.
For example, div :nth-child(2)
(two selectors) will match the second child of any div, regardless of type. However, div span:nth-child(2)
will only match if the second child is also a span.
You cannot write a selector that says “select the second span” using nth-child
. That’s where nth-of-type(n)
comes in. The selector div span:nth-of-type(2)
does exactly that. I can’t recall that I’ve ever had this need before, but it’s also possible that I ended up adding extra tags or convoluted selectors in order to achieve what could have been more elegantly done with nth-of-type
.
Additionally, while I was aware that nth-child
supported constants and the keywords odd
and even
, I didn’t know that it also supported a formula an + b
. The a is a multiplier and b is an offset. With this formula, you can select every third or fifth (or whatever) element and then move the selection by a given offset.
The selectors first-of-type
, last-of-type
, etc. also exist, as well as only-of-type
, which matches an element when it’s the only child of that type in the parent. See Meet the Pseudo Class Selectors by Chris Coyier (CSS Tricks) for more information.
You may see where this is heading. The article The wondrous world of CSS counters by Chen Hui Jeng includes an example where he writes the famous FizzBuzz program with CSS.
Start with an ordered list,
<ol>
<li></li>
…add more li elements, like 30 of them…
<li></li>
</ol>
Then apply the following CSS to it,
ol { list-style-position: inside } /* To line-up all items neatly */
li:nth-of-type(3n+3),
li:nth-of-type(5n+5),
li:nth-of-type(3n+3):nth-of-type(5n+5) {
list-style: none /* When text of Fizz, Buzz or FizzBuzz appears, get rid of the numbers */
}
li:nth-of-type(3n+3)::before { content: "Fizz" }
li:nth-of-type(5n+5)::before { content: "Buzz" }
li:nth-of-type(3n+3):nth-of-type(5n+5)::before { content: "FizzBuzz" }
Put it all together and you get CSS FizzBuzz.
From it, I learned about the evils of overlays (see the Overlay Fact Sheet) and that there are really good resources out there, like Understanding... [More] (W3C)
]]>Published by marco on 28. Dec 2021 23:45:47 (GMT-5)
I recently read through the a11y myths. They’re quite interesting and should be required reading for managers running projects that develop web sites.
From it, I learned about the evils of overlays (see the Overlay Fact Sheet) and that there are really good resources out there, like Understanding Conformance (W3C) with WCAG 2.0 (Web Content Accessibility Guidelines).
“All WCAG 2.0 Success Criteria are written as testable criteria for objectively determining if content satisfies them. Testing the Success Criteria would involve a combination of automated testing and human evaluation. The content should be tested by those who understand how people with different types of disabilities use the Web.”
If you build custom controls, you should use ARIA (MDN). That page includes the following note,
“Many of these widgets were later incorporated into HTML5, and developers should prefer using the correct semantic HTML element over using ARIA, if such an element exists. For instance, native elements have built-in keyboard accessibility, roles and states. However, if you choose to use ARIA, you are responsible for mimicking (the equivalent) browser behavior in script.”
If you do need to use ARIA, then there’s a set of rules for its use in the article Notes on ARIA Use in HTML (W3C).
While we’re on the topic of building your own custom controls instead of using the built-in HTML inputs, we can also talk about how Good semantics also goes a long way to having good accessibility, right out of the gate. So, go ahead and use main
, nav
, header
, footer
, aside
, section
, and article
.
There’s some really good advice in there on writing clearly (e.g. use full month names and clarify abbreviations) as well as using meaningful text in links (e.g. don’t just use “click” or “here”).
border-radius
property is not only a shorthand for setting all four corners at once, but also sets the horizontal and vertical lengths simultaneously. To set them individually, use a /
between two... [More]
]]>
Published by marco on 26. Dec 2021 09:24:51 (GMT-5)
I hadn’t ever really thought about it because I don’t use the API very much, but it turns out that the border-radius
property is not only a shorthand for setting all four corners at once, but also sets the horizontal and vertical lengths simultaneously. To set them individually, use a /
between two values.
The corner radii are then calculated using ellipses as shown in the following visualization,
The article CSS Border-Radius Can Do That? by Nils Binder on October 9, 2018 (9 elements) has many more examples. It also introduces a Fancy-Border-Radius tool to help you create the desired shape visually.
CSS includes the much more generalized shape()
API (MDN) [1], but it wouldn’t be as easy to define the “blobs” shown above with that API because the “blob” is defined by the intersection of four overlapping ellipses and the shape()
API doesn’t allow combining multiple shapes into one shape.
Not only that, but the fact that the “blob”, as defined by the eight values shown above, can be quite easily animated by providing the end “blob” to a transition
or by providing several “blobs” to tweenable @keyframes
. You can see the technique in action in this CodePen. Scroll all the way down in the CSS definition to see that the effect uses a combination of morphing the border-radius
and rotating using a transform
to achieve a quite-complex and organic effect using only very straightforward and highly available CSS.
@keyframes morph {
0% {border-radius: 40% 60% 60% 40% / 60% 30% 70% 40%;}
100% {border-radius: 40% 60%;}
}
@keyframes spin {
to {
transform: rotate(1turn);
}
}
You can even use “tricks” to create many shapes without using the shape()
API either. See The Shapes of CSS by Chris Coyier (CSS-Tricks) for many, many examples.
Less-complex techniques—like... [More]
]]>Published by marco on 23. Dec 2021 15:30:47 (GMT-5)
Most of us know “hackers” from the media—either the news media, television shows like Mr. Robot, or movies like Swordfish. But the fast and easy way of hacking presented in the media actually does a disservice to how incredibly clever these hacks really are.
Less-complex techniques—like guessing or brute-forcing passwords—still work super-well. And you’ve always got social engineering hacks, like just asking someone for their credentials in an official-sounding way. But real, technical hacking involves getting to know a system’s dependencies and memory layout and runtime environment even better than the original programmers ever did.
Note: Both of these issues have been fixed, but it’s fascinating to read about how they did it. It really offers insight into what to avoid doing in your own code (e.g. do not open a WebSocket
on 0.0.0.0
).
The first article A deep dive into an NSO zero-click iMessage exploit: Remote Code Execution by Ian Beer & Samuel Groß (Google Project Zero) is a longer read, but I found it fascinating how many pieces they needed to chain together in order to hack iMessage—which they managed to do with a 0-click exploit. Just sending a message to the phone with a specially coded picture in it was enough to trigger code to run automatically that, unfortunately, ran before the sandbox. It overwrote memory in a controlled manner—making sure not to crash the app—and set up its own virtual machine to execute arbitrary code, which it then did.
“JBIG2 doesn’t have scripting capabilities, but when combined with a vulnerability, it does have the ability to emulate circuits of arbitrary logic gates operating on arbitrary memory. So why not just use that to build your own computer architecture and script that!? That’s exactly what this exploit does. Using over 70,000 segment commands defining logical bit operations, they define a small computer architecture with features such as registers and a full 64-bit adder and comparator which they use to search memory and perform arithmetic operations. It’s not as fast as Javascript, but it’s fundamentally computationally equivalent.
“The bootstrapping operations for the sandbox escape exploit are written to run on this logic circuit and the whole thing runs in this weird, emulated environment created out of a single decompression pass through a JBIG2 stream. It’s pretty incredible, and at the same time, pretty terrifying.”
The second hack is less wide-reaching, in that it would apply only to certain software developers using certain tools, which automatically limits the audience. The RCE in Visual Studio Code’s Remote WSL for Fun and Negative Profit by Parsia (Hackerman's Hacking Tutorials) describes, in relatively easy-to-follow detail, how the author found a pretty big hole in the remote-debugging support for Visual Studio Code using WSL (Windows Subsystem for Linux).
In order for it to work, the user had to approve opening the port in the Windows Firewall, but it was kind of unconscionable that it opened such a big hole. The developer could be forgiven for thinking that it was OK to approve the request, given that they had just initiated an action to debug between machines. Approving a firewall in that situation is not only expected, but incredibly common. The dialog box doesn’t provide an information about which ports it wanted to amend.
The Local WebSocket Server
Every time you see a local WebSocket server, you should check WHO can connect to it.
“WebSocket connections are not bound by the Same-Origin Policy and JavaScript in the browser can connect to local servers.”
— TL;DR WebSockets
WebSockets start with a handshake. It is always a “simple” (in the context of Cross-Origin Resource Sharing or CORS) GET request so the browser sends it without a preflight request.
These bugs can be chained:
- The local WebSocket server is listening on all interfaces. If allowed through the Windows firewall, outside applications may connect to this server.
- The local WebSocket server does not check the Origin header in the WebSocket handshakes or have any mode of authentication. The JavaScript in the browser can connect to this server. This is true even if the server is listening on localhost.
- We can spawn a Node inspector instance on a specific port. It’s also listening on all interfaces. External applications can connect to it.
- If an outside app or a local website can connect to either of these servers, they can run arbitrary code on the target machine.
I liked the following sections:
Published by marco on 23. Dec 2021 09:55:59 (GMT-5)
I just finished reading through the State of CSS 2021. It’s a well-presented [1] summary of a developer survey about CSS.
I liked the following sections:
flexbox
, where grids turn out to be much more appropriate. That is, the grid layout algorithm lets me specify what I want without fiddling about with flex-base
and flex-grow
, etc. Flexbox definitely has its place,... [More]
]]>
Published by marco on 13. Nov 2021 13:36:23 (GMT-5)
I’ve been using CSS Grids for a while now. I’ve found many instances where I had used flexbox
, where grids turn out to be much more appropriate. That is, the grid layout algorithm lets me specify what I want without fiddling about with flex-base
and flex-grow
, etc. Flexbox definitely has its place, but I think we all ended up abusing it a bit in our rush to leave tables-for-layout behind.
But that’s all in the past because now we have CSS grids available everywhere and all is well with the world! That being said, if you’ve not used CSS grids yet, then you should check out this CSS-grid super-fan’s many videos. He has a playlist of CSS Grid videos by Kevin Powell (YouTube) that you can work your way through.
He even made a short video (5min) describing how to use the grid inspector in browsers. The grid inspector is super-handy, but not so intuitive to find.
I’m more interested in what the same guy has to say about sub-grids. which are currently only available in Firefox (but it’s been available there for over 2 years now).
The 8-minute video below shows a concrete, real-world example, where you can see how little effort is required to get the browser to just align everything for you, all without fixed minimum or maximum widths (just like it used to be with tables). It should be immediately obvious why this feature is both a good thing and necessary (because the behavior can’t be replicated with existing CSS layout features).
The 11-minute video below shows how the generalized mechanism lets you do the same thing for rows:
You can find the full list of sub-grid videos (so far) in the Subgrid playlist by Kevin Powell (YouTube).
CSS sub-grids are an elegant way of aligning items without hard-coding anything (as required by existing techniques). They will continue to do what you expect regardless of the content added—i.e. there are no fixed minimum or maximum heights to make the alignment work, so you won’t be surprised when one of these artificial restrictions limits the algorithm unnecessarily (as it would with flexbox or regular grids).
You can enable Subgrid (MDN) by including grid-template-columns: subgrid
. My advice to the feature designers would be to rename the value to grid-template-columns: inherit
because that would be closer to the mark. Several times in the video, Kevin has to correct himself that he’s talking about the same grid rather than a copy of the grid. That’s what the nested container is doing: it’s inheriting the grid from a parent. Since it also has to declare itself as a display: grid
, it can choose to inherit or explicitly set a template for its rows and/or columns. I think that would be relatively intuitive, but what do I know?
This feature kind of feels like a generalized way of getting back one of the advantages of the table-layout algorithm. The table-layout algorithm makes the cells in columns the same width throughout the table. This, despite the fact that the cells are all defined in different parents—and columns aren’t even defined as elements at all. I think we all understand why it’s not a good idea to abuse the table semantics just to be able to use the table layout algorithm. It’s nice to see that the advantages of that layout are being rescued—and generalized to be even more powerful.
]]>“you can use numbers in classes, but if you have a class or id that starts with a number, it’s invalid. […] It’s one of those... [More]”
Published by marco on 10. Nov 2021 11:01:27 (GMT-5)
The video I’m not sure how much longer I can wait! by Kevin Powell is an excellent introduction to sub-grids in CSS. But I was more interested in the fact that he told his viewers that,
“you can use numbers in classes, but if you have a class or id that starts with a number, it’s invalid. […] It’s one of those weird things in CSS that sometimes trips people up.”
I immediately thought to myself, “it’s not weird. Every programming language is like that.”
Then, I thought, “I bet this guy only knows CSS, so he doesn’t have anything to compare it to.”
Then, I thought, “Wait…why can’t you start an identifier with a number?”
And, finally, “I bet it’s a lexing/parsing thing.”
I’ve written several parsers for medium-sized languages and my gut feeling is that letting an identifier start with a number seems like a surefire way of making the lexer more ambiguous or pushing more work into the parsing stage.
For example, if 25L
can be either an identifier or a long integer, then the parser has to figure out from context which one it is (e.g. by checking whether that identifier is declared). If it can only be a number, then it comes out of the lexer as a number token and the parser doesn’t have to disambiguate.
Even if your language doesn’t allow suffixes, you’d still have the problem with an identifier like 25
, which would be legal unless you introduce the additional restriction that an identifier must have at least one alphabetic character. In that case, though, you might as well make the rule that the identifier has to start with an alphabetic character and avoid the whole ambiguity.
With that common—not weird!—rule, the disambiguation happens in the lexer, where the operation is clearer and less expensive, performance-wise.
It’s actually worse than that, though. In the case of a programming language, you could see how the following would result in a compiler ambiguity:
var 3 = 5; // I'm already confused
//…the compiler gets it, though
var a = 3; // Now, the compiler's confused as well
Is the developer assigned the value 3
to a
or the variable 3
? Not only is this a terrible idea for readability, the compiler can literally not resolve this ambiguity without additional information. So there have to be restrictions on identifier names in order to avoid clashes with not only reserved words (e.g. if
) but also manifest constants (e.g. 3
).
In the case of CSS, where you do have suffixes (e.g. 25px
) but you can’t really mix class identifiers with values, it’s possible that you could get away with no ambiguities right now. So it’s not weird that you can’t start an identifier with a number—it’s perfectly natural for developers—but it is, in the case of CSS, not required for unambiguous processing. As you can see below, though, it’s still kind of confusing for the user.
What if we have a class named “3”? It’s not very expressive—we’d probably call the class something like “3-part-panel”—but it’s the pathological case. Maybe a class called “3px” would be even worse.
.3-part-panel {
/* This is fine */
}
.3 {
/* Weird, but OK */
}
.3px {
/* Now you're just being obnoxious */
}
Do we actually get any ambiguities, though? I don’t think so. I think in this case, the authors of CSS just used the “standard” (not weird!) definition of an identifier. It’s only when you have people using CSS who have had no exposure to any other programming languages (or parsing/lexing) that you get people thinking it’s “weird” that you can’t start with a number.
The only place where you could get an ambiguity is with CSS customer properties. In that case, though, “[a] custom property is any property whose name starts with two dashes”, according to CSS Custom Properties for Cascading Variables Module Level 1 (W3C). So, variable names in CSS are even more restricted than in most programming languages. Is that weird? Again, no. As in the case above with other programming languages, the end result is more clarity for the user.
For example, the following declares a few CSS custom properties with deliberately obnoxious names.
:root {
red: #F33;
color: #FF0;
0: 1;
3px: 1px;
}
.error-text {
color: var(red);
background-color: var(color);
border-width: var(3px);
opacity: var(0);
}
Although I’ve chosen confusing values and names, this doesn’t—at first glance—seem to cause any ambiguities. As with the examples above, it does force implementations to handle enumerations (e.g. all of the colors) in the parser, rather than the lexer. If the word “red” cannot be used as a variable, then it could (possibly) be recognized as its own token in the lexer, (possibly) improving performance.
The same goes for the property names. If it’s possible for custom properties to use the same names as built-in properties, then the lexer can’t handle them. There is no ambiguity because custom-property values must be resolved using the CSS function var()
.
The problem is worse than that, though. There is an actual ambiguity that isn’t obvious because we’re using the :root
pseudo-class [1]. The example below, using < html>
, makes it clearer.
html {
color: #F33; // Is this setting the color
// …or declaring a color variable?
}
This is an ambiguity that the compiler cannot resolve. So that’s why the CSS designers settled on a prefix for custom properties.
So, to a layman or user of CSS, naming restrictions on class or custom-property identifiers may seem arbitrary and “weird”, but they are a logical requirement of being able to process the grammar unambiguously.
In no particular order:
field
in property accesses to manipulate the backing property without having to define it. This is a... [More]Published by marco on 5. Jun 2021 23:04:48 (GMT-5)
Updated by marco on 11. Nov 2021 08:20:38 (GMT-5)
The article Introducing C# 10 by Ken Bonny discloses some incremental but very welcome changes to the C# language in the iteration that will be released with .NET 6 in November.
In no particular order:
field
in property accesses to manipulate the backing property without having to define it. This is a welcome improvement that will clean up useless boilerplate for properties that need to do something with the value before storing it (e.g. field.Trim()
)required
keyword for properties in any of the supported types (e.g. records
, classes
, structs
, or struct records
). This lets types enforce initialization without forcing a constructor parameter. The compiler will force callers to initialize the property in the object initializer instead.record struct
for records that are value instead of reference typesoperator
overloads in records
with
operator will work with anonymous classes as well as declared types.global usings
for commonly used namespaces (e.g. System
) to cut down on clutter in filesnamespace
without braces will put all types in that file into that namespace. This cuts down on an indenting level in all files.interfaces
(to round out the default-implementation feature introduced in C# 9)strings
(e.g. $”Hello {Name}”
is considered constant if Name
is also considered constant (recursively, of course). Update on November 11th, 2021 from Dissecting Interpolated Strings Improvements in C# 10 by Sergey Teplyakov (Dissecting the Code): This feature is based on an a nice performance improvement, as well. The compiler now understands interpolated strings and emits more efficient code rather than always using string.Format()
, which incurred allocations for unboxing, time for parsing, etc. There are even attributes to hook the compiler output that could be e.g, “used by logging frameworks to avoid string creation if the logging level is off.”!!
suffix for method arguments that instructs the compiler to generated a null-check for that argument. So, string
is not nullable, but not checked (i.e. the developer is responsible for including a check to avoid a NullReferenceException
if one slips past the compiler), string?
is nullable, and string!!
is not nullable and checked. This will avoid a ton of boilerplate argument-checks. Can’t wait.var
to declare variable to which you assign a manifest lambda. E.g. var isEven = (int n) => n % 2 == 0;
automatically gets the type Func< int, bool>
.(x3, int y3) = p;
where x3
is a preexisting variable.I really appreciate how the changes build on changes that came in previous versions. There’s a very noticeable direction that they’re pulling in with these languages changes:
For more information, see the csharplang/proposals/ (GitHub) folder. Some of the C# 10 features are in the main folder rather in the csharp-10.0/
folder.
Published by marco on 5. Jun 2021 22:33:53 (GMT-5)
Out of curiosity, I looked up how dependency injection works in functional languages. I stumbled upon this amazing article series—Six approaches to dependency injection by Scott Wlaschin (F# for Fun and Profit)
—that presents five different techniques—from very simple and easily applicable to more complex, but potentially robust.
The article series applies various abstraction techniques to a program that reads input, processes it, and writes it out again. The reading and writing are impure operations and should be abstracted away to make it easier to reason about and test the actual program logic.
The first article details Dependency Retention (hard-code everything; appropriate for scripts and POC projects) and Dependency Rejection (make an impure/pure/impure sandwich that collects program logic in a testable “middle”).
The next article covers Dependency parameterization (passing as parameters and using partial application in a separate abstraction layer). These are all pretty usable techniques.
The next two articles—The Reader Monad and Dependency interpretation—are more…involved. With both, you end up writing a description of your program that you can then execute by passing in the appropriate parameters. The dependencies are separate from the logic—in kind of in a separate layer—but there are drawbacks to these approaches. For one, they are quite complex and require everyone on the team to understand the patterns very well.
This is an example of the program description using the Reader
monad.
The final article applies all of these techniques to a slightly more complex problem domain, namely a user-profile update that receives an update request, reads from a database, compares data to determine updates, and sends an email to confirm address changes. This is complex enough that we can see how the techniques scale. As expected, the more complicated but functionally pure Reader Monad and Dependency Interpretation examples take up 2/3 of the implementation and explanation (with the later taking 50% all on its own).
All in all, this is impressive work that answered my question superbly. Highly recommended. I’ve only very lightly summarized the pros and cons and descriptions above. The original author does a superb job of explaining these in much more detail—without repeating himself.
Published by marco on 22. Apr 2021 18:20:27 (GMT-5)
Updated by marco on 11. Jan 2022 20:14:34 (GMT-5)
Over the last four months, I’ve been collecting interesting HTML/CSS techniques and ideas.
For both of these goals, I’m focusing on leveraging as much of the power of the browser—especially CSS/HTML—as possible without getting mired in too much JavaScript or client-side libraries.
To that end, I’ve collected the stuff I learned and would like to use in a hopefully semi-readable and searchable format. I tried to split it into coherent sections with supporting information and links. YMMV.
The following guides/manuals contain a wealth of information.
The article What Makes CSS Hard To Master has several interesting examples, but it mostly boils down to: “HTML documents are complex programs”.
It’s always been difficult to tell which styles are applied when—it’s a near-miracle that browsers can untangle the myriad ways that style rules interact with an ever-changing DOM and viewport size correctly to say nothing of doing so with such sheer alacrity.
There are selectors, media queries, related properties (e.g. position
), CSS Properties, and much more and all of it cascades with inheritance everywhere. At least most browsers now handle this similarly with predictable performance.
A tremendous amount of content is generated dynamically using layers of framework code, either on the server or the client. Any one of these moving parts could introduce a seemingly innocuous change that breaks the entire layout or inheritance (e.g. when a component introduces a wrapping <div></div>
somewhere, either where it’s flatly invalid (e.g. a table) or where it’s just unwanted (e.g. in a sequence of flexing containers, where the new container does not flex).
To control this chaos, most designers and developers impose self-discipline and use guidelines to avoid confusion while still allowing them to leverage the power of CSS to be able to do what they want.
From the article linked above,
“I think mastering CSS comes down to having a good amount of knowledge about it, recognising the subtle dependencies between different declarations, rules, and the DOM, understanding how they make your CSS complex, and how to best avoid them.”
Congratulations: you’ve just described programming at anything but a trivial level of complexity. If a tool has power, then you have to understand it in order to avoid hurting yourself with it. That’s why “everyone codes” is a lost cause doomed to end in failure, broken dreams, and embarrassed disappointment, like so many other quixotic attempts to ignore immanent complexity.
CSS is a moving target. Things that used to be difficult are now easy. [1] But that’s the nature of the game: someone is going to abstract away the thing you spent time learning and make it easier for everyone else. That is the nature of abstraction and frameworks. If the new thing (e.g. grid
) replaces the old thing (e.g. float
) well and you have time and budget to use the new thing and it’s a priority then, by all means, upgrade to use the new technique and pay down some technical debt, while hopefully gaining some flexibility.
While CSS generators—pre-processors like LESS and SASS—are invaluable, they also introduce another layer of abstraction where code is generated for the developer—sometimes with unpredictable results.
The latest versions of CSS have included some of the features introduced in these generators. Vendor prefixes are less necessary than they used to be; CSS properties and variables and eval()
(as well as other standard functions) allow a flexibility beyond even that offered by pre-processor variables. Color and transformation and animation functions are standard now.
Check out the site SmolCSS by Stephanie Eckles for a long list of common layouts, like:
It’s called “smol” because almost all of them do a lot of heavy lifting with very few lines of CSS.
The article Guide to Advanced CSS Selectors − Part One by Stephanie Eckles (Modern CSS) is a good overview with good illustrations and some selectors I’d never heard of, like General Sibling Combinator, which “[f]or example, p ~ img
would style all images that are located somewhere after a paragraph provided they share the same parent.”
That whole site is beautiful and exhibits an absolute mastery of CSS. Check out the use of the skew
transform
for the cards at the bottom of the page or for the whole series. The rainbow gradients on the :before
and :after
borders and backgrounds are a great idea and well-executed.
The excellent tutorial Diving into the ::before and ::after Pseudo-Elements by Will Boyd (Coder's Block) is an absolute treasure trove of information, including how to use the ::before
/::after
pseudo-elements to insert content, but also noting how a classic use of ::after
can now be replaced with display: flow-root
(the modern clearfix). He also covers ::markers
.
The article Three important things you should know about CSS :is() (Bramus) gives a few caveats but also shows the power of this operator to reduce CSS clutter (along with the up-and-coming nesting feature described below). You can use where()
(MDN) instead of is()
(MDN) to keep the specificity contribution of the clause neutral. The has()
(MDN) selector function is defined, but isn’t available anywhere. Combine any of these with not()
(MDN) for even more powerful selectors.
The article CSS custom properties are not variables (Web Platform News) explains a common misconception about CSS “variables”.
“A custom property is not a variable, but it defines a variable. Any property can use variables with the
var()
function whose values are defined by their associated custom properties.“[…] This distinction is useful because it allows us to talk about “variables with fallback values” (a custom property like any other property cannot have a fallback value) and “properties using variables” (a property cannot use a custom property)”
Another great article is The styled-components Happy Path by Josh W. Comeau. which discusses styling with CSS properties in React components. In it, he references another article of his, CSS Variables in React tutorial, which is more of an introduction to some of the techniques he works with in the first article.
You commonly properties with default values on the :root pseudo-selector (MDN). [2]
The article What Can You Put in a CSS Variable? shows a lot of nice uses of CSS properties, variables, and functions (W3Schools). CSS Properties can basically hold anything you want: text, concatenated strings, references to variables, images via urls, a single value, multiple values, etc. [3]
“Some properties, likebackground
andbox-shadow
, can take a list of things. You can use a CSS variable as a single item in the list, a sublist of the list, or the entire list.”
As mentioned above, declaring colors is one of the primary uses of a CSS pre-processor language. CSS Properties handle this job very nicely, without preprocessing and also with full recalculation at runtime. [4] The demo with the set of animated RGB sliders that control the color of a swatch is worth the price of admission. All without any JavaScript at all. Smooth as butter.
As a practical application, the article Make the page count of a 3D book visible using CSS Custom Properties shows how you can use CSS to make a “book” out of a div and a cover image, transforming it in 3D-space and then using a CSS property to determine how many “pages” it looks like it has.
You can play with a demo here.
You can find a simpler and very straightforward demo in the article Sharing data between CSS and JavaScript using custom properties by Christian Heilmann, which shows how to use CSS properties with a few lines of JavaScript to follow the cursor in your document.
Practical Use Cases For CSS Variables by Ahmad Shadeed provides many, many short examples and ideas for using custom properties as an abstraction instead of setting one or more standard properties directly.
Future work: See below for a discussion of proposed but not yet supported extensions and uses of CSS Properties.
The article The complete guide to CSS media queries is a great overview of how media queries work, but also how they’ve changed recently for those who’ve gotten accustomed to them over the years. For example, the section New notations in Media query levels 4 and 5 shows how ranges are easier, how you can now use or
, the not()
function, and custom media queries, which allows you to basically make aliases for media query combinations that you need to use in several places.
/* Define your custom media query */
@custom-media –small-screen (max-width: 768px);
/* Then use it somewhere in your stylesheet */
@media (–small-screen) {
}
/* You can also combine it with other media features */
@media (–small-screen) and (pointer: fine) {
/* styling for small screens with a stylus */
}
The article Handling Text Over Images in CSS by Ahmad Shadeed gives a wonderful overview with many examples on how to use gradient overlays on images to make overlay text readable for all types of images. At the end, you can see how many sites are using this (including YouTube for its overlay video controls).
See also the gradients used in borders (hover over a “card”), headers, and other elements in the ModernCSS tutorial.
Check out Animating a CSS Gradient Border, which has no JavaScript. It leverages a newer feature of Chrome-based renderers to avoid writing a lot of keyframe boilerplates, but it’s all in CSS. You could write it all in bog-standard CSS.
Another example is a slide show written with only HTML and CSS. You can keep all slides in a single document and make animated transitions between them. See How to Play and Pause CSS Animations with CSS Custom Properties for ideas. The article An Interactive Guide to CSS Transitions provides a lot of background and interactive examples of how transitions work and how you can influence their behavior.
CSS animations apply to many, many properties—in all modern browsers—as detailed in the article The Surprising Things That CSS Can Animate by Will Boyd (Coder's Block), which shows how easy it is to animate box-shadows (for a “pulsating” effect) or even z-order, with a few other properties, to animate two items “switching places” in a very intuitive way—all without JavaScript.
The article Cooltipz.css — Pure CSS Customisable Tooltips by Bramus van Damme includes a good demonstration of Cooltipz. This library uses very modern, but well-supported techniques to place and format tooltips or flyouts (for non-desktop browsers).
Understanding Clip Path in CSS shows how to work with the standard shape functions and combinators and the clip-path
property to make pure-CSS non-rectangular accents and effects that run on all modern browsers.
The article Responsible Web Applications by Joy Heron is an absolutely lovely design that illustrates the power and simplicity of pure CSS. Right at the very top, it uses shape-outside
and circle
to make text wrap elegantly around a circular shape that contains the navigation.
The key piece of CSS is very compact and understandable.
shape-outside: circle(21rem at 1.5rem 40%);
The page makes liberal use of CSS custom properties (see below) and rem
units to make everything scale nicely. It’s kind of a master class in CSS and is well worth reading.
Speaking of clipping, you can assign the background-clip property to determine which part of its element a background covers. In particular, setting it to the value text
clips the background to show through only for area covered by text. It’s been supported for quite some time and allows developers to make dynamic effects that would otherwise have to be hard-coded in graphics.
The article CSS background-clip
Demo: Text with Animated Emoji shows a neat demo of an animated SVG ghost moving back and forth behind clipped text.
In the same ballpark is the backdrop-filter
, which allows you to apply filters to everything behind a particular element. Naturally, you need to make the element at least partially transparent in order to see the effect.
The CSS is very simple and supported on all modern browsers. Being able to create this kind of composition dynamically on the client brings very nice effects without pre-rendered compositing.
CSS Paper Snowflakes combines transforms
, clip-paths
, mask-images
, and tons of properties and variables to render what look like pre-built graphics using only CSS (well, SCSS in this case).
The article CSS mix-blend-mode
not working? Set a background-color
! (Bramus) illustrates how to use the mix-blend-mode
to make sure that the text has proper contrast versus whichever background it happens to be over.
This is a really nice effect and very handy for usability. You can have the browser ensure that text is always readable, regardless of what kind of background slides into place behind it.
The article Smooth Scrolling Sticky ScrollSpy Navigation provides a tutorial for building a JS-free TOC with sticky headers. The article Smooth Scrolling and Accessibility by Heather Migliorisi (CSS Tricks) provides some background, history, and advice on honoring user preferences.
The following CSS is enough to get started. The full demo shows how to use a little bit of JS with an IntersectionObserver
to implement the ScrollSpy feature in just one line of code.
html {
scroll-behavior: smooth;
}
main > nav {
position: sticky;
top: 2rem;
align-self: start;
}
The article Using position: sticky to create persistent headers in long texts by Christian Heilmann provides a very minimal and highly re-usable example of using this feature for “sticking” headers to the top of the page when scrolling.
h1, h2, h3, h4 {
position: sticky;
top: 0;
}
And there’s also scroll-snap-type
, scroll-snap-align
, and browser units (e.g. vw
and vh
) to basically make a slide show out of an HTML file without any JavaScript (demo or another demo with some additional JS to highlight the displayed slide/image in a thumbnail browser).
As for “sticky” or “stuck” elements,
“[…] there is one limitation: it is impossible to change the appearance of an element whether it is stuck or not, say with a pseudo-class:stuck
. This is a general limitation of CSS. In this case, I recommend combining the benefits of position: sticky to keep the element sticking withIntersectionObserver
to change its appearance (while taking care not to change its dimensions, to prevent content jumps).”
The A table with both a sticky header and a sticky first column by Chris Coyier (CSS Tricks) provides a good example of using sticky to make frozen columns in tables.
For a really fancy scroll-spy, see the Progress Nav demo. This is very cool-looking, but it’s a little bit older, so also check out the Progress Nav with IntersectionObserver by Bramus for a linked version that does the same thing, but uses the IntersectionObserver
to reduce the amount of code significantly.
For limiting text in a box, you can let the browser do all of the heavy lifting by using line-clamp
or the even smoother and also standardized webkit-line-clamp
. See a demo that shows how to use it in a grid layout.
The line-clamp
feature is not to be confused with the clamp()
CSS function, which is shorthand for bounding a value between a min()
and max()
.
There are a ton of CSS functions, for math, colors, filters, images, fonts, shapes, and more. You can use all of these with variables and custom properties to avoid whole swaths of JavaScript.
You’ll want to use minmax
to override the default minimum size of auto
, which is content-sizing, which can get quite large in what the cool kids are calling a “grid blowout”. See The Minimum Content Size In CSS Grid (Bramus) for examples, graphics, and more links and guides.
The tutorial Building a Side Navigation pulls a lot of concepts together to create a common UI element that tends to become a time sink if you don’t plan correctly. A lot of the CSS features used in this article help to reduce the work significantly.
If you’ve ever wondered what you need <col>
and <colgroup>
for, then Highlighting columns in HTML tables by Manuel Matuzovic will show you how to use them to apply styling to a column without much additional markup. He even has an example that styles a “selected” column using the :target
pseudo-selector.
You can also use a simple attribute to tell the browser to be proactive about loading images.
The article Alt vs Figcaption by Elaina Natario (ThoughtBot) nicely illustrates how well browsers now handle the FigCaption
tag, which is yet another feature I’d implemented on earthli long ago, but with custom HTML and extra containers and positioning code. It’s nice to know that I can replace that all with a single attribute that’s been supported for years.
Viewport units let the developer size elements based on the size of the viewport. This includes not only vw
and vh
, but also vmin
and vmax
, which is the minimum or maximum of the two viewport dimensions, respectively.
The article Simple Little Use Case for vmin
by Chris Coyier shows a very simple way to make a highly responsive header without using media queries.
header
{
padding: 10vmin 1rem;
}
The article Accept several email addresses in a form with the multiple
attribute (Bramus) shows you how to use the multiple
property to have the browser automatically validate multiple email addresses, all without any custom JavaScript at all.
Once you’re using HTML validations (and you should), you can use the :invalid
pseudo-selector to style elements that need correction. Form Validation: You want :not(:focus):invalid
, not :invalid
(Bramus) shows several ways of combining it with good UX to avoid annoying users with hyperactive validation messages.
A good setup is:
.error-message {
display: none;
}
input:not(:focus):invalid {
border-color: var(–color-invalid);
}
input:not(:focus):invalid ~ .error-message {
display: block;
}
input:not(:focus):not(:placeholder-shown):valid {
border-color: var(–color-valid);
}
There’s also the new :focus-visible
property to help perfect focus-display in forms.
/* Hide focus styles if they're not needed, for example,
when an element receives focus via the mouse. */
:focus:not(:focus-visible) {
outline: 0;
}
/* Show focus styles on keyboard focus. */
:focus-visible {
outline: 3px solid blue;
}
See :focus-visible
Is Here (Bramus) for more information.
Password controls need a bit more love, as documented in the article Perfecting the password field with the HTML passwordrules attribute by Scott Brady, which makes the case for a new attribute passwordrules
to be standardized. His focus is on making password fields maximally accessible and usable for password tools.
A weaker—but available—alternative to his proposal is to use the pattern property to restrict input (helping the user, but not the password generator). To that end, he also mentions that you should set the autocomplete
(MDN), autocapitalize
(MDN), and autocorrect
(MDN) (non-standard) properties correctly instead of just leaving them at the defaults.
The resize (MDN) CSS property controls the directions in which the user will be able to resize any DOM element.
“Theresize
CSS property sets whether an element is resizable, and if so, in which directions.”
The article A Complete Guide To Accessible Front-End Components includes everything from guidance to links to tutorials to full-fledged examples and screenshots of HTML/CSS/JS implementations of commonly used controls that are also accessible.
The “Tab Panel” is quite nice in that it responsively switches to an accordion at smaller widths.
The article Building a Settings component by Adam Argyle (web.dev) demonstrates accessible components using a lot of pretty advanced—but generally available—techniques, like properties, grids (w/align-items, vw, minmax, auto-fit for pretty much automatic responsiveness with nearly no code), dark/light theming, light JS manipulation of controls, FormData
, accent-color
, and much more. Watch the embedded video (YouTube) for a very quick, 8-minute overview, play with the live demo or grab the source (GitHub).
Styling: Styles Piercing Shadow DOM shows you how to reset all styles in your component, using the :host
pseudo-selector.
:host {
/* Reset specific CSS properties */
color: initial;
/* Reset all CSS properties */
all: initial;
}
The article Options for styling web components by Nolan Lawson (Read the Tea Leaves) shows how to design a styling API for a web component using CSS custom properties.
The article Creating Custom Form Controls with ElementInternals by Caleb Williams (CSS Tricks) introduces an interesting concept. The example it uses is to make a single “control” that holds several text inputs, which isn’t groundbreaking, but it does show the power of packaging CSS/HTML/JS as components that show up as simple tags with properties.
None of that is new—we’ve had web components for a while now—but the ElementInternals
allows deep integration into the form’s workings, including hooking validation, submitting, drawing, and so on.
inherit
valueThe inherit
value is not new, but I often forget to use it as intended. It’s meant to help avoid re-stating a base color.
The following example changes the color for the nav
tags to red
, but wants links to retain the original color.
body { color: black; }
nav { color: red; }
nav a { color: black; }
Instead of repeating the value black
, you can instead use inherit
.
body { color: black; }
nav { color: red; }
nav a { color: inherit; }
The initial value is also useful.
content-visibility
The article content-visibility
: the new CSS property that boosts your rendering performance discusses a very new feature. It landed in official releases of Chrome, Opera, and Edge in September 2020.
“Thecontent-visibility
CSS property controls whether or not an element renders its contents at all, along with forcing a strong set of containments, allowing user agents to potentially omit large swathes of layout and rendering work until it becomes needed. Basically it enables the user agent to skip an element’s rendering work, including layout and painting, until it is needed, makes the initial page load much faster.”
Related to this newer property are the existing will-change
, object-fit
, and contain
. See contain-intrinsic-size (MDN) and content-visibility for more information.
box-decoration-break
The article box-decoration-break helps to define how elements should be rendered across lines by Stefan Judis presents an interesting property that lets you determine how padding, border, and other properties are applied to inline elements that span multiple lines.
Instead of setting arbitrary z-indexes in your styles, sometimes the isolation
property is a better way of creating a stacking context (MDN).
The Page Visibility API (MDN) is available in all browsers and provides a high-level API for running code when showing or hiding a page.
“With tabbed browsing, there is a reasonable chance that any given webpage is in the background and thus not visible to the user. The Page Visibility API provides events you can watch for to know when a document becomes visible or hidden, as well as features to look at the current visibility state of the page.”
Pages can use this to “pause” activity when they’re in the background (e.g. server-polling or animations). In the case of animations, though, “Most browsers stop sending requestAnimationFrame()
callbacks to background tabs or hidden < iframe>
s in order to improve performance and battery life.” They also “throttle SetTimeout()
”.
The CSS Houdini (MDN) APIs are a low-level way to hook custom JavaScript into various parts of the rendering pipeline. Of particular interest is the part that’s finished and implemented in all browsers: the CSSOM (CSS Object Model) and Houdini, which let a page render custom CSS effects using JavaScript. The collection of low-level APIs is known by the umbrella term Houdini, described in Cross-browser paint worklets and Houdini.how.
From the MDN page linked above:
“Houdini is a set of low-level APIs that exposes parts of the CSS engine, giving developers the power to extend CSS by hooking into the styling and layout process of a browser’s rendering engine. Houdini is a group of APIs that give developers direct access to the CSS Object Model (CSSOM), enabling developers to write code the browser can parse as CSS, thereby creating new CSS features without waiting for them to be implemented natively in browsers.”
And:
“Houdini enables faster parse times than using JavaScript style for style changes. Browsers parse the CSSOM — including layout, paint, and composite processes — before applying any style updates found in scripts. In addition, layout, paint, and composite processes are repeated for JavaScript style updates. Houdini code doesn’t wait for that first rendering cycle to be complete. Rather, it is included in that first cycle — creating renderable, understandable styles. Houdini provides an object-based API for working with CSS values in JavaScript.”
Houdine.How is a collection of open-source CSS extensions that you can use, extend, and learn from. I heard about this from css-houdini-circles
— A Houdini Paint Worklet that draws Colorful Background Circles by Bram Van Damme (Bram.us) (see his code (GitHub))
The following video provides an excellent overview in 12 minutes.
Once you start making custom effects, you’ll run into classic rendering problems, one of which is addressed in the article CSS paint API: Being predictably random, which explains how to use a stable seed to use predictably random data for animation data.
While the painting API is relatively well-supported, the Layout API is still in early days.
“The layout stage of CSS is responsible for generating and positioning fragments from the box tree. […] This specification describes an API which allows developers to layout a box in response to computed style and box tree changes.”
The VisBug Chrome/Opera/Edge Extension is an excellent tool in general, but seems to be indispensable for optimizing Houdini code.
Skip to 23:25 for the VisBug demonstration.
The future of CSS: Higher Level Custom Properties to control multiple declarations by Bramus Van Damme discusses a very, very recent proposal (December 2020), discussed in detail in the issue [css-variables?] Higher level custom properties that control multiple declarations #5624 (GitHub)
The article @property: giving superpowers to CSS variables by Una Kravets (web dev) provides more examples.
Another interesting up-and-coming development is container queries (Bram.us), which are like media queries, but addressing the nearest “root” container in the list of parent containers for the element to which it’s applied. The article CSS Container Queries: A First Look + Demo takes you step by step through using it. Basically, you write @container (min-width: 38rem)
instead of @media (min-width: 38rem)
and assign the contain
property, like so: contain: layout inline-size
.
The article Say Hello To CSS Container Queries by Ahmad Shadeed provides a lot of real-world examples that will make you wonder how we’ve lived with only viewport-based media queries for so long.
One of the main features values added by a CSS pre-processor like LESS is nesting, which improves clarity and cuts down on duplicated definitions. The article The future of CSS: Nesting Selectors by Bramus indicates that this feature is coming to mainline CSS, as documented in CSS Nesting Module (W3C). The document is an editors’ draft, so there’s still quite a way to go.
Nested Media Queries are already supported, though more as a side-effect of the implementations, not necessarily because it was specified that way.
The “logical properties” feature will add aliases for some of the venerable CSS properties like margin-right
and margin-left
that make it easier to build more agnostic and flexible content using, e.g., margin-inline-start
and margin-inline-end
. Assigning one of these instead of a hard-coded side means that a style will work in both LTR and RTL (for example).
The article Digging Into CSS Logical Properties by Ahmad Shadeed provides many more examples. The full list of proposed properties (MDN) is quite extensive. Many of the newer modules like flexbox and grid were designed like this from the very start.
See also CSS Logical Properties Are the Future of the Web & I18N by Daniel Yuschick for more information and tons of examples, with a demystification of the difference between direction
(inline axis, or flow) and writing-mode
(block axis).
Update: 16.10.2021
Two more interesting logical properties are inline-size (MDN) and block-size (MDN), which correspond to width
and height
in the horizontal-tb
writing-mode (MDN). Using the logical properties means that the layout works even if the writing mode is changed to vertical-lr
or vertical-rl
.
The article Hands-on with Portals: seamless navigation on the web explains how this new feature in Chrome/Chromium improves support for securely embedding content from other sites (i.e. “portals”), as when using OAuth providers. It also generally improves transitions in MPAs (Multiple Page Applications) by allowing one page to prepare another rendered page in memory and then transition to it and perhaps even back.
“Single Page Applications (SPAs) offer nice transitions but come at the cost of higher complexity to build. Multi-page Applications (MPAs) are much easier to build, but you end up with blank screens between pages.
“Portals offer the best of both worlds: the low complexity of an MPA with the seamless transitions of an SPA. Think of them like an
< iframe>
in that they allow for embedding, but unlike an< iframe>
, they also come with features to navigate to their content.”
The article Page Lifecycle API by Philip Walton (Google Developers) discusses an improvement over even the “Page Visibility” API (discussed above). Instead just handling visibility, it also provides hooks for suspending and resuming pages.
“The Page Lifecycle API, shipping in Chrome 68, provides lifecycle hooks so your pages can safely handle these browser interventions without affecting the user experience. Take a look at the API to see whether you should be implementing these features in your application.
“[…] While the web platform has long had events that related to lifecycle states — like
load
,unload
, andvisibilitychange
— these events only allow developers to respond to user-initiated lifecycle state changes.”
:root
, then check out the list of Pseudo-elements to see which extra parts of a document you have access to with CSS (e.g. the ::file-selector-button
selector is a relatively new addition that lets you style the button in an upload control).Over the last several years, I’ve used many other IDEs, like Visual Studio Code for documentation, advanced search, and JavaScript/TypeScript or PHPStorm for PHP, Android... [More]
]]>Published by marco on 18. Apr 2021 22:50:04 (GMT-5)
Updated by marco on 23. Apr 2021 08:59:44 (GMT-5)
Visual Studio with ReSharper has been my main development tool for many, many years. I first started using it in 2008 or 2009.
Over the last several years, I’ve used many other IDEs, like Visual Studio Code for documentation, advanced search, and JavaScript/TypeScript or PHPStorm for PHP, Android Studio for Java/Android, XCode for Swift/iOS, or WebStorm for TypeScript/JavaScript.
JetBrains Rider came on the scene several years ago and was not, at first, a viable alternative, but it has gotten much, much better. It now makes sense to consider using Rider as well as or even instead of Visual Studio/R#.
Before going into the new setup, let’s briefly discuss what we were replacing.
.EditorConfig
used only lightlyAll inspections and quick-fixes run through ReSharper. Visual Studio “squiggles” are disabled because they’re distracting and contribute nothing additional. StyleCop does a lot of the heavy lifting, but it does a bit too much. It checks spelling in documentation, even though ReSharper already does that natively.
The biggest drawback is that StyleCop uses its own parser, which is not just detrimental to performance—the Roslyn parser, the ReSharper parser, and the Style Cop parser are all running at the same time—but also the StyleCop parser is no longer compatible with some features of C# 8 and 9. It records “syntax errors” for perfectly valid code.
Rider doesn’t support the StyleCop, ReCommended, or the Enhanced Tooltip extensions. Not having Enhanced Tooltip isn’t that big a deal (Rider’s tooltips are OK), but not having StyleCop and ReCommended meant a significant number of style and formatting inspections were not applied in Rider.
Rider supports style and formatting, but it doesn’t warn or indicate when there are issues. This makes it more difficult to help developers use a common style.
StyleCop.Analyzers
The StyleCop.Analyzers
project has been around for a while, but making the move is not as straightforward as just installing the package in all projects. You also have to rewrite the configuration. Luckily, they have a good template from which to start and the documentation is very good.
Since the test solution uses Directory.Build.Props
, it also made it very easy to include the assembly and configuration for all projects. I created a special version for test assemblies that removes the documentation requirement.
StyleCop.Analyzers
has its own JSON configuration, but it uses the .NET-standard rulesets to configure inspection severities.
Removing the StyleCop plugin for ReSharper was not without drawbacks; it removed a few minor goodies to which I’d grown accustomed:
Update 22.04.2021: I’ve since discovered that “chop” is available in Visual Studio by positioning on a method, pressing Ctrl + ., and choosing one of the many wrapping options.
Also, documentation-generation is getting better with each point release.
.EditorConfig
Another standard is using the .EditorConfig
file for as much configuration as possible. This format is not IDE-specific: Visual Studio, ReSharper, Rider, Visual Studio Code, and many other editors/IDEs make use of it. Keeping as many settings as possible in this file helps ensure style and formatting is applied correctly no matter which IDE is used. It’s not a guarantee, but there’s a better chance than if these settings are stored in a ReSharper-specific format, as before.
These days, a lot of the configuration can be stored in an .EditorConfig
file—all but a handful of the Rider and ReSharper settings are mapped there already and there are a few more with each release.
.Directory.Build.Props
I’m also using SDK-style project files together with the Directory.Build.Props feature of the MSBuild system to consolidate configuration to just one or two files.
Visual Studio:
Rider:
Shared:
.EditorConfig
used for nearly everythingI have not tested Visual Studio without ReSharper because, although Visual Studio has leapt forward in functionality, there are still too many features I miss without ReSharper. [2]
I use a separate Git client called SmartGit, so I generally turn off as must of the Git integration as possible to save power and memory. The CodeLens (VS)/Code Vision (Rider) is an amazing insight into a ton of statistical information, but I don’t ever use it, so I turned it off. Also, I don’t like how it feels when editing code because it introduces virtual “lines” in too many place. I also would sometimes inadvertently click the links and then have to close detail panels or refocus the editor.
For the same reason, I disable almost all inlay hints in Rider/ReSharper (inline hints in Visual Studio). I do not miss seeing types everywhere. I only care what the actual types are when something doesn’t compile. In Rider, you can long-hold the Ctrl key to show inlay hints on-demand. The only inlay hint I always show is for inherited attributes (e.g. for [NotNull]
annotations).
I’ve also disabled Code Folding (Rider)/Outlining (Visual Studio) because I never use it. I don’t need to see the noise along the left-hand gutter and I don’t need to accidentally click the nodes (or accidentally trigger a folding with an inadvertent key combination).
These are options that I ended up changing from the defaults.
For C# Code style, I ended up adding these extra settings. There are probably others, but these are the ones that made ⌘ + K / ⌘ + D usable for me, especially for the single-line null-check statements that we use a lot.
With the first two settings, the formatter won’t fix some things that he would have fixed before, but he’s also not going to change a whole bunch of stuff that you’d rather he left alone.
It took me a few tries to configure Ctrl+K/Ctrl+D (format document) in Rider, which doesn’t work as loosely as in ReSharper/Visual Studio. In Visual Studio, it leaves single-line argument checks alone. Rider is more … consistent … and reformats all lines, which messes up a lot of formatting.
On the positive side, the configuration for Rider ended up improving “Code Cleanup” in Visual Studio/ReSharper, which had never worked so well before. I eventually figured out how to set things up so that “Format Document” and “Code Cleanup” (Ctrl+E/Ctrl+F) both work flawlessly in Rider and Visual Studio, but it took some time and patience to find all of the settings. The “Detect Formatting Settings” in both ReSharper and Rider were indispensable.
I also finally configured the “File Layout” feature so that “Clean Up Code” works as expected. StyleCop Analyzers supports enforcing an ordering on members, but it doesn’t support configuration of that ordering. The order is fixed as StyleCop wants it. Their default style has fields at the top, which is a no-go for our style.
That means that I’ve disabled the “arrangement” feature of StyleCop and no longer see warnings about out-of-order members. This is OK, though, as re-ordering members just to fix a warning is not that great for reviews and merging. “Clean Up Code”, however, does apply the file-layout rules.
I think that this is a better balance overall, as leaving a method in place when you’ve changed its visibility from public to protects (or vice versa) should not earn a warning.
As noted above, I configured all of the StyleCop, .EditorConfig
, and Rider/R# settings to make “format document” and “clean up code” work perfectly with our style. These are just a jumping-off point (even within Encodo). Adjust StyleCop inspection severities in the *.ruleset
files.
Adjust formatting preferences in the .EditorConfig
whenever you can. Rider/ReSharper will also allow you to override these settings, storing them in the *.sln.DotSettings
file, but it’s clearer and more consistent to configure the ruleset
and .EditorConfig
files because those are more human-readable and better-documented than the *.sln.DotSettings
file.
I made this comparison over the last 4 months, during which the setup changed slowly into the configuration outlined above. I have tried to weed out the notes and impressions that no longer apply, but I may have missed some. I do my best to give the impression of what it’s like to work with these IDEs. I left some longer descriptions in place, just to give a feel of what I experienced while using the IDEs.
For small-to-medium projects on a my 4-year-old desktop, you barely notice startup. For the larger Quino project, with over 120 projects (for now), startup speed is more noticeable.
All of the IDEs start relatively quickly now. They’re just fast in different places. It really depends on where your focus is. Visual Studio by itself starts very, very quickly. The latest versions of ReSharper start up in parallel, so VS is on the screen and the editor is typable in seconds, even with a solution like Quino. You can’t search at that point, though. [3]
Rider looks like it’s totally up and running, but it mostly can’t search either, not until the projects have been processed and the indexes loaded. The initial Rider project-chooser takes longer to start up than you’d expect. Once it’s up, though, opening a solution from there is very fast. Rider runs all open solutions in a single process. Visual Studio launches a separate process per solution.
While I’m happy that the startup speed has improved all-around, I don’t really care about startup speed, not really. I never reboot unless I have to. I never log out unless I reboot. I just leave my tools running all the time. I have 32GB of RAM. Once it’s running, it’s running, and I don’t care how much RAM it takes (within reason)—I care how fast it does the things I ask of it.
Once I configured StyleCop.Analyzers
, my initial solution-load in Rider showed a shocking amount of memory for Quino (an extra 4.5GB just for the Roslyn checker process). It felt fast enough, even though the memory usage kept growing. Rider’s a 64-bit process and I have 32GB of RAM on my desktop, so it was a luxury I could afford.
Luckily, after a restart, the memory was still higher than it was, but now stable at around 3GB.
Solution-wide analysis is enabled by default in Rider, with no performance degradation noticeable at all. In fairness, there is little to no performance degradation evident with ReSharper in Visual Studio either.
Code Vision is enabled by default in Rider; also no performance-degradation noticeable. I am running everything on a desktop and I have seen CPU usage spike quite high on Rider. Code Lens in Visual Studio and Code Vision in Rider both probably suck the life out of a battery, though. TANSTAFL.
While it’s nice that Rider uses all available CPU power for certain tasks—e.g. building—I imagine that the CPU fan would be running a lot under heavy usage. Visual Studio probably suffers the same, though its CPU usage seemed to be flatter when I checked.
Solution-reloading is more stable and a bit faster than in Visual Studio. In a recent task where I was constantly cherry-picking and rebasing, making changes to project files and the solution file, Rider just worked. Visual Studio would usually throw up a yellow warning bar at the top sooner or later (usually sooner).
Sometimes, Rider is quite slow at getting its “intention actions”, something I’ve never seen with ReSharper.
This usually clears up after 5-10 seconds, but a couple of times, Rider went looking for inspections for 10 seconds and came up with nothing—repeatedly. It’s odd because, in that case, Rider kept having trouble with the same extension-method call and had to look it up again and again. This effect is noticeable in other places, as well. When you elect to show the dialog to “Configure Inspection Severity”, then sometimes it takes several seconds to show the dialog box (with no user feedback).
And, sometimes, Rider just dies. For example, when I look up sources for a .NET type, like IndentedTextWriter
, by using ⌥ + F12. Rider showed a dialog for several seconds, but didn’t seem to be doing anything. It wasn’t downloading, as expected; instead, it just showed “Searching for implementations…”.
This wouldn’t be worth mentioning but, after having dismissed the dialog, now I can’t navigate to anything with F12. I have to restart Rider. This is not the first time that this has happened. This never happened with Visual Studio. It definitely makes the IDE feel much shakier.
In Visual Studio, with R#, I can view the sources for IndentedTextWriter
after only a slight pause.
On the subject of reloading: Visual Studio definitely still freezes more (usually showing its yellow warning bar at the top after a few seconds), but Rider is just more subtle about being loaded, but still unusable. You have to keep an eye on the progress bar at the bottom in both IDEs. In general, Rider reloads more quickly than Visual Studio—and has no UI “hangs”, like VS still does, for a few seconds—but not always.
On the other, other hand, I’ve also experienced more build errors after changing framework targets than with Visual Studio. Rider can’t copy files or its looking in the wrong place for files. Restarting Rider fixed that problem, but I shouldn’t have to restart to fix a build. Rebuild should have fixed it, but it didn’t.
I was unable to get Rider to respect the generated_code
setting from the .EditorConfig
file, something that worked immediately with Visual Studio/Roslyn (ReSharper is not involved). I’ve reported that issue as RIDER-61283. In the meantime, I’m using the “Elements to Skip” feature to ignore the same file masks Rider should be ignoring anyway. That at least works for now.
Still, Rider’s integration is nice because it pulls everything together into a single list, but its quick-fixes for Analyzer inspections aren’t as strong as Visual Studio’s nor can you actually fix everything (see the issue with UTF8 below).
In Visual Studio, the analyzers work quite well, but there is no integration with ReSharper. Instead, the integration with Visual Studio is really good—with Ctrl + . instead of ⌥ + ⏎, you can get quick fixes and even apply them to the entire method, document, project, or solution.
In Visual Studio, there’s a very nice preview mode. In fact, there is useful and accurate user feedback throughout, which was a pleasant surprise. It’s quite fast in collecting fixes for all 120 projects and applying the changes. There’s even good keyboard support for arrowing to the file/project/solution actions. This is a definite boon for getting through thousands of fixes quickly.
In Rider, there are quick fixes, but most of them only work for a single instance of the inspection. Some of the fixes (e.g. each attribute on its own line) can be applied to file/project/solution with ReSharper as well, but not all. Some of the fixes aren’t available at all with ReSharper (e.g. SA1513, insert newline after brace) but are available in Rider.
So, Visual Studio’s integration with Code Analyzers worked better out of the box, but it forces you to use both ReSharper quick fixes (⌥ + ⏎) and VS quick fixes (⌘ + .), depending on which system detected the issue. The inspections also show up in two different panes. This is actually easier to get used to than it sounds, though.
There is no ReCommended extension for Rider (with no plans to add support, according to issue #51: Add support for Rider 2020.2, which was closed as “too much work”. All of these inspections are missing in Rider.
async/await
usageWhen you add a parameter to the constructor, Rider doesn’t mark the identifier as unused if it has an attribute. In the examples below, you can see that the identifier is grayed out in Visual Studio, but not in Rider.
Sometimes Rider doesn’t indicate when a conditional access is unnecessary (e.g. when ?.
can be converted to .
). It also doesn’t indicate when an expression that is always false or true could be simplified as reliably as ReSharper does.
Neither Rider nor ReSharper seems to notice when you do a silly pattern-matching check, like if (sender is Person person)
when person
is already a Person
. VS, Rider, and ReSharper simply assume that you’re doing the check in order to assign the variable, I guess.
Now I know why the solution-wide analysis is so fast in Rider: It doesn’t reevaluate warnings when the project changes (e.g. if you change the root namespace). You have to visit each file individually for it to clear the warning. Clicking “Reanalyze all files with errors” doesn’t work on files with warnings, as it does under ReSharper.
You can use ⌥ + ⇧ + PgDn to jump through the warnings, opening each file as you go. It’s pretty fast, but feels clunky. This is especially unfortunate when Rider thinks that there are errors. I suppose that this is a side-effect of repeated solution/project reloads as I’m quickly switching branches.
Changes to the ruleset and stylecop settings are noticed in both IDEs instantly. I changed a rule from warning to info and Rider changed the color of the squiggle in what felt like less than a second. Unfortunately, changes to the .stylecop.json
file are not picked up without a reload of the solution.
Here is where ReSharper is much perceivably faster than Visual Studio. It’s even a bit faster than Rider. Turn on solution-wide analysis. Remove the last reference to a function. Watch ReSharper gray out the identifier in the declaration nearly immediately. Or remove a method call. Watch ReSharper underline it immediately. Visual Studio/Roslyn? Still feels laggy.
ReSharper’s list of errors and warning updates immediately. Rider’s is pretty good, too, but, mysteriously, not as accurate or quick-to-update as ReSharper’s. Both are much faster than Visual Studio/Roslyn, which often takes long seconds to clear warnings or errors—and sometimes never does, until you force a build.
Roslyn (Visual Studio) is sometimes flaky and won’t clear old warnings/errors until the next build. ReSharper was definitely faster here, even with the extra StyleCop parser. This didn’t used to be an issue, but with the switch to Code Analyzers, I’m now using Visual Studio/Roslyn for a good portion of my inspections (StyleCop).
What does flaky mean? Whereas Rider updates relatively reliably when you make a change in any file, StyleCop Code Analyzers in Visual Studio will only occasionally show the warnings. If the file isn’t open (or in some sort of in-memory cache), then only a “Rebuild All” will make the warning appear. This also only works if you’re not using “ReSharper Build”.
Rider does this much less often, but it still does occasionally have incorrect inspections that can be very difficult to correct. For example, the following screenshots show an unrecognized dictionary.
Visual Studio recognizes the using System.Collections.Generic
, but Rider grays it out.
Restarting Rider sorted out this error. Several other cached errors and warnings disappeared with the one noted above.
Rider is very quick, as is ReSharper. Also, it’s generally pretty good on updating inspections, but I’ve also seen flakiness with lingering warnings and errors in the pane, but never in the sources. The only way I’ve found to update the pane is by actually opening the file, at which point Rider re-detects that the issues are gone and clears the inspections. Manually triggering a reanalysis does not help here.
The solution-wide find/replace window in Rider is lightning-fast and supports newlines, copy/paste, regular expressions, shows change previews. It’s wonderful. The change previews in Visual Studio Code are just a tiny-bit better, but the overall experience is solid and super-fast. The search/replace in Visual Studio is looks very dated next to this feature in Rider.
Navigation to other files is so fast in Rider that I sometimes thought it hadn’t navigated (it had!)
There is no way to navigate the warnings in a solution using the keyboard. In general, Rider tends to let panels “steal” the keys for next/previous, so when you try to navigate errors or warnings or find-results, the test session can “steal” these keys and suddenly you’re navigating tests and fixtures instead. I find myself grabbing the mouse more often in Rider than I do in Visual Studio.
Where ReSharper has Ctrl + T as a central search for everything, the same key combination does not include “search everything” in Rider. For that, you need to switch to Ctrl + ⇧ + F. On the other hand, the dedicated “find in solution” panel is lightning fast and makes up having to switch between panes.
Rider doesn’t really support extending a non-contiguous selection. It has column-selection mode, like Visual Studio, but it doesn’t have ⌘ + Shift + . to select “like” text. In Sublime Text and Visual Studio Code, this feature is available via ⌘ + D. Rider doesn’t seem to have this, which limits editing capabilities. There is documentation for multi-selection but the shortcut keys are confusing and not the ones I have assigned. Nor can I find anything in the keymap with any of those names. It’s either a new feature or its only partially supported.
Update 23.04.2021: I just tried ⌘ + Shift + . in Rider (even though that wasn’t documented) and it works just like in Visual Studio! That’s a nice surprise. I’m not sure if this was always there and just poorly documented or whether they just added it in a recent release. At any rate, good news for editing in Rider.
Pressing Ctrl+K/Ctrl+C comments code. However, instead of commenting again, it uncomments if applied a second time. This means I can’t “double comment” to indicate that this code is temporarily preserved, but should not be flagged as commented code to be removed.
Double-clicking on an identifier uses CamelHumps, if you have CamelHumps enabled (just like all other JetBrains tools). With ReSharper, though, the CamelHumps apply to cursor-based word-selection, but a double-click selects the whole word. I think that’s a better balance because that’s what I expect when I double-click an identifier. I don’t think I’ve ever wanted to select just a part of the double-clicked word by default. It’s not a deal-breaker, but it’s annoying because I have to double-click, then extend the selection manually to get the full identifier.
The undo function in Rider fails much more often than I’m used to from Visual Studio. I’ve deleted lines of documentation and then hit undo and Rider couldn’t get them back.
Once the undo buffer is broken, you have to restart Rider in order to be able to undo again. It feels quite unstable. I’m quite surprised, considering the literally dozens of popular IDEs built on this platform.
Rider creates files as UTF-8, but without the BOM. Then the StyleCop analyzer demands that the file have a BOM, but there is no quick fix in Rider for this, nor is it clear how to convert the file. I end up switching back to Visual Studio, where there’s a quick fix to set the encoding properly.
Typing speed is better in Rider than in Visual Studio/ReSharper. Just a little, but it is. It’s smoother. Even after replacing the StyleCop extension with StyleCop.Analyzers
, it still feels a bit smoother, overall. Rider on Mac feels even smoother than Windows.
I just wasted 10 minutes in Visual Studio trying to figure out from the documentation how to create a StreamWriter
with a non-default encoding. The list of overloads did not show any overloads when using a path.
I searched and the wizards at StackOverflow rather snippily asked why not use the docs? So I looked at the docs and then switched to the right target (first .NET 2.1, then .NET Standard 2.0), but the desired overloads have been around forever. Back to VS and it is really not showing those overloads. Switch to Rider and … there they are.
It turns out that Visual Studio has a maximum height for its overloads list. The only hint that there are more methods are some heretofore not-noticed dashes at the bottom. The only way to see the other overloads is to select the popup and use the arrow keys. There is no scroll bar or other evidence to indicate that this is possible. There is also no reason why the popup couldn’t be taller.
In Visual Studio, the developer can use the up arrow and down arrow to traverse the various overloads, showing the documentation for them. In Rider, it’s not obvious how to navigate. The trick is to keep hitting ⌘ + ⇧ + space to cycle forward through the list.
Typing a {
in a non-interpolated string does not show code-completion. In ReSharper, you can type {
, select a variable and ReSharper automatically makes the string interpolated. If you add a parameter, Rider rightly complains that the data between the curly braces needs to be an index, but doesn’t offer to convert the string to interpolated. You have to go back to the front of the string and add the $ yourself. This is now working in Rider 2020.3
Rider doesn’t offer to rename related symbols as much as ReSharper does. For example, if you rename a field, ReSharper will offer to rename the constructor parameter that sets that field. Rider does not.
When you insert a new parameter in a method call and then tell Rider to add it to the method, it then shows a panel with other calls that need to be updated, asking how to handle each one. This is the same as in ReSharper and is a welcome feature. As in ReSharper, you can navigate the various calls with the arrow keys and the focus is set correctly. However, I can’t figure out how to activate the choices with the keyboard. I have to use the mouse.
The NuGet integration is nice in Rider and the NuGet Explorer is quite fast. It still doesn’t feel as robust as Visual Studio, but it’s getting there. I rarely went back to Visual Studio to try to resolve an issue I couldn’t solve in the Rider UI.
Rider’s “build” command still doesn’t notice when you’re changed packages external to the solution and do a nuget restore
for you. In fact, when I updated Winform DevEx packages externally (because neither the NuGet UI in Rider nor that in VS could apply the changes without getting tripped up in dependencies because it can’t upgrade multiple projects at once), Rider had no idea what I’d done until I manually deleted the obj
folders from the projects that depend on DevEx.
I don’t recall having to do that for Visual Studio, which runs a nuget restore
check before each build. Visual Studio was more amenable to finding the actual error with a “rebuild all”. Rider cached more and stayed stuck on the original “error”, which was hiding the real problem (an interface mismatch after the upgrade).
When you update NuGet packages, Rider uses stale data a lot more than Visual Studio does now. This is how Visual Studio used to be, but it’s gotten a lot better with its caches. Rider is still a few steps behind. I just upgraded NuGet packages for a project and then ran the tests. A bunch of them failed with a MissingMethodException
.
I know this error, so I forced a full rebuild and ran the tests again. This time everything worked. With Visual Studio, I’d gotten used to no longer having to consider “rebuild all” or “restart the IDE” as possible solutions. With Rider, you still have to occasionally use these solutions, for now.
It’s not the end of the world, but it does waste time and effort—especially if you don’t jump to that conclusion quickly enough. Often enough, you’ll lose a good quarter of an hour chasing phantom errors and warnings instead.
When you edit a unit test to change the parameters to a test case, the test session will update and then move the selection to the top of the list. This is very annoying since it always scrolls away from the test area I had focused. It also has an annoying habit of nearly constantly changing the selected item in the tree, making navigation difficult.
This might be related to when tests are running or a build is running, but there’s always something like that going on—it’s not very nice that the whole IDE has to be quiet before I can use keyboard navigation in a tree without Rider constantly stealing focus and jumping around.
While running tests, Rider does not allow you to collapse nodes in the unit-test session. It quite annoyingly expands it again whenever you try to collapse a node.
Searching in tests is quite slow in both Rider and ReSharper.
Update 23.04.2021: I’ve discovered that I can use F4 in Rider to jump to the source of a test. That’s very handy because double-clicking on a test in either test runner has unpredictable results that seem to depend on whether the test is defined in a base class.
I can’t treat the Unit Test Session window as an editor window in Rider, so it’s harder to switch back and forth. The tests are docked at the bottom by default. You have to switch to that window with a hotkey, then use another hotkey to hide it. I’m getting used to it, but I don’t understand why the JetBrains IDE doesn’t support this feature (it doesn’t have it in any other JetBrains IDEs I’ve used either).
Integrated debugging with auto-disassemble and sources in Rider is pretty awesome (e.g. I debugged into SimpleInjector without SourceLink). You can open any referenced type in any assembly and either have the original source from SourceLink [4] or disassembly. In either case, you can set breakpoints and debug into it. If the file is disassembled, it’s not always pretty, but it’s amazingly useful for inspection.
The Smart Step-in feature in Rider is a very nice upgrade, to which I’ve already become quite accustomed (just ⇥ to cycle locations). It’s a bit finer-grained than being able to disable property step-in universally in Visual Studio.
On the other hand, I’m not super-happy with the different ways of running an application in Rider. They seem to make it very difficult to debug an application and stop on unexpected errors. I’ve seen other users using Rider just kind of look in the output window as if live debugging wasn’t a feature we should all expect to work. It can be configured, but you have to make sure to run in debug mode and turn on exception-handling.
It’s also much harder to debug a StackOverflowException
in tests because Rider doesn’t show a useful stack trace (it instead shows a trace for the LogException
in the test runner itself. The “launch log file” is detailed, but provides no additional information. Instead, I was forced to set breakpoints and continually “edge closer” to the crash and find it myself. This is how Visual Studio used to work, but for a couple of years, its handling for stack overflows has been much better.
Also, Rider doesn’t stop on unhandled exceptions by default, either when running tests or running a web server. The stack trace in the debug output when running the web server isn’t highlighted and can’t be clicked.
The debugger in Rider does not make use of the DebuggerTypeProxy
to display or format debugging information, which is a shame because Quino has useful customizations for debugger display that I miss in Rider.
I was unable to debug unit tests for a while because Rider complained that my DotNet runtime (AnyCPU) didn’t match the chosen testing target (x86). All of the solutions I’ve opened have been “Any CPU”-only, so I was mystified how Rider came up with the idea to run my tests as x86.
Rider pops up a helpful tip to take me directly to the setting to change the runtime to use. I don’t even have an x86 runtime. And I don’t want to run tests as x86 anyway.
The real fix is to go to Settings => Build, Execution, Deployment => Unit Testing => Default platform architecture and set it to “Automatic”. Mine was hard-coded to x86, for some reason (maybe a settings upgrade from an older version).
Viewing a variable isn’t as easy because Rider uses a much less-stable tooltip than VS. If you have a long value that you want to “view”, you have to cruise your mouse along a long, skinny tooltip for dozens of centimeters before you can click the “view” button (you have to know it’s there) at the end.
Since the tooltip is unstable, Rider has trained me to go down to the variable window and copy the value from there.
Both Rider and VS/ReSharper support navigation using SourceLink as of 2020.3, which is a massive win for usability. Now you can open a type with Ctrl + T or hit F12, ⌥ + Home, ⌥ + End to navigate to a related symbol from source and Rider/ReSharper will navigate within the SourceLink sources, which means that you can easily set breakpoints in code from NuGet packages, as long as they have SourceLink. Rider additionally offers support for setting breakpoints in disassembled code, with mixed results.
However, browsing works less well in Rider. For example, I pressed ⌥ + F12 on EventHandler
to “peek” it and it popped up a processing dialog for 15 seconds before I canceled it. When I pressed F12 to navigate there instead, it didn’t show a progress dialog, but it also just seemed to break Rider because syntax-highlighting and code-completion stopped working for subsequently typed code. The “Errors in Solution” pane was similarly crippled, showing files with warnings, but no warnings. The navigation action never showed the code for the EventHandler
, but it did make everything else stop working. A restart fixed everything.
In addition, navigation to authenticated sources was only working temporarily. It is broken in the most recent version of Rider, as I’ve documented in RIDER-61280.
The formatting for XML documentation works strangely when Rider inserts text in documentation (e.g. when you apply a fix). We use a tab size of 2 everywhere, but the settings window shows a tab size of 4, but also mention that some settings might be overridden by the .EditorConfig
. Reformatting or cleaning up code fixes the indentation to where it should be. It’s unclear where Rider is getting its settings for the initial insertion.
Even with the StyleCop Analyzers, there are fewer fixes for XML documentation than with Visual Studio/ReSharper. For example, there is no way to quickly add parameter documentation. Rider does not have any significant support for generating documentation (the initial format is very compact and never formatted according to rules).
Rider’s parameter-completion in documentation works more smoothly (Esc not necessary), but it does not use a “smart” sorting for tags. In ReSharper, once I’ve selected paramref
once, that is sorted at the top and selected by default. In Rider, the order is unchanged, so I have to arrow down or type out most of the tag name in order to get past param
.
Rider still shows a hint to add <inheritdoc/>
on the class, even if the class has its own documentation.
There’s an extra item in the action list for “move to separate file” that does nothing. There’s another item that includes the name of the file in the caption that does work.
There’s no Enhanced Tooltip extension (and the tooltips are not as nicely formatted in Rider)
I can’t seem to change colors of icons as I can for ReSharper. I’d gotten used to brighter colors and miss them in Rider.
In ReSharper, you can disable specific inlay hints directly from the completion menu. In Rider, you can do this for some of them, but not all. If it’s not there, you have to select “Configure inlay hints” and then have to find the corresponding checkbox yourself.
Rider doesn’t keep track of the last opened solutions to open from the task list. [As of 2021.1.1, the task list is now populated with recent solutions.]
The “Commit” panel doesn’t refresh very quickly at all. Long after I’d seen the files in SmartGit, they were still not in the panel. When I switched away and then back, the new changes suddenly appeared. I don’t use the integrated Git support, but I’m not going to start, either, after seeing how it works.
I can’t search for the bindings for a key combination in Rider, like I can in Visual Studio. Instead, I have to guess at the name of the operation that I think it’s bound to.
Update 23.04.2021: I’ve found that if you click on the magnifying glass to the right of the search field, you can “Find actions by shortcut”.
Rider also doesn’t have the “show active configurations” panel, for some reason. I’m currently fighting with Rider because it suddenly came up with the idea to format everything with 4 spaces instead of 2 spaces. Just yesterday this was finally working so that I could reformat the document and everything worked. Now, Rider is reindenting everything for me. Visual Studio/ReSharper is showing that I have 2 spaces configured.
Although Visual Studio/ReSharper edged out Rider in most of these categories, you’re well-served with either one. I think if I’d compared Visual Studio by itself to Rider, then Rider would have won easily. It’s only in combination with ReSharper that Visual Studio ends up being a bit better. It’s just more mature and I never found myself going to Rider from Visual Studio, whereas I did have to open Visual Studio a few times to fix something I couldn’t do in Rider.
It’s happening less with each version, though. Over the four months of the evaluation, Rider has improved steadily [5] I think you’re well-served with either version.
Once Rider files off a few more rough edges and has true feature-parity—perhaps by natively implementing some of the inspections from the ReCommended extension—it’s slightly smoother editor might help it pull ahead in this comparison.
Most of the above is complaining to a very standard, though. Both IDEs will make anyone who knows how to use them a much more efficient developer of reliable and readable code.
The last time I tried working with Visual Studio without ReSharper was over two years ago, with Visual Studio 2019 Preview 3. Still, I can see much more of Visual Studio working better than ever, taking over more and more of what I use ReSharper for.
I’d installed Visual Studio 2019 Preview 3 to investigate the following,
I installed the desktop and web-development workloads, totaling almost 6GB.
In November of 2020, the article Announcing .NET 5.0 by Richard Lander (MS Blogs) wrote that,
“[…m]oving forward, the idea is that as when we add new features to .NET, we’re also adding corresponding analyzers and code fixers to help you use them correctly, right out of the gate.”
and
“With .NET 5, we have heavily improved our support for static code analysis. This includes an analyzer for platform-specific code and a better mechanism to deal with obsoletions. The .NET 5 SDK includes over 230 analyzers!”
The latest versions of VS also allow you to fine-tune the severity of any warnings directly from the UI/Solution Explorer. This is all a great leap forward for Visual Studio 2019, but ReSharper still improves the following features:
Setting... [More]
]]>Published by marco on 30. Mar 2021 21:04:45 (GMT-5)
As with installing a dotnet tool on Azure, there isn’t a standard task for setting a Git tag from a pipeline YAML configuration. The Pipeline UI has an option to easily do this, but that hasn’t translated to a task yet, nor does it look like it’s likely to, according to online discussions.
Setting a Git tag is relatively straightforward, but is complicated by permissions (as with installing a dotnet tool. To tag a build, you have to just execute the git
commands in a script.
− task: CmdLine@2
displayName: Push Git Tag
inputs:
script: |
git tag $(Build.BuildNumber)
git push origin $(Build.BuildNumber)
If, for whatever reason, you want the tag to be created by the triggering user, then include the following lines as well:
− task: CmdLine@2
displayName: Push Git Tag
inputs:
script: |
git config user.email $env:BUILD_REQUESTEDFOREMAIL
git config user.name $env:BUILD_REQUESTEDFOR
git tag $(Build.BuildNumber)
git push origin $(Build.BuildNumber)
You should include this step after the version number has been updated.
With the task in place, you have to ensure that you’ve granted permissions to the proper user.
In order to use it from any pipeline... [More]
]]>Published by marco on 29. Mar 2021 22:36:59 (GMT-5)
I have a .NET solution (Quino) that contains a project that I publish as a `dotnet` tool. The tool calculates a version number based on the branch and version number found in the solution. I use it from Quino itself and also from other project pipelines.
In order to use it from any pipeline (including Quino itself), I need to install it from the Quino artifact feed. The original solution is a couple of years old: I’d had a secure file for NuGet.Config that included the PAT. This works fine, until the PAT expires.
So, I went searching for a better solution and thought I’d try something a bit more resilient and better-supported. By now, I’m using YAML files for my pipeline, so I tried the DotNet
task, but it doesn’t support installing tools.
There are open issues and even a very old open pull-request for supporting a Microsoft tool on Microsoft’s premiere hosting service that Microsoft has steadfastly ignored. There seem to be no plans for supporting dotnet tool install
natively, with seamless authentication, as they’ve done for dotnet restore
. The example below shows how this works for restore
.
− task: DotNetCoreCLI@2
displayName: 'Restore Server Packages'
inputs:
command: 'restore'
feedsToUse: 'select'
feedRestore: 'Quino'
projects: 'server/src/**/*.csproj'
verbosityRestore: Normal
includeNuGetOrg: true
I was hoping to follow this pattern to use the dotnet
task to install a tool with something like the following:
− task: DotNetCoreCLI@2
displayName: 'Restore Server Packages'
inputs:
command: 'tool install'
feedsToUse: 'select'
feedRestore: 'Quino'
includeNuGetOrg: true
isGlobal: true
toolName: quino
There is no support for this. The PR mentioned above would support it, but it’s never been accepted and Microsoft has not seen fit to add automatically authenticated feeds for anything other than restore
.
Instead, I use two tasks: the first is a workaround for the lack of proper support in Azure for `dotnet tool install` from authenticated feeds; the second installs the tool. See dotnet tool install/update” not working with Azure Artifacts #10057 and Add dotnet tool install command to support tools location in Azure Artifact feeds #13401 (the PR) for more information.
I can copy/paste the two tasks below into all of the pipelines that need it. It’s a bit bulky and non-intuitive, but it is both project-agnostic and doesn’t include any passwords or PATs directly. Instead, it uses the $(System.AccessToken)
. If the project has been granted access to the feed identified by <INTERNALFEEDURL>
using the standard feed permissions control panel, then it works.
− task: NuGetCommand@2
displayName: 'NuGet Add Credentials For Internal Feed'
inputs:
command: custom
arguments: > 'sources add
-Name "<INTERNALFEEDNAME>"
-Source "<INTERNALFEEDURL>/nuget/v3/index.json"
-Username "this_value_could_anything"
-Password "$(System.AccessToken)"'
− task: CmdLine@3
displayName: Install tools
inputs:
targetType: inline
script: dotnet tool install <TOOLNAME> –global
Where:
<INTERNALFEEDURL>
is obtained from your Azure project<INTERNALFEEDNAME>
doesn’t matter, as long as it doesn’t conflict with any defaults<TOOLNAME>
is the name of the tool to executeThis is utterly unintuitive, but it works and it’s not too much hacking. I think it’s indisputable that it would be much nicer if “install tool” was an option for the “dotnet” command. It’s not like it’s an external tool. This is literally how Microsoft has asked us to work.
It would be nice if I hadn’t had to spend half an afternoon trying to figure out how to get a dotnet tool installed from a feed in the same project on Azure. I’m glad I got it working, but everyone who comes after will also waste time trying to figure this out—or will give up and use a gross hack instead.
As you move the cursor around, the layer of “cells” change... [More]
]]>Published by marco on 13. Mar 2021 22:09:08 (GMT-5)
The article Getting the Mouse Position using CSS by Bramus talks about a neat trick that uses sibling elements to react to mouse events without using JavaScript. It also features some kick-ass translucency and animation effects with CSS transitions.
As you move the cursor around, the layer of “cells” change X and Y positions that the CSS text elements “watch”. This lets the central elements “follow” the mouse, transforming a stack of “CSS” texts in a nicely composed and layered stack. It looks like this.
While this is a nice-looking effect—and it’s impressive that it works purely in the browser and purely in CSS—it kicked in the fan on my iMac, something that rarely happens.
That said, the compositing features of a modern browser are impressive and can save website authors a lot of time and effort. That this is even possible is already really, really nice. Maybe with a bit of tweaking, it can be made less detrimental to battery life.
If you want to try it out yourself or tweak the code, check out the CodePen.
There’s a new feature called “discard to stash” that is enabled by default.
What this does is to stash every time you press ⌘... [More]
]]>Published by marco on 4. Mar 2021 22:39:04 (GMT-5)
Updated by marco on 4. Mar 2021 22:39:38 (GMT-5)
This a quick note for anyone else who’s downloaded the latest version of SmartGit (20.2.3 #16150) and is seeing mysterious stashes that they know they haven’t created.
There’s a new feature called “discard to stash” that is enabled by default.
What this does is to stash every time you press ⌘ + Z to discard changes. I understand that this is a failsafe “just in case”, but I kept ended up with a dozen stashes I had no use for. On balance, I’d rather have the tiny risk of wanting changes back that I’d discarded—I can’t recall this ever having happened—than the “noise” of stashes muddling the list of actual stashes I’d saved.
I started off trying to train myself to hit right arrow and then enter, or typing D instead of ⏎, but I gave up and found an “advanced” preference to switch the default behavior.
For one project, I reconfigured a program with Delphi Pascal, using Delphi 7 (it’s a very old, legacy solution) to run on my local machine instead of in a VM that had... [More]
]]>Published by marco on 17. Feb 2021 21:56:03 (GMT-5)
In one recent week, I realized I’d been working in many different areas and on many different projects, so I took an inventory.
For one project, I reconfigured a program with Delphi Pascal, using Delphi 7 (it’s a very old, legacy solution) to run on my local machine instead of in a VM that had swollen to 120GB. For that project, I also used SQL on SQL Server, running in a Docker container that I’d configured with YAML. The solution has several products, among which you can switch, so I wrote a Windows Batch program to transfer and back up versions, so you can nicely diff them with SmartGit using Git. In order to diff SQL, I used a tool written in TypeScript, which I extended with a few fixes and tests written with Jest in Visual Studio Code. I updated the documentation in Markdown.
At the same time, I was working on Quino, written in C# for the .NET platform, using Visual Studio on Windows and Jetbrains Rider on MacOS. I also set up a new solution using Quino, which involved editing a bunch of XML project files as well as configuring SQL Server and PostgreSql with Docker. I again used YAML to define pipelines in Azure DevOps.
For two evenings, I graded final projects for a JavaScript class I’ve just finished teaching. On the other evenings, I researched modern HTML, CSS, and SVG for an upcoming redesign of earthli. I made a few PHP fixes for earthli as well.
I wrote blog posts, wiki entries, and issue analyses in Jira Syntax, Markdown, XWiki syntax, and earthli Syntax in both English and German.
Published by marco on 17. Jan 2021 17:43:34 (GMT-5)
A software product with undocumented or poorly documented commits and a patchy issue-tracker is akin to a shipping pallet with 100 boxes haphazardly stacked on it, all wrapped up in shipping cellophane. You can see some of the labels and some of them you can’t and some of the boxes definitely don’t even have labels at all.
If it looks like the pallet to the right, then you already know you can’t ship it. That’s an obvious train-wreck of a project that’s going to blow up in everyone’s face. But the picture to the left looks…OK…ish. How do you know if it’s legit? Check the shipping manifest and get out your scanner gun, right?
The shipping manifest on your clipboard has 3 and ½ items on it, none of them really helpful. If you really want to be sure about what you’re shipping, you’re going to have to unwrap the whole thing and look at each box individually, noting it on the manifest if it’s missing — and maybe even opening it up to see what’s actually inside. Maybe it’s even broken and leaking on other boxes, somewhere in the middle of that whole pile.
Maybe someone wrapped it in cellophane to give it the sheen of reliability, but you can’t know for sure. Is it possible that you spend all of the time to dot the i’s and cross the t’s just in order to find out that it was fine, but just drastically under-documented? It’s possible, of course. That’s a risk you take when you try to be professional. The alternative is to become a gambler—shipping something and hoping that it doesn’t come back to haunt you.
A better approach would have been to use a documenting process as you built the product—like engineers rather than cowboys—slowing our awesome selves down a bit, but also—maybe, just maybe—getting faster because we’re more careful and can avoid wasting time on work that doesn’t need to be done.
Documenting the work to be done—e.g. to explain it to other team members—can have the much-appreciated side-effect of focusing you on the work that actually needs doing. This is generally more efficient and satisfying than just shooting out of the gate and doing what you “know needs doin’” and not noticing possible ramifications until it’s too late do anything but react to rather than plan for them.
In the end, you have not only solid, well-designed, and tested software, but also good documentation of what was actually done for a given release, as well as analyses for what was not done and what needs doing in the future. That everything is well-documented enough to implement now means you’ve got half a chance of still knowing what it means in ½ or ¾ of a year when you finally get a chance to plan and implement it.
Who knows? You may never need to work on it again—which is just fine. At least you’ll know what you didn’t implement and why. This is very helpful for that time, in a year or two, when you think of this exact same solution and are maybe too stressed or under too much pressure to remember why you decided against it the first time.
A good software product is not just the product itself, but all of the metadata surrounding it: the documentation, the analyses, the release notes, the roadmap.
It’s been a... [More]
]]>Published by marco on 17. Jan 2021 00:01:34 (GMT-5)
Until now, PHP debugging involved a fragile balance between the IDE, the server, and the debugger, each with overly verbose configuration. On top of that, using Docker introduced the wrinkle that you were technically debugging on a remote server rather than on the “real” localhost.
It’s been a long journey, but it’s finally a lot easier to set up PHP debugging with a server running in a Docker container. Once you use the most modern tools, everything works with a couple of lines of configuration.
tl;dr:
- Ignore anything you find on StackOverflow from before November of 2020 and use the install-php-extensions project instead (see example below).
- Set environment variables in the docker-compose file to indicate the client and the default mode (debug)
- Use the latest PHPStorm, which supports XDebug 3.x
My setup is as follows:
So far, so good: it’s basically a standard developer setup for PHP where I have an IDE on my machine and am running servers in Docker containers. XDebug initiates a connection from the server in the “web” container back to the IDE on the docker host.
Without further ado, these are the magic configuration files to install extensions and set up XDebug for PHP.
After much searching and rigamarole and fighting with docker-php-ext-install
and docker-php-ext-enable
and PECL
and where the PHP.INI
is and whether I need to move one of the default files somewhere so that PECL can update it and downloading dependencies with apt-get
and getting the right dependencies, depending on the PHP version and passing the right flags to docker-php-ext-configure
if the version is a bit older and, and, and…
After trying a ton of no-longer-relevant and now-overly-complex suggestions on StackOverflow, I finally returned to php on dockerhub and discovered a hint to use the install-php-extensions project, which basically takes care of everything for you.
It does. End of story.
FROM php:7.2.24-apache
ENV DEBIAN_FRONTEND=noninteractive
ADD https://github.com/mlocati/docker-php-extension-installer/releases/
latest/download/install-php-extensions /usr/local/bin/
RUN chmod +x /usr/local/bin/install-php-extensions && sync && \
install-php-extensions gd xdebug mysqli exif zip
I pin the PHP version to the one on my server, download the latest version of install-php-extensions
[1] and then call it to install the non-standard extensions I use on earthli:
exif
: Extract date information from picturesgd
: Generate thumbnailsmysqli
: Provide access to MySql using a legacy APIxdebug
: Debugging support on the serverzip
: Open and read files from ZIP archivesSee the web site for the list of supported packages. Your site will likely use different ones (but you should definitely install xdebug
because it’s totally easy to use now).
Finally, you just need to set two environment variables to enable debugging for PHP:
XDEBUG_CONFIG
: accepts a list of settings, but we only need to set the client_host
to tell XDebug which machine hosts the IDE to which to connect (Docker handily provides the host.docker.internal
alias for MacOS and Windows)XDEBUG_MODE
: this sets up the tool for step-debugging (see XDEBUG mode for more information).I’ve included nearly the full Dockerfile
from earthli, but the only relevant part for debugging is in the environment
.
web:
build: web
container_name: "${COMPOSE_PROJECT_NAME}-web"
restart: unless-stopped
ports:
− 80:80
volumes:
− ../site:/var/www/html
− ../lib:/var/tmp/earthli.com-lib
− ../../earthli-webcore/site:/var/tmp/webcore-site
− ../../earthli-webcore/lib:/var/tmp/webcore-lib
− ../../earthli-data:/var/tmp/earthli-data
− ../../earthli-logs:/var/tmp/logs
− ../config/apache-dev.conf:/etc/apache2/sites-available/000-default.conf
depends_on:
− db
environment:
XDEBUG_CONFIG: client_host=host.docker.internal
XDEBUG_MODE: debug
At this point, you’re well on your way to debugging with PHPStorm. From here, follow the instructions in the settings dialog, shown below.
XDEBUG_SESSION=PHPSTORM
in the query string, but that gets a bit tedious. Instead, install a browser-debugging extension, which simply injects the cookie XDEBUG_SESSION=PHPSTORM
into the request so that PHPStorm knows that debugging is desired. See XDebug’s documentation for more information on other ways of triggering debugging, including from the command line (e.g. when running unit tests).That’s it. A long and kind of painful journey has finally led to a solid and easy-to-configure debugging experience for PHP.
A large project... [More]
]]>Published by marco on 5. Jul 2020 21:52:52 (GMT-5)
Groovy is a dynamically typed programming language that executes on the Java Runtime. It mixes its own highly dynamic syntax with islands of Java code. The Android ecosystem and its IDE use Gradle for its build scripts. Gradle uses the Groovy programming language.
A large project I’m working on contains quite a bit of custom Gradle code for integrating framework libraries, making obfuscated builds, configuring publication, and, finally, creating signed builds.
The signed builds are configured using standard Android Gradle DSL commands. Basically, there was a block of code something like the one shown below.
signing {
storeFile = getKeyStoreFile()
storePassword = getKeyStorePassword()
keyAlias = getKeyAlias()
keyPassword = getKeyPassword()
}
The names of the methods (e.g. getKeyAlias
) used to be different before I’d refactored them to have more standard names. [1] The methods check whether there are environment variables set by the build server, using sane defaults for developer builds. [2]
This is where I went wrong. Never touch a running system [3], even when you’re trying to pull it back from the precipice of “maintenance nightmare that everyone is terrified to touch, to say nothing of change”. Well, I changed it, and ended up frittering away a couple of hours investigating the Groovy “feature” outlined below.
Groovy performs syntax-checking, but is extremely lenient as far types and variables are concerned. Variables have to be defined, but pretty much anything can be coerced into anything else. It is transformed to Java code and then Java byte code by the Java compiler. Any typing errors you see are from the Java compiler, not the
As any programming language would, Groovy resolves identifiers to match the declaration that is closest in scope to the call, even when that declaration is generated at compile-time and couldn’t possibly be the one that the original author had intended to call. This is going to be important later (which is why I put it in scary italics).
The four methods above are defined in an ext {}
block [4]. Calling them without a specific target as above automatically resolves to the methods from the ext {}
block.
Oddly, of the four properties being set in the example above, only the first two actually called the methods I’d defined in the ext {}
block. The calls to getKeyAlias()
and getKeyPassword()
were not made to the expected functions. I could tell they weren’t being called because the logger.info()
calls from those two methods never appeared in the output.
What the hell is going on? If you look carefully, you’ll notice that the first two methods have different names than you would use for writing the getter and setter for the properties being assigned. The second two methods match those names exactly.
When Groovy lowers its syntax to Java code, it declares these getters and setters. The Java compiler, in turn, references these new methods because the calls in the original Groovy code hadn’t been specific about the target of the methods. Instead of lowering to Java code and being explicit about which ext
block the method should be called from, Groovy just left the naked call as I’d written it. Probably, if I’d explicitly called ext.getKeyAlias()
, it would have avoided calling the dynamically generated this.getKeyAlias()
method.
Of course, Groovy had trained me to stop prepended the target ext.
on global function calls because ext
resolves to different things, depending on the DSL-specific context. Sometimes it’s the root project’s extra variables and sometimes it’s the sub-projects extra variables and sometimes ext
doesn’t work at all (e.g. in Java classes, naturally, but also in blocks created by special keywords).
Sure, you can trying playing around with rootProject.ext.
or other similar constructs, but the code quickly becomes even more unreadable than it already would be and the non-prefixed version works 99% of the time.
So what ended up happening was that, instead of calling the method I’d actually called, the Groovy compiler generated a new method with the same name and a higher specificity in the scope, capturing the call. Instead of calling my method, it ended up calling this.setKeyAlias(this.getKeyAlias())
, which is basically a NOP that leaves the property empty.
The solution is to use a unique name for the function that does not conflict with any of the auto-generated getters. That is, of course, an unmaintainable nightmare, but part and parcel of working with Gradle.
signing {
storeFile = getKeyStoreFile()
storePassword = getKeyStorePassword()
keyAlias = getSigningKeyAlias()
keyPassword = getSigningKeyPassword()
}
Lo and behold, my log entries appeared and I was back in business.
The compiler authors could have tried harder to avoid altering the semantics of the higher-level Groovy code when replacing it with Java.
One way would be to use more obfuscated auto-generated getters and setters (to the degree that Java even allows this, which I think it does).
Another way was hinted at above: when lowering calls that auto-resolve to functions declared in ext
regions, include information about the resolution in the call made in Java. That is, instead of just encoding getKeyAlias()
as I’d written it (which is semantically correct at the Groovy level), transform that call to rootProject.ext.getKeyAlias()
in Java.
Gradle is a shaky piece of business that automagically generates code that might replace actual, legitimate calls in your own code. It should never have been used for a build system. It makes MSBuild
seem like a pretty good idea.
Gradle lets you declare “extra” variables in a scoped block called ext
.
The parent of this block depends on the context. It’s usually rootProject
, unless you’re executing project-specific code, in which case any declarations will be made in the sub-project-specific ext
block instead of the one for the rootProject
.
It can get quite confusing if you’re not sure which context you’re in when you declare an ext {}
block, which is why some authors try to declare rootProject.ext
or project.ext
, but then you run into problems when you grab a variable from the wrong extra region.
Not to mention that it gets quite a bit messier to read and if all authors don’t stick to the same style, it becomes difficult to tell which explicit references are necessary and which are just thrown in there “to make sure”.
I settled on just declaring as much as possible in the ext {}
and letting Groovy figure out which variable to use from scope. That ended up biting me in the ass exactly once, as detailed above.
I still haven’t moved Quino to C# 8, as the only feature I’d love to have there is the non-nullable types, which ReSharper Annotations provide with earlier... [More]
]]>Published by marco on 24. May 2020 22:27:00 (GMT-5)
The article Welcome to C# 9.0 by Mads Torgersen (Microsoft Dev Blogs) (May 2020) introduces several nifty new features that I am really looking forward to using.
I still haven’t moved Quino to C# 8, as the only feature I’d love to have there is the non-nullable types, which ReSharper Annotations provide with earlier versions of C#. Not only that, but the nullabilities are properly propagated to users of Quino. It’s understood that recent versions of Visual Studio and runtimes and compilers also do this but, until recently, our customers weren’t up-to-date yet.
In C# 8, we could also replace extension methods with default interface methods—but we’ve also been replacing almost all extension methods in Quino with singletons and composition anyway. A lot of the rest of the features are nice, and interesting, but they are targeted optimizations that don’t really apply to a lot of the code that I write. I see how they are eminently useful for lower-level library and runtime optimization—many are clearly made to be able to handle web requests and fine-grained tasks more quickly and without allocation
Still, the features in C# 9 make an upgrade even more attractive.
var originalPerson = otherPerson with { LastName = “Hunter” };
)Point p = new (3, 5);
rather than this: var p = new Point(3, 5);
And, finally, covariant return types make an appearance. Java has had these for forever and there is no logical downside to introducing them.
This allows a descendant method to change the return type of an override to a descendant as well. The most common use case would for the return type of a Clone()
method. The next step would be to allow anchored types (as in Eiffel), which would let a method declare its return type as like this
and remove the requirement that each descendant override Clone
at all, while still having the desired covariant return type.
I’ve been musing about these features for what feels like most of my career.
Published by marco on 13. Apr 2020 11:20:34 (GMT-5)
Updated by marco on 15. Apr 2020 15:47:26 (GMT-5)
The Web Animations Working Draft (W3C) was published in October of 2018. Can I use “Web Animations” (CanIUse) shows that the only browser that supports this API 100% is the latest technology preview on iOS and MacOS. Chromium-based browsers have had (very) basic support for quite some time, but Safari has thrown down the gauntlet with full support, which I learned about from Web Animations in Safari 13.1 by Antoine Quint (WebKit Blog).
This API is intended to replace many usages of CSS Animations and CSS Transitions, which are not only somewhat verbose and unwieldy for even simple cases, but are also not efficient in that each animation tends to force itself to start, artificially interrupting the browser as it prepares a page. With the Web Animations API, a page can much more declaratively indicate its intent without force-calculating animation target values, as is required now with CSS Animations.
A page can create and launch animations, but it can also get a reference to that animation and change it on-the-fly afterward. You can play it, pause it, change the play position, the play state, hook into the animation lifecycle with a Promise
-based API, and much more. A page can even get all of the animations associated with an element or the entire document and manipulate them wholesale. Safari’s new inspector uses this API to offer much richer display and control of all running animations. Understandably, Safari has reimplemented CSS Animations and CSS Transitions on top of a whole new animation engine that the Web Animations API also controls.
Safari puts a very strong implementation forward, with only two features missing:
They seem to boil down to:
Published by marco on 13. Apr 2020 11:18:02 (GMT-5)
Despite the title, from what I can gather from 10 Things I Hate About PostgreSQL by Rick Branson (Medium), the author is a big fan of PostgreSql. However, he has such vast experience with it that he can still list 10 things that don’t work as well as they could.
They seem to boil down to:
The plan-builder doesn’t support planning hints, which means you can’t patch a query in production to buy time: you have to either meta-patch it (i.e. figure out some way of sending a “hint” to the planner through other means) or fix it for real, which can take a lot more time while your production servers are blowing up. From the article,
“I do understand their reasoning, which largely is about preventing users from attacking problems using query hints that should be fixed by writing proper queries. However, this philosophy seems brutally paternalistic when you’re watching a production database spiral into a full meltdown under a sudden and unexpected query plan shift. (Emphasis in original.)”
The linked notebook uses D3.js, but previous classes in the course have dealt with Vega, which is,
]]>“[…] a visualization grammar, a declarative language for creating, saving, and... [More]”
Published by marco on 21. Mar 2020 18:37:55 (GMT-5)
Updated by marco on 15. Apr 2020 15:50:21 (GMT-5)
The programmable notebook Introduction to D3 by Arvind Satyanarayan (MIT Visualization Group) is part of a full course at MIT about Interactive Data Visualization.
The linked notebook uses D3.js, but previous classes in the course have dealt with Vega, which is,
“[…] a visualization grammar, a declarative language for creating, saving, and sharing interactive visualization designs. With Vega, you can describe the visual appearance and interactive behavior of a visualization in a JSON format, and generate web-based views using Canvas or SVG.”
Vega is a higher-level abstraction than D3 and is, therefore, both more powerful and more limited than it.
If what you want to build fits the higher-level building blocks of Vega (see examples), then you’ll be done more quickly with that; if it doesn’t, then D3.js offers more flexibility as it functions at finer granularity.
“[…] grammars [like Vega] break visualization design down into a process of specifying mappings (or visual encodings) between data fields and the properties of graphical objects called marks. They’re useful for concisely and rapidly creating recognizable visualizations, while giving us more design flexibility (or expressivity) than chart typologies like Microsoft Excel.
“However, describing visualization design in these high-level terms limits the types of visualizations we can create. For example, we can only use the available marks, and can only bind data to supported encoding channels.”
With D3.js, you have to do a bit more legwork yourself, but it offers more graphical flexibility and possibilities. Instead of customizing the settings for predefined renderers (or “marks”), you define the renderers yourself: the notebook includes examples in HTML and SVG. To keep things simple, the SVG examples replicate the HTML examples, but they could render much more that is not so easy to realize in HTML.
Although D3.js has a reputation as a “charting library”, that moniker is actually more appropriate for Vega. D3.js is a generalized data-to-graphics mapping library. As you can see from the examples, it is very useful for charts, but allows a lot more customizability than Vega. Anyone building charts for their site should consider very carefully whether the additional power and complexity are warranted vs. a solution with something like Vega.
That said, it was a lot of fun getting to know D3 with this notebook. The notebook is extremely well-written and organized and it’s absolutely fantastic that it’s available online, for free. I was able to understand and execute all of the exercises and feel like I have a good enough grasp of D3 now to be able to build something with it. Perhaps more importantly, I feel that I can now:
It starts out with the absolute basics:
“let
introduces a variable binding […]”
then takes you through
Options
mut
ablesPublished by marco on 21. Mar 2020 15:59:18 (GMT-5)
I found the article A half-hour to learn Rust by Amos to be extremely helpful in learning the syntax and mechanics of Rust.
It starts out with the absolute basics:
“let
introduces a variable binding […]”
then takes you through
Options
mut
ablesIndex
and IndexMut
)Results
panic
and unwrap
, expect()
and ?
Fn
, FnMut
, and FnOnce
)move
for … in
and ends up with a function builder that tests strings:
fn make_tester<'a>(answer: &'a str) -> impl Fn(&str) -> bool + 'a {
move |challenge| {
challenge == answer
}
}
fn main() {
let test = make_tester("hunter2");
println!("{}", test("*******"));
println!("{}", test("hunter2"));
}
// output:
// false
// true
Quino 8 is a very solid and stable release that has already been test-integrated into many of our current products running on Quino. We don’t anticipate any more low-level API changes, though there will be follow-up bug-fix... [More]
]]>Published by marco on 7. Mar 2020 18:43:04 (GMT-5)
Updated by marco on 8. Mar 2020 10:58:06 (GMT-5)
Now that Quino 8.x is out the door, we can look forward to Quino 9.
Quino 8 is a very solid and stable release that has already been test-integrated into many of our current products running on Quino. We don’t anticipate any more low-level API changes, though there will be follow-up bug-fix releases.
There are a few larger-scale changes improvements and enhancement, outlined below (and noted in the roadmap).
With this release, we’ve got more coverage than ever. Excluding only generated code (e.g. *Metadata.cs
and *.Class.css
in the model assemblies), we ended up with a respectable 81% test coverage. Quino has almost 10,000 tests comprising about 51k LOC and covering 82k LOC [1], Many, many of these are integration and scenario tests. With this level of test coverage, we feel comfortable with refactoring to improve usability and performance.
One of the primary near-term goals is to improve Quino’s documentation story. The aim is to take a new developer through the common tasks of working with a solution based on Quino.
Some of this documentation is currently still out-of-date or will change as we improve the corresponding components. For example:
Nant
is no longer relevantquino
tool documentation will no longer be relevant after 8.1 (see tools-related issue in the issue tracker)The latest table of contents is much more comprehensive than before and we’re still improving it.
We don’t have an integrated search for the conceptual documentation yet, but you can use Google’s site-specific search. For example, search for configuration with the following search text “configuration site:docs.encodo.ch”. The top results are:
Which is pretty decent, overall.
Several of our upcoming products using Quino (two are so new that they’re not yet listed) are replacing legacy products that are highly dependent on a central database that defines the application domain. That is, the model is in the database or in a model description that is not initiallly a Quino model.
Instead of defining the model in C# code manually and then building the database from that (the standard approach with Quino), these products define the model with varying levels of automation and import and then use the existing database.
The following list shows the various ways that we’re building Quino models, in addition to the standard approach of defining them in C#:
This allows customers with existing databases to relatively quickly and easily produce a Quino model that gets them access to the plethora of features available to Quino applications (e.g. ORM, schema-check and -migration, generated GUI for desktop or web, and so on).
The LOC analyzer included in Visual Studio had slightly different numbers:
Quino has almost one line of testing code per line of library code (43k/56k ~ 77%). Quino has almost 4 lines of non-executable code per line of executable library code (202k/56k ~ 360%).
The disparity between the two results (JetBrains DotCover and Microsoft Visual Studio) just goes to show what a fraught metric LOC really is. According to these two measurements, Quino has between 56k and 83k LOC of executable library code.
Published by marco on 22. Feb 2020 17:43:38 (GMT-5)
The summary below describes major new features, items of note and breaking changes.
The links above require a login.
CreateGuid()
, CreateDate()
, and CreateTime()
. (QNO-6304, QNO-6305)Before upgrading, products should make sure that they do not depend on any obsolete members in the current version (7.x).
Quino-Web 8.0 is a rewrite and is therefore mostly incompatible with 7.x.
See the Quino-Web/Sandbox.Web
project for a working example. This integrates the standard SandboxApplication
into a web site using the standard GenericController
and MetadataController
to provide data and UI to the generic Quino Client.
Some internal types in Quino-Standard have been moved to more appropriate namespaces and assemblies, but the impact on products should be non-existent or very limited.
The following types were moved from Encodo.Quino.Core
to Encodo.Quino.Culture
:
LanguageTextAttribute
IValueParser
CaptionAttribute
LanguageDescriptionAttribute
The following types were moved from Encodo.Quino.Core
to Encodo.Quino.TextFormatting
:
* IFileSizeFormatter
Quino’s default culture-handling has been overhauled. Instead of tracking its own language, Quino now uses the standard .NET CultureInfo.CurrentUICulture
for the default language and CultureInfo.CurrentCulture
for default formatting (e.g. times, dates, and currencies). Many fields have been marked as obsolete and are no longer used by Quino.
The default languages in Quino have changed from “en-US” and “de-CH” to “en and “de”, respectively.
The reasoning behind this is that, while a _requested language_ should be as specific as possible, a _supported language_ should be as general as possible. The standard culture mechanisms and behavior (e.g. .NET Resources) “fall back” to a parent language when a more-specific language cannot be found. If an application claims to only support “en-US”, then a request for “en-GB” fails. If the supported language is “en”, then any request to a language in the “en” family (e.g. “en-US”, “en-GB”, “en-AU”) will use “en”.
An application that supports “en-US” and “de-CH” has, therefore, a more limited palette of languages that it can support.
Quino code runs in the context of a user, who has a list of preferred languages, in decreasing order of preference. This context can last the entire duration of an application (e.g. a standalone application like a console or desktop application) or last as long as a web request.
The application itself has a list of languages that it supports, as well as resources and metadata that defines text in these languages. The resources are standard .NET Resources with the standard fallback mechanism (i.e. a request for “en-US” can be satisfied by “en”). The metadata uses DynamicString
objects, which encapsulate a map from language codes (e.g. “en” or “de”) to strings.
During application startup or at the beginning of a web request, the ILanguageResolver
determines the language to use for a given set of requested languages. In ASP.NET Core, the requested languages come from the HTTP headers provided by the browser. In standalone applications, the IRequestedLanguageCalculator
provides the requested languages. The ILanguageInitializer
is responsible for coordinating this during application startup.
The rest of Quino uses the following singletons to work with languages.
IDynamicStringFallbackCalculator
: Comes into play when a request is made for a language that is not directly supported. For example, if the application supports “en” and “de”, then a request for “en-US” will ask this singleton how to resolve the request.IDynamicStringFactory
: Creates a dynamic string to describe a given object. The default implementation uses .NET Attributes.ILanguageResolver
: Determines the culture to use from a list of available cultures and a list of requested/preferred cultures.IRequestedLanguageCalculator
: Provides the sequence of languages from which to choose during initial resolution (web requests _do not_ use this).ILanguageInitializer
: Integrates language-selection into the application startup.ICaptionCalculator
: Extracts a single caption for a culture from a given object. Appications should use the IDynamicStringFactory
in most cases, instead.An application can control fallback by registering custom IDynamicStringFallbackCalculator
and ILanguageResolver
implementations (though this is almost certainly not necessary).
Any product that calls AddEnglishAndGerman()
will automatically be upgraded as well. A product can avoid this change by calling AddAmericanEnglishAndSwissGerman()
instead.
A product that uses the new languages will have to replace all fields in reports targeted at “en-US” and “de-CH” to target “en” and “de” instead.
A product that does use the new default languages will have to determine how to migrate database fields created for languages that are no longer explicitly supported. If the model includes value-lists (enums) or multi-language properties , the application will have to migrate the database schema to update multi-language fields (e.g. “caption_en_us” => “caption_en”).
A product that sets MetaIds
manually will migrate without modification (Quino will rename the property in the database).
A product that does _not_ set MetaIds
(this has been the default in Quino since version 2) will have a MetaID mismatch because the name has changed.
By default, Quino will migrate by attempting to drop, then re-create multi-language properties. In the case of value-list captions, this is harmless (since the data stored in these tables are generated wholly from the metadata). For actual multi-language properties with user data in them, this is _a problem_.
The simple solution is to call UseLegacyLanguageMappingFinalizerBuilder()
during application configuration to ensure a smooth migration (Quino will rename the property in the database).
A product that updates its languages should regenerate code to update any generated language-specific properties. Properties that had previously been generated as, e.g. Caption_en_us
will now be Caption_en
.
As of version 8, C# also supports non-nullable references, but we haven’t migrated to using... [More]
]]>Published by marco on 18. Feb 2020 09:08:08 (GMT-5)
I prefer to be very explicit about nullability of references, wherever possible. Happily, most modern languages support this feature non-nullable references natively (e.g. TypeScript, Swift, Rust, Kotlin).
As of version 8, C# also supports non-nullable references, but we haven’t migrated to using that enforcement yet. Instead, we’ve used the JetBrains nullability annotations for years. [1]
Recently, I ended up with code that returned a null
even though R# was convinced that the value could never be null
.
The following code looks like it could never produce a null value, but somehow it does.
[NotNull] // The R# checker will verify that the method does not return null
public DynamicString GetCaption()
{
var result = GetDynamic() ?? GetString() ?? new DynamicString();
}
[CanBeNull]
private DynamicString GetDynamic() { … }
[CanBeNull]
private string GetString() { … }
So, here we have a method GetCaption()
whose result can never be null
. It calls two methods that may return null
, but then ensures that its own result can never be null by creating a new object if neither of those methods produces a string. The nullability checker in ReSharper is understandably happy with this.
At runtime, though, a call to GetCaption()
was returning null
. How can this be?
There is a bit of code missing that explains everything. A DynamicString
declares implicit operators that allow the compiler to convert objects of that type to and from a string
.
public class DynamicString
{
// …Other stuff
[CanBeNull]
public static implicit operator string([CanBeNull] DynamicString dynamicString) => dynamicString?.Value;
}
A DynamicString
contains zero or more key/value pairs mapping a language code (e.g. “en”) to a value. If the object has no translations, then it is equivalent to null
when converted to a string
. Therefore, a null
or empty DynamicString
converts to null
.
If we look at the original call, the compiler does the following:
GetDynamic()
sets the type of the expression to DynamicString
.??
operator if both sides are of the same type; otherwise, the code is in error.DynamicString
can be coerced to string
, the compiler decides on string
for the type of the first coalesced expression.??
) triggers the same logic, coercing the right half (DynamicString
) to the type it has in common with the left half (string
, from before).string
in the end, even if we fall back to the new DynamicString()
, it is coerced to a string
and thus, null
.Essentially, what the compiler builds is:
var result =
(string)GetDynamic() ??
GetString() ??
(string)new DynamicString();
The R# nullability checker sees only that the final argument in the expression is a new
expression and determines that the [NotNull]
constraint has been satisfied. The compiler, on the other hand, executes the final cast to string
, converting the empty DynamicString
to null
.
DynamicString
-to-string
ConversionTo fix this issue, I avoided the ??
coalescing operator. Instead, I rewrote the code to return DynamicString
wherever possible and to implicitly convert from string
to DynamicString
, where necessary (instead of in the other direction).
public DynamicString GetCaption()
{
var d = GetDynamic();
if (d != null)
{
return d;
}
var s = GetString();
if (s != null)
{
return s; // Implicit conversion to DynamicString
}
return GetDefault();
}
The takeaway? Use features like implicit operators sparingly and only where absolutely necessary. A good rule of thumb is to define such operators only for structs
which are values and can never be null
.
I think the convenience of being able to use a DynamicString
as a string
outweighs the drawbacks in this case, but YMMV.
@NonNull
and @Nullable
annotations, although it’s unclear which standard you’re supposed to use. (StackOverflow)app.config
or web.config
files.
The method described below works: when you get an exception because the runtime gets an unexpected... [More]
]]>Published by marco on 30. Jan 2020 22:30:05 (GMT-5)
Updated by marco on 30. Jan 2020 22:30:51 (GMT-5)
After years of getting incrementally better at fixing binding redirects, I’ve finally taken the time to document my methodology for figuring out what to put into app.config
or web.config
files.
The method described below works: when you get an exception because the runtime gets an unexpected version of an assembly—e.g. “The located assembly’s manifest definition does not match the assembly reference”—this technique lets you formulate a binding-redirect that will fix it. You’ll then move on to the next binding issue, until you’ve taken care of them all and your code runs again.
If you have an executable, you can usually get Visual Studio (or MSBuild) to regenerate your binding redirects for you. Just delete them all out of the app.config
or web.config
and Rebuild All. You should see a warning appear that you can double-click to generate binding redirects.
If, however, this doesn’t work, then you’re on your own for discovering which version you actually have in your application. You need to know the version or you can’t write the redirect. You can’t just take any number: it has to match exactly.
Where the automatic generation of binding redirects doesn’t work is for unit-test assemblies.
My most recent experience was when I upgraded Quino-Windows
to use the latest Quino-Standard
. The Quino-Windows
test assemblies were suddenly no longer able to load the PostgreSql driver. The Quino.Data.PostgreSql
assembly targets .NET Standard 2.0. The testing assemblies in Quino-Windows
target .NET Framework.
After the latest upgrade, many tests failed with the following error message:
This is the version that it was looking for. It will either be the version required by the loading assembly (npgsql
in this case) or the version already specified in the app.config
(that is almost certainly out of date).
To find out the file version that your application actually uses, you have to figure out which assembly .NET loaded. A good first place to look is in the output folder for your executable assembly (the testing assembly in this case).
If, for whatever reason, you can’t find the assembly in the output folder—or it’s not clear which file is being loaded—you can tease the information out of the exception itself.
System.IO.FileLoadException
Click “View Details” to show the QuickWatch window for the exception. There’s a property called FusionLog
that contains more information.
The log is quite detailed and shows you the configuration file that was used to calculate the redirect as well as the file that it loaded.
With the path to the assembly in hand, it’s time to get the assembly version.
Showing the file properties will most likely not show you the assembly version. For third-party assemblies (e.g. Quino), the file version is often the same as the assembly version (for pre-release versions, it’s not). However, Microsoft loves to use a different file version than the assembly version. That means that you have to open the assembly in a tool that can dig that version out of the assembly manifest.
The easiest way to get the version number is to use the free tool JetBrains DotPeek or use the AssemblyExplorer in JetBrains ReSharper or JetBrains Rider.
You can see the three assemblies that I had to track down in the following screenshot.
Armed with the actual versions and the public key-tokens, I was ready to create the app.config
file for my testing assembly.
And here it is in text/code form:
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<runtime>
<assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
<dependentAssembly>
<assemblyIdentity
name="System.Numerics.Vectors"
publicKeyToken="B03F5F7F11D50A3A"
culture="neutral"
/>
<bindingRedirect oldVersion="0.0.0.0-4.1.4.0" newVersion="4.1.4.0"/>
</dependentAssembly>
<dependentAssembly>
<assemblyIdentity
name="System.Runtime.CompilerServices.Unsafe"
publicKeyToken="B03F5F7F11D50A3A"
culture="neutral"
/>
<bindingRedirect oldVersion="0.0.0.0-4.0.5.0" newVersion="4.0.5.0"/>
</dependentAssembly>
<dependentAssembly>
<assemblyIdentity
name="System.Threading.Tasks.Extensions"
publicKeyToken="CC7B13FFCD2DDD51"
culture="neutral
"/>
<bindingRedirect oldVersion="0.0.0.0-4.2.0.1" newVersion="4.2.0.1"/>
</dependentAssembly>
</assemblyBinding>
</runtime>
</configuration>
One of the main selling points of Fossil is that it does not support rebase. In the article,... [More]
]]>Published by marco on 2. Jan 2020 10:41:06 (GMT-5)
Fossil is a distributed Source Control Manager that claims to offer the same power without the complexity of Git. The article Fossil: Rebase Considered Harmful by D. Richard Hipp (Fossil SCM) is part of the documentation for the tool.
One of the main selling points of Fossil is that it does not support rebase. In the article, the author lays out the many ways in which rebasing causes no end of woes for developers using Git.
I’d heard of Fossil before and I’d even skimmed this document before. This time around, though, I read it through to learn the author’s reasoning. My short take is: I do not want to use an SCM that does not allow rebase. [1] I think a project benefits greatly in clarity if a developer is able to alter the local history before cementing commits into an unalterable history (i.e. pushing to the server).
The following definitions are not complete, but are sufficient for the ensuing discussion.
A rebase is considered a destructive operation because it discards part of the history of the repository by rewriting commits.
If I think about it, though, many of the operations I’m accustomed to making are destructive:
All of these operations are considered destructive because they modify the “true” history of the repository. But what do we mean by “true” history? Where does the story start?
The changes outlined above are not for sharing. It’s not interesting to the final reader that I had to backspace through and re-spell the word “outlined” in the previous sentence. It might be interesting to see different drafts, though, to see how I arrived at the final version. But those changes are at a different level of granularity.
Who decides where one level of granularity stops and the next begins? I think it’s the author of the commits. My workflow over the last ten years is based heavily on being able to massage commits so that I can prepare what I share to the server repository, where it can no longer be changed. I agree that there should be an unalterable history, but disagree with the author on where that history begins.
I agree with the author that developers should not work in silos, massaging their code until it is perfect, pushing only once there are no more errors and no-one could possibly take issue with anything in the feature. At this point, the author purports that many developers squash all of their local commits to a single so-called hairball commit that makes it look like the code sprung from the forehead of the developer as Eve sprung from Adam’s rib: whole and without blemish.
Hairball commits are acknowledged as bad, so attacking them as the prime reason to eliminate the tool that allows them seems to be more of a straw man.
Preventing developers from making any changes to local commits is not the way to solve the problem, though. While Fossil does not allow discarding any single commit from the history, the author acknowledges that Fossil allows developers to apply addenda that the common Fossil tools will show while hiding the original commits. [2]
I see the author’s point—that (potentially) important parts of a history are retained whether the developer wants them or not. That is, it is not up to the developer to decide, but up to the archeologist examining the commits later. This is an interesting idea, but the argument is ultimately not convincing.
Let’s suppose a developer uses an SCM without rebase. Either there will be many commits in the history that—unlike the author claims—do not provide any clarity because they are garbage commits (e.g. WIP and other sorts of investigatory commits that were quickly reverted or undone). Or, the developer will be terrified of making a commit before it’s ready and runs the risk of losing work or working less efficiently.
Developers will not magically become ego-less and kowtow to the machine. Instead, they will pick up bad habits that are worse than local rebasing. They will keep work uncommitted for too long or will fail to split up commits properly because they are afraid that they can’t fix them up later. In either case, it’s chaos in the commit history and the project efficiency and reliability suffers.
But the author is arguing with a straw man that doesn’t really exist outside of shitty developer teams with undisciplined developers. One can argue that these are the kind of developers that many projects have, but that can only be addressed with process. Weakening the tools so that disciplined developers are less efficient is a bad idea.
You don’t like hairball commits? Tell developers to stop making them. Enforce the policy with reviews. The Git documentation already encourages developers to make focused commits. Rebase allows a developer to split up commits during or after a code review. Rebase can actually be used to combat hairball commits.
I have personally used it to split up commits that inadvertently mixed a bug fix or two into a large pile of refactoring changes. I’ve also often advised people to redesign their commits so that they tell a better story.
I’ve interspersed citations from the document linked above and included responses and thoughts.
“[…] [some tools] accomplish things that cannot be done otherwise, or at least cannot be done easily. Rebase does not fall into that category, because it provides no new capabilities. (Emphasis added.)”
As discussed above, I think that this is fundamentally wrong. My workflow is considerably different than it was before I used Git or had access to rebase. I would now be much less efficient if I didn’t have rebase. It would make me constantly focus on cleaning up commits before I really care to. You could make the argument that cleaning up afterward takes more time, but I haven’t experienced that to be the case. Instead, I want to be able to set the priorities rather than worry about committing something that I cannot undo.
And it’s not about ego or “looking stupid” to future readers of the history; instead, it’s about having control of the story you tell to those same readers. If you don’t have rebase, then you tell just as poor a story as if you use rebase badly. It’s perhaps closer to the “true” story, but it’s not the “best” story. Without rebase, you’re forcing future archeologists of your code to read all drafts as well as the final version simultaneously.
At Encodo, we don’t focus on ego, we focus on efficiency. We do not obliterate commits that make sense just to squash a whole feature. We retain commits in order to tell a good story about how a feature was built. We do not emphasize being able to build each commit: often we’ll add a failing test in one commit, then fix the bug in another commit, because that tells a better story.
We need rebase in order to massage local commits so that they tell this good story rather than uploading dozens of commits that no-one should ever have to look at (typos, code comments, formatting, etc.). Often, we’ll squash in little fixes and changes that come up during a review. Is the Fossil author suggesting that there is some benefit to seeing these in a separate commit? It would make understanding the commits at a later time that much harder.
I think most of the author’s concerns are addressed by using review and process to enforce better commits. Fossil can’t make this happen because the developers have to create good commits in the first place or, at least, eventually. Rebase helps better developers clean up their own commits and also helps them help others clean up their commits, teaching them how to tell the story of their code.
“A rebase is just a merge with historical references omitted”
Exactly. If I can’t eliminate WIP commits or squash local commits, then my workflow changes. Honestly what’s the point of keeping each commit? Many are scribbles, unwanted drafts. They’re not part of a history anyone would retell. Once commits are cleaned up and tell a good story, there is no need to keep the old commits around. At that point, you’re wasting the future archeologist’s time.
“Surely a better approach is to record the complete ancestry of every check-in but then fix the tool to show a “clean” history in those instances where a simplified display is desirable and edifying, but retain the option to show the real, complete, messy history for cases where detail and accuracy are more important.”
This feature is an interesting one for commits that can no longer be changed (i.e. have already been pushed), but why make the developer mark every accident and mistake instead of just letting him undo them? The “full” view would be of marginal to no value. Even once the messy commits were deciphered, they would most likely yield no useful information.
What possible benefit is it to keep a jungle of “fix typo” and “add missing file” or “fix broken test” commits just because the developer made a commit before running tests or seeing a warning in the IDE? [3]
“So, another way of thinking about rebase is that it is a kind of merge that intentionally forgets some details in order to not overwhelm the weak history display mechanisms available in Git.”
I honestly think that this guy just wants to make Git look stupid and Fossil look spectacular. I understand fully that it’s silly to argue that Git doesn’t need a feature that Fossil has just because I’ve personally never needed it. A good feature is something that becomes essential once you have it, but you never knew you were missing it or were less efficient without it. Fossil’s ability to easily see which changes were made to a file after a given commit sounds like it might be that kind of feature. However, rebase in Git is such a feature, so if Fossil takes that away, it’s a deal-breaker.
At this point, I think also that the author is considering Git as a command-line application rather than extended by a truly powerful UI like SmartGit, which provides fast access to gobs of historical data with little effort.
“Or, to put it another way, you are doing siloed development. You are not sharing your intermediate work with collaborators. This is not good for product quality.”
What has this guy seen in the wild that he’s reacting this way? Who hurt this poor man? How often does he expect us all to commit and push to the server? Should we code directly on the server? Where does he draw the line for “siloed” work? A day? An hour?
More to the point: who is paying developers (or a project lead) to examine unvetted commits? Do you think we’re made out of free time? Keeping everything around forever is not the most efficient way of optimizing information about your code. It’s a hoarder mentality.
I understand the sentiment: you want to avoid people massaging commits into oblivion, eliminating important information. But, honestly, I’ve seen the opposite problem: commits pushed to the server in the shabbiest form, thereafter unalterable. [4]
“Many developers are drawn to private branches out of sense of ego. “I want to get the code right before I publish it.””
No, that is not my requirement. I want an efficient review that pinpoints (and fixes) errors quickly so no-one wastes time.
The author claims that,
“Rebase adds new check-ins to the blockchain without giving the operator an opportunity to test and verify those check-ins. Just because the underlying three-way merge had no conflict does not mean that the resulting code actually works. Thus, rebase runs the very real risk of adding non-functional check-ins to the permanent record.”
This is true only for the special case of online merges. These should be avoided like the plague, in any case. I know that people really, really trust their tools. I know that they think that merges are infallible, that their CI builds their software and runs their tests and gives their pull request a green flag and a thumbs-up.
But anything other than a trivial pull request should be examined with tools more capable than online repository managers. Not only are they not as good, they are wildly inefficient when compared to a good desktop tool. I know this next generation of developers want to do everything on their phones, but this is ridiculous. The screen is too small and the tools are too limited.
Get a machine with usable screen real estate and learn what being efficient really means. Not only will you be quicker, you’ll be better: your error rate will decrease and you’ll see connections in the commits much better than with the (comparatively) meager online tools. I’ve written before about one such UI, SmartGit, in Git: Managing local commits and branches and Using Git efficiently: SmartGit + BeyondCompare.
Other online tools have similar weaknesses versus their desktop brethren: for example, text editors like Word or Google Docs. It’s definitely a killer feature that they’re online, but their only selling point is that they’re attached to an online document storage. That’s the selling point. As amazing as it is that these tools run in a browser, they are pathetic compared to tools from thirty years ago. My God, I fondly remember WriteNow 4.0 for Mac OS 6 and 7, which handled a 250-page document with aplomb, complete with figures, tables, TOC, numbering, custom styles, … all of those things that an editor should do. Somehow, just because it’s in the cloud means that we should be happy with WordPad instead of a full-fledged editor. It’s a joke.
The author claims that,
“Rebasing is the same as lying By discarding parentage information, rebase attempts to deceive the reader about how the code actually came together.”
Then you should include all command/undo buffers from your IDE, too. At this point in the document, the author is just repeating the same argument over and over, reformulated but not different.
“Unless your project is a work of fiction, it is not a “story” but a “history.” Honorable writers adjust their narrative to fit history. Rebase adjusts history to fit the narrative.”
That’s not even how human history works. It’s not even how your own stories about your own life work. This is the kind of mentality that wants to keep all 6000 pictures from a vacation. Why? Just in case you need that picture of the ground that you took by accident? Because you need all 300 pictures of the Matterhorn? You’re wasting your readers’ time and your own.
“The intent is that development appear as though every feature were created in a single step: no multi-step evolution, no back-tracking, no false starts, no mistakes.”
Again, he proposes to fix a problem—poorly built commits—by not allowing anyone to modify commits.
“We believe it is easier to understand a line of code from the 10-line check-in it was a part of — and then to understand the surrounding check-ins as necessary — than it is to understand a 500-line check-in that collapses a whole branch’s worth of changes down to a single finished feature.”
I agree with this 100%. As already noted above, though, the review should disallow such foolish hairball commits.
“The more comments you have from a given developer on a given body of code, the more concise documentation you have of that developer’s thought process.”
Correct. But you don’t want to see everything. He presents a false choice between all the history and an improperly truncated version. Then he says he’d rather have all of it, and wants to get rid of history-rewriting. This doesn’t fix the problem of shitty programmers making shitty commits. The only way to fix that is gatekeeping reviews and process. Taking a vital tool for clarity (rebasing) away from disciplined programmers is a terrible idea.
“If we rebase each feature branch down into the development branch as a single check-in, pushing only the rebase check-in up to the parent repo, only that fix’s developer has the information locally to perform the cherry-pick of the fix onto the stable branch.”
He really seems to.be attacking a repo-management/history-editing process I’ve never used. It sounds horrid.
“Rebasing is an anti-pattern. It is dishonest. It deliberately omits historical information. It causes problems for collaboration. And it has no offsetting benefits.”
Only one of those sentences is true.
Granted, my work usually doesn’t call for fancy effects like those you can achieve with something like background-blend-mode
, but it can happen.... [More]
Published by marco on 28. Dec 2019 23:23:06 (GMT-5)
Updated by marco on 28. Dec 2019 23:23:47 (GMT-5)
The article Z’s Still Not Dead Baby, Z’s Still Not Dead by Andy Clarke (24 Ways) is well-written, very interesting and taught me a few new CSS tricks of which I was unaware.
Granted, my work usually doesn’t call for fancy effects like those you can achieve with something like background-blend-mode
, but it can happen. There’s not only background-blend-mode
, there’s also mix-blend-mode
and filter
, all of which apply high-quality effects dynamically.
In the late spring, I had a two-month project where I had to use a lot of transformations and animations—and I was able to get it all done with CSS. Once you know about these kinds of techniques, you keep them in mind, and are able to consider solutions that would seem impossible (or very difficult/time-consuming/unmaintainable) if you didn’t know the technique.
A modern browser can construct the following image by composing and blending a couple of graphics.
It’s actually pretty cool that you can get this type of layout with wide browser support and no hacks. See the linked article for a lot of examples.
I have used CSS Grid before (as the author does). The author mentions subgrids, but ends up using a second grid within the first grid because browser support for nested grids is good, whereas no-one supports subgrids except for the latest version of Firefox. The MDN documentation for Subgrids explains that it differs from nested grids in that
“If you set the value subgrid on grid-template-columns, grid-template-rows or both, instead of creating a new track listing the nested grid uses the tracks defined on the parent.”
The linked page includes many examples and more detail.
As with any advanced techniques, you have to take into account your own target browsers to see whether you can use them in your own projects. It’s a well-written article and I learned a few more techniques that I can hopefully use at some point.
Ego can also be that thing that drives a talented programmer to create something of use to the rest of us, but... [More]
]]>Published by marco on 30. Nov 2019 15:36:51 (GMT-5)
Updated by marco on 4. Oct 2023 21:28:45 (GMT-5)
The discussion React in concurrent mode: 2000 state-connected comps re-rendered at 60FPS (YCombinator) is illuminating mostly in that it shows how ego can impede productivity.
Ego can also be that thing that drives a talented programmer to create something of use to the rest of us, but that’s honestly a very rare case. More often than not, the best case is that a developer improves their skills—and perhaps learns to be more humble instead of shooting of their mouth about how “easy” it is to create a “good” product. Such claims are nearly always made without defining what they mean by “good”.
Some comments are from programmers more interested in a pissing contest of who can write performant code on their own. Their implementation often focuses laser-like on a specific use case not often found in nature without tackling the tough question of how to design a more generalized solution that incorporates and balances more than just the one aspect of the system that they think they’re good at (e.g. performance).
That is, they tend to carefully define the application domain based on what they’re already good at. This is not how product development works. Many of the commentators get distracted by the overreaching claims of the reposter (faster than any other WebGL rendering, which is patently not true) instead of reading the much more reasonable claims of Dan Abramov, who is the original poster.
Thankfully, there are others who seem to understand that giving up a logical, declarative paradigm in order to do so is not an acceptable tradeoff in almost any given project. What are some facets other than performance that contribute to a good solution?
Products that try optimizing all facets generally never see the light of day or serve as the base material from which more viable projects are born.
A higher level of abstraction is a good thing. It allows mediocre programmers (and be happy if you have even mediocre programmers) to write programs that aren’t a nightmare to maintain or refactor. It allows good developers to very quickly write maintainable programs. If the underlying framework has a declarative and easily understood paradigm that has only a handful of orthogonal concepts and it offers great performance by default, that’s a win. There are few projects that need spectacular performance as their main feature.
I would argue that most web programming is about making line-of-business apps and pages where look and feel matters so much that it’s worth investing 50% more budget to get near-perfect and smooth updates. If it janks, it janks. There is no time or budget (or, sometimes, programming skill) to “fix” it. And, if “fixing” it means abandoning the high-level declarative programming model that makes working with Reactive so efficient, maintainable and productive, then that’s even more implicit cost bound up in it.
As the commentator Onion2k put it:
“This is a demo of good performance using a web framework on top of a WebGL framework. It’s showing that a future version of React will make building a solid 60fps web app UI […] within the reach of most web developers. Sure, you can hand-roll code to get that performance today if you know how, but this is about putting that performance in the hands of developers who can’t (or, more often, aren’t given the resources to). To argue that is unnecessary or actually bad is ridiculous. Libraries that make it easier to build better apps are universally good things. (Emphasis added.)”
To use React, you have to make concessions to Reactive mode in your application definition. But that’s the way programming works. Instead of writing “a person must have a company, while the company has a possibly empty list of people”, we write (example from Quino),
Elements.Module.Company
.AddOneToManyRelation(Elements.Module.Person)
Programming is all about explaining what an application does. The programming language and framework and runtime balance all of the factors listed above to be able to transform the formulation most accessible to a product owner (“I want a CRM”) through a business analyst (“It has a list of companies, each of which has a list of people”) to a programmer (formulation above).
The formulation above is still quite high-level, but satisfactory for 99% of cases. For the remaining 1%, the API has to provide some way of digging into the underpinnings of the implementation without dropping the developer off of a cliff. Quino does this reasonably well, as does React. The focus here is on realizing that a framework’s ability to accommodate that 1% of use cases smoothly is only one aspect of its effectiveness. Given that it doesn’t come up very much, it makes no sense to focus too much effort on optimizing that path, no matter how much more interesting it would be to the developers to do so.
This is one of those silly blogs-posted-as-tweets, but the points in Is Concurrent Mode just a workaround for “virtual DOM diffing” overhead? […] by Dan Abramov (Twitter) are good.
The point is that Concurrent Mode is not a speed improvement only for React. It also improves how your app’s code updates and is scheduled without you having to change your code (much, or at all). The linked article explains how this sea change in rendering components forms the basis of many other performance improvements that apply to existing applications without modification.
It’s exciting that a near-future version of React will make animations and updates even smoother than they are now. This taking into consideration that they are already more than good enough for most apps without tweaking.
React is not a game-programming framework. It makes no sense to claim that React apps will blow away apps written in Unity. We make line-of-business apps with it. React already allows apps to have much better update characteristics with almost no code other than a few functional declarations to define rendering and components and the state that they rely on.
The model is unimpeachable in that it accurately reflects the application model without adding any ceremony.
You make some concessions in order to define your declarations about your program’s logic and states so that the framework can optimize as much as it can, but no more. With hooks, you can declare simple, mutable state or one-time, partially mutable state (memos and callbacks), listeners for lifecycle events (effects) and so on.
On the one hand, you’re forced to define your logic using React’s idioms but, on the other, they still make sense in that they make your assumptions about your app’s logic explicit rather than implicit. Once you’ve done this, the framework knows more about what it can optimize away and what it can’t. And you haven’t wasted time because you’re technically describing salient properties of your application domain.
That’s the idea behind the < Suspense/>
component: the app can declaratively determine how it would like components to be updated in different asynchronous situations involving multiple asynchronous tasks. Concurrent Mode allows the framework to work before that update is technically complete because it allows any work to be interrupted—and discarded, if it is no longer relevant.
This allows the reconciliation to benefit a bit from something like the branch predictor in a CPU, where speculative branches are executed in parallel and occasionally discarded. JavaScript imposes a cooperative rather than parallel model, but low-level support for interruptibility (especially when automatically applied) is worlds better than nothing.
Any language—and the combination of the underlying programming language and the framework API, combined, is the language a programmer uses—must have a shape, a paradigm that it enforces. Naturally, a programmer can use a different paradigm than the recommended one. But a good framework finds the balance between a paradigm that is comfortable for a large part of its audience and one that is enough of an abstraction that it has a lot of leeway for applying to the next layers down (until it gets to machine code).
A good framework provides an out-of-the-box experience that provides a clearer programming idiom and better performance than most programmers could do on their own.
In the thread above, Abramov in no way claims that it’s not possible to create a faster application for thousands of components, just that the new renderer is much, much faster than the old one without changing the programming idiom at all. The programming idiom in React is very good, if not great. This is really good news.
Instead, you could say that Abramov’s claim is that anyone who claims to have made a faster renderer is making tradeoffs in other areas (e.g. from the list above). Most likely, the resulting balance is not as good as the clear, declarative syntax of React or it doesn’t cover nearly as many use cases.
Is React’s syntax the best it can be? Maybe not yet. For example, a component declares mutable, internal state with the useState()
hook, which returns a state variable and a “setter” function to change that state. Svelte, for example, improves on this by allowing the app to just declare the state variable and automatically noticing when that state is updated and generating the state-update code in the transpilation phase. This is an improvement that allows an app developer to work even closer to “normal” code than before.
If Svelte can provide this clearly more readable feature without introducing problems in other facets (e.g. learnability, performance, completeness), then it’s a clear win.
async/await
)A similar kind of improvement is async/await
. This feature didn’t actually change how asynchronous code works. Instead, it allowed a programmer to write synchronous code that could be made asynchronous automatically.
This is a sea change for most developers—even those clever and experienced enough to have written that level of asynchronous code themselves. The point is that the developer is no longer wasting time writing what amounts to boilerplate code that is very error-prone and difficult to thoroughly test (which means that it’s often not thoroughly tested).
The idiom of async/await
imposed minimal “noise” (none, actually) and has a tremendous upside. The code doesn’t necessarily get faster, but it could be made faster without changing it.
The comment on Fiber Principles: Contributing To Fiber by sebmarkbage (React/Github) is another well-written contribution to this discussion that shows that there are a lot of clever people working on React that are aware of the fine balance between the requirements involved in writing a strong framework.
The user responds to accusations that much of this work would not be necessary if JavaScript had proper threading. The author argues that globally mutable prototypes are an intrinsic concept that is used in many, many JavaScript use cases. However, they also limit the ability of ever bringing threads to JavaScript. The language is limited from the get-go.
That doesn’t mean we should all stop using JavaScript. It just means that this is something that goes in the cons list and must be weighed against all of the pros. Anything that is in the cons list must be compensated with effort. JavaScript has many pros going for it: for example, it’s won the client-side programming-language war.
Perhaps WebAssembly will replace it as a runtime, but only time will tell. By then, it won’t matter, because we’ll be using languages like Elm or TypeScript to write our code. Even this doesn’t matter, though, because these languages must also transpile to the underlying paradigm defined by an engine that must run JavaScript.
That goes—for now—for WebAssembly targets as well. And threading is out for any of this stuff. Until something in this situation changes and we can target a threaded execution engine on the client side, we should be happy that there are very clever people making cooperative multi-tasking transparent and easy to program for the rest of us.
Those of us who worked on Apple OSs before OS X or Windows before 95 know what it’s like to have to deal with cooperative multi-tasking in our own code. I welcome the declarative paradigm that allows excellent performance for a wide range of use cases without making me write and maintain a whole bunch of code that has nothing to do with my application domain.
There’s a reason why everyone with sense is talking about this concept. Using shared, mutable state makes it very easy to write the happy path of a single use case, but it makes it very hard to reason about other use cases and branches. It doesn’t scale, extend, test or maintain well. If these requirements don’t apply to your application—e.g. a script or one-off throwaway prototype—then you might be fine.
I would personally advise against practicing or becoming accustomed to techniques that apply to one use-case but that are dangerous in all other situations. You’ll generally end up using the technique to which you’ve become accustomed. While training yourself to build high-quality solutions risks the danger of over-engineering solutions to problems that could have been solved more simply, it’s easier to “downscale” your coding style than to “upscale” it.
With enough practice and the right techniques, you can write quality code just as initially efficiently as crappy code. I would also say to beware of the seductiveness of bad programming models that promise an initial speed in development that quickly drops off once it’s too late to change.
Prototypes happen to be built into the language in JavaScript’s case, but shared mutable data is the great stumbling block of concurrent programming. Applications that batch work into parallelizable chunks can be optimized to run more quickly by a clever runtime.
It is much simpler to reason about an application without shared mutable data. There are fewer cases and branches. Otherwise, an application must use locks (or fences or some sort of synchronization concept). The point is that efficient synchronization is not easy and many laic implementations tend toward speed rather than robustness and are buggy as a result.
Though it’s possible to hand-code faster concurrency than standard frameworks, most people can’t do it. And, given time, framework implementations get really, really good at optimizing nearly all cases. C# and .Net, for example, have a tremendously clever runtime underlying async/await
now that can hardly be beaten for throughput, scheduling, etc. Successive versions have built on new language concepts introduced precisely to allow an application—where needed—to be more declarative in ways that allow even more optimization (e.g. record references, etc.)
It’s nice to see that Concurrent React—much like async/await
in JS—provides a simple idiom for moving that effort out of the hands of most developers.
Naturally, a developer is free to do that work on their own—and many commentators in the original thread at the start of this article seem to enjoy writing code that has nothing to do with their actual app just to show that they can. But with enhancements like async/await
or Concurrent React, they don’t have to in order to enjoy performance benefits. That’s a win-win—a free lunch.
The point made above by Onion2k is very salient: very often “developers [aren’t] given the resources to” make the kind of optimizations that React will provide for free. Could a given rockstar developer write something even faster for exactly their application domain? Probably. Are they going to be given the time and budget to do so? Almost certainly not. It’s far better to have a good default that is smooth as silk and more than adequate to the task for almost all conceivable applications.
No-one’s paying you to reinvent the wheel. That’s almost certainly not your job. If you’d like it to be your job, then maybe you should work on a project where you’re inventing the wheel directly (i.e. a framework project). Then, you can build on that experience and your framework to turn around tightly written, maintainable and performant applications for your paying customers.
It’s important to be pragmatic and remember when you’re working on framework code and when you’re working on code that benefits from framework code without reinventing it. Otherwise, you’ve got a terrible situation: you invest in framework/infrastructure on every single project because you never reap the benefits of having written a framework. In the case of frameworks that are completely external to your application, like React (or Quino), you never even had to invest in writing the framework at all.
If you write a framework for just expert developers, there will be no adoption and you don’t help a large part of the community to write better apps. But what do we mean by better?
Continuing with React as an example, the abstract requirements at the start of this article roughly map to:
An application should have to only declare things about itself that are relevant to itself—but that also help to render the application better. Again, these idioms should scale: an application which will not have foreseeable performance issues in most components should be able to write those components with more approachable code.
Individual “islands” of code can provide additional information to optimize hotspots (like memoization, immutability hints, etc.) It’s important to note that these concepts are not introduced by the framework—they are intrinsic to the application’s domain model, but usually kept implicit.
If the application does not describe these aspects of itself, then the framework must make more pessimistic assumptions. Often this doesn’t matter. Where it does matter, the application should be able to use compatible and familiar idioms to improve the granularity of its description about itself. This, in turn, lets the framework use a faster approach where it now knows that it won’t violate the application’s definition.
The simplest of these is to tell React which parts of the state are mutable and which are immutable. When determining what has changed in an application state, a framework can simply compare the reference to the root node of an immutable object graph to the previous root-node reference to determine if that part of the graph has changed. If the object graph does not declare itself as immutable, then the framework must be pessimistic and compare the entire subgraph to determine if it has changed.
This is a concept that is intrinsic to programming. It is hard to conceive of it ever not being relevant. Naturally, if there is more than enough processing power available or the graph is small enough, it won’t matter, but it’s still axiomatically more work to compare potentially mutable graphs than immutable graphs. If an application fails to express immutability where it could have, that small missing bit of information reduces flexibility in choosing an algorithm.
This is not a new thing: most functional languages have immutability baked in as the default. Even C has the notion of const
and volatile
to give hints to the compiler about how it can deal with that data. Naturally, higher-level languages try to abstract away these concepts, but it constrains all the layers below.
On this subject, another unavoidable concept is nullability: is a reference assigned or not? Most new languages (and newer versions of languages, like C#) are switching from the age-old—and convenient-for-the-compiler—default of nullable references. Again, reference assignment is a core concept that is unavoidable when thinking about code with pointers.
Another concept that limits choosing a more performant transformation during compilation is failing to express function purity. Does a function cause a side-effect? A compiler can optimize a function known to be pure in ways that it cannot with impure functions.
All of these features are a balance between programmer convenience, onboarding of new developers, and allowing programmers to focus on application logic rather than making concessions to the language and framework. As discussed above, though, there are concepts intrinsic to programming that have ostensibly nothing to do with application logic, but that an application declares (if not explicitly, then implicitly).
Taking the example from above, if an application declares that a person is in a company, but fails to mention that a person must be in a company, then the underlying software (framework and compiler) must be more pessimistic about that relationship than is strictly necessary.
A good framework encourages software to be precise about its own model by allowing the application to declare the salient parts of its model in a declarative minimal set of idioms.
I don’t really care about being pedantic without first knowing some facts. What are the requirements?
Published by root on 24. Nov 2019 20:55:24 (GMT-5)
The article In Defense of Utility-First CSS by Sarah Dayan on January 15th, 2018 (Frontstuff) is very long [1], so I’ve summarized a bit with notes and thoughts. [2]
I don’t really care about being pedantic without first knowing some facts. What are the requirements?
If atomic/utility CSS can deliver these things, then it’s probably a fine tool. But—spoiler alert—it seems more like a tool for designers—not programmers. Programmers have other, better tools for building CSS in a way that fulfills the requirements above.
Essentially, these designers are like we programmers used to be: we used to care about cascading when we were still hand-coding our CSS. Now that we’re using LESS or another generator, we can use variables and functions for theming and use local CSS for precision. We can lean on specificity when it suits us and avoid it when it only gets in the way.
We want to declaratively say how we want everything to look and let our tools (LESS, WebPack with plugins) figure out how best to generate the CSS to accommodate supported browsers and also to create the kind of CSS that performs well without blowing up memory client-side. None of these optimizations and accommodations for targets should be up to the programmer/designer/CSS-writer at this point.
Utility-CSS feels functional, but it also feels like something you use when you don’t have LESS. I’ve never used BEM and agree that it never really made sense, from several good coding practices like DRY. That the author is coming from BEM to utility-CSS is not a surprise: BEM was never a good idea.
“Early refactors are a pretty good indicator of unmaintainability.”
I don’t agree. It’s more a sign of shifting priorities or requirements. It’s not uncommon in agile development. The example the author has of changing the meaning of a “card” after there are already components using that style just means that you should make a “card2” class (not a “card-no-ribbon” one) because it’s just a different card type.
The problem is that the design now includes two cards, not that your implementation should somehow be able to easily roll with a confusing design.
Where I see a problem is when a card is supposed to have a certain padding and a border with a certain color (let’s say the “padding-top-8” and “border-bottom-lemon” from the author’s example). But then you don’t want those anymore.
Granted, with proper components, you’ll only have to change the style in one place anyway, right? So it doesn’t matter what you call it. You could have just called it “card” in the local styles and been done with it. So, either you have to remove those highly specific styles in many places in your HTML (as with an old-style web site, like earthli) or you change it in one place anyway (new-fangled, with React components).
I guess it’s the difference between knowing from the HTML what the component is going to look like (<blockquote class=“border-thick-left-red padding-left-medium font-navy”>
) and knowing what the component is (<blockquote class=“newspaper”>
).
The author writes:
“Yet, the bigger and the more complex a component gets, the less obvious it is to know what class name maps to what element on the screen, or what it looks like.”
But then they include an example where it’s absolutely clear which components do what:
<div class="entry">
<h2 class="entry-title">The Shining</h2>
<div class="widget widget-lead">
<div class="widget-content">
<p>His breath stopped in a gasp…</p>
</div>
<div class="author">
<img class="author-avatar" src="…">
<h3 class="author-name">Stephen King</h3>
<p>Stephen Edwin King …</p>
<div class="btn-group">
<a class="btn" href="#">Website</a>
<a class="btn" href="#">Twitter</a>
</div>
</div>
</div>
</div>
I think this again shows the difference between programmers and designers: the code above is crystal clear to a programmer, so if a programmer is writing the CSS, then there’s no need to change anything.
The author seems to be a designer hell-bent on knowing exactly what the page will look like without actually showing it in a browser. I wish they’d included the version with utility CSS … it would have been a giant block of unreadable code, doubled in size with class names.
The author makes a good case for theming using CSS variables, which can be applied “at runtime” in the browser. The solution to theming with utility CSS turns out to be … making semantic styles instead of precisely named styles. So…not utility CSS.
The author references a few other articles, one of which is Kiss My Classname by Jeffrey Zeldman, which eloquently argues that there is nothing to change. He instead argues that developers and designers should use a visual style guide.
“I don’t believe the problem is the principle of semantic markup or the cascade in CSS. I believe the problem is a dozen people working on something without talking to each other.
“Slapping a visually named class on every item in your markup may indeed make your HTML easier to understand for a future developer who takes over without talking to you, especially if you don’t document your work and create a style guide. But making things easier for yourself and other developers is not your job. And if you want to make things easier for yourself and other developers, talk to them, and create a style guide or pattern library.”
“The present is always compromised, always rushed. We muddle through with half the information we need, praised for our speed and faulted when we stop to contemplate or even breathe. (Emphasis added.)”
Another article they referenced was CSS Utility Classes and “Separation of Concerns” by Adam Wathan on August 7th, 2017 and it’s even longer. It’s almost a jeremiad with the seeming intent of breaking the reader down with a flood of words. I could only skim it, but it seems like these people are styling without programming: that is, some of the utility classes and even the slightly semantic ones they use could very easily be written more cleanly if they just used component-local styles.
For example, this is completely unnecessary with local styles, because you don’t have to worry about specificity biting you in the ass:
<div class="media-card">
<img class="media-card__image"
src="https://i.vimeocdn.com/video/585037904_1280x720.webp" alt="">
<div class="media-card__content">
<h2 class="media-card__title">Stubbing …</h2>
<p class="media-card__body">
In this quick blog post and screencast, …
</p>
</div>
</div>
In another article On the Growing Popularity of Atomic CSS by Ollie Williams on November 24th, 2017, the author mentions that they’re addressing “n a mixed-ability team, perhaps involving backend developers with limited interest and knowledge of CSS”. I didn’t have the energy to finish that one either, because a skim indicated that it repeated a lot of what was in the article I did read.
Our concrete use case was:
Published by marco on 17. Oct 2019 14:42:13 (GMT-5)
Azure DevOps allows you to link multiple accounts.
Our concrete use case was:
Are we clear so far? U1 and U2 are linked because reasons. U1 is old and busted; U2 is the new hotness.
The linking has an unexpected side-effect when managing SSH keys. If you have an SSH key registered with one of the linked accounts, you cannot register an SSH key with the same signature with any of the other accounts.
This is somewhat understandable (I guess), but while the error message indicates that you have a duplicate, it doesn’t tell you that the duplicate is in another account. When you check the account that you’re using and see no other SSH keys registered, it’s more than a little confusing.
Not only that, but if the user to which you’ve added the SSH key has been removed from the organization, it isn’t at all obvious how you’re supposed to access your SSH key settings for an account that no longer has access to Azure DevOps (in order to remove the SSH key).
Instead, you’re left with an orphan account that’s sitting on an SSH key that you’d like to use with a different account.
So, you could create a new SSH key _or_ you could do the following:
If you can’t add U1 to O1 anymore, then you’ll just have to generate and use a new SSH1 key for Azure. It’s not an earth-shatteringly bad user experience, but interesting to see how several logical UX decisions led to a place where a couple of IT guys were confused for long minutes.
If you’re just developing a single issue at a time and can branch, commit changes and make pull requests with your IDE tools, then more power to you. For this kind of... [More]
]]>Published by marco on 17. Oct 2019 13:27:26 (GMT-5)
Updated by marco on 11. Mar 2021 14:33:13 (GMT-5)
I’ve written about using SmartGit (SG) before [1] [2] and I still strongly recommend that developers who manage projects use a UI for Git.
If you’re just developing a single issue at a time and can branch, commit changes and make pull requests with your IDE tools, then more power to you. For this kind of limited workflow, you can get away with a limited tool-set without too big of a safety or efficiency penalty.
However, if you need an overview or need to more management, then you’re going to sacrifice efficiency and possibly correctness if you use only the command line or IDE tools.
I tend to manage Git repositories, which means I’m in charge of pruning merged or obsolete branches and making sure that everything is merged. A well-rendered log view and overview of branches is indispensable for this kind of work.
I have been and continue to be a proponent of SmartGit for all Git-related work. It not only has a powerful and intuitive UI, it also supports pull requests, including code comments that integrate with BitBucket, GitLab and GitHub, among others.
It has a wonderful log view that I now regularly use as my standard view. It’s fast and accurate (I almost never have to refresh explicitly to see changes) and I have a quick overview of the workspace, the index and recent commits. I can search for files and easily get individual logs and blame.
The file-differ has gotten a lot better and has almost achieved parity with my favorite diffing/merging tool Beyond Compare. Almost, but not quite. The difference is still significant enough to justify Beyond Compare’s purchase price of $60.
What’s better in Beyond Compare [3]?
I could live without the Beyond Compare differ, but not without the merger.
To set up SmartGit to use Beyond Compare
*
C:\Program Files (x86)\Beyond Compare 4\BCompare.exe
“${leftFile}” “${rightFile}”
*
C:\Program Files (x86)\Beyond Compare 4\BCompare.exe
“${leftFile}” “${rightFile}” “${baseFile}” “${mergedFile}”
I was testing the Git support in Visual Studio Code and ran into a somewhat surprising limitation. For those that use IDE Git integration without an external tool, this would be a pretty disappointing message. What do you do then?
VS displays the survey in an embedded window using IE11. [1] I captured the screen of the first thing I saw when I agreed to take the survey.
I know it’s the... [More]
]]>Published by marco on 17. Oct 2019 07:38:00 (GMT-5)
Visual Studio 2019 (VS) asked me this morning if I was interested in taking a survey to convey my level of satisfaction with the IDE.
VS displays the survey in an embedded window using IE11. [1] I captured the screen of the first thing I saw when I agreed to take the survey.
I know it’s the SurveyMonkey script that’s failing, but it’s still not an auspicious start.
From what I gather, Svelte is a compile-time reconciliation generator for JSX/TSX components.... [More]
]]>Published by marco on 19. May 2019 17:15:28 (GMT-5)
Updated by marco on 13. Jan 2022 09:53:18 (GMT-5)
I’ve just read about a web framework called Svelte in the post Virtual DOM is pure overhead. I think the product itself sounds interesting, but that the author uses unnecessarily misleading arguments.
From what I gather, Svelte is a compile-time reconciliation generator for JSX/TSX components. This pre-calculated generator applies changes to the DOM without needing a virtual DOM and without real-time diffing or reconciliation. That is, instead of having real-time calculation, with possible performance hits [1], the app benefits from having all possible state changes pre-calculated and ready to apply immediately and quickly.
This all sounds pretty good, I think. I’m definitely going to take a look at the more-advanced tutorials. [2]
However, the author wasn’t happy with just presenting his product, but seems to need to mischaracterize why products like React abstracted away from the DOM in the first place. He tells us that the virtual DOM was always slower than manipulating the DOM. But that isn’t the claim React makes. React helps users avoid common performance pitfalls in the model of programming that it replaced—it never claimed to be the final word in performance optimization.
It’s clear that something like Svelte—if it can cover all the needs of an app—is faster than maintaining a virtual DOM.
But that product isn’t what React replaced. React replaced products written in jQuery. React brought an asynchronous frame-based renderer to the web (something that products like WPF have had for decades). It brought us type-safe views (when used with TypeScript) and taught us about the advantages of immutable data structures.
He stands on their shoul-ders, then implies that they were idiots for not having been taller.
The author characterizes the notion that a virtual DOM is faster as a “meme”. This is silly and imprecise. It is true that React will be more efficient than most hand-coded web sites of a typical level of complexity. jQuery sites tended to teeter and collapse under their own weight. They were unmaintainable and very difficult to optimize without nearly rewriting them. React sites, on the other hand, are modular in nature and the library includes several standard patterns to apply and measures to take to optimize these components. It’s not always easy, but it’s better than it was in the old days.
And there are solutions in React to performance issues. The users must follow patterns and use the APIs correctly. That’s the way it is in every framework or library. Some libraries offer less leeway for users to screw up performance in the way that they shape their APIs.
Sometimes the API surface goes too much in that direction and ends up handcuffing users. That is, users can’t write what they want to write in a way that feels natural because the pattern they prefer wouldn’t perform well under their framework. Instead, the user must change how they think about writing apps just to use the framework. This isn’t necessarily a bad thing, but is definitely something to consider. It’s possible that Svelte offers all of the advantages of React with even more flexibility and less opinion.
React—and its companion Redux—was always about being very declarative about state and changes. There is no magic, even the reconciliation algorithm is very predictable. There are other approaches, like MobX, which users claim “does the right thing” with state changes, even if the user fails to declare dependencies as clearly as React would have required. [3] I imagine that Svelte is going in this direction as well.
The claim I think that Svelte is making is that users can write code that feels more natural without changing their paradigm to match the framework. That is, Svelte must have some rules for which state the compiler observes and pre-compiles, but the claim is that it’s much more flexible and forgiving than React’s “straitjacket” (my word). [4]
He goes on to say that React acknowledges its own slowness by giving the user control over shouldComponentUpdate
. This is a silly argument again. It’s arguing that React bamboozled people in 2013 by convincing them to use their framework instead of a library that the author purports is faster but that he only started in 2017.
There is honestly no need for this kind of bullshit. If your library offers advantages over React, describe them and let them speak for themselves. There is no need to rewrite the whole history of a product that quite clearly inspired your own, pretending that the authors of your own framework’s inspiration are your inferiors because they failed to leap directly to the concepts outlined in your library. He stands on their shoulders, then implies that they were idiots for not having been taller.
Through all of this fluff, it took to about ¾ of the way through the article to find out that Svelte generates update code at build time. I would have been much more intrigued had the author led with that. Now, I’m going to be suspicious of everything about this framework because the author went to such lengths to bamboozle and oversell me. He seems to want me to think I’ve been a fool for having used React in the first place, when his framework has been waiting for me all along, since all the way back to sometime in 2018. [5]
But he waits until the very last paragraph to explain what Svelte actually is—even though he’s been comparing it to React the entire time. It’s a good description:
“It’s important to understand that virtual DOM isn’t a feature. It’s a means to an end, the end being declarative, state-driven UI development. Virtual DOM is valuable because it allows you to build apps without thinking about state transitions, with performance that is generally good enough. That means less buggy code, and more time spent on creative tasks instead of tedious ones.
“But it turns out that we can achieve a similar programming model without using virtual DOM — and that’s where Svelte comes in.”
This is a much fairer characterization of the two libraries: they both base on a very similar model—one that React did a tremendous amount of legwork in establishing as an attractive approach in people’s minds—but that Svelte goes a step further to improve the reconciliation mechanism, moving it from runtime to compile-time. Svelte’s improvement could be a highly welcome one, but it’s incremental, not revolutionary.
That’s wonderful! But it’s actually even more wonderful than his article indicated, because I actually don’t have to learn anything to work with Svelte instead of React. I can work pretty much the same (Svelte doesn’t have hooks [6] because it seems it doesn’t need them) and just kind of “drop in” Svelte instead of React and have better performance, even in places where I’d never noticed I might have had problems.
That is, with Svelte instead of React, my app will be overall faster because performance no longer suffers from “death by a thousand cuts”, as the author puts it. Despite the author’s overzealous mischaracterizations and attempts at hot-take marketing, I’m still going to check out Svelte.
I’m not sure what MobX 5 is up to or what introspection it offers into the web of observables and dependencies in a more-complex application, but older versions of the library were not easy to debug when performance problems arose. From what I’ve read from users, things have gotten much better, but I’m still inclined to think that React’s declarative approach suits me better—it’s easier for me to apply well-established patterns in my own code rather than trying to figure out how to appease the MobX black box. Again, things may be different now than in earlier versions. I’m open to taking another look at MobX.
I’m also not sure how Svelte and MobX compare: MobX requires users to indicate that state is “observable” before it manages it, whereas I assume Svelte determines for itself which state-transitions it should track.
Update January 2022: In going through the tutorial available today, you’re very quickly introduced to <em>reactive declarations</em> to help Svelte determine which compound expressions should be “watched” for changes to sub-elements. That is, if you declare simple variable, any references to it in view code will be automatically updated, but if you derive another simple value from it and observe that value, it only updates when the derived value is updated directly. This is unlikely, as the derives value presumably implements an algorithm of some sort and should never be directly changed (i.e. it’s a <em>calculated property</em> in the parlance of other frameworks. For example, given the following code,
let count = 0;
doubled = count * 2;
Any observers (i.e. embeddings in a view) of the value of doubled
will not be updated when value
changes, even though the naive interpretation of a JavaScript developer would be that of course it changes. In order to get the desired effect, you must make it a reactive declaration with $:
. For example,
let count = 0;
doubled = count * 2;
This is perfectly fine, but is an example of how the “you don’t have to do anything to make your JavaScript work naturally, unlike smelly React”, is overselling the advantage. Missing <em>reactive declarations</em> will cause an app to not work as expected just as must as a missing useState()
does in React.
The author disparages hooks, saying that they are even worse for performance and linking to a tweet with the words “with predictable results”. The tweet complains about atrocious performance because of constant reconciliation and rendering—but a dozen answers down is the answer: the original poster failed to tell the useEffect()
hook on which state it relied.
That’s kind of a rookie mistake—in that framework. [7] I understand that Svelte claims that it doesn’t need these hints in order to be able to determine at compile-time when a piece of code needs to be executed because the state on which it relies has changed.
React is declarative and requires help from the user whereas presumably the selling point of Svelte is that this user would have wasted less time improving performance and more time focused on application logic because Svelte is smart enough to do all this for you.
I personally think that this sounds awesome and that it is an admirable goal, but have my doubts that Svelte doesn’t also impose its own set of limitations on what kind of state transformations you can do that the compiler can actually detect.
That is, React provides an API with which callers can “help” the reconciliation algorithm avoid work. Svelte claims that this isn’t necessary, but I’m going to guess that there are rules for how sophisticated state changes can be before the Svelte compiler no longer detects them. In that case, what does Svelte do? Fall back to using a React-like virtual DOM to reconcile changes? Or just not update when the user expects? Or just fail to compile, spitting out an error indicating what the user should do to fix them issue (my personal favorite)?
## Introduction
Testing is any form of validation that verifies code. That includes not only structured validation using checklists, test plans, etc. but also informal testing, as when developers click their way through a UI or emit values in debugging output to a console.
_Automated... [More]
]]>Published by marco on 8. Apr 2019 09:38:17 (GMT-5)
Updated by marco on 12. Jan 2023 16:49:41 (GMT-5)
[[_TOC_]]
## Introduction
Testing is any form of validation that verifies code. That includes not only structured validation using checklists, test plans, etc. but also informal testing, as when developers click their way through a UI or emit values in debugging output to a console.
_Automated testing_ covers the topic of all regression-style tests that execute both locally and in CI. This includes unit, integration, and end-to-end tests.
Testing is primarily a mindset.
You should think of writing tests not as something you _have_ to do, but rather as something you _want_ to do.
- How else do you prove that what you wrote works?
- What does _”it works”_ mean?
- Which _use cases_ are covered?
- How do you answer these questions without tests?
- What do we mean by _writing_ tests?
## You’re already testing!
You’re almost certainly already testing.
You might be clicking through the UI or emitting statements in a command-line application, but you’re verifying your code _somehow_. I mean … you are, right? RIGHT?
I’m kidding. Of course you’re not just writing code, building it, and committing it. You’re validating it somehow.
That’s testing.
### A list of validations
If you’re really good, you might even keep a list of these validations. Once you have a list, then,
1. You don’t have to worry about forgetting to do them in the future
1. Even someone with no knowledge of the system can perform validation
This is fine, but it’s still a manual process. A manual process carries with it the following drawbacks:
1. It gets quite time-consuming, especially as the list of validations grows
1. You’re highly unlikely to perform the validations often enough
− It’s much easier to fix a mistake if you learn about it relatively soon after you made it
1. You’re also unlikely to add _all_ of the validations you need
− Generally, you won’t validate smaller “facts” and will focus on high-level stuff
1. A manual validation process can’t be run as part of CI or CD
### Automating the list
Automated testing means that you _codify_ those validations.
> 😒 Great! I have tests! How the heck do I _codify_ them?
Don’t panic. Almost any code can be tested. In fact, if you can’t get at it with a test, then you might have found an architectural problem.
See? Automating tests will even help you write better code!
> 🤨 How do I get started?
Just start somewhere. It doesn’t matter where. Don’t worry about coverage. Just get the feeling for writing a proof about a facet of your code. Any bit of logic can—and should—be tested.
What if you still don’t know where to begin? Ask someone for help! Don’t be shy. It’s in everyone’s best interest for a project to have good tests. You want everyone’s code to have tests so you know _right away_ when you’ve broken something in a completely unrelated area. This is a good thing!
## Goals
> 🤸♀️ Developers should be excited to use tests to prove that their code works.
### Tests should be quick and easy (maybe even fun) to write
A project should provide support for mocking devices and external APIs, or for using test-specific datasets.
### Tests should be reasonably fast
A reasonably fast test suite will tend to be run more often. We would like a developer to notice a broken test right after the change that broke it, preferably even before pushing it.
### Avoid debugging tests in CI
Tests a developer runs locally should almost always work in CI. Failing tests in CI should also fail locally.
## Guidelines
> 🤨 Don’t be pedantic.
For example,
- [Stop requiring only one assertion per unit test: Multiple assertions are fine](https://stackoverflow.blog/2022/11/03/multiple-assertions-per-test-are-fine/)
- Don’t forbid mocking in integration tests and don’t force mocking in unit tests.
− In fact, stop worrying about whether it’s a unit or an integration and just _write useful tests_ that _prove useful things_ about your code.
- Don’t get obsessed with automating _everything_.
− Get the low-hanging fruit first, and leave the rest to manual testing.
− See where you stand.
− If you haven’t automated enough, iterate until done. 🔄
### Tests should be useful
We never want anyone in a team to get the impression that we’re writing tests just to write tests. We write tests because they help us write better code and because it feels good to be able to prove that something that was working continues to work. You should feel more efficient and productive and feel like you’re producing higher-quality code.
- Tests should confirm use cases
- Tests should prove something about your code that you think is worth proving.
- Tests should confirm behavior that either is how the code _currently_ works or how it _should_ work.
- Tests should help you write better code from the get-go.
- Every bug that you need to fix is de-facto a use case that needs a test.
### Code Coverage & Reviews
How do you know when there are “enough” automated tests?
Don’t get distracted by trying to achieve a specific coverage percentage. The most important thing is that the major use cases are covered.
If software is stable and there is “only” 40% test-coverage, then maybe there is a lot of code that rarely or never gets used? In that case, you might want to think about removing code that you don’t need rather than to waste time writing tests for code that never runs.
New code, though, should always have automated tests. A **code reviewer** should verify that new functionality is being tested.
## Types of tests
| Type | Definition | When to use them |
| − | − | − |
| Unit | Cover a single unit, mocking away other dependencies where needed | Useful for verifying simple logic like calculated properties or verifying the results of service methods with given inputs |
| Integration | Cover multiple units, possibly mocking unwanted dependencies| Useful for verifying behavior of units in composition, as they will be used in the end product. The goal is to cover as much as possible without resorting to more costly end-to-end tests |
| End-to-End | Also called _UI Tests_, these tests verify the entire stack for actual customer use cases | Very useful, but generally require more maintenance as they tend to be more fragile. Essential for verifying UI behavior not reflected in a programmatic model. Can work with snapshots (e.g. error label is in red) |
## Approach
The article [Write tests. Not too many. Mostly integration.](https://kentcdodds.com/blog/write-tests) describes a pragmatic approach quite well. Instead of the classic “testing pyramid”, it suggests a “testing trophy”.
![image.png](/.attachments/image-6b9cafdf-0bac-4155-bb8f-363a92822bc3.png =300x)
This style of development has the following aims:
1. Verify as much as possible _statically_, with linting and analyzers
1. Make _integration tests_ cheaper because they prove more about your system than _unit tests_
1. Prove as much as possible outside of _end-to-end tests_ because they’re expensive and brittle
## Analysis
> Remember that everything you use has to work both locally and in CI.
### Static-checking
A project should include analyzers and techniques so that the compiler helps make many tests unnecessary. For example, if you know that a parameter or result can never be `null`, then you can avoid a whole slew of tests.
Developers should only spend time writing tests that verify semantic aspects that can’t be proven by the compiler.
ac
#### Null-reference analysis in .NET
The .NET world provides many, many analyzers and tools to verify code quality. One of the most important things a project can do is to improve null-checking. The best way to do this is to upgrade to C# 8 or higher and enable [null-reference analysis](https://learn.microsoft.com/en-us/dotnet/csharp/nullable-references). The [default language for .NET Framework is going to stay C# 7.3](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/configure-language-version), but
you can [enable null-reference analysis for .NET Framework](https://www.infoq.com/articles/CSharp-8-Framework/) quite easily.
Another option is to use the [JetBrains Annotations NuGet package](https://www.nuget.org/packages/JetBrains.Annotations/), which provides attributes to indicate whether parameters or results are nullable.
The preferred way, though, is to use the by-now standard nullability-checking available in .NET.
Doing neither is not a good option, as it will be very difficult to avoid null-reference exceptions.
### Unit-testing
Unit tests are very useful for validating _requirements_ and _invariants_ about your code.
These are the easiest tests to write and will generally be the first ones that you will write.
A requirement or an invariant may be specified in the story itself, but it can be anything that you know about the code that’s important. It’s up to the developer and the reviewer(s) to determine which tests are necessary. It gets easier with experience—and it doesn’t take long to get enough experience so that it’s no longer so intimidating.
#### Unit-testing example
Just as a quick example in .NET, consider the following code,
```csharp
public bool IsDiagnosticModeRunning
{
get => _isDiagnosticModeRunning;
set
{
_isDiagnosticModeRunning = value;
_statusManager.InstrumentState = value ? InstrumentState.DiagnosticMode : InstrumentState.Ready;
}
}
```
Here we see a relatively simple property with a getter and a setter. However, we also see that there is an invariant in the implementation: that the `_statusManager.InstrumentState` is synced with it.
Using many of the [techniques described below](#tools-and-techniques), we could write the following test:
```csharp
[DataRow(true, InstrumentState.DiagnosticMode)]
[DataRow(false, InstrumentState.Ready)]
[TestMethod]
public void TestIsDiagnosticModeRunning(bool running, InstrumentState expectedInstrumentState)
{
var locator = CreateLocator();
var instrumentControlService = locator.GetInstance<IInstrumentControlService>();
var statusManager = locator.GetInstance<IStatusManager>();
Assert.AreNotEqual(expectedInstrumentState, statusManager.InstrumentState);
instrumentControlService.IsDiagnosticModeRunning = running;
Assert.AreEqual(expectedInstrumentState, statusManager.InstrumentState);
}
```
Here, we’re using MSTest to create a parameterized test that,
- creates the IOC
- gets the two relevant services from it
- Verifies that the state is not already set to the expected state (in which case the test would succeed even if the tested code doesn’t do anything)
- Sets the property to a given value
- Verifies that the state is correct for that value
We now have code that validates two _facts_ about the system. Should something change where these facts are no longer true, the tests will fail, giving the developer a chance to analyze the situation.
- Was the change inadvertent or deliberate?
- Are the facts still correct? Does the test need to be updated?
If you’re addressing a bug-fix, though, you might be able to _prove_ that you’ve fixed the bug with a unit test, but it’s also likely that you’ll have to write an integration test instead.
### Integration-testing
Unit tests have their place, but they are far too emphasized in the testing pyramid. The testing pyramid comes from a time when writing integration tests was much more difficult than it (theoretically) is today.
The “theoretically” above means that the ability to write integration tests as efficiently as unit tests is contingent on a project offering proper tools and support.
One common complaint about integration tests vis à vis unit tests is that they run more slowly. Another is that they take longer to develop. Ideally, a project provides support to counteract both of these tendencies.
To this end, then, a project should offer base and support classes that make common integration tests easy to set up and quick to execute:
- Interacting with a database
- Setting up a known database schema
- Getting to a clean dataset
- [Mocking]() the database
- Mocking other external dependencies in a project (e.g. loading configuration from an endpoint, sending emails, sending modifications to endpoints)
There are many different ways to solve this problem, each with tradeoffs. For example, a project can load dependencies in Docker containers, either created and started manually (see [Testing your ASP.NET Core application − using a real database](https://josef.codes/testing-your-asp-net-core-application-using-a-real-database/)) or even dynamically with a tool like the [Testcontainers NuGet package](https://github.com/testcontainers/testcontainers-dotnet).
### Comparing Unit and Integration tests
A drawback to unit tests is that, while they can test an individual component well, it’s really the big picture that we want to test. We want to test scenarios that correspond to actual use cases rather than covering theoretical call stacks. It’s not that the second part _isn’t_ important, but that it’s not _as_ important.
Given limited time and resources, we would prefer to have integration tests that also cover a lot of the same code paths that we would have covered with unit tests, rather than to have unit tests, but few to no integration tests.
This, however, leads directly to…
The advantage of a unit test over an integration test is that when it fails, it’s obvious which code failed. An integration test, by its very nature, involves multiple components. When it fails, it might not be obvious which sub-component caused the error.
If you find that you have integration tests failing and it takes a while to figure out what went wrong, then that’s a sign that you should bolster your test suite with more unit tests.
Once an integration test fails _and_ one or more unit tests fail, then you have the best of both worlds: you’ve been made aware that you’ve broken a use case (integration test), but you also know which precise behavior is no longer working as before (unit test).
## Tools and Techniques
### Tests are Code
Test code is just as important as product code. Use all of the same techniques to improve code quality in test code as you would in product code. Clean coding, good variable names, avoid copy/paste coding—all of it applies just as much to tests.
There are two main differences:
- You don’t need to document tests
- You don’t have to write tests for tests. :-)
### Writing testable code
This is a big, big topic, of course. There are a few guidelines that make it easier to write tests—or to avoid having to write tests at all.
As noted above, code that can be validated by the compiler (static analysis) doesn’t need tests. E.g. you don’t have to write a test for how your code behaves when passed a `null` parameter if you just _forbid it_. Likewise, you don’t have to re-verify that types work as they should in statically typed languages. We can trust the compiler.
Here are a handful of tips.
- Prefer composition to inheritance
- A functional programming style is very testable
- An IOC Container is very helpful
- Avoid nullable properties, results, and parameters
- Avoid mutable data
- Interfaces are much easier to fake or mock; use those wherever you can
See the following articles for more ideas.
- [C# Handbook − Chapter 4: Design](https://github.com/mvonballmo/CSharpHandbook/blob/master/4_design.md) (2017)
- [Questions to consider when designing APIs: Part I](https://www.earthli.com/news/view_article.php?id=2996) (2014)
- [Questions to consider when designing APIs: Part II](https://www.earthli.com/news/view_article.php?id=2997) (2014)
- [Why use an IOC? (hint: testing)](https://www.earthli.com/news/view_article.php?id=3487) (2019)
### Parameterized Tests
Investigate your testing library to learn how to write multiple tests without having to write a lot of code. In the MSTests framework, you can use `DataRow` to parameterize a test. In NUnit, `TestCase` does the same thing, and `Value` allows you to provide parameter values for a list of tests that are the Cartesian product of all values.
### Mocking/Faking
Use mocks or fakes to exclude a subsystem from a test. What would you want to exclude? While you will want to make some tests that include database access or REST API calls, there are a lot of tests where you’re proving a fact that doesn’t depend on these results.
#### Focus on what you’re testing
For example, suppose a component reads its configuration from the database by default. A test of that component may simply want to see how it reacts with a given input to a given method. Where the configuration came from is irrelevant to that particular test. In that case, you could mock away the component that loads the configuration from the database and instead use a fake object that just provides some standard values.
#### Test error conditions
Another possibility is to fake an external service to see how your code reacts when the service returns an error or an ambiguous response. Without mocks, how would you test how your code reacts when a REST endpoint returns 503 or 404? Without a mock, how would you force the purely external endpoint to give a certain code? You really can’t. With a mock, though, you can replace the service and return a 404 response for a specific test. This is quite a powerful technique.
#### How to fake?
As noted above, it’s much, much easier to use fake objects if you’ve consistently used interfaces. You can just create your own implementation of the interface whose standard implementation you want to replace, give it a fake implementation (e.g. returning `false` and empty string and `null` for methods and properties), and then use that class as the implementation.
#### Faking/mocking libraries
If you have interfaces that perform a single task (single-responsibility principle), then it doesn’t take too much effort to write the fake object by hand. However, it’s much easier to use a library to create fake objects—and there are other benefits as well, like tracking which methods were called with which parameters. You can assert on this data collected by the fake object.
For .NET, a great library for faking objects is [FakeItEasy](https://fakeiteasy.github.io/).
With a fake object, you can indicate which values to return for a given set of parameters without too much effort. Similarly, you can use the same API to query how often these methods have been called. This allows you to verify, for example, that a call to a REST service _would have been made_. This is a powerful way of proving facts about your code without having to actually interact with external services.
#### An example
The following code configures a fake object for `ITestUnitConfigurationService` that returns default data for all properties, except for `Configuration` and `GetTestUnitParameterValues()`, which are configured to return specific data.
```csharp
private static ITestUnitConfigurationService CreateFakeTestUnitConfigurationService()
{
var result = A.Fake<ITestUnitConfigurationService>();
var testUnitParameters = CreateTestUnitParameters();
var testUnitConfiguration = new TestUnitConfiguration(testUnitParameters);
A.CallTo(() => result.Configuration).Returns(testUnitConfiguration);
var testUnitParameterValues = CreateTestUnitParameterValues();
A.CallTo(() => result.GetTestUnitParameterValues()).Returns(testUnitParameterValues);
return result;
}
```
In the test, we could get this fake object back out of the IOC (for example) and then verify that certain methods have been called the expected number of times.
```chsarp
var testUnitConfigurationService = locator.GetInstance<ITestUnitConfigurationService>();
A.CallTo(() => testUnitConfigurationService.Configuration).MustHaveHappenedOnceExactly();
A.CallTo(() => testUnitConfigurationService.GetTestUnitParameterValues()).MustHaveHappenedOnceExactly();
```
### Snapshot-testing
You can avoid writing a ton of assertions and a ton of tests with snapshot testing.
For example, imagine you have a test that generates a particular view model. You want to verify 30 different parts of this complex model.
You _could_ navigate the data structure, asserting the 30 values individually.
That would be pretty tedious, though, and lead to fragile and hard-to-maintain testing code.
Instead, you could emit that structure as text and save it as a _snapshot_ in the repository. If a future code change leads to a different snapshot, the test fails and the developer that caused the failure would have to approve the new snapshot (if it’s an expected or innocuous change) or fix the code (if it was inadvertent and wrong).
The upside is that large swaths of assertions are reduced to a simple snapshot assertion. The downside is that the test might break more often for spurious reasons. Generally, you can avoid these spurious reasons by being judicious about how your format the snapshot,
- Avoid timestamps or data that changes over time
- Avoid using output methods that are too likely to change over time
See the documentation for the [Snapshooter NuGet package](https://swisslife-oss.github.io/snapshooter/).
Neither is as straightforward as I’d hoped.
]]>tl;dr: If you have to use Collab with Unity, but want to back... [More]
Published by marco on 22. Jan 2019 19:47:58 (GMT-5)
If you’re familiar with the topic, you might be recoiling in horror. It would be unclear, though, whether you’re recoiling from the “using Collab” part or the “using Collab with Git” part.
Neither is as straightforward as I’d hoped.
tl;dr: If you have to use Collab with Unity, but want to back it up with Git, disable
core.autocrlf
[1] and add* -text
to the.gitattributes
.
Collab is the source-control system integrated into the Unity IDE.
It was built for designers to be able to do some version control, but not much more. Even with its limited scope, it’s a poor tool.
This is really dangerous, especially with Unity projects. There is so much in a Unity project without a proper “Undo” that you very often want to return to a known good version.
So what can we do to improve this situation? We would like to use Git instead of Collab.
However, we have to respect the capabilities and know-how of the designers on our team, who don’t know how to use Git.
On our current project, there’s no time to train everyone on Git—and they already know how to use Collab and don’t feel tremendously limited by it.
Remember, any source control is better than no source control. The designers are regularly backing up their work now. In its defense, Collab is definitely better than nothing (or using a file-share or some other weak form of code-sharing).
Instead, those of us who know Git are using Git alongside Collab.
We started naively, with all of our default settings in Git. Our workflow was:
Unfortunately, we would often end up with a ton of files marked as changed in Collab. These were always line-ending differences. As mentioned above, Collab is not a good tool for reverting changes.
The project has time constraints—it’s a prototype for a conference, with a hard deadline—so, despite its limitations, we reverted in Collab and updated Git with the line-endings that Collab expected.
We limped along like this for a bit, but with two developers on Git/Collab on Windows and one designer on Collab on Mac, we were spending too much time “fixing up” files. The benefit of having Git was outweighed by the problems it caused with Collab.
So we investigated what was really going on. The following screenshots show that Collab doesn’t seem to care about line-endings. They’re all over the map.
Git, on the other hand, really cares about line-endings. By default, Git will transform the line-endings in files that it considers to be text files (this part is important later) to the line-ending of the local platform.
In the repository, all text files are LF-only. If you work on MacOS or Linux, line-endings in the workspace are unchanged; if you work on Windows, Git changes all of these line-endings to CRLF on checkout—and back to LF on commit.
Our first “fix” was to turn off the core.autocrlf
option in the local Git repository.
git config –local core.autocrlf false
We thought this would fix everything since now Git was no longer transforming our line-endings on commit and checkout.
This turned out to be only part of the problem, though. As you can see above, the text files in the repository have an arbitrary mix of line-endings already. Even with the feature turned off, Git was still normalizing line-endings to LF on Windows.
The only thing we’d changed so far is to stop using the CRLF instead of LF. Any time we git reset
, for example, the line-endings in our workspace would still end up being different than what was in Git or Collab.
What we really want is for Git to stop changing any line-endings at all.
This isn’t part of the command-line configuration, though. Instead, you have to set up .gitattributes
. Git has default settings that determine which files it treats as which types. We wanted to adjust these default settings by telling Git that, in this repository, it should treat no files as text.
Once we knew this, it’s quite easy to configure. Simply add a .gitattributes
file to the root of the repository, with the following contents:
* -text
This translates to “do not treat any file as text” (i.e. match all files; disable text-handling).
With these settings, the two developers were able to reset their workspaces and both Git and Collab were happy. Collab is still a sub-par tool, but we can now work with designers and still have Git to allow the developers to use a better workflow.
The designers using only Collab were completely unaffected by our changes.
autocrlf
setting. Turning off text-handling in Git should suffice. However, I haven’t tested with this feature left on and, due to time-constraints, am not going to risk it.The targets that connect directly to a database (e.g. WPF, Winform) were... [More]
]]>Published by marco on 21. Jan 2019 20:26:52 (GMT-5)
Quino contains a Sandbox in the main solution that lets us test a lot of the Quino subsystems in real-world conditions. The Sandbox has several application targets:
The targets that connect directly to a database (e.g. WPF, Winform) were using the PostgreSql driver by default. I wanted to configure all Sandbox applications to be easily configurable to run with SqlServer.
This is pretty straightforward for a Quino application. The driver can be selected directly in the application (directly linking the corresponding assembly) or it can be configured externally.
Naturally, if the Sandbox loads the driver from configuration, some mechanism still has to make sure that the required data-driver assemblies are available.
The PostgreSql driver was in the output folder. This was expected, since that driver works. The SqlServer was not in the output folder. This was also expected, since that driver had never been used.
I checked the direct dependencies of the Sandbox Winform application, but it didn’t include the PostgreSql driver. That’s not really good, as I would like both SqlServer and PostgreSql to be configured in the same way. As it stood, though, I would be referencing SqlServer directly and PostgreSql would continue to show up by magic.
Before doing anything else, I was going to have to find out why PostgreSql was included in the output folder.
I needed to figure out assembly dependencies.
My natural inclination was to reach for NDepend, but I thought maybe I’d see what the other tools have to offer first.
Does Visual Studio include anything that might help? The “Project Dependencies” shows only assemblies on which a project is dependent. I wanted to find assemblies that were dependent on PostgreSql. I have the Enterprise version of Visual Studio and I seem to recall an “Architecture” menu, but I discovered that these tools are no longer installed by default.
According to the VS support team in that link, you have to install the “Visual Studio extension development” workload in the Visual Studio installer. In this package, the “Architecture and analysis tools” feature is available, but not included by default.
Hovering this feature shows a tooltip indicating that it contains “Code Map, Live Dependency Validation and Code Clone detection”. The “Live Dependency Validation” sounds like it might do what I want, but it also sounds quite heavyweight and somewhat intrusive, as described in this blog from the end of 2016 (MSDN). Instead of further modifying my VS installation (and possibly slowing it down), I decided to try another tool.
What about ReSharper? For a while now, it’s included project-dependency graphs and hierarchies. Try as I might, I couldn’t get the tools to show me the transitive dependency on PostgreSql that Sandbox Winform was pulling in from somewhere. The hierarchy view is live and quick, but it doesn’t show all transitive usages.
The graph view is nicely rendered, but shows dependencies by default instead of dependencies and usages. At any rate, the Sandbox wasn’t showing up as a transitive user of PostgreSql.
I didn’t believe ReSharper at this point because something was causing the data driver to be copied to the output folder.
So, as expected, I turned to NDepend. I took a few seconds to run an analysis and then right-clicked the PostgreSql data-driver project to select NDepend => Select Assemblies… => That are Using Me (Directly or Indirectly)
to show the following query and results.
Bingo. Sandbox.Model
is indirectly referencing the PostgreSql data driver, via a transitive-dependency chain of 4 assemblies. Can I see which assemblies they are? Of course I can: this kind of information is best shown on a graph, so you can show a graph of any query results by clicking Export to Graph
to show the graph below.
Now I can finally see that the SandboxModel
pulls in the Quino.Testing.Models.Generated
(to use the BaseTypes
module) which, in turn, has a reference to Quino.Tests.Base
which, of course, includes the PostgreSql driver because that’s the default testing driver for Quino tests.
Now that I know how the reference is coming in, I can fix the problem. Here I’m on my own: I have to solve this problem without NDepend. But at least NDepend was able to show me exactly what I have to fix (unlike VS or ReSharper).
I ended up moving the test-fixture base classes from Quino.Testing.Models.Generated
into a new assembly called Quino.Testing.Models.Fixtures
. The latter assembly still depends on Quino.Tests.Base
and thus the PostgreSql data driver, but it’s now possible to reference the Quino testing models without transitively referencing the PostgreSql data driver.
A quick re-analysis with NDepend and I can see that the same query now shows a clean view: only testing code and testing assemblies reference the PostgreSql driver.
And now to finish my original task! I ran the Winform Sandbox application with the PostgreSql driver configured and was greeted with an error message that the driver could not be loaded. I now had parity between PostgreSql and SqlServer.
The fix? Obviously, make sure that the drivers are available by referencing them directly from any Sandbox application that needs to connect to a database. This was the obvious solution from the beginning, but we had to quickly fix a problem with dependencies first. Why? Because we hate hacking. :-)
Two quick references added, one build and I was able to connect to both SQL Server and PostgreSql.
QQL never made it to implementation—only specification. In the... [More]
]]>Published by marco on 20. Jan 2019 22:37:35 (GMT-5)
Updated by marco on 21. Jan 2019 10:00:49 (GMT-5)
In late 2011 and early 2012, Encodo designed a querying language for Quino. Quino has an ORM that, combined with .NET Linq provides a powerful querying interface for developers. QQL is a DSL that brings this power to non-developers.
QQL never made it to implementation—only specification. In the meantime, the world moved on and we have common, generic querying APIs like OData. The time for QQL is past, but the specification is still an interesting artifact, in its own right.
Who knows? Maybe we’ll get around to implementing some of it, at some point.
At any rate, you can download the specification from Encodo or here at earthli.
The following excerpts should give you an idea of what you’re in for, should you download and read the 80-page document.
The TOC lists the following top-level chapters:
From the abstract in the document:
“The Quino Query Language (QQL) defines a syntax and semantics for formulating data requests against hierarchical data structures. It is easy to read and learn both for those familiar with SQL and non-programmers with a certain capacity for abstract thinking (i.e. power users). Learning only a few basic rules is enough to allow a user to quickly determine which data will be returned by all but the more complex queries. As with any other language, more complex concepts result in more complex texts, but the syntax of QQL limits these cases.”
From the overview:
“QQL defines a syntax and semantics for writing queries against hierarchical data structures. A query describes a set of data by choosing an initial context in the data and specifying which data are to be returned and how the results are to be organized. An execution engine generates this result by applying the query to the data.”
The follow is from chapter 2.1, “Simple Standard Query”:
The following query returns the first and last name of all active people as well as their 10 most recent time entries, reverse-sorted first by last name, then by first name.
Person { select { FirstName; LastName; Sample:= TimeEntries { orderby Date desc; limit 10 } } where Active orderby { LastName desc; FirstName desc; } }
In chapter 2, there are also “2.2 Intermediate Standard Query” and “2.3 Complex Standard Query” examples.
The following is from chapter 2.4, “Simple Grouping Query”:
The following query groups active people by last name and returns the age of the youngest person and the maximum contracts for each last name. Results are ordered by the maximum contracts for each group and then by last name.
group Person { groupby LastName; select { default; Age:= (Now − BirthDate.Min).Year; MaxContracts:= Contracts.Count.Max } where Active; orderby { MaxContracts desc; LastName desc; } }
In chapter 2, there are also “2.5 Complex Grouping Query”, “2.6 Standard Query with Grouping Query” and “2.7 Nested Grouping Queries” examples.
While you can easily make another constructor, marking the old one(s) as obsolete, if you use an IOC that allows only a single public constructor,... [More]
]]>Published by marco on 20. Jan 2019 22:19:02 (GMT-5)
Updated by marco on 20. Jan 2019 22:20:10 (GMT-5)
Due to the nature of the language, there are some API changes that almost inevitably lead to breaking changes in C#.
While you can easily make another constructor, marking the old one(s) as obsolete, if you use an IOC that allows only a single public constructor, you’re forced to either
protected
.In either case, the user has a compile error.
There are several known issues with introducing new methods or changing existing methods on an existing interface. For many of these situations, there are relatively smooth upgrade paths.
I encountered a situation recently that I thought worth mentioning. I wanted to introduce a new overload on an existing type.
Suppose you have the following method:
bool TryGetValue<T>(
out T value,
TKey key = default(TKey),
[CanBeNull] ILogger logger = null
);
We would like to remove the logger
parameter. So we deprecate the method above and declare the new method.
bool TryGetValue<T>(
out T value,
TKey key = default(TKey)
);
Now the compiler/ReSharper notifies you that there will be an ambiguity if a caller does not pass a logger
. How to resolve this? Well, we can just remove the default value for that parameter in the obsolete method.
bool TryGetValue<T>(
out T value,
TKey key = default(TKey),
[CanBeNull] ILogger logger
);
But now you’ve got another problem: The parameter logger
cannot come after the key
parameter because it doesn’t have a default value.
So, now you’d have to move the logger
parameter in front of the key parameter. This will cause a compile error in clients, which is what we were trying to avoid in the first place.
In this case, we have a couple of sub-optimal options.
Use a different name for the new API (e.g. TryGetValueEx
à la Windows) in the next major version, then switch the name back in the version after that and finally remove the obsolete member in yet another version.
That is,
TryGetValue
(with logger) is obsolete and users are told to use TryGetValueEx
(no logger)TryGetValueEx
(no logger) is obsolete and users are told to use TryGetValue
(no logger)TryGetValueEx
.This is a lot of work and requires three upgrades to accomplish. You really need to stay on the ball in order to get this kind of change integrated and it takes a non-trivial amount of time and effort.
We generally don’t use this method, as our customers are developers and can deal with a compile error or two, especially when it’s noted in the release notes and the workaround is fairly obvious (e.g. the logger
parameter is just no longer required).
Published by marco on 20. Jan 2019 22:00:30 (GMT-5)
Any software product should have a version number. This article will answer the following questions about how Encodo works with them.
In decreasing order of expected expertise,
The intended audience of this document is *developers*.
quino
command-line tool is installed on all machines. This tool can *read* and *write* version numbers for any .NET solution, regardless of which of the many version-numbering methods a given solution actually uses.
Encodo uses semantic versions. This scheme has a strict ordering that allows you to determine which version is “newer”. It indicates pre-releases (e.g. alphas, betas, rcs) with a “minus”, as shown below.
Version numbers come in two flavors:
[Major].[Minor].[Patch].[Build]
[Major].[Minor].[Patch]-[Label][Build]
See Microsoft’s NuGet Package Version Reference for more information.
0.9.0-alpha34
: A pre-release of 0.9.00.9.0-beta48
: A pre-release of 0.9.00.9.0.67
: An official release of 0.9.01.0.0-rc512
: A pre-release of 1.0.01.0.0.523
: An official release of 1.0.0The numbers are strictly ordered. The first three *parts* indicate the “main” version. The final *part* counts strictly upward.
The following list describes each of the parts and explains what to expect when it changes.
This part is also known as “Maintenance” (see versioning”>Software versioning on Wikipedia).
There will only ever be one artifact of an official release corresponding to a given “main” version number.
That is, if 1.0.0.523
exists, then there will never be a 1.0.0.524
. This is due the fact that the build number (e.g. 524) is purely for auditing.
For example, suppose your software uses a NuGet package with version 1.0.0.523
. NuGet will not offer to upgrade to 1.0.0.524
.
There are no restrictions on the labels for pre-releases. However, it’s recommended to use one of the following:
alpha
beta
rc
Be aware that if you choose a different label, then it is ordered alphabetically relative to the other pre-releases.
For example, if you were to use the label pre-release
to produce the version 0.9.0-prealpha21
, then that version is considered to be higher than 0.0.0-alpha34
. A tool like NuGet will not see the latter version as an upgrade.
The name of a release branch should be the major version of that release. E.g. release/1
for version 1.x.x.x.
The name of a pre-release branch should be of the form feature/[label]
where [label]
is one of the labels recommended above. It’s also OK to use a personal branch to create a pre-release build, as in mvb/[label]
.
A developer uses the quino
tool to set the version.
For example, to set the version to 1.0.1, execute the following:
quino fix -v 1.0.1.0
The tool will have updated the version number in all relevant files.
The build server calculates a release’s version number as follows,
The name of the Git branch determines which kind of release to produce.
**/release/*
, then it’s an official releaseFor example,
origin/release/1
origin/production/release/new
origin/release/
release/1
production/release/new
release/
The name of the branch doesn’t influence the version number since an official release doesn’t have a label.
The label is taken from the last part of the branch name.
For example,
origin/feature/beta
yields beta
origin/feature/rc
yields rc
origin/mvb/rc
yields rc
The following algorithm ensures that the label can be part of a valid semantic version.
X
after a trailing digitX
if the label is empty (or becomes empty after having removed invalid characters)For example,
origin/feature/rc1
yields rc1X
origin/feature/linuxcompat
yields linuxcompat
origin/feature/12
yields X
Assume that,
Then,
origin/release/1
produces artifacts with version number 0.9.0.522
origin/feature/rc
produces artifacts with version number 0.9.0-rc522
The following are very concise guides for how to produce artifacts.
feature/rc
, master
)quino fix -v 1.0.2.0
release/1
)quino fix -v 1.0.2.0`
)The summary below describes major new features, items of note and breaking changes.
The... [More]
]]>Published by marco on 20. Jan 2019 21:59:55 (GMT-5)
Note: this article was originally published at Encodo.com at the end of October, 2018.
The summary below describes major new features, items of note and breaking changes.
The links above require a login.
At long last, Quino enters the world of .NET Standard and .NET Core. Libraries target .NET Standard 2.0, which means they can all be used with any .NET runtime on any .NET platform (e.g. Mac and Linux). Sample applications and testing assemblies target .NET Core 2.0. Tools like quinogenerate
and quinofix
target .NET Core 2.1 to take advantage of the standardized external tool-support there.
Furthermore, the Windows, Winform and WPF projects have moved to a separate solution/repository called Quino-Windows
.
Quino-Standard
is the core on which both Quino-Windows
and Quino-WebAPI
build.
Quino-Windows
target .NET Framework 4.6.2 because that’s the first framework that can interact with .NET Standard (and under which Windows-specific code runs).Quino-WebAPI
currently target .NET Framework 4.6.2. We plan on targeting .NET Core in an upcoming version (tentatively planned for v7).IIdentity
everywhere (deprecating ICredentials
and IUserCredentials
).6.0 is a pretty major break from the 5.x release. Although almost all assembly names have stayed the same, we had to move some types around to accommodate targeting .NET Standard with 85% of Quino’s code.
We’ve tried to support existing code wherever possible, but some compile errors will be unavoidable (e.g. from namespace changes or missing references). In many cases, R#/VS should be able to help repair these errors.
These are the breaking changes that are currently known.
IRunSettings
and RunMode
from Encodo.Application
to Encodo.Core
.Any .NET Framework executable that uses assemblies targeting .NET Standard must reference .NET Standard itself. The compiler (MSBuild
) in Visual Studio will alert you to add a reference to .NET Standard using NuGet. This applies not just to Winform executables, but also to any unit-test assemblies.
One piece that has changed significantly is the tool support formerly provided with Quino.Utils
. As of version 6, Quino no longer uses NAnt
, instead providing dotnet
-compatible tools that you can install using common .NET commands. Currently, Quino supports:
dotnet quinofix
dotnet quinogenerate
dotnet quinopack
Please see the tools documentation for more information on how to install and use the new tools.
The standalone Winforms-based tools are in the Quino-Windows
download, in the Tools.zip
archive.
Quino.Migrator
Quino.PasswordEncryptor
Quino.Utils
is no longer supported as a NuGet package.
In recent articles, we outlined a roadmap to .NET Standard and .NET Core and a roadmap for deployment and debugging. These two roadmaps taken together illustrate our plans to extend as much of Quino as possible to other... [More]
]]>Published by marco on 20. Jan 2019 21:59:29 (GMT-5)
Note: this article was originally published at Encodo.com in July, 2018.
In recent articles, we outlined a roadmap to .NET Standard and .NET Core and a roadmap for deployment and debugging. These two roadmaps taken together illustrate our plans to extend as much of Quino as possible to other platforms (.NET Standard/Core) and to make development with Quino as convenient as possible (getting/upgrading/debugging).
To round it off, we’ve made good progress on another vital piece of any framework: documentation.
We recently set up a new server to host Quino documentation. There, you can find documentation for current releases. Going forward, we’ll also retain documentation for any past releases.
We’re generating our documentation with DocFX, which is the same system that powers Microsoft’s own documentation web site. We’ve integrated documentation-generation as a build step in Quino’s nightly build on TeamCity, so it’s updated every night (Zürich time) [1].
The documentation includes conceptual documentation which provides an overview/tutorials/FAQ for basic concepts in Quino. The API Reference includes comprehensive documentation about the types and methods available in Quino.
While we’re happy to announce that we have publicly available documentation for Quino, we’re aware that we’ve got work to do. The next steps are:
Even though there’s still work to do, this is a big step in the right direction. We’re very happy to have found DocFX, which is a very comprehensive, fast and nice-looking solution to generating documentation for .NET code. [2]
In a recent article, we outlined a roadmap to .NET Standard and .NET Core. We’ve made really good progress on that front: we have a branch of Quino-Standard that targets .NET Standard for class libraries and .NET Core for... [More]
]]>Published by marco on 20. Jan 2019 21:58:53 (GMT-5)
Note: this article was originally published at Encodo.com in July, 2018.
In a recent article, we outlined a roadmap to .NET Standard and .NET Core. We’ve made really good progress on that front: we have a branch of Quino-Standard that targets .NET Standard for class libraries and .NET Core for utilities and tests. So far, we’ve smoke-tested these packages with Quino-WebApi. Our next steps there are to convert Quino-WebApi to .NET Standard and .NET Core as well. We’ll let you know when it’s ready, but progress is steady and promising.
With so much progress on several fronts, we want to address how we get Quino from our servers to our customers and users.
Currently, we provide access to a private fileshare for customers. They download the NuGet packages for the release they want. They copy these to a local folder and bind it as a NuGet source for their installations.
In order to make a build available to customers, we have to publish that build by deploying it and copying the files to our file share. This process has been streamlined considerably so that it really just involves telling our CI server (TeamCity) to deploy a new release (official or pre-). From there, we download the ZIP and copy it to the fileshare.
Encodo developers don’t have to use the fileshare because we can pull packages directly from TeamCity as soon as they’re available. This is a much more comfortable experience and feels much more like working with nuget.org directly.
The debugging story with external code in .NET is much better than it used to be (spoiler: it was almost impossible, even with Microsoft sources), but it’s not as smooth as it should be. This is mostly because NuGet started out as a packaging mechanism for binary dependencies published by vendors with proprietary/commerical products. It’s only in recent year(s) that packages are predominantly open-source.
In fact, debugging with third-party sources—even without NuGet involved—has never been easy with .NET/Visual Studio.
Currently, all Quino developers must download the sources separately (also available from TeamCity or the file-share) in order to use source-level debugging.
Binding these sources to the debugger is relatively straightforward but cumbersome. Binding these sources to ReSharper is even more cumbersome and somewhat unreliable, to boot. I’ve created the issue Add an option to let the user search for external sources explicitly (as with the VS debugger) when navigating in the hopes that this will improve in a future version. JetBrains has already fixed one of my issues in this are (Navigate to interface/enum/non-method symbol in Nuget-package assembly does not use external sources), so I’m hopeful that they’ll appreciate this suggestion, as well.
The use case I cited in the issue above is,
Developers using NuGet packages that include sources or for which sources are available want to set breakpoints in third-party source code. Ideally, a developer would be able to use R# to navigate through these sources (e.g. via F12) to drill down into the code and set a breakpoint that will actually be triggered in the debugger.
As it is, navigation in these sources is so spotty that you often end up in decompiled code and are forced to use the file-explorer in Windows to find the file and then drag/drop it to Visual Studio where you can set a breakpoint that will work.
The gist of the solution I propose is to have R# ask the user where missing sources are before decompiling (as the Visual Studio debugger does).
There is hope on the horizon, though: Nuget is going to address the debugging/symbols/sources workflow in an upcoming release. The overview is at NuGet Package Debugging & Symbols Improvements and the issue is Improve NuGet package debugging and symbols experience.
Once this feature lands, Visual Studio will offer seamless support for debugging packages hosted on nuget.org. Since we’re using TeamCity to host our packages, we need JetBrains to Add support for NuGet Server API v3 [1] in order to benefit from the improved experience. Currently, our customers are out of luck even if JetBrains releases simultaneously (because our TeamCity is not available publicly).
I’ve created an issue for Quino, Make Quino Nuget packages available publicly to track our progress in providing Quino packages to our customers in a more convenient way that also benefits from improvements to the debugging workflow with Nuget Packages.
If we published Quino packages to NuGet (or MyGet, which allows private packages), then we would have the benefit of the latest Nuget protocol/improvements for both ourselves and our customers as soon as it’s available. Alternatively, we could also proxy our TeamCity feed publicly. We’re still considering our options there.
As you can see, we’re always thinking about the development experience for both our developers and our customers. We’re fine-tuning on several fronts to make developing and debugging with Quino a seamless experience for all developers on all platforms.
We’ll keep you posted.
The title is a bit specific for this blog post, but that’s the gist of it: we ended up with a bunch of references to an in-between version of .NET (4.6.1) that was falsely advertising itself as a more optimal candidate for... [More]
]]>Published by marco on 20. Jan 2019 21:55:36 (GMT-5)
Note: this article was originally published at Encodo.com in July, 2018.
The title is a bit specific for this blog post, but that’s the gist of it: we ended up with a bunch of references to an in-between version of .NET (4.6.1) that was falsely advertising itself as a more optimal candidate for satisfying 4.6.2 dependencies. This is a known issue; there are several links to MS GitHub issues below.
In this blog, I will discuss direct vs. transient dependencies as well as internal vs. runtime dependencies.
If you’ve run into problems with an application targeted to .NET Framework 4.6.2 that does not compile on certain machines, it’s possible that the binding redirects Visual Studio has generated for you use versions of assemblies that aren’t installed anywhere but on a machine with Visual Studio installed.
How I solved this issue:
C:\Program Files (x86)\Microsoft Visual Studio\2017\BuildTools\MSBuild\Microsoft\Microsoft.NET.Build.Extensions\net461\
directorybin/
and obj/
folders.vs
folder (may not be strictly necessary)<AutoGenerateBindingRedirects>true</AutoGenerateBindingRedirects>
to your project)The product should now run locally and on other machines.
For more details, background and the story of how I ran into and solved this problem, read on.
Note: I published a recent article, .NET Tips and Resources, containing a link to a video by Immo Landwerth, in which says “If you want to be compatible with .NET Core 1.5 or lower, then you can use .NET Framework 4.6.1. For .NET Standard compatibility, you should definitely use .NET Framework 4.7.2 instead.” That will probably fix the problem as well. Moving to .NET Core will also fix the problem, as all binding is handled automatically there.
What do we mean when we say that we “build” an application?
Building is the process of taking a set of inputs and producing an artifact targeted at a certain runtime. Some of these inputs are included directly while others are linked externally.
The machine does exactly what you tell it to, so it’s up to you to make sure that your instructions are as precise as possible. However, you also want your application to be flexible so that it can run on as wide an array of environments as possible.
Your source code consists of declarations. We’ve generally got the direct inputs under control. The code compiles and produces artifacts as expected. It’s the external-input declarations where things go awry.
What kind of external inputs does our application have?
How is this stitched together to produce the application that is executed?
The NuGet dependencies are resolved at build time. All resources are pulled and added to the release on the build machine. There are no run-time decisions to make about which versions of which assemblies to use.
Dependencies come in two flavors:
It is with the transient references that we run into issues. The following situations can occur:
An application generally includes an app.config
(desktop applications or services) or web.config
XML file that includes a section where binding redirects are listed. A binding redirect indicates the range of versions that can be mapped (or redirected) to a certain fixed version (which is generally also included as a direct dependency).
A redirect looks like this (a more-complete form is further below):
<bindingRedirect oldVersion="0.0.0.0-4.0.1.0" newVersion="4.0.1.0"/>
When the direct dependency is updated, the binding redirect must be updated as well (generally by updating the maximum version number in the range and the version number of the target of the redirect). NuGet does this for you when you’re using package.config
. If you’re using Package References, you must update these manually. This situation is currently not so good, as it increases the likelihood that your binding redirects remain too restrictive.
NuGet packages are resolved at build time. These dependencies are delivered as part of the deployment. If they could be resolved on the build machine, then they are unlikely to cause issues on the deployment machine.
Where the trouble comes in is with dependencies that are resolved at execution time rather than build time. The .NET Framework assemblies are resolved in this manner. That is, an application that targets .NET Framework expects certain versions of certain assemblies to be available on the deployment machine.
We mentioned above that the algorithm sometimes chooses the desired version or higher. This is not the case for dependencies that are in the assembly-binding redirects. Adding an explicit redirect locks the version that can be used.
This is generally a good idea as it increases the likelihood that the application will only run in a deployment environment that is extremely close or identical to the development, building or testing environment.
How can we avoid these pesky run-time dependencies? There are several ways that people have come up with, in increasing order of flexibility:
To sum up:
Our application targets .NET Framework (for now). We’re looking into .NET Core, but aren’t ready to take that step yet.
To sum up the information from above, problems arise when the build machine contains components that are not available on the deployment machine.
How can this happen? Won’t the deployment machine just use the best match for the directives included in the build?
Ordinarily, it would. However, if you remember our discussion of assembly-binding redirects above, those are set in stone. What if you included binding redirects that required versions of system dependencies that are only available on your build machine … or even your developer machine?
We actually discovered an issue in our deployment because the API server was running, but the Authentication server was not. The Authentication server was crashing because it couldn’t find the runtime it needed in order to compile its Razor views (it has ASP.Net MVC components). We only discovered this issue on the deployment server because the views were only ever compiled on-the-fly.
To catch these errors earlier in the deployment process, you can enable pre-compiling views in release mode so that the build server will fail to compile instead of a producing a build that will sometimes fail to run.
Add the <MvcBuildViews>true</MvcBuildViews>
to any MVC projects in the PropertyGroup
for the release build, as shown in the example below:
<PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Release|AnyCPU' ">
<DebugType>pdbonly</DebugType>
<Optimize>true</Optimize>
<OutputPath>bin</OutputPath>
<DefineConstants>TRACE</DefineConstants>
<ErrorReport>prompt</ErrorReport>
<WarningLevel>4</WarningLevel>
<LangVersion>6</LangVersion>
<MvcBuildViews>true</MvcBuildViews>
</PropertyGroup>
We mentioned above that NuGet is capable of updating these redirects when the target version changes. An example is shown below. As you can see, they’re not very easy to write:
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<runtime>
<assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
<dependentAssembly>
<assemblyIdentity
name="System.Reflection.Extensions"
publicKeyToken="B03F5F7F11D50A3A"
culture="neutral"/>
<bindingRedirect oldVersion="0.0.0.0-4.0.1.0" newVersion="4.0.1.0"/>
</dependentAssembly>
<!– Other bindings… –>
</assemblyBinding>
</runtime>
</configuration>
Most bindings are created automatically when MSBuild emits a warning that one would be required in order to avoid potential runtime errors. If you compile with MSBuild in Visual Studio, the warning indicates that you can double-click the warning to automatically generate a binding.
If the warning doesn’t indicate this, then it will tell you that you should add the following to your project file:
<AutoGenerateBindingRedirects>true</AutoGenerateBindingRedirects>
After that, you can rebuild to show the new warning, double-click it and generate your assembly-binding redirect.
When MSBuild generates a redirect, it uses the highest version of the dependency that it found on the build machine. In most cases, this will be the developer machine. A developer machine tends to have more versions of the runtime targets installed than either the build or the deployment machine.
A Visual Studio installation, in particular, includes myriad runtime targets, including many that you’re not using or targeting. These are available to MSBuild but are ordinarily ignored in favor of more appropriate ones.
That is, unless there’s a bit of a bug in one or more of the assemblies included with one of the SDKs…as there is with the net461 distribution in Visual Studio 2017.
Even if you are targeting .NET Framework 4.6.2, MSBuild will still sometimes reference assemblies from the 461 distribution because the assemblies are incorrectly marked as having a higher version than those in 4.6.2 and are taken first.
I found the following resources somewhat useful in explaining the problem (though none really offer a solution):
How can you fix the problem if you’re affected?
You’ll generally have a crash on the deployment server that indicates a certain assembly could not be loaded (e.g. System.Runtime
). If you show the properties for that reference in your web application, do you see the path C:\Program Files (x86)\Microsoft Visual Studio\2017\BuildTools\MSBuild\Microsoft\Microsoft.NET.Build.Extensions\net461
somewhere in there? If so, then your build machine is linking in references to this incorrect version. If you let MSBuild generate binding redirects with those referenced paths, they will refer to versions of runtime components that do not generally exist on a deployment machine.
Tips for cleaning up:
C:\Program Files (x86)\Microsoft Visual Studio\2017\BuildTools\MSBuild\Microsoft\Microsoft.NET.Build.Extensions\net461
in the output?A sample warning message:
Platform:System.Collections.dll
and CopyLocal:C:\Program Files (x86)\Microsoft Visual Studio\2017\BuildTools\MSBuild\Microsoft\Microsoft.NET.Build.Extensions\net461\lib\System.Collections.dll
. Choosing CopyLocal:C:\Program Files (x86)\Microsoft Visual Studio\2017\BuildTools\MSBuild\Microsoft\Microsoft.NET.Build.Extensions\net461\lib\System.Collections.dll
because AssemblyVersion 4.0.11.0
is greater than 4.0.10.0
.As mentioned above, but reiterated here, this what I did to finally stabilize my applications:
C:\Program Files (x86)\Microsoft Visual Studio\2017\BuildTools\MSBuild\Microsoft\Microsoft.NET.Build.Extensions\net461\
directorybin/
and obj/
folders.vs
folder (may not be strictly necessary)<AutoGenerateBindingRedirects>true</AutoGenerateBindingRedirects>
to your project)When you install any update of Visual Studio, it will silently repair these missing files for you. So be aware and check the folder after any installations or upgrades to make sure that the problem doesn’t creep up on you again.
With Quino 5, we’ve gotten to a pretty good place organizationally. Dependencies are well-separated into projects—and there are almost 150 of them.
We can use code-coverage, solution-wide-analysis and so on without a... [More]
]]>Published by marco on 20. Jan 2019 21:49:13 (GMT-5)
Note: this article was originally published at Encodo.com in May, 2018.
With Quino 5, we’ve gotten to a pretty good place organizationally. Dependencies are well-separated into projects—and there are almost 150 of them.
We can use code-coverage, solution-wide-analysis and so on without a problem. TeamCity runs the ~10,000 tests quickly enough to provide feedback in a reasonable time. The tests run even more quickly on our desktops. It’s a pretty comfortable and efficient experience, overall.
As of Quino 5, all Quino-related code was still in one repository and included in a single solution file. Luckily for us, Visual Studio 2017 (and Rider and Visual Studio for Mac) were able to keep up quite well with such a large solution. Recent improvements to performance kept the experience quite comfortable on a reasonably equipped developer machine.
Having everything in one place is both an advantage and disadvantage: when we make adjustments to low-level shared code, the refactoring is applied in all dependent components, automatically. If it’s not 100% automatic, at least we know where we need to make changes in dependent components. This provides immediate feedback on any API changes, letting us fine-tune and adjust until the API is appropriate for known use cases.
On the other hand, having everything in one place means that you must make sure that your API not only works for but compiles and tests against components that you may not immediately be interested in.
For example, we’ve been pushing much harder on the web front lately. Changes we make in the web components (or in the underlying Quino core) must also work immediately for dependent Winform and WPF components. Otherwise, the solution doesn’t compile and tests fail.
While this setup had its benefits, the drawbacks were becoming more painful. We wanted to be able to work on one platform without worrying about all of the others.
On top of that, all code in one place is no longer possible with cross-platform support. Some code—Winform and WPF—doesn’t run on Mac or Linux. [1]
The time had come to separate Quino into a few larger repositories.
We decided to split along platform-specific lines.
The Quino-WebApi and Quino-Windows solution will consume Quino-Standard via NuGet packages, just like any other Quino-based product. And, just like any Quino-based product, they will be able to choose when to upgrade to a newer version of Quino-Standard.
Part of the motivation for the split is cross-platform support. The goal is to target all assemblies in Quino-Standard to .NET Standard 2.0. The large core of Quino will be available on all platforms supported by .NET Core 2.0 and higher.
This work is quite far along and we expect to complete it by August 2018.
As of Quino 5.0.5, we’ve moved web-based code to its own repository and set up a parallel deployment for it. Currently, the assemblies still target .NET Framework, but the goal here is to target class libraries to .NET Standard and to use .NET Core for all tests and sample web projects.
We expect to complete this work by August 2018 as well.
We will be moving all Winform and WPF code to its own repository, setting it up with its own deployment (as we did with Quino-WebApi). These projects will remain targeted to .NET Framework 4.6.2 (the lowest version that supports interop with .NET Standard assemblies).
We expect this work to be completed by July 2018.
One goal we have with this change is to be able to use Quino code from Xamarin projects. Any support we build for mobile projects will proceed in a separate repository from the very beginning.
We’ll keep you posted on work and improvements and news in this area.
Customer will, for the most part, not notice this change, except in minor version numbers. Core and platform versions may (and almost certainly will) diverge between major versions. For major versions, we plan to ship all platforms with a single version number.
Published by marco on 20. Jan 2019 21:44:30 (GMT-5)
The earthli blogging format uses HTML-like formatting, described in the lengthy manual (with examples). However, Encodo’s blogging back-end now uses Umbraco, with Markdown for content. I used to be able to cross-post with ease, by copy/pasting. Now, I need to convert the content from Markdown to earthli formatting.
The following steps suffice to convert any article:
### ([^\n]+)$
=> <h level=“3”>\1</h>
## ([^\n]+)$
=> <h>\1</h>
\[([^!][^\]]+)\]\(([^\)]+)\)
=> <a href=“\2”>\1</a>
\*\*([^\*]+)\*\*
=> <b>\1</b>
_([^_]+)_
=> <i>\1</i>
```txt\n([^`]+)\n```
=> <pre>\1</pre>
```[a-z]+\n([^`]+)\n```
=> <code>\1</code>
`([^`]+)`
=> <c>\1</c>
I haven’t automated this process yet because I only rarely transfer articles.
“In practice, nearly everything you write is potentially dependent upon the order of evaluation, but in practice it isn’t because you are not a nincompoop.”
He completes the thought with “[b]ut the compiler doesn’t know that. The compiler must adhere to the letter of the language standard, because it... [More]”
]]>Published by marco on 8. Jan 2019 22:46:23 (GMT-5)
Updated by marco on 20. Jan 2019 11:22:48 (GMT-5)
“In practice, nearly everything you write is potentially dependent upon the order of evaluation, but in practice it isn’t because you are not a nincompoop.”
He completes the thought with “[b]ut the compiler doesn’t know that. The compiler must adhere to the letter of the language standard, because it has to compile insane code as well as sane code.”
But the main problem that the article mentions can’t be solved 100% by any language. The... [More]
]]>Published by marco on 8. Jan 2019 22:28:18 (GMT-5)
The article Fear, trust and JavaScript: When types and functional programming fail presents issues in JavaScript and a solution: use another language. The list several newer ones that are completely untested.
But the main problem that the article mentions can’t be solved 100% by any language. The main problem is at the boundaries of your application: inputs.
When you get data from an external source, you have to validate it somehow before passing it along to the rest of the application.
No language can remove this requirement. It doesn’t matter how functional, curryable, immutable or sexy it is; it just can’t do it. What you have instead is languages with more built-in mechanisms for defining types that allow the rest of the program to work safely with the data, once it’s been validated.
So if your language supports immutability and types, then you can validate that the data is OK before hydrating the object from the serialized source (e.g. JSON).
What we’re trying to avoid is unexpected runtime errors, no? Or, at the very least, we want a runtime error of a known type that precisely identifies the problem with the incoming data. That is, the data either conforms to the definition—and the definition is statically typed—or there is an error.
The desire is to push this gatekeeper/conversion to a single place so that the rest of the application works with the compiler to find errors rather than tyhe programmer defensively checking throughout the source.
However, suggesting that PureScript or Elm or ClojureScript are somehow better at doing this the JavaScript is incorrect. Where they are better is in providing language mechanisms that allow you to precisely define the shape of the data.
Despite the author’s suggestions, they are not that much different than TypeScript. The only difference being that TypeScript chose to stay much closer to JavaScript for compatibility reasons. At the time that TypeScript came out, this was a reasonable requirement, since almost no-one wanted to move completely away from JavaScript.
Five years later and the development world is ready for other languages. With WASM (Web Assembly) as a target (instead of just JavaScript), there are more possibilities than ever.
JavaScript as a compile target is still open to runtime errors. When you use a higher-level language, you’re restricting the range of functionality that you can use in the target bytecode/machine code. That is, when you write an if-statement in C, you’re using the JMP statement, but you’re only able to JMP to certain address locations instead of anywhere in addressable memory.
It’s the same with JavaScript as a compile target. It doesn’t really matter that JavaScript allows too much—what matters is what the higher-level language allows. TypeScript may still allow too much, but it’s worlds better than JavaScript.
It’s true that PureScript or Elm or ClojureScript can close some loopholes that TypeScript leaves open. That’s fine. But if you’re going to just use JavaScript (or WASM) as a compile target, then why not choose a more-established language like C# or F#?
Published by marco on 8. Jan 2019 22:24:05 (GMT-5)
The post on Reddit called Someone asked me to make a site for them and I don’t know how the fuck I’m supposed to go about it. is about exactly what it sounds like it’s about. Amid the flurry of comments with recommendations on how to pretend he (or she) knows how to build a web site by using tools he’s (or she’s) never heard of, I chimed in with,
What is it about software that makes people who have never done it think that they can do it professionally?
What if your neighbor had heard you were a carpenter and had asked you to make a dining-room set for “good money”? Would you watch YouTube videos about how to make furniture and then charge money for the first furniture you ever made?
What about if they’d asked for a haircut/trim/style/dye? Would you just go for it, after having asked around on /r/coiffeur for a few minutes?
Or maybe they’d heard you were a chef and offered “good money” to cook their Thanksgiving dinner for them? Would you risk doing that?
Probably not, because if you’ve never done any of those things, you’re not good at them and charging for doing them can only backfire horribly.
Unless your neighbor is a sap and a fool, in which case go for it.
... [More]
Published by marco on 31. Dec 2018 22:55:26 (GMT-5)
The article Deciphering The Postcard Sized Raytracer by Fabien Sanglard is a wonderfully presented breakdown of how the path tracer found on a postcard does its magic. It’s not super-fast (it takes 3 minutes to produce a much rougher version on the author’s machine). He includes his final cleaned-up source code.
It comes from the same person who made the business card ray-tracer discussed in the article Decyphering The Business Card Raytracer by Fabien Sanglard.
In particular, he highlights the disastrous... [More]
]]>Published by marco on 30. Dec 2018 23:03:19 (GMT-5)
Updated by marco on 30. Dec 2018 23:03:37 (GMT-5)
The article ”Modern” C++ Lamentations by Aras Pranckeviciusis a wide-ranging rant about the inefficiency of C++ template programming and the degree to which it’s inappropriate for many of the areas where C++ is used. Aras is one of the developers for the Unity game engine
In particular, he highlights the disastrous compilation and execution speeds when using a lot of the STL. Not only that, but the debugging time is extremely slow, due to the inordinate amount of extra symbol information associated with hundreds of thousands of lines of code pulled in to implement relatively simple concepts that are standard in other languages, libraries and runtimes.
On top of it all, even the high-level C++ code isn’t very easy to read, despite the tremendous amount of abstraction.
The optimized version of C++ code has an even worse compilation time, but it has a comparable/reasonable run-time to the C/C++-style version. However, it’s very difficult to debug optimized code, which makes it doubly bad for development. Interactive development is hindered because of long compile times and, when debugging is necessary, most introspection tools don’t work (e.g. reading variables) very well. It’s the rare developer who can make headway debugging optimized code.
He compares versions of an algorithm built using “classic” C/C++ programming vs. STL programming. He then compares to C#, which compiles and runs and debugs very quickly—and is very easy to read, to boot.
The problem with C++ boils down to its approach of making “everything a library”. It’s almost like an exercise in abstraction: since a few generic-programming concepts can be used to build everything in the library rather than the language, that’s what C++ does. It’s almost as if it does it to prove that it can be done. I’m all for removing redundancy in a language, but C++ is far from such a language. It’s almost like the designers don’t use their own language.
He cites Christer Ericson (Twitter)
“Goal of programmers is to ship, on time, on budget. It’s not “to produce code.” IMO most modern C++ proponents 1) overassign importance to source code over 2) compile times, debug[g]ability, cognitive load for new concepts and extra complexity, project needs, etc. 2 is what matters.”
Aras continues discussing the future of C++ and how it is currently used in game companies, for example. These are the companies using C++ the most. Rust is making some inroads, but the area is dominated by C/C++.
Finally, he has some good advice for programmers—for any professional, really—on how to take criticism and turn it into something useful.
“Ignoring literal trolls who complain on the internet “just for the lulz”, [the] majority of complaints do have [an] actual issue or problem behind it. It might be worded poorly, or exaggerated, or whoever is complaining did not think about other possible viewpoints, but there is a valid issue behind the complaint anyway.
“What I do whenever someone complains about thing I’ve worked on, is try to forget about “me” and “work I did”, and get their point of view. What are they trying to solve, and what problems do they run into? The purpose of any software/library/language is to help their users solve the problems they have. It might be a perfect tool at solving their problem, an “ok I guess that will work” one, or a terribly bad one at that.”
As a postscript, the article It is fast or it is wrong by Nikita Tonsky discusses a very similar issue with Clojure vs. ClojureScript.
“What do ClojureScript/Google Closure compilers do for so long? They are wasting your time, that’s what. Of course it’s nobody’s fault, but in the end, this whole solution is simply wrong. We can do the same thing much faster, we have proof of that, we have the means to do it, it just happens that we are not. But we could. If we wanted to. That huge overhead you’re paying, you’re paying it for nothing. You don’t get anything from being on JS, except a 2× performance hit and astronomical build times.”
I find these points interesting because programming is very much about which tools you use and how they help you to turn your work around more quickly. I’m in charge of choosing which languages, libraries and tools we use at Encodo and I’m hyper-aware of the efficiency losses when developers are hindered by their tools or libraries. Being the lead developer of our framework Quino makes me doubly aware of this.
If you have a very slow feedback loop, then you’ll take much longer to get your work done. I remember back in the late 90s/early 2000s, working with C++, where I would have to schedule builds because it took over 30 minutes to rebuild all of my static libraries if I made a low-level change. This was on a project that cross-compiled to Mac and Windows. Instead of working on my project, I spent way too much time massaging PCH files and avoiding making low-level changes so that I could continue testing.
Bad tools that run too slowly are a problem. That’s why you should always be very careful in choosing your languages, libraries and environments. Jumping ship to the “new hotness” very often means that you’re going to have your time wasted by tools that aren’t ready for prime time.
Immo tells you everything you need to know about Nuget, using Package References, switching to .NET Core, and using Assembly-Binding Redirects in .NET Framework (they’re not necessary in .NET Core). He also includes an effusive... [More]
]]>Published by marco on 30. Dec 2018 22:12:44 (GMT-5)
If you’re a .NET developer, this is video you’ve been looking for:
Immo tells you everything you need to know about Nuget, using Package References, switching to .NET Core, and using Assembly-Binding Redirects in .NET Framework (they’re not necessary in .NET Core). He also includes an effusive apology for the nightmare of compatibility issues that accompanied the purported interoperability between .NET 4.6.1 and .NET Core.
If you want to be compatible with .NET Core 1.5 or lower, then you can use .NET Framework 4.6.1. For .NET Standard compatibility, you should definitely use .NET Framework 4.7.2 instead.
He includes a list of resources for digging through open-source code and checking platform and target compatibility.
While you can use Microsoft Docs to find out which targets or platforms support which APIs, this resource lets you do it faster.
You can browse a giant list of namespaces and click on any one of them to see the types, and then drill down to properties and methods. For each level, you can see a nice list of supported targets/platforms and the assemblies to use.
You can also “Search”, which opens what looks like a terminal that let’s you camel-case search for your namespace, type or member. Selecting a result takes you to the location in the catalog.
Yes, you read that correctly. I had no idea that this existed—I’ve been digging through decompiled assembly code instead. This is much faster and includes the original documentation and comments. The source is syntax-highlighted and all types, methods and properties are linked.
There’s a document explorer, namespace explorer and project manager, all linked up very nicely. You can click any element and show all references in a separate pane. Clicking one of those references navigates there—and other references in that file are also highlighted.
If that’s not sufficient, you can even download the entire source code as a ZIP file from here—complete with solution and project files so you can open it in Visual Studio for browsing.
This is a NuGet package browser combined with an API browser over all of the assemblies in a package.
It’s an open-source GitHub project, so you could even run your own copy for diffing privately published packages.
I just ran into an issue recently where a concrete implementation registered as a singleton was suddenly not registered as a singleton because of architectural changes.
The changes involved creating mini-applications within a main application, each of which has its own IOC. Instead of... [More]
]]>Published by marco on 16. Jul 2018 21:55:42 (GMT-5)
I just ran into an issue recently where a concrete implementation registered as a singleton was suddenly not registered as a singleton because of architectural changes.
The changes involved creating mini-applications within a main application, each of which has its own IOC. Instead of creating controllers using the main application, I was now creating controllers with the mini-application instead (to support multi-tenancy, of which more in an upcoming post).
Controllers are, by their nature, transient; a new controller is created to handle each incoming request.
In the original architecture, the concrete singleton was injected into the controller and all controller instances used the same shared instance. In the new architecture, the registration was not present in the mini-application (at first), which led to a (relatively) subtle bug: a transient and freshly created instance was injected into each new controller.
In cases where the singleton is a stateless algorithm, this wouldn’t be a logical problem at all. At the very worst, you’re over-allocating—but you probably wouldn’t notice that, either. In this case, the singleton was a settings object, configured at application startup. The configured object was still in the main application’s IOC, but not registered in the mini-application’s IOC.
Because the singleton was registered on a concrete type rather than an interface, the semantic error occurred silently instead of throwing a lifestyle-mismatch or unregistered-interface exception.
This is only one of the reasons that I recommend using interfaces as the anchoring type of an IOC registration.
To fix the issue, I did exactly this: I extracted an interface from the class and used the interface everywhere (except for the implementing type of the registration). Re-running the test caused an immediate exception rather than a strange data bug (which resulted because the default configuration in the concrete type was just correct enough to allow it to limp to a result).
To show an example, instead of the following,
application.RegisterSingle<ApiSettings>()
I used,
application.RegisterSingle<IApiSettings, ApiSettings>()
This still didn’t fix the crash because the mini-application doesn’t get that registration automatically.
I also can’t use the same registration as above because that would just create a new unconfigured ApiSettings
in each mini-application (the same as I had before, but now as a singleton). To go that route, I would have to replicate the configuration-loading for the ApiSettings
as well. And I don’t want to do that.
Instead, I just injected the IApiSettings
from the main application to the component responsible for creating the mini-application and registered the object as a singleton directly, as shown below.
public class MiniApplicationFactory
{
public MiniApplicationFactory([NotNull] IApiSettings apiSettings)
{
if (apiSettings = null) { throw new ArgumentNullException(nameof(apiSettings(); }
_apiSettings = apiSettings;
}
IApplication CreateApplication()
{
return new Application().UseRegisterSingle(_apiSettings);
}
[NotNull]
private readonly IApiSettings _apiSettings;
}
On a side note, whereas C# syntax has become more concise and powerful from version to version, I still think it has a way to go in terms of terseness for such simple objects. For such things, Kotlin and TypeScript nicely illustrate what such a syntax could look like. [1]
I mentioned above that this is only “one” of the reasons I don’t like registering concrete singletons. The other two reasons are:
I’m still waiting for C# to clean up a bit more of this syntax for me. The [NotNull]
should be a language feature checked by the compiler so that the ArgumentNullException
is no longer needed. On top of that, I’d like to see parameter properties, as in TypeScript (this is where you can prefix a constructor parameter with a keyword to declare and initialize it as a property). With a few more C#-language iterations that included non-nullable reference types and parameter properties, the example could look like the code below:
public class MiniApplicationFactory
{
public MiniApplicationFactory(private IApiSettings apiSettings)
{
}
IApplication CreateApplication()
{
return new Application().UseRegistereSingle(apiSettings);
}
}
A few years back, we made a big leap in Quino 2.0 to split up dependencies in anticipation of the initial release of .NET Core. Three tools were indispensable: ReSharper, NDepend and, of course, Visual Studio. Almost all... [More]
]]>Published by marco on 24. May 2018 22:12:33 (GMT-5)
The Quino roadmap shows you where we’re headed. How do we plan to get there?
A few years back, we made a big leap in Quino 2.0 to split up dependencies in anticipation of the initial release of .NET Core. Three tools were indispensable: ReSharper, NDepend and, of course, Visual Studio. Almost all .NET developers use Visual Studio, many use ReSharper and most should have at least heard of NDepend.
At the time, I wrote a series of articles on the migration from two monolithic assemblies (Encodo
and Quino
) to dozens of layered and task-specific assemblies that allows applications to include our software in a much more fine-grained manner. As you can see from the articles, NDepend was the main tool I used for finding and tracking dependencies. [1] I used ReSharper to disentangle them.
Since then, I’ve not taken advantage of NDepend’s features for maintaining architecture as much as I’d like. I recently fired it up again to see where Quino stands now, with 5.0 in beta.
But, first, let’s think about why we’re using yet another tool for examining our code. Since I started using NDepend, other tools have improved their support for helping a developer maintain code quality.
IDisposable
pattern. The Portability Analysis is essential for moving libraries to .NET Standard but doesn’t offer any insight into architectural violations like NDepend does.With a concrete .NET Core/Standard project in the wings/under development, we’re finally ready to finish our push to make Quino Core ready for cross-platform development. For that, we’re going to need NDepend’s help, I think. Let’s take a look at where we stand today.
The first step is to choose what you want to cover. In the past, I’ve selected specific assemblies that corresponded to the “Core”. I usually do the same when building code-coverage results, because the UI assemblies tend to skew the results heavily. As noted in a footnote below, we’re starting an effort to separate Quino into high-level components (roughly, a core with satellites like Winform, WPF and Web). Once we’ve done that, the health of the core itself should be more apparent (I hope).
For starters, though, I’ve thrown all assemblies in for both NDepend analysis as well as code coverage. Let’s see how things stand overall.
The amount of information can be quite daunting but the latest incarnation of the dashboard is quite easy to read. All data is presented with a current number and a delta from the analysis against which you’re comparing. Since I haven’t run an analysis in a while, there’s no previous data against which to compare, but that’s OK.
Let’s start with the positive.
Now to the cool part: you can click anything in the NDepend dashboard to see a full list of all of the data in the panel.
Click the “B” on technical debt and you’ll see an itemized and further-drillable list of the grades for all code elements. From there, you can see what led to the grade. By clicking the “Explore Debt” button, you get a drop-down list of pre-selected reports like “Types Hot Spots”.
Click lines of code and you get a breakdown of which projects/files/types/methods have the most lines of code
Click failed quality gates to see where you’ve got the most major problems (Quino currently has 3 categories)
Click “Critical” or “Violated” rules to see architectural rules that you’re violating. As with everything in NDepend, you can pick and choose which rules should apply. I use the default set of rules in Quino.
Most of our critical issues are for mutually-dependent namespaces. This is most likely not root namespaces crossing each other (though we’d like to get rid of those ASAP) but sub-namespaces that refer back to the root and vice-versa. This isn’t necessarily a no-go, but it’s definitely something to watch out for.
There are so many interesting things in these reports:
Click the “Low” issues (Quino has over 46,000!) and you can see that NDepend analyzes your code at an incredibly low level of granularity
Finally, there’s absolutely everything, which includes boxing/unboxing issues [7], method-names too long, large interfaces, large instances (could also be generated classes).
These already marked as low, so don’t worry that NDepend just rains information down on you. Stick to the critical/high violations and you’ll have real issues to deal with (i.e. code that might actually lead to bugs rather than code that leads to maintenance issues or incurs technical debt, both of which are more long-term issues).
What you’ll also notice in the screenshots that NDepend doesn’t just provide pre-baked reports: everything is based on its query language. That is, NDepend’s analysis is lightning fast (takes only a few seconds for all of Quino) during which it builds up a huge database of information about your code that it then queries in real-time. NDepends provides a ton of pre-built queries linked from all over the UI, but you can adjust any of those queries in the pane at the top to tweak the results. The syntax is Linq to Sql and there are a ton of comments in the query to help you figure out what else you can do with it.
As noted above, the amount of information can be overwhelming, but just hang in there and figure out what NDepend is trying to tell you. You can pin or hide a lot of the floating windows if it’s all just a bit too much at first.
In our case, the test assemblies have more technical debt than the code that it tests. This isn’t optimal, but it’s better than the other way around. You might be tempted to exclude test assemblies from the analysis, to boost your grade, but I think that’s a bad idea. Testing code is production code. Make it just as good as the code it tests to ensure overall quality.
I did a quick comparison between Quino 4 and Quino 5 and we’re moving in the right direction: the estimation of work required to get to grade A was already cut in half, so we’ve made good progress even without NDepend. I’m quite looking forward to using NDepend more regularly in the coming months. I’ve got my work cut out for me.
nant clean
command. I’d moved the ndepend out
folder to the common folder and our command wiped out the previous results. I’ll work on persisting those better in the future.I generated coverage data using DotCover, but realized only later that I should have configured it to generate NDepend-compatible coverage data (as detailed in NDepend Coverage Data. I’ll have to do that and run it again. For now, no coverage data in NDepend. This is what it looks like in DotCover, though. Not too shabby:
Published by marco on 31. Mar 2018 23:28:27 (GMT-5)
The long and technical article Files are hard by Dan Luu discusses several low-level and scholarly analyses of how common file-systems and user-space applications deal with read/write errors.
File-system operations work with devices and are thus asynchronous by nature. The analyses discovered similar ordering issues as with multi-threaded code.
“The most common class of error was incorrectly assuming ordering between syscalls. The next most common class of error was assuming that syscalls were atomic2. These are fundamentally the same issues people run into when doing multithreaded programming. Correctly reasoning about re-ordering behavior and inserting barriers correctly is hard. But even though shared memory concurrency is considered a hard problem that requires great care, writing to files isn’t treated the same way, even though it’s actually harder in a number of ways.”
This is why most applications should use a framework or runtime support to access the file system. Even this might not be enough, though, if the implementation is still not robust enough for the application requirements. The .NET runtime has for quite a while now offered an API that uses async/await (i.e. a promise/future-based API), which at the very least indicates the asynchronous nature of these calls, with separate paths for success and error. This is better than nothing, even if the implementation occasionally fails to properly propagate errors (as we see with the POSIX APIs below).
At any rate, the article drives home the point that programming against file systems is hard.
“People almost always just run some tests to see if things work, rather than making sure they’re coding against what’s legal in a POSIX filesystem.”
Having a few tests is better than nothing, but it’s even better to hoist your code up as many levels of abstraction as possible and avoid having to know about how to interleave fsync
calls at all. Unless you’re writing a database or a source-control system, right?
He goes on to discuss “how much misinformation is out there” and that “it’s hard for outsiders to troll through a decade and a half of mailing list postings to figure out which ones are still valid and which ones have been obsoleted”
This is a common problem that applies not just to low-level systems programming, but to any other programming problem. We have a surfeit of choice: just search online and you’ll find something that matches what you searched.
I recently ran into this phenomenon when learning Docker. Docker has changed and improved so much that the Internet is literally littered with old and overly complicated solutions to problems that either no longer exist or that can be solved with a simple one-liner in a configuration file. If you follow the instructions you find online, it’s possible that you’ll have something that works the way you want it to, but it’s also very likely that you’ll end up with a Frankenstein’s Monster of a setup that kind of works but is fragile in unnecessary ways.
From the article:
“So far, we’ve assumed that the disk works properly, or at least that the filesystem is able to detect when the disk has an error via SMART or some other kind of monitoring. I’d always figured that was the case until I started looking into it, but that assumption turns out to be completely wrong.”
That sounds bad, of course. It’s not something we user-space programmers ever really think about, is it? You read from a file, you write to a file, it works, right? And if it doesn’t work (super-rare, right?), then the runtime throws an exception.
If we assume that the runtime throws an exception, we’re also assuming that the runtime is notified when an error occurs during a read or write operation. This was, apparently, not the case (at least in 2005-2008; we’ll see improvements below).
“In one presentation, one of the authors remarked that the ext3 code had lots of comments like “I really hope a write error doesn’t happen here” in places where errors weren’t handled. […] NTFS is somewhere in between. The authors found that it has many consistency checks built in, and is pretty good about propagating errors to the user. However, like ext3, it ignores write failures.”
Ignoring write failures! That’s kind of incredible, but if you’ve ever relied heavily on NTFS, you know that there are bugs in it. Sometimes files are just mysteriously locked and inaccessible until the system is rebooted. Why does the problem go away on reboot? NTFS is journaled and can recover its data, but it needs to be unmounted and checked. Instead of panicking, the write error is ignored. [1]
“At this point, we know that it’s quite hard to write files in a way that ensures their robustness even when the underlying filesystem is correct, the underlying filesystem will have bugs, and that attempting to repair corruption to the filesystem may damage it further or destroy it.”
The papers referenced in the first article are quite old (a decade or more) but the conclusions are still fascinating. Luu discusses the need for replicating the study and laments that “replications usually give little to no academic credit. This is one of the many cases where the incentives align very poorly with producing real world impact.”
Happily, Luu followed up with another post, called File-system error-handling that reproduces some of the original results with the 2017 versions of the file systems. This is an interesting study in its own right, discussing in detail interesting nuggets like the fact that “apfs doesn’t checksum data because “[apfs] engineers contend that Apple devices basically don’t return bogus data”.” (from APFS in Detail: Data Integrity).
The second article concludes that “Filesystem error handling seems to have improved.” Basic write errors are now propagated to user-space wherever possible (i.e. if the drive is not dead). However, “[m]ost filesystems don’t have checksums for data and leave error detection and correction up to userspace software.” This is probably something that most user-space software developers never think about, but it’s crucially important. Does your software assume that the file system will always throw an error? Or does it “just assume[…] that filesystems and disks don’t have errors”?
The first article concludes with a citation from Butler Lampson:
“Lampson suggests that the best known general purpose solution is to package up all of your parallelism into as small a box as possible and then have a wizard write the code in the box.”
This is generally a good approach for anything complicated: programmers should use as high-level an API as possible for a given task. Problems like security, memory-allocation, file-system access, networking, asynchronous/parallel programming…these all fall into that category. Generally, the advice is, as usual, to get your requirements, make components that satisfy those requirements and include automated tests that verify that the components will continue to satisfy the requirements.
As Lampson says, don’t write code that’s beyond you—get a “wizard” to write it instead. That’s what most of us do when we use the runtime provided with our language. [2]
The best you can usually do is to abstract away access to external systems (including the file system) so that you can improve behavior later, should it be required. The budget and reliability constraints of a project don’t always allow you to program perfectly safely. What you can do is to make sure that the system can be made safer later with a reasonable amount of effort. To be clear: don’t be unnecessarily sloppy, but don’t tank your project guaranteeing NASA-level safety where its not needed.
So what does that mean? If you’re programming on .NET, it means you should probably stay away from some constructs that you’ve previously considered safe and not worth wrapping, like File
or Directory
. Instead of using these directly, use them from an injected service. This level of abstraction is not difficult to enforce if introduced early in a project and will allow for improved testing anyway. If the filesystem is abstracted, components will no longer need their tests to actually write out files in order to work.
As discussed above, this isn’t to say that you jeopardize your deadline to abstract away every single file-system reference. For some applications, file-system access is so intrinsic as to be un-mockable (e.g. databases, source-control, etc.). However, your application is probably not one of those. It’s likely that your application reads/writes files in a highly localizable manner that could be wrapped in a simple component.
This advice is similar to the by-now common practice of not using the global DateTime.UtcNow
. How can this be a problem? Well, if code uses an IClock
component instead, then tests can adjust “now” to be a point in the past or future and test scheduling components more easily. It’s an easy pattern to follow in new code that pays for itself the first time you need to reproduce a timing problem.
At the end of the second article, there’s an interesting discussion of how to avoid these kind of bugs—or just bugs, in general.
“There’s a very old debate over how to prevent things like this from accidentally happening.”
Better “tools or processes”? Be “better programmers”? Are tools like guardrails? Does it make sense to keep driving, bashing back and forth across the road, but happy that the guardrails are keeping us on the road at all? Would you do that in a car?
Well, no.
But, yes, if that’s the best option? What’s the other option? Just stop the car and don’t go anywhere anymore? Or get out and walk?
That analogy has been beaten to death—and I don’t think it’s very appropriate (as you can see from my discussion about abstraction above). Tools and processes are better than nothing. Proper programming practices and patterns are, as well. If you train yourself to use tried-and-true patterns, then you automatically avoid common errors.
The point isn’t to be able to say that “there are no bugs”; it’s be able to say that “these tested bugs won’t happen”. The point is to use practices that avoid whole classes of problems.
“Even better than a static analysis tool would be a language that makes it harder to accidentally forget about checking for an error.”
And now we come to the justification for some of the newer languages out there. Rust is such a language, which attempts to fix many of the shortcomings of C and C++ in the domain of allocating, sharing, modifying and freeing memory.
For error-handling, the article The Error Model by Joe Duffy discusses a very interesting and promising approach taken by a Microsoft Research team with Midori, a 100%-managed version of Windows. The basic insight is to separate bugs from recoverable errors and unrecoverable errors.
A bug is something the user-space application did wrong (e.g. passing a null reference to a method that expects only non-null references). A recoverable error is a validation error encountered when processing user input. An unrecoverable error is a file-read error in a base configuration file or a stack overflow or an out-of-memory error.
For almost all software, file-system errors are something that should just be considered an unrecoverable error. There is no reason why most applications should attempt to continue when e.g. the main configuration cannot be loaded. Most applications don’t even need to be able to recover from that. The problem occurs so rarely that you should just get a file out of backup.
Lower-level applications like Git or PostgreSql have to take more care to deal with file-system errors [5], but your software most likely doesn’t need to handle them. As discussed above, be aware that they can happen, abstract your code from the file-system so you can test error situations and improve handling where needed, but fail fast unless your project has a requirement to be able to recover in error conditions.
Generally, no-one expects a user-space application to include robust file-recovery. It’s expected, though, that the application detects when something is wrong and reports it, failing fast rather than just limping along and corrupting data.
NULL
bytes after certain catastrophic operations.I installed Visual... [More]
]]>Published by marco on 14. May 2017 21:38:17 (GMT-5)
Updated by marco on 15. May 2017 08:36:05 (GMT-5)
.NET Standard 2.0 is finally publicly available as a preview release. I couldn’t help myself and took a crack at converting parts of Quino to .NET Standard just to see where we stand. To keep me honest, I did all of my investigations on my MacBook Pro in MacOS.
I installed Visual Studio for Mac, the latest JetBrains Rider EAP and .NET Standard 2.0-preview1. I already had Visual Studio Code with the C#/OmniSharp extensions installed. Everything installed easily and quickly and I was up-and-running in no time.
Armed with 3 IDEs and a powerful command line, I waded into the task.
Quino is an almost decade-old .NET Framework solution that has seen continuous development and improvement. It’s quite modern and well-modularized, but we still ran into considerable trouble when experimenting with .NET Core 1.1 almost a year ago. At the time, we dropped our attempts to work with .NET Core, but were encouraged when Microsoft shifted gears from the extremely low–surface-area API of .NET Core to the more inclusive though still considerably cleaned-up API of .NET Standard.
Since it’s an older solution, Quino projects use the older csproj file-format: the one where you have to whitelist the files to include. Instead of re-using these projects, I figured a good first step would be to use the dotnet
command-line tool to create a new solution and projects and then copy files over. That way, I could be sure that I was really only including the code I wanted—instead of random cruft generated into the project files by previous versions of Visual Studio.
dotnet
CommandThe dotnet
command is really very nice and I was able to quickly build up a list of core projects in a new solution using the following commands:
dotnet new sln
dotnet new classlib -n {name}
dotnet add reference {../otherproject/otherproject.csproj}
dotnet add package {nuget-package-name}
dotnet clean
dotnet build
That’s all I’ve used so far, but it was enough to investigate this brave new world without needing an IDE. Spoiler alert: I like it very much. The API is so straightforward that I don’t even need to include descriptions for the commands above. (Right?)
Everything really seems to be coming together: even the documentation is clean, easy-to-navigate and has very quick and accurate search results.
Encodo.Core
compiles (almost) without change. The only change required was to move project-description attributes that used to be in the AssemblyInfo.cs
file to the project file instead (where they admittedly make much more sense). If you don’t do this, the compiler complains about “[CS0579] Duplicate ‘System.Reflection.AssemblyCompanyAttribute’ attribute” and so on.Encodo.Expressions
references Windows.System.Media
for Color
and the Colors
constants. I changed those references to System.Drawing
and Color
, respectively—something I knew I would have to do.Encodo.Connections
references the .NET-Framework–only WindowsIdentity
. I will have to move these references to a Encodo.Core.Windows
project and move creation of the CurrentCredentials
, AnonymousCredentials
and UserCredentials
to a factory in the IOC.Quino.Meta
references the .NET-Framework–only WeakEventManager
. There are only two references and these are used to implement a CollectionChanged
feature that is nearly unused. I will probably have to copy/implement the WeakEventManager
for now until we can deprecate those events permanently.Quino.Data
depends on Quino.Meta.Standard
, which references System.Windows.Media
(again) as well as a few other things. The Quino.Meta.Standard
potpourri will have to be split up.I discovered all of these things using just VS Code and the command-line build. It was pretty easy and straightforward.
So far, porting to .NET Standard is a much more rewarding process than our previous attempt at porting to .NET Core.
At this point, I had a shadow copy of a bunch of the core Quino projects with new project files as well as a handful of ad-hoc changes and commented code in the source files. While OK for investigation, this was not a viable strategy for moving forward on a port for Quino.
I want to be able to work in a branch of Quino while I further investigate the viability of:
To test things out, I copied the new Encodo.Core
project file back to the main Quino workspace and opened the old solution in Visual Studio for Mac and JetBrains Rider.
Visual Studio for Mac says it’s a production release, but it stumbled right out of the gate: it failed to compile Encodo.Core
even though dotnet build
had compiled it without complaint from the get-go. Visual Studio for Mac claimed that OperatingSytem
was not available. However, according to the documentation, Operating System
is available for .NET Standard—but not in .NET Core. My theory is that Visual Studio for Mac was somehow misinterpreting my project file.
Update: After closing and re-opening the IDE, though, this problem went away and I was able to build Encodo.Core
as well. Shaky, but at least it works now.
Unfortunately, working with this IDE remained difficult. It stumbled again on the second project that I changed to .NET Standard. Encodo.Core
and Encodo.Expressions
both have the same framework property in their project files—<TargetFramework>netstandard2.0</TargetFramework>
—but, as you can see in the screenshot to the left, both are identified as .NETStandard.Library but one has version 2.0.0-preview1-25301-01 and the other has version 1.6.1. I have no idea where there second version number is coming from—it looks like this IDE is mashing up the .NET Framework version and the .NET Standard versions. Not quite ready for primetime.
Also, the application icon is mysteriously the bog-standard MacOS-app icon instead of something more…Visual Studio-y.
JetBrains Rider built the assembly without complaint, just as dotnet build
did on the command line. Rider doesn’t didn’t stumble as hard as Visual Studio for Mac, but it also didn’t had problems building projects after the framework had changed. On top of that, it wasn’t always so easy to figure out what to do to get the framework downloaded and installed. Rider still has a bit of a way to go before I would make it my main IDE.
I also noticed that, while Rider’s project/dependencies view accurately reflects .NET Standard projects, the “project properties” dialog shows the framework version as just “2.0”. The list of version numbers makes this look like I’m targeting .NET Framework 2.0.
Addtionally, Rider’s error messages in the build console are almost always truncated. The image to the right is of the IDE trying to inform me that Encodo.Logging
(which was still targeting .NET Framework 4.5) cannot reference Encodo.Core
(which references NET Standard 2.0). If you copy/paste the message into an editor, you can see that’s what it says. [1]
I don’t really know how to get Visual Studio Code to do much more than syntax-highlight my code and expose a terminal from which I can manually call dotnet build
. They write about Roslyn integration where “[o]n startup the best matching projects are loaded automatically but you can also choose your projects manually”. While I saw that the solution was loaded and recognized, I never saw any error-highlighting in VS Code. The documentation does say that it’s “optimized for cross-platform .NET Core development” and my projects targeted .NET Standard so maybe that was the problem. At any rate, I didn’t put much time into VS Code yet.
Encodo.Core
already works and there are only minor adjustments needed to be able to compile Encodo.Expressions
and Quino.Meta
.Quino.Schema
, Quino.Data.PostgreSql
, Encodo.Parsers.Antlr
and Quino.Web
. With this core, we’d be able to run the WebAPI server we’re building for a big customer on a Mac or a Linux box.I’ll keep you posted. [2]
Encodo.Expressions.AssemblyInfo.cs(14, 12): [CS0579] Duplicate ‘System.Reflection.AssemblyCompanyAttribute’ attribute
Microsoft.NET.Sdk.Common.targets(77, 5): [null] Project ‘/Users/marco/Projects/Encodo/quino/src/libraries/Encodo.Core/Encodo.Core.csproj’ targets ‘.NETStandard,Version=v2.0’. It cannot be referenced by a project that targets ‘.NETFramework,Version=v4.5’.
Encodo.Core
(NETStandard2.0) cannot be used from Encodo.Expressions
(Net462), which doesn’t seem right, but I’m not going to fight with it on this machine anymore. I’m going to try it on a fully updated Windows box next—just to remove the Mono/Mac/NETCore/Visual Studio for Mac factors from the equation. Once I’ve got things running on Windows, I’ll prepare a NETStandard project-only solution that I’ll try on the Mac.I finally finished a version that I think I can once again recommend to my employees... [More]
]]>Published by marco on 1. May 2017 21:42:56 (GMT-5)
Updated by marco on 1. May 2017 22:01:15 (GMT-5)
I announced almost exactly one year ago that I was rewriting the Encodo C# Handbook. The original was published almost exactly nine years ago. There were a few more releases as well as a few unpublished chapters.
I finally finished a version that I think I can once again recommend to my employees at Encodo. The major changes are:
Here’s the introduction:
“The focus of this document is on providing a reference for writing C#. It includes naming, structural and formatting conventions as well as best practices for writing clean, safe and maintainable code. Many of the best practices and conventions apply equally well to other languages.”
Check out the whole thing (GitHub)! Or download the PDF that I included in the repository.
tl;dr: there is no TypeScript compiler bug, but my faith in the TypeScript language’s type model is badly shaken.
The following code compiles—and... [More]
]]>Published by marco on 4. Mar 2017 20:20:22 (GMT-5)
I recently fixed a bug in some TypeScript code that compiled just fine—but it looked for all the world like it shouldn’t have.
tl;dr: there is no TypeScript compiler bug, but my faith in the TypeScript language’s type model is badly shaken.
The following code compiles—and well it should.
interface IB {
name: string;
}
interface IA {
f(action: (p: IB) => void): IA;
}
class A implements IA {
f = (action: (p: IB) => void): IA => {
return this;
}
}
Some notes on this example:
IB
isn’t relevant to the discussion.IA
is to require implementors to define a method named f
that takes single parameter of type İ => void
and returns IA
.A
above satisfies this requirement. It doesn’t do anything with parameter action
but that’s OK.A.f()
is what a naive user of TypeScript would assume was the only way of satisfying the requirement from IA
However, the following implementations of IA
also compile.
class A2 implements IA {
f = (action: () => IB): IA => {
return this;
}
}
class A3 implements IA {
f = (action: (p: IB) => IB): IA => {
return this;
}
}
class A4 implements IA {
f = (action: () => void): IA => {
return this;
}
}
class A5 implements IA {
f = (): IA => {
return this;
}
}
The only one I tried that doesn’t compile is shown below.
class A6 implements IA {
f = (action: (p: number) => void): IA => {
return this;
}
}
In this case, the TypeScript compiler rightly shows the following error:
Hovering over the class name A5
shows the following tooltip:
Class ‘A5’ incorrectly implements interface ‘IA’. Types of property ‘f’ are incompatible. Type ‘(action: (p: number) => void) => IA’ is not assignable to type ‘(action: (p: IB) => void) => IA’. Types of parameters ‘action’ and ‘action’ are incompatible. Type ‘(p: IB) => void’ is not assignable to type ‘(p: number) => void’. Types of parameters ‘p’ and ‘p’ are incompatible. Type ‘number’ is not assignable to type ‘IB’.
To summarize, the following types seem to be compatible with İ => void
:
() => IB
İ => IB
() => void
In a more strongly typed language like C#, it’s clear that none of this would fly. But this is TypeScript, which defines its typing model on compatibility with the dynamic language JavaScript.
It almost looks like the type of the lambda isn’t part of the type signature of the method, which came as a quite a surprise to me (and also to my colleague, Urs, who is much more of a TypeScript expert than I am).
But maybe we don’t know enough about the TypeScript type system. Let’s look at the Type compatibility documentation for TypeScript.
This section starts off with a “Note on Soundness”, which contains a note that suggests that what we have above is completely valid TypeScript.
“The places where TypeScript allows unsound behavior were carefully considered, and throughout this document we’ll explain where these happen and the motivating scenarios behind them.”
The section Comparing two functions starts off explaining some rather surprising things about the type-compatibility of functions: for a function to be type-compatible with another function, the types of its parameters must match the types of the target type’s parameters, but the number of parameters doesn’t have to match. So if the target type has 4 parameters and the lambda to assign has 0 parameters, that lambda is compatible.
From the manual:
let x = (a: number) => 0;
let y = (b: number, s: string) => 0;
y = x; // OK
x = y; // Error
For return types, the matching behavior is opposite. That is, a “bigger” type that satisfies the expected return type is just fine.
let x = () => ({name: "Alice"});
let y = () => ({name: "Alice", location: "Seattle"});
x = y; // OK
y = x; // Error because x() lacks a location property
Armed with this new knowledge, let’s see if the previously bizarre-seeming behavior is actually valid.
To recap, the TypeScript compiler says that following signatures are compatible with İ => void
:
f(() => IB): IA
: this is compatible because the zero parameters conform by definition and any return type is OK because void
is expected.f(İ => IB): IA
: this is compatible because the single parameter conforms and any return type is OK because void
is expected.f(() => void): IA
: this is compatible because because the zero parameters conform by definition and any return type is OK because void
is expected.f() => IA
: this one looks plain wrong at first, but the same logic applies to the whole function f(İ => void) => IA
instead of to the lambda parameter for it. The interface expects a function f
with a single parameter, returning IA
. By the first rule above, a function with zero parameters satisfies that requirement.f((number) => void): IA
: This does not satisfy the requirement because number
is not compatible with IB
.f(number): IA
: This does not satisfy the requirement because number
is not compatible with İ => void
.f(): void
: This does not satisfy the requirement because while zero parameters is OK, the type void
is smaller than IA
.Well, it looks like there’s nothing to see here, folks. The compiler is doing exactly what it’s supposed to. Move along and get on with your day.
Unfortunately, that means that TypeScript is going to be considerably less helpful for ensuring program correctness than I’d previously thought.
In fact, the caveat about Typescript “allow[ing] unsound behavior [in] carefully considered [places]” seems a bit disingenuous because, to a programmer accustomed to something like C# or Java or Swift, this kind of type-enforcement for method compatibility cannot be relied upon to enforce much of anything.
When I read OOSC2 (Amazon) a long time ago [1], I remember how Bertrand Meyer made the distinction between the formal type of an argument (the type in the method signature) and the actual type of an argument (the runtime type).
The method-type–conformance rules for TypeScript make sense for actual arguments. They ensure compatibility with JavaScript. What’s not clear to me is that this same logic be applied to formal arguments that are only available in TypeScript. If I declare a specific type signature in an interface, what are the odds that I want the wishy-washy JavaScript-friendly type rules for those situations? From an architect’s point of view, it would certainly be nicer to have more strict type-checking for formal definitions.
Since we don’t have that, this very lenient type-compatibility renders type-checking for lambdas largely useless in interface declarations. The compiler won’t be able to tell you that your implementation no longer matches the interface declaration because almost anything you write will actually match.
Engineering You
Martin Thompson — Video
The keynote was about our place in... [More]
]]>Published by marco on 4. Mar 2017 00:06:33 (GMT-5)
Updated by marco on 4. Mar 2017 12:07:35 (GMT-5)
Encodo presented a short talk at Voxxed Days 2017 this year, called The truth about code reviews. Sebastian and I also attended the rest of the conference. The following is a list of notes and reactions to the talks.
Engineering You
Martin Thompson — Video
The keynote was about our place in the history of software engineering. Martin described us more as alchemists than engineers right now, a sentiment with which I can only agree. There is too little precision, too little reproducibility and too little focus on safety for use to qualify as engineers.
He gave as an example the pride with which car companies brag about the hundreds of millions of lines of code they have running in software in their cars: a claim that should send shivers down your spine. We know how this software is written and how it is tested.
Quino has fewer than 100,000 lines of code (about 85,000, at least 15% of which is obsolete) and we’ve been building that for almost 10 years. How a company whose main business is building automobiles guarantees safety and correctness of 300 million lines of code is beyond my comprehension. I would venture that they don’t.
Highly recommended talk. Very interesting. Lots of good history mixed with common-sense recommendations, like the following:
References:
He discussed a proof-of-concept transport-tracking application. Uses the SBB REST API for vehicle positions (using the same API as exposed for the app). Then there is the OpenData Transport API for station-board information, which provides details about delays. Everything is available as JSON with relatively straightforward data models.
Uses Kafka to handle this real-time data pipeline (kind of like Chronicle, RabbitMQ or EasyMQ, but from Apache). The pipeline includes reformatting the data into the desired format (mostly eliding unwanted data), then store them in LogStash and then to ElasticSearch, which allows easy querying of the stored data. This type of data isn’t fundamentally relational, so a document-based store is appropriate.
The transformation also involves extrapolating the data that you’re interested in from the data you obtained. For example, determining whether a train is stopped. E.g. are there x events with the same position? Is the position near a station?
It was developed in Scala with Akka actors as well as the Play framework for REST. They represented all stations and trains with actors (objects). The actors are async and can run on any number of machines.
After that comes Cassandra? Are they trying to use every possible technology? I’m losing track over here. Deployment on Docker. Also uses Zookeeper in another container for load-balancing/redundancy. OMG buzzwords.
He asks: Why not a single application on a single server? Classic Java on Tomcat? It doesn’t scale. It can only scale up, but not out. The actual solution feels like a lot of moving parts, but each part does a compartmentalized task, handing off to the next piece. It ends up being quite lightweight, using very little CPU overall.
The simple, one-use components scale natively and relatively easily (LogStash, streaming, docker). The app server using Akka can be scaled, but it’s here that you have to invest time to use the available fallback and clustering strategies.
To render the data on the map, they used React to manage the data and d3.js to render. React is fast and scalable (but as Encodo has also discovered, that’s not free either). Also, the client-side CPU usage is not insignificant, even with a lot of nodes.
He also discussed UX and UI with tests. How to visualize possibly overlapping and differently sized elements at different zoom levels.
Used Jupiter to analyze data and produce graphs.
Conclusion: offload the parts of your application that aren’t your core problem to external software and services. Things like managing data streams, transforming data, etc. Focus on your models and analyzing your data.
Functional data structures in Java
Oleg Šelajev — Video
He discussed how to build reusable structures that don’t share mutable state (non-imperative vs. functional).
Void
is a “code smell” because the only reason to call it is to cause a side effect. Prefer pure methods.Any discussion of data-structure design/implementation will naturally involve balancing performance vs. storage. The safety is baked-in, but performance is always a concern when working with immutable data structures, most especially when changing them.
Even though the average call time for a method is nearly constant (as with most mutable structures), what if you call too many expensive operations and skew the average in real-world use? Well, you can combat this by leveraging the cachability of your collections (as defined above) as a way of memoizing (a well-known performance-optimization technique which carries with it possibly higher storage costs if you can’t share the memoized instances very much.)
In some cases, you can reason about performance in the following way: if you get to a situation where you would have to do an expensive operation (e.g. the reverse implicit in balancing head/tail of a queue), you can only get to this situation by having done n cheap operations first. So it is proven that the average is still constant time.
Destructive behavior (like deque
) looks different than mutable data structure. In those cases, the operation returns both the removed element as well as a reference to the queue that represents the new state of the queue.
Tuple<T, Queue<T>> Dequeue();
For maps, you need a concept called Zip
that lets you quickly build a representation of the structure where the element viewed at a particular point in an existing structure is different. So even when a desired mutation would require alteration of a lot of the underlying structure, this operation allows reuse of a lot more of the structure than would otherwise be possible. The node can point to different parent and child nodes, referencing the new part of the structure while embedded in as much of the prior version as possible.
“Object-oriented programming makes it easier to reason about moving parts. FUnctional programming makes it easier to minimize moving parts.”
References:
Does diversity really matter?
Sombra González and Brigitte Hulliger — Video
This talk began by posing the following questions to the audience.
Good questions. Good topic. Mostly well-presented, although the middle dragged a bit: Sombra envisioned a (near-)future where women are the same as men in a tech world, a meritocracy. It didn’t add very much.
As with everywhere else, the software industry has to figure out how to deal with long maternity leaves. Some countries have introduced “rainbow” leaves, which allow sharing of the time between partners, so if the partner is male, the industry has to deal with male absence as well. That will probably help increase acceptance of female leave, as it removes the distinction.
For small companies, these kinds of extended leaves are a big hurdle because we can’t so easily absorb so much missing capacity.
We haven’t improved at all in the last quarter-century: there have been proportionally fewer women in technical software positions every year since 1991. The quit rate is much higher (41%) than for men (17%). This is not primarily due to family concerns, though. It’s mostly due to women not feeling comfortable in an industry where they’re often the only female in a meeting, on a team or in a company.
Reference:
The truth about code reviews
Sebastian Greulach — Video
This talk is a reduced version of the code-review talk that Sebastian has been doing for Encodo Systems in both English and German over the last year.
The presentation includes some statistics about the value of code reviews, a discussion of which benefits you can expect to get, which types of reviewers are likely to yield which benefits as well as Encodo’s approach and advice for integrating code reviews into your development process.
This was the most informative and amazing presentation at the entire show. All kidding aside, the room was packed and the ratings were quite good. There seemed to be a lot of interest in process.
Reference:
This guy was supremely entertaining. He is the undisputed master of the animated and reaction GIF in presentations. Informative, spirited and very funny.
var
) and the tuple elements are unnamed (p1
, p2
, etc.) C# 6 is still like this, but C# 7 introduced named items for anonymous tuples.UPDATE
statement in his projects, where he can) and then you basically have an immutable data structure in a separate process with a really powerful and efficient query languages over the graph. [3]References:
A practical introduction to Category Theory
Daniela Sfregola — Video
Category theory is about Monads, examples of which are Option
, Try
and Future
(promise).
The example she uses shows how to apply category-theory constructs to data-validation. The examples are in Scala, although the API that she presents looks very similar to the terminology used in Java’s Streams API. E.g. flatMap()
. That’s Select()
for C# developers. Similarly, Options
is Nullable
, although I can’t think of the type analog for Some
or None
.
Her validation example is well-made, going from returning an Option
which is no better than a Boolean
. Then she shows an Either
but that doesn’t allow for having both sides wrong. This can be done with Either
but it’s painful. That’s why we invented pattern-matching (now available in C# 7).
When she introduced a Validated
, which is capable of returning a list of errors. “Focus on how things compose.”
The talk was quite short and didn’t introduce much new. The pattern-matching syntax in Scala is a bit wordy.
g º f patterns
Mario Fusco — Video
Since my previous talk was done early, I joined Sebastian in this one. I saw only the tail-end of it, but man are the streams() libraries still really wordy. Welcome to functional programming, Java! Still, I’m disappointed that I can’t use streams()
in the Android project I’m working on because it required Java 8, which forces API level 24, which excludes a lot of devices.
Sebastian said the talk was pretty good.
What about CSS? Progressive Enhancement and CSS
Ire Aderinokun — Video
Rules:
WTF is the squirrel browser? (It turns out it’s UC Browser, popular in China.) Or the one with the strange globe? (Maybe Flock? Not sure.) Does Opera really have higher market-share than IE? Probably globally, right? Phone browser in India/China/etc.
She showed a really cool graph of how many hours you have to work to use 500MB of data. Germany: 1h, Brazil: 56h, US: 6h. Bandwidth matters. A lot. WWW != Wealthy Western Web ammirite?
<main>
or <header>
.More rules:
vertical-align
is ignored when flexing is enabled.)What about the future of the web? VR? Old devices handed down from the 1st to the 3rd world.
I asked about testing that the progressive enhancements work as programmed, but no-one has any new ideas for testing, though. Manual testing to verify that the enhancements and fallbacks work.
References:
I just hacked your app!
Marcos Placona — Video
He started off the talk as a bandit, reverse-engineering a Base64-encoded name/password. He used Charles to get MITM. It was a nice trick, and it probably works on a lot of devices and apps.
It’s very easy to make a hackable application if you don’t think about security. He uses a nice word-definition slide with pronunciation and usage to make it look all official.
CertificatePinner()
Published by marco on 6. Feb 2017 00:10:55 (GMT-5)
As Microsoft did a couple of years ago, Apple’s language designers are also designing the next version of Swift in public. [1] One example of the new design is the discussion of String Processing For Swift 4 (GitHub). If you read through the relatively long document, you can at least see that they’re giving the API design a tremendous amount of thought.
There are so many factors to weigh when building the API, especially for a low-level construct like String
.
String
API with a bunch of overloads? (E.g. the discussion of storage for sub-strings.)Strings
are actually structs
rather than classes
.)Array
?String
be a Collection
? If so, what is the default item-type?Character
have the same or a similar API as a String
? (E.g. why can’t you get the sub-structure of the grapheme cluster for a character without first casting it to a String
?)A good example is the discussion of how to represent string slices: should there be a separate type, called Substring
, analogous to the ArraySlice
that already exists for an Array
?
“Long-term storage of Substring instances is discouraged. A substring holds a reference to the entire storage of a larger string, not just to the portion it presents, even after the original string’s lifetime ends.
“[…]
“The downside of having two types is the inconvenience of sometimes having a Substring when you need a String, and vice-versa. It is likely this would be a significantly bigger problem than with Array and ArraySlice, as slicing of String is such a common operation. It is especially relevant to existing code that assumes String is the currency type – that is, the default string type used for everyday exchange between APIs. To ease the pain of type mismatches, Substring should be a subtype of String in the same way that Int is a subtype of Optional<Int>.”
Collection
or not?For those that watch as the API for Swift evolves from one major version to another—with each change introducing non–backward-compatible incompatibilities—this document should hopefully reassure them that the changes are not made lightly. It may seem like the designers don’t have a plan, but, over the years, designers and opinions change. E.g. Witness the discussion of what the default representation of the string should be.
“[…] in Swift 1.0, String was a collection of Character (extended grapheme clusters). […] In Swift 2.0, String’s Collection conformance was dropped, because we convinced ourselves that its semantics differed from those of Collection too significantly.”
After listing several reasons why the change in Swift 2.0 was not a good direction, they conclude that in 4.0, they should revert to the original behavior.
“It would be much better to legitimize the conformance to Collection and simply document the oddity of any concatenation corner-cases, than to deny users the benefits on the grounds that a few cases are confusing.”
Again, the discussion is open and public and, despite the claims of some who think that they’re just a bunch of cowboys changing stuff willy-nilly, they have a documented plan.
It’s unfortunate that it took them so long to get there, but this kind of design isn’t always easy.
Because Swift uses Unicode grapheme clusters as the default “items” view for strings, the discussion of string indices might seem unnecessarily abstract for developers coming from other languages, where the index is always an int
int bytes
.
“String currently has four views–characters, unicodeScalars, utf8, and utf16 […]”
Because of these different views, it’s necessary to discuss how to reduce API surface by consolidating the various index types used to refer to individual elements in these different “views” on a String
.
It’s not like C#—and most other mainstream languages—have anything to brag about with their string-handling. In that respect, even Swift 1 and 2 are light-years ahead in Unicode correctness with their focus on grapheme clusters rather than the utterly nonsensical 90s-era bytes
still used in those other languages.
The Guidance for API Designers shows how they try to build the API so that it makes sense for callers.
“A Substring passed where String is expected will be implicitly copied. When compared to the “same type, copied storage” model, we have effectively deferred the cost of copying from the point where a substring is created until it must be converted to String for use with an API.
“A user who needs to optimize away copies altogether should use this guideline: if for performance reasons you are tempted to add a Range argument to your method as well as a String to avoid unnecessary copies, you should instead use Substring.”
Their goal is noble, though it’s unclear to what degree the vision can be realized. The following citation could be written as the high-level goal of any API.
“We should represent these aspects as orthogonal, composable components, abstracting pattern matchers into a protocol like this one, that can allow us to define logical operations once, without introducing overloads, and massively reducing API surface area.”
Update: At the suggestion of a reader, I searched... [More]
]]>Published by marco on 4. Feb 2017 18:17:03 (GMT-5)
Updated by marco on 5. Feb 2017 23:42:56 (GMT-5)
I encountered some curious behavior while writing a service-locator interface (_protocol_) in Swift. I’ve reproduced the issue in a stripped-down playground [1] and am almost certain I’ve found a bug in the Swift 3.0.1 compiler included in XCode 8.2.1.
Update: At the suggestion of a reader, I searched and found Apple’s Jira for Swift [2] and reported this issue as A possible tuple-inference/parameter-resolution bug in Swift 3.0.1
We’ll start off with a very basic example, shown below.
The example above shows a very simple function, generic in its single parameter with a required argument label a:
. As expected, the compiler determines the generic type T
to be Int
.
I’m not a big fan of argument labels for such simple functions, so I like to use the _
to free the caller from writing the label, as shown below.
As you can see, the result of calling the function is unchanged.
Let’s try calling the function with some other combinations of parameters and see what happens.
If you’re coming from another programming language, it might be quite surprising that the Swift compiler happily compiles every single one of these examples. Let’s take them one at a time.
int
: This works as expectedodd
: This is the call that I experienced in my original code. At the time, I was utterly mystified how Swift—a supposedly very strictly typed language—allowed me to call a function with a single parameter with two parameters. This example’s output makes it more obvious what’s going on here: Swift interpreted the two parameters as a Tuple. Is that correct, though? Are the parentheses allowed to serve double-duty both as part of the function-call expression and as part of the tuple expression?tuple
: With two sets of parentheses, it’s clear that the compiler interprets T
as tuple (Int, Int)
.labels
: The issue with double-duty parentheses isn’t limited to anonymous tuples. The compiler treats what looks like two labeled function-call parameters as a tuple with two Ints labeled a:
and b:
.nestedTuple
: The compiler seems to be playing fast and loose with parentheses inside of a function call. The compiler sees the same type for the parameter with one, two and three sets of parentheses. [3] I would have expected the type to be ((Int, Int))
instead. complexTuple
: As with tuple
, the compiler interprets the type for this call correctly.The issue with double-duty parentheses seems to be limited to function calls without argument labels. When I changed the function definition to require a label, the compiler choked on all of the calls, as expected. To fix the problem, I added the argument label for each call and you can see the results below.
int
: This works as expectedodd
: With an argument label, instead of inferring the tuple type (Int, Int)
, the compiler correctly binds the label to the first parameter 1
. The second parameter 2
is marked as an error.tuple
: With two sets of parentheses, it’s clear that the compiler interprets T
as tuple (Int, Int)
.labels
: This example behaves the same as odd
, with the second parameter b: 2
flagged as an error.nestedTuple
: This example works the same as tuple
, with the compiler ignoring the extra set of parentheses, as it did without an argument label.complexTuple
: As with tuple
, the compiler interprets the type for this call correctly.I claimed above that I was pretty sure that we’re looking at a compiler bug here. I took a closer look at the productions for tuples and functions defined in The Swift Programming Language (Swift 3.0.1) manual available from Apple.
First, let’s look at tuples:
As expected, a tuple expression is created by surrounding zero or more comma-separated expressions (with optional identifiers) in parentheses. I don’t see anything about folding parentheses in the grammar, so it’s unclear why (((1)))
produces the same type as (1)
. Using parentheses makes it a bit difficult to see what’s going on with the types, so I’m going to translate to C# notation.
()
=> empty tuple [4](1)
=> Tuple<int>
((1))
=> Tuple<Tuple<int>>
This seems to be a separate issue from the second, but opposite, problem: instead of ignoring parentheses, the compiler allows one set of parentheses to simultaneously denote the argument clause of a single-arity function call and an argument of type Tuple
encompassing all parameters.
A look at the grammar of a function call shows that the parentheses are required.
Nowhere did I find anything in the grammar that would allow the kind of folding I observed in the compiler, as shown in the examples above. I’m honestly not sure how that would be indicated in grammar notation.
Given how surprising the result is, I can’t imagine this is anything but a bug. Even if it can be shown that the Swift compiler is correctly interpreting these cases, it’s confusing that the type-inference is different with and without labels.
The X-Code playground is a very decent REPL for this kind of example. Here’s the code I used, if you want to play around on your own.
func test<T>(_ a: T) -> String
{
return String(describing: type(of: T.self))
}
var int = test(1)
var odd = test(1, 2)
var tuple = test((1, 2))
var labels = test(a: 1, b: 2)
var nestedTuple = test(((1, 2)))
var complexTuple = test((1, (2, 3)))
Published by marco on 15. Jan 2017 23:40:49 (GMT-5)
Updated by marco on 4. Oct 2023 21:24:02 (GMT-5)
The article Dark Path by Robert C. Martin was an interesting analysis of a recent “stricter” trend in programming languages, as evidenced by Swift and Kotlin. I think TypeScript is also taking some steps along this path, as well as Rust, which I have a read a lot about, but haven’t had much occasion to use.
The point Martin makes is that all of these languages seem to be heedlessly improving correctness at the possible cost of expressiveness and maintainability. That is, as types are inferred from implementation, it can become more difficult to pinpoint where the intent of the programmer and the understanding of the compiler parted ways. As well, with increasing strictness—e.g. non-null references, reference-ownership, explicit exceptions, explicit overrides—there comes increasing overhead in maintaining code.
Not only that, but developers must know their types—and hence their design—up front, which restricts evolving design as practiced in the very successful TDD approach and seems to be headed back to the stone age of waterfall design. As well, that level of strictness convinces developers—who are similarly encouraged by the language designers—that once their code compiles, then it runs as expected.
But then they think they don’t need to test, whereas the compiler really has no idea whether your code does what it should do. All it can guarantee is that no exception went unhandled—or explicitly ignored—(e.g. in Kotlin or Swift) or there are no race conditions or deadlocks (Rust) or that there are no null references where not explicitly programmed (Swift, Kotlin, TypeScript).
These compiler-enforced language features are very useful, but are in the same class as the spell-checker in your text editor. Having no red, wavy lines in your document is no guarantee that the document makes any sense whatsoever.
So these are interesting and useful features. They can lead to increased safety. But, they won’t make your program do what it’s supposed to do. At best, they help you avoid writing behavior that you most definitely don’t want.
These features are nice to have, but they are not worth having at any price.
It was an interesting article that I more-or-less agreed with. The follow-up article Types and Tests by Robert C. Martin (Clean Coder Blog) followed close on its heels because Martin apparently wanted to respond to feedback he’d received on the first article. I thought he went a bit far in the second article. For example, he emphasized that,
“No, types are not tests. Type systems are not tests. Type checking is not testing. Here’s why.”
That’s absolutely true, but types are still related to testing. Types help me specify my interface more precisely and I can trust the compiler to enforce them. That’s a lot of tests I don’t have to write.
Otherwise, for every API I write, I’d have to write tests to prove that only the supported types can be passed in—and I’d also have to specify how my API behaves when value with an incorrect type is passed in. Do I fail silently? How do I let the caller know what to expect? This seems not only sloppy but time-consuming. It sounds like busy work, having to think about this kind of stuff for every API.
Martin continues,
“[…] the wayf
is called has nothing to do with the required behavior of the system. Rather it is a test of an arbitrary constraint imposed by the programmer. A constraint that was likely over[-]specified from the point of view of the system requirements. (Emphasis added.)”
The first sentence is a useful observation. The second is hyperbole. Indicating int
rather than object
for a parameter called limit
hardly seems like an over-specification. In fact, in seems like exactly what I want.
If the requirement says shall allow a user to enter a value for limit… rather than shall allow a user to enter a positive number for limit…, then I would argue that 99% of the time it’s the requirement that isn’t precise enough. I would not assume that the requirements engineer knew just what she was doing when she left the door open for a limit given as a string
.
Without types, our requirements would also become bloated with over-definitions like:
ArgumentOutOfRangeException
for values that are less than zero or greater than 1000.ClassCastException
if the given value cannot be marshaled to a numeric value.For this specification, a developer could write:
public void SetLimit(object limit)
{
int limitAsNumber;
if (!Int32.ParseInt(limit, out limitAsNumber))
{
throw new ClassCastException("…");
}
if (limitAsNumber > 1000)
{
throw new ArgumentOutOfRangeException("limit");
}
_limit = limit;
}
The developer could also write:
public void SetLimit(UInt32 limit)
{
if (limit > 1000)
{
throw new ArgumentOutOfRangeException("limit");
}
_limit = limit;
}
That’s actually what we want the developer to write, no? If you choose JavaScript to implement this requirement, then you would need to over-specify because you need to decide how to handle values with unsupported types. If the requirements engineer is allowed to assume that the implementing language has a minimal type system, then the requirements are also easier to write, as shown below.
ArgumentOutOfRangeException
for values that are less than zero or greater than 1000.ClassCastException
if the given value cannot be marshaled to a numeric value.Assuming a minimal type system in the target language saves time and effort. The requirements engineer can specify more concisely and the software engineer wastes less time writing boilerplate that has nothing to do with application behavior.
Martin finished up with this sentiment,
“So, no, type systems do not decrease the testing load. Not even the tiniest bit. But they can prevent some errors that unit tests might not see. (e.g. Double vs. Int) (Emphasis added.)”
As you can imagine, I strongly disagree with the “[n]ot even the tiniest bit” part, based on my arguments above. If you use JavaScript, then you have to test all valid input and verify its behavior. In JavaScript, literally any data is valid input and it’s up to your method to declare it invalid.
Only tests can provide any protection against your method being called at runtime with invalid data. You have to write a test to verify that your method throws an error when passed a double
rather than an int
. Most people will not write these kind of tests, which I suspect is why Martin says there’s no change in testing load.
I agree that the pendulum in Swift has swung too far in a restrictive direction. The language does feel pretty overloaded. I also agree that the behavior of the system itself needs to be tested and that types don’t help you there.
Martin again,
“On the other hand, internal self-consistency does not mean the program exhibits the correct behavior. Behavior and self-consistency are orthogonal concepts. Well behaved programs can be, and have been, written in languages with high ambiguity and low internal consistency. Badly behaved programs have been written in languages that are deeply self-consistent and tolerate few ambiguities. (Emphasis added.)”
Agreed.
I think, though, that Martin might be forgetting about all of the people writing software who aren’t the kind of people who can write a well-behaved program in a wildly inconsistent language. I, for example, am so awesome [1] that I wrote my entire web-site software in PHP—one of the worst languages in the world for internal self-consistency—and it’s been running my site for going on 18 years. Programming skill and iron discipline fill the gap left by language consistency.
But for bad programmers? They write utter garbage in PHP. Maybe it’s not a bad idea to create languages that channel poorly disciplined programmers into better practices. I take the point from the previous article (Dark Path) that bad programmers will simply work their way around the rigor, where possible. They will mark every class as open
in Swift instead of thinking about their architecture.
For those of us with discipline, the language will put up roadblocks that force us to write more code rather than less.
As a counterexample, there is Rust, which enforces reference-ownership in a way that guarantees concurrent code with no deadlocks and no race conditions. This is a good thing. It probably gets in your way when you’re trying to write other types of programs, but it’s overall a good thing.
I haven’t had any personal experience with it, but I’ve heard that it’s sometimes difficult to figure out why a given program won’t compile. I would hope that these situations become fewer with experience, but would also be cautious because I remember programming in C++ with templates and know how much time can be lost when you don’t know how to fix your program based on an error message.
I, for one, like that my compiler tells me when I have potential null-reference exceptions. I use attributes in C# to tell me exactly that and I use R# to find all places in my code where I have potential violations. Those are more tests that I don’t have to write, if the compiler can “prove” that this code is never called with a null
reference. [2] It lets me write more concise implementation and spares me a lot of scaffolding.
Many years ago, I had the same experience with const
in C++ as Martin discusses. After some time working with const
, I starting making everything I possibly could const
in order to eliminate a whole class of mutation errors in my code. That did have consequences, at the time. Changing one thing could—as Martin describes for his hypothetical language TDP—lead to knock-on changes throughout the code base.
Generics can have this effect, as well, with changes leaking into all of the places they’re used. I wrote a blog series on having pulled back from generics in a few central places in Quino.
I often felt the way that Martin does about Java’s throws
declaration. I imagine that I’ll start to feel the same about Swift’s, as well. I read once about a nice typing system in Midori, the managed version of Windows created by Joe Duffy and team at Microsoft Research, that I felt I would like to try (no pun intended).
Martin says that he uses both dynamically and statically typed languages. He acknowledges that certain extensions to the type system can be useful (but just that some languages have gone too far).
I, too, think some innovations can be very helpful. I like immutables (types, declarations, whatever) because they let me reason better about my code. They let me eliminate unwanted code paths with the compiler rather than having to write more rote tests that I think even Martin will agree have nothing to do with the original specification or the behavior of my application.
If I can mark something as readonly because I don’t expect it to ever need to be changed, that’s a little note I’ve left for future programmers that, should they want to modify that value, they will have to make sure to reason differently about the implementation. The value was never intended to be rewritten and there are no tests for that behavior. It’s a nice way of reducing the scope of the implementation.
It simultaneously restricts that scope, but that’s a good thing. A program can, very quickly, do a lot of things that it should not do. I don’t want to write tests for all of this stuff. I have neither the inclination nor the time—nor the budget—to write tests for things that I could instead eliminate entirely from the realm of possibility with a powerful type system.
I read up on Kotlin and saw a seminar on it last year. I, too, noticed that there seems to be an “everything but the kitchen sink” feel to it. It’s the same feeling I get when I look at Scala’s type system, though that one is less about restriction than about letting you do everything in 3 different ways.
I’ve been reading through the Swift language guide and I’m getting the same feeling. It doesn’t help that they have their own name and keyword for nearly every commonly known programming concept. You can use self.
but the guide prefers just .
, which takes some getting used to. finally
? Nope. Use defer
instead.
To be honest, I’m also a bit dizzy at how quickly the TypeScript type system has gotten more and more complex. TypeScript 2.1: keyof and Lookup Types by Marius Schulz includes details on even more typing specifications that let you infer types from dynamic objects with flow-control analysis.
I think this is quite an interesting approach, akin to more functional languages, like ML and F#, where return types are inferred and even parameter types are inferred. Swift has also gone a long way in this direction. Interfaces are replaced with non-inheritable types that describe the shape of data.
Types can even be inferred by which fields you access within conditionals so that a single variable has a different inferred type depending on which path through the code it takes. It’s all very exciting, but I wonder how much can be used correctly—especially by the aforementioned crappy programmers.
For example, this is the definition for the Object.entries()
method from JavaScript.
interface ObjectConstructor {
// …
entries<T extends { [key: string]: any }, K extends keyof T>(o: T): [keyof T, T[K]][];
// …
}
After having used languages that have explicit return types for methods, I’m still a bit at sea when I read TypeScript code without them. I find myself hovering identifiers to see which type was inferred for them by the real-time compilation.
I agree that the code is cleaner, but maybe something’s gone missing. It’s harder to tell what the hell I’m supposed to pass in as a parameter or what the hell I get back from a function when the type can be a union of 3 or 4 other vaguely and sometimes ad-hoc–defined types.
For example, a lot of code just constantly redefines the hash-table interface rather than just defining a type for it … so the caller isn’t restricted to implementing a specific interface. This is nice for library code, I guess, but it makes it harder to reason about the code because you don’t have good names for types. This is an interesting enough experience for seasoned programmers; I can’t even imagine how average or bad programmers deal with it.
I see where Martin is coming from, that he’s afraid of BDUF, something he’s been fighting for years by arguing that you can design as you go if you’ll just test your code as you write it. If you see that a parameter has to be an IHashMap
, that’s easier to understand than { [key: string]: any }
or { [key: string]: T }
where T is a completely different type. There are advantages and disadvantages.
“Every step down that path increases the difficulty of using and maintaining the language. Every step down that path forces users of the language to get their type models “right” up front; because changing them later is too expensive. Every step down that path forces us back into the regime of Big Design Up Front.”
I agree with the sentiment, but I don’t know if we’re there yet. Martin argues that there is a balance and maybe I need more experience with the languages he’s horrified about. He does write:
“I think Java and C# have done a reasonable job at hovering near the balance point. (If you ignore the horrible syntax for generics, and the ridiculous proscription against multiple inheritance.)”
…which I agree with wholeheartedly. I have learned to live without multiple inheritance, but I regularly railed against its absence for decades. I have given up because the world has moved on. I would love to see proper contravariance and covariant return types and anchored types, but I’ve kind of given up on seeing that kind of stuff in a mainstream language, as well. Instead, I’ve drifted more toward immutable, stateless, functional style—even in C#. I’m ogling F#. I’m working with Swift now and will do much more of that this year.
If you don’t have a license for DataGrip, you can download... [More]
]]>Published by marco on 11. Jan 2017 08:47:45 (GMT-5)
The article Connecting DataGrip to MS SQL Server by Maksim Sobolevskiy on June 21, 2016 (JetBrains Blog) covers all of the points well, with screen shots but I just wanted to record my steps, collected into a tight list. Screenshots for most of these steps are available in the blog linked above.
If you don’t have a license for DataGrip, you can download a 30-day trial or you can download the JetBrains Rider EAP, which bundles it. Once Rider is released, you’ll have to have a license for it, but—for now—you can use it for free.
We’ll get... [More]
]]>Published by marco on 24. Nov 2016 20:02:47 (GMT-5)
For many years, the C#/.NET world has been dominated by a single main IDE: Visual Studio. MonoDevelop has also been available for a while, as an alternative for users on other platforms. Lately, though, there have been a few new contenders in the .NET IDE arena.
We’ll get this one out of the way first: this is basically Xamarin Studio for Mac, rebranded as Visual Studio for Mac. This IDE is pretty and extremely well-integrated into MacOS, with a lot of animated editor interaction for compiler warnings and errors.
Unlike Rider or Visual Studio 2017 with ReSharper, Xamarin Studio doesn’t benefit from the R# tooling, so there are a few things immediately missing. Navigation is not as smooth as with ReSharper-based IDEs [1], although it’s definitely on-par with what I’ve experienced in Xcode. Xamarin Studio is fast and pretty good and I’ll definitely keep it in the mix for testing Quino on alternate platforms once we start the move to .NET Standard 2.0. [2]
This is only an EAP, so keep that in mind when testing. I installed this IDE on my Mac and Windows. The setup process was very smooth, asking for theme/color preferences and—most importantly—keyboard preferences. This time, the key-mapping for “Visual Studio” turned out to be quite appropriate and good.
I was able to load the Quino solution relatively quickly. The first load kicks off two processes: Nuget Restore and Process Files. On subsequent loads, the Nuget Restore no longer applies and Process Files benefits from Rider having cached everything the first time around.
I couldn’t find any option to add an extra NuGet source, which was odd. There is a tab in the “Nuget Packages” pane called “sources”, but it just lists the NuGet configuration files but doesn’t offer any way to add sources.
On the plus side, the test runner worked immediately. but on the minus side, it delivered results inconsistent with VS2015 and VS2017 running on the same machine. It looks and behaves like the same test runner as in ReSharper [3], but the results are different for some (a few hundred) Quino tests.
It loads quickly, can deal with the Quino solution without issues and the test runner works. Everything else felt like Visual Studio with ReSharper—at least for the stuff I use. I’ll keep an eye on this IDE.
I installed this with ReSharper 2016.3EAP9 and was pleasantly surprised to see that it behaved like an actual RC. That is, instead of releasing Alpha/early-beta software as an RC—I’m looking at you, .NET Core—they’ve got a really solid release on their hands.
That said, it’s not quite ready for production use (obvious from the RC moniker) but I was able to use it for productive use over a long weekend. So I was pretty encouraged that I’ll be able to let the guys at Encodo use it sooner rather than later. [4]
That said, here are the things I’ve noticed that are missing:
Everything else seemed to work fine, which speaks well of both VS2017 and R#’s latest EAP.
We discussed ABD in a recent article ABD: Refactoring and refining an API. To cite from that article,
]]>“[…] the most important part of code is to think about how you’re writing it and what you’re building. You shouldn’t write a single line without thinking of the myriad ways in... [More]”
Published by marco on 5. Jun 2016 12:52:31 (GMT-5)
We discussed ABD in a recent article ABD: Refactoring and refining an API. To cite from that article,
“[…] the most important part of code is to think about how you’re writing it and what you’re building. You shouldn’t write a single line without thinking of the myriad ways in which it must fit into existing code and the established patterns and practices.”
With that in mind, I saw another teaching opportunity this week and wrote up my experience designing an improvement to an existing API.
Before we write any code, we should know what we’re doing. [1]
IMetaAspects
) in Quino to add domain-specific metadata (e.g. the IVisibleAspect
controls element visibility)FindOrAddAspect()
. This method does what it advertises: If an aspect with the requested type already exists, it is returned; otherwise, an instance of that type is created, added and returned. The caller gets an instance of the requested type (e.g. IVisibleAspect
).Although we’re dealing concretely with aspects in Quino metadata, the pattern and techniques outlined below apply equally well to other, similar domains.
A good example is the IClassCacheAspect
. It exposes five properties, four of which are read-only. You can modify the property (OrderOfMagnitude
) through the interface. This is already not good, as we are forced to work with the implementation type in order to change any property other than OrderOfMagnitude
.
The current way to address this issue would be to make all of the properties settable on the interface. Then we could use the FindOrAddAspect()
method with the IClassCacheAspect
. For example,
var cacheAspect =
Element.Classes.Person.FindOrAddAspect<IClassCacheAspect>(
() => new ClassCacheAspect()
);
cacheAspect.OrderOfMagnitude = 7;
cacheAspect.Capacity = 1000;
For comparison, if the caller were simply creating the aspect instead of getting a possibly-already-existing version, then it would just use an object initializer.
var cacheAspect = Element.Classes.Person.Aspects.Add(
new ClassCacheAspect()
{
OrderOfMagnitude = 7;
Capacity = 1000;
}
}
This works nicely for creating the initial aspect. But it causes an error if an aspect of that type had already been added. Can we design a single method with all the advantages?
A good way to approach a new is to ask: How would we want the method to look if we were calling it?
Element.Classes.Person.SetCacheAspectValues(
a =>
{
a.OrderOfMagnitude = 7;
a.Capacity = 1000;
}
);
If we only want to change a single property, we can use a one-liner:
Element.Classes.Person.SetCacheAspectValues(a => a.Capacity = 1000);
Nice. That’s even cleaner and has fewer explicit dependencies than creating the aspect ourselves.
Now that we know what we want the API to look like, let’s see if it’s possible to provide it. We request an interface from the list of aspects but want to use an implementation to set properties. The caller has to indicate how to create the instance if it doesn’t already exist, but what if it does exist? We can’t just upcast it because there is no guarantee that the existing aspect is the same implementation.
These are relatively lightweight objects and the requirement above is that the property values on the existing aspect are set on the returned aspect, not that the existing aspect is preserved.
What if we just provided a mechanism for copying properties from an existing aspect onto the new version?
var cacheAspect = new ClassCacheAspect();
var existingCacheAspect =
Element.Classes.Person.Aspects.FirstOfTypeOrDefault<IClassCacheAspect>();
if (existingCacheAspect != null)
{
result.OrderOfMagnitude = existingAspect.OrderOfMagnitude;
result.Capacity = existingAspect.Capacity;
// Set all other properties
}
// Set custom values
cacheAspect.OrderOfMagnitude = 7;
cacheAspect.Capacity = 1000;
This code does exactly what we want and doesn’t require any setters on the interface properties. Let’s pack this away into the API we defined above. The extension method is:
public static ClassCacheAspect SetCacheAspectValues(
this IMetaClass metaClass,
Action<ClassCacheAspect> setValues)
{
var result = new ClassCacheAspect();
var existingCacheAspect =
metaClass.Aspects.FirstOfTypeOrDefault<IClassCacheAspect>();
if (existingCacheAspect != null)
{
result.OrderOfMagnitude = existingAspect.OrderOfMagnitude;
result.Capacity = existingAspect.Capacity;
// Set all other properties
}
setValues(result);
return result;
}
So that takes care of the boilerplate for the IClassCacheAspect
. It hard-codes the implementation to ClassCacheAspect
, but let’s see how big a restriction that is once we’ve generalized below.
We want to see if we can do anything about generalizing SetCacheAspectValues()
to work for other aspects.
Let’s first extract the main body of logic and generalize the aspects.
public static TConcrete SetAspectValues<TService, TConcrete>(
this IMetaClass metaClass,
Action<TConcrete, TService> copyValues,
Action<TConcrete> setValues
)
where TConcrete : new, TService
where TService : IMetaAspect
{
var result = new TConcrete();
var existingAspect = metaClass.Aspects.FirstOfTypeOrDefault<TService>();
if (existingAspect != null)
{
copyValues(result, existingAspect);
}
setValues(result);
return result;
}
This isn’t bad, but we’ve required that the TConcrete
parameter implement a default constructor. Instead, we could require an additional parameter for creating the new aspect.
public static TConcrete SetAspectValues<TService, TConcrete>(
this IMetaClass metaClass,
Func<TConcrete> createAspect,
Action<TConcrete, TService> copyValues,
Action<TConcrete> setValues
)
where TConcrete : TService
where TService : IMetaAspect
{
var result = createAspect();
var existingAspect = metaClass.Aspects.FirstOfTypeOrDefault<TService>();
if (existingAspect != null)
{
copyValues(result, existingAspect);
}
setValues(result);
return result;
}
Wait, wait, wait. We not only don’t need to the new
generic constraint, we also don’t need the createAspect
lambda parameter, do we? Can’t we just pass in the object instead of passing in a lambda to create the object and then calling it immediately?
public static TConcrete SetAspectValues<TService, TConcrete>(
this IMetaClass metaClass,
TConcrete aspect,
Action<TConcrete, TService> copyValues,
Action<TConcrete> setValues
)
where TConcrete : TService
where TService : IMetaAspect
{
var existingAspect = metaClass.Aspects.FirstOfTypeOrDefault<TService>();
if (existingAspect != null)
{
copyValues(aspect, existingAspect);
}
setValues(aspect);
return aspect;
}
That’s a bit more logical and intuitive, I think.
We can now redefine our original method in terms of this one:
public static ClassCacheAspect SetAspectValues(
this IMetaClass metaClass,
Action<ClassCacheAspect> setValues)
{
return metaClass.UpdateAspect(
new ClassCacheAspect(),
(aspect, existingAspect) =>
{
result.OrderOfMagnitude = existingAspect.OrderOfMagnitude;
result.Capacity = existingAspect.Capacity;
// Set all other properties
},
setValues
);
}
Can we somehow generalize the copying behavior? We could make a wrapper that expects an interface on the TService
that would allow us to call CopyFrom(existingAspect)
.
public static TConcrete SetAspectValues<TService, TConcrete>(
this IMetaClass metaClass,
TConcrete aspect,
Action<TConcrete> setValues
)
where TConcrete : TService, ICopyTarget
where TService : IMetaAspect
{
return metaClass.UpdateAspect<TService, TConcrete>(
aspect,
(aspect, existingAspect) => aspect.CopyFrom(existingAspect),
setValues
);
}
What does the ICopyTarget
interface look like?
public interface ICopyTarget
{
void CopyFrom(object other);
}
This is going to lead to type-casting code at the start of every implementation to make sure that the other
object is the right type. We can avoid that by using a generic type parameter instead.
public interface ICopyTarget<T>
{
void CopyFrom(T other);
}
That’s better. How would we use it? Here’s the definition for ClassCacheAspect
:
public class ClassCacheAspect : IClassCacheAspect, ICopyTarget<IClassCacheAspect>
{
public void CopyFrom(IClassCacheAspect otherAspect)
{
OrderOfMagnitude = otherAspect.OrderOfMagnitude;
Capacity = otherAspect.Capacity;
// Set all other properties
}
}
Since the final version of ICopyTarget
has a generic type parameter, we need to adjust the extension method. But that’s not a problem because we already have the required generic type parameter in the outer method.
public static TConcrete SetAspectValues<TService, TConcrete>(
this IMetaClass metaClass,
TConcrete aspect,
Action<TConcrete> setValues
)
where TConcrete : TService, ICopyTarget<TService>
where TService : IMetaAspect
{
return metaClass.UpdateAspect(
aspect,
(aspect, existingAspect) => aspect.CopyFrom(existingAspect),
setValues
);
}
Assuming that the implementation of ClassCacheAspect
implements ICopyTarget
as shown above, then we can rewrite the cache-specific extension method to use the new extension method for ICopyTargets
.
public static ClassCacheAspect SetCacheAspectValues(
this IMetaClass metaClass,
Action<ClassCacheAspect> setValues)
{
return metaClass.UpdateAspect<IClassCacheAspect, ClassCacheAspect>(
new ClassCacheAspect(),
setValues
);
}
This is an extension method, so any caller that wants to use its own IClassCacheAspect
could just copy/paste this one line of code and use its own aspect.
This is actually pretty neat and clean:
]]>[A]lways
[B]e... [More]
Published by marco on 21. May 2016 10:58:43 (GMT-5)
Updated by marco on 21. May 2016 10:59:27 (GMT-5)
We’ve been doing more internal training at Encodo lately and one topic that we’ve started to tackle is design for architecture/APIs. Even if you’re not officially a software architect—designing and building entire systems from scratch—every developer designs code, on some level.
[A]lways
[B]e
[D]esigning
There are broad guidelines about how to format and style code, about how many lines to put in a method, about how many parameters to use, and so on. We strive for Clean Code™.
But the most important part of code is to think about how you’re writing it and what you’re building. You shouldn’t write a single line without thinking of the myriad ways in which it must fit into existing code and the established patterns and practices.
We’ve written about this before, in the two-part series called “Questions to consider when designing APIs” (Part I and Part II). Those two articles comprise a long list of aspects of a design to consider.
First make a good design, then compromise to fit project constraints.
Your project defines the constraints under which you can design. That is, we should still have our designer caps on, but the options available are much more strictly limited.
But, frustrating as that might be, it doesn’t mean you should stop thinking. A good designer figures out what would be optimal, then adjusts the solution to fit the constraints. Otherwise, you’ll forget what you were compromising from—and your design skills either erode or never get better.
We’ve been calling this concept ABD—Always Be Designing. [1] Let’s take a closer, concrete look, using a recent issue in the schema migration for Quino. Hopefully, this example illustrates how even the tiniest detail is important. [2]
We detected the problem when the schema migration generated an invalid SQL statement.
ALTER TABLE "punchclock__timeentry" ALTER COLUMN "personid" SET DEFAULT ;
As you can see, the default value is missing. It seems that there are situations where the code that generates this SQL is unable to correctly determine that a default value could not be calculated.
The code that calculates the default value is below.
result = Builder.GetExpressionPayload(
null,
CommandFormatHints.DefaultValue,
new ExpressionContext(prop),
prop.DefaultValueGenerator
);
To translate, there is a Builder
that produces a payload. We’re using that builder to get the payload (SQL, in this case) that corresponds to the DefaultValueGenerator
expression for a given property, prop
.
This method is an extension method of the IDataCommandBuilder
, reproduced below in full, with additional line-breaks for formatting:
public static string GetExpressionPayload<TCommand>(
this IDataCommandBuilder<TCommand> builder,
[CanBeNull] TCommand command,
CommandFormatHints hints,
IExpressionContext context,
params IExpression[] expressions)
{
if (builder == null) { throw new ArgumentNullException("builder"); }
if (context == null) { throw new ArgumentNullException("context"); }
if (expressions == null) { throw new ArgumentNullException("expressions"); }
return builder.GetExpressionPayload(
command,
hints,
context,
expressions.Select(
e => new ExecutableQueryItem<IExecutableExpression>(new ExecutableExpression(e))
)
);
}
This method does no more than to package each item in the expressions
parameter in an ExecutableQueryItem
and call the interface method.
The problem isn’t immediately obvious. It stems from the fact that each ExecutableQueryItem
can be marked as Handled
. The extension method ignores this feature, and always returns a result. The caller is unaware that the result may correspond to an only partially handled expression.
Our first instinct is, naturally, to try to figure out how we can fix the problem. [3] In the code above, we could keep a reference to the executable items and then check if any of them were unhandled, like so:
var executableItems = expressions.Select(
e => new ExecutableQueryItem<IExecutableExpression>(new ExecutableExpression(e))
);
var result = builder.GetExpressionPayload(command, hints, context, executableItems);
if (executableItems.Unhandled().Any())
{
// Now what?
}
return result;
}
We can detect if at least one of the input expressions could not be mapped to SQL. But we don’t know what to do with that information.
null
? What can we return to indicate that the input expressions could not be mapped? Here we have the same problem as with throwing an exception: all callers assume that the result can be mapped.So there’s no quick fix. We have to change an API. We have to design.
As with most bugs, the challenge lies not in knowing how to fix the bug, but in how to fix the underlying design problem that led to the bug. The problem is actually not in the extension method, but in the method signature of the interface method.
Instead of a single result, there are actually two results for this method call:
Instead of a Get
method, this is a classic TryGet
method.
If this code is already in production, then you have to figure out how to introduce the bug fix without breaking existing code. If you already have consumers of your API, you can’t just change the signature and cause a compile error when they upgrade. You have to decorate the existing method with [Obsolete]
and make a new interface method.
So we don’t change the existing method and instead add the method TryGetExpressionPayload()
to IDataCommandBuilder
.
Now, let’s figure out what the parameters are going to be.
The method called by the extension method above has a slightly different signature. [5]
string GetExpressionPayload(
[CanBeNull] TCommand command,
CommandFormatHints hints,
[NotNull] IExpressionContext context,
[NotNull] IEnumerable<ExecutableQueryItem<IExecutableExpression>> expressions
);
That last parameter is a bit of a bear. What does it even mean? The signature of the extension method deals with simple IExpression
objects—I know what those are. But what are ExecutableQueryItems
and IExecutableExpressions
?
As an author and maintainer of the data driver, I know that these objects are part of the internal representation of a query as it is processed. But as a caller of this method, I’m almost never going to have a list of these objects, am I?
Let’s find out.
Me: Hey, ReSharper, how many callers of that method are there in the entire Quino source?
ReSharper: Just one, Dave. [6]
So, we defined an API with a signature that’s so hairy no-one calls it except through an extension method that makes the signature more palatable. And it introduces a bug. Lovely.
We’ve now figured out that our new method should accept a sequence of IExpression
objects instead of ExecutableQueryItem
objects.
How’s the signature looking so far?
bool TryGetExpressionPayload(
[CanBeNull] TCommand command,
CommandFormatHints hints,
[NotNull] IExpressionContext context,
[NotNull] IEnumerable<IExpression> expressions,
out string payload
);
Not quite. There are two things that are still wrong with this signature, both important.
One problem is that the rest of the IDataCommandBuilder<TCommand>
deals with a generic payload type and this method only works for builders where the target representation is a string. The Mongo driver, for example, uses MongoStorePayload
and MongoRetrievePayload
objects instead of strings and throws a NotSupportedException
for this API.
That’s not very elegant, but the Mongo driver was forced into that corner by the signature. Can we do better? The API would currently require Mongo to always return false
because our Mongo driver doesn’t know how to map anything to a string. But it could map to one of the aforementioned object representations.
If we change the out
parameter type from a string
to an object
, then any driver, regardless of payload representation, has at least the possibility of implementing this API correctly.
Another problem is that the order of parameters does not conform to the code style for Encodo.
null
as the first parameter looks strange. The command
can be null, so it should move after the two non-nullable parameters. If we move it all the way to the end, we can even make it optional.hints
should be third.)expressions
not the context
. The first parameter should be the target of the method; the rest of the parameters provide context for that input.params IExpression[]
. Using params
allows a caller to provide zero or more expressions, but it’s only allowed on the terminal parameter. Instead, we’ll accept an IEnumerable<IExpression>
, which is more standard for the Quino library anyway.The final method signature is below.
bool TryGetExpressionPayload(
[NotNull] IEnumerable<IExpression> expressions,
[NotNull] IExpressionContext context,
CommandFormatHints hints,
out object payload,
[CanBeNull] TCommand command = default(TCommand)
);
The schema migration called the original API like this:
result = Builder.GetExpressionPayload(
null,
CommandFormatHints.DefaultValue,
new ExpressionContext(prop),
prop.DefaultValueGenerator
);
return true;
The call with the new API—and with the bug fixed—is shown below. The only non-functional addition is that we have to call ToSequence()
on the first parameter (highlighted). Happily, though, we’ve fixed the bug and only include a default value in the field definition if one can actually be calculated.
object payload;
if (Builder.TryGetExpressionPayload(
prop.DefaultValueGenerator.ToSequence(),
new ExpressionContext(prop),
CommandFormatHints.DefaultValue,
out payload)
)
{
result = payload as string ?? payload.ToString();
return true;
}
A good rule of thumb is that if you find yourself explaining something in detail, it might still be too complicated. In that light, the call to ToSequence()
is a little distracting. [7] It would be nice to be able to map a single expression without having to pack it into a sequence.
So we have one more design decision to make: where do we add that method call? Directly to the interface, right? But the method for a single expression can easily be expressed in terms of the method we already have (as we saw above). It would be a shame if every implementor of the interface was forced to produce this boilerplate.
Since we’re using C#, we can instead extend the interface with a static method, as shown below (again, with more line breaks for this article):
public static bool TryGetExpressionPayload<TCommand>(
[NotNull] this IDataCommandBuilder<TCommand> builder, // Extend the builder
[NotNull] IExpression expression,
[NotNull] IExpressionContext context,
CommandFormatHints hints,
out object payload,
[CanBeNull] TCommand command = default(TCommand)
)
{
return builder.TryGetExpressionPayload(
expression.ToSequence(),
context,
hints,
out payload,
command
);
}
We not only avoided cluttering the interface with another method, but now a caller with a single expression doesn’t have to create a sequence for it [8], as shown in the final version of the call below.
object payload;
if (Builder.TryGetExpressionPayload(
prop.DefaultValueGenerator,
new ExpressionContext(prop),
CommandFormatHints.DefaultValue,
out payload)
)
{
result = payload as string ?? payload.ToString();
return true;
}
We saw in this post how we always have our designer/architect cap on, even when only fixing bugs. We took a look at a quick-fix and then backed out and realized that we were designing a new solution. Then we covered, in nigh-excruciating detail, our thought process as we came up with a new solution.
Many thanks to Dani for the original design and Sebastian for the review!
new [] { expression }
, which I think is kind of ugly.Published by marco on 12. May 2016 22:22:36 (GMT-5)
The summary below describes major new features, items of note and breaking changes. The full list of issues is also available for those with access to the Encodo issue tracker.
DateTimeExtensions.GetDayOfWeek()
had a leap-day bug (QNO-5051)GenericObjects
is calculated, which fixes sorting issues in grids, specifically for non-persisted or transient objects (QNO-5137)IAccessControl
API for getting groups and users and testing membership (QNO-5133)Add support for query aliases (e.g. for joining the same table multiple times) (QNO-531) This changes the API surface only minimally. Applications can pass an alias
when calling the Join
method, as shown below,
query.Join(Metadata.Project.Deputy, alias: "deputy")
You can find more examples of aliased queries in the TestAliasedQuery()
, TestJoinAliasedTables()
, TestJoinChildTwice()
defined in the QueryTests
testing fixture.
IQueryAnalyzer
for optimizations and in-memory mini-drivers (QNO-4830)ISchemaManager
has been removed. Instead, you should retrieve the interface you were looking for from the IOC. The possible interfaces you might need are IImportHandler
, IMappingBuilder
, IPlanBuilder
or ISchemaCommandFactory
.ISchemaManagerSettings.GetAuthorized()
has been moved to ISchemaManagerAuthorizer
. GenericObjects
may have an effect on the way your application sorts objects.The IParticipantManager
(base interface of IAccessControl
) no longer has a single method called GetGroups(IParticipant)
. This method was previously used to get the groups to which a user belongs and the child groups of a given group. This confusing double duty for the API led to an incorrect implementation for both usages. Instead, there are now two methods:
IEnumerable<IGroup> GetGroups(IUser user)
: Gets the groups for the given userIEnumerable<IGroup> GetChildGroups(IGroup group)
: Gets the child groups for the given groupThe old method has been removed from the interface because (A) it never worked correctly anyway and (B) it conflicts with the new API.
Before taking a look at the roadmap, let’s quickly recap how far we’ve come. An overview of the release schedule shows a steady accretion of features over the years, as driven by customer or project needs.
The list below includes more detail on the releases highlighted in the graphic. [1]
Published by marco on 12. May 2016 22:16:43 (GMT-5)
Updated by marco on 12. May 2016 22:30:34 (GMT-5)
Before taking a look at the roadmap, let’s quickly recap how far we’ve come. An overview of the release schedule shows a steady accretion of features over the years, as driven by customer or project needs.
The list below includes more detail on the releases highlighted in the graphic. [1]
We took 1.5 years to get to v1. The initial major version was to signify the first time that Quino-based code went into external production. [2]
After that, it took 6.5 years to get to v2. Although we added several large products that use Quino, we were always able to extend rather than significantly change anything in the core. The second major version was to signify sweeping changes made to address technical debt, to modernize certain components and to prepare for changes coming to the .NET platform.
It took just 5 months to get to v3 for two reasons:
So that’s where we’ve been. Where are we headed?
As you can see above, Quino is a very mature product that satisfies the needs of a wide array of software on all tiers. What more is there to add?
Quino’s design has always been driven by a combination of customer requirements and what we anticipated would be customer requirements.
We’re currently working on the following features.
MetaBuilder
in v3. We’re creating a more fluent, modern and extensible API for building metadata. We hope to be able to add these changes incrementally without introducing any breaking changes. [6]A natural use of the rich metadata in Quino is to generate user interfaces for business entities without have to hand-tool each form. From the POC onward, Quino has included support for generating UIs for .NET Winforms.
Winforms has been replaced on the Windows desktop with WPF and UWP. We’ve gotten quite far with being able to generate WPF applications from Quino metadata. The screenshots below come from a pre-alpha version of the Sandbox application included in the Quino solution.
You may have noticed the lovely style of the new UI. [7] We’re using a VSG designed for us by Ergosign, for whom we’ve done some implementation work in the past.
If you’ve been following Microsoft’s announcements, things are moving quickly in the .NET world. There are whole new platforms available, if you target your software to run on them. We’re investigating the next target platforms for Quino. Currently that means getting the core of Quino—Quino.Meta
and its dependencies—to compile under .NET Core.
As you can see in the screenshot, we’ve got one of the toughest assemblies to compile—Encodo.Core
. After that, we’ll try for running some tests under Linux or OS X. The long-term goal is to be able to run Quino-based application and web servers on non-Windows—and, most importantly, non-IIS—platforms. [8]
These changes will almost certainly cause builds using previous versions to break. Look for any additional platform support in an upcoming major-version release.
Encodo
and Quino
assemblies into dozens of new, smaller and much more focused assemblies. Reorganizing configuration around the IOC and rewriting application startup for more than just desktop applications was another sweeping change.MetaBuilder
, which started off as a helper class for assembling application metadata, but became a monolithic and unavoidable dependency, even in v2. In v3, we made the breaking changes to remove this component from its central role and will continue to replace its functionality with components that more targeted, flexible and customizable.Published by marco on 12. May 2016 22:11:29 (GMT-5)
The summary below describes major new features, items of note and breaking changes. The full list of issues is also available for those with access to the Encodo issue tracker.
IDataSession
and IApplication
now directly implement the IServiceRequestHandler
and helper methods that used to extend IApplication
now extend this interface instead, so calls like GetModel()
can now be executed against an IApplication
or an IDataSession
. Many methods have been moved out of the IServiceRequestHandler
interface to extension methods declared in the Encodo.IOC
namespace. This move will require applications to update the usings
. ReSharper will automatically find the correct namespace and apply it for you.ApplicationExtensions.GetInstance()
has been replaced with a direct implementation of the IServiceRequestHandler
by IApplication
.MetaBuilder.Include()
has been replaced with Dependencies.Include()
CreateModel()
, you can no longer call CreateMainModule()
because the main module is set up automatically. Although the call is marked as obsolete, it can only be combined with the older overload of the CreateModel()
. Using it with the newer overload will cause a runtime error as the main module is added to the model twice.The various methods to create paths with the MetaBuilder
have been replaced by AddPath()
. To rewrite a path, use the following style:
Builder.AddPath(
Elements.Classes.A.FromOne("Id"),
Elements.Classes.B.ToMany("FileId"),
path => path.SetMetaId(new Guid("…")).SetDeleteRule(MetaPathRule.Cascade),
idx => idx.SetMetaId(new Guid("…"))
);
Published by marco on 26. Apr 2016 21:40:40 (GMT-5)
Updated by marco on 27. Apr 2016 07:13:40 (GMT-5)
Encodo published its first C# Handbook and published it to its web site in 2008. At the time, we also published to several other standard places and got some good, positive feedback. Over the next year, I made some more changes and published new versions. The latest version is 1.5.2 and is available from Encodo’s web site. Since then, though I’ve made a few extra notes and corrected a few errors, but never published an official version again.
This is not because Encodo hasn’t improved or modernized its coding guidelines, but because of several issues, listed below.
var
advice) or just plain wrong (e.g. var
advice)To address these issues and to accommodate the new requirements, here’s what we’re going to do:
These are the requirements and goals for a new version of the C# handbook.
The immediate next steps are:
I hope to have an initial, modern version ready within the next month or so.
As I was working on another part of Quino the other day, I noticed that the oft-discussed registration and configuration methods [1] were a bit clunkier than I’d have liked. To whit, the methods that I tended to use together for... [More]
]]>Published by marco on 7. Apr 2016 22:27:10 (GMT-5)
“Unwritten code requires no maintenance and introduces no cognitive load.”
As I was working on another part of Quino the other day, I noticed that the oft-discussed registration and configuration methods [1] were a bit clunkier than I’d have liked. To whit, the methods that I tended to use together for configuration had different return types and didn’t allow me to freely mix calls fluently.
Register
and Use
The return type for Register
methods is IServiceRegistrationHandler
and the return type for Use
methods is IApplication
(a descendant), The Register* methods come from the IOC interfaces, while the application builds on top of this infrastructure with higher-level Use* configuration methods.
This forces developers to write code in the following way to create and configure an application.
public IApplication CreateApplication()
{
var result =
new Application()
.UseStandard()
.UseOtherComponent();
result.
.RegisterSingle<ICodeHandler, CustomCodeHandler>()
.Register<ICodePacket, FSharpCodePacket>();
return result;
}
That doesn’t look too bad, though, does it? It doesn’t seem like it would cramp anyone’s style too much, right? Aren’t we being a bit nitpicky here?
That’s exactly why Quino 2.0 was released with this API. However, here we are, months later, and I’ve written a lot more configuration code and it’s really starting to chafe that I have to declare a local variable and sort my method invocations.
So I think it’s worth addressing. Anything that disturbs me as the writer of the framework—that gets in my way or makes me write more code than I’d like—is going to disturb the users of the framework as well.
Whether they’re aware of it or not.
In the best of worlds, users will complain about your crappy API and make you change it. In the world we’re in, though, they will cheerfully and unquestioningly copy/paste the hell out of whatever examples of usage they find and cement your crappy API into their products forever.
Do not underestimate how quickly calls to your inconvenient API will proliferate. In my experience, programmers really tend to just add a workaround for whatever annoys them instead of asking you to fix the problem at its root. This is a shame. I’d rather they just complained vociferously that the API is crap rather than using it and making me support it side-by-side with a better version for usually feels like an eternity.
Maybe it’s because I very often have control over framework code that I will just not deal with bad patterns or repetitive code. Also I’ve become very accustomed to having a wall of tests at my beck and call when I bound off on another initially risky but in-the-end rewarding refactoring.
If you’re not used to this level of control, then you just deal with awkward APIs or you build a workaround as a band-aid for the symptom rather than going after the root cause.
So while the code above doesn’t trigger warning bells for most, once I’d written it a dozen times, my fingers were already itching to add [Obsolete]
on something.
I am well-aware that this is not a simple or cost-free endeavor. However, I happen to know that there aren’t that many users of this API yet, so the damage can be controlled.
If I wait, then replacing this API with something better later will take a bunch of versions, obsolete warnings, documentation and re-training until the old API is finally eradicated. It’s much better to use your own APIs—if you can—before releasing them into the wild.
Another more subtle reason why the API above poses a problem is that it’s more difficult to discover, to learn. The difference in return types will feel arbitrary to product developers. Code-completion is less helpful than it could be.
It would be much nicer if we could offer an API that helped users discover it at their own pace instead of making them step back and learn new concepts. Ideally, developers of Quino-based applications shouldn’t have to know the subtle difference between the IOC and the application.
Something like the example below would be nice.
return
new Application()
.UseStandard()
.RegisterSingle<ICodeHandler, CustomCodeHandler>()
.UseOtherComponent()
.Register<ICodePacket, FSharpCodePacket>();
Right? Not a gigantic change, but if you can imagine how a user would write that code, it’s probably a lot easier and more fluid than writing the first example. In the second example, they would just keep asking code-completion for the next configuration method and it would just be there.
In order to do this, I’d already created an issue in our tracker to parameterize the IServiceRegistrationHandler
type in order to be able to pass back the proper return type from registration methods.
I’ll show below what I mean, but I took a crack at it recently because I’d just watched the very interesting video Fun with Generics by Benjamin Hodgson (Vimeo), which starts off with a technique identical to the one I’d planned to use—and that I’d already used successfully for the IQueryCondition
interface. [2]
Let’s redefine the IServiceRegistrationHandler
interface as shown below,
public interface IServiceRegistrationHandler<TSelf>
{
TSelf Register<TService, TImplementation>()
where TService : class
where TImplementation : class, TService;
// …
}
Can you see how we pass the type we’d like to return as a generic type parameter? Then the descendants would be defined as,
public interface IApplication : IServiceRegistrationHandler<IApplication>
{
}
In the video, Hodgson notes that the technique has a name in formal notation, “F-bounded quantification” but that a snappier name comes from the C++ world, “curiously recurring template pattern”. I’ve often called it a self-referencing generic parameter, which seems to be a popular search term as well.
This is only the first step, though. The remaining work is to update all usages of the formerly non-parameterized interface IServiceRegistrationHandler
. This means that a lot of extension methods like the one below
public static IServiceRegistrationHandler RegisterCoreServices(
[NotNull] this IServiceRegistrationHandler handler)
{
}
will now look like this:
public static TSelf RegisterCoreServices<TSelf>(
[NotNull] this IServiceRegistrationHandler<TSelf> handler)
where TSelf : IServiceRegistrationHandler<TSelf>
{
}
This makes defining such methods more complex (again). [3] in my attempt at implementing this, Visual Studio indicated 170 errors remaining after I’d already updated a couple of extension methods.
Instead of continuing down this path, we might just want to follow the pattern we established in a few other places, by defining both a Register
method, which uses the IServiceRegistrationHandler
, and a Use
method, which uses the IApplication
Here’s an example of the corresponding “Use” method:
public static IApplication UseCoreServices(
[NotNull] this IApplication application)
{
if (application == null) { throw new ArgumentNullException("application"); }
application
.RegisterCoreServices()
.RegisterSingle(application.GetServices())
.RegisterSingle(application);
return application;
}
Though the technique involves a bit more boilerplate, it’s easy to write and understand (and reason about) these methods. As mentioned in the initial sentence of this article, the cognitive load is lower than the technique with generic parameters.
The only place where it would be nice to have an IApplication
return type is from the Register*
methods defined on the IServiceRegistrationHandler
itself.
We already decided that self-referential generic constraints would be too messy. Instead, we could define some extension methods that return the correct type. We can’t name the method the same as the one that already exists on the interface [4], though, so let’s prepend the word Use
, as shown below:
IApplication UseRegister<TService, TImplementation>(
[NotNull] this IApplication application)
where TService : class
where TImplementation : class, TService;
{
if (application == null) { throw new ArgumentNullException("application"); }
application.Register<TService, TImplementation>();
return application;
}
That’s actually pretty consistent with the other configuration methods. Let’s take it for a spin and see how it feels. Now that we have an alternative way of registering types fluently without “downgrading” the result type from IApplication
to IServiceRegistrationHandler
, we can rewrite the example from above as:
return
new Application()
.UseStandard()
.UseRegisterSingle<ICodeHandler, CustomCodeHandler>()
.UseOtherComponent()
.UseRegister<ICodePacket, FSharpCodePacket>();
Instead of increasing cognitive load by trying to push the C# type system to places it’s not ready to go (yet), we use tiny methods to tweak the API and make it easier for users of our framework to write code correctly. [5]
If you define an extension method for a descendant type that has the same name as a method of an ancestor interface, the method-resolution algorithm for C# will never use it. Why? Because the directly defined method matches the name and all the types and is a “stronger” match than an extension method.
Perhaps an example is in order:
interface IA
{
IA RegisterSingle<TService, TConcrete>();
}
interface IB : IA { }
static class BExtensions
{
static IB RegisterSingle<TService, TConcrete>(this IB b) { return b; }
static IB UseStuff(this IB b) { return b; }
}
Let’s try to call the method from BExtensions
:
public void Configure(IB b)
{
b.RegisterSingle<IFoo, Foo>().UseStuff();
}
The call to UseStuff
cannot be resolved because the return type of the matched RegisterSingle
method is the IA
of the interface method not the IB
of the extension method. There is a solution, but you’re not going to like it (I know I don’t).
public void Configure(IB b)
{
BExtensions.RegisterSingle<IFoo, Foo>(b).UseStuff();
}
You have to specify the extension-method class’s name explicitly, which engenders awkward fluent chaining—you’ll have to nest these calls if you have more than one—but the desired method-resolution was obtained.
But at what cost? The horror…the horror. (IMDb)
Published by marco on 25. Mar 2016 13:41:54 (GMT-5)
The summary below describes major new features, items of note and breaking changes. The full list of issues is also available for those with access to the Encodo issue tracker.
DateTimeExtensions.GetDayOfWeek()
had a leap-day bug (QNO-5051)GenericObjects
is calculated, which fixes sorting issues in grids, specifically for non-persisted or transient objects (QNO-5137)IAccessControl
API for getting groups and users and testing membership (QNO-5133)Add support for query aliases (e.g. for joining the same table multiple times) (QNO-531) This changes the API surface only minimally. Applications can pass an alias
when calling the Join
method, as shown below,
query.Join(Metadata.Project.Deputy, alias: "deputy")
You can find more examples of aliased queries in the TestAliasedQuery()
, TestJoinAliasedTables()
, TestJoinChildTwice()
defined in the QueryTests
testing fixture.
IQueryAnalyzer
for optimizations and in-memory mini-drivers (QNO-4830)ISchemaManager
has been removed. Instead, you should retrieve the interface you were looking for from the IOC. The possible interfaces you might need are IImportHandler
, IMappingBuilder
, IPlanBuilder
or ISchemaCommandFactory
.ISchemaManagerSettings.GetAuthorized()
has been moved to ISchemaManagerAuthorizer
. GenericObjects
may have an effect on the way your application sorts objects.The IParticipantManager
(base interface of IAccessControl
) no longer has a single method called GetGroups(IParticipant)
. This method was previously used to get the groups to which a user belongs and the child groups of a given group. This confusing double duty for the API led to an incorrect implementation for both usages. Instead, there are now two methods:
IEnumerable<IGroup> GetGroups(IUser user)
: Gets the groups for the given userIEnumerable<IGroup> GetChildGroups(IGroup group)
: Gets the child groups for the given groupThe old method has been removed from the interface because (A) it never worked correctly anyway and (B) it conflicts with the new API.
This first-ever Voxxed Zürich was hosted at the cinema in the SihlCity shopping center in Zürich on March 3rd. All presentations were in English. The conference was relatively small—333 participants—and largely vendor-free. The... [More]
]]>Published by marco on 25. Mar 2016 13:41:30 (GMT-5)
This article was originally published on the Encodo Blogs.
This first-ever Voxxed Zürich was hosted at the cinema in the SihlCity shopping center in Zürich on March 3rd. All presentations were in English. The conference was relatively small—333 participants—and largely vendor-free. The overal technical level of the presentations and participants was quite high. I had a really nice time and enjoyed a lot of the presentations.
There was a nice common thread running through all of the presentations, starting with the Keynote. There’s a focus on performance and reliability through immutabiliy, sequences, events, actors, delayed execution (lambdas, which are relatively new to Java), instances in the cloud, etc. It sounds very BUZZWORDY, but instead it came as a very technically polished conference that reminded me of how many good developers there are trying to do the right thing. Looking forward to next year; hopefully Encodo can submit a presentation.
You can take a look at the VoxxedDays Zürich – Schedule. The talks that I visited are included below, with links to the presentation page, the video on YouTube and my notes and impressions. YMMV.
Life beyond the Illusion of the Present—Jonas Bonér
Kotlin − Ready for production—Hadi Hariri
Reactive Apps with Akka and AngularJS—Heiko Seeberger
During his talk, he took us through the following stages of building a scalable, resilient actor-based application with Akka.
AKKA Distributed Data
AKKA Cluster Sharding
AKKA Persistence
Akka looks pretty good. It guarantees the ordering because ACTORS. Any given actor only exists on any shard once. If a shard goes down, the actor is recreated on a different shard, and filled with information from the persistent store to “recreate” the state of that actor.
DDD (Domain-Driven Design) and the actor model. Watch Hewitt, Meijer and Szyperski: The Actor Model (everything you wanted to know, but were afraid to ask) (Channel9).
Code is on GitHub: seeberger/reactive_flows
Lambda core − hardcore—Jarek Ratajski
Focus on immutability and no side-effects. Enforced by the lambda calculus. Pretty low-level talk about lambda calculus. Interesting, but not applicable. He admitted as much at the top of the talk.
Links:
expect(“poo”).length.toBe(1)—Philip Hofstetter [1]
This was a talk about expectations of the length of a character. The presenter was very passionate about his talk and went into an incredible amount of detail.
How usability fits in UX − it’s no PICNIC—Myriam Jessier
What should a UI be?
Also nice to have:
Book recommendation: Don’t make me think by Steve Krug
Guidelines:
Guidelines for mobile:
Suggested usability testing tools:
React − A trip to Russia isn’t all it seems—Josh Sephton [3]
This talk was about Web UI frameworks and how his team settled on React.
The reactor programming model for composable distributed computing—Aleksandar Prokopec [4]
At the beginning of the year, we worked on an interesting project that dipped into IOT (Internet of Things). The project was to create use cases for Crealogix’s banking APIs in the real world. Concretely, we wanted to show how a customer could use these APIs in their own workflows. The use cases... [More]
]]>Published by marco on 25. Mar 2016 13:39:04 (GMT-5)
At the beginning of the year, we worked on an interesting project that dipped into IOT (Internet of Things). The project was to create use cases for Crealogix’s banking APIs in the real world. Concretely, we wanted to show how a customer could use these APIs in their own workflows. The use cases were to provide proof of the promise of flexibility and integrability offered by well-designed APIs.
Watch 7–minute video of the presentation
The first use case is for the treasurer of a local football club. The treasurer wants to be notified whenever an annual club fee is transferred from a member. The club currently uses a Google Spreadsheet to track everything, but it’s updated manually. It would be really nice if the banking API could connected—via some scripting “glue”—to update the spreadsheet directly, without user intervention. The treasurer would just see the most current numbers whenever he opened the spreadsheet.
The spreadsheet is in addition to the up-to-date view of payments in the banking app. The information is also available there, but not necessarily in the form that he or she would like. Linking automatically to the spreadsheet is the added value.
Imagine a family with a young son who wants to buy a drone. He would have to earn it by doing chores. Instead of tracking this manually, the boy’s chores would be tabulated automatically, moving money from the parents’ account to his own as he did chores. Additionally, a lamp in the boy’s room would glow a color indicating how close he was to his goal. The parents wanted to track the boy’s progress in a spreadsheet, tracking the transfers as they would have had they not had any APIs.
The idea is to provided added value to the boy, who can record his chores by pressing a button and see his progress by looking at a lamp’s color. The parents get to stay in their comfort zone, working with a spreadsheet as usual, but having the data automatically entered in the spreadsheet.
It’s a bit of a stretch, but it sufficed to ground the relatively abstract concept of banking APIs in an example that non-technical people could follow.
So we needed to pull quite a few things together to implement these scenarios.
Either of these—just judging from their websites—would be sufficient to utterly and completely change our lives. The Hue looked like it was going to turn us into musicians, so we went with Lifx, which only threatened to give us horn-rimmed glasses and a beard (and probably skinny jeans and Chuck Taylor knockoffs).
Yeah, we think the marketing for what is, essentially, a light-bulb, is just a touch overblown. Still, you can change the color of the light bulb with a SmartPhone app, or control it via API (which is what we wanted to do).
The button sounds simple. You’d think that, in 2016, these things would be as ubiquitous as AOL CDs were in the 1990s. You’d be wrong.
There’s a KickStarter project called Flic that purports to have buttons that send signals over a wireless connection. They cost about CHF20. Though we ordered some, we never saw any because of manufacturing problems. If you thought the hype and marketing for a light bulb were overblown, then you’re sure to enjoy how Flic presents a button.
We quickly moved along a parallel track to get buttons that can be pressed in real life rather than just viewed from several different angles and in several different colors online.
Amazon has what they have called “Dash” buttons that customers can press to add predefined orders to their one-click shopping lists. The buttons are bound to certain household products that you tend to purchase cyclically: toilet paper, baby wipes, etc.
They sell them dirt-cheap—$5—but only to Amazon Prime customers—and only to customers in the U.S. Luckily, we knew someone in the States willing to let us use his Amazon Prime account to deliver them, naturally only to a domestic address, from which they would have to be forwarded to us here in Switzerland.
That we couldn’t use them to order toilet paper in the States didn’t bother us—we were planning to hack them anyway.
These buttons showed up after a long journey and we started trapping them in our own mini-network so that we could capture the signal they send and interpret it as a trigger. This was not ground-breaking stuff, but we really wanted the demonstrator to be able to press a physical button on stage to trigger the API that would cascade other APIs and so on.
Of course we could have just hacked the whole thing so that someone presses a button on a screen somewhere—and we programmed this as a backup plan—but the physicality of pressing a button was the part of the demonstration that was intended to ground the whole idea for non-technical users. [1]
If you’re going to use an API to modify a spreadsheet, then that spreadsheet has to be available online somewhere. The spreadsheet application in Google Docs is a good candidate.
The API allows you to add or modify existing data, but that’s pretty much it. When you make changes, they show up immediately, with no ceremony. That, unfortunately, doesn’t make for a very nice-looking demo.
Google Docs also offers a Javascript-like scripting language that let’s you do more. We wanted to not only insert rows, we wanted charts to automatically update and move down the page to accommodate the new row. All animated, thank you very much.
This took a couple pages of scripting and a good amount of time. It’s also no longer a solution that an everyday user is likely to make themselves. And, even though we pushed as hard as we could, we also didn’t get everything we wanted. The animation is very jerky (watch the video linked above) but gets the job done.
So we’ve got a bunch of pieces that are all capable of communicating in very similar ways. The final step is to glue everything together with a bit of script. There are several services available online, like IFTTT—If This Then That—that allow you to code simple logic to connect signals to actions.
In our system, we had the following signals:
and the following actions:
So we’re going to betray a tiny secret here. Although the product demonstrated on-stage did actually do what it said, it didn’t do it using the Crealogix API to actually transfer money. That’s the part that we were actually selling and it’s the part we ended up faking/mocking out because the actual transfer is beside the point. Setting up bank accounts is not so easy, and the banks take umbrage at creating them for fake purposes.
Crealogix could have let us use fake testing accounts, but even that would have been more work than it was worth: if we’re already faking, why not just fake in the easiest way possible by skipping the API call to Crealogix and only updating the spreadsheet?
Likewise, the entire UI that we included in the product was mocked up to include only the functionality required by the demonstration. You can see an example here—of the login screen—but other screens are linked throughout this article. Likewise, the Bank2Things screen shown above and to the left is a mockup.
So what did Encodo actually contribute?
As last year—when we helped Crealogix create the prototype for their BankClip for Finovate 2015—we had a lot of fun investigating all of these cutting-edge technologies and putting together a custom solution in time for Finovate 2016.
Some of you might be wondering: what if I want to start up and run an... [More]
]]>Published by marco on 27. Feb 2016 12:36:39 (GMT-5)
Updated by marco on 27. Feb 2016 12:52:18 (GMT-5)
In several articles last year [1], I went into a lot of detail about the configuration and startup for Quino applications. Those posts discuss a lot about what led to the architecture Quino has for loading up an application.
Some of you might be wondering: what if I want to start up and run an application that doesn’t use Quino? Can I build applications that don’t use any fancy metadata because they’re super-simple and don’t even need to store any data? Those are the kind of utility applications I make all the time; do you have anything for me, you continue to wonder?
As you probably suspected from the leading question: You’re in luck. Any functionality that doesn’t need metadata is available to you without using any of Quino. We call this the “Encodo” libraries, which are the base on which Quino is built. Thanks to the fundamental changes made in Quino 2, you have a wealth of functionality available in just the granularity you’re looking for.
Instead of writing such small applications from scratch—and we know we could write them—why would we want to leverage existing code? What are the advantages of doing this?
What are potential disadvantages?
A developer unfamiliar with a library—or one who is too impatient to read up on it—will feel these disadvantages more acutely and earlier.
Let’s take a look at some examples below to see how the Encodo/Quino libraries stack up. Are we able to profit from the advantages without suffering from the disadvantages?
We’re going to take a look at two simple applications:
The actual service-registration part is boilerplate generated by Microsoft Visual Studio [2], but we’d like to replace the hard-coded strings with customized data obtained from a configuration file. So how do we get that data?
That doesn’t sound that hard, right? I’m sure you could just whip something together with an XMLDocument
and some hard-coded paths and filenames that would do the trick. [3] It might even work on the first try, too. But do you really want to bother with all of that? Wouldn’t you rather just get the scaffolding for free and focus on the part where you load your settings?
The following listing shows the main application method, using the Encodo/Quino framework libraries to do the heavy lifting.
[NotNull]
public static ServiceSettings LoadServiceSettings()
{
ServiceSettings result = null;
var transcript = new ApplicationManager().Run(
CreateServiceConfigurationApplication,
app => result = app.GetInstance<ServiceSettings>()
);
if (transcript.ExitCode != ExitCodes.Ok)
{
throw new InvalidOperationException(
"Could not read the service settings from the configuration file." +
new SimpleMessageFormatter().GetErrorDetails(transcript.Messages)
);
}
return result;
}
If you’ve been following along in the other articles (see first footnote below), then this structure should be very familiar. We use an ApplicationManager()
to execute the application logic, creating the application with CreateServiceConfigurationApplication
and returning the settings configured by the application in the second parameter (the “run” action). If anything went wrong, we get the details and throw an exception.
You can’t see it, but the library provides debug/file logging (if you enable it), debug/release mode support (exception-handling, etc.) and everything is customizable/replaceable by registering with an IOC.
Soooo…I can see where we’re returning the ServiceSettings
, but where are they configured? Let’s take a look at the second method, the one that creates the application.
private static IApplication CreateServiceConfigurationApplication()
{
var application = new Application();
application
.UseSimpleInjector()
.UseStandard()
.UseConfigurationFile("service-settings.xml")
.Configure<ServiceSettings>(
"service",
(settings, node) =>
{
settings.ServiceName = node.GetValue("name", settings.ServiceName);
settings.DisplayName = node.GetValue("displayName", settings.DisplayName);
settings.Description = node.GetValue("description", settings.Description);
settings.Types = node.GetValue("types", settings.Types);
}
).RegisterSingle<ServiceSettings>();
return application;
}
Application
, defined in the Encodo.Application
assembly. What does this class do? It does very little other than manage the main IOC (see articles linked in the first footnote for details).UseSimpleInjector()
. Quino includes support for the SimpleInjector IOC out of the box. As you can see, you must include this support explicitly, so you’re also free to assign your own IOC (e.g. one using Microsoft’s Unity). SimpleInjector is very lightweight and super-fast, so there’s no downside to using it.UseStandard()
, defined in the Encodo.Application.Standard
assembly. Since I know that UseStandard()
pulls in what I’m likely to need, I’ll just use that. [4]ServiceSettings
object that we want to return. For that, there’s a Configure
method that returns an object from the IOC along with a specific node from the configuration data. This method is called only if everything started up OK.RegisterSingle
makes sure that the ServiceSettings
object created by the IOC is a singleton (it would be silly to configure one instance and return another, unconfigured one).Basically, because this application is so simple, it has already accomplished its goal by the time the standard startup completes. At the point that we would “run” this application, the ServiceSettings
object is already configured and ready for use. That’s why, in LoadServiceSettings()
, we can just get the settings from the application with GetInstance()
and exit immediately.
The code generator has a bit more code, but follows the same pattern as the simple application above. In this case, we use the command line rather than the configuration file to get user input.
The main method defers all functionality to the ApplicationManager
, passing along two methods, one to create the application, the other to run it.
internal static void Main()
{
new ApplicationManager().Run(CreateApplication, GenerateCode);
}
As before, we first create an Application
, then choose the SimpleInjector and some standard configuration and registrations with UseStandard()
, UseMetaStandardServices()
and UseMetaTools()
. [6]
We set the application title to “Quino Code Generator” and then include objects with UseSingle()
that will be configured from the command line and used later in the application. [7] And, finally, we add our own ICommandSet
to the command-line processor that will configure the input and output settings. We’ll take a look at that part next.
private static IApplication CreateApplication(
IApplicationCreationSettings applicationCreationSettings)
{
var application = new Application();
return
application
.UseSimpleInjector()
.UseStandard()
.UseMetaStandardServices()
.UseMetaTools()
.UseTitle("Quino Code Generator")
.UseSingle(new CodeGeneratorInputSettings())
.UseSingle(new CodeGeneratorOutputSettings())
.UseUnattendedCommand()
.UseCommandSet(CreateGenerateCodeCommandSet(application))
.UseConsole();
}
The final bit of the application configuration is to see how to add items to the command-line processor.
Basically, each command set consists of required values, optional values and zero or more switches that are considered part of a set.
The one for i simply sets the value of inputSettings.AssemblyFilename
to whatever was passed on the command line after that parameter. Note that it pulls the inputSettings
from the application to make sure that it sets the values on the same singleton reference as will be used in the rest of the application.
The code below shows only one of the code-generator–specific command-line options. [8]
private static ICommandSet CreateGenerateCodeCommandSet(
IApplication application)
{
var inputSettings = application.GetSingle<CodeGeneratorInputSettings>();
var outputSettings = application.GetSingle<CodeGeneratorOutputSettings>();
return new CommandSet("Generate Code")
{
Required =
{
new OptionCommandDefinition<string>
{
ShortName = "i",
LongName = "in",
Description = Resources.Program_ParseCommandLineArgumentIn,
Action = value => inputSettings.AssemblyFilename = value
},
// And others…
},
};
}
Finally, let’s take a look at the main program execution for the code generator. It shouldn’t surprise you too much to see that the logic consists mostly of getting objects from the IOC and telling them to do stuff with each other. [9]
I’ve highlighted the code-generator–specific objects in the code below. All other objects are standard library tools and interfaces.
private static void GenerateCode(IApplication application)
{
var logger = application.GetLogger();
var inputSettings = application.GetInstance<CodeGeneratorInputSettings>();
if (!inputSettings.TypeNames.Any())
{
logger.Log(Levels.Warning, "No types to generate.");
}
else
{
var modelLoader = application.GetInstance<IMetaModelLoader>();
var metaCodeGenerator = application.GetInstance<IMetaCodeGenerator>();
var outputSettings = application.GetInstance<CodeGeneratorOutputSettings>();
var modelAssembly = AssemblyTools.LoadAssembly(
inputSettings.AssemblyFilename, logger
);
outputSettings.AssemblyDetails = modelAssembly.GetDetails();
foreach (var typeName in inputSettings.TypeNames)
{
metaCodeGenerator.GenerateCode(
modelLoader.LoadModel(modelAssembly, typeName),
outputSettings,
logger
);
}
}
}
So that’s basically it: no matter how simple or complex your application, you configure it by indicating what stuff you want to use, then use all of that stuff once the application has successfully started. The Encodo/Quino framework provides a large amount of standard functionality. It’s yours to use as you like and you don’t have to worry about building it yourself. Even your tiniest application can benefit from sophisticated error-handling, command-line support, configuration and logging without lifting a finger.
That boilerplate looks like this:
var fileService = new ServiceInstaller();
fileService.StartType = ServiceStartMode.Automatic;
fileService.DisplayName = "Quino Sandbox";
fileService.Description = "Demonstrates a Quino-based service.";
fileService.ServiceName = "Sandbox.Services";
See the ServiceInstaller.cs
file in the Sandbox.Server
project in Quino 2.1.2 and higher for the full listing.
The standard implementation of Quino’s ITextKeyValueNodeReader supports XML, but it would be trivial to create and register a version that supports JSON (QNO-4993) or YAML. The configuration file for the utility looks like this:
<?xml version="1.0" encoding="utf-8" ?>
<config>
<service>
<name>Quino.Services</name>
<displayName>Quino Utility</displayName>
<description>The application to run all Quino backend services.</description>
<types>All</types>
</service>
</config>
If you look at the implementation of the UseStandard
method [10], it pulls in a lot of stuff, like support for BCrypt, enhanced CSV and enum-value parsing and standard configuration for various components (e.g. the file log and command line). It’s called “Standard” because it’s the stuff we tend to use in a lot of applications.
But that method is just a composition of over a dozen other methods. If, for whatever reason (perhaps dependencies), you don’t want all of that functionality, you can just call the subset of methods that you do want. For example, you could call UseApplication()
from the Encodo.Application
assembly instead. That method includes only the support for:
ICommandSetManager
)ILocationManager
)IConfigurationDataLoader
)IExternalLoggerFactory
)IApplicationManager
.If you want to go even lower than that, you can try UseCore()
, defined in the Encodo.Core
assembly and then pick and choose the individual components yourself. Methods like UseApplication()
and UseStandard()
are tried and tested defaults, but you’re free to configure your application however you want, pulling from the rich trove of features that Quino offers.
By default, the application will look for this file next to the executable. You can configure this as well, by getting the location manager with GetLocationManager()
and setting values on it.
You’ll notice that I didn’t use Configure<ILocationManager>()
for this particular usage. That’s ordinarily the way to go if you want to make changes to a singleton before it is used. However, if you want to change where the application looks for configuration files, then you have to change the location manager before it’s used any other configuration takes place. It’s a special object that is available before the IOC has been fully configured. To reiterate from other articles (because it’s important), the order of operations we’re interested in here are:
Use*()
to build the application)LocationNames.Configuration
Configure()
If you want to change the configuration-file location, then you have to get in there before the startup starts running—and that’s basically during application construction. Alternatively, you could also call UseConfigurationDataLoader()
to register your own object to actually load configuration data and do whatever the heck you like in there, including returning constant data. :-)
UseStandard()
is UseMetaStandard()
, but we don’t call that. Instead, we call UseMetaStandardServices()
. Why? The answer is that we want the code generator to be able to use some objects defined in Quino, but the code generator itself isn’t a metadata-based application. We want to include the IOC registrations required by metadata-based applications without adding any of the startup or shutdown actions. Many of the standard Use*()
methods included in the base libraries have analogs like this. The Use*Services()
analogs are also very useful in automated tests, where you want to be able to create objects but don’t want to add anything to the startup.RegisterSingle()
? For almost any object, we could totally do that. But objects used during the first stage of application startup—before the IOC is available—must go in the other IOC, accessed with SetSingle()
and GetSingle()
.Program.cs
in the Quino.CodeGenerator
project in any 2.x version of Quino.GetInstance()
instead of GetSingle()
because the IOC is now available and all singletons are mirrored from the startup IOC to the main IOC. In fact, once the application is started, it’s recommended to use GetInstance()
everywhere, for consistency and to prevent the subtle distinction between IOCs—present only in the first stage of startup—from bleeding into your main program logic.UseStandard()
to decompile the method. In the latest DotPeek, the extension methods are even displayed much more nicely in decompiled form.Published by marco on 17. Jan 2016 22:27:10 (GMT-5)
Updated by marco on 18. Jan 2016 07:12:55 (GMT-5)
The article Learn you Func Prog on five minute quick! by Verity Stob (The Register) provides a typically twisted and unhelpful overview of the state of functional programming in this 21st-century renaissance—heralded decades ago by Lisp programmers. It includes an honest overview of the major players, including Scala, for which the “pro” and “con” are the same (a “[c]lose relationship with Java […]”) and ending with JavaScript, for which the “pro” is “It’s what you’ll end up using.”
The discussion continues with rules: variable immutability, function purity, curryability and monadicity, which is where things really go off the rails. Property 7 dribbles to a shuddering halt with,
“All monads define a
unit()
function calledof()
, abind()
function calledmap()
and a type constructor function called…“Wait a minute. Wait a minute. Perhaps
bind()
is a functor not a function. I’m pretty sure about that. Hold on to the horses a moment there while I look it up.“…And I should perhaps clarify that this
bind()
andmap()
is nothing to do with any otherbind()
ormap()
methods or functions that you might be familiar with, although their actions are in some sense quite similar.“Summary: It has been an honour and a pleasure to clear all that up for you.
“Final Reader’s comment: My gratitude is inexpressible. [1]”
Which is not to say that I don’t enjoy immensely the functional aspects of C#. I do. I also have read a lot about monads and am completely familiar with the tragically bad and unenlightening explanations. Stob captures this elegantly with the following corollary to Rule 4:
“If you should by some accident come to understand what a Monad is, you will simultaneously lose the ability to explain it to anybody else.”
In this second part, we’ll... [More]
]]>Published by marco on 16. Jan 2016 12:53:04 (GMT-5)
In part I of this series, we discussed some core concepts of profiling. In that article, we not only discussed the problem at hand, but also how to think about not only fixing performance problems, but reducing the likelihood that they get out of hand in the first place.
In this second part, we’ll go into detail and try to fix the problem.
Since we have new requirements for an existing component, it’s time to reconsider the requirements for all stakeholders. In terms of requirements, the IScope
can be described as follows:
There is more detail, but that should give you enough information to understand the code examples that follow.
There are many ways of implementing the functional requirements listed above. While you can implement the feature with only requirements, it’s very helpful to know usage patterns when trying to optimize code.
Therefore, we’d like to know exactly what kind of contract our code has to implement—and to not implement any more than was promised.
Sometimes a hopeless optimization task gets a lot easier when you realize that you only have to optimize for a very specific situation. In that case, you can leave the majority of the code alone and optimize a single path through the code to speed up 95% of the calls. All other calls, while perhaps a bit slow, will at least still be yield the correct results.
And “optimized” doesn’t necessarily mean that you have to throw all of your language’s higher-level constructs out the window. Once your profiling tool tells you that a particular bit of code has introduced a bottleneck, it often suffices to just examine that particular bit of code more closely. Just picking the low-hanging fruit will usually be more than enough to fix the bottleneck. [1]
I saw in the profiler that creating the ExpressionContext
had gotten considerably slower. Here’s the code in the constructor.
foreach (var value in values.Where(v => v != null))
{
Add(value);
}
I saw a few potential problems immediately.
Add()
had gotten more expensive in order to return the most appropriate object from the GetInstances()
methodAddRange()
The faster version is below:
var scope = CurrentScope;
for (var i = 0; i < values.Length; i++)
{
var value = values[i];
if (value != null)
{
scope.AddUnnamed(value);
}
}
Why is this version faster? The code now uses the fact that we know we’re dealing with an indexable list to avoid allocating an enumerator and to use non-allocating means of checking null. While the Linq code is highly optimized, a for
loop is always going to be faster because it’s guaranteed not to allocate anything. Furthermore, we now call AddUnnamed()
to use the faster registration method because the more involved method is never needed for these objects.
The optimized version is less elegant and harder to read, but it’s not terrible. Still, you should use these techniques only if you can prove that they’re worth it.
CurrentScope
Another minor improvement is that the call to retrieve the scope is made only once regardless of how many objects are added. On the one hand, we might expect only a minor improvement since we noted above that most use cases only ever add one object anyway. On the other, however, we know that we call the constructor 20 million times in at least one test, so it’s worth examining.
The call to CurrentScope
gets the last element of the list of scopes. Even something as innocuous as calling the Linq extension method Last()
can get more costly than it needs to be when your application calls it millions of times. Of course, Microsoft has decorated its Linq calls with all sorts of compiler hints for inlining and, of course, if you decompile, you can see that the method itself is implemented to check whether the target of the call is a list and use indexing, but it’s still slower. There is still an extra stack frame (unless inlined) and there is still a type-check with as
.
Replacing a call to Last()
with getting the item at the index of the last position in the list is not recommended in the general case. However, making that change in a provably performance-critical area shaved a percent or two off a test run that takes about 45 minutes. That’s not nothing.
protected IScope CurrentScope
{
get { return _scopes.Last(); }
}
protected IScope CurrentScope
{
get { return _scopes[_scopes.Count − 1]; }
}
That takes care of the creation & registration side, where I noticed a slowdown when creating the millions of ExpressionContext
objects needed by the data driver in our product’s test suite.
Let’s now look at the evaluation side, where objects are requested from the context.
The offending, slow code is below:
public IEnumerable<TService> GetInstances<TService>()
{
var serviceType = typeof(TService);
var rawNameMatch = this[serviceType.FullName];
var memberMatches = All.OfType<TService>();
var namedMemberMatches = NamedMembers.Select(
item => item.Value
).OfType<TService>();
if (rawNameMatch != null)
{
var nameMatch = (TService)rawNameMatch;
return
nameMatch
.ToSequence()
.Union(namedMemberMatches)
.Union(memberMatches)
.Distinct(ReferenceEqualityComparer<TService>.Default);
}
return namedMemberMatches.Union(memberMatches);
}
As you can readily see, this code isn’t particularly concerned about performance. It is, however, relatively easy to read and to figure out the logic behind returning objects, though. As long as no-one really needs this code to be fast—if it’s not used that often and not used in tight loops—it doesn’t matter. What matters more is legibility and maintainability.
But we now know that we need to make it faster, so let’s focus on the most-likely use cases. I know the following things:
Scope
instances are created with a single object in them and no other objects are ever added.FirstOrDefault()
object.These extra bits of information will allow me to optimize the already-correct implementation to be much, much faster for the calls that we’re likely to make.
The optimized version is below:
public IEnumerable<TService> GetInstances<TService>()
{
var members = _members;
if (members == null)
{
yield break;
}
if (members.Count == 1)
{
if (members[0] is TService)
{
yield return (TService)members[0];
}
yield break;
}
object exactTypeMatch;
if (TypedMembers.TryGetValue(typeof(TService), out exactTypeMatch))
{
yield return (TService)exactTypeMatch;
}
foreach (var member in members.OfType<TService>())
{
if (!ReferenceEquals(member, exactTypeMatch))
{
yield return member;
}
}
}
Given the requirements, the handful of use cases and decent naming, you should be able to follow what’s going on above. The code contains many more escape clauses for common and easily handled conditions, handling them in an allocation-free manner wherever possible.
You’ll notice that returning a value added by-name is not a requirement and has been dropped. Improving performance by removing code for unneeded requirements is a perfectly legitimate solution.
And, finally, how did we do? I created tests for the following use cases:
Here are the numbers from the automated tests.
This looks amazing but remember: while the optimized solution may be faster than the original, all we really know is that we’ve just managed to claw our way back from the atrocious performance characteristics introduced by a recent change. We expect to see vast improvements versus a really slow version.
Since I know that these calls showed up as hotspots and were made millions of times in the test, the performance improvement shown by these tests is enough for me to deploy a pre-release of Quino via TeamCity, upgrade my product to that version and run the tests again. Wish me luck! [4]
All
members contained a hidden call to the Linq call Reverse()
, which slowed things down even more! I removed the call to reverse all elements because (A) I don’t actually have any tests for the LIFO requirement nor (B) do I have any other code that expects it to happen. I wasn’t about to make the code even more complicated and possibly slower just to satisfy a purely theoretical requirement. That’s the kind of behavior that got me into this predicament in the first place.“Premature optimization is the root of all evil.”
As is so often the case with quotes—especially those on the Internet [1]—this one has a slightly different meaning in context. The snippet above invites developers to overlook the word... [More]
]]>Published by marco on 13. Jan 2016 07:05:23 (GMT-5)
An oft-quoted bit of software-development sagacity is
“Premature optimization is the root of all evil.”
As is so often the case with quotes—especially those on the Internet [1]—this one has a slightly different meaning in context. The snippet above invites developers to overlook the word “premature” and interpret the received wisdom as “you don’t ever need to optimize.”
Instead, Knuth’s full quote actually tells you how much of your code is likely to be affected by performance issues that matter (highlighted below).
“Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.”
In other articles, I’d mentioned that we’d upgraded several solutions to Quino 2 in order to test that the API was solid enough for a more general release. One of these products is both quite large and has a test suite of almost 1500 tests. The product involves a lot of data-import and manipulation and the tests include several scenarios where Quino is used very intensively to load, process and save data.
These tests used to run in a certain amount of time, but started taking about 25% longer after the upgrade to Quino 2.
Before doing anything else—making educated guesses as to what the problem could be, for example—we measure. At Encodo, we use JetBrains DotTrace to collect performance profiles.
There is no hidden secret: the standard procedure is to take a measurement before and after the change and to compare them. However, so much had changed from Quino 1.13 to Quino 2—e.g. namespaces and type names had changed—that while DotTrace was able to show some matches, the comparisons were not as useful as usual.
A comparison between codebases that hadn’t changed so much is much easier, but I didn’t have that luxury.
Even excluding the less-than-optimal comparison, it was an odd profile. Ordinarily, one or two issues stick out right away, but the slowness seemed to suffuse the entire test run. Since the direct profiling comparison was difficult, I downloaded test-speed measurements as CSV from TeamCity for the product where we noticed the issue.
How much slower, you might ask? The test that I looked at most closely took almost 4 minutes (236,187ms) in the stable version, but took 5:41 in the latest build.
This test was definitely one of the largest and longest tests, so it was particularly impacted. Most other tests that imported and manipulated data ranged anywhere from 10% to 30% slower.
When I looked for hot-spots, the profile unsurprisingly showed me that database access took up the most time. The issue was more subtle: while database-access still used the most time, it was using a smaller percentage of the total time. Hot-spot analysis wasn’t going to help this time. Sorting by absolute times and using call counts in the tracing profiles yielded better clues.
The tests were slower when saving and also when loading data. But I knew that the ORM code itself had barely changed at all. And, since the product was using Quino so heavily, the stack traces ran quite deep. After a lot of digging, I noticed that creating the ExpressionContext
to hold an object while evaluating expressions locally seemed to be taking longer than before. This was my first, real clue.
Once I was on the trail, I found that when evaluating calls (getting objects) that used local evaluation, it was also always slower.
Once you start looking for places where performance is not optimal, you’re likely to start seeing them everywhere. However, as noted above, 97% of them are harmless.
To be clear, we’re not optimizing because we feel that the framework is too slow but because we’ve determined that the framework is now slower than it used to be and we don’t know why.
Even after we’ve finished restoring the previous performance (or maybe even making it a little better), we might still be able to easily optimize further, based on other information that we gleaned during our investigation.
But we want to make sure that we don’t get distracted and start trying to FIX ALL THE THINGS instead of just focusing on one task at a time. While it’s somewhat disturbing that we seem to be created 20 million ExpressionContext
objects in a 4-minute test, that is also how we’ve always done it, and no-one has complained about the speed up until now.
Sure, if we could reduce that number to only 2 million, we might be even faster [3], but the point is that that we used to be faster on the exact same number of calls—so fix that first.
I found a likely candidate in the Scope
class, which implements the IScope
interface. This type is used throughout Quino, but the two use-cases that affect performance are:
ExpressionContext
, which holds the named values and objects to be used when evaluating the value of an IExpression
. These expressions are used everywhere in the data driver.The former usage has existed unchanged for years; its implementation is unlikely to be the cause of the slowdown. The latter usage is new and I recall having made a change to the semantics of which objects are returned by the Scope
in order to make it work there as well.
You may already be thinking: smooth move, moron. You changed the behavior of a class that is used everywhere for a tacked-on use case. That’s definitely a valid accusation to make.
In my defense, my instinct is to reuse code wherever possible. If I already have a class that holds a list of objects and gives me back the object that matches a requested type, then I will use that. If I discover that the object that I get back isn’t as predictable as I’d like, then I improve the predictability of the API until I’ve got what I want. If the improvement comes at no extra cost, then it’s a win-win situation. However, this time I paid for the extra functionality with degraded performance.
Where I really went wrong was that I’d made two assumptions:
Avoid changing a type shared by different systems without considering all stakeholder requirements.
I think a few words on process here are important. Can we improve the development process so that this doesn’t happen again? One obvious answer would be to avoid changing a type shared by different systems without considering all stakeholder requirements. That’s a pretty tall order, though. Including this in the process will most likely lead to less refactoring and improvement out of fear of breaking something.
We discussed above how completely reasonable assumptions and design decisions led to the performance degradation. So we can’t be sure it won’t happen again. What we would like, though, is to be notified quickly when there is performance degradation, so that it appears as a test failure.
Notify quickly when there is performance degradation
Our requirements are captured by tests. If all of the tests pass, then the requirements are satisfied. Performance is a non-functional requirement. Where we could improve Quino is to include high-level performance tests that would sound the alarm the next time something like this happens. [5]
Enough theory: in part II, we’ll describe the problem in detail and take a crack at improving the speed. See you there.
Quino 2 is finally ready and will go out the door with a 2.1 rather than a 2.0 version number. The reason... [More]
]]>Published by marco on 1. Jan 2016 22:52:49 (GMT-5)
The summary below describes major new features, items of note and breaking changes. The full list of issues is also available for those with access to the Encodo issue tracker.
Quino 2 is finally ready and will go out the door with a 2.1 rather than a 2.0 version number. The reason being that we released 2.0 internally and tested the hell out of it. 2.1 is the result of that testing. It includes a lot of bug fixes as well as API tweaks to make things easier for developers.
On top of that, I’ve gone through the backlog and found many issues that had either been fixed already, were obsolete or had been inadequately specified. The Quino backlog dropped from 682 to 542 issues.
Quino.Web.Glimpse
package to use the support we do have (QNO-4560)HtmlHelpers
and other client-side rendering (QNO-3921, QNO-3995, QNO-3804, QNO-3797, QNO-3974, QNO-4001, QNO-3992, QNO-3991, QNO-3973, QNO-3970, QNO-3969, QNO-3918, QNO-3866, QNO-3865, QNO-3857, QNO-3849, QNO-3848, QNO-3842, QNO-3839, QNO-3837, QNO-3836, QNO-3834, QNO-3833, QNO-3831, QNO-3824 w/sub-tasks, QNO-3806, QNO-3805, QNO-3802, QNO-2288)The following changes are marked with Obsolete
attributes, so you’ll get a hint as to how to fix the problem. Since these are changes from an unreleased version of Quino, they cause a compile error.
UseMetaSchemaWinformDxFeedback()
has been renamed to UseMetaschemaWinformDx()
UseSchemaMigrationSupport()
has been renamed to UseIntegratedSchemaMigration()
MetaHttpApplicationBase.MetaApplication
has been renamed to BaseApplication
IServer.Run()
extension method is no longer supported.GetStandardFilters
, GetStandardFiltersForFormsAuthentication()
and GetStandardFiltersForUnrestrictedAuthentication
are no longer supported. Instead, you should register filters in the IOC and use the IWebFilterAttributeFactory.CreateFilters()
to get the list of supported filtersToolRequirementAttribute
is no longer supported or used.AssemblyExtensions.GetLoadableTypesWithInterface()
is no longer supportedAssemblyTools.GetValidAssembly()
has been replaced with AssemblyTools.GetApplicationAssembly()
; GetExecutableName()
and GetExecutablePath()
have removed.MetaBuilderBase
(e.g. EndOfTimeExpression
) are obsolete. Instead, use MetaBuilderBase.ExpressionFactory.Constants.EndOfTime
instead.MetaObjectDescriptionExtensions
are obsolete; instead, use the IMetaObjectFormatterSettings
from the IOC to change settings on startup.GetShortDescription()
has been moved to the IMetaObjectFormatter
. Obtain an instance from the IOC, as usual.In the beta1 and beta2 release notes, we read about changes to configuration, dependency reduction, the data... [More]
]]>Published by marco on 28. Dec 2015 10:40:24 (GMT-5)
The summary below describes major new features, items of note and breaking changes. The full list of issues is also available for those with access to the Encodo issue tracker.
In the beta1 and beta2 release notes, we read about changes to configuration, dependency reduction, the data driver architecture, DDL commands, security and access control in web applications and a new code-generation format.
In 2.0 final—which was actually released internally on November 13th, 2015 (a Friday)—we made the following additional improvements:
These notes are being published for completeness and documentation. The first publicly available release of Quino 2.x will be 2.1 or higher (release notes coming soon).
As we’ve mentioned before, this release is absolutely merciless in regard to backwards compatibility. Old code is not retained as Obsolete
. Instead, a project upgrading to 2.0 will encounter compile errors.
The following notes serve as an incomplete guide that will help you upgrade a Quino-based product.
As I wrote in the release notes for beta1 and beta2, if you arm yourself with a bit of time, ReSharper and the release notes (and possibly keep an Encodo employee on speed-dial), the upgrade is not difficult. It consists mainly of letting ReSharper update namespace references for you.
Instead of going through the errors (example shown to the right) one by one, you can take care of a lot of errors with the following search/replace pairs.
Encodo.Quino.Data.Persistence
=> Encodo.Quino.Data
IMetaApplication
=> IApplication
ICoreApplication
=> IApplication
GetServiceLocator()
=> GetServices()
MetaMethodTools.GetInstance
=> DataMetaMethodExtensions.GetInstance
application.ServiceLocator.GetInstance
=> application.GetInstance
Application.ServiceLocator.GetInstance
=> Application.GetInstance
application.ServiceLocator
=> application.GetServices()
Application.ServiceLocator
=> Application.GetServices()
application.Recorder
=> application.GetLogger()
Application.Recorder
=> Application.GetLogger()
session.GetRecorder()
=> session.GetLogger()
Session.GetRecorder()
=> Session.GetLogger()
Session.Application.Recorder
=> Session.GetLogger()
FileTools.Canonicalize()
=> PathTools.Normalize()
application.Messages
=> application.GetMessageList()
Application.Messages
=> Application.GetMessageList()
ServiceLocator.GetInstance
=> Application.GetInstance
MetaLayoutTools
=> LayoutConstants
GlobalContext.Instance.Application.Configuration.Model
=> GlobalContext.Instance.Application.GetModel()
IMessageRecorder
=> ILogger
GetUseReleaseSettings()
=> IsInReleaseMode()
ReportToolsDX
=> ReportDxExtensions
Although you can’t just search/replace everything, it gets you a long way.
These replacement pairs, while not recommended for global search/replace, are a handy guide for how the API has generally changed.
*Generator => *
Builder
SetUpForModule =>
CreateModule
Builder.SetElementVisibility(prop, true) =>
prop.Show()
Builder.SetElementVisibility(prop, false) =>
prop.Hide()
Builder.SetElementControlIdentifier(prop, ControlIdentifiers =>
prop.SetInputControl(ControlIdentifiers
Builder.SetPropertyHeightInPixels(prop, 200); =>
prop.SetHeightInPixels(200);
Constructing a module has also changed. Instead of using the following syntax,
var module = Builder.SetUpForModule<AuditModule>(Name, "ApexClearing.Alps.Core", Name, true);
Replace it with the following direct replacement,
var module = Builder.CreateModule(Name, "ApexClearing.Alps.Core", Name);
Or use this replacement, with the recommended style for the v2 format (no more class prefix for generated classes and a standard namespace):
var module = Builder.CreateModule(Name, typeof(AuditModuleBuilder).GetParentNamespace());
Because of how the module class-names have changed, the standard module ORM classes all have different names. The formula is that the ORM class-name is no longer prepended its module name.
ReportsReportDefinition
=> ReportDefinition
SecurityUser
=> User
Furthermore, all modules have been converted to use the v2 code-generation format, which has the metadata separate from the ORM object. Therefore, instead of referencing metadata using the ORM class-name as the base, you use the module name as the base.
ReportReportDefinition.Fields.Name
=> ReportModule.ReportDefinition.Name.Identifier
ReportReportDefinition.MetaProperties.Name
=> ReportModule.ReportDefinition.Name
ReportReportDefinition.Metadata
=> ReportModule.ReportDefinition.Metadata
There’s an upcoming article that will show more examples of the improved flexibility and capabilities that come with the v2-metadata.
The standard action names have moved as well.
Any other, more rarely used action names have been moved back to the actions themselves, so for example
SaveApplicationSettingsAction.ActionName
If you created any actions of your own, then the API there has changed as well. As previously documented in API Design: To Generic or not Generic? (Part II), instead of overriding the following method,
protected override int DoExecute(IApplication application, ConfigurationOptions options, int currentResult)
{
base.DoExecute(application, options, currentResult);
}
you instead override in the following way,
public override void Execute()
{
base.Execute();
}
If you’re already using Visual Studio 2015, then the NuGet UI is a good choice for managing packages. If you’re still on Visual Studio 2013, then the UI there is pretty flaky and we recommend using the console.
The examples below assume that you have configured a source called “Local Quino” (e.g. a local folder that holds the nupkg
files for Quino).
install-package Quino.Data.PostgreSql.Testing -ProjectName Punchclock.Core.Tests -Source "Local Quino"
install-package Quino.Server -ProjectName Punchclock.Server -Source "Local Quino"
install-package Quino.Console -ProjectName Punchclock.Server -Source "Local Quino"
install-package Quino.Web -ProjectName Punchclock.Web.API -Source "Local Quino"
We recommend using Visual Studio 2015 if at all possible. Visual Studio 2013 is also supported, but we have all migrated to 2015 and our knowhow about 2013 and its debugging idiosyncrasies will deteriorate with time.
These are just brief points of interest to get you set up. As with the NuGet support, these instructions are subject to change as we gain more experience with debugging with packages as well.
Quino.zip
as part of the package release)Quino packages are no different than any other NuGet packages. We provide both standard packages as well as packages with symbols and sources. Any complications you encounter with them are due to the whole NuGet experience still being a bit in-flux in the .NET world.
An upcoming post will provide more detail and examples.
We generally use our continuous integration server to create packages, but you can also create packages locally (it’s up to you to make sure the version number makes sense, so be careful). These instructions are approximate and are subject to change. I provide them here to give you an idea of how packages are created. If they don’t work, please contact Encodo for help.
%QUINO_ROOT%\src
directorynant build pack
to build Quino and packages%QUINO_ROOT%\nuget
(one-time only)nant nuget
from your project directory to get the latest Quino build from your local folderThat said, you’ll have to get yourself a test... [More]
]]>Published by marco on 6. Dec 2015 11:57:57 (GMT-5)
These days nobody who’s anybody in the software-development world is writing software without tests. Just writing them doesn’t help make the software better, though. You also need to be able to execute tests—reliably and quickly and repeatably.
That said, you’ll have to get yourself a test runner, which is a different tool from the compiler or the runtime. That is, just because your tests compile (satisfy all of the language rules) and could be executed doesn’t mean that you’re done writing them yet.
Every testing framework has its own rules for how the test runner selects methods for execution as tests. The standard configuration options are:
Each testing framework will offer different ways of configuring your code so that the test runner can find and execute setup/test/teardown code. To write NUnit tests, you decorate classes, methods and parameters with C# attributes.
The standard scenario is relatively easy to execute—run all methods with a Test
attribute in a class with a TestFixture
attribute on it.
There are legitimate questions for which even the best specification does not provide answers.
When you consider multiple base classes and generic type arguments, each of which may also have NUnit attributes, things get a bit less clear. In that case, not only do you have to know what NUnit offers as possibilities but also whether the test runner that you’re using also understands and implements the NUnit specification in the same way. Not only that, but there are legitimate questions for which even the best specification does not provide answers.
At Encodo, we use Visual Studio 2015 with ReSharper 9.2 and we use the ReSharper test runner. We’re still looking into using the built-in VS test runner—the continuous-testing integration in the editor is intriguing [1]—but it’s quite weak when compared to the ReSharper one.
So, not only do we have to consider what the NUnit documentation says is possible, but we must also know what how the R# test runner interprets the NUnit attributes and what is supported.
Where is there room for misunderstanding? A few examples,
TestFixture
attribute on an abstract class?TestFixture
attribute on a class with generic parameters?Tests
but no TestFixture
attribute?Tests
but no TestFixture
attribute, but there are non-abstract descendants that do have a TestFixture
attribute?In our case, the answer to these questions depends on which version of R# you’re using. Even though it feels like you configured everything correctly and it logically should work, the test runner sometimes disagrees.
Throw the TeamCity test runner into the mix—which is ostensibly the same as that from R# but still subtly different—and you’ll have even more fun.
At any rate, now that you know the general issue, I’d like to share how the ground rules we’ve come up with that avoid all of the issues described above. The text below comes from the issue I created for the impending release of Quino 2.
Non-leaf-node base classes should never appear as nodes in test runners. A user should be able to run tests in descendants directly from a fixture or test in the base class.
Non-leaf-node base classes are shown in the R# test runner in both versions 9 and 10. A user must navigate to the descendant to run a test. The user can no longer run all descendants or a single descendant directly from the test.
Relatively recently, in order to better test a misbehaving test runner and accurately report issues to JetBrains, I standardized all tests to the same pattern:
TestFixture
attribute only on leaf nodesThis worked just fine with ReSharper 8.x but causes strange behavior in both R# 9.x and 10.x. We discovered recently that not only did the test runner act strangely (something that they might fix), but also that the unit-testing integration in the files themselves behaved differently when the base class is abstract (something JetBrains is unlikely to fix).
You can see that R# treats a non-abstract class with tests as a testable entity, even when it doesn’t actually have a TestFixture
attribute and even expects a generic type parameter in order to instantiate.
Here it’s not working well in either the source file or the test runner. In the source file, you can see that it offers to run tests in a category, but not the tests from actual descendants. If you try to run or debug anything from this menu, it shows the fixture with a question-mark icon and marks any tests it manages to display as inconclusive. This is not surprising, since the test fixture may not be abstract, but does require a type parameter in order to be instantiated.
Here it looks and acts correctly:
I’ve reported this issue to JetBrains, but our testing structure either isn’t very common or it hasn’t made it to their core test cases, because neither 9 nor 10 handles them as well as the 8.x runner did.
Now that we’re also using TeamCity a lot more to not only execute tests but also to collect coverage results, we’ll capitulate and just change our patterns to whatever makes R#/TeamCity the happiest.
Once more to recap our ground rules for making tests:
TestFixture
only on leafs (classes with no descendants)Category
or Test
attributes anywhere in the hierarchy, but need to declare the class as abstract.When you make the change, you can see the improvement immediately.
Published by marco on 28. Nov 2015 13:58:45 (GMT-5)
Updated by marco on 19. May 2017 15:18:20 (GMT-5)
As part of the final release process for Quino 2, we’ve upgraded 5 solutions [1] from Quino 1.13 to the latest API in order to shake out any remaining API inconsistencies or even just inelegant or clumsy calls or constructs. A lot of questions came up during these conversions, so I wrote the following blog to provide detail on the exact workings and execution order of a Quino application.
I’ve discussed the design of Quino’s configuration before, most recently in API Design: Running an Application (Part I) and API Design: To Generic or not Generic? (Part II) as well as the three-part series that starts with Encodo’s configuration library for Quino: part I.
The life-cycle of a Quino 2.0 application breaks down into roughly the following stages:
ServicesInitialized
ServicesConfigured
actionThe first stage is all about putting the application together with calls to Use
various services and features. This stage is covered in detail in three parts, starting with Encodo’s configuration library for Quino: part I.
Let’s tackle this one last because it requires a bit more explanation.
Technically, an application can add code to this stage by adding an IApplicationAction
before the ServicesConfigured
action. Use the Configure<TService>()
extension method in stage 1 to configure individual services, as shown below.
application.Configure<IFileLogSettings>(
s => s.Behavior = FileLogBehavior.MultipleFiles
);
The execution stage is application-specific. This stage can be short or long, depending on what your application does.
For desktop applications or single-user utilities, stage 4 is executed in application code, as shown below, in the Run
method, which called by the ApplicationManager
after the application has started.
var transcript = new ApplicationManager().Run(CreateApplication, Run);
IApplication CreateApplication() { … }
void Run(IApplication application) { … }
If your application is a service, like a daemon or a web server or whatever, then you’ll want to execute stages 1–3 and then let the framework send requests to your application’s running services. When the framework sends the termination signal, execute stage 5 by disposing of the application. Instead of calling Run
, you’ll call CreateAndStartupUp
.
var application = new ApplicationManager().CreateAndStartUp(CreateApplication);
IApplication CreateApplication() { … }
Every application has certain tasks to execute during shutdown. For example, an application will want to close down any open connections to external resources, close file (especially log files) and perhaps inform the user of shutdown.
Instead of exposing a specific “shutdown” method, a Quino 2.0 application can simply be disposed to shut it down.
If you use ApplicationManager.Run()
as shown above, then you’re already sorted—the application will be disposed and the user will be informed in case of catastrophic failure; otherwise, you can shut down and get the final application transcript from the disposed object.
application.Dispose();
var transcript = application.GetTranscript();
// Do something with the transcript…
We’re finally ready to discuss stage 2 in detail.
An IOC has two phases: in the first phase, the application registers services with the IOC; in the second phase, the application uses services from the IOC.
An application should use the IOC as much as possible, so Quino keeps stage 2 as short as possible. Because it can’t use the IOC during the registration phase, code that runs in this stage shares objects via a poor-man’s IOC built into the IApplication
that allows modification and only supports singletons. Luckily, very little end-developer application code will ever need to run in this stage. It’s nevertheless interesting to know how it works.
Obviously, any code in this stage that uses the IOC will cause it to switch from phase one to phase two and subsequent attempts to register services will fail. Therefore, while application code in stage 2 has to be careful, you don’t have to worry about not knowing you’ve screwed up.
Why would we have this stage? Some advocates of using an IOC claim that everything should be configured in code. However, it’s not uncommon for applications to want to run very differently based on command-line or other configuration parameters. The Quino startup handles this by placing the following actions in stage 2:
An application is free to insert more actions before the ServicesInitialized
action, but they have to play by the rules outlined above.
Code in stage 2 shares objects by calling SetSingle()
and GetSingle()
. There are only a few objects that fall into this category.
The calls UseCore()
and UseApplication()
register most of the standard objects used in stage 2. Actually, while they’re mostly used during stage 2, some of them are also added to the poor man’s IOC in case of catastrophic failure, in which case the IOC cannot be assumed to be available. A good example is the IApplicationCrashReporter
.
Before listing all of the objects, let’s take a rough look at how a standard application is started. The following steps outline what we consider to be a good minimum level of support for any application. Of course, the Quino configuration is modular, so you can take as much or as little as you like, but while you can use a naked Application
—which has absolutely nothing registered—and you can call UseCore()
to have a bit more—it registers a handful of low-level services but no actions—we recommend calling at least UseApplication()
to adds most of the functionality outlined below.
RunMode
from the IRunSettings
to determine if the application should catch all exceptions or let them go to the debugger. This involves getting the IRunSettings
from the application and getting the final value using the IApplicationManagerPreRunFinalizer
. This is commonly an implementation that can allows setting the value of RunMode
from the command-line in debug builds. This further depends on the ICommandSetManager
(which depends on the IValueTools
) and possibly the ICommandLineSettings
(to set the CommandLineConfigurationFilename
if it was set by the user).ICommandProcessingResult
, possibly setting other values and adding other configuration steps to the list of startup actions (e.g. many command-line options are switches that are handled by calling Configure<TSettings>()
where TSettings
is the configuration object in the IOC to modify).IConfigurationDataSettings
, involving the ILocationManager
to find configuration files and the ITextValueNodeReader
to read them.ILogger
is used throughout by various actions to log application behaviorIApplicationCrashReporter
uses the IFeedback
or the ILogger
to notify the user and log the errorIInMemoryLogger
is used to include all in-memory messages in the IApplicationTranscript
The next section provides detail to each of the individual objects referenced in the workflow above.
You can get any one of these objects from the IApplication
in at least two ways, either by using GetSingle<TService>()
(safe in all situations) or GetInstance<TService>()
(safe only in stage 3 or later) or there’s almost always a method which starts with “Use” and ends in the service name.
The example below shows how to get the ICommandSetManager
[2] if you need it.
application.GetCommandSetManager();
application.GetSingle<ICommandSetManager>(); // Prefer the one above
application.GetInstance<ICommandSetManager>();
All three calls return the exact same object, though. The first two from the poor-man’s IOC; the last from the real IOC.
Only applications that need access to low-level objects or need to mess around in stage 2 need to know which objects are available where and when. Most applications don’t care and will just always use GetInstance()
.
The objects in the poor-man’s IOC are listed below.
IValueTools
: converts values; used by the command-line parser, mostly to translate enumerate values and flagsILocationManager
: an object that manages aliases for file-system locations, like “Configuration”, from which configuration files should be loaded or “UserConfiguration” where user-specific overlay configuration files are stored; used by the configuration loaderILogger
: a reference to the main logger for the applicationIInMemoryLogger
: a reference to an in-memory message store for the logger (used by the ApplicationManager
to retrieve the message log from a crashed application)IMessageFormatter
: a reference to the object that formats messages for the loggerICommandSetManager
: sets the schema for a command line; used by the command-line parserICommandProcessingResult
: contains the result of having processed the command lineICommandLineSettings
: defines the properties needed to process the command line (e.g. the Arguments
and CommandLineConfigurationFilename
, which indicates the optional filename to use for configuration in addition to the standard ones)IConfigurationDataSettings
: defines the ConfigurationData
which is the hierarchical representation of all configuration data for the application as well as the MainConfigurationFilename
from which this data is read; used by the configuration-loaderITextValueNodeReader
: the object that knows how to read ConfigurationData
from the file formats supported by the application [3]; used by the configuration-loaderIRunSettings
: an object that manages the RunMode
(“release” or “debug”), which can be set from the command line and is used by the ApplicationManager
to determine whether to use global exception-handlingIApplicationManagerPreRunFinalizer
: a reference to an object that applies any options from the command line before the decision of whether to execute in release or debug mode is taken.IApplicationCrashReporter
: used by the ApplicationManager
in the code surrounding the entire application execution and therefore not guaranteed to have a usable IOC availableIApplicationDescription
: used together with the ILocationManager
to set application-specific aliases to user-configuration folders (e.g. AppData\{CompanyTitle}\{ApplicationTitle}
)IApplicationTranscript
: an object that records the last result of having run the application; returned by the ApplicationManager
after Run()
has completed, but also available through the application object returned by CreateAndStartUp()
to indicate the state of the application after startup.Each of these objects has a very compact interface and has a single responsibility. An application can easily replace any of these objects by calling UseSingle()
during stage 1 or 2. This call sets the object in both the poor-man’s IOC as well as the real one. For those rare cases where a non-IOC singleton needs to be set after the IOC has been finalized, the application can call SetSingle()
, which does not touch the IOC. This feature is currently used only to set the IApplicationTranscript
, which needs to happen even after the IOC registration is complete.
Two large customer solutions, two medium-sized internal solutions (Punchclock and JobVortex) as well as the Demo/Sandbox solution. These solutions include the gamut of application types:
I originally used ITextValueNodeReader
as an example, but that’s one case where the recommended call doesn’t match 1-to-1 with the interface name.
application.GetSingle<ITextValueNodeReader>();
application.GetInstance<ITextValueNodeReader>();
application.GetConfigurationDataReader(); // Recommended
We... [More]
]]>Published by marco on 23. Nov 2015 22:31:29 (GMT-5)
Quino has long included support for connecting to an application server instead of connecting directly to databases or other sources. The application server uses the same model as the client and provides modeled services (application-specific) as well as CRUD for non-modeled data interactions.
We wrote the first version of the server in 2008. Since then, it’s acquired better authentication and authorization capabilities as well as routing and state-handling. We’ve always based it on the .NET HttpListener
.
As late as Quino 2.0-beta2 (which we had deployed in production environments already), the server hierarchy looked like screenshot below, pulled from issue QNO-4927:
This screenshot was captured after a few unneeded interfaces had already been removed. As you can see by the class names, we’d struggled heroically to deal with the complexity that arises when you use inheritance rather than composition.
The state-handling was welded onto an authentication-enabled server, and the base machinery for supporting authentication was spread across three hierarchy layers. The hierarchy only hints at composition in its naming: the “Stateful” part of the class name CoreStatefulHttpServerBase<TState>
had already been moved to a state provider and a state creator in previous versions. That support is unchanged in the 2.0 version.
We mentioned above that implementation was “spread across three hierarchy layers”. There’s nothing wrong with that, in principle. In fact, it’s a good idea to encapsulate higher-level patterns in a layer that doesn’t introduce too many dependencies and to introduce dependencies in other layers. This allows applications not only to be able to use a common implementation without pulling in unwanted dependencies, but also to profit from the common tests that ensure the components works as advertised.
In Quino, the following three layers are present in many components:
Encodo.Core
).Encodo.Application
, Encodo.Connections
and so on)Quino.Meta
, Quino.Application
and so on).The diagram below shows the new hotness in Quino 2. [2]
The hierarchy is now extremely flat. There is an IServer
interface and a Server
implementation, both generic in TListener
, of type IServerListener
. The server manages a single instance of an IServerListener
.
The listener, in turn, has an IHttpServerRequestHandler
, the main implementation of which uses an IHttpServerAuthenticator
.
As mentioned above, the IServerStateProvider
is included in this diagram, but is unchanged from Quino 2.0-beta3, except that it is now used by the request handler rather than directly by the server.
You can see how the abstract layer is enhanced by an HTTP-specific layer (the Encodo.Server.Http
namespace) and the metadata-specific layer is nice encapsulated in three classes in the Quino.Server
assembly.
This type hierarchy has decoupled the main elements of the workflow of handling requests for a server:
It is important to note that this behavior is unchanged from the previous version—it’s just that now each step is encapsulated in its own component. The components are small and easily replaced, with clear and concise interfaces.
Note also that the current implementation of the request handler is for HTTP servers only. Should the need arise, however, it would be relatively easy to abstract away the HttpListener
dependency and generalize most of the logic in the request handler for any kind of server, regardless of protocol and networking implementation. Only the request handler is affected by the HTTP dependency, though: authentication, state-provision and listener-management can all be re-used as-is.
Also of note is that the only full-fledged implementation is for metadata-based applications. At the bottom of the diagram, you can see the metadata-specific implementations for the route registry, state provider and authenticator. This is reflected in the standard registration in the IOC.
These are the service registrations from Encodo.Server
:
return handler
.RegisterSingle<IServerSettings, ServerSettings>()
.RegisterSingle<IServerListenerFactory<HttpServerListener>, HttpServerListenerFactory>()
.Register<IServer, Server<HttpServerListener>>();
And these are the service registrations from Quino.Server
:
handler
.RegisterSingle<IServerRouteRegistry<IMetaServerState>, StandardMetaServerRouteRegistry>()
.RegisterSingle<IServerStateProvider<IMetaServerState>, MetaPersistentServerStateProvider>()
.RegisterSingle<IServerStateCreator<IMetaServerState>, MetaServerStateCreator>()
.RegisterSingle<IHttpServerAuthenticator<IMetaServerState>, MetaHttpServerAuthenticator>()
.RegisterSingle<IHttpServerRequestHandler, HttpServerRequestHandler<IMetaServerState>>()
As you can see, the registration is extremely fine-grained and allows very precise customization as well as easy mocking and testing.
Please note that what follows is a description of how I have used... [More]
]]>Published by marco on 16. Oct 2015 11:44:35 (GMT-5)
In the previous article, we discussed the task of Splitting up assemblies in Quino using NDepend. In this article, I’ll discuss both the high-level and low-level workflows I used with NDepend to efficiently clear up these cycles.
Please note that what follows is a description of how I have used the tool—so far—to get my very specific tasks accomplished. If you’re looking to solve other problems or want to solve the same problems more efficiently, you should take a look at the official NDepend documentation.
To recap briefly: we are reducing dependencies among top-level namespaces in two large assemblies, in order to be able to split them up into multiple assemblies. The resulting assemblies will have dependencies on each other, but the idea is to make at least some parts of the Encodo/Quino libraries opt-in.
On a high-level, I tackled the task in the following loosely defined phases.
Even once you’ve gotten rid of all cycles, you may still have unwanted dependencies that hinder splitting namespaces into the desired constellation of assemblies.
For example, the plan is to split all logging and message-recording into an assembly called Encodo.Logging
. However, the IRecorder
interface (with a single method, Log()
) is used practically everywhere. It quickly becomes necessary to split interfaces and implementation—with many more potential dependencies—into two assemblies for some very central interfaces and support classes. In this specific case, I moved IRecorder
to Encodo.Core
.
Even after you’ve conquered the black hole, you might still have quite a bit of work to do. Never fear, though: NDepend is there to help root out those dependencies as well.
Because we can split off smaller assemblies regardless, these dependencies are less important to clean up for our current purposes. However, once this code is packed into its own assembly, its namespaces become root namespaces of their own and—voila! you have more potentially nasty dependencies to deal with. Granted, the problem is less severe because you’re dealing with a logically smaller component.
In Quino, use non-root namespaces more for organization and less for defining components. Still, cycles are cycles and they’re worth examining and at least plucking the low-hanging fruit.
With the high-level plan described above in hand, I repeated the following steps for the many dependencies I had to untangle. Don’t despair if it looks like your library has a ton of unwanted dependencies. If you’re smart about the ones you untangle first, you can make excellent—and, most importantly, rewarding—progress relatively quickly. [1]
GOTO 1
The high-level plan of attack sounded interesting, but might have left you cold with its abstraction. Then there was the promise of detail with a focus on root-level namespaces, but alas, you might still be left wondering just how exactly do you reduce these much-hated cycles?
I took some screenshots as I worked on Quino, to document my process and point out parts of NDepend I thought were eminently helpful.
I mentioned above that you should “[k]eep zooming in”, but how do you do that? A good first step is to zoom all the way out and show only direct namespace dependencies. This focuses only on using
references instead of the much-more frequent member accesses. In addition, I changed the default setting to show dependencies in only one direction—when a column references a row (blue), but not vice versa (green).
As you can see, the diagrams are considerably less busy than the one shown above. Here, we can see a few black spots that indicate cycles, but it’s not so many as to be overwhelming. [2] You can hover over the offending squares to show more detail in a popup.
If you don’t see any more cycles between namespaces, switch the detail level to “Members”. Another very useful feature is to “Bind Matrix”, which forces the columns and rows to be shown in the same order and concentrates the cycles in a smaller area of the matrix.
As you can see in the diagram, NDepend then highlights the offending area and you can even click the upper-left corner to focus the matrix only on that particular cycle.
Once you’re looking at members, it isn’t enough to know just the namespaces involved—you need to know which types are referencing which types. The powerful matrix view lets you drill down through namespaces to show classes as well.
If your classes are large—another no-no, but one thing at a time—then you can drill down to show which method is calling which method to create the cycle. In the screenshot to the right, you can see where I had to do just that in order to finally figure out what was going on.
In that screenshot, you can also see something that I only discovered after using the tool for a while: the direction of usage is indicated with an arrow. You can turn off the tooltips—which are informative, but can be distracting for this task—and you don’t have to remember which color (blue or green) corresponds to which direction of usage.
Once you’ve drilled your way down from namespaces-only to showing member dependencies, to focusing on classes, and even members, your diagram should be shaping up quite well.
On the right, you’ll see a diagram of all direct dependencies for the remaining area with a problem. You don’t see any black boxes, which means that all direct dependencies are gone. So we have to turn up the power of our microscope further to show indirect dependencies.
On the left, you can see that the scary, scary black hole from the start of our journey has been whittled down to a small, black spot. And that’s with all direct and indirect dependencies as well as both directions of usage turned on (i.e. the green boxes are back). This picture is much more pleasing, no?
For the last cluster of indirect dependencies shown above, I had to unpack another feature: NDepend queries: you can select any element and run a query to show using/used by assemblies/namespaces. [3] The results are shown in a panel, where you can edit the query see live updates immediately.
Even with a highly zoomed-in view on the cycle, I still couldn’t see the problem, so I took NDepend’s suggestion and generated a graph of the final indirect dependency between Culture
and Enums
(through Expression
). At this zoom level, the graph becomes more useful (for me) and illuminates problems that remain muddy in the matrix (see right).
In order to finish the job efficiently, here are a handful of miscellaneous tips that are useful, but didn’t fit into the guide above.
And BOOM! just like that [4], phase 1 (root namespaces) for Encodo was complete! Now, on to Quino.dll…
Depending on what shape your library is in, do not underestimate the work involved. Even with NDepend riding shotgun and barking out the course like a rally navigator, you still have to actually make the changes. That means lots of refactoring, lots of building, lots of analysis, lots of running tests and lots of reviews of at-times quite-sweeping changes to your code base. The destination is worth the journey, but do not embark on it lightly—and don’t forget to bring the right tools. [5]
Published by marco on 4. Oct 2015 07:44:39 (GMT-5)
A lot of work has been put into Quino 2.0 [1], with almost no stone left unturned. Almost every subsystem has been refactored and simplified, including but not limited to the data driver, the schema migration, generated code and metadata, model-building, security and authentication, service-application support and, of course, configuration and execution.
Two of the finishing touches before releasing 2.0 are to reorganize all of the code into a more coherent namespace structure and to reduce the size of the two monolithic assemblies: Encodo and Quino.
The first thing to establish is: why are we doing this? Why do we want to reduce dependencies and reduce the size of our assemblies? There are several reasons, but a major reason is to improve the discoverability of patterns and types in Quino. Two giant assemblies are not inviting—they are, in fact, daunting. Replace these assemblies with dozens of smaller ones and users of your framework will be more likely to (A) find what they’re looking for on their own and (B) build their own extensions with the correct dependencies and patterns. Neither of these is guaranteed, but smaller modules are a great start.
Another big reason is portability. The .NET Core was released as open-source software some time ago and more and more .NET source code is added to it each day. There are portable targets, non-Windows targets, Universal-build targets and much more. It makes sense to split code up into highly portable units with as few dependencies as possible. That is, the dependencies should be explicit and intended.
Not only that, but NuGet packaging has come to the fore more than ever. Quino was originally designed to keep third-party boundaries clear, but we wanted to make it as easy as possible to use Quino. Just include Encodo and Quino and off you went. However, with NuGet, you can now say you want to use Quino.Standard and you’ll get Quino.Core, Encodo.Core, Encodo.Services.SimpleInjector, Quino.Services.SimpleInjector and other packages.
With so much interesting code in the Quino framework, we want to make it available as much as possible not only for our internal projects but also for customer projects where appropriate and, also, possibly for open-source distribution.
I’ve used NDepend before [2] to clean up dependencies. However, the last analysis I did about a year ago showed quite deep problems [3] that needed to be addressed before any further dependency analysis could bear fruit at all. With that work finally out of the way, I’m ready to re-engage with NDepend and see where we stand with Quino.
As luck would have it, NDepend is in version 6, released at the start of summer 2015. As was the case last year, NDepend has generously provided me with an upgrade license to allow me to test and evaluate the new version with a sizable and real-world project.
Here is some of the feedback I sent to NDepend (Twitter):
I really, really like the depth of insight NDepend gives me into my code. I find myself thinking “SOLID” much more often when I have NDepend shaking its head sadly at me, tsk-tsking at all of the dependency snarls I’ve managed to build.
- It’s fast and super-reliable. I can work these checks into my workflow relatively easily.
- I’m using the matrix view a lot more than the graphs because even NDepend recommends I don’t use a graph for the number of namespaces/classes I’m usually looking at
- Where the graph view is super-useful is for examining *indirect* dependencies, which are harder to decipher with the graph
- I’ve found so many silly mistakes/lazy decisions that would lead to confusion for developers new to my framework
- I’m spending so much time with it and documenting my experiences because I want more people at my company to use it
- I haven’t even scratched the surface of the warnings/errors but want to get to that, as well (the Dashboard tells me of 71 rules violated; 9 critical; I’m afraid to look :-)
Before I get more in-depth with NDepend, please note that there at least two main use cases for this tool [4]:
These two use cases are vastly different. The first is like cleaning a gas-station bathroom for the first time in years; the second is more like the weekly once-over you give your bathroom at home. The tools you’ll need for the two jobs are similar, but quite different in scope and power. The same goes for NDepend: how you’ll use it to claw your way back to architectural purity is different than how you’ll use it to occasionally clean up an already mostly-clean project.
Quino is much better than it was the last time we peeked under the covers with NDepend, but we’re still going to need a bucket of industrial cleaner before we’re done. [5]
The first step is to make sure that you’re analyzing the correct assemblies. Show the project properties to see which assemblies are included. You should remove all assemblies from consideration that don’t currently interest you (especially if your library is not quite up to snuff, dependency-wise; afterwards, you can leave as many clean assemblies in the list as you like). [6]
Running an analysis with NDepend 6 generates a nice report, which includes the following initial dependency graph for the assemblies.
As you can see, Encodo and Quino depend only on system assemblies, but there are components that pull in other references where they might not be needed. The initial dependency matrices for Encodo and Quino both look much better than they did when I last generated one. The images below show what we have to work with in the Encodo and Quino assemblies.
It’s not as terrible as I’ve made out, right? There is far less namespace-nesting, so it’s much easier to see where the bidirectional dependencies are. There are only a handful of cyclic dependencies in each library, with Encodo edging out Quino because of (A) the nature of the code and (B) I’d put more effort into Encodo so far.
I’m not particularly surprised to see that this is relatively clean because we’ve put effort into keeping the external dependencies low. It’s the internal dependencies in Encodo and Quino that we want to reduce.
The goal, as stated in the title of this article, is to split Encodo and Quino into separate assemblies. While removing cyclic dependencies is required for such an operation, it’s not sufficient. Even without cycles, it’s still possible that a given assembly is still too dependent on other assemblies.
Before going any farther, I’m going to list the assemblies we’d like to have. By “like to have”, I mean the list that we’d originally planned plus a few more that we added while doing the actual splitting. [7] The images on the right show the assemblies in Encodo, Quino and a partial overview of the dependency graph (calculated with the ReSharper Architecture overview rather than with NDepend, just for variety).
Of these, the following assemblies and their dependencies are of particular interest [8]:
Encodo.Core
Encodo.Core
and Encodo.Expressions
Encodo.Application
and Quino.Meta
Quino.Application
and some Encodo.* assembliesQuino.Data
This seems like a good spot to stop, before getting into the nitty-gritty detail of how we used NDepend in practice. In the next article, I’ll discuss both the high-level and low-level workflows I used with NDepend to efficiently clear up these cycles. Stay tuned!
Release notes for 2.0 betas:
Articles about design:
I published a two-parter in August and November of 2014.
Here I’m going to give you a tip that confused me for a while, but that I think was due to particularly bad luck and is actually quite a rare occurrence.
If you already see the correct assemblies in the list, you should still check that NDepend picked up the right paths. That is, if you haven’t followed the advice in NDepend’s white paper and still have a different bin
folder for each assembly, you may see something like the following in the tooltip when you hover over the assembly name:
“Several valid .NET assemblies with the name {Encodo} have been found. They all have the same version. the one with the biggest file has been chosen.”
If NDepend has accidentally found an older copy of your assembly, you must delete that assembly. Even if you add an assembly directly, NDepend will not honor the path from which you added it. This isn’t as bad as it sounds, since it’s a very strange constellation of circumstances that led to this assembly hanging around anyway:
I only noticed because I knew I didn’t have that many dependency cycles left in the Encodo assembly.
Encodo.Application
.Published by marco on 26. Sep 2015 11:27:08 (GMT-5)
Updated by marco on 15. Jan 2017 23:18:41 (GMT-5)
In this article, I’m going to continue the discussion started in Part I, where we laid some groundwork about the state machine that is the startup/execution/shutdown feature of Quino. As we discussed, this part of the API still suffers from “several places where generic TApplication parameters [are] cluttering the API”. In this article, we’ll take a closer look at different design approaches to this concrete example—and see how we decided whether to use generic type parameters.
Any decision you take with a non-trivial API is going to involve several stakeholders and aspects. It’s often not easy to decide which path is best for your stakeholders and your product.
For any API you design, consider how others are likely to extend it—and whether your pattern is likely to deteriorate from neglect.
For any API you design, consider how others are likely to extend it—and whether your pattern is likely to deteriorate from neglect. Even a very clever solution has to be balanced with simplicity and elegance if it is to have a hope in hell of being used and standing the test of time.
In Quino 2.0, the focus has been on ruthlessly eradicating properties on the IApplication
interface as well as getting rid of the descendant interfaces, ICoreApplication
and IMetaApplication
. Because Quino now uses a pattern of placing sub-objects in the IOC associated with an IApplication
, there is far less need for a generic TApplication
parameter in the rest of the framework. See Encodo’s configuration library for Quino: part I for more information and examples.
This focus raised an API-design question: if we no longer want descendant interfaces, should we eliminate parameters generic in that interface? Or should we continue to support generic parameters for applications so that the caller will always get back the type of application that was passed in?
Before getting too far into the weeds [1], let’s look at a few concrete examples to illustrate the issue.
As discussed in Encodo’s configuration library for Quino: part III in detail, Quino applications are configured with the “Use*” pattern, where the caller includes functionality in an application by calling methods like UseRemoteServer()
or UseCommandLine()
. The latest version of this API pattern in Quino recommends returning the application that was passed in to allow chaining and fluent configuration.
For example, the following code chains the aforementioned methods together without creating a local variable or other clutter.
return new CodeGeneratorApplication().UseRemoteServer().UseCommandLine();
What should the return type of such standard configuration operations be? Taking a method above as an example, it could be defined as follows:
public static IApplication UseCommandLine(this IApplication application, string[] args) { … }
This seems like it would work fine, but the original type of the application that was passed in is lost, which is not exactly in keeping with the fluent style. In order to maintain the type, we could define the method as follows:
public static TApplication UseCommandLine<TApplication>(this TApplication application, string[] args)
where TApplication : IApplication
{ … }
This style is not as succinct but has the advantage that the caller loses no type information. On the other hand, it’s more work to define methods in this way and there is a strong likelihood that many such methods will simply be written in the style in the first example.
Generics definitely offer advantages, but it remains to be seen how much those advantages are worth.
Why would other coders do that? Because it’s easier to write code without generics, and because the stronger result type is not needed in 99% of the cases. If every configuration method expects and returns an IApplication
, then the stronger type will never come into play. If the compiler isn’t going to complain, you can expect a higher rate of entropy in your API right out of the gate.
One way the more-derived type would come in handy is if the caller wanted to define the application-creation method with their own type as a result, as shown below:
private static CodeGeneratorApplication CreateApplication()
{
return new CodeGeneratorApplication().UseRemoteServer().UseCommandLine();
}
If the library methods expect and return IApplication
values, the result of UseCommandLine()
will be IApplication
and requires a cast to be used as defined above. If the library methods are defined generic in TApplication
, then everything works as written above.
This is definitely an advantage, in that the user gets the exact type back that they created. Generics definitely offer advantages, but it remains to be seen how much those advantages are worth. [2]
IApplicationManager
Before we examine the pros and cons further, let’s look at another example.
In Quino 1.x, applications were created directly by the client program and passed into the framework. In Quino 2.x, the IApplicationManager
is responsible for creating and executing applications. A caller passes in two functions: one to create an application and another to execute an application.
A standard application startup looks like this:
new ApplicationManager().Run(CreateApplication, RunApplication); [3]
Generic types can trigger an avalanche of generic parameters™ throughout your code.
The question is: what should the types of the two function parameters be? Does CreateApplication
return an IApplication
or a caller-specific derived type? What is the type of the application parameter passed to RunApplication
? Also IApplication
? Or the more derived type returned by CreateApplication
?
As with the previous example, if the IApplicationManager
is to return a derived type, then it must be generic in TApplication
and both function parameters will be generically typed as well. These generic types will trigger an avalanche of generic parameters™ throughout the other extension methods, interfaces and classes involved in initializing and executing applications.
That sounds horrible. This sounds like a pretty easy decision. Why are we even considering the alternative? Well, because it can be very advantageous if the application can declare RunApplication
with a strictly typed signature, as shown below.
private static void RunApplication(CodeGeneratorApplication application) { … }
Neat, right? I’ve got my very own type back.
However, if the IApplicationManager
is to call this function, then the signature of CreateAndStartUp()
and Run()
have to be generic, as shown below.
TApplication CreateAndStartUp<TApplication>(
Func<IApplicationCreationSettings, TApplication> createApplication
)
where TApplication : IApplication;
IApplicationExecutionTranscript Run<TApplication>(
Func<IApplicationCreationSettings, TApplication> createApplication,
Action<TApplication> run
)
where TApplication : IApplication;
These are quite messy—and kinda scary—signatures. [4] if these core methods are already so complex, any other methods involved in startup and execution would have to be equally complex—including helper methods created by calling applications. [5]
The advantage here is that the caller will always get back the type of application that was created. The compiler guarantees it. The caller is not obliged to cast an IApplication
back up to the original type. The disadvantage is that all of the library code is infected by a generic <TApplication> parameter with its attendant IApplication
generic constraint. [6]
The title of this section seems pretty self-explanatory, but we as designers must remain vigilant against the siren call of what seems like a really elegant and strictly typed solution.
But aren’t properties on an application exactly what we just worked so hard to eliminate?
The generics above establish a pattern that must be adhered to by subsequent extenders and implementors. And to what end? So that a caller can attach properties to an application and access those in a statically typed manner, i.e. without casting?
But aren’t properties on an application exactly what we just worked so hard to eliminate? Isn’t the recommended pattern to create a “settings” object and add it to the IOC instead? That is, as of Quino 2.0, you get an IApplication
and obtain the desired settings from its IOC. Technically, the cast is still taking place in the IOC somewhere, but that seems somehow less bad than a direct cast.
If the framework recommends that users don’t add properties to an application—and ruthlessly eliminated all standard properties and descendants—then why would the framework turn around and add support—at considerable cost in maintenance and readability and extendibility—for callers that expect a certain type of application?
Let’s take a look at the non-generic implementation and see what we lose or gain. The final version of the IApplicationManager
API is shown below, which properly balances the concerns of all stakeholders and hopefully will stand the test of time (or at least last until the next major revision).
IApplication CreateAndStartUp(
Func<IApplicationCreationSettings, IApplication> createApplication
);
IApplicationExecutionTranscript Run(
Func<IApplicationCreationSettings, IApplication> createApplication,
Action<IApplication> run
);
These are the hard questions of API design: ensuring consistency, enforcing intent and balancing simplicity and cleanliness of code with expressiveness.
Run()
method for the desired type of application. Almost all of the startup code is shared and the pattern is the same everywhere.ApplicationManager
were it to have been defined with generic parameters. Yet another thing to consider when choosing how to define you API.IApplication
everywhere—and most probably will, because the advantage offered by making everything generic is vanishingly small.. If your API looks this scary, entropy will eat it alive before the end of the week, to say nothing of its surviving to the next major version.IApplication
as the extended parameter in some cases and TApplication
in others). This issue is in how the application object is registered in the IOC. During development, when the framework was still using generics everywhere (or almost everywhere), some parts of the code were retrieving a reference to the application using the most-derived type whereas the application had been registered in the container as a singleton using IApplication
. The call to retrieve the most derived type returned a new instance of the application rather than the pre-registered singleton, which was a subtle and difficult bug to track down.Other entries on this topic have been the articles about Encodo’s configuration library for Quino: part I, part II and part III.
The goal of this article is to discuss a concrete example of how we... [More]
]]>Published by marco on 19. Sep 2015 07:29:59 (GMT-5)
Updated by marco on 26. Sep 2015 11:24:56 (GMT-5)
In this article, we’re going to discuss a bit more about the configuration library in Quino 2.0.
Other entries on this topic have been the articles about Encodo’s configuration library for Quino: part I, part II and part III.
The goal of this article is to discuss a concrete example of how we decided whether to use generic type parameters throughout the configuration part of Quino. The meat of that discussion will be in a part 2 because we’re going to have to lay some groundwork about the features we want first. (Requirements!)
As of Quino 2.0-beta2, the configuration library consisted of a central IApplication
interface which has a reference to an IOC container and a list of startup and shutdown actions.
As shown in part III, these actions no longer have a generic TApplication
parameter. This makes it not only much easier to use the framework, but also easier to extend it. In this case, we were able to remove the generic parameter without sacrificing any expressiveness or type-safety.
As of beta2, there were still several places where generic TApplication
parameters were cluttering the API. Could we perhaps optimize further? Throw out even more complexity without losing anything?
One of these places is the actual engine that executes the startup and shutdown actions. This code is a bit trickier than just a simple loop because Quino supports execution in debug mode—without exception-handling—and release mode—with global exception-handling and logging.
As with any application that uses an IOC container, there is a configuration phase, during which the container can be changed and an execution phase, during which the container produces objects but can no longer be re-configured.
Until 2.0-beta2, the execution engine was encapsulated in several extension methods called Run()
, StartUp()
and so on. These methods were generally generic in TApplication
. I write “generally” because there were some inconsistencies with extension methods for custom application types like Winform or Console applications.
While extension methods can be really useful, this usage was not really appropriate as it violated the open/closed principle. For the final release of Quino, we wanted to move this logic into an IApplicationManager
so that applications using Quino could (A) choose their own logic for starting an application and (B) add this startup class to a non-Quino IOC container if they wanted to.
So far, so good. Before we discuss how to rewrite the application manager/execution engine, we should quickly revisit what exactly this engine is supposed to do. As it turns out, not only do we wnat to make an architectural change to make the design more open for extension, but the basic algorithm for starting an application changed, as well.
What does it mean to run an application?
Quino has always acknowledged and kinda/sorta supported the idea that a single application can be run in different ways. Even an execution that results in immediate failure technically counts as an execution, as a traversal of the state machine defined by the application.
If we view an application for the state machine that it is, then every application has at least two terminal nodes: OK and Error.
But what does OK mean for an application? In Quino, it means that all startup actions were executed without error and the run()
action passed in by the caller was also executed without error. Anything else results in an exception and is shunted to Error.
But is that true, really? Can you think of other ways in which an application could successfully execute without really having failed? For most applications, the answer is yes. Almost every application—and certainly every Quino application—supports a command line. One of the default options for the command line of a Quino application is -h
, which shows a manual for the other command-line options.
If the application is running in a console, this manual is printed to the console; for a Winform application, a dialog box is shown; and so on.
This “help” mode is actually a successful execution of the application that did not result in the main event loop of the application being executed.
Thought of in this way, any command-line option that controls application execution could divert the application to another type of terminal node in the state machine. A good example is when an application provides support for importing or exporting data via the command line.
A terminal node is also not necessarily only Crashed
or Ok
. Almost any application will also need to have a Canceled
mode that is a perfectly valid exit state. For example,
These are two ways in which a standard Quino application could run to completion without crashing but without having accomplished any of its main tasks. It ran and it didn’t crash, but it also didn’t do anything useful.
This section title sounds a bit pretentious, but that’s exactly what we want to discuss here. Instead of having just start and terminal nodes, the Quino startup supports cycles through intermediate nodes as well. What the hell does that mean? It means that some nodes may trigger Quino to restart in a different mode in order to handle a particular kind of error condition that could be repaired. [1]
A concrete example is desperately needed here, I think. The main use of this feature in Quino right now is to support on-the-fly schema-migration without forcing the user to restart the application. This feature has been in Quino from the very beginning and is used almost exclusively by developers during development. The use case to support is as follows:
This workflow minimizes the amount of trouble that a developer has when either making changes or when integrating changes from other developers. In all cases in which the application model is different from the developer’s database schema, it’s very quick and easy to upgrade and continue working.
How does this work internally in Quino 2.0? The application starts up but somehow encounters an error that indicates that a schema migration might be required. This can happen in one of two ways:
DatabaseException
that is indicative of a schema-mismatchIn all of these cases, the application that was running throws an ApplicationRestartException
, that the standard IApplicationManager
implementation knows how to handle. It handles it by shutting down the running application instance and asking the caller to create a new application, but this time one that knows how to handle the situation that caused the exception. Concretely, the exception includes an IApplicationCreationSettings
descendant that the caller can use to decide how to customize the application to handle that situation.
The manager then runs this new application to completion (or until a new RestartApplicationException
is thrown), shuts it down, and asks the caller to create the original application again, to give it another go.
In the example above, if the user has successfully migrated the schema, then the application will start on this second attempt. If not, then the manager enters the cycle again, attempting to repair the situation so that it can get to a terminal node. Naturally, the user can cancel the migration and the application also exits gracefully, with a Canceled
state.
A few examples of possible application execution paths:
The pattern is the same for interactive, client applications as for headless applications like test suites, which attempt migration once and abort if not successful. Applications like web servers or other services will generally only support the OK and Error states and fail when they encounter a RestartApplicationException
.
Still, it’s nice to know that the pattern is there, should you need it. It fits relatively cleanly into the rest of the API without making it more complicated. The caller passes two functions to the IApplicationManager
: one to create an application and one to run it.
An example from the Quino CodeGeneratorApplication
is shown below:
internal static void Main()
{
new ApplicationManager().Run(CreateApplication, GenerateCode);
}
private static IApplication CreateApplication(
IApplicationCreationSettings applicationCreationSettings
) { … }
private static void GenerateCode(IApplication application) { … }
We’ll see in the next post what the final API looks like and how we arrived at the final version of that API in Quino 2.0.
Published by marco on 19. Sep 2015 07:18:15 (GMT-5)
Encodo first published a Git Handbook for employees in September 2011 and last updated it in July of 2012. Since then, we’ve continued to use Git, refining our practices and tools. Although a lot of the content is still relevant, some parts are quite outdated and the overall organization suffered through several subsequent, unpublished updates.
What did we change from the version 2.0?
You can download version 3 of the Git Handbook or get the latest copy from here.
Chapter 3, Basic Concepts and chapter 4, Best Practices have been included in their entirety below.
Focused commits are required; small commits are highly recommended. Keeping the number of changes per commit tightly focused on a single task helps in many cases.
For example, if you are working on a bug fix and discover that you need to refactor a file as well, or clean up the documentation or formatting, you should finish the bug fix first, commit it and then reformat, document or refactor in a separate commit.
Even if you have made a lot of changes all at once, you can still separate changes into multiple commits to keep those commits focused. Git even allows you to split changes from a single file over multiple commits (the Git Gui provides this functionality as does the index editor in SmartGit).
Use the staging area to make quick snapshots without committing changes but still being able to compare them against more recent changes.
For example, suppose you want to refactor the implementation of a class.
Where you develop new code depends entirely on the project release plan.
Follow these rules for which command to use to combine two branches:
A branching model is required in order to successfully manage a non-trivial project.
Whereas a trivial project generally has a single branch and few or no tags, a non-trivial project has a stable release—with tags and possible hotfix branches—as well as a development branch—with possible feature branches.
A common branching model in the Git world is called Git Flow. Previous versions of this manual included more specific instructions for using the Git Flow-plugin for Git but experience has shown that a less complex branching model is sufficient and that using standard Git commands is more transparent.
However, since Git Flow is a very widely used branching model, retaining the naming conventions helps new developers more easily understand how a repository is organized.
The following list shows the branch types as well as the naming convention for each type:
The main difference from the Git Flow branching model is that there is no explicit stable branch. Instead, the last version tag serves the purpose just as well and is less work to maintain. For more information on where to develop code, see “3.3 – Developing New Code”.
To get a better picture of how these branches are created and merged, the following diagram depicts many of the situations outlined above.
The diagram tells the following story:
tl;dr: I’m back to ReSharper 8.2.3 and am a bit worried about the state of the 9.x series of ReSharper.... [More]
]]>Published by marco on 3. Sep 2015 12:30:57 (GMT-5)
Way back in February, I wrote about my experiences with ReSharper 9 when it first came out. The following article provides an update, this time with version 9.2, released just last week.
tl;dr: I’m back to ReSharper 8.2.3 and am a bit worried about the state of the 9.x series of ReSharper. Ordinarily, JetBrains has eliminated performance, stability and functional issues by the first minor version-update (9.1), to say nothing of the second (9.2).
In the previous article, my main gripe was with the unit-test runner, which was unusable due to flakiness in the UI, execution and change-detection. With the release of 9.2, the UI and change-detection problems have been fixed, but the runner is still quite flaky at executing tests.
What follows is the text of the report that I sent to JetBrains when they asked me why I uninstalled R# 9.2.
As with 9.0 and 9.1, I am unable to productively use the 9.2 Test Runner with many of my NUnit tests. These tests are not straight-up, standard tests, but R# 8.2.3 handled them without any issues whatsoever.
What’s special about my tests?
There are quite a few base classes providing base functionality. The top layers provide scenario-specific input via a generic type parameter.
- TestsBase − OtherBase<tmixin> (7 of these, one with an NUnit CategoryAttribute) − ConcreteTests<tmixin> (defines tests with NUnit TestAttributes) − ProviderAConcreteTests<tmixin> (CategoryAttribute) − ProtocolAProviderAConcreteTests (TMixin = ProtocolAProviderA; TestFixtureAttribute, CategoryAttributes) − ProtocolBProviderAConcreteTests (TMixin = ProtocolBProviderA; TestFixtureAttribute, CategoryAttributes) − ProviderBConcreteTests<tmixin> (CategoryAttribute) − ProtocolAProviderBConcreteTests (TMixin = ProtocolAProviderB; TestFixtureAttribute, CategoryAttributes) − ProtocolBProviderBConcreteTests (TMixin = ProtocolBProviderB; TestFixtureAttribute, CategoryAttributes)The test runner in 9.2 is not happy with this at all. The test explorer shows all of the tests correctly, with the test counts correct. If I select a node for all tests for ProviderB and ProtocolA (696 tests in 36 fixtures), R# loads 36 non-expandable nodes into the runner and, after a bit of a wait, marks them all as inconclusive. Running an individual test-fixture node does not magically cause the tests to load or appear and also shows inconclusive (after a while; it seems the fixture setup executes as expected but the results are not displayed).
If I select a specific, concrete fixture and add or run those tests, R# loads and executes the runner correctly. If I select multiple test fixtures in the explorer and add them, they also show up as expandable nodes, with the correct test counts, and can be executed individually (per fixture). However, if I elect to run them all by running the parent node, R# once again marks everything as inconclusive.
As I mentioned, 8.2.3 handles this correctly and I feel R# 9.2 isn’t far off—the unit-test explorer does, after all, show the correct tests and counts. In 9.2, it’s not only inconvenient, but I’m worried that my tests are not being executed with the expected configuration.
Also, I really missed the StyleCop plugin for 9.2. There’s a beta version for 9.1 that caused noticeable lag, so I’m still waiting for a more unobtrusive version for 9.2 (or any version at all).
While it’s possible that there’s something I’m doing wrong, or there’s something in my installation that’s strange, I don’t think that’s the problem. As I mentioned, test-running for the exact same solution with 8.2.3 is error-free and a pleasure to use. In 9.2, the test explorer shows all of the tests correctly, so R# is clearly able to interpret the hierarchy and attributes (noted above) as I’ve intended them to be interpreted. This feels very much like a bug or a regression for which JetBrains doesn’t have test coverage. I will try to work with them to help them get coverage for this case.
Additionally, the StyleCop plugin is absolutely essential for my workflow and there still isn’t an official release for any of the 9.x versions. ReSharper 9.2 isn’t supported at all yet, even in prerelease form. The official Codeplex page shows the latest official version as 4.7, released in January of 2012 for ReSharper 8.2 and Visual Studio 2013. One would imagine that VS2015 support is in the works, but it’s hard to say. There is a page for StyleCop in the ReSharper extensions gallery but that shows a beta4, released in April of 2015, that only works with ReSharper 9.1.x, not 9.2. I tested it with 9.1.x, but it noticeably slowed down the UI. While typing was mostly unaffected, scrolling and switching file-tabs was very laggy. Since StyleCop is essential for so many developers, it’s hard to see why the plugin gets so little love from either JetBrains or Microsoft.
The “Go To Word” plugin is not essential but it is an extremely welcome addition, especially with so much more client-side work depending on text-based bindings that aren’t always detected by ReSharper. In those cases, you can find—for example—all the references of a Knockout template by searching just as you would for a type or member. Additionally, you benefit from the speed of the ReSharper indexing engine and search UI instead of using the comparatively slow and ugly “Find in Files” support in Visual Studio. Alternatives suggested in the comments to the linked issue above all depend on building yet another index of data (e.g. Sando Code Search Tool). JetBrains has pushed off integrating go-to-word until version 10. Again, not a deal-breaker, but a shame nonetheless, as I’ll have to do without it in 9.x until version 10 is released.
With so much more client-side development going on in Visual Studio and with dynamic languages and data-binding languages that use name-matching for data-binding, GoToWord is more and more essential. Sure, ReSharper can continue to integrate native support for finding such references, but until that happens, we’re stuck with the inferior Find-in-Files dialog or other extensions that increase the memory pressure for larger solutions.
In beta1, we read about changes to configuration, the data driver architecture, DDL commands, and security... [More]
]]>Published by marco on 30. May 2015 23:51:19 (GMT-5)
The summary below describes major new features, items of note and breaking changes. The full list of issues is also available for those with access to the Encodo issue tracker.
In beta1, we read about changes to configuration, the data driver architecture, DDL commands, and security and access control in web applications.
In beta-2, we made the following additional improvements:
IApplication
, ICoreApplication
and IMetaApplication
. (QNO-4789, QNO-4788, QNO-4786, QNO-4785, QNO-4671, QNO-4669, , QNO-4668, QNO-4667, QNO-4660)ICustomCommandBuilder
. This was added by customer request, for applications that formulate queries that are beyond what the Quino ORM is currently capable of mapping. A blog post with more detail on how this works is forthcoming. (QNO-4802)DataContract
and DataMember
attributes in metadata and generated code. (QNO-4823, QNO-4826)This release addressed some issues that have been bugging us for a while (almost 3 years in one case).
You will not be missed.
As we’ve mentioned before, this release is absolutely merciless in regard to backwards compatibility. Old code is not retained as obsolete Obsolete
. Instead, a project upgrading to 2.0 will encounter compile errors.
That said, if you arm yourself with a bit of time, ReSharper and the release notes (and possibly keep an Encodo employee on speed-dial), the upgrade is not difficult. It consists mainly of letting ReSharper update namespace references for you. In cases where the update is not so straightforward, we’ve provided release notes.
One of the few things you’ll be able to keep (at least for a minor version or two) is the old-style generated code. We made this concession because, while even a large solution can be upgraded from 1.13.0 to 2.0 relatively painlessly in about an hour (we’ve converted our own internal projects to test), changing the generated-code format is potentially a much larger change. Again, an upgrade to the generated-code format isn’t complicated but it might require more than an hour or two’s worth of elbow grease to complete.
Therefore, you’ll be able to not only retain your old generated code, but the code generator will continue support the old-style code-generation format for further development. Expect the grace period to be relatively short, though.
Regardless of whether you elect to keep the old-style generated code, you’ll have to do a little bit of extra work just to be able to generate code again.
Before you can regenerate, you’ll have to manually update your previously generated code in the main model file, as shown below.
static MyModel()
{
Messages = new InMemoryRecorder();
Loader = new ModelLoader(() => Instance, () => Messages, new MyModelGenerator());
}
public static IMetaModel CreateModel(IExtendedRecorder recorder)
{
if (recorder == null) { throw new ArgumentNullException("recorder"); }
var result = Loader.Generator.CreateModel(recorder);
result.Configure();
return result;
}
// More code …
/// <inheritdoc/>
protected override void DoConfigure()
{
base.DoConfigure();
ConfigurePreferredTypes();
ApplyCustomConfiguration();
}
static MyModel()
{
Messages = new InMemoryRecorder();
Loader = new ModelLoader(() => Instance, () => Messages, new MyModelGenerator());
}
public static IMetaModel CreateModel(IExtendedRecorder recorder)
{
if (recorder == null) { throw new ArgumentNullException("recorder"); }
var result = Loader.Generator(MyModel)new MyModelGenerator().CreateModel(
ServiceLocator.Current.GetInstance<IExpressionParser>(),
ServiceLocator.Current.GetInstance<IMetaExpressionFactory>(),
recorder
);
result.ConfigurePreferredTypes();
result.ApplyCustomConfiguration();
return result;
}
/// <inheritdoc/>
protected override void DoConfigure()
{
base.DoConfigure();
ConfigurePreferredTypes();
ApplyCustomConfiguration();
}
In the application configuration for the first time you generate code with Quino 2.0, you should use:
ModelLoader = MyModel.Loader;
this.UseMetaSimpleInjector();
this.UseModelLoader(MyModel.CreateModel);
After regenerating code, you should use the following for version-2 generated code:
ModelLoader = MyModel.Loader;
this.UseMetaSimpleInjector();
this.UseModelLoader(MyModelExtensions.CreateModelAndMetadata);
…and the following for version-1 generated code:
ModelLoader = MyModel.Loader;
this.UseMetaSimpleInjector();
this.UseModelLoader(MyModel.CreateModel);
As you can see, we’ve already done quite a bit of work in beta1 and beta2. We have a few more tasks planned for the feature-complete release candidate for 2.0
Move the schema-migration metadata table to a module.
The Quino schema-migration extracts most of the information it needs from database schema itself. It also stores extra metadata in a special table. This table has been with Quino since before modules were supported (over seven years) and hence was built in a completely custom manner. Moving this support to a Quino metadata module will remove unnecessary implementation and make the migration process more straightforward. (QNO-4888)
Separate collection algorithm from storage/display method in IRecorder
and descendants.
The recording/logging library has a very good interface but the implementation for the standard recorders has become too complex as we added support for multi-threading, custom disposal and so on. We want to clean this up to make it easier to extend the library with custom loggers. (QNO-4888)
Finish integrating building and publishing NuGet and symbol packages into Quino’s release process.
And, finally, once we have the assemblies split up to our liking, we’ll finalize the Nuget packages for the Quino library and leave the direct-assembly-reference days behind us, ready for Visual Studio 2015.
(QNO-4376)
That’s all we’ve got for now. See you next month for the next (and, hopefully, final update)!
Published by marco on 17. May 2015 17:45:56 (GMT-5)
This discussion about configuration spans three articles:
Registering with an IOC is all well and good, but something has to make calls into the IOC to get the ball rolling.
Something has to actually make calls into the IOC to get the ball rolling.
Even service applications—which start up quickly and wait for requests to do most of their work—have basic operations to execute before declaring themselves ready.
Things can get complex when starting up registered components and performing basic checks and non-IOC configuration.
Part of the complexity of configuration and startup is that developers quickly forget all of the things that they’ve come to expect from a mature product and start from zero again with each application. Encodo and Quino applications take advantage of prior work to include standard behavior for a lot of common situations.
Some components can be configured once and directly by calling a method like UseMetaTranslations(string filePath)
, which includes all of the configuration options directly in the composition call. This pattern is perfect for options that are used only by one action or that it wouldn’t make sense to override in a subsequent action.
So, for simple actions, an application can just replace the existing action with its own, custom action. In the example above, an application for which translations had already been configured would just call UseMetaTranslations()
again in order to override that behavior with its own.
Most application will replace standard actions or customize standard settings
Some components, however, will want to expose settings that can be customized by actions before they are used to initialize the component.
For example, there is an action called SetUpLoggingAction
, which configures logging for the application. This action uses IFileLogSettings
and IEventLogSettings
objects from the IOC during execution to determine which types of logging to configure.
An application is, of course, free to replace the entire SetUpLoggingAction
action with its own, completely custom behavior. However, an application that just wanted to change the log-file behavior or turn on event-logging could use the Configure<TService>()
method [1], as shown below.
application.Configure<IFileLogSettings>(
s => s.Behavior = LogFileBehavior.MultipleFiles
);
application.Configure<IEventLogSettings>(
s => s.Enabled = true
);
A Quino application object has a list of StartupActions
and a list of ShutdownActions
. Most standard middleware methods register objects with the IOC and add one or more actions to configure those objects during application startup.
Actions have existed for quite a while in Quino. In Quino 2, they have been considerably simplified and streamlined to the point where all but a handful are little more than a functional interface [2].
The list below will give you an idea of the kind of configuration actions we’re talking about.
For installed/desktop/mobile applications, there’s also:
Quino applications also have actions to configure metadata:
Application shutdown has a smaller set of vital cleanup chores that:
The following example [3] is for the 1.x version of the relatively simple ConfigureDisplayLanguageAction
.
public class ConfigureDisplayLanguageAction<TApplication>
: ApplicationActionBase<TApplication>
where TApplication : ICoreApplication
{
public ConfigureDisplayLanguageAction()
: base(CoreActionNames.ConfigureDisplayLanguage)
{
}
protected override int DoExecute(
TApplication application, ConfigurationOptions options, int currentResult)
{
// Configuration code…
}
}
What is wrong with this startup action? The following list illustrates the main points, each of which is addressed in more detail in its own section further below.
ConfigurationOptions
parameter introduces an unnecessary layer of complexityTApplication
complicates declaration, instantiation and extension methods that use the actionint
return type along with the currentResult
parameter are a bad way of controlling flow.The same startup action in Quino 2.x has the following changes from the Quino 1.x version above (legend: additions; deletions).
public class ConfigureDisplayLanguageAction<TApplication>
: ApplicationActionBase<TApplication>
where TApplication : ICoreApplication
{
public ConfigureDisplayLanguageAction()
: base(CoreActionNames.ConfigureDisplayLanguage)
{
}
publicprotected override void int DoExecute(
TApplication application, ConfigurationOptions options, int currentResult)
{
// Configuration code…
}
}
As you can see, quite a bit of code and declaration text was removed, all without sacrificing any functionality. The final form is quite simple, inheriting from a simple base class that manages the name of the action and overrides a single parameter-less method. It is now much easier to see what an action does and the barrier to entry for customization is much lower.
public class ConfigureDisplayLanguageAction : ApplicationActionBase
{
public ConfigureDisplayLanguageAction()
: base(CoreActionNames.ConfigureDisplayLanguage)
{
}
public override void Execute()
{
// Configuration code…
}
}
In the following sections, we’ll take a look at each of the problems indicated above in more detail.
ConfigurationOptions
parameterThese options are a simple enumeration with values like Client
, Testing
, Service
and so on. They were used only by a handful of standard actions.
These options made it more difficult to decide how to implement the action for a given task. If two tasks were completely different, then a developer would know to create two separate actions. However, if two tasks were similar, but could be executed differently depending on application type (e.g. testing vs. client), then the developer could still have used two separate actions, but could also have used the configuration options. Multiple ways of doing the exact same thing is all kinds of bad.
Multiple ways of doing the exact same thing is all kinds of bad.
Parameters like this conflict conceptually with the idea of using composition to build an application. To keep things simple, Quino applications should be configured exclusively by composition. Composing an application with service registrations and startup actions and then passing options to the startup introduced an unneeded level of complexity.
Instead, an application now defines a separate action for each set of options. For example, most applications will need to set up the display language to use—be it for a GUI, a command-line or just to log messages in the correct language. For that, the application can add a ConfigureDisplayLanguageAction
to the startup actions or call the standard method UseCore()
. Desktop or single-user applications can use the ConfigureGlobalDisplayLanguageAction
or call UseGlobalCore()
to make sure that global language resources are also configured.
TApplication
generic parameterThe generic parameter to this interface complicates the IApplication<TApplication>
interface and causes no end of trouble in MetaApplication
, which actually inherits from IApplication<IMetaApplication>
for historical reasons.
There is no need to maintain statelessness for a single-use object.
Originally, this parameter guaranteed that an action could be stateless. However, each action object is attached to exactly one application (in the IApplication<TApplication>.StartupActions
list. So the action that is attached to an application is technically stateless, and a completely different application than the one to which the action is attached could be passed to the IApplcationAction.Execute
…which makes no sense whatsoever.
Luckily, this never happens, and only the application to which the action is attached is passed to that method. If that’s the case, though, why not just create the action with the application as a constructor parameter when the action is added to the StartupActions
list? There is no need to maintain statelessness for a single-use object.
This way, there is no generic parameter for the IApplication
interface, all of the extension methods are much simpler and applications are free to create custom actions that work with descendants of IApplication
simply by requiring that type in the constructor parameter.
A global exception handler is terrible for debugging
The original startup avoided exceptions, preferring an integer return result instead.
In release mode, a global exception handler is active and is there to help the application exit more or less smoothly—e.g. by logging the error, closing resources where possible, and so on.
A global exception handler is terrible for debugging, though. For exceptions that are caught, the default behavior of the debugger is to stop where the exception is caught rather than where it is thrown. Instead, you want exceptions raised by your application to to stop the debugger from where they are thrown.
So that’s part of the reason why the startup and shutdown in 1.x used return codes rather than exceptions.
The other reason Quino used result codes is that most non-trivial applications actually have multiple paths through which they could successfully run.
Exactly which path the application should take depends on startup conditions, parameters and so on. Some common examples are:
To show command-line help, an application execute its startup actions in order. It reaches the action that checks whether the user requested command-line help. This action processes the request, displays that help and then wants to smoothly exit the application. The “main” path—perhaps showing the user a desktop application—should no longer be executed.
Non-trivial applications have multiple valid run profiles.
Similarly, the action that checks the database schema determines that the schema in the data provider doesn’t match the model. In this case, it would like to offer the user (usually a developer) the option to update the schema. Once the schema is updated, though, startup should be restarted from the beginning, trying again to run the main path.
Whereas the Quino 1.x startup addressed the design requirements above with return codes, this imposes an undue burden on implementors. There was also confusion as to when it was OK to actually throw an exception rather than returning a special code.
Instead, the Quino 2.x startup always uses exceptions to indicate errors. There are a few special types of exceptions recognized by the startup code that can indicate whether the application should silently—and successfully—exit or whether the startup should be attempted again.
There is of course more detail into which we could go on much of what we discussed in these three articles, but that should suffice for an overview of the Quino configuration library.
Published by marco on 17. May 2015 17:45:20 (GMT-5)
Updated by marco on 19. Sep 2015 07:13:29 (GMT-5)
In this article, we’ll continue the discussion about configuration started in part I. We wrapped up that part with the following principles to keep in mind while designing the new system.
Quino’s configuration inconsistencies and issues have been well-known for several versions—and years—but the opportunity to rewrite it comes only now with a major-version break.
Luckily for us, ASP.NET has been going through a similar struggle and evolution. We were able to model some of our terminology on the patterns from their next version. For example, ASP.NET has moved to a pattern where an application-builder object is passed to user code for configuration. The pattern there is to include middleware (what we call “configuration”) by calling extension methods starting with “Use”.
Quino has had a similar pattern for a while, but the method names varied: “Integrate”, “Add”, “Include”; these methods have now all been standardized to “Use” to match the prevailing .NET winds.
Additionally, Quino used to make a distinction between an application instance and its “configuration”—the template on which an application is based. No more. Too complicated. This design decision, coupled with the promotion of a platform-specific “Feedback” object to first-level citizen, led to an explosion of generic type parameters. [1]
The distinction between configuration (template) and application (instance) has been removed. Instead, there is just an application object to configure.
The feedback object is now to be found in the service locator. An application registers a platform-specific feedback to use as it would any other customization.
CustomWinformFeedback
in the Quino 1.x code at the end of this article provides a glaring example.ASP.NET vNext has made the service locator a first-class citizen. In ASP.NET, applications receive an IApplicationBuilder
in one magic “Configure” method and receive an IServiceCollection
in another magic “ConfigureServices” method.
In Quino 2.x, the application is in charge of creating the service container, though Quino provides a method to create and configure a standard one (SimpleInjector). That service locator is passed to the IApplication
object and subsequently accessible there.
Services can of course be registered directly or by calling pre-packaged Middleware methods. Unlike ASP.NET vNext, Quino 2.x makes no distinction between configuring middleware and including the services required by that middleware.
Quino’s configuration library has its roots in a time before we were using an IOC container. The configuration was defined as a hierarchy of configuration classes that modeled the following layers.
RunMode
(“debug” or “release”) or an exit code or that it has a logging mechanism (e.g. IRecorder
).While these layers are still somewhat evident, the move to middleware packages has blurred the distinction between them. Instead of choosing a concrete configuration base class, an application now calls a handful of “Use” methods to indicate what kind of application to build.
There are, of course, still helpful top-level methods—e.g. UseCore()
and UseMeta()
methods—that pull in all of the middleware for the standard application types. But, crucially, the application is free to tweak this configuration with more granular calls to register custom configuration in the service locator.
This is a flexible and transparent improvement over passing esoteric parameters to monolithic configuration methods, as in the previous version.
Just as a simple example, whereas a Quino 1.x standalone application would set ICoreConfiguration.UseSoftwareUpdater
to true
, a Quino 2.x application calls UseSoftwareUpdater()
. Where a Quino 1.x Winform application would inherit from the WinformFeedback
in order to return a customized ISoftwareUpdateFeedback
, a Quino 2.x application calls UseSoftwareUpdateFeedback()
.
The software-update feedback class is defined below and is used by both versions.
public class CustomSoftwareUpdateFeedback
: WinformSoftwareUpdateFeedback<IMetaApplication>
{
protected override ResponseType DoConfirmUpdate(TApplication application, …)
{
…
}
}
That’s where the similarities end, though. The code samples below show the stark difference between the old and new configuration systems.
As explained above, Quino 1.x did not allow registration of a sub-feedback like the software-updater. Instead, the application had to inherit from the main feedback and override a method to create the desired sub-feedback.
class CustomWinformFeedback : WinformFeedback
{
public virtual ISoftwareUpdateFeedback<TApplication>
GetSoftwareUpdateFeedback<TApplication, TConfiguration, TFeedback>()
where TApplication : ICoreApplication<TConfiguration, TFeedback>
where TConfiguration : ICoreConfiguration
where TFeedback : ICoreFeedback
{
return new CustomSoftwareUpdateFeedback(this);
}
}
var configuration = new CustomConfiguration()
{
UseSoftwareUpdater = true
}
WinformDxMetaConfigurationTools.Run(
configuration,
app => new CustomMainForm(app),
new CustomWinformFeedback()
);
The method-override in the feedback was hideous and scared off a good many developers. not only that, the pattern was to use a magical, platform-specific WinformDxMetaConfigurationTools.Run
method to create an application, run it and dispose it.
Software-update feedback-registration in Quino 2.x adheres to the principles outlined at the top of the article: it is consistent and uses common patterns (functionality is included and customized with methods named “Use”), configuration is opt-in, and the IOC container is used throughout (albeit implicitly with these higher-level configuration methods).
using (var application = new CustomApplication())
{
application.UseMetaWinformDx();
application.UseSoftwareUpdater();
application.UseSoftwareUpdaterFeedback(new CustomSoftwareUpdateFeedback());
application.Run(app => new CustomMainForm(app));
}
Additionally, the program has complete control over creation, running and disposal of the application. No more magic and implicit after-the-fact configuration.
In the next and (hopefully) final article, we’ll take a look at configuring execution—the actions to execute during startup and shutdown. Registering objects in a service locator is all well and good, but calls into the service locator have to be made in order for anything to actually happen.
Keeping this system flexible and addressing standard application requirements is a challenging but not insurmountable problem. Stay tuned.
]]>what sort of patterns... [More]
Published by marco on 10. Apr 2015 15:36:06 (GMT-5)
In this article, I’ll continue the discussion about configuration improvements mentioned in the release notes for Quino 2.0-beta1. With beta2 development underway, I thought I’d share some more of the thought process behind the forthcoming changes.
what sort of patterns integrate and customize the functionality of libraries in an application?
An application comprises multiple tasks, only some of which are part of that application’s actual domain. For those parts not in the application domain, software developers use libraries. A library captures a pattern or a particular way of doing something, making it available through an abstraction. These simplify and smooth away detail irrelevant to the application.
A runtime and its standard libraries provide many such abstractions: for reading/writing files, connecting to networks and so on. Third-party libraries provide others, like logging, IOC, task-scheduling and more.
Because Encodo’s been writing software for a long time, we have a lot of patterns that we’ve come up with for our applications. These libraries are split into two main groups:
A sort of “meta” library that lies on top of all of this is configuration and startup of applications that use these libraries. That is, what sort of patterns integrate and customize the functionality of libraries in an application?
Almost nowhere in an application is the balance between K.I.S.S. and D.R.Y. more difficult to maintain than in configuration and startup.
So if we already know all of that, why does Quino need a new configuration library?
As mentioned above, there is a lot of commonality between applications in this area. An application will definitely want to incorporate such common configuration from a library. Updates and improvements to that library will then be applied as for any other. This is a good thing.
However, an application will also want to be able to tweak almost any given facet of this shared configuration. That is: just keep the good parts, have those upgraded when they’re changed, but apply customization and extend functionality for the application’s domain. Easy, right?
It is here that a good configuration library will find just the right level of granularity for customization. Too coarse? Then an application ends up throwing out too much common configuration in order to customize a small part of it. Too fine? Then the configuration system is too verbose or complex and the application avoids using it.
Instead, a configuration system should establish clear patterns—optimally, just one—for how to apply customization.
So if we already know all of that, then why does Quino need a new configuration library? Well…
It’s really easy to make things over-complicated and muddy. It’s really easy to end up growing several different kinds of extension systems over the years. Quino ended up with a generics-heavy API that made declaring new configuration components very wordy.
The core of Quino is the metadata definition for an application domain. That part has barely changed at all since we first wrote it lo so many years ago. We declared it to be our core business—the part that we are better than others at—the part we wanted to have under our own control. Our first draft [1] has held up remarkably well.
Many of the other components have undergone quite a bit of flux: changes in requirements and the components themselves as well as new development processes and patterns all contributed to change. Over time, various applications had different needs and made adjustments to a different iteration of the configuration library. We moved from supporting only single-threaded, single-user desktop applications to also supporting multi-user, multi-threaded services and web servers.
…we were left with an ugly configuration system that no-one wanted to extend or
use—so yet another would be invented.
For all of these different applications, we naturally wanted to maintain the common configuration where possible—but customizations for new platforms stretched the capabilities of the configuration library.
Customization would be made to a new version of that library, but applications that couldn’t be upgraded immediately forced backwards-compatibility and thus resulted in several different concurrent ways of configuring a particular facet of an application.
In order to keep things in one place, we ended up breaking the interface-separation rule. Dependencies started clumping drastically, but it was OK because nobody was trying to use one thing without the other ten. But it was hard to see what was going on; customization became a black box for all but one or two gurus. On and on it went, until we were left with an ugly configuration system that no-one wanted to extend or use—so yet another would be invented, ad-hoc. And so it went.
With Quino 2.0, we examined the existing system and came up with a list of principles.
In the next part, we’ll take a look at some concrete examples and documentation for the new patterns. [2]
These are the big ones that forced a major-version change.
Published by marco on 28. Mar 2015 23:26:29 (GMT-5)
The summary below describes major new features, items of note and breaking changes. The full list of issues is also available for those with access to the Encodo issue tracker.
These are the big ones that forced a major-version change.
IMessageRecorder
to IRecorder
, IMessageStore
to IInMemoryRecorder
and consolidated IFilteredMessageRecorder
to IRecorder
. (QNO-4686, QNO-4696, QNO-4750, QNO-4557)Some smaller, but important changes:
RunInTransaction
attribute. Specify the attribute on any IMetaTestFixture
to wrap a test or every test in a fixture in a transaction. (QNO-4682)Oh yeah. You betcha. This is a major release and we’ve knowingly made a decision not to maintain backwards-compatibility at all costs. Good news, though, the changes to make are relatively straightforward and easy to make if you’ve got a tool like ReSharper that can update using
statements automatically.
As we saw in part I and part II of the guide to using NDepend, Quino 2.0 has unsnarled quite a few dependency issues. A large number of classes and interfaces have been moved out of the Encodo.Tools
namespace. Many have been moved to Encodo.Core
but others have been scattered into more appropriate and more specific namespaces.
This is one part of the larger changes, easily addressed by using ReSharper to Alt + Enter your way through the compile errors.
Another large change is in renaming IMessageRecorder
to IRecorder
and IMessageStore
to IInMemoryRecorder
. Judicious use of search/replace or just a bit of elbow grease will get you through these as well.
Finally, probably the most far-reaching change is in merging IConfiguration
into IApplication
. In previous versions of Quino, applications would create a configuration object and pass that to a platform-dependent Quino Run()
method. Some configuration was provided by the application and some by the platform-specific method.
The example for Quino 1.13.0 below comes from the JobVortex Winform application.
var configuration = new JobVortexConfiguration
{
MainSettings = Settings.Default
};
configuration.Add(new JobVortexClientConfigurationPackage());
if (!string.IsNullOrEmpty(Settings.Default.DisplayLanguage))
{
configuration.DisplayLanguage = new Language(Settings.Default.DisplayLanguage);
}
WinformDxMetaConfigurationTools.Run(
configuration,
app => new MainForm(app)
);
In Quino 2.0, the code above has been rewritten as shown below.
using (IMetaApplication application = new JobVortexApplication())
{
application.MainSettings = Settings.Default;
application.UseJobVortexClient();
if (!string.IsNullOrEmpty(Settings.Default.DisplayLanguage))
{
application.DisplayLanguage = new Language(Settings.Default.DisplayLanguage);
}
application.Run(app => new MainForm(app));
}
As you can see, instead of creating a configuration, the program creates an application object. Instead of using configuration packages mixed with extension methods named “Integrate”, “Configure” and so on, the new API uses “Use” everywhere. This should be comfortable for people familiar with the OWIN/Katana configuration pattern.
It does, however, mean that the IConfiguration
, ICoreConfiguration
and IMetaConfiguration
don’t exist anymore. Instead, use IApplication
, ICoreApplication
and IMetaApplication
Again, a bit of elbow grease will be needed to get through these compile errors, but there’s little to no risk or need for high-level decisions.
There are a lot of these prepackaged methods to help you create common kinds of applications:
UseCoreConsole()
(a non-Quino application that uses the console)UseMetaConsole()
(a Quino application that uses the console)UseCoreWinformDx()
(a non-Quino application that uses Winform)UseMetaWinformDx()
(a Quino application that uses Winform)UseReporting()
UseRemotingServer()
I think you get the idea. Once we have a final release for Quino 2.0, we’ll write more about how to use this new pattern.
This is still just an internal beta of the 2.0 final version. More changes are on the way, including but not limited to:
IConfigurationPackage
and standardize the configuration API to be named “Use” everywhere (QNO-4771)GenericObject
improvements (QNO-4761, QNO-4762)ICoreApplication
and IMetaApplication
to configuration objects in the service locator. Also improve use of and configuration of service locator (QNO-4659)See you there!
In that spirit, even endeavors like... [More]
]]>Published by marco on 13. Mar 2015 08:59:09 (GMT-5)
Microsoft has recently made a lot of their .NET code open-source. Not only is the code for many of the base libraries open-source but also the code for the runtime itself. On top of that, basic .NET development is now much more open to community involvement.
In that spirit, even endeavors like designing the features to be included in the next version of C# are online and open to all: C# Design Meeting Notes for Jan 21, 2015 by Mads Torgerson (GitHub).
You may be surprised at the version number “7”—aren’t we still waiting for C# 6 to be officially released? Yes, we are.
If you’ll recall, the primary feature added to C# 5 was support for asynchronous operations through the async
/await
keywords. Most .NET programmers are only getting around to using this rather far- and deep-reaching feature, to say nothing of the new C# 6 features that are almost officially available.
C# 6 brings the following features with it and can be used in the CTP versions of Visual Studio 2015 or downloaded from the Roslyn project (GitHub).
Some of the more interesting features of C# 6 are:
out
parameter can now be declared inline with var
or a specific type. This avoids the ugly variable declaration outside of a call to a Try*
method.using
can now be used with with a static class as well as a namespace. Direct access to methods and properties of a static class should clean up some code considerably.string.Format()
and numbered parameters for formatting, C# 6 allows expressions to be embedded directly in a string (á la PHP): e.g. “{Name} logged in at {Time}”
null
when the target of a call is null
. E.g. company.People?[0]?.ContactInfo?.BusinessAddress.Street
includes three null-checksIf the idea of using await
correctly or wrapping your head around the C# 6 features outlined above doesn’t already make your poor head spin, then let’s move on to language features that aren’t even close to being implemented yet.
That said, the first set of design notes for C# 7 by Mads Torgerson (GitHub) include several interesting ideas as well.
Metaprogramming: Another focus for C# is reducing boilerplate and capturing common code-generation patterns. They’re thinking of delegation of interfaces through composition. Also welcome would be an improvement in the expressiveness of generic constraints.
Related User Voice issues:
null
at compile-time (where reasonable—they do acknowledge that they may end up with a “less ambitious approach”).Lambda capture lists: One of the issues with closures is that they currently just close over any referenced variables. The compiler just makes this happen and for the most part works as expected. When it doesn’t work as expected, it creates subtle bugs that lead to leaks, race conditions and all sorts of hairy situations that are difficult to debug.
If you throw in the increased use of and nesting of lambda calls, you end up with subtle bugs buried in frameworks and libraries that are nearly impossible to tease out.
The idea of this feature is to allow a lambda to explicitly capture variables and perhaps even indicate whether the capture is read-only. Any additional capture would be flagged by the compiler or tools as an error.
Contracts(!): And, finally, this is the feature I’m most excited about because I’ve been waiting for integrated language support for Design by Contract for literally decades [1], ever since I read the Object-Oriented Software Construction 2 (Amazon) (OOSC2) for the first time. The design document doesn’t say much about it, but mentions that “.NET already has a contract system”, the weaknesses of which I’ve written about before. Torgersen writes:
“When you think about how much code is currently occupied with arguments and result checking, this certainly seems like an attractive way to reduce code bloat and improve readability.”
…and expressiveness and provability!
There are a bunch of User Voice issues that I can’t encourage you enough to vote for so we can finally get this feature:
With some or all of these improvements, C# 7 would move much closer to a provable language at compile-time, an improvement over being a safe language at run-time.
We can already indicate that instance data or properties are readonly. We can already mark methods as static to prevent the use of this
. We can use ReSharper [NotNull]
attributes to (kinda) enforce non-null references without using structs and incurring the debt of value-passing and -copying semantics.
I’m already quite happy with C# 5, but if you throw in some or all of the stuff outlined above, I’ll be even happier. I’ll still have stuff I can think of to increase expressiveness—covariant return types for polymorphic methods or anchored types or relaxed contravariant type-conformance—but this next set of features being discussed sounds really, really good.
I love the features of the language Eiffel, but haven’t ever been able to use it for work. The tools and IDE are a bit stuck in the past (very dated on Windows; X11 required on OS X). The language is super-strong, with native support for contracts, anchored types, null-safe programming, contravariant type-conformance, covariant return types and probably much more that C# is slowly but surely including with each version. Unfair? I’ve been writing about this progress for years (from newest to oldest):
In this article, we’re going to take a look at the data pipeline itself.
Published by marco on 7. Mar 2015 08:11:14 (GMT-5)
In part I of these series, we discussed applications, which provide the model and data provider, and sessions, which encapsulate high-level data context. In part II, we covered command types and inputs to the data pipeline.
In this article, we’re going to take a look at the data pipeline itself.
The primary goal of the data pipeline is, of course, to correctly execute each query to retrieve data or command to store, delete or refresh data. The diagram to the right shows that the pipeline consists of several data handlers. Some of these refer to data sources, which can be anything: an SQL database or a remote service. [1]
The name “pipeline” is only somewhat appropriate: A command can jump out anywhere in the pipeline rather than just at the opposite end. A given command will be processed through the various data handlers until one of them pronounces the command to be “complete”.
In the previous parts, we learned that the input to the pipeline is an IDataCommandContext
. To briefly recap, this object has the following properties:
SetValue(IMetaProperty)
); more detail on this laterWhere the pipeline metaphor holds up is that the command context will always start at the same end. The ordering of data handlers is intended to reduce the amount of work and time invested in processing a given command.
The first stage of processing is to quickly analyze the command to handle cases where there is nothing to do. For example,
Objects
is emptyObjects
has changednull
value in the primary key or a foreign key that references a non-nullable, unique key.It is useful to capture these checks in one or more analyzers for the following reasons,
If the analyzer hasn’t categorically handled the command and the command is to load data, the next step is to check caches. For the purposes of this article, there are two things that affect how long data is cached:
isolationLevel
stricter than RepeatableRead
.Caches currently include the following standard handlers [2]:
ValueListDataHandler
returns immutable data. Since the data is immutable, it can be used independent of the transaction-state of the session in which the command is executed.SessionCacheDataHandler
returns data that’s already been loaded or saved in this session, to avoid a call to a possibly high-latency back-end. This data is safe to use within the session with transactions because the cache is rolled back when a transaction is rolled back.If the analyzer and cache haven’t handled a command, then we’re finally at a point where we can no longer avoid a call to a data source. Data sources can be internal or external.
The most common type is an external database:
Another standard data source is the Quino remote application server, which provides a classic interface- and method-based service layer as well as mapping nearly the full power of Quino’s generalized querying capabilities to an application server. That is, an application can smoothly switch between a direct connection to a database to using the remoting driver to call into a service layer instead.
The remoting driver supports both binary and JSON protocols. Further details are also beyond the scope of this article, but this driver has proven quite useful for scaling smaller client-heavy applications with a single database to thin clients talking to an application server.
And finally, there is another way to easily include “mini” data drivers in an application. Any metaclass can include an IDataHandlerAspect
that defines its own data driver as well as its capabilities. Most implementations use this technique to bind in immutable lists of data. But this technique has also been used to load/save data from/to external APIs, like REST services. We can take a look at some examples in more detail in another article.
The mini data driver created for use with an aspect can relatively easily be converted to a full-fledged data handler.
The last step in a command is what Quino calls “local evaluation”. Essentially, if a command cannot be handled entirely within the rest of the data pipeline—either entirely by an analyzer, one or more caches or the data source for that type of object—then the local analyzer completes the command.
What does this mean? Any orderings or restrictions in a query that cannot be mapped to the data source (e.g. a C# lambda is too complex to map to SQL) are evaluated on the client rather than the server. Therefore, any query that can be formulated in Quino can also be evaluated fully by the data pipeline—the question is only of how much of it can be executed on the server, where it would (usually) be more efficient to do so.
Please see the article series that starts with Optimizing data access for high-latency networks for specific examples.
In this article, we’ve learned a bit about the ways in which Quino retrieves and stores data using the data pipeline. In the next part, we’ll cover the topic “Builders & Commands”.
In this article, we’re going to take a look at the command types & inputs
Published by marco on 28. Feb 2015 18:36:41 (GMT-5)
In part I, we discussed applications—which provide the model and data provider—and sessions—which encapsulate high-level data context.
In this article, we’re going to take a look at the command types & inputs
Before we can discuss how the pipeline processes a given command, we should discuss what kinds of commands the data driver supports and what kind of inputs the caller can pass to it. As you can well imagine, the data driver can be used for CRUD—to create, read, update and delete and also to refresh data.
In the top-right corner of the diagram to the right, you can see that the only input to the pipeline is an IDataCommandContext
. This object comprises the inputs provided by the caller as well as command-specific state used throughout the driver for the duration of the command.
A caller initiates a command with either a query or an object graph, depending on the type of command. The following commands and inputs are supported:
A query includes information about the data to return (or delete).
FirstName %~ ‘m’
[2]—or the caller can find all people which belong to a company whose name starts with the letter “e”—Company.FirstName %~ ‘e’
. The context for these expressions is naturally the meta-class mentioned above. Additionally, the metadata/model can also include default filters to include.LastName
and then by FirstName
. More complex expressions are supported—for example, you could use the expression “{LastName}, {FirstName}”
, which sorts by a formatted string [3]—but be aware that many data stores have limited support for complex expressions in orderings. Orderings are ignored in a query when used to delete objects.Queries are a pretty big topic and we’ve only really scratched the surface so far. Quino has its own query language—QQL—the specification for which weighs in at over 80 pages, but that’s a topic for another day.
An object graph consists of a sequence of root objects and the sub-objects available along relations defined in the metadata.
It’s actually simpler than it perhaps sounds.
Let’s use the example above: a person is related to a single company, so the graph of a single person will include the company as well (if the object is loaded and/or assigned). Additionally, the company defines a relation that describes the list of people that belong to it. The person=>company relationship is complementary to the company=>person relationship. We call person=>company a 1-1 relation, while company=>person is a 1-n relation.
The following code creates two new companies, assigns them to three people and saves everything at once.
var encodo = new Company { Name = "Encodo Systems AG" };
var other = new Company { Name = "Not Encodo" };
var people = new []
{
new Person { FirstName = "John", LastName = "Doe", Company = other },
new Person { FirstName = "Bob", LastName = "Smith", Company = encodo },
new Person { FirstName = "Ted", LastName = "Jones", Company = encodo }
};
Session.Save(people);
The variable people
above is an object graph. The variables encodo
and other
are also object graphs, but only to parts of the first one. From people
, a caller can look up people[0].Company
, which is other
. The graph contains cycles, so people[0].Company.People[0].Company
is also other
. From encodo
, the caller can get to other people in the same company, but not to people in the other
company, for example, encodo.People[0]
gets “Bob Smith” and encodo.People[0].Company.People[1]
gets “Ted Jones”.
As with queries, object graphs are a big topic and are strongly bound to the kind of metadata available in Quino. Another topic for another day.
Phew. We’re almost to the point where we can create an IDataCommandContext
to send into the data pipeline.
IDataSession
and know why we need itWith those inputs, Quino has all it needs from the caller. A glance at the top-left corner of the diagram above shows us that Quino will determine an IMetaClass
and an IMetaObjectHandler
from these inputs and then use them to build the IDataCommandContext
.
An IQuery
has a MetaClass
property, so that’s easy. With the meta-class and the requested type of object, the data driver checks a list of registered object-handlers and uses the first one that says it supports that type. If the input is an object graph, though, the object-handler is determined first and then the meta-class is obtained from the object-handler using a root object from the graph.
Most objects will inherit from GenericObject
which implements the IPersistable
interface required by the standard object handler. However, an application is free to implement an object handler for other base classes—or no base class at all, using reflection to get/set values on POCOs. That is, however, an exercise left up to the reader.
At this point, we have all of our inputs and can create the IDataCommandContext
.
In the next part, we’ll take a look at the “Data Pipeline” through which this command context travels.
Published by marco on 21. Feb 2015 08:02:16 (GMT-5)
One part of Quino that has undergone quite a few changes in the last few versions is the data driver. The data driver is responsible for CRUD: create, read, update and delete operations. One part of this is the ORM—the object-relational mapper—that marshals data to and from relational databases like PostgreSql, SQL Server and SQLite.
We’re going to cover a few topics in this series:
But first let’s take a look at an example to anchor our investigation.
An application makes a request to the data driver using commands like Save()
to save data and GetObject()
or GetList()
to get data. How are these high-level commands executed? Quino does an excellent job of shielding applications from the details but it’s still very interesting to know how this is achieved.
The following code snippet creates retrieves some data, deletes part of it and saves a new version.
using (var session = application.CreateSession())
{
var people = session.GetList<Person>();
people.Query.WhereEquals(Person.Fields.FirstName, "john");
session.Delete(people);
session.Save(new Person { FirstName = "bob", LastName = "doe" });
}
In this series, we’re going to answer the following questions…and probably many more.
Let’s tackle the last two questions first.
The application defines common configuration information. The most important bits for the ORM are as follows:
Person
, which has at least the two properties LastName
and FirstName
. There is probably an entity named Company
as well, with a one-to-many relationship to Person
. As you can imagine, Quino uses this information to formulate requests to data stores that contain data in this format. [1] For drivers that support it, Quino also uses this information in order to create that underlying data schema. [2]So that’s the application. There is a single shared application for a process.
But in any non-trivial application—and any non-desktop application—we will have multiple data requests running, possibly in different threads of execution.
That’s where sessions come in. The session encapsulates a data context, which contains the following information:
HttpContext.Current.User
but generalized to be available in any Quino application. All data requests over a session are made in the context of this user.If we go back to the original code sample, we now know that creating a new session with CreateSession()
creates a new data context, with its own user and its own data cache. Since we didn’t pass in any credentials, the session uses the default credentials for the application. [3] All data access made on that session is nicely shielded and protected from any data access made in other sessions (where necessary, of course).
So now we’re no closer to knowing how Quino works with data on our behalf, but we’ve taken the first step: we know all about one of the main inputs to the data driver, the session.
In the next part, we’ll cover the topic “The Data Pipeline”.
Person.Fields.FirstName
in the example), or view models, DTOs or even client-side TypeScript definitions. We also use the model to generate user interfaces—both for entire desktop-application interfaces but also for HTML helpers to build MVC views.This is code that you might use in a single-user application. In a server application, you would most likely just use the session that was created for your request by Quino. If an application wants to create a new session, but using the same user as an existing session, it would call:
var requestCredentials = requestSession.AccessControl.CurrentUser.CreateCredentials();
using (var session = application.CreateSession(requestCredentials))
{
// Work with session
}
As long-time users of ReSharper, we’ve become accustomed to the following pattern of adoption for new major versions:
Published by marco on 11. Feb 2015 07:11:51 (GMT-5)
We’ve been using ReSharper at Encodo since version 4. And we regularly use a ton of other software from JetBrains [1]—so we’re big fans.
As long-time users of ReSharper, we’ve become accustomed to the following pattern of adoption for new major versions:
This process can take anywhere from several weeks to a couple of months. The reason we do it almost every time is that the newest version of ReSharper almost always has a few killer features. For example, version 8 had initial TypeScript support. Version 9 carries with it a slew of support improvements for Gulp, TypeScript and other web technologies.
Unfortunately, if you need to continue to use the test-runner with C#, you’re in for a bumpy ride.
Any new major version of ReSharper can be judged by its test runner. The test runner seems to be rewritten from the ground-up in every major version. Until the test runner has settled down, we can’t really use that version of ReSharper for C# development.
The 6.x and 7.x versions were terrible at the NUnit TestCase
and Values
attributes. They were so bad that we actually converted tests back from using those attributes. While 6.x had trouble reliably compiling and executing those tests, 7.x was better at noticing that something had changed without forcing the user to manually rebuild everything.
Unfortunately, this new awareness in 7.x came at a cost: it slowed editing in larger NUnit fixtures down to a crawl, using a tremendous amount of memory and sending VS into a 1.6GB+ memory-churn that made you want to tear your hair out.
8.x fixed all of this and, by 8.2.x was a model of stability and usefulness, getting the hell out of the way and reliably compiling, displaying and running tests.
And then along came 9.x, with a whole slew of sexy new features that just had to be installed. I tried the new features and they were good. They were fast. I was looking forward to using the snazzy new editor to create our own formatting template. ReSharper seemed to be using less memory, felt snappier, it was lovely.
And then I launched the test runner.
And then I uninstalled 9.x and reinstalled 8.x.
And then I needed the latest version of DotMemory and was forced to reinstall 9.x. So I tried the test runner again, which inspired this post. [2]
So what’s not to love about the test runner? It’s faster and seems much more asynchronous. However, it gets quite confused about which tests to run, how to handle test cases and how to handle abstract unit-test base classes.
Just like 6.x, ReSharper 9.x can’t seem to keep track of which assemblies need to be built based on changes made to the code and which test(s) the user would like to run.
To be fair, we have some abstract base classes in our unit fixtures. For example, we define all ORM query tests in multiple abstract test-fixtures and then create concrete descendants that run those tests for each of our supported databases. If I make a change to a common assembly and run the tests for PostgreSql, then I expect—at the very least—that the base assembly and the PostgreSql test assemblies will be rebuilt. 9.x isn’t so good at that yet, forcing you to “Rebuild All”—something that I’d no longer had to do with 8.2.x.
It’s the same with TestCases
: whereas 8.x was able to reliably show changes and to make sure that the latest version was run, 9.x suffers from the same issue that 6.x and 7.x had: sometimes the test is shown as a single node without children and sometimes it’s shown with the wrong children. Running these tests results in a spinning cursor that never ends. You have to manually abort the test-run, rebuild all, reload the runner with the newly generated tests from the explorer and try again. This is a gigantic pain in the ass compared to 8.x, which just showed the right tests—if not in the runner, then at-least very reliably in the explorer.
And the explorer in 9.x! It’s a hyperactive, overly sensitive, eager-to-please puppy that reloads, refreshes, expands nodes and scrolls around—all seemingly with a mind of its own! Tests wink in and out of existence, groups expand seemingly at random, the scrollbar extends and extends and extends to accommodate all of the wonderful things that the unit-test explorer wants you to see—needs for you to see. Again, it’s possible that this is due to our abstract test fixtures, but this is new to 9.x. 8.2.x is perfectly capable of displaying our tests in a far less effusive and frankly hyperactive manner.
Even the output formatting has changed in 9.x, expanding all CR/LF pairs from single-spacing to double-spacing. It’s not a deal-breaker, but it’s annoying: copying text is harder, reading stack traces is harder. How could no one have noticed this in testing?
The install/uninstall process is painless and supports jumping back and forth between versions quite well, so I’ll keep trying new versions of 9.x until the test runner is as good as the one in 8.2.x is. For now, I’m back on 8.2.3. Stay tuned.
In no particular order, we have used or are using:
Published by marco on 16. Nov 2014 00:20:42 (GMT-5)
In the previous article, I explained how we were using NDepend to clean up dependencies and the architecture of our Quino framework. You have to start somewhere, so I started with the two base assemblies: Quino and Encodo. Encodo only has dependencies on standard .NET assemblies, so let’s start with that one.
The first step in cleaning up the Encodo assembly is to remove dependencies on the Tools namespace. There seems to be some confusion as to what belongs in the Core namespace versus what belongs in the Tools namespace.
There are too many low-level classes and helpers in the Tools namespace. Just as a few examples, I moved the following classes from Tools to Core:
The names kind of speak for themselves: these classes clearly belong in a core component and not in a general collection of tools.
Now, how did I decide which elements to move to core? NDepend helped me visualize which classes are interdependent.
We see that EnumerableTools
depends on StringTools
. I’d just moved EnumerableTools
to Encodo.Core
to reduce dependence on Encodo.Tools
. However, since StringTools
is still in the Tools
namespace, the dependency remains. This is how examining dependencies really helps clarify a design: it’s now totally obvious that something as low-level as StringTools
belongs in the Encodo.Core
namespace and not in the Encodo.Tools
namespace, which has everything but the kitchen sink in it.
Another example in the same vein is shown to the left, where we examine the dependencies of MessageTools
on Encodo.Tools
. The diagram explains that the colors correspond to the two dependency directions. [1]
We would like the Encodo.Messages
namespace to be independent of the Encodo.Tools
namespace, so we have to consider either (A) removing the references to ExceptionTools
and OperatingSystemTools
from MessageTools
or (B) moving those two dependencies to the Encodo.Core
namespace.
Choice (A) is unlikely while choice (B) beckons with the same logic as the example above: it’s now obvious that tools like ExceptionTools
and OperatingSystemTools
belong in Encodo.Core
rather than the kitchen-sink namespace.
Once you’re done cleaning up your direct dependencies, you still can’t just sit back on your laurels. Now, you’re ready to get started looking at indirect dependencies. These are dependencies that involve more than just two namespaces that use each other directly. NDepend displays these as red bounding blocks. The documentation indicates that these are probably good component boundaries, assuming that the dependencies are architecturally valid.
NDepend can only show you information about your code but can’t actually make the decisions for you. As we saw above, if you have what appear to be strange or unwanted dependencies, you have to decide how to fix them. In the cases above, it was obvious that certain code was just in the wrong namespace. In other cases, it may simply be a few bits of code are defined at too low a level.
For example, our standard practice for components is to put high-level concepts for the component at the Encodo.<ComponentName>
namespace. Then we would use those elements from sub-namespaces, like Encodo.<ComponentName>.Utils
. However, we also ended up placing types that then used that sub-namespace in the upper-level namespace, like ComponentNameTools.SetUpEnvironment()
or something like that. The call to SetUpEnvironment()
references the Utils
namespace which, in turn, references
the root namespace. This is a direct dependency, but if another namespace comes between, we have an indirect dependency.
This happens quite quickly for larger components, like Encodo.Security
.
The screenshots below show a high-level snapshot of the indirect dependencies in the Encodo assembly and then also a detail view, with all sub-namespaces expanded. The detail view is much larger but shows you much more information about the exact nature of the cycle. When you select a red bounding box, another panel shows the full details and exact nature of the dependency.
After a bunch of work, I’ve managed to reduce the dependencies to a set of interfaces that are clearly far too dependent on many subsystems.
The white books for NDepend claim that “[t]echnically speaking, the task of merging the source code of several assemblies into one is a relatively light one that takes just a few hours.” However, this assumes that the code has already been properly separated into non-interdependent namespaces that correspond to components. These components can then relatively easily be extracted to separate assemblies.
The issue that I have above with the Encodo assembly is a thornier one: the interfaces themselves embody a pattern that is inherently non-decoupling. I need to change how the configuration and feedback work completely in order to decouple this code.
To that end, I’ve created an issue in the issue-tracker for Quino, QNO-4659 [2], titled “Re-examine how the configuration, feedback and application work together”. The design of these components predates our introduction of a service locator, which means it’s much more tightly coupled (as you can see above).
After some internal discussion, we’ve decided to change the design of the Encodo and Quino library support for application-level configuration and state.
IApplication<TConfiguration, TFeedback>
.IApplication
, leaving us with a base interface that is free of generic arguments.Any components that currently reference the properties on the ICoreConfiguration
can use the service locator to retrieve an instance instead.
As you can see, while NDepend is indispensable for finding dependencies, it can—along with a good refactoring tool (we use ReSharper)—really only help you clean up the low-hanging fruit. While I started out trying to split assemblies, I’ve now been side-tracked into cleaning up an older and less–well-designed component—and that’s a very good thing.
There are some gnarly knots that will feel nearly unsolvable—but with a good amount of planning, those can be re-designed as well. As I mentioned in the previous article, though, we can do so only because we’re making a clean break from the 1.x version of Quino instead of trying to maintain backward compatibility.
It’s worth it, though: the new design already looks much cleaner and is much more easily explained to new developers. Once that rewrite is finished, the Encodo assembly should be clean and I’ll use NDepend to find good places to split up that rather large assembly into sensible sub-assemblies.
A
and B
are interdependent, but A
should not rely on B
, you should make sure A
is showing in the column. You can then examine dependencies on row B
—and then remove them. This works very nicely with both direct and indirect dependencies.A while back—this last spring, I believe—I downloaded NDepend to analyze code dependencies. The trial license is fourteen days; needless to say, I got only one afternoon in before I was distracted by other duties. That was enough, however, to convince me that it was worth the... [More]
]]>Published by marco on 12. Nov 2014 22:23:25 (GMT-5)
A while back—this last spring, I believe—I downloaded NDepend to analyze code dependencies. The trial license is fourteen days; needless to say, I got only one afternoon in before I was distracted by other duties. That was enough, however, to convince me that it was worth the $375 to continue to clean up Quino with NDepend.
I decided to wait until I had more time before opening my wallet. In the meantime, however, Patrick Smacchia of NDepend approached me with a free license if I would write about my experiences using NDepend on Encodo’s blog. I’m happy to write about how I used the tool and what I think it does and doesn’t do. [1]
We started working on Quino in the fall of 2007. As you can see from the first commit, the library was super-small and comprised a single assembly.
Fast-forward seven years and Version 1.13 of Quino has 66 projects/assemblies. That’s a lot of code and it was long past time to take a look a more structured look at how we’d managed the architecture over the years.
I’d already opened a branch in our Quino repository called feature/dependencyChanges and checked in some changes at the beginning of July. Those changes had come as a result of the first time I used NDepend to find a bunch of code that was in the wrong namespace or the wrong assembly, architecturally speaking.
I wasn’t able to continue using this branch, though, for the following reasons.
With each Quino change and release, we try our hardest to balance backward-compatibility with maintainability and effort. If it’s easy enough to keep old functionality under an old name or interface, we do so.
We mark members and types obsolete so that users are given a warning in the compiler but can continue using the old code until they have time to upgrade. These obsolete members are removed in the next major or minor upgrade.
Developers who have not removed their references to obsolete members will at this point be greeted with compiler errors. In all cases, the user can find out from Quino’s release notes how they should fix a warning or error.
The type of high-level changes that we have planned necessitate that we make a major version-upgrade, to Quino 2.0. In this version, we have decided not to maintain backward-compatibility in the code with Obsolete
attributes. However, where we do make a breaking change—either by moving code to new or different assemblies or by changing namespaces—we want to maintain a usable change-log for customers who make the upgrade. The giant commit that I’d made previously was not a good start.
Since some of these changes will be quite drastic departures in structure, we want to come up with a plan to make merging from the master branch to the feature/dependencyChanges branch safer, quicker and all-around easier.
I want to include many of the changes I started in the feature/dependencyChanges branch, but would like to re-apply those changes in the following manner:
So, now that I’m ready to start cleaning up Quino for version 2.0, I’ll re-apply the changes from the giant commit, but in smaller commits. At the same time, I’ll use NDepend to find the architectural breaks that caused me to make those changes in the first place and document a bit of that process.
I created an NDepend project and attached it to my solution. Version 1.13 of Quino has 66 projects/assemblies, of which I chose the following “core” assemblies to analyze.
I can change this list at any time. There are a few ways to add assemblies. Unfortunately, the option to “Add Assemblies from VS Solution(s)” showed only 28 of the 66 projects in the Quino solution. I was unable to determine the logic that led to the other 38 projects not being shown. When I did select the projects I wanted from the list, the assemblies were loaded from unexpected directories. For example, it added a bunch of core assemblies (e.g. Encodo.Imaging) from the src/tools/Quino.CodeGenerator/bin/
folder rather than the src/libraries/Encodo.Imaging/bin
folder. I ended up just taking the references I was offered by NDepend and added references to Encodo
and Quino
, which it had not offered to add. [3]
Let’s take a look at the initial NDepend Dashboard.
There’s a lot of detail here. The initial impression of NDepend can be a bit overwhelming, I supposed, but you have to remember the sheer amount of interdependent data that it shows. As you can see on the dashboard, not only are there a ton of metrics, but those metrics are also tracked on a time-axis. I only have one measurement so far.
Any assemblies not included in the NDepend project are considered to be “third-party” assemblies, so you can see external dependencies differently than internal ones. There is also support for importing test-coverage data, but I haven’t tried that yet.
There are a ton of measurements in there, some of which interest me and others that don’t, or with which I disagree. For example, over 1400 warnings are in the Quino* assemblies because the base namespace—Encodo.Quino—doesn’t correspond to a file-system folder—it expects Encodo/Quino, but we use just Quino.
Another 200 warnings are to “Avoid public methods not publicly visible”, which generally means that we’ve declared public methods on internal, protected or private classes. The blog post Internal or public? by Eric Lippert (Fabulous adventures in coding) covered this adequately and came to the same conclusion that we have: you actually should make methods public if they are public within their scope.
There are some White Books about namespace and assembly dependencies that are worth reading if you’re going to get serious about dependencies. There’s a tip in there about turning off “Copy Local” on referenced assemblies to drastically increase compilation speed that we’re going to look into.
One of the white books explains how to use namespaces for components and how to “levelize” an architecture. This means that the dependency graph is acyclic—that there are no dependency cycles and that there are certainly no direct interdependencies. The initial graphs from the Encodo and Quino libraries show that we have our work cut out for us.
The first matrix shows the high-level view of dependencies in the Encodo and Quino namespaces. Click the second and third to see some initial dependency issues within the Encodo and Quino assemblies.
That’s as far as I’ve gotten so far. Tune in next time for a look at how we managed to fix some of these dependency issues and how we use NDepend to track improvement over time.
To make a long story short: the compiler(s) and execution engine optimize by profiling and... [More]
]]>Published by marco on 12. Nov 2014 22:14:18 (GMT-5)
The long and very technical article Introducing the WebKit FTL JIT provides a fascinating and in-depth look at how a modern execution engine optimizes code for a highly dynamic language like JavaScript.
To make a long story short: the compiler(s) and execution engine optimize by profiling and analyzing code and lowering it to runtimes of ever decreasing abstraction to run as the least dynamic version possible.
What does it mean to “lower” code? A programming language has a given level of abstraction and expressiveness. Generally, the more expressive it is, the more abstracted it is from code that can actually be run in hardware. A compiler transforms or translates from one language to another.
When people started programming machines, they used punch cards. Punch cards did not require any compilation because the programmer was directly speaking the language that the computer understood.
The first layer of abstraction that most of us—older programmers—encountered was assembly language, or assembler. Assembly code still has a more-or-less one-to-one correspondence between instructions and machine-language codes but there is a bit of abstraction in that there are identifiers and op-codes that are more human-readable.
Procedural languages introduced more types of statements like loops and conditions. At the same time, the syntax was abstracted further from assembler and machine code to make it easier to express more complex concepts in a more understandable manner.
At this point, the assembler (which assembled instructions into machine op-codes) became a compile which “compiled” a set of instructions from the more abstract language. A compiler made decisions about how to translate these concepts, and could make optimization decisions based on registers, volatility and other settings.
In time, we’d graduated to functional, statically typed and/or object-oriented languages, with much higher levels of abstraction and much more sophisticated compilers.
Generally, a compiler still used assembly language as an intermediate format, which some may remember from their days working with C++ or Pascal compilers and debuggers. In fact, .NET languages are also compiled to IL—the “Intermediate Language”—which corresponds to the instruction set that the .NET runtime exposes. The runtime compiles IL to the underlying machine code for its processor, usually in a process called JIT—Just-In-Time compilation. That is, in .NET, you start with C#, for example, which the compiler transforms to IL, which is, in turn, transformed to assembler and then machine code by the .NET runtime.
A compiler and execution engine for a statically typed language can make assumptions about the types of variables. The set of possible types is known in advance and types can be checked very quickly in cases where it’s even necessary. That is, the statically typed nature of the language allows the compiler to reason about a given program without making assumptions. Certain features of a program can be proven to be true. A runtime for a statically typed language can often avoid type checks entirely. It benefits from a significant performance boost without sacrificing any runtime safety.
The main characteristic of a dynamic language like JavaScript is that variables do not have a fixed type. Generated code must be ready for any eventuality and must be capable of highly dynamic dispatch. The generated code is highly virtualized. Such a runtime will execute much more slowly than a comparable statically compiled program.
Enter the profile-driven compiler, introduced in WebKit. From the article,
“The only a priori assumption about web content that our engine makes is that past execution frequency of individual functions is a good predictor for those functions’ future execution frequency.”
Here a “function” corresponds to a particular overload of a set of instructions called with parameters with a specific set of types. That is, suppose a JavaScript function is declared with one parameter and is called once with a string and 100 times with an integer. WebKit considers this to be two function overloads and will (possibly) elect to optimize the second one because it is called much more frequently. The first overload will still handle all possible types, including strings. In this way, all possible code paths are still possible, but the most heavily used paths are more highly optimized.
“All of the performance is from the DFG’s type inference and LLVM’s low-level optimizing power. […]
“Profile-driven compilation implies that we might invoke an optimizing compiler while the function is running and we may want to transfer the function’s execution into optimized code in the middle of a loop; to our knowledge the FTL is the first compiler to do on-stack-replacement for hot-loop transfer into LLVM-compiled code.”
Depending on the level of optimization, the code contains the following broad sections:
While WebKit has included some form of profile-driven compilation for quite some time, the upcoming version is the first to carry the same optimization to LLVM-generated machine code.
I recommend reading the whole article if you’re interested in more detail, such as how they avoided LLVM compiler performance issues and how they integrated this all with the garbage collector. It’s really amazing how much that we take for granted the WebKit JS runtime treats as “hot-swappable”. The article is quite well-written and includes diagrams of the process and underlying systems.
Published by marco on 31. Oct 2014 10:39:12 (GMT-5)
Updated by marco on 1. Nov 2014 08:44:53 (GMT-5)
The summary below describes major new features, items of note and breaking changes in Quino. The full list of issues is also available for those with access to the Encodo issue tracker.
CoreServiceBase
, which extends the standard .NET ServiceBase
. The runner is available in the Encodo.Service
assembly.HttpApplicationBase
, especially in situations where the application fails to start. Error-page handling was also improved, including handling for Windows Event Log
errors.Encodo.Core
namespace to use annotations like NotNull
and CanBeNull
with parameters and results. (QNO-4508)Generated code now includes a property that returns a ValueListObject
for each enum
property in the metadata. For example, for a property named State
of type CoreState
, the generated code includes the former properties for the enum
and the foreign key backing it, but now also includes the ValueListObject
property. This new property provides easy access to the captions.
public CoreState State { … }
public ValueListObject StateObject { … }
public int? CoreStateIdId { … }
Improved the nant fix command in the default build tools to fix the assembly name as well. The build tools are available in bin/tools/build. See the src/demo/Demo.build
file for an example on how to use the Nant build scripts for your own solutions. To change the company name used by the “fix” command, for example, add the following task override:
<target name="fix.before">
<call target="fix.before.base"/>
<property name="InfoCompanyName" value="Foobar Corporation"/>
</target>
IntegrateRemotableMethods
to avoid a race condition with remote methods. Also improved the stability of the DataProvider
statistics. (QNO-4599)
TRight
has been removed from all classes and interfaces in the Encodo.Security.*
namespace. In order to fix this code, just remove the int
generic parameter wherever it was used. For example, where before you used the interface IUser<int>
, you should now use IUser
(QNO-4576).MetaAccessControl.DoGetAccessChecker()
has been renamed to MetaAccessControl.GetAccessChecker()
.Encodo.ServiceLocator.SimpleInjector.dll
to Encodo.Services.SimpleInjector.dll
and Quino.ServiceLocator.SimpleInjector.dll
to Quino.Services.SimpleInjector.dll
Also changed the namespace Quino.ServiceLocator
to Encodo.Quino.Services
.HttpApplicationBase.StartMetaApplication()
to CreateAndStartUpApplication()
.IMetaReadable
(e.g. Deleted
, Persisted
). The model will no longer validate until the properties have been renamed and the code regenerated. (QNO-4185)StandardIntRights
with integer constants and replaced it with StandardRights
with string constants.IAccessControl.Check()
and other related methods now accept a sequence of string rights rather than integers.IMetaConfiguration.ConfigureSession()
has been deprecated. The method will still be called but may have undesired side-effects, depending on why it was overridden. The common use was to initialize a custom AccessControl
for the session. Continuing to do so may overwrite the current user set by the default Winform startup. Instead, applications should use the IDataSessionAccessControlFactory
and IDataSessionFactory
to customize the data sessions and access controls returned for an application. In order to attach an access control, take care to only set your custom access control for sessions that correspond to your application model. [1]
internal class JobVortexDataSessionAccessControlFactory : DataSessionAccessControlFactory
{
public override IAccessControl CreateAccessControl(IDataSession session)
{
if (session.Application.Model.MetaId == JobVortexModelGenerator.ModelGuid)
{
return new JobVortexAccessControl(session);
}
return base.CreateAccessControl(session);
}
}
The default length of the UserModule.User.PasswordHash
property has been increased from 100 characters to 1000. This default is more sensible for implementations that use much longer validations tokens instead of passwords. To avoid the schema migration, revert the change by setting the property default length back to 0 in your application model, after importing the security module, as shown below.
var securityModule = Builder.Include<SecurityModuleGenerator>();
securityModule.Elements.Classes.User.Properties[
Encodo.Quino.Models.Security.Classes.SecurityUser.Fields.PasswordHash
].MaximumSize = 100;
Application.Credentials
has been removed. To fix references, retrieve the IUserCredentialsManager
from the service locator. For example, the following code returns the current user:
Session.Application.Configuration.ServiceLocator.GetInstance<IUserCredentialsManager>().Current
If your application uses the WinformMetaConfigurationTools.IntegrateWinformPackages()
or WinformDxMetaConfigurationTools.IntegrateWinformDxPackages()
, then the IDataSession.AccessControl.CurrentUser
will continue to be set correctly. If not, add the SingleUserApplicationConfigurationPackage
to your application’s configuration. The user in the remoting server will be set up correctly. Add the WebApplicationConfigurationPackage
to web applications in order to ensure that the current user is set up correctly for each request. (QNO-4596)
IDataSession.SyncRoot
has been removed as it was no longer needed or used in Quino itself. Sessions should not be used in multiple threads, so there is no need for a SyncRoot
. Code that uses it should be reworked to use a separate session for each thread.IMetaApplication.CreateSession()
to an extension method. Add Encodo.Quino.App
to the using clauses to fix any compile errors.IMetaApplication.DataProvider
; use IMetaApplication.Configuration.DataProvider
instead. (QNO-4604)ISchemaChange
and descendents has been completely removed. ISchemaAction
is no longer part of the external API, although it is still used internally. The ISchemaChangeFactory
has been renamed to ISchemaCommandFactory
and, instead of creating change objects, which are then applied directly, returns ISchemaCommand
objects, which can be either executed or transformed in some other way. IMigrateToolkit.GetActionFor()
has also been replace with CreateCommands()
, which mirrors the rest of the API by returning a sequence of commands to address a given ISchemaDifference
. This release still has some commands that cannot be transformed to pure SQL, but the goal is to be able to generate pure SQL for a schema migration. (QNO-993, QNO-4579, QNO-4581, 4588, 4591, QNO-4594)IMigrateSchemaAspect.Apply()
has been removed. All aspects will have to be updated to implement GetCommands()
instead, or to use one of the available base classes, like UpdateDataAspectBase
or ConvertPropertyTypeSchemaAspect
. The following example shows how to use the UpdateDataAspectBase
to customize migration for a renamed property.
internal class ArchivedMigrationAspect : UpdateDataAspectBase
{
public ArchivedMigrationAspect()
: base("ArchivedMigrationAspect", DifferenceType.RenamedProperty, ChangePhase.Instead)
{
}
protected override void UpdateData(IMigrateContext context, ISchemaDifference difference)
{
using (var session = context.CreateSession(difference))
{
session.ChangeAndSaveAll<Project>(UpdateArchivedFlag);
}
}
private void UpdateArchivedFlag(Project obj)
{
obj.Archived = !obj.Archived;
}
}
The base aspects should cover most needs; if your functionality is completely customized, you can easily pass your previous implementation of Apply()
to a DelegateSchemaCommand
and return that from your implementation of GetCommands()
. See the implementation of UpdateDataAspectBase
for more examples. (QNO-4580)
MetaObjectIdEqualityComparer<T>
can no longer be constructed directly. Instead, use MetaObjectIdEqualityComparer<Project>.Default
.MetaClipboardControlDx.UpdateColorSkinaware()
to MetaClipboardControlDx.UpdateSkinAwareColors()
.IMetaUnique.LogicalParent
has been moved to IMetaBase
. Since IMetaUnique
inherits from IMetaBase
, it is unlikely that code is affected (unless reflection or some other direct means was used to reference the property). (QNO-4586)IUntypedMessage
has been removed; the AssociatedObject
formerly found there has been moved to IMessage
.ITypedMessage.AssociatedObject
has been renamed to ITypedMessage.TypedAssociatedObject
. (QNO-4647)MetaObjectTools
to MetaReadableTools
.GenericObject.GetAsGuid()
and GenericObject.GetAsGuidDefault
as extension methods in MetaWritableTools
.IMetaFeedback.CreateGlobalContext()
has been removed. Instead the IGlobalContext
is created using the service locator.Published by marco on 24. Oct 2014 12:26:25 (GMT-5)
Quino is a metadata framework for .NET. It provides a means of defining an application-domain model in the form of metadata objects. Quino also provides many components and support libraries that work with that metadata to automate many services and functions. A few examples are an ORM, schema migration, automatically generated user interfaces and reporting tools.
The component we’re going to discuss is the automated schema-migration for databases. A question that recently came up with a customer was: what do all of the options mean in the console-based schema migrator?
Here’s the menu you’ll see in the console migrator:
Advanced Options (1) Show migration plan (2) Show significant mappings (3) Show significant mappings with unique ids (4) Show all mappings (5) Show all mappings with unique ids Main Options (R) Refresh status (M) Migrate database (C) Cancel
The brief summary is:
The other advanced options are more for debugging the migration recommendation if something looks wrong. In order to understand what that means, we need to know what the migrator actually does.
The initial database-import and final command-generation parts of migration are very database-specific. The determination of differences is also partially database-specific (e.g. some databases do not allow certain features so there is no point in detecting a difference that cannot ever be repaired). The rest of the migration logic is database-independent.
The migrator works with two models: the target model and a source model
Given these two models, the “mapping builder” creates a mapping. In the current implementation of Quino, there is no support for allowing the user to adjust mapping before a migration plan is built from it. However, it would be possible to allow the user to verify and possibly adjust the mapping. Experience has shown that this is not necessary. Anytime we thought we needed to adjust the mapping, the problem was instead that the target model had been configured incorrectly. That is, each time we had an unexpected mapping, it led us directly to a misconfiguration in the model.
The options to show mappings are used to debug exactly such situations. Before we talk about mapping, though, we should talk about what we mean by “unique ids”. Every schema-relevant bit of metadata in a Quino model is associated with a unique id, in the form of a Guid and called a “MetaId” in Quino.
What happens during when the import handler generates a model?
The importer runs in two phases:
A Quino application named “demo” will have the following schema:
The migrator reads the following information into a “raw model”
If there is no further information in the database, then the mapper will have to use the raw model only. If, however, the database was created or is being maintained by Quino, then there is additional information stored in the metadata table mentioned above. The importer enhanced the raw model with this information, in order to improve mapping and difference-recognition. The metadata table contains all of the Quino modeling information that is not reflected in a standard database schema (e.g. the aforementioned MetaId).
The data available in this table is currently:
SchemaIdentifier
For each schema element in the raw model, the importer does the following:
At this point, the imported model is ready and we can create a mapping between it and the application model. The imported model is called the source model while the application model is called the target model because we’re migrating the “source” to match the “target”.
We generate a mapping by iterating the target model:
The important decisions have already been made in the mapping phase. At this point, the migrator just generates a migration plan, which is a list of differences that must be addressed in order to update the database to match the target model.
This is the plan that is shown to the user by the various migration tools available with Quino. [2]
At this point, we can now understand what the advanced console-migrator commands mean. Significant mappings are those mappings which correspond to a difference in the database (create, drop, rename or alter).
As already stated, the advanced options are really there to help a developer see why the migrator might be suggesting a change that doesn’t correspond to expectations.
At this point, the migrator displays the list of differences that will be addressed by the migrator if the user chooses to proceed.
What happens when the user proceeds? The migrator generates database-specific commands that, when executed against the database, will modify the schema of the database. [3]
Commands are executed for different phases of the migration process. The phases are occasionally extended but currently comprise the following.
The commands are then executed and the results logged.
Afterward, the schema is imported again, to verify that there are no differences between the target model and the database. In some (always rarer) cases, there will still be differences, in which case, you can execute the new migration plan to repair those differences as well.
In development, this works remarkably well and often, without further intervention.
In some cases, there is data in the database that, while compatible with the current database schema, is incompatible with the updated schema. This usually happens when a new property or constraint is introduced. For example, a new required property is added that does not have a default value or a new unique index is added which existing data violates.
In these cases, there are two things that can be done:
In general, it’s strongly advised to perform a migration against a replica of the true target database (e.g. a production database) in order to guarantee that all potential data situations have been anticipated with custom code, if necessary.
It’s important to point out that Quino’s schema migration is considerably different from that employed by EF (which it picked up from the Active Migrations in Ruby, often used with Ruby on Rails). In those systems, the developer generates specific migrations to move from one model version to another. There is a clear notion of upgrading versus downgrading. Quino only recognizes migrating from an arbitrary model to another arbitrary model. This makes Quino’s migration exceedingly friendly when moving between development branches, unlike EF, whose deficiencies in this area have been documented.
We use Microsoft Entity Framework (EF) Migrations in one of our... [More]
]]>Published by marco on 20. Oct 2014 15:23:19 (GMT-5)
The version of EF Migrations discussed in this article is 5.0.20627. The version of Quino is less relevant: the features discussed have been supported for years. For those in a hurry, there is a tl;dr near the end of the article.
We use Microsoft Entity Framework (EF) Migrations in one of our projects where we are unable to use Quino. We were initially happy to be able to automate database-schema changes. After using it for a while, we have decidedly mixed feelings.
As developers of our own schema migration for the Quino ORM, we’re always on the lookout for new and better ideas to improve our own product. If we can’t use Quino, we try to optimize our development process in each project to cause as little pain as possible.
We ran into problems in integrating EF Migrations into a development process that uses feature branches. As long as a developer stays on a given branch, there are no problems and EF functions relatively smoothly. [1]
However, if a developer switches to a different branch—with different migrations—EF Migrations is decidedly less helpful. It is, in fact, quite cryptic and blocks progress until you figure out what’s going on.
Assume the following not-uncommon situation:
We now have the situation in which two branches have different code and each has its own database schema. Switching from one branch to another with Git quickly and easily addresses the code differences. The database is, unfortunately, a different story.
Let’s assume that developer A switches to branch feature/B to continue working there. The natural thing for A to do is to call “update-database” from the Package Manager Console [2]. This yields the following message—all-too-familiar to EF Migrations developers.
“Unable to update database to match the current model because there are pending changes and automatic migration is disabled. Either write the pending changes to a code-based migration or enable automatic migration. […]”
This situation happens regularly when working with multiple branches. It’s even possible to screw up a commit within a single branch, as illustrated in the following real-world example.
As far as you’re concerned, you committed a single field to the model. When your co-worker runs that migration, it will be applied, but EF Migrations immediately thereafter complains that there are pending model changes to make. How can that be?
Just to focus, we’re actually trying to get real work done, not necessarily debug EF Migrations. We want to answer the following questions:
The underlying reason why EF Migrations has problems is that it does not actually know what the schema of the database is. It doesn’t read the schema from the database itself, but relies instead on a copy of the EF model that it stored in the database when it last performed a successful migration.
That copy of the model is also stored in the resource file generated for the migration. EF Migrations does this so that the migration includes information about which changes it needs to apply and about the model to which the change can be applied.
If the model stored in the database does not match the model stored with the migration that you’re trying to apply, EF Migrations will not update the database. This is probably for the best, but leads us to the second question above: what do we have to do to get the database updated?
The answer has already been hinted at above: we need to fix the model stored in the database for the last migration.
Let’s take a look at the situation above in which your colleague downloaded what you thought was a clean commit.
From the Package Manager Console, run add-migration foo
to scaffold a migration for the so-called “pending changes” that EF Migrations detected. That’s interesting: EF Migrations thinks that your colleague should generate a migration to drop the column that you’d only temporarily added but never checked in.
That is, the column isn’t in his database, it’s not in your database, but EF Migrations is convinced that it was once in the model and must be dropped.
How does EF Migrations even know about a column that you added to your own database but that you removed from the code before committing? What dark magic is this?
The answer is probably obvious: you did check in the change. The part that you can easily remove (the C# code) is only half of the migration. As mentioned above, the other part is a binary chunk stored in the resource file associated with each migration. These BLOBS are stored in the table _MigrationHistory
table in the database.
Here’s the tl;dr: generate a “fake” migration, remove all of the C# code that would apply changes to the database (shown below) and execute update-database
from the Package Manager Console.
This may look like it does exactly nothing. What actually happens is that it includes the current state of the EF model in the binary data for the last migration applied to the database (because you just applied it).
Once you’ve applied the migration, delete the files and remove them from the project. This migration was only generated to fix your local database; do not commit it.
Applying the fix above doesn’t mean that you won’t get database errors. If your database schema does not actually match the application model, EF will crash when it assumes fields or tables are available which do not exist in your database.
Sometimes, the only way to really clean up a damaged database—especially if you don’t have the code for the migrations that were applied there [3]—is to remove the misapplied migrations from your database, undo all of the changes to the schema (manually, of course) and then generate a new migration that starts from a known good schema.
The obvious answer to the complaint “it hurts when I do this” is “stop doing that”. We would dearly love to avoid these EF Migrations-related issues but developing without any schema-migration support is even more unthinkable.
We’d have to create upgrade scripts manually or would have to maintain scripts to generate a working development database and this in each branch. When branches are merged, the database-upgrade scripts have to be merged and tested as well. This would be a significant addition to our development process, has maintainability and quality issues and would probably slow us down even more.
And we’re certainly not going to stop developing with branches, either.
We were hoping to avoid all of this pain by using EF Migrations. That EF Migrations makes us think of going back to manual schema migration is proof that it’s not nearly as elegant a solution as our own Quino schema migration, which never gave us these problems.
Quino actually reads the schema in the database and compares that model directly against the current application model. The schema migrator generates a custom list of differences that map from the current schema to the desired schema and applies them. There is user intervention but it’s hardly ever really required. This is an absolute godsend during development where we can freely switch between branches without any hassle. [4]
Quino doesn’t recognize “upgrade” versus “downgrade” but instead applies “changes”. This paradigm has proven to be a much better fit for our agile, multi-branch style of development and lets us focus on our actual work rather than fighting with tools and libraries.
Downgrade
method that is generated with each migration, but perhaps someone with more experience could explain how to properly apply such a thing. If that doesn’t work, the method outlined above is your only fallback.index.html
in a modern web browser... [More]Published by marco on 14. Sep 2014 16:09:45 (GMT-5)
On Wednesday, August 27th, Tymon gave the rest of Encodo [1] a great introduction to PowerShell. I’ve attached the presentation but a lot of the content was in demonstrations on the command-line.
index.html
in a modern web browser (Chrome/Opera/Firefox work the best; IE has some rendering issues)We learned a few very interesting things:
get-command
and get-member
than the GUI.The easiest way to integrate PowerShell into your workflow is to make it eminently accessible by installing ConEmu. ConEmu is a Windows command-line with a tabbed interface and offers a tremendous number of power-user settings and features. You can tweak it to your heart’s content.
I set mine up to look like the one that Tymon had in the demonstrations (shown on my desktop to the right).
The reason we frown on returning null
from a method that returns a list or sequence is that we want to be able to freely use these sequences or lists with in a functional manner.
It seems to me that the... [More]
]]>Published by marco on 8. Aug 2014 10:20:08 (GMT-5)
I’ve seen a bunch of articles addressing this topic of late, so I’ve decided to weigh in.
The reason we frown on returning null
from a method that returns a list or sequence is that we want to be able to freely use these sequences or lists with in a functional manner.
It seems to me that the proponents of “no nulls” are generally those who have a functional language at their disposal and the antagonists do not. In functional languages, we almost always return sequences instead of lists or arrays.
In C# and other functional languages, we want to be able to do this:
var names = GetOpenItems()
.Where(i => i.OverdueByTwoWeeks)
.SelectMany(i => i.GetHistoricalAssignees()
.Select(a => new { a.FirstName, a.LastName })
);
foreach (var name in names)
{
Console.WriteLine("{1}, {0}", name.FirstName, name.LastName);
}
If either GetHistoricalAssignees()
or GetOpenItems()
might return null
, then we’d have to write the code above as follows instead:
var openItems = GetOpenItems();
if (openItems != null)
{
var names = openItems
.Where(i => i.OverdueByTwoWeeks)
.SelectMany(i => (i.GetHistoricalAssignees() ?? Enumerable.Empty<Person>())
.Select(a => new { a.FirstName, a.LastName })
);
foreach (var name in names)
{
Console.WriteLine("{1}, {0}", name.FirstName, name.LastName);
}
}
This seems like exactly the kind of code we’d like to avoid writing, if possible. It’s also the kind of code that calling clients are unlikely to write, which will lead to crashes with NullReferenceExceptions
. As we’ll see below, there are people that seem to think that’s perfectly OK. I am not one of those people, but I digress.
The post, Is it Really Better to ‘Return an Empty List Instead of null’? / Part 1 by Christian Neumanns (Code Project) serves as a good example of an article that seems to be providing information but is just trying to distract people into accepting it as a source of genuine information. He introduces his topic with the following vagueness.
“If we read through related questions in Stackoverflow and other forums, we can see that not all people agree. There are many different, sometimes truly opposite opinions. For example, the top rated answer in the Stackoverflow question Should functions return null or an empty object? (related to objects in general, not specifically to lists) tells us exactly the opposite:
“Returning null is usually the best idea …”
The statement “we can see that not all people agree” is a tautology. I would split the people into groups of those whose opinions we should care about and everyone else. The statement “There are many different, sometimes truly opposite opinions” is also tautological, given the nature of the matter under discussion—namely, a question that can only be answered as “yes” or “no”. Such questions generally result in two camps with diametrically opposed opinions.
As the extremely long-winded pair of articles writes: sometimes you can’t be sure of what an external API will return. That’s correct. You have to protect against those with ugly, defensive code. But don’t use that as an excuse to produce even more methods that may return null
. Otherwise, you’re just part of the problem.
The second article Is it Really Better to ‘Return an Empty List Instead of null’? − Part 2 by Christian Neumanns (Code Project) includes many more examples.
I just don’t know what to say about people that write things like “Bugs that cause NullPointerExceptions are usually easy to debug because the cause and effect are short-distanced in space (i.e. location in source code) and time.” While this is kind of true, it’s also even more true that you can’t tell the difference between such an exception being caused by a savvy programmer who’s using it to his advantage and a non-savvy programmer whose code is buggy as hell.
He has a ton of examples that try to distinguish between a method that returns an empty sequence being different from a method that cannot properly answer a question. This is a concern and a very real distinction to make, but the answer is not to return null
to indicate nonsensical input. The answer is to throw an exception.
The method providing the sequence should not be making decisions about whether an empty sequence is acceptable for the caller. For sequences that cannot logically be empty, the method should throw an exception instead of returning null to indicate “something went wrong”.
A caller may impart semantic meaning to an empty result and also throw an exception (as in his example with a cycling team that has no members). If the display of such a sequence on a web page is incorrect, then that is the fault of the caller, not of the provider of the sequence.
That there exists calling code that makes assumptions about return values that are incorrect is no reason to start returning values that will make calling code crash with a NullPointerException
.
All of his examples are similar: he tries to make the pure-data call to retrieve a sequence of elements simultaneously validate some business logic. That’s not a good idea. If this is really necessary, then the validity check should go in another method.
The example he cites for getting the amount from a list of PriceComponents is exactly why most aggregation functions in .NET throw an exception when the input sequence is empty. But that’s a much better way of handling it—with a precise exception—than by returning null
to try to force an exception somewhere in the calling code.
But the upshot for me is: I am not going to write code that, when I call it, forces me to litter other code with null-checks. That’s just ridiculous.
Because we’re talking about latency in these articles, we’d also like to... [More]
]]>Published by marco on 8. Aug 2014 10:20:05 (GMT-5)
In the previous two articles, we managed to reduce the number of queries executed when opening the calendar of Encodo’s time-tracking product Punchclock from one very slow query per person to a single very fast query.
Because we’re talking about latency in these articles, we’d also like to clear away a few other queries that aren’t related to time entries but are still wasting time.
In particular, the queries that “Load values” for person objects look quite suspicious. These queries don’t take a lot of time to execute but they will definitely degrade performance in high-latency networks. [1]
As we did before, we can click on one of these queries to show the query that’s being loaded. In the screenshot below, we see that the person’s picture is being loaded for each person in the drop-down list.
We’re not showing pictures in the drop-down list, though, so this is an extravagant waste of time. On a LAN, we hardly notice how wasteful we are with queries; on a WAN, the product will feel…sluggish.
In order to understand the cause of these queries, you must first know that Quino allows a developer to put metadata properties into different load-groups. A load-group has the following behavior: If the value for a property in a load-group is requested on an object, the values for all of the properties in the load-group are retrieved with a single query and set on that object.
The default load-group of an object’s metadata determine the values that are initially retrieved and applied to objects materialized by the ORM.
The metadata for a person puts the “picture” property of a person into a separate load-group so that the value is not loaded by default when people objects are loaded from the data driver. This is a good balance because business logic will avoid downloading a lot of unwanted picture data by default.
Business logic that needs the pictures can either explicitly include the picture in the query or let the value be lazy-loaded by the ORM when it is accessed. The proper solution depends on the situation.
As before, we can check the stack trace of the query to figure out which application component is triggering the call. In this case, the culprit is the binding list that we are using to attach the list of people to the drop-down control.
The binding list binds the values for all of the properties in a metaclass (e.g. “person”), triggering a lazy load when it accesses the “picture” property. To avoid the lazy-load, we can create a wrapper of the default metadata for a person and remove/hide the property so that the binding list will no longer access it.
This is quite easy [2], as shown in the code below.
var personMetaClass = new WrapMetaClass(Person.Metadata);
personMetaClass.Properties.Remove(Person.MetaProperties.Picture);
var query = new Query(personMetaClass);
With this simple fix, the binding list no longer knows about the picture property, doesn’t retrieve values for that property and therefore no longer triggers any queries to lazily load the pictures from the database for each person object.
The screenshot of the statistics window below shows up that we were successful. We have two main queries: one for the list of people to show in the dropdown control and one for the time entries to show in the calendar.
For completeness, here’s the code that Punchclock is using in the current version of Quino (1.11).
var personMetaClass = new WrapMetaClass(Person.Metadata);
personMetaClass.Properties.Remove(Person.MetaProperties.Picture);
var accessToolkit = new PostgreSqlMetaDatabase().AccessToolkit;
var query = new Query(personMetaClass);
query.CustomCommandText = new CustomCommandText();
query.CustomCommandText.SetSection(
CommandTextSections.Where,
CommandTextAction.Replace,
string.Format(
"EXISTS (SELECT id FROM {0} WHERE {1} = {2})",
accessToolkit.GetName(TimeEntry.Metadata),
accessToolkit.GetField(TimeEntry.MetaProperties.PersonId),
accessToolkit.GetField(Person.MetaProperties.Id)
)>
);
var people = Session.GetList<Person>(query);
Once we fix the but in the WhereExists
join type mentioned in the previous article and add the fluent methods for constructing wrappers mentioned in the footnote below, the code will be as follows:
var personMetaClass =
Person.Metadata.
Wrap().
RemoveProperty(Person.MetaProperties.Picture);
var accessToolkit = new PostgreSqlMetaDatabase().AccessToolkit;
var people =
Session.GetList<Person>(
new Query(personMetaClass).
Join(Person.MetaRelations.TimeEntries, JoinType.WhereExists).
Query
);
This concludes our investigation into performance issues with Quino and Punchclock.
You may have noticed that these calls to “load values” are technically lazy-loaded but don’t seem to be marked as such in the screenshots. This was a bug in the statistics viewer that I discovered and addressed while writing this article.
This is a rather old API and hasn’t been touched with the “fluent” wand that we’ve applied in other parts of the Quino API. A nicer way of writing it would be to create an extension methods called Wrap()
and RemoveProperty
that return the wrapper class, like so:
var personMetaClass =
Person.Metadata.
Wrap().
RemoveProperty(Person.MetaProperties.Picture);
var query = new Query(personMetaClass);
But that will have to wait for a future version of Quino.
The... [More]
]]>Published by marco on 4. Jul 2014 09:09:05 (GMT-5)
In the previous article, we partially addressed a performance problem in the calendar of Encodo’s time-tracking product, Punchclock. While we managed to drastically reduce the amount of time taken by each query (>95% time saved), we were still executing more queries than strictly necessary.
The query that we’re trying to optimized further is shown below.
var people =
Session.GetList<Person>().
Where(p => Session.GetCount(p.TimeEntries.Query) > 0).
ToList();
This query executes one query to get all the people and then one query per person to get the number of time entries per person. Each of these queries by itself is very fast. High latency will cause them to be slow. In order to optimize further, there’s really nothing for it but to reduce the number of queries being executed.
Let’s think back to what we’re actually trying to accomplish: We want to get all people who have at least one time entry. Can’t we get the database to do that for us? Some join or existence check or something? How about the code below?
var people =
Session.GetList<Person>(
Session.CreateQuery<Person>().
Join(Person.MetaRelations.TimeEntries, JoinType.WhereExists).
Query
);
What’s happening in the code above? We’re still getting a list of people but, instead of manipulating the related TimeEntries for each person locally, we’re joining the TimeEntries relation with the Quino query Join()
method and changing the join type from the default All
to the restrictive WhereExists
. This sounds like exactly what we want to happen! There is no local evaluation or manipulation with Linq and, with luck, Quino will be able to map this to a single query on the database.
This is the best possible query: it’s purely declarative and will be executed as efficiently as the back-end knows how.
There’s just one problem: the WhereExists
join type is broken in Quino 1.11.
Never fear, though! We can still get it to work, but we’ll have to do a bit of work until the bug is fixed in Quino 1.12. The code below builds on lessons learned in the earlier article, Mixing your own SQL into Quino queries: part 2 of 2 to use custom query text to create the restriction instead of letting Quino do it.
var accessToolkit = new PostgreSqlMetaDatabase().AccessToolkit;
var query = Session.CreateQuery<Person>();
query.CustomCommandText = new CustomCommandText();
query.CustomCommandText.SetSection(
CommandTextSections.Where,
CommandTextAction.Replace,
string.Format(
"EXISTS (SELECT id FROM {0} WHERE {1} = {2})",
accessToolkit.GetName(TimeEntry.Metadata),
accessToolkit.GetField(TimeEntry.MetaProperties.PersonId),
accessToolkit.GetField(Person.MetaProperties.Id)
)
);
var people = Session.GetList<Person>(query);
A look at the statistics is very encouraging:
We’re down to one 29ms query for the people and an even quicker query for all the relevant time entries. [1] We can see our query text appears embedded in the SQL generated by Quino, just as we expected.
There are a few other security-related queries that execute very quickly and hardly need optimization.
We’ve come much farther in this article and we’re almost done. In the next article, we’ll quickly clean up a few other queries that are showing up in the statistics and that have been nagging us since the beginning.
Published by marco on 27. Jun 2014 10:07:40 (GMT-5)
In the previous article, we discussed a performance problem in the calendar of Encodo’s time-tracking product, Punchclock.
Instead of guessing at the problem, we profiled the application using the database-statistics window available to all Quino applications. [1] We quickly discovered that most of the slowdown stems from the relatively innocuous line of code shown below.
var people =
Session.GetList<Person>().
Where(p => p.TimeEntries.Any()).
ToList();
Before doing anything else, we should establish what the code does. Logically, it retrieves a list of people in the database who have recorded at least one time entry.
The first question we should ask at this point is: does the application even need to do this? The answer in this case is ‘yes’. The calendar includes a drop-down control that lets the user switch between the calendars for different users. This query returns the people to show in this drop-down control.
With the intent and usefulness of the code established, let’s dissect how it is accomplishing the task.
Session.GetList<Person>()
portion retrieves a list of all people from the databaseWhere()
method is applied locally for each object in the list [2]Any()
method is applied to the full list of time entriesToList()
method creates a list of all people who match the conditionThough the line of code looks innocuous enough, it causes a huge number of objects to be retrieved, materialized and retained in memory—simply in order to check whether there is at least one object.
This is a real-world example of a performance problem that can happen to any developer. Instead of blaming the developer who wrote this line of code, its more important to stay vigilant to performance problems and to have tools available to quickly and easily find them.
The first solution I came up with [3] was to stop creating objects that I didn’t need. A good way of doing this and one that was covered in Quino: partially-mapped queries is to use cursors instead of lists. Instead of using the generated list TimeEntries
, the following code retrieves a cursor on that list’s query and materializes at most one object for the sub-query.
var people = Session.GetList<Person>().Select(p =>
{
using (var cursor = Session.CreateCursor<TimeEntry>(p.TimeEntries.Query)) [4]
{
return cursor.Any();
}
}).ToList();
A check of the database statistics shows improvement, as shown below.
Just by using cursors, we’ve managed to reduce the execution time for each query by about 75%. [5] Since all we’re interested in finding out is whether there is at least one time entry for a person, we could also ask the database to count objects rather than to return them. That should be even faster. The following code is very similar to the example above but, instead of getting a cursor based on the TimeEntries
query, it gets the count.
var people =
Session.GetList<Person>().
Where(p => Session.GetCount(p.TimeEntries.Query) > 0).
ToList();
How did we do? A check of the database statistics shows even more improvement, as shown below.
We’re now down to a few dozen milliseconds for all of our queries, so we’re done, right? A 95% reduction in query-execution time should be enough.
Unfortunately, we’re still executing just as many queries as before, even though we’re taking far less time to execute them. This is better, but still not optimal. In high-latency situations, the user is still likely to experience a significant delay when opening the calendar since each query’s execution time is increased by the latency of the connection. In a local network, the latency is negligible; on a WAN, we still have a problem.
In the next article, we’ll see if we can’t reduce the number of queries being executed.
It is important for users of the Microsoft Entity Framework (EF) to point out that Quino does not have a Linq-to-Sql mapper. That means that any Linq expressions like Where()
are evaluated locally instead of being mapped to the database. There are various reasons for this but the main one is that we ended up preferring a strict boundary between the mappable query API and the local evaluation API.
Anything formulated with the query API is guaranteed to be executed by the data provider (even if it must be evaluated locally) and anything formulated with Linq is naturally evaluated locally. In this way, the code is clear in what is sent to the server and what is evaluated locally. Quino only very, very rarely issues an “unmappable query” exception, unlike EF, which occasionally requires contortions until you’ve figured out which C# formulation of a particular expression can be mapped by EF.
Published by marco on 20. Jun 2014 10:44:29 (GMT-5)
Updated by marco on 24. Jun 2014 13:27:18 (GMT-5)
Punchclock is Encodo’s time-tracking and invoicing tool. It includes a calendar to show time entries (shown to the left). Since the very first versions, it hasn’t opened very quickly. It was fast enough for most users, but those who worked with Punchclock over the WAN through our VPN have reported that it often takes many seconds to open the calendar. So we have a very useful tool that is not often used because of how slowly it opens.
That the calendar opens slowly in a local network and even more slowly in a WAN indicates that there is not only a problem with executing many queries but also with retrieving too much data.
This seemed like a solvable problem, so I fired up Punchclock in debug mode to have a look at the query-statistics window.
To set up the view shown below, I did the following:
I marked a few things on the screenshot. It’s somewhat suspicious that there are 13 queries for data of type “Person”, but we’ll get to that later. Much more suspicious is that there are 52 queries for time entries, which seems like quite a lot considering we’re showing a calendar for a single user. We would instead expect to have a single query. More queries would be OK if there were good reasons for them, but I feel comfortable in deciding that 52 queries is definitely too many.
A closer look at the details for the time-entry queries shows very high durations for some of them, ranging from a tenth of a second to nearly a second. These queries are definitely the reason the calendar window takes so long to load.
If I select one of the time-entry queries and show the “Query Text” tab (see screenshot below), I can see that it retrieves all time entries for a single person, one after another. There are almost six years of historical data in our Punchclock database and some of our employees have been around for all of them. [1] That’s a lot of time entries to load.
I can also select the “Stack Trace” tab to see where the call originated in my source code. This feature lets me pinpoint the program component that is causing these slow queries to be executed.
As with any UI-code stack, you have to be somewhat familiar with how events are handled and dispatched. In this stack, we can see how a MouseUp
command bubbled up to create a new form, then a new control and finally, to trigger a call to the data provider during that control’s initialization. We don’t have line numbers but we see that the call originates in a lambda defined in the DynamicSchedulerControl
constructor.
The line of code that I pinpoint as the culprit is shown below.
var people = Session.GetList<Person>().Where(p => p.TimeEntries.Any()).ToList();
This looks like a nicely declarative way of getting data, but to the trained eye of a Quino developer, it’s clear what the problem is.
In the next couple of articles, we’ll take a closer look at what exactly the problem is and how we can improve the speed of this query. We’ll also take a look at how we can improve the Quino query API to make it harder for code like the line above to cause performance problems.
Encodo just turned nine years old, but we used a different time-entry system for the first couple of years. If you’re interested in our time-entry software history, here it is:
In particular, we’d... [More]
]]>Published by marco on 18. Jun 2014 08:10:36 (GMT-5)
Updated by marco on 8. Jun 2016 20:51:27 (GMT-5)
In the previous article, we listed a lot of questions that you should continuously ask yourself when you’re writing code. Even when you think you’re not designing anything, you’re actually making decisions that will affect either other team members or future versions of you.
In particular, we’d like to think about how we can reconcile a development process that involves asking so many questions and taking so many facets into consideration with YAGNI.
The implication of this principle is, that if you aren’t going to need something, then there’s no point in even thinking about it. While it’s absolutely commendable to adopt a YAGNI attitude, not building something doesn’t mean not thinking about it and identifying potential pitfalls.
A feature or design concept can be discussed within a time-box. Allocate a fixed, limited amount of time to determine whether the feature or design concept needs to be incorporated, whether it would be nice to incorporate it or possibly to jettison it if it’s too much work and isn’t really necessary.
The overwhelming majority of time wasted on a feature is in the implementation, debugging, testing, documentation and maintenance of it, not in the design. Granted, a long design phase can be a time-sink—especially a “perfect is the enemy of the good” style of design where you’re completely blocked from even starting work. With practice, however, you’ll learn how to think about a feature or design concept (e.g. extensibility) without letting it ruin your schedule.
If you don’t try to anticipate future needs at all while designing your API, you may end up preventing that API from being extended in directions that are both logical and could easily have been anticipated. If the API is not extensible, then it will not be used and may have to be rewritten in the future, losing more time at that point rather than up front. This is, however, only a consideration you must make. It’s perfectly acceptable to decide that you currently don’t care at all and that a feature will have to be rewritten at some point in the future.
You can’t do this kind of cost-benefit analysis and risk-management if you haven’t taken time to identify the costs, benefits or risks.
At Encodo, we encourage the person who’s already spent time thinking about this problem to simply document the drawbacks and concessions and possible ideas in an issue-tracker entry that is linked to the current implementation. This allows future users, maintainers or extenders of the API to be aware of the thought process that underlies a feature. It can also help to avoid misunderstandings about what the intended audience and coverage of an API are.
The idea is to eliminate assumptions. A lot of time can be wasted when maintenance developers make incorrect assumptions about the intent of code.
If you don’t have time to do any of this, then you can write a quick note in a task list that you need to more fully document your thoughts on the code you’re writing. And you should try to do that soon, while the ideas are still relatively fresh in your mind. If you don’t have time to think about what you’re doing even to that degree, then you’re doing something wrong and need to get organized better.
That is, you if you can’t think about the code you’re writing and don’t have time to document your process, even minimally, then you shouldn’t be writing that code. Either that, or you implicitly accept that others will have to clean up your mess. And “others” includes future versions of you. (E.g. the you who, six months from now, is muttering, “who wrote this crap?!?”)
As an example, we can consider how we go from a specific feature in the context of a project to thinking about where the functionality could fit in to a suite of products—that may or may not yet exist. And remember, we’re only thinking about these things. And we’re thinking about them for a limited time—a time-box. You don’t want to prevent your project from moving forward, but you also don’t want to advance at all costs.
Advancing in an unstructured way is called hacking and, while it can lead to a short-term win, it almost always leads to short-to-medium term deficits. You can still write code that is hacked and looks hacked, if that is the highest current priority, but you’re not allowed to forget that you did so. You must officially designate what you’re doing as a hot-zone of hacking so that the Hazmat team can clean it up later, if needed.
A working prototype that is hacked together just so it works for the next demonstration is great as long as you don’t think that you can take it into production without doing the design and documentation work that you initially skipped.
If you fail to document the deficits that prevent you from taking a prototype to production, then how will you address those deficits? It will cost you much more time and pain to determine the deficits after the fact. Not only that, but unless you do a very good job, it is your users that will most likely be finding deficits—in the form of bugs.
If your product is just a hacked mess of spaghetti code with no rhyme or reason, another developer will be faster and produce more reliable code by just starting over. Trying to determine the flaws, drawbacks and hacks through intuition and reverse-engineering is slower and more error-prone than just starting with a clean slate. Developers on such a project will not be able to save time—and money—by building on what you’ve already made.
Not to be forgotten is a structured approach to error-handling. The more “hacked” the code, the more stringent the error-checking should be. If you haven’t had time yet to write or test code sufficiently, then that code shouldn’t be making broad decisions about what it thinks are acceptable errors.
Fail early, fail often. Don’t try to make a hacked mess of code bullet-proof by catching all errors in an undocumented manner. Doing so is deceptive to testers of the product as well as other developers.
If you’re building a demo, make sure the happy path works and stick to it during the demo. If you do have to break this rule, add the hacks to a demo-specific branch of the code that will be discarded later.
If, however, the developer can look at your code and sees accompanying notes (either in an issue tracker, as TODOs in the code or some other form of documentation), that developer knows where to start fixing the code to bring it to production quality.
For example, it’s acceptable to configure an application in code as long as you do it in a central place and you document that the intent is to move the configuration to an external source when there’s time. If a future developer finds code for support for multiple database connections and tests that are set to ignore with a note/issue that says “extend to support multiple databases”, that future developer can decide whether to actually implement the feature or whether to just discard it because it has been deprecated as a requirement.
Without documentation or structure or an indication which parts of the code were thought-through and which are considered to be hacked, subsequent developers are forced to make assumptions that may not be accurate. They will either assume that hacked code is OK or that battle-tested code is garbage. If you don’t inform other developers of your intent when your’re writing the code—best done with documentation, tests and/or a cleanly designed API—then it might be discarded or ignored, wasting even more time and money.
If you’re on a really tight time-budget and don’t have time to document your process correctly, then write a quick note that you think the design is OK or the code is OK, but tell your future self or other developers what they’re looking at. It will only take you a few minutes and you’ll be glad you did—and so will they.
Published by marco on 3. Jun 2014 10:25:46 (GMT-5)
A big part of an agile programmer’s job is API design. In an agile project, the architecture is defined from on high only in broad strokes, leaving the fine details of component design up to the implementer. Even in projects that are specified in much more detail, implementers will still find themselves in situations where they have to design something.
This means that programmers in an agile team have to be capable of weighing the pros and cons of various approaches in order to avoid causing performance, scalability, maintenance or other problems as the API is used and evolves.
When designing an API, we consider some of the following aspects. This is not meant to be a comprehensive list, but should get you thinking about how to think about the code you’re about to write.
Even if you don’t have time to write tests right now, you should still build your code so that it can be tested. It’s possible that you won’t be writing the tests. Instead, you should prepare the code so that others can use it.
It’s also possible that a future you will be writing the tests and will hate you for having made it so hard to automate testing.
This is a very important one and involves how your application handles situations outside of the design.
While we’re on the subject of error-handling, I want to emphasize that this is one of the most important parts of API design, regardless of which language or environment you use. [1]
Add preconditions for all method parameters; verify them as non-null and verify ranges. Do not catch all exceptions and log them or—even worse—ignore them. This is even more important in environments—I’m looking at you client-side web code in general and JavaScript in particular—where the established philosophy is to run anything and to never rap a programmer on the knuckles for having written really knuckle-headed code.
You haven’t tested the code, so you don’t know what kind of errors you’re going to get. If you ignore everything, then you’ll also ignore assertions, contract violations, null-reference exceptions and so on. The code will never be improved if it never makes a noise. It will just stay silently crappy until someone notices a subtle logical error somewhere and must painstakingly track it down to your untested code.
You might say that production code shouldn’t throw exceptions. This is true, but we’re explicitly not talking about production code here. We’re talking about code that has few to no tests and is acknowledged to be incomplete. If you move code like this into production, then it’s better to crash than to silently corrupt data or impinge the user experience.
A crash will get attention and the code may even be fixed or improved. If you write code that will crash on all but the “happy path” and it never crashes? That’s great. Do not program preemptively defensively in fresh code. If you have established code that interfaces with other (possibly external) components and you sometimes get errors that you can’t work around in any other way, then it’s OK to catch and log those exceptions rather than propagating them. At least you tried.
In the next article, we’ll take a look at how all of these questions and considerations can at all be reconciled with YAGNI. Spoiler alert: we think that they can.
Dispose()
method calls Close()
on the client irrespective of whether there was a fault. If there was a fault, then the method should call Abort()
instead. Failure to do so causes another exception, which masks the original exception.... [More]
]]>
Published by marco on 31. May 2014 08:55:13 (GMT-5)
There’s an old problem in generated WCF clients in which the Dispose()
method calls Close()
on the client irrespective of whether there was a fault. If there was a fault, then the method should call Abort()
instead. Failure to do so causes another exception, which masks the original exception. Client code will see the subsequent fault rather than the original one. A developer running the code in debug mode will have be misled as to what really happened.
You can see WCF Clients and the “Broken” IDisposable Implementation by David Barrett for a more in-depth analysis, but that’s the gist of it.
This issue is still present in the ClientBase
implementation in .NET 4.5.1. The linked article shows how you can add your own implementation of the Dispose()
method in each generated client. An alternative is to use a generic adaptor if you don’t feel like adding a custom dispose to every client you create. [1]
public class SafeClient<T> : IDisposable
where T : ICommunicationObject, IDisposable
{
public SafeClient(T client)
{
if (client == null) { throw new ArgumentNullException("client"); }
Client = client;
}
public T Client { get; private set; }
public void Dispose()
{
Dispose(true);
GC.SuppressFinalize(this);
}
protected virtual void Dispose(bool disposing)
{
if (disposing)
{
if (Client != null)
{
if (Client.State == CommunicationState.Faulted)
{
Client.Abort();
}
else
{
Client.Close();
}
Client = default(T);
}
}
}
}
To use your WCF client safely, you wrap it in the class defined above, as shown below.
using (var safeClient = new SafeClient<SystemLoginServiceClient>(new SystemLoginServiceClient(…)))
{
var client = safeClient.Client;
// Work with "client"
}
If you can figure out how to initialize your clients without passing parameters to the constructor, you could slim it down by adding a “new” generic constraint to the parameter T in SafeClient
and then using the SafeClient
as follows:
using (var safeClient = new SafeClient<SystemLoginServiceClient>())
{
var client = safeClient.Client;
// Work with "client"
}
Published by marco on 31. May 2014 08:55:09 (GMT-5)
In a project that we’re working on, we’re consuming REST APIs delivered by services built by another team working for the same customer. We had a discussion about what were appropriate error codes to return for various situations. The discussion boiled down to: should a service return a 500 error code or a 400 error code when a request cannot be processed?
I took a quick look at the documentation for a couple of the larger REST API providers and they are using the 500 code only for catastrophic failure and using the 400 code for anything related to query-input validation errors.
Microsoft Azure Common REST API Error Codes
Code 400:
- The requested URI does not represent any resource on the server.
- One of the request inputs is out of range.
- One of the request inputs is not valid.
- A required query parameter was not specified for this request.
- One of the query parameters specified in the request URI is not supported.
- An invalid value was specified for one of the query parameters in the request URI.
Code 500:
- The server encountered an internal error. Please retry the request.
- The operation could not be completed within the permitted time.
- The server is currently unable to receive requests. Please retry your request.
Twitter Error Codes & Responses
Code 400:
“The request was invalid or cannot be otherwise served. An accompanying error message will explain further.”
Code 500:
“Something is broken. Please post to the group so the Twitter team can investigate.”
REST API Tutorial HTTP Status Codes
Code 400:
“General error when fulfilling the request would cause an invalid state. Domain validation errors, missing data, etc. are some examples.”
Code 500:
“A generic error message, given when no more specific message is suitable. The general catch-all error when the server-side throws an exception. Use this only for errors that the consumer cannot address from their end—never return this intentionally.”
“For input validation failure: 400 Bad Request + your optional description. This is suggested in the book “RESTful Web Services”.”
In this installment, we will see more elegant techniques that make use of the CustomCommandText... [More]
Published by marco on 17. Apr 2014 21:30:02 (GMT-5)
In the first installment, we covered the basics of mixing custom SQL with ORM-generated queries. We also took a look at a solution that uses direct ADO database access to perform arbitrarily complex queries.
In this installment, we will see more elegant techniques that make use of the CustomCommandText
property of Quino queries. We’ll approach the desired solution in steps, proceeding from attempt #1 – attempt #5.
tl;dr: Skip to attempt #5 to see the final result without learning why it’s correct.
An application can assign the CustomCommandText
property of any Quino query to override some of the generated SQL. In the example below, we override all of the text, so that Quino doesn’t generate any SQL at all. Instead, Quino is only responsible for sending the request to the database and materializing the objects based on the results.
[Test]
public void TestExecuteCustomCommand()
{
var people = Session.GetList<Person>();
people.Query.CustomCommandText = new CustomCommandText
{
Text = @"
SELECT ALL
""punchclock__person"".""id"",
""punchclock__person"".""companyid"",
""punchclock__person"".""contactid"",
""punchclock__person"".""customerid"",
""punchclock__person"".""initials"",
""punchclock__person"".""firstname"",
""punchclock__person"".""lastname"",
""punchclock__person"".""genderid"",
""punchclock__person"".""telephone"",
""punchclock__person"".""active"",
""punchclock__person"".""isemployee"",
""punchclock__person"".""birthdate"",
""punchclock__person"".""salary""
FROM punchclock__person WHERE lastname = 'Rogers'"
};
Assert.That(people.Count, Is.EqualTo(9));
}
This example solves two of the three problems outlined above:
Let’s see if we can address the third issue by getting Quino to format the SELECT
clause for us.
SELECT
clauseThe following example uses the AccessToolkit
of the IQueryableDatabase
to format the list of properties obtained from the metadata for a Person
. The application no longer makes assumptions about which properties are included in the select statement, what order they should be in or how to format them for the SQL expected by the database.
[Test]
public virtual void TestExecuteCustomCommandWithStandardSelect()
{
var people = Session.GetList<Person>();
var accessToolkit = DefaultDatabase.AccessToolkit;
var properties = Person.Metadata.DefaultLoadGroup.Properties;
var fields = properties.Select(accessToolkit.GetField);
people.Query.CustomCommandText = new CustomCommandText
{
Text = string.Format(
@"SELECT ALL {0} FROM punchclock__person WHERE lastname = 'Rogers'",
fields.FlattenToString()
)
};
Assert.That(people.Count, Is.EqualTo(9));
}
This example fixes the problem with the previous one but introduces a new problem: it no longer works with a remote application because it assumes that the client-side driver is a database with an AccessToolkit
. The next example addresses this problem.
AccessToolkit
The version below uses a hard-coded AccessToolkit
so that it doesn’t rely on the external data driver being a direct ADO database. It still makes an assumption about the database on the server but that is usually quite acceptable because the backing database for most applications rarely changes. [1]
[Test]
public void TestCustomCommandWithPostgreSqlSelect()
{
var people = Session.GetList<Person>();
var accessToolkit = new PostgreSqlMetaDatabase().AccessToolkit;
var properties = Person.Metadata.DefaultLoadGroup.Properties;
var fields = properties.Select(accessToolkit.GetField);
people.Query.CustomCommandText = new CustomCommandText
{
Text = string.Format(
@"SELECT ALL {0} FROM punchclock__person WHERE lastname = 'Rogers'",
fields.FlattenToString()
)
};
Assert.That(people.Count, Is.EqualTo(9));
}
We now have a version that satisfies all three conditions to a large degree. The application uses only a single query and the query works with both local databases and remoting servers. It still makes some assumptions about database-schema names (e.g. “punchclock__person” and “lastname”). Let’s see if we can clean up some of these as well.
where
clauseInstead of replacing the entire query text, an application can replace individual sections of the query, letting Quino fill in the rest of the query with its standard generated SQL. An application can append or prepend text to the generated SQL or replace it entirely. Because the condition for our query is so simple, the example below replaces the entire WHERE
clause instead of adding to it.
[Test]
public void TestCustomWhereExecution()
{
var people = Session.GetList<Person>();
people.Query.CustomCommandText = new CustomCommandText();
people.Query.CustomCommandText.SetSection(
CommandTextSections.Where,
CommandTextAction.Replace,
"lastname = 'Rogers'"
);
Assert.That(people.Count, Is.EqualTo(9));
}
That’s much nicer—still not perfect, but nice. The only remaining quibble is that the identifier lastname
is still hard-coded. If the model changes in a way where that property is renamed or removed, this code will continue to compile but will fail at run-time. This is a not insignificant problem if your application ends up using these kinds of queries throughout its business logic.
where
clause with generated field namesIn order to fix this query and have a completely generic query that fails to compile should anything at all change in the model, we can mix in the technique that we used in attempts #2 and #3: using the AccessToolkit
to format fields for SQL. To make the query 100% statically checked, we’ll also use the generated metadata—LastName
—to indicate which property we want to format as SQL.
[Test]
public void TestCustomWhereExecution()
{
var people = Session.GetList<Person>();
var accessToolkit = new PostgreSqlMetaDatabase().AccessToolkit;
var lastNameField = accessToolkit.GetField(Person.MetaProperties.LastName);
people.Query.CustomCommandText = new CustomCommandText();
people.Query.CustomCommandText.SetSection(
CommandTextSections.Where,
CommandTextAction.Replace,
string.Format("{0} = 'Rogers'", lastNameField)
);
Assert.That(people.Count, Is.EqualTo(9));
}
The query above satisfies all of the conditions we outlined above. it’s clear that the condition is quite simple and that real-world business logic will likely be much more complex. For those situations, the best approach is to fall back to using the direct ADO approach mixed with using Quino facilities like the AccessToolkit
as much as possible to create a fully customized SQL text.
Many thanks to Urs for proofreading and suggestions on overall structure.
A... [More]
]]>Published by marco on 13. Apr 2014 17:38:59 (GMT-5)
The Quino ORM [1] manages all CrUD—Create, Update, Delete—operations for your application. This basic behavior is generally more than enough for standard user interfaces. When a user works with a single object in a window and saves it, there really isn’t that much to optimize.
A more complex editing process may include several objects at once and perhaps trigger events that create additional auditing objects. Even in these cases, there are still only a handful of save operations to execute. To keep the architecture clean, an application is encouraged to model these higher-level operations with methods in the metadata (modeled methods).
The advantage to using modeled methods is that they can be executed in an application server as well as locally in the client. When an application uses a remote application server rather than a direct connection to a database, modeled methods are executed in the service layer and therefore have much less latency to the database.
If an application needs even more optimization, then it may be necessary to write custom SQL—or even to use stored procedures to move the query into the database. Mixing SQL with an ORM can be a tricky business. It’s even more of a challenge with an ORM like that in Quino, which generates the database schema and shields the user from tables, fields and SQL syntax almost entirely.
What are the potential pitfalls when using custom query text (e.g. SQL) with Quino?
There are two approaches to executing custom code:
IQuery
object using expressions, but can also add text directly to enhance or replace sections of the generated query.All of the examples below are taken directly from the Quino test suite. Some variables—like DefaultDatabase
—are provided by the Quino base testing classes but their purpose, types and implementation should be relatively obvious.
You can use the AdoDataConnectionTools
to get the underlying ADO connection for a given Session
so that any commands you execute are guaranteed to be executed in the same transactions as are already active on that session. If you use these tools, your ADO code will also automatically use the same connection parameters as the rest of your application without having to use hard-coded connection strings.
The first example shows a test from the Quino framework that shows how easy it is to combine results returned from another method into a standard Quino query.
[Test]
public virtual void TestExecuteAdoDirectly()
{
var ids = GetIds().ToList();
var people = Session.GetList<Person>();
people.Query.Where(Person.MetaProperties.Id, ExpressionOperator.In, ids);
Assert.That(people.Count, Is.EqualTo(9));
}
The ADO-access code is hidden inside the call to GetIds()
, the implementation for which is shown below. Your application can get the connection for a session as described above and then create commands using the same helper class. If you call CreateCommand()
directly on the ADO connection, you’ll have a problem when running inside a transaction on SQL Server. The SQL Server ADO implementation requires that you assign the active transaction object to each command. Quino takes care of this bookkeeping for you if you use the helper method.
private IEnumerable<int> GetIds()
{
using (var helper = AdoDataConnectionTools.GetAdoConnection(Session, "Name"))
{
using (var command = helper.CreateCommand())
{
command.AdoCommand.CommandText =
@"SELECT id FROM punchclock__person WHERE lastname = 'Rogers'";
using (var reader = command.AdoCommand.ExecuteReader())
{
while (reader.Read())
{
yield return reader.GetInt32(0);
}
}
}
}
}
There are a few drawbacks to this approach:
In the second part, we will improve on this approach by using the CustomCommandText
property of a Quino query. This will allow us to use only a single query. We will also improve maintainability by reducing the amount of code that isn’t checked by the compiler (e.g. the SQL text above).
Stay tuned for part 2, coming soon!
Many thanks to Urs for proofreading and suggestions on overall structure.
AdoDataConnectionTools
is not available until 1.12. The functionality of this class can, however, be back-ported if necessary.Mixing your own SQL into Quino queries: part 1 of 2
Java 8 has finally been released and—drum roll, please—it has... [More]
]]>Published by marco on 28. Mar 2014 15:53:54 (GMT-5)
Updated by marco on 28. Mar 2014 15:56:09 (GMT-5)
This article discusses and compares the initial version of Java 8 and C# 4.5.1. I have not used Java 8 and I have not tested that any of the examples—Java or C#—even compile, but they should be pretty close to valid.
Java 8 has finally been released and—drum roll, please—it has closures/lambdas, as promised! I would be greeting this as champagne-cork–popping news if I were still a Java programmer. [1] As an ex-Java developer, I greet this news more with an ambivalent shrug than with any overarching joy. It’s a sunny morning and I’m in a good mood, so I’m able to suppress what would be a more than appropriate comment: “it’s about time”.
Since I’m a C# programmer, I’m more interested in peering over the fence at the pile of goodies that Java just received for its eighth birthday and see if it got something “what I ain’t got”. I found a concise list of new features in the article Will Java 8 Kill Scala? by Ahmed Soliman and was distraught/pleased [2] to discover that Java had in fact gotten two presents that C# doesn’t already have.
As you’ll see, these two features aren’t huge and the lack of them doesn’t significantly impact design or expressiveness, but you know how jealousy works:
Jealousy doesn’t care.
Jealousy is.
I’m sure I’ll get over it, but it will take time. [3]
Java 8 introduces support for static methods on interfaces as well as default methods that, taken together, amount to functionality that is more or less what extensions methods brings to C#.
In Java 8, you can define static methods on an interface, which is nice, but it becomes especially useful when combined with the keyword default
on those methods. As defined in Default Methods (Java Tutorials):
“Default methods enable you to add new functionality to the interfaces of your libraries and ensure binary compatibility with code written for older versions of those interfaces.”
In Java, you no longer have to worry that adding a method to an interface will break implementations of that interface in other jar files that have not yet been recompiled against the new version of the interface. You can avoid that by adding a default implementation for your method. This applies only to those methods where a default implementation is possible, of course.
The page includes an example but it’s relatively obvious what it looks like:
public interface ITransformer
{
string Adjust(string value);
string NewAdjust(string value)
{
return value.Replace(' ', '\t');
}
}
How do these compare with extension methods in C#?
Extension methods are nice because they allow you to quasi-add methods to an interface without requiring an implementor to actually implement them. My rule of thumb is that any method that can be defined purely in terms of the public API of an interface should be defined as an extension method rather than added to the interface.
Java’s default methods are a twist on this concept that addresses a limitation of extension methods. What is that limitation? That the method definition in the extension method can’t be overridden by the actual implementation behind the interface. That is, the default implementation can be expressed purely in terms of the public interface, but perhaps a specific implementor of the interface would like to do that plus something more. Or would perhaps like to execute the extension method in a different way, but only for a specific implementation. There is no way to do this with extension methods.
Interface default methods in Java 8 allow you to provide a fallback implementation but also allows any class to actually implement that method and override the fallback.
Functional interfaces are a nice addition, too, and something I’ve wanted in C# for some time. Eric Meijer of Microsoft doesn’t miss an opportunity to point out that this is a must for functional languages (he’s exaggerating, but the point is taken).
Saying that a language supports functional interface simply means that a lambda defined in that language can be assigned to any interface with a single method that has the same signature as that lambda.
An example in C# should make things clearer:
public interface ITransformer
{
string Adjust(string value);
}
public static class Utility
{
public static void WorkOnText(string text, ITransformer)
{
// Do work
}
}
In order to call WorkOnText()
in C#, I am required to define a class that implements ITransformer
. There is no other way around it. However, in a language that allows functional interfaces, I could call the method with a lambda directly. The following code looks like C# but won’t actually compile.
Utility.WorkOnText(
"Hello world",
s => s.Replace("Hello", "Goodbye cruel")
);
For completeness, let’s also see how much extra code it is do this in C#, which has no functional interfaces.
public class PessimisticTransformer : ITransformer
{
public string Adjust(string value)
{
return value.Replace("Hello", "Goodbye cruel");
}
}
Utility.WorkOnText(
"Hello world",
new PessimisticTransformer()
);
That’s quite a huge difference. It’s surprising that C# hasn’t gotten this functionality yet. It’s hard to see what the downside is for this feature—it doesn’t seem to alter semantics.
While it is supported in Java, there are other restrictions. The signature has to match exactly. What happens if we add an optional parameter to the interface-method definition?
public interface ITransformer
{
string Adjust(string value, ITransformer additional = null);
}
In the C# example, the class implementing the interface would have to be updated, of course, but the code at calling location remains unchanged. The functional interface’s definition is the calling location, so the change would be closer to the implementation instead of more abstracted from it.
public class PessimisticTransformer : ITransformer
{
public string Adjust(string value, ITransformer additional = null)
{
return value.Replace("Hello", "Goodbye cruel");
}
}
// Using a class
Utility.WorkOnText(
"Hello world",
new PessimisticTransformer()
);
// Using a functional interface
Utility.WorkOnText(
"Hello world",
(s, a) => s.Replace("Hello", "Goodbye cruel")
);
I would take the functional interface any day.
As a final note, Java 8 has finally acquired closures/lambdas [4] but there is a limitation on which functions can be passed as lambdas. It turns out that the inclusion of functional interfaces is a workaround for not having first-class functions in the language.
Citing the article,
“[…] you cannot pass any function as first-class to other functions, the function must be explicitly defined as lambda or using Functional Interfaces”
While in C# you can assign any method with a matching signature to a lambda variable or parameter, Java requires that the method be first assigned to a variable that is “explicitly assigned as lambda” in order to use. This isn’t a limitation on expressiveness but may lead to clutter.
In C# I can write the following:
public string Twist(string value)
{
return value.Reverse();
}
public string Alter(this string value, Func<string, string> func)
{
return func(value);
}
public string ApplyTransformations(string value)
{
return value.Alter(Twist).Alter(s => s.Reverse());
}
This example shows how you can declare a Func
to indicate that the parameter is a first-class function. I can pass the Twist
function or I can pass an inline lambda, as shown in ApplyTransformations
. However, in Java, I can’t declare a Func
: only functional interfaces. In order to replicate the C# example above in Java, I would do the following:
public String twist(String value)
{
return new StringBuilder(value).reverse().toString();
}
public String alter(String value, ITransformer transformer)
{
return transformer.adjust(value);
}
public String applyTransformations(String value)
{
return alter(alter(value, s -> twist(s)), s -> StringBuilder(s).reverse().toString();
}
Note that the Java example cannot pass Twist
directly; instead, it wraps it in a lambda so that it can be passed as a functional interface. Also, the C# example uses an extension method, which allows me to “add” methods to class string
, which is not really possible in Java.
Overall, though, while these things feel like deal-breakers to a programming-language snob [5]—especially those who have a choice as to which language to use—Java developers can rejoice that their language has finally acquired features that both increase expressiveness and reduce clutter. [6]
As a bonus, as a C# developer, I find that I don’t have to be so jealous after all.
Though I’d still really like me some functional interfaces.
As efficiently as possible can be a bit of a weasel... [More]
]]>Published by marco on 13. Mar 2014 21:46:59 (GMT-5)
In Quino: partially-mapped queries we took a look at how Quino seamlessly maps as much as possible to the database, while handling unmappable query components locally as efficiently as possible.
As efficiently as possible can be a bit of a weasel statement. We saw that partial application of restrictions could significantly reduce the data returned. And we saw that efficient handling of that returned data could minimize the impact on both performance and memory, keeping in mind, of course, that the primary goal is correctness.
However, as we saw in the previous article, it’s still entirely possible that even an optimally mapped query will result in an unacceptable memory-usage or performance penalty. In these cases, we need to be able to hint or warn the developer that something non-optimal is occurring. It would also be nice if the developer could indicate whether or not queries with such deficiencies should even be executed.
Why would this be necessary? Doesn’t the developer have ultimate control over which queries are called? The developer has control over queries in business-logic code. But recall that the queries that we are using are somewhat contrived in order to keep things simple. Quino is a highly generic metadata framework: most of the queries are constructed by standard components from expressions defined in the metadata.
For example, the UI may piece together a query from various sources in order to retrieve the data for a particular view. In such cases, the developer has less direct control to “repair” queries with hand-tuning. Instead, the developer has to view the application holistically and make repairs in the metadata. This is one of many reasons why Quino has local evaluation and does not simply throw an exception for partially mapped queries, as EF does.
It is, in general, far better to continue working while executing a possibly sub-optimal and performance-damaging query than it is to simply crash out. Such behavior would increase the testing requirements for generated UIs considerably. Instead, the UI always works and the developer can focus on optimization and fine-tuning in the model, using tools like the Statistics Viewer, shown to the left.
The statistics viewer shows all commands executed in an application, with a stack trace, messages (hints/warnings/info) and the original query and mapped SQL/remote statement for each command. The statistics are available for SQL-based data drivers, but also for remoting drivers for all payload types (including JSON).
The screenshot above is for the statistics viewer for Winform applications; we’ve also integrated statistics into web applications using Glimpse, a plugin architecture for displaying extra information for web-site developers. The screenshot to the right shows a preview-release version that will be released with Quino 1.11 at the end of March.
One place where an application can run into efficiency problems is when the sort order for entities is too complex to map to the server.
If a single restriction cannot be mapped to the database, we can map all of the others and evaluate the unmappable ones locally. What happens if a single sort cannot be mapped to the database? Can we do the same thing? Again, to avoid being too abstract, let’s start with an example.
var query = Session.GetQuery<Person>();
query
.Where(Person.Fields.LastName, ExpressionOperator.StartsWith[1], "M")
.OrderBy(Person.Fields.LastName)
.OrderBy(Person.Fields.FirstName)
.Join(Person.Relations.Company).WhereEqual(Company.Fields.Name, "IBM");
Assert.That(Session.GetList(query).Count, Is.Between(100, 120));
Both of these sorts can be mapped to the server so the performance and memory hit is very limited. The ORM will execute a single query and will return data for and create about 100 objects.
Now, let’s replace one of the mappable sorts with something unmappable:
var query = Session.GetQuery<Person>();
query
.Where(Person.Fields.LastName, ExpressionOperator.StartsWith[1], "M")
.OrderBy(new DelegateExpression(c => c.GetObject<Person>().FirstName)
.OrderBy(Person.Fields.LastName)
.Join(Person.Relations.Company).WhereEqual(Company.Fields.Name, "IBM");
Assert.That(Session.GetList(query).Count, Is.Between(100, 120));
What’s happening here? Instead of being able to map both sorts to the database, now only one can be mapped. Or can it? The primary sort can’t be mapped, so there’s obviously no point in mapping the secondary sort. Instead, all sorting must be applied locally.
What if we had been able to map the primary sort but not the secondary one? Then we could have the database apply the primary sort, returning the data partially ordered. We can apply the remaining sort in memory…but that won’t work, will it? If we only applied the secondary sort in memory, then the data would end up sort only by that value. It turns out that, unlike restrictions, sorting is all-or-nothing. If we can’t map all sorts to the database, then we have to apply them all locally. [1]
In this case, the damage is minimal because the restrictions can be mapped and guarantee that only about 100 objects are returned. Sorting 100 objects locally isn’t likely to show up on the performance radar.
Still, sorting is a potential performance-killer: as soon as you stray from the path of standard sorting, you run the risk of either:
In the next article, we’ll discuss how we can extract slices from a result set—using limit
and offset
—and what sort of effect this can have on performance in partially mapped queries.
Now we’ll take a look at what happens with partially-mapped queries. Before explaining what those are, we need a... [More]
]]>Published by marco on 6. Mar 2014 22:33:32 (GMT-5)
In Quino: an overview of query-mapping in the data driver we took a look at some of the basics of querying data with Quino while maintaining acceptable performance and memory usage.
Now we’ll take a look at what happens with partially-mapped queries. Before explaining what those are, we need a more concrete example to work with. Here’s the most-optimized query we ended up with in the previous article:
var query = Session.GetQuery<Person>();
query.Join(Person.Relations.Company).WhereEqual(Company.Fields.Name, "IBM");
Assert.That(Session.GetCount(query), Is.GreaterThanEqual(140000));
With so many entries, we’ll want to trim down the list a bit more before we actually create objects. Let’s choose only people whose last names start with the letter “M”.
var query = Session.GetQuery<Person>();
query
.Where(Person.Fields.LastName, ExpressionOperator.StartsWith [1], "M")
.Join(Person.Relations.Company).WhereEqual(Company.Fields.Name, "IBM");
Assert.That(Session.GetCount(query), Is.Between(100, 120));
This is the kind of stuff that works just fine in other ORMs, like Entity Framework. Where Quino goes just a little farther is in being more forgiving when a query can be only partially mapped to the server. If you’ve used EF for anything beyond trivial queries, you’ve surely run into an exception that tells you that some portion of your query could not be mapped. [2]
Instead of throwing an exception, Quino sends what it can to the database and uses LINQ to post-process the data sent back by the database to complete the query.
Unmappable code can easily sneak in through aspects in the metadata that define filters or sorts using local methods or delegates that do not exist on the server. Instead of building a complex case, we’re going to knowingly include an unmappable expression in the query.
var query = Session.GetQuery<Person>();
query
.Where(new DelegateExpression [3](c => c.GetObject<Person>().LastName.StartsWith("M") [4])
.Join(Person.Relations.Company).WhereEqual(Company.Fields.Name, "IBM");
Assert.That(Session.GetCount(query), Is.Between(100, 120));
The new expression performs the same check as the previous example, but in a way that cannot be mapped to SQL. [5] With our new example, we’ve provoked a situation where any of the following could happen:
Person
objects from the server (all several million of them, if you’ll recall from the previous post).What happens when we evaluate the query above? With partial mapping, we know that the restriction to “IBM” will be applied on the database. But we still have an additional restriction that must be applied locally. Instead of being able to get the count from the server without creating any objects, we’re now forced to create objects in memory so that we can apply the local restrictions and only count the objects that match them all.
But as you’ll recall from the previous article, the number of matches for “IBM” is 140,000 objects. The garbage collector just gave you a dirty look again.
There is no way to further optimized this query because of the local evaluation, but there is a way to avoid another particularly nasty issue: memory bubbles.
What is a memory bubble you might ask? It describes what happens when your application is using nMB and then is suddenly using n + 100MB because you created 140,000 objects all at once. Milliseconds later, the garbage collector is thrashing furiously to clean up all of these objects—and all because you created them only in order to filter and count them. A few milliseconds after that, your application is back at nMB but the garbage collector’s fur is ruffled and it’s still trembling slightly from the shock.
The way to avoid this is to stream the objects through your analyzer one at a time rather than to create them all at once. Quino uses lazily-evaluated IEnumerable<T>
sequences throughout the data driver specifically to prevent memory bubbles.
IEnumerable<T>
sequencesBefore tackling how the Quino ORM handles the Count()
, let’s look at how it would return the actual objects from this query.
SELECT
statementIEnumerable<T>
sequence that represents the result of the mapped queryRight, now we have an IEnumerable<T>
that represents the result set, but we haven’t lit the fuse on it yet.
How do we light the fuse? Well, the most common way to do so is to call ToList()
on it. What happens then?
IEnumerator<T>
requests an elementIDataReader
Person
object from that row’s dataPerson
and yields it if it matchesPerson
is added to the listIDataReader
, which requests another rowSince the decision to add all objects to a list occurs all the way at the very outer caller, it’s the caller that’s at fault for the memory bubble not the driver. [6] We’ll see in the section how to avoid creating a list when none is needed.
Using cursors to control evaluation
If we wanted to process data but perhaps offer the user a chance to abort processing at any time, we could even get an IDataCursor<T>
from the Quino ORM so control iteration ourselves.
using (var cursor = Session.CreateCursor(query))
{
foreach (var obj in cursor)
{
// Do something with obj
if (userAbortedOperation) { break; }
}
}
But back to evaluating the query above. The Quino ORM handles it like this:
COUNT
statementSo, if a count-query cannot be fully mapped to the database, the most efficient possible alternative is to execute a query that retrieves as few objects as possible (i.e. maps as much to the server as it can) and streams those objects to count them locally.
Tune in next time for a look at how to exert more control with limit
and offset
and how those work together with partial mapping.
ExpressionOperator.StartsWithCI
to perform the check in a case-insensitive manner instead.DelegateExpression
simply wraps the lambda given in the constructor in a Quino expression object. The parameter c
is an IExpressionContext
that provides the target object, which is in this case a Person
.LastName
field.Session.CreateCursor()
to control evaluation yourself and create the right-sized batches of objects to count. The ChangeAndSave()
extension method does exactly that to load objects in batches (size adjustable by an optional parameter) rather than one by one.Recently, I was converting some older, theme stylesheets for... [More]
]]>Published by marco on 24. Feb 2014 23:01:09 (GMT-5)
Updated by marco on 24. Feb 2014 23:13:04 (GMT-5)
I’ve been using CSS since pretty much its inception. It’s powerful but quite low-level and lacks support for DRY. So, I switched to generating CSS with LESS a while back. This has gone quite well and I’ve been pretty happy with it.
Recently, I was converting some older, theme stylesheets for earthli. A theme stylesheet provides no structural CSS, mostly setting text, background and border colors to let users choose the basic color set. This is a perfect candidate for LESS.
So I constructed a common stylesheet that referenced LESS variables that I would define in the theme stylesheet. Very basically, it looks like this:
@body_color: #800;
@import "theme-base";
body
{
background-color: @body_color;
}
This is just about the most basic use of LESS that even an amateur user could possibly imagine. I’m keeping it simple because I’d like to illustrate a subtlety to variables in LESS that tripped me up at first—but for which I’m very thankful. I’ll give you a hint: LESS treats variables as a stylesheet would, whereas SASS treats them as one would expect in a programming language.
Let’s expand the theme-base.less
file with some more default definitions. I’m going to define some other variables in terms of the body color so that themes don’t have to explicitly set all values. Instead, a theme can set a base value and let the base stylesheet calculate derived values. If a calculated value isn’t OK for a theme, the theme can set that value explicitly to override.
Let’s see an example before we continue.
@title_color: darken(@body_color, 25%);
@border_color: @title_color;
body
{
background-color: @body_color;
}
h2
{
color: @title_color;
border: 1px solid @border_color;
}
You’ll notice that I avoided setting a value for @body_color
because I didn’t want to override the value set previously in the theme. But then wouldn’t it be impossible for the theme to override the values for @title_color
and @border_color
? We seem to have a problem here. [1]
I want to be able to set some values and just use defaults for everything that I don’t want to override. There is a construct in SASS called !default
that does exactly this. It indicates that an assignment should only take place if the variable has not yet been assigned. [2] Searching around for an equivalent in LESS took me to this page, Add support for “default” variables (similar to !default in SASS) #1706 (GitHub). There users suggested various solutions and the original poster became ever more adamant—“Suffice it to say that we believe we need default variable setting as we’ve proposed here”—until a LESS developer waded in to state that it would be “a pointless feature in less”, which seemed harsh until an example showed that he was quite right.
The clue is further down in one of the answers:
“If users define overrides after then it works as if it had a default on it. [T]hat’s because even in the imported file it will take the last definition in the same way as css, even if defined after usage. (Emphasis added.)”
It was at this point that the lightbulb went on for me. I was thinking like a programmer where a file is processed top-down and variable values can vary depending on location in the source text. That the output of the following C# code is 12
should amaze no one.
var a = 1;
Console.Write(a);
a = 2;
Console.Write(a);
a = 3;
In fact, we would totally expect our IDE to indicate that the value in the final assignment is never used and can be removed. Using LESS variable semantics, though, where variables are global in scope [3] and assignment are treated as they are in CSS, we would get 33
as output. Why? Because the value of the variable a
has the value 3 because that’s the last value assigned to it. That is, LESS has a cascading approach to variable assignment.
This is exactly as the developer from LESS said: stop fighting it and just let LESS do what it does best. Do you want default values? Define the defaults first, then define your override values. The overridden value will be used even when used for setting the value of another default value that you didn’t even override.
Now let’s go fix our stylesheet to use these terse semantics of LESS. Here’s a first cut at a setup that feels pretty right. I put the files in the order that you would read them so that you can see the overridden values and everything makes sense again. [4]
@body_color: white;
@title_color: darken(@body_color, 25%);
@border_color: @title_color;
@import "theme-variables";
@body_color: #800;
@import "theme-base";
body
{
background-color: @body_color;
}
h2
{
color: @title_color;
border: 1px solid @border_color;
}
You can see in the example above that the required variables are all declared, then overridden and then used. From what we learned above, we know that the value of @title_color
in the file theme-variables.less
will use a value of #800
for @body_color
because that was the last value it was assigned.
We can do better though. The example above hasn’t quite embraced the power of LESS fully. Let’s try again.
@body_color: white;
@title_color: darken(@body_color, 25%);
@border_color: @title_color;
body
{
background-color: @body_color;
}
h2
{
color: @title_color;
border: 1px solid @border_color;
}
@import "theme-base";
@body_color: #800;
Boom! That’s all you have to do. Set up everything in your base stylesheet file. Define all variables and define them in terms of each other in as convoluted a manner as you like. The final value of each value is determined before any CSS is generated.
This final version also has the added advantage that a syntax-checking IDE like JetBrains WebStorm or PHPStorm will be able to provide perfect assistance and validity checking. That wasn’t true at all for any of the previous versions, where variable declarations were in different files.
Although I was seriously considering moving away from LESS and over to SASS—because at least they didn’t leave out such a basic feature, as I had thought crossly to myself—I’m quite happy to have learned this lesson and am more happy with LESS than ever.
property
tasks where you can use the now-deprecated overwrite=“false”
directive. For the curious, now you’re supposed to use unless=“${property::exists(‘property-name’)}”
instead, which is just hideous.I thought he jumped... [More]
]]>Published by marco on 9. Feb 2014 23:08:59 (GMT-5)
The blog post/article So You Want To Write Your Own Language? by Walter Bright (Dr. Dobbs) contains a lot of interesting information, related to only to parsing, but also to runtime and framework design. Bright is well-known as the designer of the D programming language, so he’s definitely worth a read.
I thought he jumped back and forth between topics a bit, so I summarized the contents for myself below:
Bright identifies Minimizing keystrokes, easy parsing and minimizing the number of keywords as false gods. Do not waste any time trying to satisfy these requirements; instead, let them flow naturally from a good design.
Your language should consist of productions that have only a single non-terminal on the left-hand side. That is, strive to make your language context-free. [1] The implication is that you’re actually going to define the grammar rather than just winging it. This means that you can can use a parser generator even though Bright says not to “bother wasting time with lexer or parser generators and other so-called ‘compiler compilers.’”
I instead agree with the article Advice on writing a programming language by Ted Kaminski' (Generic Language), which advises providing a grammar that can be used with parser generators because “many of those people eager to contribute either get stuck trying and failing to build a parser or trying and failing to learn to use the daunting internals of your compiler”.
You can either make it easy for people to build compilers for your language or you can maintain a very friendly API for your own compiler. If you choose the API route, it might force you to be more disciplined, but it might also cause you no end of backwards-compatibility headaches as your compiler quickly evolves. Not only that, but you’d then have to make that API available for any number of languages and any number of platforms.
If you take the route of publishing the BNF, that may also not not be enough. This because it can still be daunting to convert a BNF to something that your compiler-generator can use, especially for non-trivial languages. Providing a grammar for a widely supported parser-generator like ANTLR [2] will give those willing to build tools for your language a good jump-start.
“Use an LR parser generator. It’ll keep your language parsable, and make it easier to change early on. When your language becomes popular enough that you have the problem of parsing errors not being friendly enough for your users, celebrate that fact and hand-roll only then.
“And then reap the benefit of all the tooling an easily parsed language gets, since you know you kept it LR(1).”
Introduce redundancy into the language definition (e.g. semicolons as line-terminators in addition to whitespace/newlines) in order to make error-message generation much easier and much more likely to produce friendly output.
Compilers can handle error messages in different ways:
In order to continue parsing/compiling after an error, the machine can take one of two approaches:
Do not re-invent the syntax for everything in your language. Instead, as Bright says, “[s]ave the divergence for features not generally seen before, which also signals the user that this is new.”
A language definition is nothing without a runtime. Bright recommends “taking the common sense approach and using an existing back end, such as the JVM, CLR, gcc, or LLVM. (Of course, I can always set you up with the glorious Digital Mars back end!)” If you can avoid writing your own back-end, you should definitely do so. Similar to the approach recommended for parsing the language: start with a stock runtime and migrate to something custom if the needs of your project warrant it (they almost certainly won’t). This is the approach taken by any number of other popular languages, like Scala.
And then there’s the library/framework that accompanies the language and, arguably, helps to define it for people. Complaints about a language are often complaints about the standard runtime library/framework for the language. Developers quickly associate them and treat them as one entity. Bright’s focus is on very low-level runtimes (such as the one for his language, D) and thus his advice focuses on fast I/O, fast and efficient memory allocation/de-allocation and robust/fast transcendental functions [3]. However, he also offers the following excellent rule of thumb for any framework:
“My general rule is if the explanation for what the function does is more lines than the implementation code, then the function is likely trivia and should be booted out.”
One of the most-used components of Quino is the ORM. An ORM is an Object-Relational Mapper, which accepts queries and returns data.
Published by marco on 7. Feb 2014 09:57:07 (GMT-5)
The following article was originally published on the Encodo blogs and is cross-published here.
One of the most-used components of Quino is the ORM. An ORM is an Object-Relational Mapper, which accepts queries and returns data.
This all sounds a bit abstract, so let’s start with a concrete example. Let’s say that we have millions of records in an employee database. We’d like to get some information about that data using our ORM. With millions of records, we have to be a bit careful about how that data is retrieved, but let’s continue with concrete examples.
The following example returns the correct information, but does not satisfy performance or scalability requirements. [1]
var people = Session.GetList<Person [2]>().Where(p => p.Company.Name == "IBM");
Assert.That(people.Count(), Is.GreaterThanEqual(140000));
What’s wrong with the statement above? Since the call to Where
occurs after the call to GetList<Person>()
, the restriction cannot possibly have been passed on to the ORM.
The first line of code doesn’t actually execute anything. It’s in the call to Count()
that the ORM and LINQ are called into action. Here’s what happens, though:
Person
table, create a Person
objectCompany
objectName
of the person’s company is equal to “IBM”.The code above benefits from almost no optimization, instantiating a tremendous number of objects in order to yield a scalar result. The only side-effect that can be considered an optimization is that most of the related Company
objects will be retrieved from cache rather than from the database. So that’s a plus.
Still, the garbage collector is going to be running pretty hot and the database is going to see far more queries than necessary. [3]
Let’s try again, using Quino’s fluent querying API. [4] The Quino ORM can map much of this API to SQL. Anything that is mapped to the database is not performed locally and is, by definition, more efficient. [5]
var people = Session.GetList<Person>();
people.Query.Join(Person.Relations.Company).WhereEqual(Company.Fields.Name, "IBM"); [6]
Assert.That(people.Count, Is.GreaterThanEqual(140000));
First, we get a list of people from the Session
. As of the first line, we haven’t actually gotten any data into memory yet—we’ve only created a container for results of a certain type (Person
in this case).
The default query for the list we created is to retrieve everything without restriction, as we saw in the first example. In this example, though, we restrict the Query
to only the people that work for a company called “IBM”. At this point, we still haven’t called the database.
The final line is the first point at which data is requested, so that’s where the database is called. We ask the list for the number of entries that match it and it returns an impressive number of employees.
At this point, things look pretty good. In older versions of Quino, this code would already have been sufficiently optimized. It results in a single call to the database that returns a single scalar value with everything calculated on the database. Perfect.
However, since v1.6.0 of Quino [7], the call to the property IDataList.Count
has automatically populated the list with all matching objects as well. We made this change because the following code pattern was pretty common:
var list = Session.GetList<Person>();
// Adjust query here
if (list.Count > 0)
{
// do something with all of the objects here
}
That kind of code resulted in not one, but two calls to the database, which was killing performance, especially in high-latency environments.
That means, however, that the previous example is still going to pull 14,000 objects into memory, all just to count them and add them to a list that we’re going to ignore. The garbage collector isn’t a white-hot glowing mess anymore, but it’s still throwing you a look of disapproval.
Since we know that we don’t want the objects in this case, we can get the old behavior back by making the following adjustment.
var people = Session.GetList<Person>();
people.Query.Join(Person.Relations.Company).WhereEqual(Company.Fields.Name, "IBM");
Assert.That(Session.GetCount(people.Query), Is.GreaterThanEqual(140000));
It would be even clearer to just forget about creating a list at all and work only with the query instead.
var query = Session.GetQuery<Person>();
query.Join(Person.Relations.Company).WhereEqual(Company.Fields.Name, "IBM");
Assert.That(Session.GetCount(query), Is.GreaterThanEqual(140000));
Now that’s a maximally efficient request for a number of people in Quino 1.10 as well.
Tune in next time for a look at what happens when a query can only be partially mapped to the database.
Person
class used here is generated from the application metadata rather than written by the developer, as in other frameworks.There are different strategies for retrieving associated data. Quino does not yet support retrieving anything other than root objects. That is, the associated Company
object is not retrieved in the same query as the Person
object.
In the example in question, the first indication that the ORM has that a Company
is required is when the lambda retrieves them individually. Even if the original query had somehow indicated that the Company
objects were also desired (e.g. using something like Include(Person.Relations.Company)
as you would in EF), the most optimal mapping strategy is still not clear.
Should the mapper join the company table and retrieve that highly redundant data with each person? Or should it execute a single query for all companies and prime a cache with those? The right answer depends on the latency and bandwidth between the ORM and the database as well as myriad other conditions. When dealing with a lot of data, it’s not hard to find examples where the default behavior of even a clever ORM isn’t maximally efficient—or even very efficient at all.
As we already noted, though, the example in question does everything in memory. If we reasonably assume that the people belong to a relatively small number of companies—say qc—then the millions of calls to retrieve companies associated with people will result in a lot of cache hits and generate “only” qc + 1 queries.
Person.Relations
and Person.Fields
static fields are generated with the Person
class. These correspond to the application metadata and change when the metadata changes. Developers are encouraged to use these generated constants so that even metadata-based queries can be validated by the compiler.Published by marco on 1. Feb 2014 17:09:41 (GMT-5)
A while back, I participated in an evaluation of languages that could replace JavaScript for our web front-end development language at Encodo. We took a look at two contenders: Dart and TypeScript. At the time, Dart was weaker for the following reasons:
Though TypeScript has its weaknesses (it has technically not yet hit a 1.0 release), we eventually decided to go in that direction. The tool support in Visual Studio and ReSharper are both improving steadily and have gotten quite good. We’ve had quite positive results in one larger project.
Even with Dart in our wake, I am still curious to see how people are using it. I was surprised by the claims in the article Why Dart should learn JSON while it’s still young by Max Horstmann.
Since Dart is not directly compatible with JavaScript, as TypeScript is, neither can a given JSON-formatted string be implicitly interpreted. Instead, you can import it using a library function. This is not really a problem, though one wonders if there are performance penalties for Dart that are not present in JavaScript/TypeScript.
Where the problem arises is in exporting JSON, which does not happen automagically. In non–client-side languages like C#, NewtonSoft’s JSON.Net library can serialize pretty much anything using reflection. JSON isn’t baked into the language, but that isn’t too surprising. However, in Dart, positioned as a contender for taking over from JavaScript as the client-side language of choice, the solution recommended even by Dart language gurus is to implement toJson()
on all objects that you want to export.
Either that, or use a probably non-optimized external library to serialize your object to JSON (likely using introspection, as JSON.Net does). I agree with the author of the blog that this is a red flag for using Dart in production projects. It’s strange that Dart doesn’t produce JSON without relying on external libraries. And the recommended library is, as of this writing, of pre-production/alpha quality—the version number is 0.1.0 and the TODO list includes a bullet point that exhorts the author to “Write tests!”.
So I’m still waiting to see what becomes of Dart, but the balkiness of the current solution for generating JSON not only makes it a bit of tough fit for many current web applications, but also makes us urge caution despite its having recently been released (1.0 came out in November 2013).
It takes full advantage of being evaluated at run-time to offer features that I haven’t seen in even other dynamic languages. Some of these features seem like they might be... [More]
]]>Published by marco on 1. Feb 2014 12:38:27 (GMT-5)
I have never really examined Ruby in detail but it seems to be even more of a treasure-trove of ad-hoc features than PHP.
It takes full advantage of being evaluated at run-time to offer features that I haven’t seen in even other dynamic languages. Some of these features seem like they might be nice shortcuts but also seem like they would be difficult to optimize. Not only that, but they seem so obscure that they would likely will trip up even more seasoned users of the language.
At any rate, the one I found to be most brash was methods in class definitions by bjeanes (StackOverflow). (The article is a treasure trove of other gems, no pun intended.)
The example below shows how that might work.
class RandomSubclass < [Array, Hash, String, Fixnum, Float, TrueClass].sample
end
RandomSubclass.superclass # could output one of 6 different classes.
The language allows you to call methods from the “extends” clause. The example above creates an array of class names, then calls the sample
method on them to yield a base class. The actual base class is not only unknown at compile time, it is also unpredictable at runtime.
The example above is contrived and makes the feature seem like it’s only for the reckless. It’s clear that serious software would have to forbid or strictly limit the use of such a feature, but I can see where it would be useful.
For example, you may want to change your base class depending on deployment parameters. If you’re deploying to a testing or staging environment, you’ll use a base class that includes more logging, profiling and debugging code. For production, you switch to a base class that’s optimized. If the class interface remains the same, then using this feature wouldn’t be as dangerous as it initially appeared.
Still, ensuring quality and enforcing architecture in software written in such a language would require a strict development process and discipline and vigilance from all involved.
Published by marco on 5. Jan 2014 11:46:53 (GMT-5)
It’s well-known that Apple runs a walled garden. Apple makes its developers pay a yearly fee to get access to that garden. In fairness, though, they do provide some seriously nice-looking APIs for their iOS and OS X platforms. They’ve been doing this for years, as listed in the post iOS 7 only is the only sane thing to do by Tal Bereznitskey. It argues that the new stuff in iOS 7 is compelling enough to make developers consider dropping support for all older operating systems. And this for pragmatic reasons, such as having far less of your own code to support and correspondingly making the product cost less to support. It’s best to check your actual target market, but Apple users tend to upgrade very quickly and reliably, so an iOS 7-only strategy is a good option.
Among the improvements that Apple has brought in the recent past are blocks (lambdas), GCD (asynchronous execution management) and ARC (mostly automated memory management), all introduced in iOS 4 and OS X 10.6 Snow Leopard. OS X 10.9 Mavericks and iOS 7 introduced a slew of common UI improvements (e.g. AutoLayout and HTML strings for labels). [1]
To find the videos listed below, browse to WWDC 2013 Development Videos.
For the web, Apple has improved developer tools and support in Safari considerably. There are two pretty good videos demonstrating a lot of these improvements:
For non-web development, Apple has been steadily introducing libraries to provide support for common application tasks, the most interesting of which are related to UI APIs like Core Image, Core Video, Core Animation, etc.
Building on top of these, Apple presents the Sprite Kit—for building 2D animated user interfaces and games—and the Scene Kit—for building 3D animated user interfaces and games. There are some good videos demonstrating these APIs as well.
Published by marco on 29. Dec 2013 23:09:53 (GMT-5)
I recently stumbled upon some Essays from the funniest man in Microsoft Research by Raymond (Old New Thing). He is such a funny writer that this article, against convention, will consist mostly of citations rather than an even mix of citations and paraphrasing that I naturally consider to be much more lucid and pithy. I quote at length to do the material justice, for documentation and to ensure that you all download the PDFs to see if there is more where that came from (there is). All emphases have been added.
On the delusions of the mobile-computing world:
“Mobile computing researchers are a special kind of menace. They don’t smuggle rockets to Hezbollah, or clone baby seals and then make them work in sweatshops for pennies a day. That’s not the problem with mobile computing people. The problem with mobile computing people is that they have no shame. They write research papers with titles like “Crowdsourced Geolocation-based Energy Profiling for Mobile Devices,” as if the most urgent deficiency of smartphones is an insufficient composition of buzzwords.”
On browsing web pages:
“When I use a mobile browser to load a web page, I literally have no expectation that anything will ever happen. A successful page load is so unlikely, so unimaginable, that mobile browsers effectively exist outside of causality—the browser is completely divorced from all action verbs, and can only be associated with sad, falling-tone sentences like “I had to give up after twenty seconds.” ”
On the fragility of touchscreens:
“Note that, when I say that you will “drop” your touchscreen, I do not mean “drop” in the layperson sense of “to release from a non-trivial height onto a hard surface.” I mean “drop” in the sense of “to place your touchscreen on any surface that isn’t composed of angel feathers and the dreams of earnest schoolchildren.” Phones and tablets apparently require Planck-scale mechanical alignments, such that merely looking at the touchscreen introduces fundamental, quantum dynamical changes in the touchscreen’s dilithium crystals. Thus, if you place your touchscreen on anything, ever, you have made a severe and irreversible life mistake.”
On the sheer touchiness of touchscreens:
“On your touchscreen, your swipes will become pinches, and your pinches will become scrolls, and each one of your scrolls will become a complex thing never before seen on this earth, a leviathan meta-touch event of such breadth and complexity that your phone can only respond like Carrie White at the prom. So, your phone just starts doing stuff, all the stuff that it knows how to do, and it’s just going nuts, and your apps are closing and opening and talking to the cloud and configuring themselves in unnatural ways, and your phone starts vibrating and rumbling with its little rumble pack, and it will gently sing like a tiny hummingbird of hate, and you’ll look at the touchscreen, and you’ll see that things are happening, my god, there are so many happenings, and you’ll try to flip the phone over and take out the battery, because now you just want to kill it and move to Kansas and start over, […]”
On the uselessness of most mobile computing:
“When you purchase a mobile device, you are basically saying, “I endorse the operational inefficiency of the modern bourgeoisie lifestyle, even though I could find a rock and tie a coat hanger around it and have a better chance of having a phone conversation that doesn’t sound like two monsters arguing about German poetry.””
On flying in the early 21st century:
“The point is that flying in airplanes used to be fun, but now it resembles a dystopian bin-packing problem in which humans, carry-on luggage, and five dollar peanut bags compete for real estate while crying children materialize from the ether and make obscure demands in unintelligible, Wookie-like languages while you fantasize about who you won’t be helping when the oxygen masks descend.”
On how awesome it was being a hardware architect before things got all quantum and messy:
“Of course, pride precedes the fall, and at some point, you realize that to implement aggressive out-of-order execution, you need to fit more transistors into the same die size, but then a material science guy pops out of a birthday cake and says YEAH WE CAN DO THAT, and by now, you’re touring with Aerosmith and throwing Matisse paintings from hotel room windows, because when you order two Matisse paintings from room service and you get three, that equation is going to be balanced. It all goes so well, and the party keeps getting better. When you retire in 2003, your face is wrinkled from all of the smiles, and even though you’ve been sued by several pedestrians who suddenly acquired rare paintings as hats, you go out on top, the master of your domain. ”
On quantum-level effects in modern processors:
“They randomly switched states; they leaked voltage; they fell prey to the seductive whims of cosmic rays that, unlike the cosmic rays in comic books, did not turn you into a superhero, but instead made your transistors unreliable and shiftless, like a surly teenager who is told to clean his room and who will occasionally just spray his bed with Lysol and declare victory.”
On scaling in cores when processor speed and more transistors became too messy:
“John did what any reasonable person would do: he cloaked himself in a wall of denial and acted like nothing had happened. “Making processors faster is increasingly difficult,” John thought, “but maybe people won’t notice if I give them more processors.” This, of course, was a variant of the notorious Zubotov Gambit, named after the Soviet-era car manufacturer who abandoned its attempts to make its cars not explode, and instead offered customers two Zubotovs for the price of one […]”
On the main purpose that people have for their computers:
“Lay people use their computers for precisely ten things, none of which involve massive computational parallelism, and seven of which involve procuring a vast menagerie of pornographic data and then curating that data using a variety of fairly obvious management techniques, like the creation of a folder called “Work Stuff,” which contains an inner folder called “More Work Stuff,” where “More Work Stuff” contains a series of ostensible documentaries that describe the economic interactions between people who don’t have enough money to pay for pizza and people who aren’t too bothered by that fact. ”
A summary of the state of the world of hardware design and development:
“[…] you brought the fire down from Olympus, and the mortals do with it what they will. But now, all the easy giants were dead, and John was left to fight the ghosts that Schrödinger had left behind.”
What it’s like to be a systems (low-level) programmer:
“A systems programmer will know what to do when society breaks down, because the systems programmer already lives in a world without law.”
On why people still use C++ (or a response to the snotty question of: “why don’t you just use high-level language X instead?”)
“Why not use a modern language with garbage collection and functional programming and free massages after lunch? Here’s the answer: Pointers are real. They’re what the hardware understands. Somebody has to deal with them. You can’t just place a LISP book on top of an x86 chip and hope that the hardware learns about lambda calculus by osmosis. […] Pointers are like […] real, living things that must be dealt with so that polite society can exist. Make no mistake, I don’t want to write systems software in a language like C++. […] When it’s 3 A.M., and you’ve been debugging for 12 hours, and you encounter a virtual static friend protected volatile templated function pointer, you want to […] find the people who wrote the C++ standard and bring ruin to the things that they love.”
On being thankful for systems programmers:
“That being said, if you find yourself drinking a martini and writing programs in garbage-collected, object-oriented Esperanto, be aware that the only reason that the Esperanto runtime works is because there are systems people who have exchanged any hope of losing their virginity for the exciting opportunity to think about hex numbers and their relationships with the operating system, the hardware, and ancient blood rituals that Bjarne Stroustrup performed at Stonehenge.”
On how difficult it is to work in extremely fragile territory (rather than a safe runtime):
“Indeed, I would [have…checked the log files for errors] if I hadn’t broken every component that a logging system needs to log data. I have a network file system, and I have broken the network, and I have broken the file system, and my machines crash when I make eye contact with them. I HAVE NO TOOLS BECAUSE I’VE DESTROYED MY TOOLS WITH MY TOOLS.”
A backhanded swipe at the utter uselessness of many UI concerns:
“I’m glad that people are working on new kinds of bouncing icons because they believe that humanity has solved cancer and homelessness and now lives in a consequence-free world of immersive sprites.”
Anyway, the interesting thing I saw was in their introductory test. It struck... [More]
]]>Published by marco on 7. Nov 2013 20:56:06 (GMT-5)
On Codecademy, you can learn to program in various languages. It starts off very slowly and is targeted at non-technical users. That’s their claim anyway—the material in the courses I looked at ramps up pretty quickly.
Anyway, the interesting thing I saw was in their introductory test. It struck me as a subtle way to get you to enter your email address. I’d just recently discussed this on a project I’m working on: how can we make it fun for the user to enter personal information? The goal is not to sell that information (not yet anyway, but who knows what the future holds), but to be able to enhance—nay, personalize—the service.
Personalizing has a bad reputation but can be very beneficial. For example, if you’re using a site for free and you’re going to see offers and advertisements anyway, isn’t it better to enter a bit of data that will increase the likelihood that offers and ads are interesting? Each person can—and should—decide for the themselves what to make public, but the answer isn’t always necessarily no.
Here they teach you how to use the “length” method by measuring your email address. Sneaky. I like it.
Even if you don’t given them an address, they re-prompt you to enter your email, but it doesn’t come across as pushy because you’re taking a test.
I thought that this was pretty subtle. Because of the context, people who would ordinarily be sensitive to giving up their email might not even notice. Why? Because they want to answer the question *correctly*. They don’t want the site to judge them for having entered something wrong, so they do as they’re told.
Is Codecademy collecting emails this way? I have no way to be sure, but they’d be silly not to.
]]>The following article outlines a solution to what may end up being a temporary problem. The conditions are very specific: no server-side logic; HTTP authentication; AppCache as it is implemented by... [More]
Published by marco on 3. Nov 2013 11:17:36 (GMT-5)
The following article was originally published on the Encodo blogs and is cross-published here.
The following article outlines a solution to what may end up being a temporary problem. The conditions are very specific: no server-side logic; HTTP authentication; AppCache as it is implemented by the target platforms—Safari Mobile and Google Chrome—in late 2012/early 2013. The solution is not perfect but it’s workable. We’re sharing it here in the hope that it can help someone else or serve as a base for a better solution.
The application cache is a relatively new feature that is,
Web applications can use the HTML5 application-cache to store local content, but different browsers apply different restrictions to the amount of space allocated per domain.
In particular, the Safari Mobile browser cannot update the application cache for files for which it must obtain authentication.
The graphic below illustrates the mechanism by which a content package in a web application can manage content updates and present them to the user.
In order to address the problems described above, the UA products use a separate version file to check for updates independent of the browser’s application-cache mechanism and to trigger this update only when authentication has been reestablished.
This approach worked relatively well for us, although we continue to refine it based on feedback and experience.
Published by marco on 21. Oct 2013 22:56:04 (GMT-5)
Updated by marco on 12. Jun 2018 20:06:15 (GMT-5)
Microsoft just recently released Visual Studio 2013, which includes Entity Framework 6 and introduces a lot of new features. It reminded me of the following query that EF generated for me, way, way back when it was still version 3.5. Here’s hoping that they’ve taken care of this problem since then.
So, the other day EF (v3.5) seemed to be taking quite a while to execute a query on SQL Server. This was a pretty central query and involved a few joins and restrictions, but wasn’t anything too wild. All of the restrictions and joins were on numeric fields backed by indexes.
In these cases, it’s always best to just fire up the profiler and see what kind of SQL is being generated by EF. It was a pretty scary thing (I’ve lost it unfortunately), but I did manage to take a screenshot of the query plan, shown below.
It doesn’t look too bad until you notice that the inset on the bottom right (the black smudgy thing) is a representation of the entire query … and that it just kept going on down the page.
Just to recap, here are the ways to ignore a file:
.gitignore
: you can designate basic exclusion... [More]Published by marco on 15. Jul 2013 01:38:50 (GMT-5)
The helpful page, Ignoring files (GitHub), taught me something I didn’t know: there’s a file you can use to ignore files in your local Git repository without changing anyone else’s repository.
Just to recap, here are the ways to ignore a file:
.gitignore
: you can designate basic exclusion directives that apply to all repositories on your system. This file is not committed to any repository or shared with others. Execute git config –global core.excludesfile ~/.gitignore_global
to set the file to ~/.gitignore_global
(for example). See the linked article for sample directives..git/info/exclude
file in any repository. These directives are combined with any system-global directives to form the base exclusions for that repository. This file is not committed with the repository. This is the one I’d never heard of before..gitignore
: add a file with this name to any directory. The directives in that file are merged with those from the parent directory to define the patterns that are excluded in that directory and all child directories. This is definitely the most common way to exclude files.git update-index –assume-unchanged path/to/file.txt
. While this can be useful for legacy projects, it’s best to structure new projects so developers don’t have to rely on easily forgotten tricks like this.1vw
: 1% of viewport width1vh
: 1% of viewport... [More]Published by marco on 14. Jul 2013 23:17:53 (GMT-5)
I’ve been using CSS since its inception and use many parts of the CSS3 specification for both personal work and work I do for Encodo. Recently, I read about some length units I’d never heard of in the article CSS viewport units: vw, vh, vmin and vmax by Chris Mills (Dev.Opera).
1vw
: 1% of viewport width1vh
: 1% of viewport height1vmin
: 1vw
or 1vh
, whatever is smallest1vmax
: 1vw
or 1vh
, whatever is largestThese should be eminently useful for responsive designs. While there is wide support for these new units, that support is only available in the absolute latest versions of browsers. See the article for a good example of how these can be used.
While the ones covered in the article are actually new, there are others that have existed for a while but that I’ve never had occasion to use. The Font-relative lengths: the ‘em’, ‘ex’, ‘ch’, ‘rem’ units (CSS Values and Units Module Level 3) section lists the following units:
em
: This one is well-known: 1em
is equal to the “computed value of the ‘font-size’ property of the element on which it is used.”ex
: Equal to the height of the letter ‘x’ in the font of the element on which it is used. This is useful when you want to size a container based on the height of a lower-case letter—i.e. tighter—rather than on the full size of the font (as you get with em
).ch
: “Equal to the advance measure of the “0” (ZERO, U+0030) glyph found in the font used to render it.” Since all digits in a font should be the same width, this unit is probably useful for pages that need to measure and render numbers in a reliable vertical alignment.rem
: The same as em
but always returns the value for the root element of the page rather than the current element. Elements that use this unit will all scale against a common size, independently of the font-size of their contents. There’s more to the CSS rem unit than font sizing by Roman Rudenko (CSS-Tricks) has a lot more information and examples, as well as an explanation of how rem
can stand in for the still nascent support for vw
.The article Announcing the .NET Framework 4.5.1 Preview provides an incredible amount of detail about a relatively exciting list of improvements for .NET developers.
First and... [More]
]]>Published by marco on 29. Jun 2013 17:00:15 (GMT-5)
Updated by marco on 29. Jun 2013 17:04:02 (GMT-5)
The following article was originally published on the Encodo blogs and is cross-published here.
The article Announcing the .NET Framework 4.5.1 Preview provides an incredible amount of detail about a relatively exciting list of improvements for .NET developers.
First and foremost, the Edit-and-Continue feature is now available for x64 builds as well as x86 builds. Whereas an appropriate cynical reaction is that “it’s about damn time they got that done”, another appropriate reaction is to just be happy that they will finally support x64-debugging as a first-class feature in Visual Studio 2013.
Now that they have feature-parity for all build types, they can move on to other issues in the debugger (see the list of suggestions at the end).
We haven’t had much opportunity to experience the drawbacks of the current debugger vis à vis asynchronous debugging, but the experience outlined in the call-stack screenshot below is one that is familiar to anyone who’s done multi-threaded (or multi-fiber, etc.) programming.
Instead of showing the actual stack location in the thread within which the asynchronous operation is being executed, the new and improved version of the debugger shows a higher-level interpretation that places the current execution point within the context of the asnyc operation. This is much more in keeping with the philosophy of the async/await feature in .NET 4.5, which lets developers write asynchronous code in what appears to be a serial fashion. This improved readability has been translated to the debugger now, as well.
The VS2013 debugger can now show the “direct return values and the values of embedded methods (the arguments)” for the current line. [1] Instead of manually selecting the text segment and using the Quick Watch window, you can now just see the chain of values in the “Autos” debugger pane.
“We are also releasing an update in Visual Studio 2013 Preview to provide better support for apps that indirectly depend on multiple versions of a single NuGet package. You can think of this as sane NuGet library versioning for desktop apps.”
We’ve been bitten by the afore-mentioned issue and are hopeful that the solution in Visual Studio 2013 will fill the gaps in the current release. The article describes several other improvements to the Nuget services, including integration with Windows Update for large-scale deployment. They also mentioned “a curated list of Microsoft .NET Framework NuGet Packages to help you discover these releases, published in OData format on the NuGet site”, but don’t mention whether the Nuget UI in VS2013 has been improved. The current UI, while not as awful and slow as initial versions, is still not very good for discovery and is quite clumsy for installation and maintenance.
You’re not limited to just waiting on the sidelines to see which feature Microsoft has decided to implement in the latest version of .NET/Visual Studio. You should head over to the User Voice for Visual Studio site to get an account and vote for the issues you’d like the to work on next.
Here’s a list of the ones I found interesting, and some of which I’ve voted on.
Many improvements have been made to Microsoft’s Entity Framework (EF) since I last used it in production code. In fact, we’d last used it waaaaaay back in 2008 and 2009 when EF had just been released.... [More]
]]>Published by marco on 8. Jun 2013 09:43:11 (GMT-5)
Updated by marco on 9. Jun 2013 09:38:28 (GMT-5)
The following article was originally published on the Encodo blogs and is cross-published here.
Many improvements have been made to Microsoft’s Entity Framework (EF) since I last used it in production code. In fact, we’d last used it waaaaaay back in 2008 and 2009 when EF had just been released. Instead of EF, I’ve been using the Quino ORM whenever I can.
However, I’ve recently started working on a project where EF5 is used (EF6 is in the late stages of release, but is no longer generally available for production use). Though I’d been following the latest EF developments via the ADO.Net blog, I finally had a good excuse to become more familiar with the latest version with some hands-on experience.
Entity Framework: Be Prepared was the first article I wrote about working with EF. It’s quite long and documents the pain of using a 1.0 product from Microsoft. That version support only a database-first approach, the designer was slow and the ORM SQL-mapper was quite primitive. Most of the tips and advice in the linked article, while perhaps amusing, are no longer necessary (especially if you’re using the Code-first approach, which is highly recommended).
Our next update, The Dark Side of Entity Framework: Mapping Enumerated Associations, discusses a very specific issue related to mapping enumerated types in an entity model (something that Quino does very well). This shortcoming in EF has also been addressed but I haven’t had a chance to test it yet.
Our final article was on performance, Pre-generating Entity Framework (EF) Views, which, while still pertinent, no longer needs to be done manually (there’s an Entity Framework Power Tools extension for that now).
So let’s just assume that that was the old EF; what’s the latest and greatest version like?
Well, as you may have suspected, you’re not going to get an article about Code-first or database migrations. [1] While a lot of things have been fixed and streamlined to be not only much more intuitive but also work much more smoothly, there are still a few operations that aren’t so intuitive (or that aren’t supported by EF yet).
One such operation is deleting multiple objects in the database. It’s not that it’s not possible, but that the only solution that immediately appears is to,
The following code illustrates this pattern for a hypothetical list of users.
var users = context.Users.Where(u => u.Name == "John");
foreach (var u in users)
{
context.Users.Remove(u);
}
context.SaveChanges();
This seems somewhat roundabout and quite inefficient. [2]
While the method above is fine for deleting a small number of objects—and is quite useful when removing different types of objects from various collections—it’s not very useful for a large number of objects. Retrieving objects into memory only to delete them is neither intuitive nor logical.
The question is: is there a way to tell EF to delete objects based on a query from the database?
I found an example attached as an answer to the post Simple delete query using EF Code First (Stack Overflow). The gist of it is shown below.
context.Database.SqlQuery<User>(
"DELETE FROM Users WHERE Name = @name",
new [] { new SqlParameter("@name", "John") }
);
To be clear right from the start, using ESQL is already sub-optimal because the identifiers are not statically checked. This query will cause a run-time error if the model changes so that the “Users” table no longer exists or the “Name” column no longer exists or is no longer a string.
Since I hadn’t found anything else more promising, though, I continued with this approach, aware that it might not be usable as a pattern because of the compile-time trade-off.
Although the answer had four up-votes, it is not clear that either the author or any of his fans have actually tried to execute the code. The code above returns an IEnumerable<User>
but doesn’t actually do anything.
After I’d realized this, I went to MSDN for more information on the SqlQuery
method. The documentation is not encouraging for our purposes (still trying to delete objects without first loading them), as it describes the method as follows (emphasis added),
“Creates a raw SQL query that will return elements of the given generic type. The type can be any type that has properties that match the names of the columns returned from the query, or can be a simple primitive type.”
This does not bode well for deleting objects using this method. Creating an enumerable does very little, though. In order to actually execute the query, you have to evaluate it.
Die Hoffnung stirbt zuletzt [3] as we like to say on this side of the pond, so I tried evaluating the enumerable. A foreach
should do the trick.
var users = context.Database.SqlQuery<User>(
"DELETE FROM Users WHERE Name = @name",
new [] { new SqlParameter("@name", "John") }
);
foreach (var u in users)
{
// NOP?
}
As indicated by the “NOP?” comment, it’s unclear what one should actually do in this loop because the query already includes the command to delete the selected objects.
Our hopes are finally extinguished with the following error message:
That this approach does not work is actually a relief because it would have been far too obtuse and confusing to use in production.
It turns out that the SqlQuery
only works with SELECT
statements, as was strongly implied by the documentation.
var users = context.Database.SqlQuery<User>(
"SELECT * FROM Users WHERE Name = @name",
new [] { new SqlParameter("@name", "John") }
);
Once we’ve converted to this syntax, though, we can just use the much clearer and compile-time–checked version that we started with, repeated below.
var users = context.Users.Where(u => u.Name == "John");
foreach (var u in users)
{
context.Users.Remove(u);
}
context.SaveChanges();
So we’re back where we started, but perhaps a little wiser for having tried.
As a final footnote, I just want to point out how you would perform multiple deletes with the Quino ORM. It’s quite simple, really. Any query that you can use to select objects you can also use to delete objects [4].
So, how would I execute the query above in Quino?
Session.Delete(Session.CreateQuery<User>().WhereEquals(User.MetaProperties.Name, "John").Query);
To make it a little clearer instead of showing off with a one-liner:
var query = Session.CreateQuery<User>();
query.WhereEquals(User.MetaProperties.Name, "John");
Session.Delete(query);
Quino doesn’t support using Linq to create queries, but its query API is still more statically checked than ESQL. You can see how the query could easily be extended to restrict on much more complex conditions, even including fields on joined tables.
context.SaveChanges()
inside the foreach
-loop. Doing so is wasteful and does not give EF an opportunity to optimize the delete calls into a single SQL statement (see footnote below).With the following caveats, which generally apply to all queries with any ORM:
DELETE
vs. SELECT
operations.DELETE
operations on some database back-endsDELETE
operation than in a SELECT
operation simply because that particular combination has never come up before.Some combination of these reasons possibly accounts for EF’s lack of support for batch deletes.
“[…] there keep being issues of files being over written, changes backed out etc. from people coding in the same file from different teams.”
My response was as follows:
]]>tl;dr: The way to prevent this is to keep... [More]
Published by marco on 5. May 2013 21:36:02 (GMT-5)
I was recently asked a question about merge conflicts in source-control systems.
“[…] there keep being issues of files being over written, changes backed out etc. from people coding in the same file from different teams.”
My response was as follows:
tl;dr: The way to prevent this is to keep people who have no idea what they’re doing from merging files.
Let’s talk about bad merges happening accidentally. Any source-control worth its salt will support at least some form of automatic merging.
An automatic merge is generally not a problem because the system will not automatically merge when there are conflicts (i.e. simultaneous edits of the same lines, or edits that are “close” to one another in the base file).
An automatic merge can, however, introduce semantic issues.
For example if both sides declared a method with the same name, but in different places in the same file, an automatic merge will include both copies but the resulting file won’t compile (because the same method was declared twice).
Or, another example is as follows:
public void A(B b)
{
var a = new A();
b.Do(a);
b.Do(a);
b.Do(a);
}
public void A(B b)
{
var a = new A();
b.Do(a);
b.Do(a);
b.Do(a);
a.Do();
}
public void A(B b)
{
var a = null;
b.Do(a);
b.Do(a);
b.Do(a);
}
public void A(B b)
{
var a = null;
b.Do(a);
b.Do(a);
b.Do(a);
a.Do();
}
The automatically merged result will compile, but it will crash at run-time. Some tools (like ReSharper) will display a warning when the merged file is opened, showing that a method is being called on a provably null variable. However, if the file is never opened or the warning ignored or overlooked, the program will crash when run.
In my experience, though, this kind of automatic-merge “error” doesn’t happen very often. Code-organization techniques like putting each type in its own file and keeping methods bodies relatively compact go a long way toward preventing such conflicts. They help to drastically reduce the likelihood that two developers will be working in the same area in a file.
With these relatively rare automatic-merge errors taken care of, let’s move on to errors introduced deliberately through maliciousness or stupidity. This kind of error is also very rare, in my experience, but I work with very good people.
“Let’s say we have two teams:
Team One − branch one
> Works on file 1
Team Two − branch two
> Works on file 1
Team One promotes file 1 into the Master B branch, there are some conflicts that they are working out but the file is promoted.”
I originally answered that I wasn’t sure what it meant to “promote” a file while still working on it. How can a file be commited or checked in without having resolved all of the conflicts?
As it turns out, it can’t. As documented in TFS Server 2012 and Promoting changes (Stack Overflow), promotion simply means telling TFS to pick up local changes and add them to the list of “Pending Changes”. This is part of a new TFS2012 feature called “Local Workspaces”. A promoted change corresponds to having added a file to a change list in Perforce or having staged a file in Git.
The net effect, though, is that the change is purely local. That is has been promoted has nothing to do with merging or committing to the shared repository. Other users cannot see your promoted changes. When you pull down new changes from the server, conflicts with local “promoted” changes will be indicated as usual, even if TFS has already indicated conflicts between a previous change and another promoted, uncommitted version of the same file. Any other behavior else would be madness. [1]
“Team Two checks in their file 1 into the Master B branch. They back out the changes that Team One made without telling anyone anything.”
There’s your problem. This should never happen unless Team Two has truly determined that their changes have replaced all of the work that Team One did or otherwise made it obsolete. If people don’t know how to deal with merges, then they should not be merging.
Just as Stevie Wonder’s not allowed behind the wheel of a car, neither should some developers be allowed to deal with merge conflicts. In my opinion, though, any developer who can’t deal with merges in code that he or she is working on should be moved another team or, possibly, job. You have to know your own code and you have to know your tools. [2]
“Team One figures out the conflicts in their branch and re-promotes file one (and other files) to Master B branch. The source control system remembers that file 1 was backed out by Team Two so it doesn’t promote file 1 but doesn’t let the user know.”
This sounds insane. When a file is promoted—i.e. added to the pending changes—it is assumed that the current version is added to the pending changes, akin to staging a file in Git. When further changes are made to the file locally, the source-control system should indicate that it has changed since having been promoted (i.e. staged).
When you re-promote the file (re-stage it), TFS should treat that as the most recent version in your workspace. When you pull down the changes from Team 2, you will have all-new conflicts to resolve because your newly promoted file will still be in conflict with the changes they made to “file 1”—namely that they threw away all of the changes that you’d made previously.
And, I’m not sure how it works in TFS, but in Git, you can’t “back out” a commit without leaving a trail:
Either way, your local changes will cause a conflict because they will have altered the same file in the same place as either the “merge” or “revert” commit and—this is important—will have done so after that other commit.
To recap, let me summarize what this sounds like:
I don’t believe that this is really possible—even with TFS—but, if this is a possibility with your source-control, then you have two problems:
There is probably a setting in your source-control system that disallows simultaneous editing for files. This is a pretty huge restriction, but if your developers either can’t or won’t play nice, you probably have no choice.
Published by marco on 12. Feb 2013 21:44:37 (GMT-5)
Updated by marco on 12. Apr 2013 10:01:17 (GMT-5)
The paper Uniqueness and Reference Immutability for Safe Parallelism by Colin S. Gordon, Matthew J. Parkinson, Jared Parsons, Aleks Bromfield, Joe Duffy (Microsoft Research) is quite long (26 pages), detailed and involved. To be frank, most of the notation was foreign to me—to say nothing of making heads or tails of most of the proofs and lemmas—but I found the higher-level discussions and conclusions quite interesting.
The abstract is concise and describes the project very well:
“A key challenge for concurrent programming is that side-effects (memory operations) in one thread can affect the behavior of another thread. In this paper, we present a type system to restrict the updates to memory to prevent these unintended side-effects. We provide a novel combination of immutable and unique (isolated) types that ensures safe parallelism (race freedom and deterministic execution). The type system includes support for polymorphism over type qualifiers, and can easily create cycles of immutable objects. Key to the system’s flexibility is the ability to recover immutable or externally unique references after violating uniqueness without any explicit alias tracking. Our type system models a prototype extension to C# that is in active use by a Microsoft team. We describe their experiences building large systems with this extension. We prove the soundness of the type system by an embedding into a program logic.”
The project proposes a type-system extension with which developers can write provably safe parallel programs—i.e. “race freedom and deterministic execution”—with the amount of actual parallelism determined when the program is analyzed and compiled rather than decided by a programmer creating threads of execution.
The “isolation” part of this type system reminds me a bit of the way that SCOOP addresses concurrency. That system also allows programs to designate objects as “separate” from other objects while also releasing the program from the onus of actually creating and managing separate execution contexts. That is, the syntax of the language allows a program to be written in a provably correct way (at least as far as parallelism is concerned; see the “other provable-language projects” section below). In order to execute such a program, the runtime loads not just the program but also another file that specifies the available virtual processors (commonly mapped to threads). Sections of code marked as “separate” can be run in parallel, depending on the available number of virtual processors. Otherwise, the program runs serially.
In SCOOP, methods are used as a natural isolation barrier, with input parameters marked as “separate”. See SCOOP: Concurrency for Eiffel (Eiffel.com) and SCOOP (software) (Wikipedia) for more details. The paper also contains an entire section listing other projects—many implemented on the the JVM—that have attempted to make provably safe programming languages.
The system described in this paper goes much further, adding immutability as well as isolation (the same concept as “separate” in SCOOP). An interesting extension to the type system is that isolated object trees are free to have references to immutable objects (since those can’t negatively impact parallelism). This allows for globally shared immutable state and reduces argument-passing significantly. Additionally, there are readable and writable references: the former can only be read but may be modified by other objects (otherwise it would be immutable); the latter can be read and written and is equivalent to a “normal” object in C# today. In fact, “[…] writable is the default annotation, so any single-threaded C# that does not access global state also compiles with the prototype.”
In this safe-parallel extension, a standard type system is extended so that every type can be assigned such a permission and there is “support for polymorphism over type qualifiers”, which means that the extended type system includes the permission in the type, so that, given B => A
, a reference to readable B
can be passed to a method that expects an immutable A
. In addition, covariance is also supported for generic parameter types.
When they say that the “[k]ey to the system’s flexibility is the ability to recover immutable or externally unique references after violating uniqueness without any explicit alias tracking”, they mean that the type system allows programs to specify sections that accept isolated references as input, lets them convert to writable references and then convert back to isolated objects—all without losing provably safe parallelism. This is quite a feat since it allows programs to benefit from isolation, immutability and provably safe parallelism without significantly changing common programming practice. In essence, it suffices to decorate variables and method parameters with these permission extensions to modify the types and let the compiler guide you as to further changes that need to be made. That is, an input parameter for a method will be marked as immutable
so that it won’t be changed and subsequent misuse has to be corrected.
Even better, they found that, in practice, it is possible to use extension methods to allow parallel and standard implementations of collections (lists, maps, etc.) to share most code.
“A fully polymorphic version of a map() method for a collection can coexist with a parallelized version pmap() specialized for immutable or readable collections. […] Note that the parallelized version can still be used with writable collections through subtyping and framing as long as the mapped operation is pure; no duplication or creation of an additional collection just for concurrency is needed.”
Much of the paper is naturally concerned with proving that their type system actually does what it says it does. As mentioned above, at least 2/3 of the paper is devoted to lemmas and large swaths of notation. For programmers, the more interesting part is the penultimate section that discusses the extension to C# and the experiences in using it for larger projects.
“A source-level variant of this system, as an extension to C#, is in use by a large project at Microsoft, as their primary programming language. The group has written several million lines of code, including: core libraries (including collections with polymorphism over element permissions and data-parallel operations when safe), a webserver, a high level optimizing compiler, and an MPEG decoder.”
Several million lines of code is, well, it’s an enormous amount of code. I’m not sure how many programmers they have or how they’re counting lines or how efficiently they write their code, but millions of lines of code suggests generated code of some kind. Still, taken with the next statement on performance, that much code more than proves that the type system is viable.
“These and other applications written in the source language are performance-competitive with established implementations on standard benchmarks; we mention this not because our language design is focused on performance, but merely to point out that heavy use of reference immutability, including removing mutable static/global state, has not come at the cost of performance in the experience of the Microsoft team.”
Not only is performance not impacted, but the nature of the typing extensions allows the compiler to know much more about which values and collections can be changed, which affects how aggressively this data can be cached or inlined.
“In fact, the prototype compiler exploits reference immutability information for a number of otherwise-unavailable compiler optimizations. […] Reference immutability enables some new optimizations in the compiler and runtime system. For example, the concurrent GC can use weaker read barriers for immutable data. The compiler can perform more code motion and caching, and an MSIL-to-native pass can freeze immutable data into the binary.”
In the current implementation, there is an unstrict
block that allows the team at Microsoft to temporarily turn off the new type system and to ignore safety checks. This is a pragmatic approach which allows the software to be run before it has been proven 100% parallel-safe. This is still better than having no provably safe blocks at all. Their goal is naturally to remove as many of these blocks as possible—and, in fact, this requirement drives further refinement of the type system and library.
“We continue to work on driving the number of unstrict blocks as low as possible without over-complicating the type system’s use or implementation.”
The project is still a work-in-progress but has seen quite a few iterations, which is promising. The paper was written in 2012; it would be very interesting to take it for a test drive in a CTP.
A related project at Microsoft Research Spec# contributed a lot of basic knowledge about provable programs. The authors even state that the “[…] type system grew naturally from a series of efforts at safe parallelism. […] The earliest version was simply copying Spec#’s [Pure] method attribute, along with a set of carefully designed task-and data-parallelism libraries.” Spec#, in turn, is a “[…] formal language for API contracts (influenced by JML, AsmL, and Eiffel), which extends C# with constructs for non-null types, preconditions, postconditions, and object invariants”.
Though the implementation of this permissions-based type system may have started with Spec#, the primary focus of that project was more a valiant attempt to bring Design-by-Contract principles (examples and some discussion here (encodo.com)) to the .NET world via C#. Though spec# has downloadable code (CodePlex), the project hasn’t really been updated in years. This is a shame, as support for Eiffel [1] in .NET, mentioned above as one of the key influences of spec#, was dropped by ISE Eiffel long ago.
Spec#, in turn, was mostly replaced by Microsoft Research’s Contracts project (an older version of which was covered in depth in Microsoft Code Contracts: Not with a Ten-foot Pole (earthli.com)). The Contracts project seems to be alive and well: the most recent release is from October, 2012. I have not checked it out since my initial thumbs-down review (linked above) but did note in passing that the implementation is still (A) library-only and (B) does not support Visual Studio 2012.
The library-only restriction is particularly galling, as such an implementation can lead to repeated code and unwieldy anti-patterns. As documented in the Contracts FAQ, the current implementation of the “tools take care of enforcing consistency and proper inheritance of contracts” but this is presumably accomplished with compiler errors that require the programmer to include contracts from base methods in overrides.
The seminal work Object-oriented Software Construction by Bertrand Meyer (vol. II in particular) goes into tremendous detail on a type system that incorporates contracts directly. The type system discussed in this article covers only parallel safety: null-safety and other contracts are not covered at all. If you’re at all interested in these types of language extensions, the vol.2 of OOSC is a great read. The examples are all in Eiffel but should be relatively accessible. Though some features—generics, notably but also tuples, once routines and agents (earthli.com)—have since made their way into C# and other more commonly used languages, many others—such as contracts, anchored types (contravariance is far too constrained in C# to allow them), covariant return types, covariance everywhere, multiple inheritance, explicit feature removal, loop variants and invariants, etc.—are still not available. Subsequent interesting work has also been done on extensions that allow creation of provably null-safe programs (earthli.com), something also addressed in part by Microsoft Research’s Contracts project.
Published by marco on 3. Feb 2013 23:04:09 (GMT-5)
In order to program in 2013, it is important not to waste any time honing your skills with outdated tools and work-flows. What are the essential pieces of software for developing software in 2013?
Even for the smallest projects, there is no reason to forgo any of these tools.
tl;dr: It’s 2013 and your local commit history is not sacrosanct. No one wants to see how you arrived at the solution; they just want to see clean commits that explain your solution as clearly as possible. Use git; use rebase; use “rebase interactive”; use the index; stage hunks; squash merge; go nuts. [2]
I would like to focus on the “versioning” part of the tool-chain. Source control tells the story of your code, showing how it evolved to where it is at any given point. If you look closely at the “Encodo Branching Model” [3] diagram (click to enlarge), you can see the story of the source code:
Small, precise, well-documented commits are essential in order for others to understand the project—especially those who weren’t involved in developing the code. It should be obvious from which commits you made a release. You should be able to go back to any commit and easily start working from there. You should be able to maintain multiple lines of development, both for maintenance of published versions and for development of new features. The difficulty of merging these branches should be determined by the logical distance between them rather than by the tools. Merging should almost always be automatic.
Nowhere in those requirements does it say that you’re not allowed to lie about how you got to that pristine tree of commits.
A few good articles about Git have recently appeared—Understanding the Git Workflow by Benjamin Sandofsky is one such—explaining better than ever why rewriting history is better than server-side, immutable commits.
In the article cited above, Sandofsky divides his work up into “Short-lived work […] larger work […] and branch bankrupty.” These concepts are documented to some degree in the Branch Management chapter of the Encodo Git Handbook (of which I am co-author). I will expand on these themes below.
Note: The linked articles deal exclusively with the command line, which isn’t everyone’s favorite user interface (I, for one, like it). We use the SmartGit/Hg client for visualizing diffs, organizing commits and browsing the log. We also use the command-line for a lot of operations, but SmartGit is a very nice tool and version 3 supports nearly all of the operations described in this article.
As you can see from the diagram above, a well-organized and active project will have multiple branches
. Merging
and rebasing
are two different ways of getting commits from one branch into another.
Merging commits into a branch creates a merge commit, which shows up in the history to indicate that n commits were made on a separate branch. Rebasing those commits instead re-applies those commits to the head of the indicated branch without a merge commit. In both cases there can be conflicts, but one method doesn’t pose a greatest risk of them than the other. [4] You cannot tell from the history that rebased commits were developed in a separate branch. You can, however, tell that the commits were rebased because the author date (the time the commit was originally created) differs from the commit date (the last time that the commit was applied).
At Encodo, we primarily work in the master branch because we generally work on very manageable, bite-sized issues that can easily be managed in a day. Developers are free to use local branches but are not required to do so. If some other requirement demands priority, we shunt the pending issue into a private branch. Such single-issue branches are focused and involve only a handful of files. It is not at all important to “remember” that the issue was developed in a branch rather than the master branch. If there are several commits, it may be important for other users to know that they were developed together and a merge-commit can be used to indicate this. Naturally, larger changes are developed in feature branches, but those are generally the exception rather than the rule.
Remember: Nowhere in those requirements does it say that you’re not allowed to lie about how you got to that pristine tree of commits.
Otherwise? Local commit history is absolutely not sacrosanct. We rebase like crazy to avoid unwanted merge commits. That is, when we pull from the central repository, we rebase our local commits on top of the commits that come form the origin. This has worked well for us.
If the local commit history is confusing—and this will sometimes come up during the code review—we use an interactive rebase to reorganize the files into a more soothing and/or understandable set of commits. See Sandofsky’s article for a good introduction to using interactive rebasing to combine and edit commits.
Naturally, we weigh the amount of confusion caused by the offending commits against the amount of effort required to clean up the history. We don’t use bisect [5] very often, so we don’t invest a lot of time in enforcing the clean, compilable commits required by that tool. For us, the history is interesting, but we rarely go back farther than a few weeks in the log. [6]
At Encodo, there are only a few reasons to retain a merge commit in the official history:
There are no rules for local branches: you can name them whatever you like. However, if you promote a local branch to a private branch, at Encodo we use the developer’s initials as the prefix for the branch. My branches are marked as “mvb/feature1”, for example.
What’s the difference between the two? Private branches may get pushed to our common repository. Why would you need to do that? Well, I, for example, have a desktop at work and, if I want to work at home, I have to transfer my workspace somehow to the machine at home. One solution is to work on a virtual machine that’s accessible to both places; another is to remote in to the desktop at work from home; the final one is to just push that work to the central repository and pull it from home. The offline solution has the advantage of speed and less reliance on connectivity.
What often happens to me is that I start work on a feature but can only spend an hour or two on it before I get pulled off onto something else. I push the private branch, work on it a bit more at home, push back, work on another, higher-priority feature branch, merge that in to master, work on master, whatever. A few weeks later and I’ve got a private branch with a few ugly commits, some useful changes and a handful of merge commits from the master branch. The commit history is a disgusting mess and I have a sneaking suspicion that I’ve only made changes to about a dozen files but have a dozen commits for those changes.
That’s where the aforementioned “branch bankruptcy” comes in. You’re not obligated to keep that branch; you can keep the changes, though. As shown in the referenced article, you execute the following git commands:
git checkout master
git checkout -b cleaned_up_branch
git merge --squash private_feature_branch
git reset
The --squash
tells git to squash all of the changes from the private_feature_branch
into the index (staging) and reset
the index so that those changes are in the working tree. From here, you can make a single, clean, well-written commit or several commits that correspond logically to the various changes you made.
Git also lets you lose your attachment to checking in all the changes in a file at once: if a file has changes that correspond to different commits, you can add only selected differences in a file to the index (staging). In praise of Git’s index by Aristotle Pagaltzis (Plasmasturm) provides a great introduction. If you, like me, regularly take advantage of refactoring and cleanup tools while working on something else, you’ll appreciate the ability to avoid checking in dozens of no-brainer cleanup/refactoring changes along with a one-liner bug-fix. [8]
I recently renamed several projects in our solution, which involved renaming the folders as well as the project files and all references to those files and folders. Git automatically recognizes these kind of renames as long as the old file is removed and the new file is added in the same commit.
I selected all of the files for the rename in SmartGit and committed them, using the index editor to stage only the hunks from the project files that corresponded to the rename. Nice and neat. I selected a few other files and committed those as a separate bug-fix. Two seconds later, the UI refreshed and showed me a large number of deleted files that I should have included in the first commit. Now, one way to go about fixing this is to revert the two commits and start all over, picking the changes apart (including playing with the index editor to stage individual hunks).
Instead of doing that, I did the following:
Now my master branch was ready to push to the server, all neat and tidy. And nobody was the wiser.
bisect
is a git feature that executes a command against various commits to try to localize the commit that caused a build or test failure. Basically, you tell it the last commit that worked and git uses a binary search to find the offending commit. Of course, if you have commits that don’t compile, this won’t work very well. We haven’t used this feature very much because we know the code in our repositories well and using blame
and log
is much faster. Bisect is much more useful for maintainers that don’t know the code very well, but still need to figure out at which commit it stopped working.This article deals with the situation illustrated below, specifically the question raised in the comment.
if (! $folder_id)
{
$this->db->logged_query (“SELECT... [More]
]]>
Published by marco on 22. Nov 2012 23:24:02 (GMT-5)
The following ruminations were written seven years ago but have held up remarkably well. They have been published with minor updates.
This article deals with the situation illustrated below, specifically the question raised in the comment.
if (! $folder_id)
{
$this->db->logged_query (“SELECT folder_id FROM” .
$this->app->table_names->objects .
“WHERE id = $obj->object_id”);
if ($this->db->next_record ())
$folder_id = $this->db->f (“folder_id”);
else
// raise exception? ignore? what to do?
}
Above we see a situation in which you may decide against a stricter enforcement because whereas the error is clear, the reaction is not. This is also a big part of working with contracts: deferring reactions. Often—especially when developing libraries—you’re in code so deep that the desired reaction could be one of many depending on the deployment of that code.
The code above is taken from the publishing loop in the webcore; it’s used to publish comments. In effect, the code has detected that a comment object id has been passed in that doesn’t correspond to anything in the system. It’s bogus. It’s wrong.
Some deployments—I would hazard most—would just like to silently ignore the error and publish as much as possible. Silently ignoring an error will always bite you in the ass in the end (pun intended). The key here is that whereas the person deploying the final system should be perfectly free to ignore the error, you, as the library developer, can and must not.
Let’s see what kind of reactions we could have here. Well, isn’t that what exceptions were invented for? They’re for transmitting error conditions up out of deep library code. Problem solved. For more severe errors in which the code cannot continue, the answer is quite clear: you simply throw an exception. However, in the situation above, it’s not so clear.
The problem is easily skipped and most of the rest of the job can be finished. Here is where deferral comes in. Just call a function that will handle it later. This function can log the error or warning, display it to the user, ask to abort/retry/ignore, consult a table for same, throw an exception or just ignore it. It’s not your problem to dispatch solutions to encountered errors. It’s your job to detect them and maintain the integrity of the running code.
Simply throwing an exception no matter what the error condition is, in effect, making a decision about how the error will be handled. Control is lost because the exception handler is necessarily higher up. This is a bad thing if you’d actually like the code to do the best it can. As any experience at all will have shown you, some errors are just warnings or hints. It’s not just black and white, error or not. Many deployments of the system containing the code above will actually treat the issue as a warning and log it for the database techs to address.
However, to assume the opposite, that callers want errors to be swallowed, cheats those callers as well. It cheats them because it becomes incredibly hard to find errors; they must be detected by subtle logic or data problems (e.g. Hmmm…the log shows it only sent 500 emails, I thought there were 503 subscribers…). If the system never complains or logs anything, the end user calls you first. It cheats you because you can never adequately test your system because it never complains. Everything’s OK. It kind of works. It mostly works.
The desire for safety or avoiding crashes or exceptions on the client side should never override the desire to have correct code that detects error conditions. If you write library code or end-user code, that code deals mostly with detecting and reporting misused functions. The functionality itself is generally straightforward; it’s wrapping the interface around it that’s hard. The only thing that you’ll probably spend more time on is hunting down memory bugs—and, yes, even if you’re using a garbage-collected runtime, you can still have memory bugs. What would you call it if the memory required by your publication script increases in proportion to the number of subscribers and mails?
If you’re wondering what I ended up doing in the case above, I decided on a function call ‘raise’. It sounds like an exception, and that’s usually what happens: it breaks the code on that line, but a little more elegantly than the usual PHP die
statement. While the default handler for exceptions simply issues a fancy die
statement, that handler can be replaced with a different one, one that redirects to an HTML page with a nicely formatted error printout and a form for submitting the error.
Since this code is likely to run inside a script that just wants to send subscriptions and doesn’t care about data integrity errors, the handler would probably be replaced with something that suppresses the exception, but logs the error. That way, once the subscription run is done, you can view the error log and see if there are data integrity problems, and, perhaps more importantly, you see them all at once instead of just one at a time. And, more importantly still, those subscribers for whom there were no problems received their mail on time.
In the latest version of Quino—version 1.8.5—we took a long, hard look at the patterns we were using to create metadata. The metadata for an application includes all of the usual Quino stuff:... [More]
]]>Published by marco on 22. Nov 2012 19:45:47 (GMT-5)
The following article was originally published on the Encodo blogs and is cross-published here.
In the latest version of Quino—version 1.8.5—we took a long, hard look at the patterns we were using to create metadata. The metadata for an application includes all of the usual Quino stuff: classes, properties, paths, relations. With each version, though we’re able to use the metadata in more places. That means that the metadata definition code grows and grows. We needed some way to keep a decent overview of that metadata without causing too much pain when defining it.
In order to provide some background, the following are the high-level requirements that we kept in mind while designing the new pattern and supporting framework.
Quino metadata has always been defined using a .NET language—in our case, we always use C# to define the metadata, using the MetaBuilder
or InMemoryMetaBuilder
to compose the application model. This approach satisfies the need to leverage existing tools, refactoring and introspection.
Since Quino metadata is an in-memory construct, there will always be a .NET API for creating metadata. This is not to say that there will never be a DSL to define Quino metadata but that such an approach is not the subject of this post.
Quino applications have always been able to define and integrate metadata modules (e.g. reporting or security) using an IMetaModuleBuilder
. Modules solved interdependency issues by splitting the metadata-generation into several phases:
In this way, when a module needed to add a path between a class that it had defined and a class defined in another module, it could be guaranteed that classes and foreign keys for all modules had been defined before any paths were created. Likewise for classes that wanted to define relations based on paths defined in other modules.
The limitation of the previous implementation was that a module generator always created its own module and builder and could not simply re-use those created by another generator. Basically, there was no “lightweight” way of splitting metadata-generation into separate files for purely organizational purposes.
There were also a few issues with the implementation of the main model-generation code as well. The previous pattern depended heavily on local variables, all defined within one mammoth function. Separating code into individual method calls was ad-hoc—each project did it a little differently—and involved a lot of migration of local variables to instance variables. With all code in a single method, file-structure navigation tools couldn’t help at all. The previous pattern prescribed using file comments or regions that could be located using “find in file”. This was clearly sub-optimal.
The new pattern that can be applied for all models, bit or small includes the following parts:
IMetaModelGenerator
interface. This class is used by the application configuration and various tools (e.g. the code generator or UML generator) to create the model.AddClasses()
step and referenced in the AddPaths
, AddProperties
and AddLayouts
steps.) The model elements typically has two properties called Classes
and Paths
.This may sound like a lot of overhead for a simple application, but it’s really not that much extra code. The benefits are:
But enough chatter; let’s take a look at the absolute minimum boilerplate for an empty model.
public class DemoModelElements
{
public DemoModelElements()
{
Classes = new DemoModelClasses();
Paths = new DemoModelPaths();
}
public DemoModelClasses Classes { get; private set; }
public DemoModelPaths Paths { get; private set; }
}
public class DemoModelPaths
{
}
public class DemoModelClasses
{
}
public class DemoCoreGenerator : DependentMetadataGeneratorBase<DemoModelGenerator, DemoModelElements, MetaBuilder>
{
}
public class DemoModelGenerator : MetaBuilderBasedModelGeneratorBase<DemoModelElements>
{
protected override void AddMetadata()
{
Builder.Include<DemoCoreGenerator>();
}
}
The code above is functional but doesn’t actually create any metadata. So what does it do?
MetaBuilderBasedModelGeneratorBase
to indicate the type of Elements that will be exposed by this model generator. The elements class is created automatically and is available as the property Elements
(as we’ll see in the examples below). Additionally, we’re using a ModelGeneratorBase
that is based on a MetaBuilder
which means that the property Builder
is also available and is of type MetaBuilder
.DemoCoreGenerator
which is a dependent generator—it’s lightweight and uses the elements and builder from its owner. The exact types are shown in the class declaration; it can be read as: get elements of type DemoModelElements
and a builder of type MetaBuilder
from the generator with type DemoModelGenerator
. The initial generic argument can be any other metadata generator that implements the IElementsProvider<TElements, TBuilder>
interface.AddMetadata
to include the metadata created by DemoCoreGenerator
in the model.Even though it’s not very much code, you can create a snippet or a file template with Visual Studio or a Live Template or file template with ReSharper to quickly create a new model.
Now, let’s fill the empty model with some metadata. The first step is to define the model that we’re going to build. That part goes in the AddMetadata()
method. [3]
public class DemoModelGenerator : MetaBuilderBasedModelGeneratorBase<DemoModelElements>
{
protected override void AddMetadata()
{
Builder.CreateModel<DemoModel>("Demo", /*Guid*/);
Builder.CreateMainModule("Encodo.Quino");
Builder.Include<DemoCoreGenerator>();
}
}
A typical next step is to define a class. Let’s do that.
public class DemoModelClasses
{
public IMetaClass Company { get; set; }
}
public class DemoCoreGenerator : DependentMetadataGeneratorBase<DemoModelGenerator, DemoModelElements, MetaBuilder>
{
protected override void AddClasses()
{
Elements.Classes.Company = Builder.AddClassWithDefaultPrimaryKey("Company", /*Guid*/, /*Guid*/);
}
}
As you can see, we added a new class to the elements and created and assigned it in the AddClasses()
phase of metadata-generation.
An obvious next step is to create another class and define a path between them.
public class DemoModelClasses
{
public IMetaClass Company { get; set; }
public IMetaClass Person { get; set; }
}
public class DemoCoreGenerator : DependentMetadataGeneratorBase<DemoModelGenerator, DemoModelElements, MetaBuilder>
{
protected override void AddClasses()
{
Elements.Classes.Company = Builder.AddClassWithDefaultPrimaryKey("Company", /*Guid*/, /*Guid*/);
Elements.Classes.Person = Builder.AddClassWithDefaultPrimaryKey("Person", /*Guid*/, /*Guid*/);
Builder.AddInvisibleProperty(Elements.Classes.Person, "CompanyId", MetaType.Key, true, /*Guid*/);
}
protected override void AddPaths()
{
Elements.Paths.CompanyPersonPath = Builder.AddOneToManyPath(
Elements.Classes.Company, "Id",
Elements.Classes.Person, "CompanyId",
/*Guid*/, /*Guid*/
);
}
}
Having a path is not enough, though. We can also define how the relations on that path are exposed in the classes.
public class DemoCoreGenerator : DependentMetadataGeneratorBase<DemoModelGenerator, DemoModelElements, MetaBuilder>
{
protected override void AddClasses()
{
Elements.Classes.Company = Builder.AddClassWithDefaultPrimaryKey("Company", /*Guid*/, /*Guid*/);
Elements.Classes.Person = Builder.AddClassWithDefaultPrimaryKey("Person", /*Guid*/, /*Guid*/);
Builder.AddInvisibleProperty(Elements.Classes.Person, "CompanyId", MetaType.Key, true, /*Guid*/);
}
protected override void AddPaths()
{
Elements.Paths.CompanyPersonPath = Builder.AddOneToManyPath(
Elements.Classes.Company, "Id",
Elements.Classes.Person, "CompanyId",
/*Guid*/, /*Guid*/
);
}
protected override void AddProperties()
{
Builder.AddRelation(Elements.Classes.Company, "People", "", Elements.Paths.CompanyPersonPath);
Builder.AddRelation(Elements.Classes.Person, "Company", "", Elements.Paths.CompanyPersonPath);
}
}
OK, now we have a model with two entities—companies and people—that are related to each other so that a company has a list of people and each person belongs to a company.
Now we’d like to make the metadata support German as well as English. Quino naturally supports more generalized ways of doing this (e.g. importing from files), but let’s just add the metadata manually to see what that would look like (unaffected methods are left off for brevity).
public class DemoModelElements
{
public DemoModelElements()
{
Classes = new DemoModelClasses();
Paths = new DemoModelPaths();
}
public ILanguage English { get; set; }
public ILanguage German { get; set; }
public DemoModelClasses Classes { get; private set; }
public DemoModelPaths Paths { get; private set; }
}
public class DemoCoreGenerator : DependentMetadataGeneratorBase<DemoModelGenerator, DemoModelElements, MetaBuilder>
{
protected override void AddCoreElements()
{
Elements.English = Builder.AddDisplayLanguage("en-US", "English");
Elements.German = Builder.AddDisplayLanguage("de-CH", "Deutsch");
}
protected override void AddClasses()
{
var company = Elements.Classes.Company = Builder.AddClassWithDefaultPrimaryKey("Company", /*Guid*/, /*Guid*/);
company.Caption.SetValue(Elements.English, "Company");
company.Caption.SetValue(Elements.German, "Firma");
company.PluralCaption.SetValue(Elements.English, "Companies");
company.PluralCaption.SetValue(Elements.German, "Firmen");
var person = Elements.Classes.Person = Builder.AddClassWithDefaultPrimaryKey("Person", /*Guid*/, /*Guid*/);
Builder.AddInvisibleProperty(person, "CompanyId", MetaType.Key, true, /*Guid*/);
person.Caption.SetValue(Elements.English, "Person");
person.Caption.SetValue(Elements.German, "Person");
person.PluralCaption.SetValue(Elements.English, "People");
person.PluralCaption.SetValue(Elements.German, "Personen");
}
}
Note that I created a local variable for both company and person. I did this for two reasons:
Elements.Classes.Person
and Elements.Classes.Company
properties. It’s useful to keep the number of references to a minimum in order to make searching for usages with a tool like ReSharper of maximum benefit. Otherwise, there’s a lot of noise to signal and you’ll get hundreds of references when there are only actually a few dozen “real” references.You can see that the metadata-generation code is still manageable, but it’s growing. Once we’ve filled out all of the properties, relations, translations, layouts and view aspects for the person and company classes, we’ll have a file that’s several hundred lines long. A file of that size is still manageable and, since we have methods, it’s eminently navigable with a file-structure browser.
If we don’t mind keeping—or we’d rather keep—everything in one file, we can see more structure by splitting the code into more methods. This is really easy to do because we’re using the elements to reference other parts of metadata instead of local variables. For example, let’s move the class initialization code for the person and company entities to separate methods (unaffected methods are left off for brevity).
public class DemoCoreGenerator : DependentMetadataGeneratorBase<DemoModelGenerator, DemoModelElements, MetaBuilder>
{
protected override void AddClasses()
{
AddCompany();
AddPerson();
}
private void AddCompany()
{
var company = Elements.Classes.Company = Builder.AddClassWithDefaultPrimaryKey("Company", /*Guid*/, /*Guid*/);
company.Caption.SetValue(Elements.English, "Company");
company.Caption.SetValue(Elements.German, "Firma");
company.PluralCaption.SetValue(Elements.English, "Companies");
company.PluralCaption.SetValue(Elements.German, "Firmen");
}
private void AddPerson()
{
var person = Elements.Classes.Person = Builder.AddClassWithDefaultPrimaryKey("Person", /*Guid*/, /*Guid*/);
Builder.AddInvisibleProperty(person, "CompanyId", MetaType.Key, true, /*Guid*/);
person.Caption.SetValue(Elements.English, "Person");
person.Caption.SetValue(Elements.German, "Person");
person.PluralCaption.SetValue(Elements.English, "People");
person.PluralCaption.SetValue(Elements.German, "Personen");
}
}
While this is a good technique for small models—with anywhere up to five entities—most models are larger and include entities with sizable metadata definitions. Another thing to consider is that, when working with larger teams, it’s often best to keep a central item like the metadata definition as modular as possible.
To scale the pattern up for larger models, we can move code for larger entity definitions into separate generators. As soon as we move an entity to its own generator, we’re faced with the question of where we should create paths for that entity. A path doesn’t really belong to one class or another; in which generate should it go?
Well, we thought about that and came to the conclusion that the pattern should be to just create a separate generator for all paths in the model (or multiple path-only generators if you have a larger model). That is, when a model gets a bit larger, it should include the following generators (using the name “Demo” from the examples above):
DemoCoreGenerator
DemoPathGenerator
DemoCompanyGenerator
DemoPersonGenerator
The DemoCoreGenerator
will create metadata and assign elements like the display languages. It’s also recommended to define base types like enumerations and very simple classes [4] in the core as well. Obviously, as the model grows, the core generator may also get larger. This isn’t a problem: just split the contents logically into multiple generators.
For the purposes of this example, though, we only have a single core and a single path generator and two entity generators. Since these generators will all be dependent on the model’s builder and elements, the first step is to define a base class that will be used by the other generators.
internal class DemoDependentGenerator : DependentMetadataGeneratorBase<DemoModelGenerator, DemoModelElements, MetaBuilder>
{
}
public class DemoCoreGenerator : DemoDependentGenerator
{
protected override void AddCoreElements()
{
Elements.English = Builder.AddDisplayLanguage("en-US", "English");
Elements.German = Builder.AddDisplayLanguage("de-CH", "Deutsch");
}
}
public class DemoPathGenerator : DemoDependentGenerator
{
protected override void AddPaths()
{
Elements.Paths.CompanyPersonPath = Builder.AddOneToManyPath(
Elements.Classes.Company, "Id",
Elements.Classes.Person, "CompanyId",
/*Guid*/, /*Guid*/
);
}
}
public class DemoCompanyGenerator : DemoDependentGenerator
{
protected override void AddClasses()
{
var company = Elements.Classes.Company = Builder.AddClassWithDefaultPrimaryKey("Company", /*Guid*/, /*Guid*/);
company.Caption.SetValue(Elements.English, "Company");
company.Caption.SetValue(Elements.German, "Firma");
company.PluralCaption.SetValue(Elements.English, "Companies");
company.PluralCaption.SetValue(Elements.German, "Firmen");
}
protected override void AddProperties()
{
Builder.AddRelation(Elements.Classes.Person, "Company", "", Elements.Paths.CompanyPersonPath);
}
}
public class DemoPersonGenerator : DemoDependentGenerator
{
protected override void AddClasses()
{
var person = Elements.Classes.Person = Builder.AddClassWithDefaultPrimaryKey("Person", /*Guid*/, /*Guid*/);
Builder.AddInvisibleProperty(person, "CompanyId", MetaType.Key, true, /*Guid*/);
person.Caption.SetValue(Elements.English, "Person");
person.Caption.SetValue(Elements.German, "Person");
person.PluralCaption.SetValue(Elements.English, "People");
person.PluralCaption.SetValue(Elements.German, "Personen");
}
protected override void AddProperties()
{
Builder.AddRelation(Elements.Classes.Company, "People", "", Elements.Paths.CompanyPersonPath);
}
}
MetaBuilderBasedModelGeneratorBase<DemoModelElements>
{
protected override void AddMetadata()
{
Builder.CreateModel<DemoModel>("Demo", /*Guid*/);
Builder.CreateMainModule("Encodo.Quino");
Builder.Include<DemoCoreGenerator>();
Builder.Include<DemoPathGenerator>();
Builder.Include<DemoCompanyGenerator>();
Builder.Include<DemoPersonGenerator>();
}
}
You’ll note that we only moved code around and didn’t have to change any implementation or add any new elements or anything that might introduce subtle errors in the metadata. Please note, the classes are all shown in a single code block above, but the pattern dictates that each class should be in its own file.
So far, we’ve only worked with generators that are dependent on the model generator. How do we access information—and elements—generated in other modules? For example, let’s include the security module and change a translation for a caption.
public class DemoModelElements
{
public DemoModelElements()
{
Classes = new DemoModelClasses();
Paths = new DemoModelPaths();
}
public ILanguage English { get; set; }
public ILanguage German { get; set; }
public SecurityModuleElements Security { get; set; }
public DemoModelClasses Classes { get; private set; }
public DemoModelPaths Paths { get; private set; }
}
public class DemoCoreGenerator : DemoDependentGenerator
{
protected override void AddCoreElements()
{
Elements.English = Builder.AddDisplayLanguage("en-US", "English");
Elements.German = Builder.AddDisplayLanguage("de-CH", "Deutsch");
Elements.Security = Builder.Include<SecurityModuleGenerator>().Elements;
}
protected override void AddProperties()
{
Elements.Security.Classes.User.Caption.SetValue(Elements.German, "Benutzer");
}
}
This approach works well with any module that has adhered to the pattern and exposes its elements in a standardized way. [5] In this case, the core module includes the security module and retains a reference to its elements. Any code that uses the core module will now have access not only to the core elements but also to the security elements, as well.
Another major benefit to using this pattern is that the resulting code is quite self-explanatory: it’s no mystery to what the Elements.Security.Classes.User.Caption
is referring.
The previous pattern had a single monolithic file. The new pattern increases the number of files—possibly by quite a lot. It’s recommended to put these new files into the following structure:
[-] Models [+] Aspects [+] Elements [+] Generators
The “Aspects” folder isn’t new to this pattern, but it’s worth mentioning that any model-specific aspects should go into a separate folder.
That’s all for now. Happy modeling!
IMetaModel
is always available and any part of the generation process can access metadata in the model at any time. However, the API for the model is quite generic and requires knowledge of the unique identifier or index for a piece of metadata.DemoCoreGenerator
could also set up the builder (since it’s using the same builder object). To do that, you’d override AddCoreElements()
and set up the model there. However, it’s clearer to keep it in the generator that actually owns the builder that is being configured.IElementProvider
mentioned abovePublished by marco on 21. Nov 2012 23:08:51 (GMT-5)
Updated by marco on 8. Mar 2013 09:44:48 (GMT-5)
I was recently redesigning a web page and wanted to make it easier to use from touch-screen browsers. Links made only of text are relatively easy to click with a mouse, but tend to make poor touch targets. If the layout has enough space around the link, this can be remedied by applying CSS.
Suppose we have a box with three links in it, as shown to the right.
The first step is to make this box taller, so the logical thing to do is to set the height. We’ll have to pick a value, so set height: 40px
on the gray box.
This isn’t exactly what we want, though; we’d rather have the vertical space equally distributed. Also, if you hover over the links, you can see that the space below the text is not active. Maybe we can try to add vertical-align: middle
to align the content.
Unfortunately, this doesn’t have the desired effect. The vertical-align
property works when used this way in table cells, but otherwise has no effect for block elements. Knowing that, we can set display: table-cell
for the gray box.
And now the box has become longer, because the 50% width of the box is calculated differently for table cells than for regular boxes (especially when a table cell is found outside of a table).
Let’s abandon the vertical-alignment approach and try using positioning instead. Set position: relative
and top: 25%
to center the links vertically.
Now that looks much better, but the space above and below the links is still not active. Perhaps we can use the height trick again, to make the individual links taller as well. So we set height: 100%
on each of the links.
We didn’t get the expected result, but we should have expected that: the links are inline
elements and can only have a height set if we set display: inline-block
on each link as well. We use inline-block
rather than block
so that the links stay on the same line.
The links are now the right size, but they stick out below the gray box, which isn’t what we wanted at all. We’re kind of out of ideas with this approach, but there is another way we can get the desired effect.
Let’s start with the original gray box and, instead of choosing a random height as we did above—40px
—let’s set padding: 8px
on the gray box to make room above and below the links.
With just one CSS style, we’ve already got the links nicely aligned and, as an added benefit, this technique scales even if the font size is changed. The 8-pixel padding is preserved regardless of how large the font gets. [1]
This approach seems promising, but the links are still not tall enough. The naive approach of setting height: 100%
on the links probably won’t work as expected, but let’s try it anyway.
It looks like the links were already 100% of the height of the container; in hindsight it’s obvious, since the height of the gray box is determined by the height of the links. The 100% height refers to the client area of the gray box, which doesn’t include the padding.
We’d actually like the links to have padding above and below just as the gray box has. As we saw above, the links will only honor the padding if they also have display: inline-block
, so let’s set that in addition to padding: 8px
.
We’re almost there. The only thing remaining is to make the vertical padding of the links overlap with the vertical padding of the gray box. We can do this by using a negative vertical margin, setting margin: -8px
.
We finally have the result we wanted. The links are now large enough for the average finger to strike without trying too hard. Welcome to the CSS-enabled touch-friendly world of web design.
The code for the final example is shown below, with the sizing/positioning styles highlighted:
.gray-box
{
background-color: gray;
border: 1px solid black;
border-width: 1px 0;
width: 50%;
text-align: center;
padding: 8px 0;
}
.gray-box a
{
background-color: #8F8F8F;
display: inline-block;
padding: 8px 20px;
margin: -8px 0;
}
<div class="gray-box">
<a href="#" style="color: goldenrod">First</a>
<a href="#" style="color: gold">Second</a>
<a href="#" style="color: yellowgreen">Third</a>
</div>
.8em
instead and then the padding will scale with the font size. This would work just as well with the height. Let’s pretend that we’re working with a specification that requires an 8-pixel padding instead of a flexible one.The chart does not show just much time must be spent before the programmer wins, that being dependent on the complexity of the task. The probability... [More]
]]>Published by marco on 8. Jan 2012 17:13:18 (GMT-5)
This graphic Geeks versus Non-Geeks when Doing Repetitive Tasks (How-to Geek) illustrates quite nicely how programmers approach the world of problem-solving.
The chart does not show just much time must be spent before the programmer wins, that being dependent on the complexity of the task. The probability that the task will recur is also highly relevant, as automation of a smallish, one-time task is useless. Neither of those things will stop a determined programmer, though, who will automate no matter what.
In the beginning, there was Microsoft Visual SourceSafe. And it was not good.
In 1994, I... [More]
]]>Published by marco on 9. Oct 2011 12:21:54 (GMT-5)
Updated by marco on 14. Sep 2014 10:54:47 (GMT-5)
tl;dr: Encodo Systems AG has moved from Perforce to Git and has written a manual for getting started for other users or companies looking to make the leap. It’s available for free at Encodo Git Handbook.
In the beginning, there was Microsoft Visual SourceSafe. And it was not good.
In 1994, I started working for a small software company. Source control was a structured network share until I started moving projects into Microsoft Visual SourceSafe, which was slow and balky and feature-poor, but it was better than manual merging.
Until it corrupted its own database, losing our entire history. Luckily, we were able to piece together the repository from local workspaces. But the search was on to find a replacement.
In 1997, we moved to Perforce and were very happy for many years. I even used the two-user free license to run a personal Perforce server on earthli for a while.
Then I moved to Switzerland to work for Opus Software AG, a very tech-savvy company, which was, of course, using source-control software. You haven’t heard of it, though, because it was an internal tool. It worked fine and even supported branches, but some operations didn’t scale as well as they should and it was a bit difficult to understand, in general.
So, I started a campaign to move to something else. We evaluated various alternatives, including Subversion and Perforce. Perforce won—mostly because Subversion’s merging support in pre-1.5 versions was laughable—and I was back on the source-control system I’d been using for almost ten years at that point.
For my personal projects, I switched to Mercurial because I was working with other users and the two-user limit for the free Perforce license was no longer adequate, but neither was I willing to cough up $800 per user in order to continue using Perforce. I chose Mercurial because I wanted a DVCS and a good friend/coworker of mine is a lead developer on the project, so he was around to help me when I had questions.
When I left that company to found Encodo Systems AG, Perforce was the logical choice for source control. We used it exclusively for our own projects for several years, using Subversion only to access repositories hosted by two different customers. What finally broke Perforce’s lock on Encodo was offline and remote work. We finally got a customer who wanted to work with Git instead of Perforce or Subversion and the customer is king, so we started to learn Git.
It wasn’t easy at first, especially if you don’t read any documentation or background information on Git concepts. But we got the hang of it and quickly became accustomed to the freedom offered by Git versus a central-server solution like Perforce.
So off we went to do an internal evaluation on source-control systems, this time including Mercurial/Kiln, Git, PlasticSCM, Perforce and TFS. We quickly decided against TFS for several reasons, primarily that it was too tightly-coupled to other Microsoft systems that we weren’t using or prepared to use yet. Perforce was, at the time (February 2011), a wholly centralized solution and had not yet made moves in a DVCS direction. PlasticSCM was good, but didn’t overwhelm us and finally Mercurial was also good, but even my aforementioned colleague—the developer on the Mercurial project—told us that there was no advantage relative to Git if we were already familiar and (relatively) comfortable with Git.
So Encodo moved all of its source code to several Git repositories hosted on an internal Gitorious server. Since Git is more a version-control toolkit/framework than a complete end-user solution, I/we use several tools on top of Git to make it more comfortable and to reduce points-of-failure.
So, after nearly 15 years of using Perforce almost exclusively, I am now almost exclusively a Git user (I still use Subversion and Mercurial very rarely for some personal and customer projects). An Encodo developer, Stephan Hauser, wrote a handbook in June to help everyone get up-to-speed on using Git. I recently updated it to account for the last several months of working with Git and we published it just last week. You can download it for free at Encodo Git Handbook.
Despite the panacea of Agile Development, you still can’t have both fast and right. While it is possible to write good... [More]
]]>Published by marco on 27. Mar 2011 20:35:28 (GMT-5)
The oft though-provoking XKCD published a flow chart recently, called Good Code (XKCD), which outlines the two branches: doing it fast or doing it right. The chart is linked below.
Despite the panacea of Agile Development, you still can’t have both fast and right. While it is possible to write good code, the odds are good that that code will accomplish a task that no longer requires completion (indicated by the “requirements have changed” block).
Even if it’s decent code, it’s further quite likely that it is “code with concessions” and, though it works, there are a forest of TODOs and missing integration tests blighting your conscience.
Published by marco on 26. Mar 2011 11:21:16 (GMT-5)
A few years ago, I developed a utility for syncing ratings, play counts and last-played times between the same set of songs on two different iTunes installations. I haven’t worked on it in years, but it’s quite well-written and full-featured and has rich documentation with a tutorial. You can download the Windows-only software for free.
I originally wrote this software because I was listening to a lot of music at work and rating it. When I got home, I didn’t have these ratings anymore because they were only stored on my work laptop. Likewise, the ratings at home weren’t making their way to my laptop. And it wasn’t only ratings: play count and last-played date also help the digital DJ decide what to play. That was a problem when I decided to select a playlist from the machine in the living room: it had no ratings and couldn’t decide very well which music to choose from its collection.
And it’s not just user data like ratings: there’s also the matter of song data, like album and genre, which are often wrong or incomplete. If you fix it on one machine, your—or your friends or partner—might appreciate having the improved tag information for free.
Since then, the world has moved on a bit, with the Home Sharing feature letting me play music from the office machine on the living room player and services like GrooveShark letting you keep your music collection in the cloud. However, I still have a couple of iTunes libraries around and still want to sync them now and again to have the most up-to-date information from which to launch a smart playlist or run the Genius. On top of that, a lot of people have Apple gadgets that work only with iTunes. There are a lot of iTunes libraries out there that could probably benefit from syncing. See “Who Needs TuneSync?” in the documentation to find out more.
What TuneSync does is load two iTunes library files, compares them using various heuristics and lets you synchronize selected information between the two. You can then store the changes to both files and force iTunes to reload its metadata from this library. You are in full control over the information that is synchronized from one library to the other and vice versa and you can even edit information directly if neither side is 100% correct.
Though the software was developed years ago, it still loads iTunes libraries for versions as recent as 10.2.x. The two libraries I compared had about 8000 and 7500 songs respectively (about 15-16MB XML files) and TuneSync was to load them both in less than 15 seconds. Memory usage was about 150MB and the application responded smoothly and quickly for all operations.
TuneSync runs on any reasonably modern Windows operating system.
The best place to go for questions is the documentation, but here’s a brief overview of the functionality (with screenshots).
First you choose the two project files:
Once the files are loaded, the libraries are compared with the default heuristic (which is relatively strict) and TuneSync presents you with a comparison view. The one shown below is “Matched Songs”, but you can also see just the songs in each libary, all songs or unmatched songs too. See Default Views in the documentation for more information.
As mentioned above, there are tabs for the common filters—all songs, songs in library one, library two, matched songs and unmatched songs—but in each view you can also search and filter by other criteria.
You can filter by type of match or simply by typing in the filter box to restrict the songs shown in any view. The screenshot above shows only songs that have different song data for which there is only one match. The various fields are colored according to the schema outlined above. See Songs in a Library and Column Data in the documentation for more information.
The screenshot above shows the song properties that you can show by selecting one or more songs (if you select multiple songs, the details are collapsed and summarized as much as possible). See Song Info Pane and Selecting Multiple Songs in the documentation for more information.
The default matching heuristic is quite strict and is best for libraries that have either been copied from one another or been synced before. Those matches are easy to synchronize (see below) without too much worry that there are invalid matches.
However, you can ask TuneSync to perform additional matches using custom heuristics, shown above. See Match Options Window in the documentation for the more information.
Depending on the options chosen in the Match Window, you will see a lot more red here and will have to be more careful which matches you accept. See Match Results Window in the documentation for the more information.
Once you have all of the matches set up correctly, you can synchronize data between the libraries. You can either do this manually by using the Song Info Pane at the bottom of the window or by synchronizing multiple songs using certain criteria, as shown in the screenshot above. See Synchronize Individual Data, Using the Info Pane and Synchronize Multiple Data in the documentation for more information.
Once you’ve matched and synchronized songs, you can use the filters to search for modified songs in either library to verify the changes before exporting them back to the source files. If you’re running TuneSync on the same machine as the iTunes library that you’re replacing, you can have it replace the library for you; otherwise, you have to copy it to the proper location manually. See Tutorial: Checking Songs, Check Songs, Import from iTunes with TuneSync and Import from iTunes by Hand in the documentation for more information.
Published by marco on 19. Mar 2011 21:08:09 (GMT-5)
I’m currently revising the Encodo C# Handbook to update it for the last year’s worth of programming experience at Encodo, which includes a lot more experience with C# 4.0 features like optional parameters, dynamic types and more. The following is an expanded section on working with Linq. A final draft should be available by the middle of April or so.
Whether to use loose or tight coupling for components depends on several factors. If a component on a lower-level must access functionality on a higher level, this can only be achieved with loose coupling: e.g. connecting the two by using one or more delegates or callbacks.
If the component on the higher level needs to be coupled to a component on a lower level, then it’s possible to have them be more tightly coupled by using an interface. The advantage of using an interface over a set or one or more callbacks is that changes to the semantics of how the coupling should occur can be enforced. The example below should make this much clearer.
Imagine a class that provides a single event to indicate that it has received data from somewhere.
public class DataTransmitter
{
public event EventHandler<DataBundleEventArgs> DataReceived;
}
This is the class way of loosely coupling components; any component that is interested in receiving data can simply attach to this event, like this:
public class DataListener
{
public DataListener(DataTransmitter transmitter)
{
transmitter.DataReceived += TransmitterDataReceived;
}
private TransmitterDataReceived(object sender, DataBundleEventArgs args)
{
// Do something when data is received
}
}
Another class could combine these two classes in the following, classic way:
var transmitter = new DataTransmitter();
var listener = new DataListener(transmitter);
The transmitter and listener can be defined in completely different assemblies and need no dependency on any common code (other than the .NET runtime) in order to compile and run. If this is an absolute must for your component, then this is the pattern to use for all events. Just be aware that the loose coupling may introduce semantic errors—errors in usage that the compiler will not notice.
For example, suppose the transmitter is extended to include a new event, NoDataAvailableReceived
.
public class DataTransmitter
{
public event EventHandler<DataBundleEventArgs> DataReceived;
public event EventHandler NoDataAvailableReceived;
}
Let’s assume that the previous version of the interface threw a timeout exception when it had not received data within a certain time window. Now, instead of throwing an exception, the transmitter triggers the new event instead. The code above will no longer indicate a timeout error (because no exception is thrown) nor will it indicate that no data was transmitted.
One way to fix this problem (once detected) is to hook the new event in the DataListener
constructor. If the code is to remain highly decoupled—or if the interface cannot be easily changed—this is the only real solution.
Imagine now that the transmitter becomes more sophisticated and defines more events, as shown below.
public class DataTransmitter
{
public event EventHandler<DataBundleEventArgs> DataReceived;
public event EventHandler NoDataAvailableReceived;
public event EventHandler ConnectionOpened;
public event EventHandler ConnectionClosed;
public event EventHandler<DataErrorEventArgs> ErrorOccured;
}
Clearly, a listener that attaches and responds appropriately to all of these events will provide a much better user experience than one that does not. The loose coupling of the interface thus far requires all clients of this interface to be proactively aware that something has changed and, once again, the compiler is no help at all.
If we can change the interface—and if the components can include references to common code—then we can introduce tight coupling by defining an interface with methods instead of individual events.
public interface IDataListener
{
void DataReceived(IDataBundle bundle);
void NoDataAvailableReceived();
void ConnectionOpened();
void ConnectionClosed();
void ErrorOccurred(Exception exception, string message);
}
With a few more changes, we have a more tightly coupled system, but one that will enforce changes on clients:
DataTransmitter
DataTransmitter
.DataListener
implement IDataListener
Now when the transmitter requires changes to the IDataListener interface, the compiler will enforce that all listeners are also updated.
Published by marco on 19. Mar 2011 21:00:03 (GMT-5)
I’m currently revising the Encodo C# Handbook to update it for the last year’s worth of programming experience at Encodo, which includes a lot more experience with C# 4.0 features like optional parameters, dynamic types and more. The following is an expanded section on working with Linq. A final draft should be available by the middle of April or so.
System.Linq
When using Linq expressions, be careful not to sacrifice legibility or performance simply in order to use Linq instead of more common constructs. For example, the following loop sets a property for those elements in a list where a condition holds.
foreach (var pair in Data)
{
if (pair.Value.Property is IMetaRelation)
{
pair.Value.Value = null;
}
}
This seems like a perfect place to use Linq; assuming an extension method ForEach(this IEnumerable<T>)
, we can write the loop above using the following Linq expression:
Data.Where(pair => pair.Value.Property is IMetaRelation).ForEach(pair => pair.Value.Value = null);
This formulation, however, is more difficult to read because the condition and the loop are now buried in a single line of code, but a more subtle performance problem has been introduced as well. We have made sure to evaluate the restriction (“Where”) first so that we iterate the list (with “ForEach”) with as few elements as possible, but we still end up iterating twice instead of once. This could cause performance problems in border cases where the list is large and a large number of elements satisfy the condition.
Linq is mostly a blessing, but you always have to keep in mind that Linq expressions are evaluated lazily. Therefore, be very careful when using the Count() method because it will iterate over the entire collection (if the backing collection is of base type IEnumerable<T>
). Linq is optimized to check the actual backing collection, so if the IEnumerable<T>
you have is a list and the count is requested, Linq will use the Count
property instead of counting elements naively.
A few concrete examples of other issues that arise due to lazy evaluation are illustrated below.
You can accidentally change the value of a captured variable before the sequence is evaluated. Since ReSharper will complain about this behavior even when it does not cause unwanted side-effects, it is important to understand which cases are actually problematic.
var data = new[] { "foo", "bar", "bla" };
var otherData = new[] { "bla", "blu" };
var overlapData = new List<string>();
foreach (var d in data)
{
if (otherData.Where(od => od == d).Any())
{
overlapData.Add(d);
}
}
// We expect one element in the overlap, “bla”
Assert.AreEqual(1, overlapData.Count);
The reference to the variable d will be flagged by ReSharper and marked as an “access to a modified closure”. This is a reminder that a variable referenced—or “captured”—by the lambda expression—closure—will have the last value assigned to it rather than the value that was assigned to it when the lambda was created. In the example above, the lambda is created with the first value in the sequence, but since we only use the lambda once, and then always before the variable has been changed, we don’t have to worry about side-effects. ReSharper can only detect that a variable referenced in a closure is being changed within the scope that it checks and letting you know so you can verify that there are no unwanted side-effects.
Even though there isn’t a problem, you can rewrite the foreach
-statement above as the following code, eliminating the “Access to modified closure” warning.
var overlapData = data.Where(d => otherData.Where(od => od == d).Any()).ToList();
The example above was tame in that the program ran as expected despite capturing a variable that was later changed. The following code, however, will not run as expected:
var data = new[] { "foo", "bar", "bla" };
var otherData = new[] { "bla", "blu" };
var overlapData = new List<string>();
var threshold = 2;
var results = data.Where(d => d.Length == threshold);
var overlapData = data.Where(d => otherData.Where(od => od == d).Any());
if (overlapData.Any())
{
threshold += 1;
}
// All elements are three characters long, so we expect no matches
Assert.AreEqual(0, results.Count());
Here we have a problem because the closure is evaluated after a local variable that it captured has been modified, resulting in unexpected behavior. Whereas it’s possible that this is exactly what you intended, it’s not a recommended coding style. Instead, you should move the calculation that uses the lambda after any code that changes variables that it capture:
var threshold = 2;
var overlapData = data.Where(d => otherData.Where(od => od == d).Any());
if (overlapData.Any())
{
threshold += 1;
}
var results = data.Where(d => d.Length == threshold);
This is probably the easiest way to get rid of the warning and make the code clearer to read.
Published by marco on 19. Mar 2011 20:30:46 (GMT-5)
Updated by marco on 19. Mar 2011 20:35:22 (GMT-5)
PHPDoc is a popular tool for generating documentation for PHP projects. I made a whole lot of improvements to it for PHP5 and updated all the skins to look less boxy, have nicer and more informative icons and be easier to use, and then created an earthli fork. This article includes a full feature list and screenshots.
The earthli WebCore (the software that runs this web site) is open-source. It is also relatively well-documented. The documentation is generated using PHPDoc, but a better version than that available in the main fork of PHPDoc found on the main site or in SourceForge.
Though PHPDoc does a decent job of gathering information and making it available to the templates, there are a few problems with the main fork:
A long time ago—when PHP4 was still young—I contributed a whole new set of templates—modestly called “earthli” and “earthli:DOM”—to the project and brought the rendering up to a decent level. There were still problems, but it was—in my eyes—worlds better than any of the existing templates.
Things stayed like that for a while.
Then I ported my framework from PHP4 to PHP5 in early 2010 and discovered that PHPDoc was again limping along a bit, generating output that no longer met my standards.
So, I made a lot of improvements and again basically rewrote one set of templates (this time called “earthli-v2”) that I thought looked clean and offered the following features:
UTF-8
and ISO-8859-1
are supported; defaults to UTF-8
.XHTML
and HTML
are supported; defaults to HTML
.object
, stdClass
, mixed
; defaults to mixed
.access
and abstract
tags are generated; since these properties are also indicated by the icon, defaults to false
.css
, img
and none
; defaults to css
.false
default
(a PHPDoc standard) and earthli
skins are included; defaults to earthli
.The old style isn’t horrible, but it’s a bit dated, with too many borders, blurry icons and too many bold fonts.
The new style is cleaner, has far fewer borders, better margins and alignments and nicer icons (for all elements, with access visibility for all element types) as well as much more legible placement and more information, including direct links to source for all elements, much nicer signature-formatting and tamer colors.
I was in contact with the project maintainer but was never able to upload my changes into the main branch of PHPDoc. There have been no updates on the main line since late 2009 and I don’t know whether the project has died or not. I only just realized that I never officially published my changes, so I’m officially making the earthli fork available as a Mercurial repository or as a compressed archive.
HashSet<T>
, but it could be anything. It ends with cogitation... [More]Published by marco on 18. Dec 2010 01:21:38 (GMT-5)
Updated by marco on 22. Nov 2012 19:42:16 (GMT-5)
tl;dr: This is a long-winded way of advising you to always be sure what you’re comparing when you build low-level algorithms that will be used with arbitrary generic arguments. The culprit in this case was the default comparator in aHashSet<T>
, but it could be anything. It ends with cogitation about software processes in the real world.
Imagine that you have a framework with support for walking arbitrary object graphs in the form of a GraphWalker
. Implementations of this interface complement a generalized algorithm.
This algorithm generates nodes corresponding to various events generated by the graph traversal, like beginning or ending a node or edge or encountering a previously processed node (in the case of graphs with cycles). Such an algorithm is eminently useful for formatting graphs into a human-readable format, cloning said graphs or other forms of processing.
A crucial feature of such a GraphWalker
is to keep track of the nodes it has seen before in order to avoid traversing the same node multiple times and going into an infinite loop in graphs with cycles. For subsequent encounters with a node, the walker handles it differently—generating a reference event rather than a begin node event.
A common object graph is the AST for a programming language. The graph walker can be used to quickly analyze such ASTs for nodes that match particular conditions.
Let’s take a look at a concrete example, with a little language that defines simple boolean expressions:
OR(
(A < 2)
(B > A)
)
It’s just an example and we don’t really have to care about what it does, where A
and B
came from or the syntax. What matters is the AST that we generate from it:
1 Operator (OR)
2 Operator (<)
3 Variable (A)
4 Constant (2)
5 Operator (>)
6 Constant (B)
7 Variable (A)
When the walker iterates over this tree, it generates the following events (note the numbers at the front of the line correspond to the object in the diagram above:
1 begin node
1 begin edge
2 begin node
2 begin edge
3 begin node
3 end node
4 begin node
4 end node
2 end edge
2 end node
5 begin node
5 begin edge
6 begin node
6 end node
7 begin node
7 end node
5 end edge
5 end node
1 end edge
Now that’s the event tree we expect. This is also the event tree that we get for the objects that we’ve chosen to represent our nodes (Operator
, Variable
and Constant
in this case). If, for example, we process the AST and pass it through a formatter for this little language, we expect to get back exactly what we put in (namely the code in Listing 1). Given the event tree, it’s quite easy to write such a formatter—namely, by handling the begin node (output the node text), begin edge (output a “(”) and end edge (output a “)”) events.
So far, so good?
However, now imagine that we discover a bug in other code that uses these objects and we discover that when two different objects refer to the same variable, we need them to be considered equal. That is, we update the equality methods—in the case of .NET, Equals()
and GetHashCode()
—for Variable
.
As soon as we do, however, the sample from Listing 1 now formats as:
OR(
(A < 2)
(B > )
)
Now we have to figure out what happened. A good first step is to see what the corresponding event tree looks like now. We discover the following:
1 begin node
1 begin edge
2 begin node
2 begin edge
3 begin node
3 end node
4 begin node
4 end node
2 end edge
2 end node
5 begin node
5 begin edge
6 reference
7 begin node
7 end node
5 end edge
5 end node
1 end edge
The change is highlighted and affects the sixth node, which has now become a reference because we changed how equality is handled for Variables
. The algorithm now considers any two Variables
with the same name to be equivalent even if they are two different object references.
If we look back at how we wrote the simple formatter above, we only handled the begin node, begin edge and end edge events. If we throw in a handler for the reference event and output the text of the node, we’re back in business and have “fixed” the formatter.
But we ignore the more subtle problem at our own peril: namely, that the graph walking-code is fragile in that its behavior changes due to seemingly unrelated changes in the arguments that are passed. Though we have a quick fix above, we need to think about providing more stability in the algorithm—especially if we’re providers of low-level framework functionality. [1]
The walker algorithm uses a HashSet<T>
to track the nodes that it has previously encountered. However, the default comparator—again, in .NET—leans on the equality functions of the objects stored in the map to determine membership.
The first solution—or rather, the second one, as we already “fixed” the problem with what amounts to a hack above by outputting references as well—is to change the equality comparator for the HashSet<T>
to explicitly compare references. We make that change and we can once again remove the hack because the algorithm no longer generates references for subsequent variable encounters.
However, we’re still not done. We’ve now not only gotten our code running but we’ve fixed the code for the algorithm itself so the same problem won’t crop up again in other instances. That’s not bad for a day’s work, but there’s still a nagging problem.
What happens if the behavior that was considered unexpected in this case is exactly the behavior that another use of the algorithm expects? That is, it may well be that other types of graph walker will actually want to be able to control what is and is not a reference by changing the equivalence functions for the nodes. [2]
Luckily, callers of the algorithm already pass in the graph walker itself, the methods of which the algorithm already calls to process nodes and edges. A simple solution is to add a method to the graph walker interface to ask it to create the kind of HashSet<T>
that it would like to use to track references.
So how much time does this all take to do? Well, the first solution—the hack in application code—is the quickest, with time spent only on writing the unit test for the AST and verifying that it once again outputs as expected.
If we make a change to the framework, as in the second solution where we change the equality operator, we have to create unit tests to test the behavior of the AST in application code, but using test objects in the framework unit tests. That’s a bit more work and we may not have time for it.
The last suggestion—to extend the graph walker interface—involves even more work because we then have to create two sets of test objects: one set that tests a graph walker that uses reference equality (as the AST in the application code) and one that uses object equality (to make sure that works as well).
It is at this point that we might get swamped and end up working on framework code and unit tests that verify functionality that isn’t even being used—and certainly isn’t being used by the application with the looming deadline. However, we’re right there, in the code, and will never be better equipped to get this all right than we are right now. But what if we just don’t have time? What if there’s a release looming and we should just thank our lucky stars that we found the bug? What if there’s no time to follow the process?
Well, sometimes the process has to take a back seat, but that doesn’t mean we do nothing. Here are a few possibilities:
What about those who quite rightly frown at the third possibility because it would provide a solution for what amounts to a potential—as opposed to actual—problem? It’s really up to the developer here and experience really helps. How much time does it take to write the code? How much does it change the interface? How many other applications are affected? How likely is it that other implementations will need this fix? Are there potential users who won’t be able to make the fix themselves? Who won’t be able to recompile and just have to live with the reference-only equivalence? How likely is it that other code will break subtly if the fix is not made? It’s not an easy decision either way, actually.
Though purists might be appalled at the fast and loose approach to correctness outlined above, pragmatism and deadlines play a huge role in software development. The only way to avoid missing deadlines is to have fallback plans to ensure that the code is clean as soon as possible rather than immediately as a more stringent process would demand.
And thus ends the cautionary tale of making assumptions about how objects are compared and how frameworks are made.
sealed
keyword in C# serves the following dual purpose:
]]>“When applied to a class, the sealed modifier prevents other classes from inheriting from it. […] You can also use the sealed modifier on a method or property that overrides a virtual method or... [More]”
Published by marco on 6. May 2010 22:39:54 (GMT-5)
Updated by marco on 19. Mar 2011 21:03:14 (GMT-5)
According to the official documentation, the sealed
keyword in C# serves the following dual purpose:
“When applied to a class, the sealed modifier prevents other classes from inheriting from it. […] You can also use the sealed modifier on a method or property that overrides a virtual method or property in a base class. This enables you to allow classes to derive from your class and prevent them from overriding specific virtual methods or properties.”
Each inheritable class and overridable method in an API is part of the surface of that API. Functionality on the surface of the API costs money and time because it implies a promise to support that API through subsequent versions. The provider of the API more-or-less guarantees that potential modifications—through inheritance or overriding—will not be irrevocably broken by upgrades. At the very least, it implies that so-called breaking changes are well-documented in a release and that an upgrade path is made available.
In C#, the default setting for classes and methods is that classes are not sealed and methods are sealed (non-virtual, which amounts to the same thing). Additionally, the default visibility in C# is internal, which means that the class or method is only visible to other classes in the assembly. Thus, the default external API for an assembly is empty. The default internal API allows inheritance everywhere.
Some designers recommend the somewhat radical approach of declaring all classes sealed and leaving methods as non-virtual by default. That is, they recommend reducing the surface area of the API to only that which is made available by the implementation itself. The designer should then carefully decide which classes should be extensible—even within the assembly, because designers have to support any API that they expose, even if it’s only internal to the assembly—and unseal them, while deciding which methods should be virtual.
From the calling side of the equation, sealed classes are a pain in the ass. The framework designer, in his ineffable wisdom, usually fails to provide an implementation that does just what the caller needs. With inheritance and virtual methods, the caller may be able to get the desired functionality without rewriting everything from scratch. If the class is sealed, the caller has no recourse but to pull out Reflector™ and make a copy of the code, adjusting the copy until it works as desired.
Until the next upgrade, when the original version gets a few bug fixes or changes the copied version begins to diverge from it. It’s not so clear-cut whether to seal classes or not, but the answer is—as with so many other things—likely a well-thought out balance of both approaches.
Sealing methods, on the other hand, is simply a way of reverting that method back to the default state of being non-virtual. It can be quite useful, as I discovered in a recent case, shown below.
I started with a class for which I wanted to customize the textual representation—a common task.
class Expression
{
public override string ToString()
{
// Output the expression in human-readable form
}
}
class FancyExpression : Expression
{
public override string ToString()
{
// Output the expression in human-readable form
}
}
So far, so good; extremely straightforward. Imagine dozens of other expression types, each overriding ToString()
and producing custom output.
Time passes and it turns out that the formatting for expressions should be customizable based on the situation. The most obvious solution it to declare an overloaded version of ToString()
and then call the new overload from the overload inherited from the library, like this:
class Expression
{
public override string ToString()
{
return string ToString(ExpressionFormatOptions.Compact);
}
public virtual string ToString(ExpressionFormatOptions options)
{
// Output the expression in human-readable form
}
}
Since the new overload is a more powerful version of the basic ToString()
, we just redefine the latter in terms of the former, choosing appropriate default options. That seems simple enough, but now the API has changed and in a seemingly unenforcable way. Enforcable, in this context, means that the API can use the semantics of the language to force callers to use it in a certain way. Using the API in non-approved ways should result in a compilation error.
This new version of the API now has two virtual methods, but the overload of ToString()
without a parameter is actually completely defined in terms of the second overload. Not only is there no longer any reason to override it, but it would be wrong to do so—because the API calls for descendants to override the more powerful overload and to be aware of and handle the new formatting options.
But, this is the second version of the API and there are already dozens of descendants that override the basic ToString()
method. There might even be descendants in other application code that isn’t even being compiled at this time. The simplest solution is to make the basic ToString()
method non-virtual and be done with it. Descendents that overrode that method would no longer compile; maintainers could look at the new class declaration—or the example-rich release notes!—to figure out what changed since the last version and how best to return to a compilable state.
But ToString()
comes from the object
class and is part of the .NET system. This is where the sealed keyword comes in handy. Just seal the basic method to prevent overrides and the compiler will take care of the rest.
class Expression
{
public override sealed string ToString()
{
return ToString(ExpressionFormatOptions.Compact);
}
public virtual string ToString(ExpressionFormatOptions options)
{
// Output the expression in human-readable form
}
}
Even without release notes, a competent programmer should be able to figure out what to do. A final tip, though, is to add documentation so that everything’s crystal clear.
class Expression
{
/// <summary>
/// Returns a text representation of this expression.
/// </summary>
/// <returns>
/// A text representation of this expression.
/// </returns>
/// <remarks>
/// This method can no longer be overridden; instead,
/// override <see cref="ToString(ExpressionFormatOptions)"/>.
/// </remarks>
/// <seealso cref="ToString(ExpressionFormatOptions)"/>
public override sealed string ToString()
{
return ToString(ExpressionFormatOptions.Compact);
}
/// <summary>
/// Gets a text representation of this expression using the given
/// <paramref name="options"/>.
/// </summary>
/// <param name="options">The options to apply.</param>
/// <returns>
/// A text representation of this expression using the given
/// <paramref name="options"/>
/// </returns>
public virtual string ToString(ExpressionFormatOptions options)
{
// Output the expression in human-readable form
}
}
Part I of this guide to configuring a local firewall for OpenVPN introduced you to using iptables
on Linux. It also included a script for OpenVPN that opened and closed... [More]
Published by marco on 28. Apr 2010 21:25:44 (GMT-5)
Updated by marco on 28. Apr 2010 23:02:57 (GMT-5)
The following tip was developed using Ubuntu 9.1x (Hardy Heron) with OpenVPn 2.1rc19. It builds on the the setup from Part I.
Part I of this guide to configuring a local firewall for OpenVPN introduced you to using iptables
on Linux. It also included a script for OpenVPN that opened and closed the firewall for specific IP addresses. If you haven’t read it already, you should probably go do that first.
Unfortunately, it turns out that the firewall configuration from part I is not watertight because it still allows FORWARDs
for all IP addresses. If you’ll recall, we solved this problem for INPUTs
by closing them by default and selectively opening them.
The first step is to ascertain that the firewall is configured as we expect. A call to sudo iptables -nL
elicits the following output:
Chain INPUT (policy DROP) target prot opt source destination ACCEPT all – 0.0.0.0/0 0.0.0.0/0 ACCEPT all – 0.0.0.0/0 0.0.0.0/0 Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination
As you can see, the default policy for FORWARD
is ACCEPT
, which allows anyone to access other IP addresses from this machine. In Part I, you created a file named /etc/iptables.uprules
in which you stored the default configuration of the firewall. You’ll want to change that as shown below (the changes are highlighted):
*filter
:INPUT DROP [0:0]
:FORWARD DROP [0:0]
:OUTPUT ACCEPT [0:0]
-A INPUT -i eth0 -j ACCEPT
-A INPUT -i lo -j ACCEPT
-A FORWARD -i eth0 -j ACCEPT
-A FORWARD -i lo -j ACCEPT
COMMIT
Restart networking by executing sudo /etc/init.d/networking restart
. A call to sudo iptables -nL
should now elicit the following output (the main changes are highlighted):
Chain INPUT (policy DROP) target prot opt source destination ACCEPT all – 0.0.0.0/0 0.0.0.0/0 ACCEPT all – 0.0.0.0/0 0.0.0.0/0 Chain FORWARD (policy DROP) target prot opt source destination ACCEPT all – 0.0.0.0/0 0.0.0.0/0 ACCEPT all – 0.0.0.0/0 0.0.0.0/0 Chain OUTPUT (policy ACCEPT) target prot opt source destination
Now all IP forwarding requests are blocked by default. Those for the lo
and eth0
interfaces are, of course, still enabled, to allow the machine to be reachable both by itself and the local network.
The final step is to change the firewall configuration script to open up IP forwarding for employees, but not for strangers. Since this is a FORWARD
rule, not an INPUT
one, the script has to make sure the remove all firewall rules for the client IP address instead of just the INPUT
rules as it did previously. Changes from the script in Part I are highlighted.
function inlist
{
less `dirname $0`/$1 | egrep "^${CLIENTCERT}$" > /dev/null
if [ $? -eq 0 ]; then
return 0
else
return 1
fi
}
function get_next_matching_firewall_rule
{
ip_address=$1
channel=$2
RULE="`iptables -L $channel -n –line-numbers | grep $ip_address | head -n 1`"
}
function drop_rule_from_iptables
{
rule="$1"
channel="$2"
echo " Drop rule [$rule] for channel [$channel]"
line_number=`echo "$rule" | awk '{print $1}'`
iptables -D $channel $line_number
}
function add_port_to_iptables
{
source_ip=$1
destination_ip=$2
protocol=$3
port=$4
iptables -A INPUT -i tun0 -s $source_ip -d $destination_ip -p $protocol –dport $port -j ACCEPT
iptables -A INPUT -i tun0 -s $source_ip -d $destination_ip -p $protocol –dport $port -j ACCEPT
}
function add_destination_to_iptables
{
source_ip=$1
destination_ip=$2
iptables -A INPUT -i tun0 -s $source_ip -d $destination_ip -j ACCEPT
iptables -A FORWARD -i tun0 -s $source_ip -d $destination_ip -j ACCEPT
}
function open_firewall_for_strangers
{
echo " Add route for DNS"
add_port_to_iptables $CLIENTIP 192.168.1.1 "UDP" 53
echo " Add route for Windows shares"
add_port_to_iptables $CLIENTIP 192.168.1.5 "TCP" 139
add_port_to_iptables $CLIENTIP 192.168.1.5 "TCP" 445
return 0
}
function open_firewall_for_employees
{
echo " Add routes for all ip addresses"
iptables -A INPUT -i tun0 -s $CLIENTIP -j ACCEPT
iptables -A FORWARD -i tun0 -s $CLIENTIP -j ACCEPT
return 0
}
function open_firewall
{
echo "Opening firewall for $CLIENTCERT @ [$CLIENTIP]"
# TODO Add filtering for other lists, if desired
# inlist "MYGROUP.list"
#if [ $? -eq 0 ]; then
# echo " Certificate found in MYGROUP list"
# open_firewall_for_MYGROUP
# return 0
#else
inlist "strangers.list"
if [ $? -eq 0 ]; then
echo " Certificate found in strangers list"
open_firewall_for_strangers
return 0
else
inlist "employees.list"
if [ $? -eq 0 ]; then
echo " Certificate found in employee list"
open_firewall_for_employees
return 0
else
echo " Certificate not found in any list"
return 1
fi
fi
}
function close_firewall_channel
{
channel=$1
get_next_matching_firewall_rule $CLIENTIP $channel
while [ -n "$RULE" ]
do
drop_rule_from_iptables "$RULE" $channel
get_next_matching_firewall_rule $CLIENTIP $channel
done
}
function close_firewall
{
echo "CloseFirewall for [$CLIENTIP]"
close_firewall_channel "INPUT"
close_firewall_channel "FORWARD"
close_firewall_channel "OUTPUT"
}
# Main
OPERATION=$1
CLIENTIP=$2
CLIENTCERT=$3
case "$1" in
add)
close_firewall
open_firewall
;;
update)
close_firewall
open_firewall
;;
delete)
close_firewall
;;
*)
echo "Unknown operation"
exit 1
esac
exit $?
Since you only changed the firewall configuration script, there is no need to restart OpenVPN.
You can test to verify that the firewall is updated properly by simply executing the /etc/openvpn/configfirewall.sh
script with various parameters. The expected parameters are an operation—”add” or “delete” for testing purposes—a name—matched against the names in your lists—and an IP address, which should be chosen so as not to interfere with any addresses assigned by either OpenVPN or a DHCP server.
To test what would happen when an employee connects through OpenVPN, execute the following command:
sudo /etc/openvpn/configfirewall.sh add 192.168.40.3 John_Doe
You should see the following output from the script:
CloseFirewall for [192.168.40.3] OpenFirewall for John_Doe @ [192.168.40.3] Certificate found in employee list Add routes for all ip addresses
This sounds about right and it looks like the script ran as expected. You can check that the firewall was configured as expected with a call to sudo iptables -nL
, which should now elicit the following output (the main changes are highlighted):
Chain INPUT (policy DROP) target prot opt source destination ACCEPT all – 0.0.0.0/0 0.0.0.0/0 ACCEPT all – 0.0.0.0/0 0.0.0.0/0 ACCEPT all – 192.168.40.3 0.0.0.0/0 Chain FORWARD (policy DROP) target prot opt source destination ACCEPT all – 0.0.0.0/0 0.0.0.0/0 ACCEPT all – 0.0.0.0/0 0.0.0.0/0 ACCEPT all – 192.168.40.3 0.0.0.0/0
As you can see, the firewall accepts all INPUT
and FORWARD
from employees. Removing this test employee is as simple as executing:
sudo /etc/openvpn/configfirewall.sh delete 192.168.40.3 John_Doe
You should see the following output from the script:
CloseFirewall for [192.168.40.3] Drop rule [4 ACCEPT all – 192.168.40.3 0.0.0.0/0 ] for channel [INPUT] Drop rule [4 ACCEPT all – 192.168.40.3 0.0.0.0/0 ] for channel [FORWARD]
A call to sudo iptables -nL
should now elicit the following output, where the rules for the employee have been removed:
Chain INPUT (policy DROP) target prot opt source destination ACCEPT all – 0.0.0.0/0 0.0.0.0/0 ACCEPT all – 0.0.0.0/0 0.0.0.0/0 Chain FORWARD (policy DROP) target prot opt source destination ACCEPT all – 0.0.0.0/0 0.0.0.0/0 ACCEPT all – 0.0.0.0/0 0.0.0.0/0
You should really test with one user from each list, so the next user to test is a stranger. Add a stranger by calling the script with the strangers’s name instead of the employee’s name:
sudo /etc/openvpn/configfirewall.sh add 192.168.40.3 John_Stranger
You should see the following output from the script:
CloseFirewall for [192.168.40.3] OpenFirewall for John_Stranger @ [192.168.40.3] Certificate found in strangers list Add route for DNS Add route for Windows shares
A call to sudo iptables -nL
should now elicit the following output:
Chain INPUT (policy DROP) target prot opt source destination ACCEPT all – 0.0.0.0/0 0.0.0.0/0 ACCEPT all – 0.0.0.0/0 0.0.0.0/0 ACCEPT udp – 192.168.40.3 192.168.1.1 udp dpt:53 ACCEPT tcp – 192.168.40.3 192.168.1.5 tcp dpt:139 ACCEPT tcp – 192.168.40.3 192.168.1.5 tcp dpt:445 Chain FORWARD (policy DROP) target prot opt source destination ACCEPT all – 0.0.0.0/0 0.0.0.0/0 ACCEPT all – 0.0.0.0/0 0.0.0.0/0 ACCEPT udp – 192.168.40.3 192.168.1.1 udp dpt:53 ACCEPT tcp – 192.168.40.3 192.168.1.5 tcp dpt:139 ACCEPT tcp – 192.168.40.3 192.168.1.5 tcp dpt:445
For strangers, the firewall accepts only requests on the ports and IP addresses explicitly opened by the script and drops all FORWARD
requests. Removing this test employee is as simple as executing:
sudo /etc/openvpn/configfirewall.sh delete 192.168.40.3 John_Stranger
You should see the following output from the script:
CloseFirewall for [192.168.40.3] Drop rule [4 ACCEPT udp – 192.168.40.3 192.168.1.1 udp dpt:53 ] for channel [INPUT] Drop rule [4 ACCEPT tcp – 192.168.40.3 192.168.1.5 tcp dpt:139 ] for channel [INPUT] Drop rule [4 ACCEPT tcp – 192.168.40.3 192.168.1.5 tcp dpt:445 ] for channel [INPUT] Drop rule [4 ACCEPT udp – 192.168.40.3 192.168.1.1 udp dpt:53 ] for channel [FORWARD] Drop rule [4 ACCEPT tcp – 192.168.40.3 192.168.1.5 tcp dpt:139 ] for channel [FORWARD] Drop rule [4 ACCEPT tcp – 192.168.40.3 192.168.1.5 tcp dpt:445 ] for channel [FORWARD]
A call to sudo iptables -nL
should now elicit the following output, where the rules for the stranger have been removed:
Chain INPUT (policy DROP) target prot opt source destination ACCEPT all – 0.0.0.0/0 0.0.0.0/0 ACCEPT all – 0.0.0.0/0 0.0.0.0/0 Chain FORWARD (policy DROP) target prot opt source destination ACCEPT all – 0.0.0.0/0 0.0.0.0/0 ACCEPT all – 0.0.0.0/0 0.0.0.0/0
You can use the script this way to test the firewall configuration without actually logging in through OpenVPN. When everything is set, you should still log with OpenVPN as a user from each list to verify that the firewall is doing what you think it is doing. In fact, that’s exactly why there is a Part II to this article: We tested by adding a user to the strangers list and logging in and noticed that we were able to ping many more servers than we had configured. Don’t let that happen to you!
So, there’s one more trick that you can use to make testing via OpenVPN easier. Since you have to be outside the network to test tunneling in via VPN, you run into the problem of testing as a stranger because strangers probably won’t have rights to open a shell on the OpenVPN server. That is, you need to be able to do this:
Since you’re a stranger, you can no longer open a shell on the OpenVPN server and alter the configuration.
Here are some ways of getting around this problem:
Another way around this is to add an exception for the OpenVPN server to all configurations (strangers, employees, etc.) so that you can test almost everything. To do this, just add the a rule for the OpenVPN server (assumed to be on 192.168.1.1) as follows:
add_destination_to_iptables $CLIENTIP 192.168.1.1
When you’re finished testing, make sure to remove the hack.
Finally, here are samples of all of the files modified in this tutorial. See Part I for the other files.
There are dozens of guides around that describe how to optimally configure the iptables
firewall on Linux for OpenVPN. There’s even a... [More]
Published by marco on 20. Apr 2010 22:37:16 (GMT-5)
Updated by marco on 21. Apr 2010 22:25:01 (GMT-5)
The following tip was developed using Ubuntu 9.1x (Hardy Heron) with OpenVPn 2.1rc19. It was originally published on the Encodo blogs and cross-published here.
There are dozens of guides around that describe how to optimally configure the iptables
firewall on Linux for OpenVPN. There’s even a script installed by default that is extremely well-commented and shows to how close down the firewall, then open up only very selected ports and protocols for optimal browsing. However, all of those guides assume that the machine on which OpenVPN is installed is also the firewall separating an external network (the DMZ) from an internal one. Well, what if you have a dedicated firewall and run the OpenVPN server on a machine running in the internal network?
This tutorial assumes that you’ve already followed the instructions for setting up OpenVPN and that you’ve also set up a Public Key Infrastructure (PKI). That means that access to your internal network via OpenVPN is secured and will only authorize users that have a proper certificate and password.
All of the files and scripts mentioned in this tutorial are available for download as files at the end of the article.
Since the external firewall routes requests to OpenVPN directly to the internal machine, it cannot be used to restrict the actions of users that are tunneling into the internal network. Luckily, the default behavior is that users only have access to the OpenVPN server itself, which gives you time to consider how, exactly, you want to open things up.
Here are some questions you need to answer:
For many organizations, the whole point of using OpenVPN is to let users work as if they are on the internal network, but from outside the physical office. In that case, the answers to the questions above will in many cases be:
Let’s take care of that trivial case first, then. Execute sudo iptables -nL
to show the current firewall configuration. You should see something like the following:
Chain INPUT (policy ACCEPT) target prot opt source destination Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination
This table indicates that all input, output and forward requests are accepted. OUTPUT
requests are not interesting for this exercise, as they are generated by software running on the server itself, but INPUT
and FORWARD
requests bear more scrutiny. It looks like the firewall is already configured to allow access to everything your users need: There are no restrictions on inputs, which means that the firewall will allow requests on all ports and protocols for the local machine. There are likewise no restrictions on forwards, which means that requests to other IP addresses in the same subnet will be forwarded to those machines.
So, if FORWARDS
are being, well, forwarded, why can’t you ping any other machines in the same subnet? Once you know the answer, it’s obvious: It’s because the firewall isn’t the one blocking forward requests. It’s because IP forwarding is a networking feature that must be explicitly enabled in the networking configuration. The article How to enable IP Forwarding will help you get this option configured, but the crux of the change is shown below.
Since you’ll probably want to make this change permanent, execute sudo vi /etc/sysctl.conf
and remove the comment from the front of the line containing net.ipv4.ip_forward = 1
. Restart networking by executing sudo /etc/init.d/networking restart
and you’ll be good to go.
The default network is now set up for smaller installations where everybody has the same permissions everywhere. What if, however, your needs are a little more complex? What if you have some users on your VPN that should only have access to certain resources i.e. certain ports and protocols?
In that case, you’ll have to use a different approach: Perhaps something like the following:
The first step is to close the firewall by default. As you can see from the iptables
listing above, the firewall accepts all INPUT
connections by default. You’re probably not an expert on iptables
configuration (or you wouldn’t be here). There are two ways to get the settings you need:
iptables
to set up the firewalliptables
configuration from a dump fileThere’s really not much difference, but this tutorial opted for the second option. Once you’ve got a default firewall set up to your liking, use iptables-save
to dump out the rules to a file named /etc/iptables.uprules
(naturally, you can use whatever file name you like; it just has to match the reference from the script below). If this is all very confusing, the values below set up a closed firewall for you, which is probably what you want.
*filter
:INPUT DROP [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A INPUT -i eth0 -j ACCEPT
-A INPUT -i lo -j ACCEPT
COMMIT
Though FORWARD
and OUTPUT
are still accepted unconditionally, all requests to INPUT
are dropped. The two rules for eth0
and lo
make sure that the machine can communicate with itself. Now that you’ve got the rules you need, you want to somehow alter the default configuration of the firewall.
If you guessed that the next step is to edit /etc/iptables/default.conf
or /etc/default/iptables.conf
, you’d be wrong. That’s pretty intuitive, but wrong. On the latest versions of Ubuntu, networking setup like firewall configuration is best accomplished by adding a script that is executed just before the networking interface is established. This guarantees that the default firewall rules are in place before the network is in any way accessible. To do this, add a file called iptables.sh
to the /etc/network/if-pre-up.d/
folder; Add the following lines to it:
#!/bin/sh
iptables-restore < /etc/iptables.uprules
exit 0
This is a super-simple script that loads the firewall configuration from the file you just created above. The iptables-restore
command is convenient because it replaces the whole configuration, so you don’t have to do any resetting of your own.
Save the file and execute sudo chmod +x /etc/netwokr/if-pre-up.d/iptables.sh
to make it executable. Restart networking by executing sudo /etc/init.d/networking restart
.
A call to sudo iptables -nL
should now elicit the following output (the main changes are highlighted):
Chain INPUT (policy DROP) target prot opt source destination ACCEPT all – 0.0.0.0/0 0.0.0.0/0 ACCEPT all – 0.0.0.0/0 0.0.0.0/0 Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination
Congratulations! You’ve succeeded in locking out everybody again, but in a different way.
How do you get back that coveted VIP status that you had just seconds ago? Now you’re up to step (2) above: “Determine the user connected through OpenVPN”. The basic strategy here is to key on the unique name in the SSL certificate authorized by OpenVPN. For each different group of permissions (IP addresses/ports/protocols) that you want to grant, create a file with the names of people who belong to that group, one name per line. For example:
Joe_Jackson
Phil_Hartman
Jill_Meikenson
Horst_Buchholz
Susan_B_Lazy
This is just one very simple solution to the problem of determining membership. Some installations with much larger user bases might want to instead bind to an external lookup using LDAP or an already existing MySQL database or something similar. That’s obviously beyond the scope of this tutorial, though.
You’re now going to need a script that will use these lists to determine which firewall rules to execute. I’ve added the general form of that script below, with matching for “employees” and “strangers” and TODO statements indicating where you need to extend the script for your own purposes:
function inlist
{
less `dirname $0`/$1 | egrep "^${CLIENTCERT}$" > /dev/null
if [ $? -eq 0 ]; then
return 0
else
return 1
fi
}
function get_next_matching_firewall_rule
{
ip_address=$1
RULE="`iptables -L INPUT -n –line-numbers | grep $ip_address | head -n 1`"
}
function drop_rule_from_iptables
{
rule="$1"
echo " Drop rule [$rule]"
line_number=`echo "$rule" | awk '{print $1}'`
iptables -D INPUT $line_number
}
function add_port_to_iptables
{
source_ip=$1
destination_ip=$2
protocol=$3
port=$4
iptables -A INPUT -i tun0 -s $source_ip -d $destination_ip -p $protocol –dport $port -j ACCEPT
}
function add_destination_to_iptables
{
source_ip=$1
destination_ip=$2
iptables -A INPUT -i tun0 -s $source_ip -d $destination_ip -j ACCEPT
}
function open_firewall_for_strangers
{
echo " Add route for DNS"
add_port_to_iptables $CLIENTIP 192.168.1.1 "UDP" 53
echo " Add route for Windows shares"
add_port_to_iptables $CLIENTIP 192.168.1.5 "TCP" 139
add_port_to_iptables $CLIENTIP 192.168.1.5 "TCP" 445
return 0
}
function open_firewall_for_employees
{
echo " Add routes for all ip addresses"
iptables -A INPUT -i tun0 -s $CLIENTIP -j ACCEPT
return 0
}
function open_firewall
{
echo "Opening firewall for $CLIENTCERT @ [$CLIENTIP]"
# TODO Add filtering for other lists, if desired
# inlist "MYGROUP.list"
#if [ $? -eq 0 ]; then
# echo " Certificate found in MYGROUP list"
# open_firewall_for_MYGROUP
# return 0
#else
inlist "strangers.list"
if [ $? -eq 0 ]; then
echo " Certificate found in strangers list"
open_firewall_for_strangers
return 0
else
inlist "employees.list"
if [ $? -eq 0 ]; then
echo " Certificate found in employee list"
open_firewall_for_employees
return 0
else
echo " Certificate not found in any list"
return 1
fi
fi
}
function close_firewall
{
echo "Closing firewall for [$CLIENTIP]"
get_next_matching_firewall_rule $CLIENTIP
while [ -n "$RULE" ]
do
drop_rule_from_iptables "$RULE"
get_next_matching_firewall_rule $ip_address
done
}
# Main
OPERATION=$1
CLIENTIP=$2
CLIENTCERT=$3
case "$1" in
add)
close_firewall
open_firewall
;;
update)
close_firewall
open_firewall
;;
delete)
close_firewall
;;
*)
echo "Unknown operation"
exit 1
esac
exit $?
Some explanation for those who haven’t scripted in bash much before:
case
statement at the end of the script. Note that in all recognized cases, the firewall is first closed just to make sure that there are no lingering entries for the given client’s IP address.close_firewall
simply removes all rules for the given client’s IP address, in which case the default DROP
action on INPUTS
will block all incoming traffic from the address.open_firewall
tries to find the user in one of the files. If successful, the rules for that file are applied to the firewall.Finally, you need to tell OpenVPN to run your script whenever it has authorized a connection. Execute sudo vi /etc/openvpn/server.conf
and add or modify the following line:
learn-address /etc/openvpn/configfirewall.sh
Restart OpenVPN with sudo /etc/init.d/openvpn retart
and you’re done! Your OpenVPN server now not only authorizes users but also locks down the firewall to allow only those services for which a user has permission.
Finally, here are samples of all of the files used in this tutorial.
Big, new features include:
Published by marco on 27. Oct 2009 16:52:19 (GMT-5)
Updated by marco on 27. Oct 2009 16:52:51 (GMT-5)
Encodo Systems AG [1] has released Quino 1.1.0.0 to licensed customers; test licenses are available on request. Feel free to contact them at “info [at] encodo [dot] ch”. Read the Quino Fact Sheet for an in-depth overview.
Big, new features include:
More information is available at the Quino home page, including the original Metadata in Software Development paper as well as the aforementioned Quino Fact Sheet (excerpted below).
“What is Quino? Quino is a metadata framework written in C# 3.5. How does a metadata framework differ from an application framework? Application frameworks generally dictate much of the infrastructure of an application. An application can extend the framework only if it offers some way of extending it; if not, the application developer is often left without help at all. If a developer wants to extend or improve the user interface, for example, they have to work within the bounds defined by the user interface support provided by the application framework.
Quino, being a metadata framework, is different. The philosophy behind Quino is that metadata is great. Metadata enables generic programming and allows a developer to write or generate entire swathes of your application with very little effort—and great results—and therefore free up previous time to fine-tune the parts of the application that make it really stand out. It’s about spending time on the stuff that really matters instead of down in the trenches, connecting to databases, marshalling objects or painstakingly placing controls on forms or web pages.”
Published by marco on 19. Oct 2009 21:45:58 (GMT-5)
Version 1.5.2 of the Encodo C# Handbook is now available for download. It includes the following updates:
It’s also available for download at the MSDN Code Gallery.
Published by marco on 18. Oct 2009 15:59:43 (GMT-5)
Updated by marco on 19. Oct 2009 07:13:11 (GMT-5)
DSL is a buzzword that’s been around for a while and it stands for [D]omain-[Specific] [L]anguage. That is, some tasks or “domains” are better described with their own language rather than using the same language for everything. This gives a name to what is actually already a standard practice: every time a program assumes a particular format for an input string (e.g. CSV or configuration files), it is using a DSL. On the surface, it’s extremely logical to use a syntax and semantics most appropriate to the task at hand; it would be hard to argue with that. However, that’s assuming that there are no hidden downsides.
And the downsides are not inconsequential. As an example, let’s look at the DSL “Linq”, which arrived with C# 3.5. What’s the problem with Linq? Well, nothing, actually, but only because a lot of work went into avoiding the drawbacks of DSLs. Linq was written by Microsoft and they shipped it at the same time as they shipped a new IDE—Visual Studio 2008—which basically upgraded Visual Studio 2005 in order to support Linq. All of the tools to which .NET developers have become accustomed worked seamlessly with Linq.
However, it took a little while before JetBrains released a version of ReSharper that understood Linq…and that right there is the nub of the problem. Developer tools need to understand a DSL or you might as well just write it in Notepad. [1] The bar for integration into an IDE is quite high: developers expect a lot these days, including:
What sounds, on the surface, like a slam-dunk of an idea, suddenly sounds like a helluva lot more work than just defining a little language [3]. That’s why Encodo decided early on to just use C# for everything in its Quino framework, wherever possible. The main part of a Quino application is its metadata, or the model definition. However, instead of coming up with a language for defining the metadata, Encodo lets the developer define the metadata using a .NET-API, which gives that developer the full power of code-completion, ReSharper and whatever other goodies they may have installed to help them get their jobs done.
Deciding to use C# for APIs doesn’t mean, however, that your job is done quickly: you still have to design an API that not only works, but is intuitive enough to let developers use it with as little error and confusion as possible.
I recently extended the API for building metadata to include being able to group other metadata into hierarchies called “layouts”. Though the API is implementation-agnostic, its primary use will initially be to determine how the properties of a meta-class are laid out in a form. That is, most applications will want to have more control over the appearance than simply displaying the properties of a meta-class in a form from first-to-last, one to a line.
In the metadata itself, a layout is a group of other elements; an element can be a meta-property or another group. A group can have a caption. Essentially, it should look like this when displayed (groups are surrounded by []; elements with <>):
[MainTab]
-----------------------------------
| <Company>
| [MainFieldSet]
| --------------------------------
| | <Contact>
| | [ <FirstName> <LastName> ]
| | <Picture>
| | <Birthdate>
| --------------------------------
| [ <IsEmployee> <Active> ]
-----------------------------------
From the example above, we can extract the following requirements:
One way of constructing this in a traditional programming language like C# is to create a new group when needed, using a constructor with a caption or not, as needed. However, I also wanted to make a DSL, which has as little cruft as possible; that is, I wanted to avoid redundant parameters and unnecessary constructors. I also wanted to avoid forcing the developer to provide direct references to meta-property elements where it would be more comfortable to just use the name of the property instead.
To that end, I decided to avoid making the developer create or necessarily provide the actual destination objects (i.e. the groups and elements); instead, I would build a parallel set of throwaway objects that the developer would either implicitly or explicitly create. The back-end could then use those objects to resolve references to elements and create the target object-graph with proper error-checking and so on. This approach also avoids getting the target metadata “dirty” with properties or methods that are only needed during this particular style of construction.
I started by writing some code in C# that I thought was both concise enough and offered visual hints to indicate what was being built. That is, I used whitespace to indicate grouping of elements, exactly as in the diagram from the requirements above.
Here’s a simple example, with very little grouping:
builder.AddLayout(
personClass, "Basic",
Person.Relations.Contact,
new LayoutGroup(Person.Fields.FirstName, Person.Fields.LastName),
Person.Fields.Picture,
Person.Fields.Birthdate
new LayoutGroup(Person.Fields.IsEmployee, Person.Fields.Active)
);
The code above creates a new “layout” for the class personClass
named “Details”. That takes care of the first two parameters; the much larger final parameter is an open array of elements. These are primarily the names of properties to include from personClass
(or they could also be the properties themselves). In order to indicate that two properties are on the same line, the developer must group them using a LayoutGroup
object.
Here’s a more complex sample, with nested groups (this one corresponds to the original requirement from above):
builder.AddLayout(
personClass, "Details",
new LayoutGroup("MainTab",
Person.Relations.Company,
new LayoutGroup("MainFieldSet",
Person.Relations.Contact,
new LayoutGroup(Person.Fields.FirstName, Person.Fields.LastName),
Person.Fields.Picture,
Person.Fields.Birthdate
),
new LayoutGroup(Person.Fields.IsEmployee, Person.Fields.Active)
)
);
In this example, we see that the developer can also use a LayoutGroup
to attach a caption to a group of other items, but that otherwise everything pretty much stays the same as in the simpler example.
Finally, a developer should also be able to refer to other layout definitions in order to avoid repeating code (adhering to the D.R.Y. principle [4]). Here’s the previous example redefined using a reference to another layout (highlighted):
builder.AddLayout(
personClass, "Basic",
Person.Relations.Contact,
new LayoutGroup(Person.Fields.FirstName, Person.Fields.LastName),
Person.Fields.Picture,
Person.Fields.Birthdate
);
builder.AddLayout(
personClass, "Details",
new LayoutGroup("MainTab",
Person.Relations.Company,
new LayoutGroup("MainFieldSet",
new LayoutReference("Basic");
)),
new LayoutItems(Person.Fields.IsEmployee, Person.Fields.Active)
))
);
Now that I had an API I thought was good enough to use, I had to figure out how to get the C# compiler to not only accept it, but also to give me the opportunity to build the actual target metadata I wanted.
The trick ended up being to define a few objects for the different possibilities—groups, elements, references, etc.—and make them implicitly convert to a basic LayoutItem
. Using implicit operators allowed me to even convert strings to meta-property references, like this:
public static implicit operator LayoutItem(string identifier)
{
return new LayoutItem(identifier);
}
Each of these items has a reference to each possible type of data and a flag to indicate which of these data are valid and can be extracted from this item. The builder receives a list of such items, each of which may have a sub-list of other items. Processing the list is now as simple as iterating them with foreach
, something like this:
private void ProcessItems(IMetaGroup group, IMetaClass metaClass, LayoutItem[] items)
{
foreach (var item in items)
{
if (!String.IsNullOrEmpty(item.Identifier))
{
var element = metaClass.Properties[item.Identifier];
group.Elements.Add(element);
}
else if (item.Items != null)
{
var subGroup = CreateNextSubGroup(group);
group.Elements.Add(subGroup);
ProcessItems(subGroup, metaClass, item.Items.Items);
}
else if (item.Group != null)
{
…
}
else (…)
}
}
If the item was created from a string, the builder looks up the property to which it refers in the meta-class and add that to the current group. If the item corresponds to an anonymous group, the builder creates a new group and calls adds the items to it recursively. Here we can see how this solution spares the application developer the work of looking up each and every referenced property in application code. Instead, the developer’s code stays clean and short.
Naturally, my solution has many more cases but the sample above should suffice to show how the full solution works.
The story didn’t just end there, as there are limitations to forcing C# to doing everything we’d like. The primary problem came from distinguishing between the string that is the caption from strings that are references to meta-properties. To avoid this problem, I was forced to introduce a LayoutItems
class for anonymous groups and reserve the LayoutGroup
for groups with captions.
I was not able to get the implementation to support my requirements exactly as I’d designed them, but it ended up being pretty close. Below is the first example from the requirements, but changed to accommodate the final API; all changes are highlighted.
builder.AddLayout(
personClass, "Details",
new LayoutGroup("MainTab", new LayoutItems(
Person.Relations.Company,
new LayoutGroup("MainFieldSet", new LayoutItems(
Person.Relations.Contact,
new LayoutItems(Person.Fields.FirstName, Person.Fields.LastName),
Person.Fields.Picture,
Person.Fields.Birthdate
)),
new LayoutItems(Person.Fields.IsEmployee, Person.Fields.Active)
))
);
All in all, I’m pretty happy with how things turned out: the API is clear enough that the developer should be able to both visually debug the layouts and easily adjust them to accommodate changes. For example, it’s quite obvious how to add a new property to a group, move a property to another line or put several properties on the same line. Defining this pseudo-DSL in C# lets the developer use code-completion, popup documentation and the full power of ReSharper and frees me from having to either write or maintain a parser or development tools for a DSL.
Published by marco on 18. Oct 2009 13:24:58 (GMT-5)
A usable API doesn’t usually spring forth in its entirety on the first try. A good, usable API generally arises iteratively, improving over time. Naturally, when using words like good and usable, I’m obliged to define what exactly I mean by that. Here are the guidelines I use when designing an API, in decreasing order of importance:
Using those guidelines, I designed an API to manage bits and sets of bits in C#. Having spent a lot of time using Delphi Pascal, I’d become accustomed to set and bit operations with static typing. In C#, the .Net framework provides the Set<T> generic type, but that seems like overkill when the whole idea behind using bits is to use less space. That means using enumerated types and the FlagsAttribute
; however, there are some drawbacks to using the native bit-operations directly in code:
To demonstrate, here is a sample:
[Flags]
enum TestValues
{
None = 0,
One = 1,
Two = 2,
Three = 4,
Four = 8,
All = 15,
}
// Set bits one and two:
var bitsOneAndTwo = TestValues.One | TestValues.Two;
// Remove bit two :
var bitOneOnly = bitsOneAndTwo & ~TestValues.Two;
// Testing for bit two:
if ((bitsOneAndTwo & TestValues.Two) == TestValues.Two)
{
…
}
As you can see in the example above, setting a bit is reasonably intuitive (though it’s understandable to get confused about using |
instead of &
to combine bits). Removing a bit is more esoteric, as the combination of &
with the ~
(inverse) operator is easily forgotten if not often used. Testing for a bit is quite verbose and extending to testing for one of several flags even more so.
Therefore, to make things easier, I decided to make some extension methods for these various functions and ended up with something like the following:
public static void Include<T>(this T flags, T value) { … }
public static void Exclude<T>(this T flags, T value) { … }
public static bool In<T>(this T flags, T value) { … }
public static void ForEachFlag<T>(this T flags, Action<T> action) { … }
These definitions compiled and worked as expected, but had the following major drawbacks:
enum
values, but code completion was offering the methods for all objects because there was no generic constraint on T
.ForEachFlag()
function was implemented as a lambda when it is clearly an iteration. Using a lambda instead makes it impossible to use break
or continue
with this method.This version, although it worked, broke several of the rules outline above; namely: while it did offer compile-time checking, the implementation had a lot of repetition in it and the iteration did not make use of the common library enumeration support (IEnumerable
and foreach
). That the operations were available for all objects and polluted code-completion only added insult to injury.
A natural solution to the namespace-pollution problem is to add a generic constraint to the methods, restricting the operations to objects of type Enum
, as follows:
public static void Include<T>(this T flags, T value)
where T : Enum
{ … }
public static void Exclude<T>(this T flags, T value)
where T : Enum
{ … }
public static bool In<T>(this T flags, T value)
where T : Enum
{ … }
public static void ForEachFlag<T>(this T flags, Action<T> action)
where T : Enum
{ … }
.NET enum
-declarations, however, do not inherit from Enum
; instead, they inherit from Int32
, by default, but can also inherit from a handful of other base types (e.g. byte
, Int16
). This makes sense so that enum
-values can be freely converted to and from these base values. Not only will a generic constraint as defined above not have the intended effect, it’s explicitly disallowed by the compiler. So, that’s a dead-end.
The other, more obvious way of restricting the target type of an extension method is to change the type of the first parameter from T
to something else. However, since enum
types don’t inherit from Enum
, what type do we use? Well, it turns out that Enum
is a strange type, indeed. It can’t be used in a generic constraint and does not serve as the base class for enumerated types but, when used as the target of an extension method, it magically applies to all enumerated types!
I took advantage of this loophole to build the next version of the API, as follows:
public static void Include<T>(this Enum flags, T value) { … }
public static void Exclude<T>(this Enum flags, T value) { … }
public static bool In<T>(this T flags, Enum value) { … }
public static void ForEachFlag<T>(this Enum flags, Action<T> action) { … }
This version had two advantages over the first version:
Enum.GetTypeCode()
method instead of the is
and as
-operators to figure out the type and cast the input accordingly.After using this version for a little while, it became obvious that there were still problems with the implementation:
Enum
as the target type of the extension method was a clever solution, it turns out to be a huge violation of the first design-principle outlined above: The type T
for the other parameters is not guaranteed to conform to Enum
. That is, the compiler cannot statically verify that the bit being checked (value
) is of the same type as the bit-set (flags
).Enum
objects, where it would also be appropriate for Int32
, Int64
objects and so on.ForEach
method still has the same problems it had in the first version; namely, that it doesn’t allow the use of break
and continue
and therefore violates the second design-principle above.A little more investigation showed that the Enum.GetTypeCode()
method is not unique to Enum
but implements a method initially defined in the IConvertible
interface. And, as luck would have it, this interface is implemented not only by the Enum
class, but also by Int32
, Int64
and all of the other types to which we would like to apply bit- and set-operations.
Knowing that, we can hope that the third time’s a charm and redesign the API once again, as follows:
public static void Include<T>(this T flags, T value)
where T : IConvertible
{ … }
public static void Exclude<T>(this T flags, T value)
where T : IConvertible
{ … }
public static bool In<T>(this T flags, T value)
where T : IConvertible
{ … }
public static void ForEachFlag<T>(this T flags, Action<T> action)
where T : IConvertible
{ … }
Now we have methods that apply only to those types that support set- and bit-operations (more or less [1]). Not only that, but the value and action arguments are once again guaranteed to be statically compliant with the flags
arguments.
With two of the drawbacks eliminated with one change, we converted the ForEachFlag
method to return an IEnumerable<T>
instead, as follows:
public static IEnumerable<T> GetEnabledFlags<T>(this T flags)
where T : IConvertible
{ … }
The result of this method can now be used with foreach
and works with break
and continue
, as expected. Since the method also now applies to non-enumerated types, we had to re-implement it to return the set of possible bits for the type instead of simply iterating the possible enumerated values returned by Enum.GetValues()
. [2]
This version satisfies the first design principles (statically-typed, standard practice, elegant) relatively well, but is still forced to make concessions in implementation and CLS-compliance. It turns out that the IConvertible
interface is somehow not CLS-compliant, so I was forced to mark the whole class as non-compliant. On the implementation side, I was avoiding the rather clumsy is
-operator by using the IConvertible.GetByteCode()
method, but still had a lot of repeated code, as shown below in a sample from the implementation of Is
:
switch (flags.GetTypeCode())
{
case TypeCode.Byte:
return (byte)(object)flags == (byte)(object)value;
case TypeCode.Int32:
return (int)(object)flags == (int)(object)value;
…
}
Unfortunately, bit-testing is so low-level that there is no (obvious) way to refine this implementation further. In order to compare the two convertible values, the compiler must be told the exact base type to use, which requires an explicit cast for each supported type, as shown above. Luckily, this limitation is in the implementation, which affects the maintainer and not the user of the API.
Since implementing the third version of these “BitTools”, I’ve added support for Is
(shown partially above), Has
, HasOneOf
and it looks like the third time might indeed be the charm, as the saying goes.
IConvertible
interface is actually implemented by other types, to which our bit-operations don’t apply at all, like double
, bool
and so on. The .NET library doesn’t provide a more specific interface—like “INumeric
” or “IIntegralType
”—so we’re stuck constraining to IConvertible
instead.Which, coincidentally, fixed a bug in the first and second versions that had returned all detected enumerated values—including combinations—instead of individual bits. For example, given the type shown below, we only ever expect values One
and Two
, and never None
, OneAndTwo
or All
.
[Flags]
enum TestValues
{
None = 0,
One = 1,
Two = 2,
OneOrTwo = 3,
All = 3,
}
That is, foreach (Two.GetEnabledFlags()) { … }
should return only Two
and foreach (All.GetEnabledFlags()) { … }
should return One
and Two
.
class D { }
class E : D { }
class F : D { }
class Program
{
void ProcessListOfD(IList<D> list) { }
void... [More]
]]>
Published by marco on 17. Oct 2009 22:56:11 (GMT-5)
Updated by marco on 26. Oct 2021 12:17:00 (GMT-5)
C# 3.5 has a limitation where generic classes don’t necessarily conform to each other in the way that one would expect. This problem manifests itself classically in the following way:
class D { }
class E : D { }
class F : D { }
class Program
{
void ProcessListOfD(IList<D> list) { }
void ProcessListOfE(IList<E> list) { }
void ProcessSequenceOfD(IEnumerable<D> sequence) { }
void ProcessSequenceOfE(IEnumerable<E> sequence) { }
void Main()
{
var eList = new List<E>();
var dList = new List<D>();
ProcessListOfD(dList); // OK
ProcessListOfE(dList); // Compiler error, as expected
ProcessSequenceOfD(dList); // OK
ProcessSequenceOfE(dList); // Compiler error, as expected
ProcessListOfD(eList); // Compiler error, unexpected!
ProcessListOfE(eList); // OK
ProcessSequenceOfD(eList); // Compiler error, unexpected!
ProcessSequenceOfE(eList); // OK
}
}
Why are those two compiler errors unexpected? Why shouldn’t a program be able to provide an IList<E>
where an IList<D>
is expected? Well, that’s where things get a little bit complicated. Whereas at first, it seems that there’s no down side to allowing the assignment—E
can do everything expected of D
, after all—further investigation reveals a potential source of runtime errors.
Expanding on the example above, suppose ProcessListOfD()
were to have the following implementation:
void ProcessListOfD(IList<D> list)
{
if (SomeCondition(list))
{
list.Add(new F());
}
}
With such an implementation, the call to ProcessListOfD(bList)
, which passes an IList<E>
would cause a runtime error if SomeCondition()
were to return true
. So, the dilemma is that allowing co- and contravariance may result in runtime errors.
A language design includes a balance of features that permit good expressiveness while restricting bad expressiveness. C# has implicit conversions, but requires potentially dangerous conversions to be made explicit with casts. Similarly, the obvious type-compatibility outlined in the first example is forbidden and requires a call to the System.Linq.Enumerable.Cast<T>(this IEnumerable)
method instead. Other languages—most notably Eiffel—have always allowed the logical conformance between generic types, at the risk of runtime errors. [1]
Some of these limitations will be addressed in C# 4.0 with the introduction of covariance. See Covariance and Contravariance (C# and Visual Basic) (MSDN) and LINQ Farm: Covariance and Contravariance in C# 4.0 for more information.
Until then, there’s the aforementioned System.Linq.Enumerable.Cast<T>(this IEnumerable)
method in the system library. However, that method, while very convenient, makes no effort to statically verify that the input and output types are compatible with one another. That is, a call such as the following is perfectly legal:
var numbers = new [] { 1, 2, 3, 4, 5 };
var objects = numbers.Cast< object>(); // OK
var strings = numbers.Cast< string>(); // runtime error!
Instead of an unchecked cast, a method with a generic constraint on the input and output types would be much more appropriate in those situations where the program is simply avoiding the generic-typing limitation described in detail in the first section. The method below does the trick:
public static IEnumerable<TOutput> Convert<TInput, TOutput>(this IEnumerable<TInput> input)
where TInput : TOutput
{
if (input == null) { throw new ArgumentNullException("input"); }
if (input is IList<TOutput>) { return (IList<TOutput>)input; }
return input.Select(obj => (TOutput)(object)obj);
}
While it’s entirely possible that the Cast()
function from the Linq library is more highly optimized, it’s not as safe as the method above. A check with Redgate’s Reflector would probably reveal just how that method actually works. Correctness come before performance, but YMMV. [2]
The initial examples can now be rewritten to compile without casting:
ProcessListOfD(eList.Convert<E, D>()); // OK
ProcessListOfE(eList); // OK
ProcessSequenceOfD(bList.Convert<E, D>()); // OK
ProcessSequenceOfE(eList); // OK
Unlike the Enumerable.Cast<TOutput>()
method, which has no restrictions and can be used on any IEnumerable
, there will be places where the compiler will not allow an application to use Convert<TOutput>()
. This is because the generic constraint to which TOutput
must conform (TInput
) is, in some cases, not statically provable (i.e. at compile-time). A concrete example is shown below:
abstract class A
{
abstract IList<TResult> GetObject<TResult>();
}
class B<T> : A
{
public override IList<TResult> GetObject<TResult>()
{
return _objects.Convert<T, TResult>(); // Compile error!
}
private IList<T> _objects;
}
The example above does not compile because TResult
does not provably conform to T
. A generic constraint on TResult
cannot be applied because it would have to be applied to the original, abstract function, which knows nothing of T
. In these cases, the application will be forced to use the System.Linq.Enumerable.Cast<T>(this IEnumerable)
instead.
recast
definition. Similarly, another runtime plague—null
-references—is also addressed in Eiffel, a feature extensively documented in the paper, Attached types and their application to three open problems of object-oriented programming.Where
or WhereEquals
), order (OrderBy
), join (Join
) and project (Select
) data. The first version of this API was very... [More]
]]>
Published by marco on 15. Oct 2009 23:18:21 (GMT-5)
Updated by marco on 16. Oct 2009 09:35:31 (GMT-5)
Fluent interfaces—or “method chaining” as it’s also called—provide an elegant API for configuring objects. For example, the Quino query API provides methods to restrict (Where
or WhereEquals
), order (OrderBy
), join (Join
) and project (Select
) data. The first version of this API was very traditional and applications typically contained code like the following:
var query = new Query(Person.Metadata);
query.WhereEquals(Person.Fields.Name, "Müller");
query.WhereEquals(Person.Fields.FirstName, "Hans");
query.OrderBy(Person.Fields.LastName, SortDirection.Ascending);
query.OrderBy(Person.Fields.FirstName, SortDirection.Ascending);
var contactsTable = query.Join(Person.Relations.ContactInfo);
contactsTable.Where(ContactInfo.Fields.Street, ExpressionOperator.EndsWithCI, "Strasse");
(This example gets all people named “Hans Müller” that live on a street with a name that ends in “Strasse” (case-insensitive) sorted by last name, then first name. Fields
and Relations
refer to constants generated from the Quino metadata model.)
The syntax above is very declarative and relatively easy-to-follow, but is a bit wordy. It would be nice to be able to chain together all of these calls and remove the repeated references to query
. The local variable contactsTable
also seems kind of superfluous here (it is only used once).
A fluent version of the query definition looks like this:
var query = new Query(Person.Metadata);
query.WhereEquals(Person.Fields.Name, "Müller")
.WhereEquals(Person.Fields.FirstName, "Hans")
.OrderBy(Person.Fields.LastName, SortDirection.Ascending)
.OrderBy(Person.Fields.FirstName, SortDirection.Ascending)
.Join(Person.Relations.ContactInfo)
.Where(ContactInfo.Fields.Street, ExpressionOperator.EndsWithCI, "Strasse");
The example uses indenting to indicate that restriction after the join on the “ContactInfo” table applies to the “ContactInfo” table instead of to the “Person” table. The call to Join
logically returns a reference to the joined table instead of the query itself. However, each such table also has a Query
property that refers to the original query. Applications can use this to “jump” back up and apply more joins, as shown in the example below where the query only returns a person if he or she also works in the London office:
var query = new Query(Person.Metadata);
query.WhereEquals(Person.Fields.Name, "Müller")
.WhereEquals(Person.Fields.FirstName, "Hans")
.OrderBy(Person.Fields.LastName, SortDirection.Ascending)
.OrderBy(Person.Fields.FirstName, SortDirection.Ascending)
.Join(Person.Relations.ContactInfo)
.Where(ContactInfo.Fields.Street, ExpressionOperator.EndsWithCI, "Strasse").Query
.Join(Person.Relations.Office)
.WhereEquals(Office.Fields.Name, "London");
A final example shows how even complex queries over multiple table levels can be chained together into one single call. The following example joins on the “ContactInfo” table to dig even deeper into the data by restricting to people whose web sites are owned by people with at least 10 years of experience:
var query = new Query(Person.Metadata);
query.WhereEquals(Person.Fields.Name, "Müller")
.WhereEquals(Person.Fields.FirstName, "Hans")
.OrderBy(Person.Fields.LastName, SortDirection.Ascending)
.OrderBy(Person.Fields.FirstName, SortDirection.Ascending)
.Join(Person.Relations.ContactInfo)
.Where(ContactInfo.Fields.Street, ExpressionOperator.EndsWithCI, "Strasse")
.Join(ContactInfo.Relations.WebSite)
.Join(WebSite.Relations.Owner)
.Where(Owner.Fields.YearsExperience, ExpressionOperator.GreaterThan, 10).Query
.Join(Person.Relations.Office)
.WhereEquals(Office.Fields.Name, "London");
This API might still be a bit too wordy for some (.NET 3.5 Linq would be less wordy), but it’s refactoring-friendly and it’s crystal-clear what’s going on.
When there’s only one class involved, it’s not that hard to conceive of how this API is implemented: each method just returns a reference to this
when it has finished modifying the query. For example, the WhereEquals
method would look like this:
IQuery WhereEquals(IMetaProperty prop, object value);
{
Where(CreateExpression(prop, value);
return this;
}
This isn’t rocket science and the job is quickly done.
However, what if things in the inheritance hierarchy aren’t that simple? What if, for reasons known to the Quino framework architects, IQuery
actually inherits from IQueryCondition
, which defines all of the restriction and ordering operations. The IQuery
provides projection and joining operations, which can easily just return this
, but what type should the operations in IQueryCondition
return?
The problem area is indicated with question marks in the example below:
public interface IQueryCondition
{
??? WhereEquals(IMetaProperty prop, object value);
}
public interface IQueryTable : IQueryCondition
{
IQueryTable Join(IMetaRelation relation);
}
public interface IQuery : IQueryTable
{
IQueryTable SelectDefaultForAllTables();
}
The IQueryCondition
can’t simply return IQueryTable
because it might be used elsewhere [1], but it can’t return IQueryCondition
because then the table couldn’t perform a join after a restriction because applying the restriction would have restricted the fluent interface to an IQueryCondition
instead of an IQueryTable
.
The solution is to make IQueryCondition
generic and pass it the type that it should return instead of hard-coding it.
public interface IQueryCondition<TSelf>
{
TSelf WhereEquals(IMetaProperty prop, object value);
}
public interface IQueryTable : IQueryCondition<IQueryTable>
{
IQueryTable Join(IMetaRelation relation);
}
public interface IQuery : IQueryTable
{
IQueryTable SelectDefaultForAllTables();
}
That takes care of the interfaces, on to the implementation. The standard implementation runs into a small problem when returning the generic type:
public class QueryCondition<TSelf> : IQueryCondition<TSelf>
{
TSelf WhereEquals(IMetaProperty prop, object value)
{
// Apply restriction
return (TSelf)this; // causes a compile error
}
}
public class QueryTable : QueryCondition<IQueryTable>, IQueryTable
{
IQueryTable Join(IMetaRelation relation)
{
// Perform the join
return result;
}
}
public class Query : IQuery
{
IQueryTable SelectDefaultForAllTables()
{
// Perform the select
return this;
}
}
One simple solution to the problem is to cast down to object
and back up to TSelf
, but this is pretty bad practice as it short-circuits the static checker in the compiler and defers the problem to a potential runtime one.
public class QueryCondition<TSelf> : IQueryCondition<TSelf>
{
TSelf WhereEquals(IMetaProperty prop, object value)
{
// Apply restriction
return (TSelf)(object)this;
}
}
In this case, it’s guaranteed by the implementation that this
is compliant with TSelf
, but it would be even better to solve the problem without resorting to the double-cast above. As it turns out, there is a simple and quite elegant solution, using an abstract method called ThisAsTSelf
, as illustrated below:
public abstract class QueryCondition<TSelf> : IQueryCondition<TSelf>
{
TSelf WhereEquals(IMetaProperty prop, object value)
{
// Apply restriction
return ThisAsTSelf();
}
protected abstract TSelf ThisAsTSelf();
}
public class Query : IQuery
{
protected override TSelf ThisAsTSelf()
{
return this;
}
}
The compiler is now happy without a single cast at all because Query
returns this
, which the compiler knows conforms to TSelf
. The power of a fluent API is now at your disposal without restricting inheritance hierarchies or making end-runs around the compiler. Naturally, the concept extends to multiple levels of inheritance (e.g. if all calls had to return IQuery
instead of IQueryTable
), but it gets much uglier, as it requires nested generic types in the return types, which makes it much more difficult to understand. With a single level, as in the example above, the complexity is still relatively low and the resulting API is very powerful.
IQueryJoinCondition
.Let’s start with the user story that generated this idea:
]]>“A user was entering data using our database software and complained... [More]”
Published by marco on 26. Jun 2009 00:04:09 (GMT-5)
Updated by marco on 26. Jun 2009 08:51:26 (GMT-5)
The following is an analysis and brainstorming of a problem in generalized database browser GUIs, like those generated by the Quino metadata framework.
Let’s start with the user story that generated this idea:
“A user was entering data using our database software and complained of losing data. After verifying that the lost data was not due to an obvious software bug, we determined that it was because of how she was assuming the software worked. That is, she would use the application to browse to the location where she wanted to add data, create a new object, fill out its fields, then save it. For each subsequent object, she simply filled out the form again and clicked save to save it.”
Now, if you know the paradigm of Quino applications – and, indeed, most modern GUI applications, you’re going to see the problem: instead of creating a new entry every time she clicked save, she was simply saving the current object, then editing that same object, then overwriting the previous changes. After filling out the details for dozens of objects, she had only one object saved in the database.
One fix is to improve her training so that she knows how to create multiple objects with a Quino application. That is what we did, so that she could continue with data-entry and get her work done with the current software. A better workflow for entering new records is to select “new”, fill in the data, then select “new” again to store the existing entry and create a blank form for the next entry. A few problems are immediately obvious:
Anyway, once we’d gotten her squared away, we huddled back at Encodo headquarters and asked ourselves how we could avoid similar problems in the future. We agreed that it was a difficult problem and had to break up after a bit to attend to more pressing matters. The problem continued to swirl around in our collective subconscious, though.
In describing the problem to another, non-technical person with a fair amount of computer experience, the following ideas came up.
As mentioned above, when we create a new object, the data entry form is empty, save for a few default values (set from the model) and the parent object on which the object is being created. However, the user might want to do one of several other things when creating a new object:
In the case of “Cloning” and “Templating”, we run into the danger that cropped in the user story above; namely, that the form is in the same place as the object being cloned or the last object displayed, but it is now showing either an exact copy or a partially filled object instead. An object that is new and unsaved. How can we let the user know that this object is new and unsaved and, conversely, how can we let the user know that when they are making edits to an existing object, they are not saving a new object, but modifying data, which included replacing existing information with new information.
One way to handle this problem is to leave the GUI as it is, but to use color or decal hinting to let the user know the object state. We could do this in several ways:
Another way to handle the problem is to separate the tasks of editing existing objects and creating new ones. A paradigm to which users are well-accustomed is the dialog box “Ok/Cancel” one. Open a dialog, fill in the data and click ok to save it or cancel to abort. The way the Quino data browser works right now is that Ok and Cancel manifest as “Save” and “Revert” in the toolbar (which is not so clearly connected to the object being edited or created). This is not really that intuitive, especially when considering that editible objects can be nested.
The navigate to an object and edit-in-place concept is a good one and one which seems to cause little trouble with users. What if, however, we were to change the data entry mode to use a separate window instead? Instead of simply loading the new, empty object into the panel where existing objects are edited, we open a modal dialog showing the new, empty object instead. There is little room for error as the user must select “Ok” or “Cancel” to exit the dialog, making an explicit choice to save or discard the new object. The dialog cannot be closed with “Ok” unless the object validates successfully. When the object is saved and the dialog closes, the form from which the dialog was opened is focused on the new object, which appears in the browser in the tree(s) and/or list(s) where it belongs.
The feature above is quite a bare-bones approach and we can do much better. For example, we could offer the following improvements:
It seems that, with such an approach, Quino would offer a much more streamlined and intuitive method of mass or single data entry with far less of a chance of users getting confused by the combination of the global toolbar, auto-saving and the mix of browsing and data-entry.
Published by marco on 21. Jun 2009 14:10:45 (GMT-5)
After what seems like an eternity, a mainstream programming language will finally dip its toe in the Design-by-contract (DBC) pool. DBC is a domain amply covered in one less well-known language called Eiffel (see ISE Eiffel Goes Open-Source for a good overview), where preconditions, postconditions and invariants of various stripes have been available for over twenty years.
Object-oriented languages already include contracts; “classic” signature-checking involves verification of parameter counts and type-conformance. DBC generally means extending this mechanism to include assertions on a higher semantic level. A method’s signature describes the obligations calling code must fulfill in order to execute the method. The degree of enforcement varies from language to language. Statically-typed languages verify types according to conformance at compile-time, whereas dynamically-typed languages do so at run-time. Even the level of conformance-checking differs from language to language, with statically-typed languages requiring hierarchical conformance via ancestors and dynamically-typed languages verifying signatures via duck-typing.
And that’s only for individual methods; methods are typically collected into classes that also have a semantic meaning. DBC is about being able to specify the semantics of a class (e.g. can property A
ever be false
when property B
is true
?) as well as those of method parameters (can parameter a
ever be null
?) using the same programming language.
DBC is relatively tedious to employ without framework or language support. Generally, this takes the form of using Debug.Assert
[1] at the start of a method call to verify arguments, throwing ArgumentExceptions
when the caller did not satisfy the contract. Post-conditions can also be added in a similar fashion, at the end of the funtion. Naturally, without library support, post-conditions must be added before any return
-statements or enclosed in an artificial finally
-clause around the rest of the method body. Class invariants are even more tedious, as they must be checked both at the beginning and end of every single “entering” method call, where the “entering” method call is the first on the given object. A proper implementation may not check the invariant for method calls that an object calls on itself because its perfectly all right for an object to be in an invalid state until the “entering” method returns.
One assertion that arises quite often is that of requiring that a parameter be non-null
in a precondition. An analysis of most code bases that used poor-man’s DBC will probably reveal that the majority of its assertions are of this form. Therefore, it would be nice to handle this class of assertion separately using a language feature that indicates that a particular type can statically never be null
. Eiffel has added this support with a separate notation for denoting “attached” types (types that are guaranteed to be attached to a non-null
reference). Inclusion of such a feature not only improves the so-called “provability” of programs written in that language, it also transforms null-checking contracts to another notation (e.g. in Eiffel, objects are no longer nullable by default and the ?
-operator is used to denote nullability) and removes much of the clutter from the precondition block.
Without explicit language support, a DBC solution couched in terms of assertions and/or exceptions quickly leads to clutter that obscures the actual program logic. Contracts should be easily recognizable as such by both tools and humans. Ideally, the contract can be extracted and included in documentation and code completion tooltips. Eiffel provides such support with separate areas for pre- and post-conditions as well as class invariants. All assertions can be labeled to give them a human-readable name, like “param1_not_null” or “list_contains_at_most_one_element”. The Eiffel tools provide various views on the source code, including what they call the “short” view, showing method signatures and contracts without implementation, as well as the “short flat” view, which is the “short” view, but includes all inherited methods to present the full interface of a type.
Other than Eiffel, no close-to-mainstream programming language [2] has attempted to make the implicit semantics of a class explicit with DBC. Until now. Code Contracts will be included in C# 4.0, which will be released with Visual Studio 2010. It is available today as a separate assembly and compatible with C# 3.5 and Visual Studio 2008, so no upgrade is required to start using it. Given the lack of an upgrade requirement, we can draw the conclusion that this contracting solution is library-only without any special language support.
That does not bode well; as mentioned above, such implementations will be limited in their support of proper DBC. The user documentation provides an extensive overview of the design and proper use of Code Contracts.
There are, as expected, no new keywords or language support for contracts in C# 4.0. That means that tools and programmers will have to rely on convention in order to extract semantic meaning from the contracts. Pre- and postconditions are mixed together at the top of the method call. Post-conditions have support for accessing the method result and original values of arguments. Contracts can refer to fields not visible to other classes and there is an attribute-based hack to make these fields visible via a proxy property.
Contracts for abstract classes and interfaces are, simply put, a catastrophe. Since these constructs don’t have method implementations, they can’t contain contracts. Therefore, in order to attach contracts to these constructs—and, to be clear, the mechanism would be no improvement over the current poor-man’s DBC if there was no way to do this—there is a ContractClass
attribute. Attaching contracts to an interface involves making a fake implementation of that interface, adding contracts there, hacking expected results so that it compiles, presumably adding a private constructor so it can’t be instantiated by accident, then referencing it from the interface via the attribute mentioned above. It works, but it’s far from pretty and it move the contracts far from the place where it would be intuitive to look for them.
Just as the specification side is not so pretty, the execution side also suffers. Contracts are, at least, inherited, but preconditions cannot be weakened. That is, a sub-type—and implementations of interfaces with contracts are sub-types—cannot add preconditions; end of story. As soon as a type contains at least one contract on one method, all methods in that type without contracts are interpreted as specifying the “empty” contract.
Instead of simply acknowledging that precondition weakening could be a useful feature, the authors state:
“While we could allow a weaker precondition, we have found that the complications of doing so outweigh the benefits. We just haven’t seen any compelling examples where weakening the precondition is useful.”
Let’s have an example, where we want to extend an existing class with support for a fallback mechanism. In the following case we have a transmitter class that sends data over a server; the contracts require that the server be reachable before sending data. The descendant adds support for a second server over which to send, should the first be unreachable. All examples below have trimmed initialization code that guarantees non-null
properties for clarity’s sake. All contracts are included.
class Transmitter
{
public Server Server { get; }
public virtual void SendData(Data data)
{
Contracts.Requires(data != null);
Contracts.Requires(Server.IsReachable);
Contracts.Ensures(data.State == DataState.Sent);
Server.Send(data);
}
[ContractInvariantMethod]
protected void ObjectInvariant
{
Contract.Invariant(Server != null);
}
}
class TransmitterWithFallback : Transmitter
{
public Server FallbackServer { get; }
public override void SendData(Data data)
{
// *contract violation*
// If "Server" is not reachable, we will never be given
// the opportunity to send using the fallback server
}
[ContractInvariantMethod]
protected void ObjectInvariant
{
Contract.Invariant(FallbackServer != null);
}
}
We can’t actually implement the fallback without adjusting the original contracts. With access to the code for the base class, we could address this shortcoming by moving the check for server availability to a separate method, as follows:
class Transmitter
{
public Server Server { get; }
[Pure]
public virtual bool ServerIsReachable
{
get { return Server.IsReachable; }
}
public virtual void SendData(Data data)
{
Contracts.Requires(data != null);
Contracts.Requires(ServerIsReachable);
Contracts.Ensures(data.State == DataState.Sent);
Server.Send(data);
}
[ContractInvariantMethod]
protected void ObjectInvariant
{
Contract.Invariant(Server != null);
}
}
class TransmitterWithFallback : Transmitter
{
public Server FallbackServer { get; }
[Pure]
public override bool ServerIsReachable
{
get { return Server.IsReachable || FallbackServer.IsReachable; }
}
public override void SendData(Data data)
{
if (Server.IsReachable)
{
base.SendData(data);
}
else
{
FallbackServer.Send(data);
}
}
[ContractInvariantMethod]
protected void ObjectInvariant
{
Contract.Invariant(FallbackServer != null);
}
}
With careful planning in the class that introduces the first contract—where precondition contracts are required to go—we can get around the lack of extensibility of preconditions. Let’s take a look at how Eiffel would address this. In Eiffel, the example above would look something like the following [3]:
class TRANSMITTER
feature
server: SERVER
send_data(data: DATA) is
require
server.reachable
do
server.send(data)
ensure
data.state = DATA_STATE.sent;
end
end
class TRANSMITTER_WITH_FALLBACK
inherits
TRANSMITTER
redefine
send_data
end
feature
fallback_server: SERVER
send_data (data: DATA) is
require else
fallback_server.reachable
do
if server.reachable then
Precursor;
else
fallback_server.send(data)
end
end
end
The Eiffel version has clearly separated boundaries between contract code and implementation code. It also did not require a change to the base implementation in order to implement a useful feature. The author of the library has that luxury, whereas users of the library would not and would be forced to use less elegant solutions.
To sum up, it seems that, once again, the feature designers have taken the way out that makes it easier on the compiler, framework and library authors rather than providing a full-featured design-by-contract implementation. It was the same with the initial generics implementation in C#, without co- or contra-variance. The justification at the time was also that “no one really needed it”. C# 4.0 will finally include this essential functionality, belying the original assertion.
The implementation is so easy-to-use that even the documentation leads off by warning that:
“a word of caution: Static code checking or verification is a difficult endeavor. It requires a relatively large effort in terms of writing contracts, determining why a particular property cannot be proven, and finding a way to help the checker see the light. […] If you are still determined to go ahead with contracts […] To not get lost in a sea of warnings […] (emphasis added)”
Not only is that not ringing, that’s not even an endorsement.
Other notes on implementation include:
DEBUG
builds. This is a ridiculous restriction as null-checks and other preconditions are useful throughout the development process, not just for pre-release testing. Poor-man’s DBC is currently enabled in all builds; a move to MS Contracts with the recommended separate build would remove this support, weakening the development process.Because the feature is not a proper language extension, the implementation is forced within the bounds of the existing language features. A more promising implementation was Spec#—which extended the C# compiler itself—but there hasn’t been any activity on that project from Microsoft Research in quite some time. There are, however, a lot of interesting papers available there which offer a more developer-friendly insight into the world of design-by-contract than the highly compiler-oriented point-of-view espoused by the Contracts team.
This author will be taking a pass on the initial version of DBC as embodied by Microsoft Contracts.
Published by marco on 19. May 2009 23:07:24 (GMT-5)
When a .NET application exhibits behavior on a remote server that cannot be reproduced locally, you’ll need to debug application directly on the server. The following article includes specific instructions for debugging ASP.NET applications, but applies just as well to standalone executables.
There are several prerequisites for remote debugging; don’t even bother trying until you have all of the items on the following list squared away or the Remote Debugger will just chortle at your naiveté.
Before you think you can get all fancy and simply debug remotely without authentication, know this: unauthenticated, native debugging does not support breakpoints, so forget it. You’ll technically be able to connect to a running application but, without breakpoints, you’ll only be able to watch any pre-existing debug output appear on the console, if that.
The following ports must be open in order for Remote Debugging to function correctly in all situations:
Protocol Port Service Name TCP 139 File and Printer Sharing TCP 445 File and Printer Sharing UDP 137 File and Printer Sharing UDP 138 File and Printer Sharing UDP 4500 IPsec (IKE NAT-T) UDP 500 IPsec (IKE) TCP 135 RPC Endpoint Mapper and DCOM infrastructure
Additionally, the application “Microsoft Visual Studio 2008” must be in the exceptions list on the client and “Visual Studio Remote Debugging Monitor” must be in the exceptions list on the server.
Once you’ve satisfied the requirements above, you should probably also heed the following tips: it’s best to read about them now rather than learn about them the hard way later:
Here are steps you can follow to debug an application remotely. These steps worked for me, but the remote debugging situation seems to be extremely hit-or-miss, so your mileage may vary.
You’ve set up the server and attached to it so far. If anything has gone wrong, check the troubleshooting section below to see if your problem is addressed there. Now, the next steps are optional if you think you can identify your process without knowing the PID (Process ID). This is generally the case only when yours is the only .NET application deployed to that server. In that case, your process is the “w3wp.exe” process which includes “managed code”. If you don’t know your PID, follow the optional instructions below to figure out which one is yours.
If that didn’t work, then you probably aren’t configured to query WMI remotely; your only options are to try to run it remotely using the instructions and tips below or to run it from the server.
Once you have the PID in hand, continue:
As you can probably tell from the massive list of prerequisites and recommendations as well as the 20-step guide to triggering a breakpoint, there’s a lot that can go wrong with Remote Debugging. It’s not insurmountable, but it’s not something you’re going to want to attempt unless your job pretty much depends on it. These are some of the errors I encountered along the way and how I addressed them.
You need to create a local administrator with the same password as the one you’re using on the server to run the debugging monitor.
You opened the firewall, but only for computers on the same subnet. The computer to which you are connecting is probably not on the same subnet, so you’ll need to go to the firewall settings and open them up all the way (Visual Studio will not ask again). To edit the firewall settings, do the following:
It’s also possible that the Remote Debugger is being blocked on the server side. To address this, run the “Visual Studio 2008 Remote Debugger Configuration Wizard” again; if the wizard wants to adjust firewall settings, let it do so (for internal or external networks, as appropriate to your situation – if you’re not sure, use external). To make sure that the settings were applied, run the wizard again; it should ask you about running the service, but should no longer complain about the firewall.
If it still complains about the firewall, then you’ve got another problem, which is that the setup is having trouble adjusting the settings for the firewall but isn’t telling you that it’s utterly failing when it attempts to do so. Verify that you’re running the wizard as a user that has permission to adjust the firewall settings.
The user with which you are executing Visual Studio on the client does not exist on the server or has a different password. In order to avoid adding useless user accounts to the server’s domain, you should restart your IDE using “Run as…” to set the security context to the same user as you have on the server.
You can impersonate other users, but you have set a registry key; see Remote Debugging Under Another User Account for more information. This doesn’t help though, if the user you are trying to use doesn’t even have an account on the remote machine.
Remote debugging sounds way cool and is the major difference between the Standard and Professional versions of Visual Studio, but it’s not for the faint of heart or the inexperienced. If you Google around a bit, you’ll notice that most people get a big heap of epic fail when they try it and I’ve tried to make as comprehensive guide to remote debugging as my own experience and time constraints allowed.
Here’s hoping you never have to do remote debugging (write a test instead! *smile*) but, if you do, I wish you the best of luck.
At Encodo, we’re using the Microsoft Entity Framework (EF) to map objects to the database. EF treats everything—and I mean everything—as an object; the foreign key fields by which objects are related... [More]
]]>Published by marco on 18. May 2009 22:54:44 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
At Encodo, we’re using the Microsoft Entity Framework (EF) to map objects to the database. EF treats everything—and I mean everything—as an object; the foreign key fields by which objects are related aren’t even exposed in the generated code. But I’m getting ahead of myself a bit. We wanted to figure out the most elegant way of mapping what we are going to call enumerated associations in EF. These are associations from a source table to a target table where the target table is a lookup value of type int
. That is, the enumerated association could be mapped to a C# enum
instead of an object. We already knew what we wanted the solution to look like, as we’d implemented something similar in Quino, our metadata framework (see below for a description of how that works).
The goals are as follows:
EF encourages—nay, requires—that one develop the application model in the database. A database model consists of tables, fields and relationships between those tables. EF will map those tables, fields and relationships to classes, properties and sub-objects in your C# code. The properties used to map an association—the foreign keys—are not exposed by the Entity Framework and are simply unavailable in the generated code. You can, however, add custom code to your partial classes to expose those values [1]:
return Child.ParentReference.ID;
However, you can’t use those properties with LINQ queries because those extra properties cannot be mapped to the database by EF. Without restrictions or orderings on those properties, they’re as good as useless, so we’ll have to work within EF itself.
Even though EF has already mapped the constraint from the database as a navigational property, let’s add the property to the model as a scalar property anyway. You’ll immediately be reprimanded for mapping the property twice, with something like the following error message:
Since we’re feeling adventurous, we open the XML file directly (instead of inside the designer) and remove the navigational property and association, then add the property to the conceptual model by hand. Now, we’re reprimanded for not having mapped the association EF found in the database, with something like the following error message:
Not giving up yet, we open the model in the designer again and delete the offending foreign key from the diagram. Now, we get something like the following error message:
The list of line numbers indicate where the foreign key we’ve deleted is still being referenced. Despite having used the designer to delete the key, EF has neglected to maintain consistency in the model, so it’s time to re-open the model as XML and delete the remaining references to ‘FOREIGN_KEY_NAME’ manually.
We’re finally in the clear as far as the designer and compiler are concerned, with the constraint defined as we want it in the database and EF exposing the foreign key as an integer—to which we can assign a typecast enum
—instead of an object. This was the goal, so let’s run the application and see what happens.
Everything works as expected and there are no nasty surprises waiting for us at runtime. We’ve got a much more comfortable way of working with the special case of enumerated types working in EF. This special case, arguably, comes up quite a lot; in the model for our application, about half of the tables contain enumerated data, which are used as lookups for reports.
It wasn’t easy and the solution involved switching from designer to XML-file and back a few times [2], but at least it works. However, before we jump for joy that we at least have a solution, let’s pretend we’ve changed our database again and update the model from the database.
Oops.
The EF-Designer has detected the foreign key we so painstakingly deleted and re-established it without asking for so much as a by-your-leave, giving us the error of type 3007 shown above. We’re basically back where we started … and will be whenever anyone changes the database and updates the model automatically. At this point, it seems that the only way to actually expose the foreign key in the EF model is to remove the association from the database! Removing the constraint in the database, however, is unacceptable as that would destroy the relational integrity just to satisfy a crippled object mapper.
In a last-ditch effort, we can fool EF into thinking that the constraint has been dropped not by removing the constraint but by removing the related table from the EF model. That is, once EF no longer maps the destination table—the one containing the enumerated data—it will no longer try to map the constraint, mapping the foreign key as just another integer field.
This solution finally works and the model can be updated from the designer without breaking it—as long as no one re-adds the table with the enumerated data. This is the solution we’ve chosen for all of our lookup data, establishing a second EF-model to hold those tables.
It’s not a beautiful solution, but it works better than the alternative (using objects for everything). Quino, Encodo’s metadata framework includes an ORM that addresses this problem much more elegantly. In Quino, if you have the situation outlined above—a data table with a relation to a lookup table—you define two classes in the metadata, pretty much as you do with EF. However, in Quino, you can specify that one class corresponds to an enumerated type and both the code generator and schema migrator will treat that meta-class accordingly.
EF has a graphical designer, whereas Quino does not, but the designer only gets in the way for the situation outlined above. Quino offers an elegant solution for lookup values with only two lines of code: one to create the lookup class and indicate which C# enum it represents and one to create a property of that type on the target class. The Quino Demo (not yet publicly available) contains an example.
At Encodo, we currently run Debian Etch on our servers, with a Xen hypervisor managing a bunch of individual virtual machines (VMs). Most of the VMs also run Debian Etch, but one of them runs Windows Server... [More]
]]>Published by marco on 18. May 2009 21:46:32 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
At Encodo, we currently run Debian Etch on our servers, with a Xen hypervisor managing a bunch of individual virtual machines (VMs). Most of the VMs also run Debian Etch, but one of them runs Windows Server 2003 instead. We use this machine for testing integration with Microsoft technologies like Sharepoint, Exchange and so on. Recently, we had to re-install the Exchange instance on that server and were faced with the problem of having to change the CD without rebooting the VM. Luckily, we found the article, Xen 3.0.3 change cdrom with windows 2003, which cryptically describes how to do this. The instructions describe pressing ctrl+alt+1, but where?
The trick is to realize that they are assuming three things:
Before you do anything, verify that you have made the physical CD/DVD available to the machine, by specifying something like the following in the XEN configuration file for the VM:
disk = [ 'file:/home/xen/domains/burken/disk1.img,ioemu:hda,w', 'phy:/dev/cdrom,hdc:cdrom,r' ]
The first disk (disk1.img
) is a disk image for the system itself; the second disk (hdc:cdrom
) is the physical CD/DVD. Until you see the CD inside the VM, you don’t have to even worry about trying to eject it.
You also need to make sure the VNC port is available, again with a line in the configuration:
vnc=1
If you make any changes to the configuration, you’ll need to restart the VM before you see the effects. Use the additional configuration option called vncpasswd
to lock down the VNC port.
Once you can see the CD within the VM and you can open a connection with the VNC viewer, you’re ready to actually follow the instructions in the post linked above:
At this point, you might think you’re done, but the first step is a stumbling block as you don’t actually type ctrl and alt; instead, you select them from the system menu, as illustrated below:
That’s it; you should see the new CD in the VM and you can continue with your installation.
/dev/cdrom
corresponds to the CD/DVD drive in question.A developer on the Microsoft C# compiler team recently made a post asking readers to post their solutions to a programming exercise in Comma Quibbling by Eric Lippert (Fabulous Adventures in Coding). The requirements are as follows:
Published by marco on 16. Apr 2009 13:51:42 (GMT-5)
Updated by marco on 17. Apr 2009 08:41:13 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
A developer on the Microsoft C# compiler team recently made a post asking readers to post their solutions to a programming exercise in Comma Quibbling by Eric Lippert (Fabulous Adventures in Coding). The requirements are as follows:
On top of that, he stipulated “I am particularly interested in solutions which make the semantics of the code very clear to the code maintainer.”
Before doing anything else, let’s nail down the specification above with some tests, using the NUnit testing framework:
[TestFixture]
public class SentenceComposerTests
{
[Test]
public void TestZero()
{
var parts = new string[0];
var result = parts.ConcatenateWithAnd();
Assert.AreEqual("{}", result);
}
[Test]
public void TestOne()
{
var parts = new[] { "one" };
var result = parts.ConcatenateWithAnd();
Assert.AreEqual("{one}", result);
}
[Test]
public void TestTwo()
{
var parts = new[] { "one", "two" };
var result = parts.ConcatenateWithAnd();
Assert.AreEqual("{one and two}", result);
}
[Test]
public void TestThree()
{
var parts = new[] { "one", "two", "three" };
var result = parts.ConcatenateWithAnd();
Assert.AreEqual("{one, two and three}", result);
}
[Test]
public void TestTen()
{
var parts = new[] { "one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten" };
var result = parts.ConcatenateWithAnd();
Assert.AreEqual("{one, two, three, four, five, six, seven, eight, nine and ten}", result);
}
}
The tests assume that the method ConcatenateWithAnd()
is declared as an extension method. With the tests written, I figured I’d take a crack at the solution, keeping the last condition foremost in my mind instead of compactness, elegance or cleverness (as often predominate). Instead, I wanted to make the special cases given in the specification as clear as possible in the code. On top of that, I added the following conditions to the implementation:
That said, here’s my version:
public static string ConcatenateWithAnd(this IEnumerable<string> words)
{
using (var enumerator = words.GetEnumerator())
{
if (!enumerator.MoveNext())
{
return "{}";
}
var firstItem = enumerator.Current;
if (!enumerator.MoveNext())
{
return "{" + firstItem + "}";
}
var secondItem = enumerator.Current;
if (!enumerator.MoveNext())
{
return "{" + firstItem + " and " + secondItem + "}";
}
var builder = new StringBuilder("{");
builder.Append(firstItem);
builder.Append(", ");
builder.Append(secondItem);
var item = enumerator.Current;
while (enumerator.MoveNext())
{
builder.Append(", ");
builder.Append(item);
item = enumerator.Current;
}
builder.Append(" and ");
builder.Append(item);
builder.Append("}");
return builder.ToString();
}
}
Looking at this from a maintenance or understanding point-of-view, I have the following notes:
foreach
-statement.StringBuilder.Append()
are intentional. I wanted to avoid having to use escaped {}
in the format string (e.g. String.Format(“{{{0} and {1}}}”, firstItem, secondItem)
is confusing if you’re not aware how curly brackets are escaped in a format string).Other than those things, it seems relatively compact and efficient. With my own version written, I looked through the comments on the post to see if any other interesting solutions were available. I came up with two that caught my eye, one by Jon Skeet and another by Hresto Deshev, who submitted his in F#.
Hristo’s example in F# is as follows:
#light
let format (words:list<string>) =
let rec makeList (words: list<string>) =
match words with
| [] -> ""
| first :: [] -> first
| first :: second :: [] -> first + " and " + second
| first :: second :: rest -> first + ", " + second + ", " + (makeList rest)
"{" + (makeList words) + "}"
That’s so cool: the formulation in F# is almost plain English! That’s pretty damned maintainable, I’d say. I have no way of judging the performance of this just-in-time parsing, but it does make use of recursion: lists with thousands of items will incur thousands of nested calls.
Next up is Jon Skeet’s version in C#:
public static string JonSkeetVersion(this IEnumerable<string> words)
{
var builder = new StringBuilder("{");
string last = null;
string penultimate = null;
foreach (string word in words)
{
// Shuffle existing words down
if (penultimate != null)
{
builder.Append(penultimate);
builder.Append(", ");
}
penultimate = last;
last = word;
}
if (penultimate != null)
{
builder.Append(penultimate);
builder.Append(" and ");
}
if (last != null)
{
builder.Append(last);
}
builder.Append("}");
return builder.ToString();
}
This one is very clever and handles all cases in a single loop rather than addressing special cases outside of a loop (as mine did). Also, all of the formatting elements—the curly brackets and item separators—are mentioned only once, improving maintainability. I immediately liked it better than my own solution from a technical standpoint. While I’m drawn to the cleverness and elegance of the solution, I’m not the target audience. Skeet’s version forces you to reason out the special cases; it’s not immediately obvious how the special cases for zero, one and two elements are handled. Also, while I am tickled pink by the aptness of the variable name penultimate
, I wonder how many non-native English speakers would understand its intent without a visit to an online dictionary. The name secondToLast
would have been a better, though far less sexy, choice.
It’s very easy to underestimate how little people are willing to actually read code that they didn’t write. If the code requires a certain amount of study to understand, then they may just leave it well enough alone and seek the original developer. If, however, it looks quite easy and the special cases are made clear—as in my version—they are far more likely to dig further and work with it. Since the problem is defined as a three special cases and a general case, it is probably best to offer a solution where these cases are immediately obvious to ease maintainability—and as long as you don’t sacrifice performance unnecessarily. Cleverness is wonderful, but you may end up severely limiting the number of people willing—or able—to work on that code.
Published by marco on 5. Apr 2009 21:02:50 (GMT-5)
Once you’ve been coding for a while, you’ll probably have quite a pile of code that you’ve written and are regularly using. It’s possible that you’ve got some older code in use that just works and on which you rely every day. At some point, though, you realize that you have to get back in there and fix a few things. That happened recently with the upgrade of the earthli WebCore and attendant applications from PHP4 to PHP5 (which is ongoing). The earthli codebase was born in 1999 and was originally designed to run on PHP3. It has been quite aggresively upgraded and rewritten since then and is thus in pretty decent condition, from the design and stability side of things.
The code formatting, however, was old-style and broke a few cardinal rules I’d picked up since 1999. There were two problems in particular that I wanted to address:
else
statement, e.g.:if ($something_is_true)
return true;
else
return false;
Instead of just living with it, however, I did some global search/replace with regular expressions kung fu to get the code back up to snuff. I used the PCRE support in Zend Studio build 20090119, which I assume just uses the standard Eclipse search/replace support. All operations were applied solution-wide with relatively little trouble.
First, I searched for if
-statements whose contents did not start with an opening curly bracket:
Search: ([ ]+)if \(([^\n]+)\)\n[ ]+([^ {][^\n]+)
Replace: \1if (\2)\n\1{\n\1 \3\n\1}
From there, it’s relatively easy to find/replace all single-line else
-statements, by searching for the following:
Search: ([ ]+)else\n[ ]+([^ {][^\n]+)
Replace: \1else\n\1{\n\1 \2\n\1}
Then, I did the more esoteric elseif
:
Search: ([ ]+)elseif \(([^\n]+)\)\n[ ]+([^ {][^\n]+)
Replace: \1elseif (\2)\n\1{\n\1 \3\n\1}
And finally, I replaced all loop constructs
Search: ([ ]+)foreach \(([^\n]+)\)\n[ ]+([^ {][^\n]+)
Replace: \1foreach (\2)\n\1{\n\1 \3\n\1}
Search: ([ ]+)for \(([^\n]+)\)\n[ ]+([^ {][^\n]+)
Replace: \1for (\2)\n\1{\n\1 \3\n\1}
Once I’d normalized all of the else
-statements, I could clean up else
-statements that included only a return
-statement.
Search: ([ ]+)else\n[ ]+\{\n[ ]+return ([^\n]+)\n[ ]+\}\n
Replace: \n\n\1return \2\n
There are a lot more interesting things you can to globally alter your code if you’re willing to put some time into building your regular expressions. Legibility is better, debugging works better and there are far fewer warning reported by the compiler.
Published by marco on 17. Mar 2009 23:04:57 (GMT-5)
When someone posts a link to your web site on Facebook, it retrieves a preview and presents that as the default text, along with a selection of pictures it found in the page. Clearly, Facebook has some sort of scraper that extracts what it thinks is the best preview text from a given URL. Sometimes it works well, sometimes not. Luckily, you can tune your pages for Facebook requests, emphasizing the parts you think are important and belong in the preview.
It’s anybody’s guess how the scraper actually works but, at the very least, we know that it uses a special user agent when accessing your site. Given that, you can customize your response when Facebook comes calling. The user agent is given below:
facebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php)
The earthli WebCore just recently got upgraded to detect Facebook. When a browser of unknown capabilities makes a request to a WebCore site, it generally includes a banner in the header, urging the user to download a supported browser (as shown below). Until recently, the message didn’t include HTML
paragraph tags; once it acquired them, the Facebook scraper started using the warning text as the suggested summary for every link posted from WebCore sites.
This clearly would not do, therefore the earthli Browser Detector was updated to include support for detecting requests made for the purpose of extracting a preview. [1] Search engines generally frown on content-customization but Facebook can hardly complain. In the WebCore’s case, the default renderer now leaves off both the banner and footer of the page, generating only the page body, where the most important text is most likely to be.
To use the earthli Browser Detector, include the file in your PHP template and do something like the following:
$browser = new BROWSER();
if ($browser->is(Browser_previewer))
{
// Render page for Facebook (and other previewers as they are supported)
}
else
{
// Render page content for standard browsers
}
Of course, you can always do your own user-agent testing; you don’t have to use the browser detector, though it does offer many other useful capability checks and is rock-solid at browser detection.
Either way, customizing content for Facebook will go a long way to making links to your sites much easier and faster to create.
Published by marco on 16. Mar 2009 23:15:24 (GMT-5)
PHP is a programming language like any other; like any other, it’s possible to construct a bug complex enough that it can only reasonably be solved with a debugger. Granted, most PHP code is quite simple and limited to single pages with single include files and a limited library or framework. However, the advent of PHP5 has ushered in more than one team with the courage to build a full-fledged web framework. You would think that the state of PHP development had concordantly improved to the point that debugging scripts—on a local web server, at the very least—would be a no-brainer.
You’d be surprised.
The developer of the earthli WebCore [1] was courageous enough to attempt building a framework with PHP3. Since debuggers for PHP at that time (circa 1999) weren’t on anyone’s radar, PHP developers made do with the vaunted echo
and print
commands to simulate debugging. The WebCore quickly acquired a Javascript-based logger to which logging commands were written. Such methods only take one so far: for more complexly nested and recursive code which is more object-oriented and has a much larger stack, a debugger is really needed.
The port from PHP3 to PHP4 was accomplished without a debugger, but that was long ago. When it came time to port from PHP4 to PHP4.3, things were much harder. It is highly likely that very few developers encountered issues during that upgrade but complex libraries with heavy use of references felt the pain. PHP4.3 was ostensibly a maintenance release but included fixes for reference handling that not only caused a tsunami of new warnings but also subtly changed how references were handled. This was a portent of things to come in the far greater problems encountered when porting from PHP4.3 to PHP5.
Developers that never developed for PHP4 will need a bit of background on how references used to work. Succintly put, PHP4 by default created copies on variable assignment rather than assigning references. In order to get a reference instead, a developer had to explicitly request one with a special operator (&). Larger libraries with many methods soon became littered with ampersands. Even better, forgetting just one in a parameter caused PHP to create a new copy of the passed object. Changes made to that object within that routine were applied to the copy and mysteriously disappeared when the method call returned.
This was obviously an untenable situation so PHP5—based on the Zend 2.0 engine—reversed the default. Under PHP5, assignments that previously created copies now created only references. Mix this with a large library with many such implicit copies hidden throughout the code and let the fun begin. Luckily, incredibly savvy developers who had read of this change enforced an iron discipline and limited these types of implicit copies to only a few, well-marked places. [2] Thus, the massive pain entailed in a port from PHP4 to PHP5 was somewhat ameliorated.
As you, dear reader, can imagine, the need for a debugger at this point became overwhelming. Let the search begin.
PHPEclipse was the default editor throughout the development of the earthli WebCore, so the first step was to check out what they had to offer. It turns out that the current version of PHPEclipse supports both the XDebug and DBG debuggers. The XDebug debugger was mentioned much more often in the forums, so it seemed like a good place to start.
Though you can debug PHP using only the executable (PHP.EXE
), you’d have to configure that executable to include all the extensions and settings that are already configured with the local web server you’ve likely got installed. It’s not that it can’t be done, but that the most convenient way to debug would be to just execute the page on the web server you’re already using for testing. So, step one is to get the debugger extension loaded in your local server. If you’re developing on Windows [3], the WAMP server package is an excellent, highly-configurable solution. For porting from PHP4 to PHP5, it also offers the unique ability to change from one to the other within seconds. It has numerous addons corresponding to previous releases of Apache, MySql and PHP which seamlessly integrate with the main installation. What it does not have is an installer for configuring the appropriate version of either XDebug or DBG.
It seems that PHP developers don’t, in general, use debuggers.
As usual with such things online—that is, things that very few people do—instructions are available, but they must be pieced together from several different locations. XDebug was up first and was, after many false starts, loaded by the local server. Some things to watch out for:
PHP.INI
file; in WAMP, this is the file located in the bin
folder of the currently-loaded version of Apache.Once you are rewarded with the XDebug extension in the phpinfo()
page, you’re ready to start debugging. Following the instructions at the PHPEclipse wiki, though confusing, will get you stopping at a break point soon enough. Imagine that! A breakpoint in PHP! Press F6 to step over that line and … wait … and wait … listen to the fan on your laptop start. Apache is using 100% CPU or as close to it as it can. Wait several minutes for things to sort themselves, but they never do. Use the Task Manager to kill the offending instance of Apache and you simply transfer the problem to Eclipse, which begins using 100% CPU. Long story short, debugging with XDebug never got farther than this relatively low point. The inital breakpoint worked, but nothing else.
On to DBG.
To keep things a bit shorter, DBG never even stopped at the initial breakpoint, regardless of settings.
The open-source world, it seems, has nothing to offer on the PHP debugging front.
On to Zend.
Zend makes the scripting engine used by PHP. They make numerous tools for analysis as well as the Zend Developer Studio. It costs 399 Euros and brags about its debugging capabilities right on the home page. It has a 30-day trial. It all sounds so promising. One 330MB-download and installation later and you’ve got the Zend Studio up and running. Once you’ve configured a new project, you can set up a debug configuration, which comes with copious well-written help as well as a “Test Debugger” button right in the configuration window.
As before, you can run the debug session using a local executable, but the more useful setup is to run the debugger through the local web server. To do this, you need to install the Zend debugger extension, which has much better instructions in the Zend Studio help.
Long story short, 399 Euros buys you a working debugger for PHP that flawlessly debugged code in several files—including code located in “Include Paths”—and was exactly like any other debugging experience in Eclipse.
So, if you need to debug PHP, you can either take the cheap route and hope that the open-source solutions work for your code or you can take the plunge and use the Zend Studio—if you’re actually earning money with PHP development, choice (B) is the logical one.
This PHP developer, on the other hand, is going to get his port from PHP4 to PHP5 done in the next 30 days.
In August of 2008, Microsoft released the first service pack (SP1) for Visual Studio 2008. It included the first version (1.0) of Microsoft’s generalized ORM, the Entity Framework. We at Encodo were quite... [More]
]]>Published by marco on 16. Mar 2009 10:01:08 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
In August of 2008, Microsoft released the first service pack (SP1) for Visual Studio 2008. It included the first version (1.0) of Microsoft’s generalized ORM, the Entity Framework. We at Encodo were quite interested as we’ve had a lot of experience with ORMs, having worked on several of them over the years. The first was a framework written in Delphi Pascal that included a sophisticated ORM with support for multiple back-ends (Sql Server, SQLAnywhere and others). In between, we used Hibernate for several projects in Java, but moved on quickly enough. [1] Most recently, we’ve developed Quino in C# and .NET, with which we’ve developed quite a few WinForms and web projects. Though we’re very happy with Quino, we’re also quite interested in the sophisticated integration with LINQ and multiple database back-ends offered by the Entity Framework. Given that, two of our more recent projects are being written with the Entity Framework, keeping an eye out for how we could integrate the experience with the advantages of Quino. [2]
What follows are first impressions acquired while building the data layer for one of our projects. The database model has about 50 tables, is highly normalized and is pretty straightforward with auto-incremented integer primary keys everywhere, and single-field foreign keys and constraints as expected. Cascaded deletes are set for many tables, but there are no views, triggers or stored procedures (yet).
Eventually, EF will map your model and the runtime performs admirably (so far). However, designing that model is not without its quirks:
To be fair, this is a 1.0 release; it is to be expected that there are some wrinkles to iron out. However, one of the wrinkles is that a model with 50 tables is considered “large”.
With 50 tables and the designer slowing down, you’re forced to at least consider options for splitting the model. The graphic below shows the model for our application:
There exists official support for splitting the model into logical modules, but it’s just a bit complex; that is, you have to significantly change the generated files in order to get it to work and there is no design-time support whatsoever for indicating which entities belong to which modules. The blog posts by a member of the ADO.NET team called Working With Large Models In Entity Framework (Part 1 and Part 2) offer instructions for how to do this, but you’ll have to satisfy one of the following conditions:
The low-level, runtime-only solution offered by the ADO.NET team ostensibly works, though it probably isn’t very well-tested at all. Designer and better runtime-integration would be key in supporting larger models, but the comments at the second blog post indicate that designer support likely won’t make version 2 of the Entity Framework. This is a shocking admission, as it means that EF won’t scale on the development side for at least two more versions.
The designer is probably the weakest element of the Entity Framework; it is quite slow and requires a lot of work with the right mouse-button.
If you’re right in the middle of a desperate action to avoid reverting to the last version of your model from source control, you’ll be pleased to discover that, sometimes, Visual Studio will prevent you from opening the model, either in visual- or XML-editing mode. Neither a double-click in the tree nor explicitly selecting “Open” from the shortcut menu will open the file. The only thing for it is to re-open the solution, but at least you don’t lose any changes.
The biggest time-sink in EF is the questionable synchronization with the database. Often, you will be required to intervene and “help” EF figure out how to synchronize—usually by deleting chunks of XML and letting it re-create them.
1 (One)
to 0..1 (Zero or One)
.Here’s a development note written after making minor changes to the database:
“I added a couple of relationships between existing tables and there were suddenly 17 compile errors. I desperately tried to delete those relationships from the editor, to no avail. I opened it as XML and started deleting the affected sections in the hopes that I would be able to compile again and re-sync with the database. After a few edits, the editor would no longer open and the list of errors was getting longer as the infection spread; I would have to cut out the cancer. The cancer, in this case, was all of the classes involved in the new relationships. Luckily, they were mostly quite small and mostly used the identifiers from the database. [4] Once the model compiled again (the code did not build because it depended on generated code that was no longer generated), I could open the editor and re-sync with the database. Now it worked and had no more problems. All this without touching the database, which places the blame squarely on EF and its tendency to get confused.”
As you can imagine, adventures like these can take quite a bit of time and break up the development flow considerably.
The problem with dates all starts with this error message:
Be prepared to guess which of your several DateTime
fields is causing the error because the error message doesn’t mention the field name. Or the class name either, if you’ve had the audacity to add several different types of objects—or, God forbid, a whole tree of objects—before calling SaveChanges()
.
This error may come as a surprise because you’ve actually set default values in the database for all non-nullable date-time fields. Unfortunately, the EF schema reader does not synchronize non-scalar default values, so the default value of getdate()
set in the database is not automatically included in the model. Since the entity model doesn’t know that the database has a default value, but it does know that the field cannot be null, it requires a value. If you don’t provide a value, the mapper automatically assigns DateTime.MinValue. The database does not accept this value, so we have to set it ourselves, even though we’ve already set the desired default on the database.
To add insult to injury, the designer does not allow non-scalar values (e.g. you can’t set DateTime.Now
in the property editor), so you have to set non-scalar defaults in the constructors that you’ll declare by hand in the partial classes for all EF objects with dates [5].
In order to figure out which date-time is causing a problem once you think you’ve set them all, your best bet is to debug the Microsoft sources so you can see where ADO.NET is throwing the SqlClientException
. The SQL Profiler is unfortunately no use because the validation errors occur before the command is sent to the database. To keep things interesting, the Entity Framework sources are not available yet.
The documentation recommends using ScopeTransactions
, which use the DTS (Distributed Transactions Services). If the database is running locally, you should have no troubles; at the most, you’ll have to start the DTC [6] services. If the database is on a remote server, then you’ll need to do the following:
Any troubles you may experience with the DTC are unrelated to EF development; they’re just the pain of working with highly-integrated and security-aware software. That’s not to say that the experience is pleasant when something is mis-configured, but that I am reserving judgment until a later point in time.
The following section includes solutions for specific errors that crop up more often during EF model development.
Error 1 Error 3007: Problem in Mapping Fragments starting at lines 1383, 1617: Non-Primary-Key column(s) [ColumnName] are being mapped in both fragments to different conceptual side properties − data inconsistency is possible because the corresponding conceptual side properties can be independently modified.
You have most likely mapped the property identified by ColumnName
as both a scalar and navigational property. This usually happens in the following situation:
To fix the conflict, simply remove the scalar property manually.
You have most likely created a cascading relationship in the database and the EF editor has failed to properly update the model. It seems that there is no way to determine from the designer whether or not an association has delete or update rules. According to the blog post, Cascade delete in Entity Framework, the designer sometimes fails to update the association in both the physical and entity mappings in the XML file, so you have to add the rule by hand. See the article for instructions.
The database-design phase is more difficult than it should be, but it is navigable. You end up with a very usable, generated set of classes which nicely integrate with data-binding on controls. We will soldier on and bring news of our experiences on the runtime front.
timeCreated/timeModified
fields, you’ve got a lot of work ahead of you.DTS
—the Distributed Transaction Services. However, the actual Windows service is called the DTC
—the Distributed Transaction Controller.Published by marco on 15. Feb 2009 21:24:32 (GMT-5)
After nearly a decade of using Perforce for my private source control, I’d decided to switch to Mercurial. Mercurial is a distributed version control system and open-source and all kinds of awesome and I won’t go into why I made the switch here. Suffice it to say it makes it much easier to release code and work with others.
Mercurial itself is an easy installation and I had it running both on my OS X 10.4 and Windows XP in a flash. I even installed the newly released HgTortoise plugin, which works as advertised even though its user interface simply screams open-source. Now, Mercurial is nothing without a server, so I set about setting up one of those too. There’s nothing like jumping in the deep end when you’re a complete neophyte.
I’ve got one project that I’d like to share publicly and a handful of private projects that I’d like to store on the server, but work on from several places (e.g. the Mac or Windows). Now, with Mercurial, every repository contains a complete history of the project, so the designation of the server as the “main” storage is a feature of my deployment system, not a requirement imposed by Mercurial [1].
Mercurial has decent instructions for setting up an http
server; they provide CGI
scripts for both single and multiple repositories. Once you’ve got that, you’ll want to set up which users are allowed to push updates to your repository. Mercurial strongly recommends you use only SSL connections for push
operations; you can shut off the requirement easily enough, but it’s a good recommendation.
So far, no real problems have cropped up. Until you go to the web site and see the circa 1998-style horror that is the default style sheet. Running hg -v
reveals that you’re running version 0.9.1
, released in 2005. Thanks, Debian. Way to stay on top of things. [2] A quick check of Debian Backports reveals that a newer version, 1.0.1.2
is available. Grab that and install it, then enable one of the newer themes—I chose paper, which looked nice and neat—and refresh the page. Sweet! Click a link. Shit! Purple python error messages everywhere. Follow the stack trace to the bottom of the page and it seems to be complaining about an unset variable in a dictionary…bla, bla, bla.
Maybe there are some missing mercurial or python modules. A quick check of the recommend and suggested modules with apt-get
reveals nothing significant.
Maybe the Python version is wrong. Well, look at that, Debian Etch is still using Python 2.4 by default. Python 2.5 was released in September of 2006. It’s just possible that Mercurial—especially the perhaps less-well-tested CGI script—might be a wee bit surprised to find itself on a runtime that’s almost three years out-of-date. That’s a decent theory, I think. A quick apt-get install python2.5
grabs the latest version from Etch (which is probably also horribly old, but no matter) and … has no effect. The Debian installer does not set up the newer Python as the system default; it doesn’t even ask if you’d like to do this. Perhaps there’s a good reason for this… [3]
Long story short, I couldn’t get the Mercurial 1.0.1 web CGI any more stable than it was on my initial attempt and must instead assume that it’s just broken in that version. I rolled back to Python 2.4 and Mercurial 0.9.1 and everything started working again, though my eyes still teared up when I had to look at the repository web site. I couldn’t figure out any way of getting a newer version of Mercurial to run on Debian Etch. A pity, but there it is. I ended up making my own theme and adjusting the style sheet enough so that I could use it less than an hour after having eaten without endangering my keyboard.
I thought my Python/Debian adventures had ended until I started my server backup script, which uses rdiff-backup
to synchronize several directories from my server. I use rdiff-backup 1.1.15
on my Mac and remotely control an instance on the Debian Etch server. Setting this up with compatible versions was not the easiest thing in the world and seems to have been more fragile than I thought. My Python reconfigurations had removed rdiff-backup
from the server because it had the Python2.4
package as a dependency. I quickly re-installed rdiff-backup
, but it was permanently offended and continued to give the same error message, which was, once again, something about some file or function or variable that it had expected to be set that was most emphatically not set and that it was going to, as a result, quit in a huff.
I know that it’s my fault for having used Debian Backports instead of being happy with three-year–old software, but knowing that doesn’t make me any happier to be, once again, debugging a version mismatch error in rdiff-backup
. I have, in fact, decided to tempt fate and forget about that part of the backup for this week [4]. I’m sure my self from one week from now will bloody hate me for it, but I’m going to bed.
pull
from all of those repositories, which happens in a flash. All in all, I’m quite happy with Mercurial, even though with 0.9.1, I’m still in 2005.Published by marco on 1. Feb 2009 13:25:15 (GMT-5)
Courier IMAP has a default certificate for SLL communication, but it’s only valid for a year and has bogus, default information in it. You can use a utility to generate a new certificate and, with a little perseverance, find the configuration file from which it draws its parameters. With these parameters, you can make a slightly better certificate, but it’s better to use OpenSSL to generate a proper certificate, based either on a trusted certificate or self-signed. However, OpenSSL’s default output does not include the combined private key/certificate file expected by Courier. To do that, I adapted the instructions found in Courier IMAP SSL Certificate Installation to create the combined PEM file and reference it from the courier configuration file.
In my case, I just re-used the certificates I’d already generated for TLS SMTP access with Postfix, which I’d stored at /etc/postfix/keys/
. All instructions are for a Debian Etch installation. Open a text editor and paste the contents of the primary certificate and the private key one after another in the following order:
Include the BEGIN
and END
tags on each. The result should look like this:
—–BEGIN CERTIFICATE—–
(Your Primary SSL certificate: server.crt)
—–END CERTIFICATE—–
—–BEGIN RSA PRIVATE KEY—–
(Your Private Key: server.key)
—–END RSA PRIVATE KEY—–
Save the combined file as server.pem
.
Finally, open the /etc/courier/imapd-ssl
file and update the following value to reference the new PEM file.
TLS_CERTFILE=/etc/postfix/keys/server.pem
Restart the Courier server by executing /etc/init.d/courier-imap-ssl restart
and you’re done.
Published by marco on 21. Oct 2008 21:09:05 (GMT-5)
If you’re faced with a pile of data that needs to be sorted, you can use the Animated Sorting Algorithms by David R. Martin to decide, based on what kind of data you think you’re going to have. Click a little green refresh symbol in the rows to watch the algorithms race on the same dataset or click a column header to watch the same algorithm attack best- and worst-case scenarios simultaneously.
Published by marco on 28. Jul 2008 19:14:04 (GMT-5)
The recently-published RFC: Lambda functions and closures by Christian Seiler & Dmitry Stogov (PHP) has all the details about a patch for PHP that adds support for closures to PHP 5.3 and higher. It looks like the proposal was initially made in December of 2007 and the process lasted about half a year. Compare and constrast with Java’s long struggle to get closures, which isn’t over by half [1].
The syntax is pretty straightforward, though not as elegant as the lambda syntax adopted by C# (and likely others of whose syntax I’m not aware, like Ruby or Python). Reference variables are supported, but copy-by-value is the default for local variables (as is the norm for PHP). Local variables referenced in the closure must be listed explicitly for two reasons:
A lambda construct has a nameless function
followed by a parameter list followed by an optional use
clause that lists the local variables that may be referenced from the closure. A closure defined within a class automatically includes a reference to $this
. [3] Here’s an example of a replacer function that defines an anonymous replacer function—customized for the parameters passed to the outer method—which is, in turn, passed to array_map
.
function replace_in_array ($search, $replacement, $array) {
$map = function ($text) use ($search, $replacement) {
if (strpos ($text, $search) > 50) {
return str_replace ($search, $replacement, $text);
} else {
return $text;
}
};
return array_map ($map, $array);
}
The closure header and closing brace are highlighted in the example above; the body of the closure is just normal PHP code with the additional restriction that the scope is limited to the contents of the parameters and “use” clauses.
All in all, it seems like a solid addition to PHP and should lead to cleaner code for those using the latest version.
static function
.Published by marco on 22. Apr 2008 18:46:33 (GMT-5)
Updated by marco on 22. Apr 2008 18:53:44 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
Metadata is, by definition, data about an application’s data. It describes the properties and capabilities of and connections between different types of information in a particular application domain. Examples of application domains are bookkeeping, document management, a lending library or inventory management.
Listed below are a few commonly used terms for modeling methodologies:
The approach recommended in the following paper can be generally described as MDD, but diverges from traditional approaches in several key ways.
All software defines and uses metadata, but usually defines it implicitly, encoding it in the constructs of the programming language (e.g. classes, field, methods, references, etc.) in which it is written. A non-trivial application—one that renders a GUI or responds to a web-service request—needs much, much more information than that. More often than not, that extra information is encoded in application-specific code that is usually not generalized (though it may follow a pattern).
Software increases in complexity and size with time. In order to maintain any sort of control over larger software projects, it is important to strike a balance between the following two precepts:
Using a standardized metadata framework helps an application apply the D.R.Y. principle, but seems to violate K.I.S.S. However, applying K.I.S.S. to the metadata framework itself results in a simple API that makes any application using it also much simpler than it would have been with an ad-hoc solution. Using the same simple pattern and library throughout several applications decreases the overall complexity and drastically improves maintainability.
Imagine an application that generates a list of entities, say Books. Regardless of whether it generates a report, a web page or a graphical user interface (GUI), it needs to identify columns of data with labels. Traditionally, each application solves the problem of retrieval and internationalization of labels on its own. Programmers without much time or experience—and without framework support—will most likely hard-code the text for the labels, making maintenance more time-consuming and error-prone.
A metadata-based application, on the other hand, starts off with a model, in which the labels for bits of data are already defined. This application need concern itself only with transforming the information in the metadata to the output format and not with retrieval and internationalization of the metadata itself. On top of that, the application is free to use other available metadata which maps to the output format, like color, borders or font-size.
Apps using metadata have the following advantages:
Those few systems that do use explicit metadata do so in order to provide object-relational mapping (ORM) and CrUD (Create/Update/Delete) access to a database. To name just a few:
These solutions use various mechanisms to specify the metadata (uniqueness, primary key, relationships, etc.) needed to communicate with a database: Cayenne uses XML files, Hibernate allows either XML files or Java annotations, LINQ uses the class definition along with optional attributes and Django uses inner classes.
Django goes further still by providing metadata for web UIs as well as data access. It uses this metadata to generate the entire administrative back-end—including sortable, filterable lists and CrUD UI for all objects—automatically. The .NET framework has grids that automatically provide CrUD from a LINQ dataset, but these cannot be customized by tweaking central metadata, as in Django . Other frameworks have been similarly inspired—Rails (Ruby) and Grails (Groovy) come to mind—but are still in very early stages and don’t seem to use much central metadata. For example, the BeanForm component in Tapestry generates web forms from objects using Java reflection and Hibernate properties (if present). However, it can only be further customized (e.g. styles and classes or layout) with metadata hard-coded in the HTML document.
All of the approaches above use the class hierarchy as the model and the reflection/introspection services of the language itself as the metadata API. Metadata not directly expressible in the language is attached using attributes (LINQ), annotations (Hibernate) or inner classes (Django).
This raises a few issues:
With metadata so central to the application, it makes no sense to cede control of it to a single external component (like an ORM). Instead, it must be independently and centrally defined and under application control.
In order to be both truly useful, metadata must satisfy the following conditions:
None of the requirements listed above places restrictions on the software environment. Rather, it emphasizes only that none of the components gets to control the metadata. For example, it is trivial to generate the data classes needed by Hibernate or LINQ from the metadata. Similarly, nothing prevents a project from generating the in-memory metadata from more traditional modeling approaches like UML.
The application benefits from the centralized metadata, but can continue to any external components without restriction.
The first step is to define the boundaries of an application domain with entities, properties, operations and relationships (described in section 5 – “Elements of metadata”). However, as mentioned above, an non-trivial application needs specialized metadata in order to work with one or more external components, such as:
To address these different needs—and to stay centralized but decoupled—metadata is split into aspects, each of which encapsulates one of the tasks listed above. For example, an application might want to store the following information in its metadata:
Aspects that are very similar, like “gui” and “web”, can put shared metadata into another aspect (e.g. “view”). This way, common metadata, like “color” and “font-size” are in the “view” aspect, while browser-specific metadata is defined in the “web” aspect. A desktop application includes the “gui” and “view” aspects, while a web application includes “web” and “view”. The reporting tool mentioned above includes only “view” so that it has access to the required display metadata.
The sections below briefly sketch the basic parts of metadata and are not meant to be comprehensive.
Since metadata is descriptive, all of its elements have the following basic properties:
Complicating things slightly, each textual property—like the “singular title” above—has multiple values, one for each language supported by the application. In addition to these basic, shared properties (of which there can be many, many more), specific types of elements add other features.
A lending library application is used in the examples below.
Entities describe particular types of objects in the application domain, like books, authors, publishers or customers. Less obvious entities are languages, media types, genres or lending transactions (when a customer loans a book for a period of time). Entities include the following lists:
Each of these is described in more detail in the following sections.
A property is a single feature of an entity, like a book’s title, an author’s first or last name or the due date for a loaned book. In addition to the basic features above (title, description), a meta-property has the following features:
There are, of course, dozens of other features that an application might associate with a property, but things like “color”, “font-size” or “visible” are only interesting to particular application domains. Therefore, these features belong in domain-specific aspects (as described above in “Aspects of Meta-Data”).
An operation is an action that can be executed against an entity (or, more precisely, an instance of an entity). In addition to the basic features above (title, description), a meta-operation has the following features:
Including the signature in the metadata is useful for validation, but the implementation is best left in the application itself (including it in the metadata would violate K.I.S.S.).
An application domain consists of more than just free-floating entities: it is the relationships between those entities that truly describe what is possible within a metadata model. A book has a list of authors as well as a publisher, whereas publishers and customers have lists of books. A relationship has the following features:
As with properties, these are the basic features that all relationships have to describe them fully; domain-specific aspects may add more.
Using metadata explicitly is the kind of approach that comes after years of experience working with other methodologies and technologies. It arises from a need to avoid re-inventing the wheel with each new application. We started Encodo after working for years with modeling tools that offered some—though not all—of the advantages mentioned in this paper.
Most available components and libraries, however, don’t work with metadata as we’d grown accustomed to. We tried using them as-is but, soon enough, realized we could adhere to neither K.I.S.S. nor D.R.Y. principles. Having had the advantage of using metadata in previous applications, our standards were raised to a level where we were no longer willing to work without it.
]]>The first publicly available version of the Encodo C# Handbook is ready for download! It covers many aspects of programming with C#, from naming, structural and formatting conventions to best practices... [More]
Published by marco on 21. Apr 2008 20:25:27 (GMT-5)
Updated by marco on 21. Apr 2008 20:25:41 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
The first publicly available version of the Encodo C# Handbook is ready for download! It covers many aspects of programming with C#, from naming, structural and formatting conventions to best practices for using existing and developing new code.
Here’s the backstory on how and why we decided to write a formal coding handbook.
Here at Encodo, we started working with C# less than a year ago. We decided early on that we would be building a framework on which we would base our projects, both internal and external. That framework now exists and forms the core of several client projects: it’s called “Quino” and you can find out more at the Quino home page. Since we were library-oriented from the get-go, we were very aware of our coding style and were interested to know how other projects and developers organized and formatted their code and how they worked with the .NET framework.
Naturally, there was a lot of documentation to be found in Microsoft’s MSDN, but it was scattered over dozens of pages and wasn’t very useful as a consolidated reference. It also made recommendations that Microsoft themselves ignored in their own code. Searching with Mr. Google brought up numerous references to a manual from iDesign, which is quite good. Philips also has a pretty extensive manual.
We started with those as well as a bushel of ad-hoc rules we’d developed over the years and an “Encodo Style” slowly evolved. Where we diverged from other companies is that we decided to write it all down. Every last niggling bit of it. The handbook was in a very ad-hoc format when we hired Marc and realized that we’d need to get him up to speed on how we work at Encodo. After an initial formatting effort, there followed a few months of slow accretion of new rules as well as a refinement of existing ones.
Where our guide differs from the others is in the organization; there are clear sections for structure, formatting, naming, language elements and best practices instead of just a hodge-podge of rules. We’ve also done our best to weed out conflicting or repeated rules. The current handbook (version 1.4) also includes rules for those of you, like us, who’ve moved on to VS2008 and the wonderful world of .NET 3.5.
Though there will certainly be updates as we learn more, we hope you like what we’ve got so far and welcome any and all feedback!
For your quick persusal, here’s the current table of contents:
Table of Contents 1 General 1.1 Goals 1.2 Scope 1.3 Fixing Problems in the Handbook 1.4 Fixing Problems in Code 1.5 Working with an IDE 2 Design Guide 2.1 Abstractions 2.2 Inheritance vs. Helpers 2.3 Interfaces vs. Abstract Classes 2.4 Modifying interfaces 2.5 Delegates vs. Interfaces 2.6 Methods vs. Properties 2.7 Virtual Methods 2.8 Choosing Types 2.9 Design-by-Contract 2.10 Controlling API Size 3 Structure 3.1 File Contents 3.2 Assemblies 3.3 Namespaces 3.3.1 Usage 3.3.2 Naming 3.3.3 Standard Prefixes 3.3.4 Standard Suffixes 3.3.5 Encodo Namespaces 3.3.6 Grouping and ordering 4 Formatting 4.1 Indenting and Spacing 4.1.1 Case Statements 4.2 Brackets (Braces) 4.2.1 Properties 4.2.2 Methods 4.2.3 Enumerations 4.2.4 Return Statements 4.3 Parentheses 4.4 Empty Lines 4.5 Line Breaking 4.5.1 Method Calls 4.5.2 Method Definitions 4.5.3 Multi-Line Text 4.5.4 Chained Method Calls 4.5.5 Anonymous Delegates 4.5.6 Lambda Expressions 4.5.7 Ternary and Coalescing Operators 5 Naming 5.1 Basic Composition 5.1.1 Valid Characters 5.1.2 General Rules 5.1.3 Collision and Matching 5.2 Capitalization 5.3 The Art of Choosing a Name 5.3.1 General 5.3.2 Namespaces 5.3.3 Interfaces 5.3.4 Classes 5.3.5 Properties 5.3.6 Methods 5.3.7 Parameters 5.3.8 Local Variables 5.3.9 Events 5.3.10 Enumerations 5.3.11 Generic Parameters 5.3.12 Lambda Expressions 5.4 Common Names 5.4.1 Local Variables and Parameters 5.4.2 User Interface Components 5.4.3 ASP Pages 6 Language Elements 6.1 Declaration Order 6.2 Visibility 6.3 Constants 6.3.1 readonly vs. const 6.3.2 Strings and Resources 6.4 Properties 6.4.1 Indexers 6.5 Methods 6.5.1 Virtual 6.5.2 Overloads 6.5.3 Parameters 6.5.4 Constructors 6.6 Classes 6.6.1 Abstract Classes 6.6.2 Static Classes 6.6.3 Sealed Classes & Methods 6.7 Interfaces 6.8 Structs 6.9 Enumerations 6.9.1 Bit-sets 6.10 Nested Types 6.11 Local Variables 6.12 Event Handlers 6.13 Operators 6.14 Loops & Conditions 6.14.1 Loops 6.14.2 If Statements 6.14.3 Switch Statements 6.14.4 Ternary and Coalescing Operators 6.15 Comments 6.15.1 Formatting & Placement 6.15.2 Styles 6.15.3 Content 6.16 Grouping with #region Tags 6.17 Compiler Variables 6.17.1 The [Conditional] Attribute 6.17.2 #if/#else/#endif 7 Patterns & Best Practices 7.1 Safe Programming 7.2 Side Effects 7.3 Null Handling 7.4 Casting 7.5 Conversions 7.6 Object Lifetime 7.7 Using Dispose and Finalize 7.8 Using base and this 7.9 Using Value Types 7.10 Using Strings 7.11 Using Checked 7.12 Using Floating Point and Integral Types 7.13 Using Generics 7.14 Using Event Handlers 7.15 Using “var” 7.15.1 Examples 7.16 Using out and ref parameters 7.17 Error Handling 7.17.1 Strategies 7.17.2 Error Messages 7.17.3 The Try* Pattern 7.18 Exceptions 7.18.1 Defining Exceptions 7.18.2 Throwing Exceptions 7.18.3 Catching Exceptions 7.18.4 Wrapping Exceptions 7.18.5 Suppressing Exceptions 7.18.6 Specific Exception Types 7.19 Generated code 7.20 Setting Timeouts 7.21 Configuration & File System 7.22 Logging and Tracing 7.23 Performance 8 Processes 8.1 Documentation 8.1.1 Content 8.1.2 What to Document 8.2 Testing 8.3 Releases
Published by marco on 19. Feb 2008 21:41:06 (GMT-5)
Microsoft recently released documentation for their binary office formats in both PDF and their own XPS format. The PDF for Word weighs in at 2.8MB and has 210 pages. Why are the Microsoft Office file formats so complicated? by Joel Spolsky, provides a lot of good reasons for why the formats are so complicated (most rooted in history), like speed, complexity of the task, purely internal formats (until now), etc.
Where Spolsky veers off the path (and he almost always does) is in reaching a bit too far with his “workarounds”. Instead of trying to load the binary formats yourself, he suggests simply launching Word or Excel as a COM object “directly, even from ASP or ASP.NET code running under IIS”. The caveat comes only later and tells only of a “few gotchas”, like it “not [being] officially supported by Microsoft”. He includes a link to a knowledge base article which uses a lot of words to say the equivalent of “for the love of the sweet baby Jesus, don’t do this.” Clearly, Spolsky was so enchanted by his prose and clever examples that he didn’t think Microsoft explicitly countermanding his idea was enough reason not to publish it to the world [1].
His advice to hide this type of solution behind a web service for Linux servers actually goes for ASP.NET servers as well. If you need to read the Office format (or generate it), there are libraries that do this without using office itself. The POI java library from Apache works quite well for generating Excel and Word documents. If you’re using .NET, you can hide the POI library behind a web service and call that instead. Even a Tomcat server to run a little web service won’t weigh more than running Office in a Windows 2003 Server. If you do have to run a Windows 2003 Server in a Linux environment, consider running it a virtual machine under Xen or some other virtualization solution.
Some of the other suggestions also indicate that Spolsky was just trying to fill out his bullet lists, like “[o]pening an Excel workbook, storing some data in input cells, recalculating, and pulling some results out of output cells”—that sounds like the kind of stuff you could just write in .NET or Java directly, no? [2] Or what about “[u]sing Excel to generate charts in GIF format”—there are libraries for that, aren’t there? Do you really have to consider automation in a server process (including a likely bottlenecking nightmare) just to generate a chart?
Happily, he closes strongly with good suggestions for generating the least complex format possible for fulfilling the task, such as using RTF for formatted documents (it’s a text format, reasonably legible, and is well-documented) or CSV for simple Excel data.
In the end, the formats for the office applications are published. This is what Microsoft deals with in their office products—there’s no use complaining that they’re too complicated. They are what they are and most people should be able to avoid having to deal with them—unless you do something silly like joining the Office development team in Redmond.
Published by marco on 5. Feb 2008 23:16:21 (GMT-5)
Updated by marco on 5. Feb 2008 23:48:59 (GMT-5)
The article Two-Party Threaded Chat by Peter Arrenbrecht addresses the problem of multiple threads of discussion within a single conversation. Without face-to-face contact, the threshold for interruption is much lower and answers will not always neatly line up under their questions. A conversation may have many of these “threads”, though individual ones are usually quite short-lived. Chat clients are currently limited by their purely serial approach to inserting text into a conversation.
The solution proposed in the article is quite good, and the discussion below only expands slightly on those concepts, while illustrating them with fake screenshots in iChat-style. iChat’s approach of using talk bubbles from opposing sides is somewhat easier to follow than the text-only examples provided in the original article and even the multi-thread view in the demo (which is likely to be a bit much for the average user).
The problem boils down to insertion points for comments. Chat clients today always place new text at the bottom of the conversation, whether or not that is the most appropriate insertion point. The proposed extension is to give the user control over the insertion point and also to automatically propose the best insertion point, if possible.
In the classic case, one party asks a question or makes a comment. The other party types a message and sends it, inserting it at the end of the conversation. If the first party did not send any messages in the meantime, then the question and answer are lined up correctly. The example below shows the insertion point as displayed in iChat as one party types (l.) and the inserted message (r.).
Whenever a message arrives from one party while the other party is still typing, that counts as an interruption and there are now two possible insertion points: the point in the conversation at which the receiver started typing and the end of the conversation. The user should be able to switch between these insertion points (either with a mouse click or a key combination to toggle between them), but the client should default to inserting into the original location. Once the user selects a point, they both disappear and the text appears where inserted (in the original insertion point in the example below).
As you can see from the example, there can only ever be two insertion points because the user can only be typing one message and can only be interrupted once. As soon as the user sends a message, the “thread” is closed and the client manages a single insertion point again. [1]
Of course, clients that do not support this feature will continue to display conversations from newer clients serially, as they do currently.
Peter mentions being able to specifically reference a comment all the way at the end of his proposal (under “Implementation”). I (and several people with whom I chat) have been using the @-symbol for this purpose for years, using the following syntax:
@[label]: [comment]
The label refers to a unique word in the comment from the other party to which the new comment refers. It’s essentially poor-man’s threading for the dumb chat clients available today—but it’s already quite effective even without client support. When a new client sees such a targeted insertion, it should seek backwards through the conversation to find that word, then insert the new comment immediately after the sentence or comment that contained it. The exact algorithm can be refined, but let’s take a look at a simple example.
Let’s take a look at a conversation initiated by someone who makes several points at once, piling up questions for the other party to answer. The other party then answers the questions in order, using the @-symbol to let the other party know which answers correspond to which questions. In a modern chat client, this looks like one of the screenshots below (all comments in one block on the left; comments separate on the right):
Let’s proceed with the example on the left, showing the first response from the second party below:
Since the response is not targeted, it just shows up below the whole block of text. The next response is targeted—using @dinner
—and so causes the sentence with the word “dinner” in it (as well as everything following it, which in this case is nothing) to be split off from the main text in both clients. That response is shown below:
The third response targets a chunk out of the original text—using @troops—so that sentence is extracted and the first response moves up to remain under all the remaining untargeted text. That response is shown below:
A client that understands targets in the text (and adjusts the display accordingly) could also just remove those targets from the displayed text, as shown below.
The client now looks as if the conversation had happened in the correct order, all with the help of targeting.
All of the targets in the example above are still on-screen, so the chat client doesn’t have to do anything other than move the blobs of text around. However, if a target has already scrolled off-screen, the client should display that part of the conversation again. One way to do this is to split the window to show the target as well as the end of the conversation, as shown below.
In this case, both regions are scrollable, showing different locations within the same conversation. Since these targeted discussion threads rarely stay open for more than a few comments, the client doesn’t need to offer any way of switching between the splitter areas. In order to continue the top part of the conversation in the example above, a user could either re-use the @troops
label (in which case the comment is inserted after all other comments that used that label) or target a word in that comment, like @insane
. If no label is used, the comment is inserted at the end of the conversation, as usual.
This approach keeps the number of things a user has to know to a minimum, letting all new functionality be controlled by the use of labels.
The term DRY—Don’t Repeat Yourself—has become more and more popular lately as a design principle. This is nothing new and is the main principle underlying object-oriented programming. As OO programmers,... [More]
]]>Published by marco on 4. May 2007 16:16:31 (GMT-5)
Updated by marco on 5. May 2007 08:52:21 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
The term DRY—Don’t Repeat Yourself—has become more and more popular lately as a design principle. This is nothing new and is the main principle underlying object-oriented programming. As OO programmers, we’ve gotten used to using inheritance and polymorphism to encapsulate concepts. Until recently, languages like C# and Java have had only very limited support for re-using functionality across larger swathes of code. [1] To illustrate this, let’s take a look at a simple class with a descendent as well as some code that deals with lists of these objects and their properties.
Let’s start with some basic definitions [2]:
class Pet
{
public string Name
{
get { return _Name; }
}
public bool IsHouseTrained
{
get { return _IsHouseTrained; }
}
private string _Name;
private bool _IsHouseTrained = true;
}
class Dog : Pet
{
public void Bark() {}
}
class Owner
{
public IList<Pet> Pets
{
get { return _Pets; }
}
private IList<Pet> _Pets = new List<Pet>();
}
This is basically boilerplate for articles about inheritance, so let’s move on to working with these classes. Imagine that the Owner
wants to find all pets named “Fido”:
IList<Pet> FindPetsNamedFido()
{
IList<Pet> result = new List<Pet>();
foreach (Pet p in Pets)
{
if (p.Name == "Fido")
{
result.Add(p);
}
}
return result;
}
Again, no surprises yet. This is a standard loop in C#, using the foreach
construct and generics to loop through the list in a type-safe manner. Applying the DRY principle, however, we see that we’re going to end up writing a lot of these loops—especially if we offer a lot of different ways of analyzing data in the list of pets. Essentially, the code above is a completely standard loop except for the condition—the (p.name == “Fido”)
part. We can then imagine a function with the following form:
IList<Pet> FindPets(??? condition)
{
IList<Pet> result = new List<Pet>();
foreach (Pet p in Pets)
{
if (condition(p))
{
result.Add(p);
}
}
return result;
}
Now we need to figure out what type condition
has. From the function body, we see that it takes a parameter of type Pet
and returns a bool
value. In C#, the definition of a function is called a delegate
, which is also a keyword; for the type above, we write:
delegate bool MatchesCondition(Pet item);
As mentioned above, the return type is a bool
, the single parameter is of type Pet
, and the delegate is identified by the name <em>MatchesCondition</em>. The name of the parameter is purely for documentation. We can then rewrite the function signature above using the delegate we just defined:
IList<Pet> FindPets(MatchesCondition condition) {…}
We’ve managed to move the looping code for many common situations into a shared method. Now, how do we use it? We originally wanted to find all pets named “Fido”, so we need to define a function that does just that, matching the function signature defined by MatchesCondition
:
bool IsNamedFido(Pet p)
{
return p.Name == "Fido";
}
In this fashion, we can write any number of methods, which check various conditions on Pet
s. To use this method, we simply pass it to the shared FindPets
method, like this:
IList<Pet> petsNamedFido = FindPets(IsNamedFido);
IList<Pet> petsNamedRex = FindPets(IsNamedRex);
IList<Pet> houseTrainedPets = FindPets(IsHouseTrained);
This is better than the previous situation—in which we would have repeated the loop again and again—but we can do better. The problem with this solution is that it tends to clutter the class (Owner
in this case) with many little methods that are useful only in conjunction with FindPets
. Even if the methods are private, it’s a shame to have to use a full-fledged method as a kludge for instancing a piece of code to be called. The C# designers thought so too, so they added anonymous methods, which have a parameter list and a body, but no name. Using anonymous methods, we can replace the methods, IsNamedFido
, IsNamedRex
and IsHouseTrained
, with the following code:
IList<Pet> petsNamedFido = FindPets(delegate(Pet p) { return p.Name == "Fido"; });
IList<Pet> petsNamedRex = FindPets(delegate(Pet p) { return p.Name == "Rex"; });
IList<Pet> houseTrainedPets = FindPets(delegate(Pet p) { return p.IsHouseTrained; });
Again, the keyword delegate
introduces a parameter list and body for the anonymous method.
All of the code above uses the generic IList
and List
classes. None of the looping code in FindPets
is dependent on the type of the list element except for the condition
. It would be really nice if we could re-use this code not just for Pet
s, but for any collection of elements. Generic functions to the rescue. A generic function has one or more generic parameters, which can be used throughout the parameter list and implementation body. The first step in making FindPets
fully generic is to change the definition of MatchesCondition
:
delegate bool MatchesCondition<T>(T item);
As with a generic class, the function’s generic arguments appear within pointy brackets after the identifier—in this case, the single generic parameter is named T
. Pet
has been replaced as the type of the parameter as well. In order to finish making FindPets
fully generic, we’ll have to pass it a list to work with (right now it always uses Pets
) and change the name, so as to avoid confusion:
IList<T> FindItems<T>(IList<T> list, MatchesCondition<T> condition)
{
IList<T> result = new List<T>();
foreach (T item in list)
{
if (condition(item))
{
result.Add(item);
}
}
return result;
}
We’re not quite done yet, though. If you look closely at the function body, all it does is enumerate the items in the parameter list
. Therefore, we can loosen the type-constraint of the parameter from IList
to IEnumerable
, so that it can be called with any collection from all of .NET.
IList<T> FindItems<T>(IEnumerable<T> list, MatchesCondition condition) {…}
And … we’re done. Fully generic! Let’s see how that looks using the examples from above:
IList<Pet> petsNamedFido = FindItems<Pet>(Pets, delegate(Pet p) { return p.Name == "Fido"; });
IList<Pet> petsNamedRex = FindItems<Pet>(Pets, delegate(Pet p) { return p.Name == "Rex"; });
IList<Pet> houseTrainedPets = FindItems<Pet>(Pets, delegate(Pet p) { return p.IsHouseTrained; });
Though we’ve lost something in legibility, we’ve gained quite a bit in re-use. Imagine now that an Owner
also has a list of Vehicle
s, a list of Properties
and a list of Relative
s. You only have to write the conditions themselves and you can search any type of container for items matching any condition … all in a statically type-safe manner:
IList<Pet> petsNamedFido = FindItems<Pet>(Pets, delegate(Pet p) { return p.Name == "Fido"; });
IList<Vehicle> redCars = FindItems<Vehicle>(Vehicles, delegate(Vehicle v) { return (v is Car) and (((Car)v).Color == Red); });
IList<Property> bigLand = FindItems<Property>(Properties, delegate(Property p) { return p.Acreage >= 1000; });
IList<Relative> deadBeats = FindItems<Relative>(Relatives, delegate(Relative r) { return r.MoneyOwed > 0; });
Note: C# 2.0 offers this functionality in the .NET library for both the List
and Array
classes. In the official version, MatchesCondition
is called Predicate
and FindItems
is called FindAll
. It is not known why these functions don’t apply to all collections, as illustrated in our example.
Can we do something about the legibility of the solution from the last section? In C# 2.0, we’ve reached the end of the line. If you’ve been following the development of “Orcas” and C# 3.0/3.5, you might have heard of extension methods [3], which allow you to extend existing classes with new functions without inheriting from them. Let’s extend any IEnumerable
with our find function:
public static class MyVeryOwnExtensions
{
public static bool FindItems<T>(this IEnumerable<T> list, MatchesCondition<T> condition)
{
// implementation from above
}
}
The keyword this
highlighted above indicates to the compiler that FindItems
is an extension method for the type following it: IEnumerable<T>
. Now, we can call FindItems
with a bit more legibility and clarity, dropping both the generic parameter the actual argument (Pet
and Pets
, respectively) and replacing with a method call on Pets
directly.
IList<Pet> petsNamedFido = Pets.FindItems(delegate(Pet p) { return p.Name == "Fido"; });
For brevity’s sake, the examples in this section assume use of the extension method defined above. To use the examples with C# 2.0, simply rewrite them to use the non-extended syntax.
We use anonymous methods to avoid declaring methods that will be used for one-off calculations. However, larger methods or methods that are reused throughout a class properly belong to the class as full-fledged methods. At the top, we defined a descendent of the Pet
class called Dog
. Imagine that each Owner
has not only a list of Pet
s, but also a list of Dog
s. Then we’d like to bring back our IsNamedFido
method in order to be able to apply it against both lists (copied from above):
bool IsNamedFido(Pet p)
{
return p.Name == "Fido";
}
Now we can use this method to test against lists of pets or lists of dogs:
IList<Pet> petsNamedFido = Pets.FindItems(IsNamedFido);
IList<Dog> dogsNamedFido = Dogs.FindItems(IsNamedFido);
The example above illustrates an interesting property of delegates, called contravariance. Because of this property, we can use IsNamedFido
—which takes a parameter of type Pet
—when calling FindItems<Dog>
. That means that IsNamedFido
can be used with any list containing objects descended from Pet
. Unfortunately, contravariance only applies in this very special case; the type of dogsNamedFido
cannot be IList<Pet>
because IList<Dog>
does not conform to IList<Pet>
. [4]
However, this courtesy extends only to predefined delegates. If we wanted to replace the call to IsNamedFido
with a call to an anonymous method, we’d be forced to specify the exact type for the parameter, as shown below:
IList<Dog> dogsNamedFido = Dogs.FindItems(delegate(Dog d) { return d.Name == "Fido"; });
Using Pet
as the type parameter does not compile even though it is simply an in-place reformulation of the previous example. Enforcing the constraint here does not restrict the expressiveness of the language in any way, but it’s interesting to note that the compiler relaxes the rule against contravariance only when it absolutely has to.
In the previous section, we created a method, IsNamedFido
instead of using an anonymous method to avoid duplicate code. In that spirit, suppose we further believe that having a name-checking function that checks a constant is also not generalized enough [5]. Suppose we write the following function instead:
bool IsNamed(Pet p, string name)
{
return p.Name == name;
}
Unfortunately, there is no way to call this method directly because it takes two parameters and doesn’t match the signature of MatchesCondition
(and even contravariance won’t save us). You can, however, drop back to using a combination of the defined method and an anonymous method:
IList<Pet> petsNamedFido = Pets.FindItems(delegate (Pet p) { return IsNamed(p, "Fido"); });
This version is a good deal less legible, but serves to show how you can at least pack most of the functionality away into an anonymous method, repeating as little as possible. Even if the anonymous method uses local or instance variables, those are packed up with the call so that the values of these variables at the time the delegate is created are used.
For comparison, Java does not support proper closures, requiring final
hacks and creation of anonymous classes in order to perform the task outlined above. Various proposals aim to extend Java in this direction, but, as of version 6, none have yet found their way into the language specification.
On a final note, it would be nice to have a cleaner notation for formulating the method call above—in which additional parameters to a function must be collected manually into an anonymous method. The Eiffel programming language offers such an alternative, calling their delegates agents instead [6]. The conformance rules for agents for a method signature like MatchesCondition<T>
are different, requiring not that the signature match perfectly, but only that all non-conforming parameters be provided at the time the agent is created.
Eiffel uses question marks to indicate where actual arguments are to be mapped to the agent, so in pseudo-C# syntax, the method call above would be written as:
IList<Pet> petsNamedFido = Pets.FindItems(agent IsNamed(?, "Fido"));
This is much more concise and expressive than the C# version. It differs enough from an actual function call—through the rather obvious and syntax-highlightable keyword, agent—but not so much as to suggest an entirely different mechanism. The developer is made aware that it’s not a regular method call, but a delayed one. C# could easily implement such a feature as pure syntactic sugar, compiling the agent expression to the previous formulation automatically. Perhaps in C# 4.0?
All in all, though, C#’s support for generics and closures and DRY programming is eminently useful and looks only to improve in upcoming versions like LINQ, which introduces inferred typing, a mechanism that will improve legibility and expressiveness dramatically.
This reduces the expressiveness of the language, but C# forbids this because it cannot statically prevent incorrect objects from being added to the resulting list. Building on the example above, if we assume a class Cat
also descendend from Pet
, it would then be possible to do the following:
IList<Pet> dogsNamedFido = Dogs.FindItems(IsNamedFido);
dogsNamedFido.Add(new Cat());
This would cause a run-time error because the actual instance attached to dogsNamedFido
can only contain Dog
s. Instead of adding run-time checking for this special case and enhancing the expressiveness of the language—as Eiffel or Scala, for example, do—C# forbids it entirely, as does Java.
For further information, the articles, Generic type parameter variance in the CLR and Using ConvertAll to Imitate Native Covariance/Contravariance in C# Generics, are also useful. For more information on closures in C#, see C#: Anonymous methods are not closures and The Power of Closures in C#.
Published by marco on 12. Apr 2007 20:47:55 (GMT-5)
If you’ve ever thought that PHP was too fast or used too little memory or that Java’s class encapsulation was too restricitive, boy has Quercus: PHP in Java got the solution for you. At last, PHP developers can enjoy the benefits of enterprise computing complete with abominable startup times, appalling refresh speeds and PermGen
errors every 15 minutes. And Java developers can finally leave their half-assed web frameworks behind and get behind the ultra-organized global namespace with a little something for everyone that is the PHP API.
This is clearly an April Fools prank that got out of control and is being delivered unconscionably late. There is no such thing as an idea so bad that the Internet can’t bring enough people together to make it happen.
Most Tapestry programming involves writing event handlers and operations on page objects. In order to execute these operations, you need access to properties of the form and properties of the session and... [More]
]]>Published by marco on 31. Jan 2007 23:16:42 (GMT-5)
Updated by marco on 11. Feb 2007 21:59:51 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
Most Tapestry programming involves writing event handlers and operations on page objects. In order to execute these operations, you need access to properties of the form and properties of the session and application in which the page resides. For convenience, developers can add references to all sorts of objects in the system using various forms of the @Inject*
annotation (like @InjectPage
, @InjectObject
and so on). Pages are declared abstract and, when instantiated, Tapestry extends the abstract class to fill in all of these injected objects and maintain proper initialization and linkage with rest of the system.
As long as this works as expected, there’s no problem. However, when Tapestry can’t inject a property as specified, it fails silently instead of throwing an exception. Surely, the failure is logged somewhere, but turning on logging results in a flood of output that is all-too-quickly overwhelming. If an object cannot be injected, there should be an exception—else why would the page have tried to inject it? Because it would appreciate access to that object, but only if it’s not too much trouble?
Injection does not work if the property to inject also happens to override an existing method or implements an interface method. The example below shows what to watch out for:
Given the following interface:
interface IEditorInterface {
IPage getEditorPage();
}
If a page class implements this interface in the following way, by tring to get Tapestry to inject an implementation (which would be quite elegant), the page reference will sometimes return null and sometimes generates a duplicate declaration error caused by a race condition (see Random error acces[s]ing for more information).
public abstract class EditorPage extends BasePage implements IEditorInterface {
@InjectPage("Editor")
public abstract Editor getEditorPage();
}
As mentioned above, this would be quite elegant, but it doesn’t function reliably at all. In the case of @InjectPage
, it is quite shaky, whereas properties declared as abstract
, which are automatically implemented and managed by Tapestry, are always null or result in runtime abstract errors (class instantiated, but method not implemented).
To get around this problem, use two separate methods, one to implement the interface and the other to inject the object:
public abstract class EditorPage extends BasePage implements IEditorInterface {
public IPage getEditorPage() {
return getInjectedEditorPage();
}
@InjectPage("Editor")
public abstract Editor getInjectedEditorPage();
}
In the above example, the function name for the injected page was changed; it is also possible to adjust the interface to use a more specific name, like “getEditorInterfacePage”. This solution lets the class use the name “getEditorPage” for the more specific type (returning ‘’Editor’‘ instead of ‘’IPage’‘).
Using Java 1.5, Tapestry 4.02, Hivemind 1.1.1
Hibernate is a persistence framework for Java. Among the many perks it purports to bring to the table is automatic versioning for objects in the database. That is, when saving an object to the database, it... [More]
]]>Published by marco on 15. Jan 2007 17:31:51 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
Hibernate is a persistence framework for Java. Among the many perks it purports to bring to the table is automatic versioning for objects in the database. That is, when saving an object to the database, it increments a version number. Any process that attempts to store a different version of the same object is rejected. This is all extremely flexible and can be added to a POJO using an annotation:
@Version
private int version;
Nice … a single annotation takes care of people overwriting each other’s data. The exercise of handling the ensuing StaleObjectStateException
in the user interface is left up to the reader.
Now, imagine that we have an object—call it a Book—in memory and we render it to a web page. On that page is a button which attaches more information to the object—say an Author—then saves and rerenders the same book in the page. The user can add and save authors or change other book properties and save the book to exit edit mode for that book. Though split over multiple page requests, as far as Hibernate is concerned, the following actions occur on that object [1]:
book.save();
book.addAuthor(new Author(getNameOfAuthor()));
book.save();
book.addAuthor(new Author(getNameOfAuthor()));
book.save();
book.save();
// exit edit mode …
This does not work. Hibernate raises a StaleObjectException
on the second execution of save()
because it never updated the version number in the object when it saved it the first time. That is, the automatically managed field version
is not synchronized with the object being saved when it is modified in the database. It’s not like Hibernate doesn’t know how to do this—fields marked with the @Id
annotation are updated as expected.
At this point, there are two things to do:
@Version
isn’t treated the same as @Id
After a bit of initial debugging pursuing choice (1), it became clear that choice (2) would be much more efficient (not least because the line numbers in the accompanying sources didn’t match the jar file).
The first step is to search online for this problem, but that was relatively fruitless, as no one else seemed to have had this problem, they didn’t regard it as a problem or they didn’t notice it yet. With hundreds, if not thousands, of companies using Hibernate, it’s hard to believe that this feature is designed like this, or is fundamentally broken. The internet having failed us, we’re left to fix this problem ourselves.
As some of you may already have been dying to point out, the quick and dirty way of fixing this is to simply update the version number by hand. At the risk of making any programming purists ill, here’s that code:
book.save();
book.setVersion(book.getVersion() + 1);
However, this code assumes that it knows exactly how the automatic versioning feature of Hibernate works (or, rather, doesn’t) and how future versions will work. Not liking that solution, we decide that we’ll probably need to hit the database again; for that purpose there’s the refresh()
function.
…aaaaaand, the version number is still zero.
Taking a look at the database shows that the object with this id clearly has a version number of 2. Let’s turn on query logging to see what Hibernate is doing when it executes a refresh on our object. In the hibernate.cfg.xml
file, set the following property (it’s probably set to false in your default configuration):
<hibernate-configuration>
<session-factory>
…
<property name="hibernate.show_sql">true</property>
This time the console shows that Hibernate does indeed execute a select and does indeed select our version field. It, however, fails to apply that value to the object on which refresh()
was called.
That hack solution at the beginning of this section is starting to look mighty good. At this point, we’re left with the alternative of reloading the object from the database in order to simulate the seemingly non-functional refresh()
. Reloading the object does get the correct version and leaves us with an object that we can use for further editing & saving operations.
book.save();
book = book.refresh();
book.addAuthor(new Author(getNameOfAuthor()));
book.save();
book = book.refresh();
// exit edit mode …
With this solution, however, we’re forced to create a new object, reassigning the reference to book
. Though further reading online turned up references to tantalizing tidbits like load(Object obj,Serializable id)
, it only generated NonUniqueObjectExceptions
; no combination of evict()
and flushing the session were able to avoid this. After more fruitless investigation in this vein—and perusal of the Hibernate documentation, which, while good, doesn’t link to actual examples of these methods in action—the “fake” refresh outlined above was accepted as a general solution.
Be aware, however, that any other references to the object represented by book
are not updated and will still have the wrong version
. In straightforward web applications, where the object is primarily referenced from a single page object, this is less likely to be the case. Applications with more sophisticated operations—for instance, where the reference is part of a graph of objects being edited—the refresh()
outlined above is not a proper solution.
save()
and refresh()
execute the similarly named functions on the Hibernate session … these assumed shortcut functions make the code easier to read.Using Java 1.5, Hibernate 3.2
If you are not already familier with HiveMind, read Setting up a Service in HiveMind for an introduction. [1]
In the article mentioned above, we learned how to set up a new HiveMind service. What if we want to... [More]
]]>Published by marco on 1. Jan 2007 23:29:29 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
If you are not already familier with HiveMind, read Setting up a Service in HiveMind for an introduction. [1]
In the article mentioned above, we learned how to set up a new HiveMind service. What if we want to replace the implementation for an existing service? Is it even possible? Why would you want to do that? This article answers these questions in the context of a real-life example from one of our applications.
In our Tapestry applications, we use the ExternalLink because it provides a standard URL that can be bookmarked and refreshed across sessions (which is handy during development as well). This link relies on the ExternalService
, which is configured and created by HiveMind. We wanted to experiment with this implementation to include more flexibility as to which method would be called [2]; at first, we just replaced the class in our application and worked with it from there.
This worked fine until we wanted to be able to configure some basic properties we’d added to the service from our application’s configuration file. The easiest way to do this would be to give the external service access to the custom application service we’d created in our own HiveMind file. So, we’re faced with the problem of injecting an application-specific service into a library service from another module.
As usual with HiveMind, you just have to know that there is a specific element for doing this called <implementation>
. If you declare a service using this element instead of <service-point>
, HiveMind searches for the previous service declaration and replaces the entire creation clause for that service. That means that any service properties set by the initial declaration have to be copy/pasted into the implementation override or HiveMind won’t set those properties anymore.
The declaration below does this for the external service, adding in the application-specific property in addition to the two properties stipulated by the original implementation (copied from Tapestry’s configuration).
<implementation service-id="tapestry.services.External">
<invoke-factory>
<construct class="com.encodo.tapestry.ConfigurableExternalService">
<set-object property="responseRenderer" value="infrastructure:responseRenderer"/>
<set-object property="linkFactory" value="infrastructure:linkFactory"/>
<set-service property="applicationService" service-id="CustomApplicationService" />
</construct>
</invoke-factory>
</implementation>
Once you know how, it’s quite simple … and powerful. Any component of Tapestry can be replaced in this rather elegant way to make application-specific implementations of common services. If you’re simply extending existing functionality, you have to remember to copy the entire prior specification; if you’ve replaced the implementation wholesale, then it’s possible you no longer need the properties set by the prior implementation and can leave them out.
com.encodo.customer.project
Using Java 1.5, Tapestry 4.02, HiveMind 1.1.1
If you are not already familier with HiveMind, read Setting up a Service in HiveMind for an introduction. [1]
Almost every application is going to need to have information that is session-specific. This is... [More]
]]>Published by marco on 1. Jan 2007 23:29:20 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
If you are not already familier with HiveMind, read Setting up a Service in HiveMind for an introduction. [1]
Almost every application is going to need to have information that is session-specific. This is accomplished by adding a member to Tapestry’s application objects list and assigning it the proper scope. With a scope of “session”, HiveMind makes sure that each session in the web application has its own copy.
<contribution configuration-id="tapestry.state.ApplicationObjects">
<state-object name="CustomSessionService" scope="session">
<create-instance class="tapestry.CustomSessionService"/>
</state-object>
</contribution>
The tag names are quite straightforward in this case, with a CustomSessionService
instantiated for each session.
There is, within HiveMind, a concept known as “auto-wiring”, which purports to automatically make the connection between services based on interfaces: if one HiveMind service has a setter accepting an interface as declared by one and only one other HiveMind service, it is automatically applied to that service. That is, assume that the following service is the only one providing the ICustomApplicationService
interface to the application:
<service-point id="CustomApplicationService" interface="tapestry.ICustomApplicationService">
Any other service declared in HiveMind, whose implementation includes a public setter method taking a parameter of type ICustomApplicationService
(as shown below), will have this service automatically injected into it via that method.
public void setCustomApplicationService(ICustomApplicationService _service) {
applicationService = _service;
}
The exact rules for setter naming aren’t known [2]; it is recommended that you stick to using the id of the service prepended with “set”.
Though we have seen auto-wiring work and it works consistently, it’s not always immediately obvious where it won’t work; generally, we found it’s much better to just declare the connection explicitly for two reasons:
If the session needs further configuration or connection with other services, the following, though logical (especially in light of how services are configured), does not work:
<contribution configuration-id="tapestry.state.ApplicationObjects">
<state-object name="CustomSessionService" scope="session">
<construct object="service:CustomSessionService">
<set-service property="applicationService" service-id="CustomApplicationService" />
</construct>
</state-object>
</contribution>
The element <construct>
is not allowed within a state-object declaration. Instead, as with services, the state-object must be created using a factory. If you’re already familiar with the way service factories are declared, declaring a state object factory and connecting it to the session object is straightforward.
<service-point id="QMSSessionServiceFactory" interface="org.apache.tapestry.engine.state.StateObjectFactory">
<invoke-factory>
<construct class="tapestry.CustomSessionServiceFactory">
<set-service property="applicationService" service-id="CustomApplicationService" />
</construct>
</invoke-factory>
</service-point>
<contribution configuration-id="tapestry.state.ApplicationObjects">
<state-object name="CustomSessionService" scope="session">
<invoke-factory object="service:CustomSessionServiceFactory" />
</state-object>
</contribution>
For completeness, we’ll include an example implementation of the state object factory to show which function needs to be overridden:
public class CustomSessionServiceFactory implements StateObjectFactory {
private ICustomApplicationService applicationService;
public Object createStateObject() {
CustomSessionService result = new CustomSessionService();
result.setApplicationService(getApplicationService());
return result;
}
public ICustomApplicationService getApplicationService() {
return applicationService;
}
public void setApplicationService(ICustomApplicationService _service) {
applicationService = _service;
}
}
In this case, it’s the createStateObject()
function that must be implemented in order to create the session service. HiveMind sets the application service using setApplicationService()
(there is an error if this is not defined because the HiveMind configuration for the factory references the applicationService
property) and the factory simply passes the application service to the session service when it creates it. Granted, it would have been much more intuitive to simply be able to specify this all in the HiveMind configuration (as attempted above), but at least there is a workaround.
Web applications handle browser requests, so they generally create a small environment of objects to handle each request. Each request is handled in its own thread; the application can use HiveMind to create and connect these types of objects as well. For example, if you’re working with Hibernate, each request needs its own session—it’s a bad idea to keep them open across multiple requests—so you can delegate creation of this session to HiveMind. Unfortunately, the following attempt won’t get you very far at all (though it is pretty intuitive):
<contribution configuration-id="tapestry.state.ApplicationObjects">
<state-object name="CustomHibernateSession" scope="request">
<create-instance class="tapestry.CustomHibernateSession"/>
</state-object>
</contribution>
Instead, per-request objects are declared as full-fledged services, but use a different invokation model. The declaration looks like that for any other service, but with one small (highlighted) difference:
<service-point id="CustomHibernateSession" interface="tapestry.ICustomHibernatSession">
<invoke-factory model="threaded">
<create-instance class="tapestry.CustomHibernateSession"/>
</invoke-factory>
</service-point>
Remember that the names for state objects and ids for services can be anything, as long as they are unique within the union of all HiveMind configurations in your application (including those from Tapestry and any contributions your application uses).
As we’ve seen in both this article and Setting up a Service in HiveMind, HiveMind can be incredibly useful for dynamic configurations like web applications. However, it’s difficult to build on previous knowledge when tackling a new problem—the syntax for each type of situation is slightly different. Once you know how, it’s not bad, but getting there can be quite a battle.
com.encodo.customer.project
Using Java 1.5, Tapestry 4.02, HiveMind 1.1.1
HiveMind is the IOC manager used together with Tapestry; it’s in charge of bootstrapping and connecting all of the myriad objects and services available to a Tapestry application. Applications based on... [More]
]]>Published by marco on 19. Dec 2006 22:58:54 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
HiveMind is the IOC manager used together with Tapestry; it’s in charge of bootstrapping and connecting all of the myriad objects and services available to a Tapestry application. Applications based on Tapestry are encouraged to use it to configure their application- and session-level objects and services as well.
Once it works, it works well. Getting it configured in the first place—especially when new to HiveMind—is an exercise in patience. Larger errors are detected at startup, when HiveMind tries to parse the configuration you’ve entered. Once you’ve gotten a bit better at this, you’re building a configuration that parses correctly, but fails during execution when HiveMind fails to connect one object to another as you expected (usually due to a naming mismatch of some sort or another).
Every module starts with the <module>
tag, like this:
<module id="com.encodo.customer.project" version="1.0.0">
If HiveMind can’t locate a class specified in the configuration, it prepends the module id and tries again. A well-chosen module id makes the enclosed specification much cleaner because most of the canonical class name can be left off.
One of the most common concepts to configure in HiveMind is that of a global service. Each service is a POJO and is instantiated under a particular name by HiveMind. Taking it slowly, let’s look at the introductory tag of a service declaration:
<service-point id="CustomApplicationService" interface="tapestry.CustomApplicationService">
…
</service>
This creates a service identified by the id com.encodo.customer.project.CustomApplicationService
(expanded to include the module id). The service is understood to have the interface represented by the class, com.encodo.customer.project.tapesty.CustomApplicationService
(also expanded to include the module id). Do not confuse the two. Though the service id looks like a class name, it does not have to correspond to an existing class. The outer wrapper identifies the service uniquely and indicates which interface it can be expected to have. It remains to tell HiveMind how to create and configure the object that represents the service.
<service-point id="CustomApplicationService" interface="tapestry.CustomApplicationService">
<invoke-factory>
<create-instance class="tapestry.CustomApplicationService"/>
</invoke-factory>
</service>
Now HiveMind is happy and will create an instance of the class com.encodo.customer.project.tapestry.CustomApplicationService
when requested.
HiveMind services are loaded on-demand, in order to reduce startup time to only that time required to parse configuration files. If you need the service to be started immediately—for example, if it is a periodic task that is never accessed directly by the application—add the service to HiveMind’s EagerLoad
contribution point.
<contribution configuration-id="hivemind.EagerLoad">
<load service-id="CustomApplicationService"/>
</contribution>
Though recent versions of HiveMind allow a service to be declared without an explicit interface (as done above), HiveMind treats services declared thusly differently. Specifically, it instantiates the service object twice instead of just once. Since this is often an undesired side-effect, it’s better to just extract an interface from the class and use that when declaring the service.
<service-point id="CustomApplicationService" interface="tapestry.ICustomApplicationService">
<invoke-factory>
<create-instance class="tapestry.CustomApplicationService"/>
</invoke-factory>
</service>
Multiple interdependent services are configured slightly differently. Instead of simply telling HiveMind to create an instance of an object, you have to use the <construct>
tags, which allow nesting of <set-service>
tags (among others). The example below shows the application service, but now dependent on a fileManagerService.
<service-point id="CustomFileManagerService" interface="tapestry.ICustomFileManagerService">
<invoke-factory>
<create-instance class="tapestry.CustomFileManagerService"/>
</invoke-factory>
</service>
<service-point id="CustomApplicationService" interface="tapestry.ICustomApplicationService">
<invoke-factory>
<construct class="tapestry.CustomApplicationService">
<set-service property="fileManagerService" service-id="CustomFileManagerService"/>
</construct>
</invoke-factory>
</service-point>
Since both services are defined in the same module, one can refer to the other with its short name, CustomFileMangerService
instead of the canonical service id, com.encodo.customer.project.CustomFileMangerService
(and, yes, the ids are case-sensitive). The property name must also match the following method in CustomApplicationService
.
public void setFileManagerService(ICustomApplicationService _service) {
fileManagerService = _service;
}
If it is not available, HiveMind will throw an exception. HiveMind also has no trouble connecting the file manager simultaneously to the application service, like this:
<service-point id="CustomFileManagerService" interface="tapestry.ICustomFileManagerService">
<invoke-factory>
<construct class="tapestry.CustomFileManagerService">
<set-service property="applicationService" service-id="CustomApplicationService"/>
</construct>
</invoke-factory>
</service>
The services will be able to access each other through the proxies created and assigned by HiveMind.
null
… because HiveMind hasn’t had the chance to assign them yet.This means that you can’t read properties from one service in order to configure the other service and you can’t pass the service reference to other objects, because it will always be null
in the constructor.
One way around this is to use a service factory to create a service. The pattern works as follows:
<set-property>
calls to the service factory instead of the service<construct>
or <create-instance>
Let’s make a service factory for the file manager service.
<service-point id="CustomFileManagerServiceFactory"
interface="org.apache.hivemind.ServiceImplementationFactory">
<invoke-factory>
<construct class="tapestry.CustomFileManagerServiceFactory">
<set-service property="applicationService" service-id="CustomApplicationService" />
</construct>
</invoke-factory>
</service-point>
As you can see, the application service property is set on the factory now instead of the file manager itself. The file manager can now be created using the factory, like this:
<service-point id="CustomFileManagerService" interface="doclib.CustomFileManagerService">
<invoke-factory service-id="CustomFileManagerServiceFactory"/>
</service-point>
Instantiation of the file manager is delegated to the service factory, which creates the file manager object, assigns the necessary properties (like the application service) and returns it. A possible Java implementation is shown below:
public class CustomFileManagerServiceFactory implements ServiceImplementationFactory {
private ICustomApplicationService applicationService;
public ICustomApplicationService getApplicationService() {
return applicationService;
}
public void setApplicationService(ICustomApplicationService _applicationService) {
applicationService = _applicationService;
}
public Object createCoreServiceImplementation(ServiceImplementationFactoryParameters _factoryParameters) {
CustomFileManagerService result = new CustomFileManagerService();
result.setApplicationService(getApplicationService());
return result;
}
}
createCoreServiceImplementation
is called after the factory has been created and all other HiveMind properties assigned. Though quite a bit more work—and not at all intuitive—it’s at least possible to control configuration at quite a low level … once one knows how. Even the example as given won’t work because of one niggling little detail: HiveMind will parse the configuration without trouble, but will throw an exception when trying to create the service:
The solution is not immediately obvious, but, after digging around on the web and finding Bug with parameter-occurs?, it became clear that the default parameter setting was wrong. The fix is to allow the service factory to be instantiated with 0 parameters (instead of requiring 1):
<service-point id="CustomFileManagerServiceFactory"
interface="org.apache.hivemind.ServiceImplementationFactory">
parameters-occurs="0..n"
…
</service-point>
With this fix in place, HiveMind can create the service factory and the file manager service as expected.
Using Java 1.5, Tapestry 4.02, HiveMind 1.1.1
Every once in a while, when adding a new component to or changing an existing one on a Tapestry page, you’ll make a mistake. Most of the time, the exception handler page is pretty good; sometimes the exception... [More]
]]>Published by marco on 14. Dec 2006 07:08:49 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
Every once in a while, when adding a new component to or changing an existing one on a Tapestry page, you’ll make a mistake. Most of the time, the exception handler page is pretty good; sometimes the exception can be quite confusing. For example, suppose we have a custom component with a single property:
package com.encodo.blogs.samples;
class CustomComponent extends BaseComponent {
public abstract SomeObject getCustomParameter();
}
To use this component in a page, you just write the following:
<span jwcid="@CustomComponent" customParameter="obj"/>
This looks ok [1], but when loaded in a browser causes the following error:
org.apache.tapestry.BindingException
Error converting value for template parameter customParameter: No type converter for type com.encodo.blogs.samples.SomeObject
is available.With this kind of error message, you’re ready to start imagining all sorts of horrible things:
Serializable
?The missing magic in the above example is ognl:
. Tapestry uses the Object-Graph Navigation Language to process references to Java code in its templates and page/component definitions. However, the default for HTML attributes is literal:
, which performs no extra processing. Since ognl:
is missing, obj
is simply a string, which Tapestry cannot convert to SomeObject
. To fix the problem, just add ognl:
before the object reference, like this:
<span jwcid="@CustomComponent" customParameter="ognl:obj"/>
This being such a common error, it would be nice if Tapestry could do some common-sense handling of it to help emit a better message. One simple way is to specify what, exactly, it was trying to convert to the target object. Compare to the following error message:
org.apache.tapestry.BindingException
Error converting “obj” (interpreted as “literal:obj”) for template parameter customParameter: No type converter for type com.encodo.blogs.samples.SomeObject
is available.Once the developer sees how Tapestry interpreted the component declaration, it’s much easier to pinpoint the error. Now, for purely asthetic reasons, let’s make this message more user-friendly:
org.apache.tapestry.BindingException
: The value for template parameter “customParameter”, given as “obj” and interpreted as “literal:obj”, could not be converted to com.encodo.blogs.samples.SomeObject
.That error message, at least, should no longer inspire panic and desperate restarts of the testing server.
Using Java 1.5
One of the features we expect from a collections library is sorting. You should be able to use generic library mechanisms to sort a list of any kind of element. Most libraries include a generic sort
function,... [More]
Published by marco on 6. Dec 2006 21:32:03 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
One of the features we expect from a collections library is sorting. You should be able to use generic library mechanisms to sort a list of any kind of element. Most libraries include a generic sort
function, to which a comparison functor (object or function pointer) is passed. This functor is called repeatedly on pairs of elements until the list is sorted.
Let’s define the simple class we’ll use in the ensuing examples.
class A {
String fileName;
function getFileName() {
return fileName;
}
}
Now, let’s sort a List<A>
. Is there a sort
function on the list object itself? No. Why not? Legacy reasons. In order to avoid breaking existing code, Java has not made any changes to the List
interface. Ever. Therefore, you will have to search for any new functionality in the global function unit masquerading as a class [1] called Collections
.
There are two sorting functions defined in this class, shown below:
public static <T extends Comparable<? super T>> void sort(List<T> list);
public static <T> void sort(List<T> list, Comparator<? super T> c);
Neither one of these is exactly easy on the eyes and both include the wildcard (?) operator in their definitions. The first version accepts a List<T>
only if T extends Comparable
directly or a generic instantiation of Comparable
which takes T or a superclass as a generic parameter. The second version takes a List<T>
and a Comparator
instantiated with T or a supertype.
With our simple class A
above, it would be pretty easy to implement the interface to give the class a standard ordering. Naively, we might add the following:
class A implements Comparable {
String fileName;
function getFileName() {
return fileName;
}
public int compareTo(Object _o) {
return getFileName().compareTo((A) _o.getFileName());
}
}
The compiler seems pretty happy with it, but actually calling sort()
with a List<A>
results in a warning:
sort(List<A>)
of the generic method sort(List<T>)
of type Collections
If you’re using Eclipse, your “Quick-Fix” trigger finger is probably getting mighty itchy right now, but let’s avoid simply adding a @Suppress
directive above this function and try to find out why the class compiles, but the function call has a problem.
A quick search through Google Groups turns up Collections.sort() in Java 5, which reminds us that “[t]he Comparable interface takes a generic parameter”. Adding in the generic argument (as shown below) fixed the problem and gets rid of the warning.
class A implements Comparable<A> {
…
public int compareTo(A _o) {
return getFileName().compareTo(_o.getFileName());
}
}
On top of that, the generic version of Comparable
allows us to declare a type-specific compareTo
function and get rid of the ugly cast. All’s well that ends well, but it would be much nicer if the compiler could tell us that we are misusing Comparable
than to have to find out from some guy in a newsgroup.
Using Java 1.5
Develop your web application using Firefox. Validate your (X)HTML, validate your CSS, test your JavaScript. Tweak graphics, tweak layout. Get the client to sign off. Now that everything’s looking and working... [More]
]]>Published by marco on 3. Dec 2006 23:16:24 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
Develop your web application using Firefox. Validate your (X)HTML, validate your CSS, test your JavaScript. Tweak graphics, tweak layout. Get the client to sign off. Now that everything’s looking and working just right, it’s time to get it running in IE. Fire up IE and load the application.
?!?!
Lingering problems with PNG graphics, improperly interpreted CSS, imaginative approach to HTML layout—these are the types of problems you expect in IE. But IE refusing to load any page in the application at all? That’s a new one.
The first step was to load the TamperData extension with Firefox to get a look at the HTTP request and response headers. From this, it appeared that the server was using an HTTP 1.1-only feature by setting the Transfer-Encoding
to chunked
. Forcing Firefox to use HTTP 1.0 disables this feature and returns a Content-Length
instead. Go back to IE and force the HTTP compliance to version 1.0 and reload the page. Nothing. Same error.
So it doesn’t seem to be anything on the server … could it possibly be the content? This article, Internet Explorer Programming Bugs, yielded a wealth of information, including the following lead:
“Apparently interacting with innerHTML and possibly using other JScript functionality causes IE to pop up “Internet Explorer cannot open the Internet site http://example.com. Operation aborted.” messages after loading a page. … It seems that IE doesn’t like when somebody is trying to modify content of “document.body” by adding new elements (previous example) or by modifying its innerHTML (my case).”
Go back to the application and remove all scripts from the page, including the dojo libraries included by Tacos, an Ajax/scripting framework for Tapestry. Reload the page in IE and it loads without problems.
Getting closer.
Well, we can’t just shut off Javascripting for a modern web application, so let’s check for dojo/IE6 conflicts. Digging further turns up this post, [Dojo-checkins] [dojo] #557: IE 6 refuses to load page if dojo is loaded in HEAD and a BASE tag exists before that. The title says it all. Re-enabling the scripts and loading in Firefox confirms that there is indeed a base tag before the dojo scripts.
Ok. So that seems to be the problem. We use the @Shell Tapestry component to render the HTML head, which has a renderBaseTag
property. Set this property to false
and the page works in IE as designed [1].
We’re using Tapestry 4.0.1 and had already extended the @Shell
component to accept an array of scripts (analogous to the array of stylesheets it already accepts). Therefore, we simply changed the implementation to output the base tag at the end of the <head>
section, to avoid any future conflicts in IE6.
After a cursory examination of the sources for Tapestry 4.1, it seems that both the delegate
and ajaxDelegate
are rendered before the base tag, if present. This means that Tapestry 4.1 applications that require a base tag will also not function in IE6.
We’ll have to submit a patch in order to get this fixed once and for all.
Using Firefox 1.5, TamperData 9.8.1, IE 6 SP2, Dojo 0.4, Java 1.5, Tapestry 4.02 and Tacos 4.0.1
As of version 1.5, Java has blessed its developers with generics, which increase expressiveness through improved static typing. With generics, Java programmers should be able to get away from the “casting... [More]
]]>Published by marco on 17. Nov 2006 16:34:38 (GMT-5)
Updated by marco on 1. Dec 2006 08:50:31 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
As of version 1.5, Java has blessed its developers with generics, which increase expressiveness through improved static typing. With generics, Java programmers should be able to get away from the “casting orgy” to which Java programming heretofore typically devolved. The implementation in 1.5 does not affect the JVm at all and is restricted to a syntactic sugar wherein the compiler simply performs the casts for you.
Let’s build a class hierarchy and see how much casting Java saves us. Assume that you have defined a generic hierarchy using the following class:
public class DataObject {
private String name;
private List<DataObject> subObjects = new ArrayList<DataObject>();
public String getName() {
return name;
}
public List<DataObject> getSubObjects() {
return subObjects;
}
}
Well, now that’s an improvement! The class can express its intent in a relatively clear syntax without creating a specialized list class for the private field and result type. Assume further that there are various sub-classes of this DataObject
, which want to provide type-specific helper functions for their sub-lists. For example:
public class A extends DataObject {
}
public class B extends DataObject {
public List<A> getAs() {
return getSubObjects();
}
}
Though this is exactly what we would like, it won’t compile. It returns instead the error:
List<DataObject>
to List<A>
In the next section, we’ll find out why.
For some reason, List<A>
does not conform to List<DataObject>
, even though A
inherits from DataObject
. The Generics Tutorial (PDF) Section 3 explains:
“In general, if Foo is a subtype (subclass or subinterface) of Bar, and G is some generic type declaration, it is not the case that G<Foo> is a subtype of G<Bar>. This is probably the hardest thing you need to learn about generics, because it goes against our deeply held intuitions.”
Indeed it is hard to learn and indeed it does go against intuitions. Is there a more specific reason why generics is implemented in this way in Java? Java’s competitor, C#, is limited in exactly the same way and the C# Version 2.0 Specification (DOC) or the Google HTML version offers the following explanation:
“No special conversions exist between constructed reference types other than those described in §6. In particular, unlike array types, constructed reference types do not exhibit “covariant” conversions. This means that a type List<B> has no conversion (either implicit or explicit) to List<A> even if B is derived from A. Likewise, no conversion exists from List<B> to List<object>.
“The rationale for this is simple: if a conversion to List<A> is permitted, then apparently one can store values of type A into the list. But this would break the invariant that every object in a list of type List<B> is always a value of type B, or else unexpected failures may occur when assigning into collection classes.”
The key word here is covariance. Neither Java nor C# supports it (except for return types, where there are no dangers involved) because of function calls that, in the Eiffel world, have long been called “catcalls”. Suffice it to say that both Java and C# have elected to limit expressiveness and legibility in order to prevent this type of error from happening. [1]
Since Java has clearly state that it neither condones nor supports what we would like to do, we can choose one of several options:
List<DataObject>
and just go back to casting to get the desired <A> when neededList<A>
without complainingSince we’re stubborn, we’ll go with (2) above and dig a little deeper into generics. One solution is to create the list on-the-fly and transfer all the elements over to it.
public List<A> getAs() {
List<A> result = new ArrayList<A>();
for (DataObject obj : getSubObjects()) {
result.add((A) obj);
}
return result;
}
Mmmmm…lovely. It does the soul good and makes the heart swell with pride to write code like this. So clear and understanable—and such a lovely mix of new-style iteration with old-style casting! Methinks we’ll try again. In the first attempt, we returned List<DataObject>
from getSubObjects()
. Is there another result type we could use?
Java’s generics include something called wildcards, which allow a restricted form of covariance, in which the character ? acts as a placeholder for any class type at all. Wildcards are especially useful for function arguments, where they allow any list of elements to be passed. Imagine we wanted to pass in a list of DataObjects
to a function to be printed. Using wildcards, we can write the following:
public void printCollection(Collection<?> _objects) {
for (Object o : _objects) {
System.out.println(o);
}
}
The example above takes an collection at all and prints all of them. It only works because the compiler knows that any class that replaces ? must inherit from java.lang.Object
, so it can access any methods of that class from within the function. This is extremely limited since we can’t access any DataObject
-specific functions, so Java also includes bounded wildcards, which allow a wildcard to restrict the types of objects that may be used as the generic argument. Let’s rewrite printCollection
so that we can access DataObject
’s members without casting:
public void printCollection(List<? extends DataObject> _objects) {
for (DataObject o : _objects) {
System.out.println(o.getName());
}
}
Whereas this mechanism suffices for the example above, wildcards exact a hidden price: they do not conform to anything. That is, though List<A>
conforms to the format parameter, List<? extends DataObject>
, you cannot then call add()
on it. That is, the following code doesn’t work:
public void extendCollection(List<? extends DataObject> _objects) {
_objects.add(new DataObject());
}
The parameter of _objects.add()
is of type ? extends DataObject
, which is completely unknown to the Java compiler. Therefore, nothing conforms to it … not even DataObject
itself!
Using the example above, we can recap the different approaches to using generics in Java:
List<DataObject>
as the formal argument doesn’t allow us to pass a List<A>
List<?>
as the formal argument allows us to use only those functions defined in java.lang.Object
on elements of the list.List<? extends DataObject>
allows us to pass any list of elements whose type conforms to DataObject
, but limits the methods that can be called on it.Let’s return now to our original example and see if we can’t apply our new-found knowledge to find a solution. Let’s redefine the result type of the getSubObjects()
function to use a wildcard, while leaving the result type of the getAs()
function, defined in B
, as it was.
public List<? extends DataObject> getSubObjects() {
return subObjects;
}
However, as we saw in the third case above, this return type uses an unknown (unknowable) generic type and cannot be modified using add()
or remove()
. Not exactly what we were looking for. Let’s instead put it back the way it was and concentrate on using our newfound knowledge to cast (Yay! Casting! I knew you’d be back!) our result to the correct type. Here’s a naive attempt:
public List<A> getAs() {
return (List<A>) getSubObjects();
}
Ok. From the discussion above, it’s clear this won’t work and the compiler rewards us with the following error message:
List<DataObject>
to List<A>
Fine, let’s try again, this time throwing a wildcard into the mix:
public List<A> getAs() {
return (List<A>) (List< ? extends PathElement>) getSubObjects();
}
Sweet! It compiles! We’re definitely on the home stretch now, but there’s still a warning from the compiler:
List<capture-of ? extends DataObject>
to List<A>
is actually checking against the erased type list.This is Java’s way of saying that you have done a complete end-run around it’s type-checking. The “erased type list” is actually List
because the compiler uses a strategy called erasure [2] to resolve generic references. The double cast in the example above compiles (and will run), but cannot be statically checked. At this point, there’s nothing more we can do, so we admit defeat the Java way and slap a SuppressWarnings
annotation on the function and continue on our way.
@SuppressWarnings("unchecked")
public List<A> getAs() {
return (List<A>) (List< ? extends PathElement>) getSubObjects();
}
It’s clear that the decision to avoid covariance at all costs has cost the language dearly in terms of expressiveness (and, as a result, type-safety, as evidenced by the casting in the final example). It takes rather a lot of illegible code to express what, at the beginning of the article, seemed a rather simple concept.
Using Java 1.5
Given a recursive object structure in memory, what’s the best—and most efficient—way to render it with Tapestry? First, let’s define a tiny Java class that we’ll use for our example:
public class... [More]
]]>
Published by marco on 10. Nov 2006 14:20:21 (GMT-5)
Updated by marco on 10. Nov 2006 14:20:20 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
Given a recursive object structure in memory, what’s the best—and most efficient—way to render it with Tapestry? First, let’s define a tiny Java class that we’ll use for our example:
public class DataObject {
private String name;
private List<DataObject> subObjects = new ArrayList<DataObject>();
public String getName() {
return name;
}
public List<DataObject> getSubObjects() {
return subObjects;
}
}
Imagine an application has built a whole tree of DataObjects
and wants to display them in a page. Since the page doesn’t know how many objects—or nesting levels—there are, it can’t be defined statically. This sounds like the perfect place to use a Tapestry component. Since each component must know about its context object (the DataObject
), there must be an instance of the component for each object. This sounds like the perfect place to use recursion.
Let’s take a crack at defining the template for a component named “DataObjectTree”, which has a single property, context
, which passes in the object to render [1]:
<span jwcid="@Insert" value="ognl:Context.Name">Context Name</span>
<div jwcid="@If" condition="ognl:Context.SubObjects.size() > 0">
<div jwcid="@For" source="ognl:Context.SubObjects" value="ognl:DataObject">
<div jwcid="@DataObjectTree" context="ognl:DataObject"/>
</div>
</div>
If it was that easy, you probably wouldn’t be reading this article, as it wouldn’t have been written. However, the Tapestry template parser is going to have extreme difficulties parsing this self-referential template. This value causes a stack overflow:
<div jwcid="@DataObjectTree" context="ognl:DataObject"/>
That’s a shame, but, with help from the blog entry, Recursive Tapestry Components, it’s possible to solve this problem with very little code (though the recursive solution would still be nicer).
Blocks
to fool TapestryTapestry has two components, Block
and RenberBlock
. Block
defines a “floating” piece of template that is not rendered where it is defined, but is rather rendered in a particular place—or places—in a template by a RenderBlock
. The trick boils down to this: replace the recursive call in the component definition with a call to render a block defined in the page. That is, replace the offending line above with the line below:
<div jwcid="@RenderBlock" block="ognl:Page.Components.DataObjectBlock"/>
The DataObjectBlock
, in turn, is defined in the page template and includes a DataObjectTree
component.
<div jwcid="DataObjectBlock@Block">
<div jwcid="@DataObjectTree" context="???"/>
</div>
As shown above, there is a slight problem, as we need to pass a context to the nested DataObjectTree
component. Since we are once again in the page template, we can’t refer to any properties defined in the component. Therefore, the iterator object, DataObject
, used in the recursive (and non-functional) example above, is not available. We’ll have to access it some other way. The other way turns out to be by passing it from the RenderBlock
to the Block
. A RenderBlock
component accepts and stores all properties, so we’ll pass it the context in a “value” property (use any name you like).
<div jwcid="@RenderBlock" block="ognl:Page.Components.DataObjectBlock" value="ognl:DataObject"/>
How can the DataObjectTree
retrieve this property? It needs access to the RenderBlock
that included its parent Block
. That is, with a reference to its surrounding block, it can obtain a reference to the RenderBlock
and retrieve the value from it. The code below shows how to declare the Block
in the page template.
<div jwcid="DataObjectBlock@Block">
<div jwcid="@DataObjectTree" block="ognl:Page.Components.DataObjectBlock"/>
</div>
Tapestry will now give each instance of DataObjectTree
a reference to the Block
instance that encloses it. In order to complete this solution, you’ll have to write a Java class for the DataObjectTree
component itself. The component template refers to a Context
, which represents the DataObject
to display. When displayed from the block, this property is not directly set, so we will have to define code to retrieve it from the appropriate place. [2]
public abstract class DataObjectTree extends BaseComponent {
public abstract Block getBlock();
public abstract DataObject getContext();
public DataObject getBlockContext() {
if (getBlock() != null) {
return (DataObject) getBlock().getParameter("value");
}
return getContext();
}
}
If the component’s block
property is set, then the component was instantiated from within a block. In that case, the context is retrieved from the block’s parameters (all properties from the initiating RenderBlock
are automatically passed to the block as parameters). Otherwise, use the context set by the Context
property of the component. Below is the completed template for the component:
<span jwcid="@Insert" value="ognl:BlockContext.Name">Context Name</span>
<div jwcid="@If" condition="ognl:BlockContext.SubObjects.size() > 0">
<div jwcid="@For" source="ognl:BlockContext.SubObjects" value="ognl:DataObject">
<div jwcid="@RenderBlock" block="ognl:Page.Components.DataObjectBlock" value="ognl:DataObject"/>
</div>
</div>
As discussed above, the template now uses the BlockContext
instead of the context directly, so it uses the correct DataObject
. The page, on the other hand, uses the DataObjectTree
component twice, once from the DataObjectBlock
(as shown above) and once from the main template, as highlighted below.
<div jwcid="@DataObjectTree" context="Context"/>
<div jwcid="DataObjectBlock@Block">
<div jwcid="@DataObjectTree" block="ognl:Page.Components.DataObjectBlock"/>
</div>
Note how the instance in the main template gets a Context
representing the root of the tree and defined in the page itself. Though not as simple as the intuitive, recursive solution, the “Tapestry Way” doesn’t end up using too much code, though it did take a little while to figure out.
It’s kind of a shame that the page template has to not only use the component, but declare the block that it uses to render its nodes. Optimally, this part would also go into the component … but that takes us right back to the recursion problem we started with. Also, it’s kind of a shame that a separate component is required. Is there any way around around these two warts?
Using the DataObjectTree
recursively is not possible, but Blocks
can seemingly be nested as much as needed. In fact, all that seems to be missing is a way to cleanly access the “value” passed to the Block
by the RenderBlock
. Leaving the definition within a separate component (for now), we can redefine the component HTML as follows:
<div jwcid="@RenderBlock" block="Components.DataObjectBlock" value="Context">
<div jwcid="DataObjectBlock@Block">
<span jwcid="@Insert" value="ognl:BlockContext.Name">Context Name</span>
<div jwcid="@If" condition="ognl:BlockContext.SubObjects.size() > 0">
<div jwcid="@For" source="ognl:BlockContext.SubObjects" value="ognl:DataObject">
<div jwcid="@RenderBlock" block="ognl:Components.DataObjectBlock" value="ognl:DataObject"/>
</div>
</div>
</div>
The entirety of the component’s rendering is contained within a Block
, which accesses the current context through the BlockContext
. The component’s main content is simply a RenderBlock
that renders that block for its Context
by passing it as the “value” for the block. Since the block is now defined within the component, it is accessed using Components.DataObjectBlock
rather than Page.Components.DataObjectBlock
.
The only remaining magic is to implement BlockContext
in the component’s Java class. The component retrieves the current instance of the “Node” component out of its component map and returns whatever was set in the “value” parameter. The page uses the getContext()
property to pass in the initial context.
public abstract class DataObjectTree extends BaseComponent {
public abstract DataObject getContext();
public Object getBlockContext() {
return ((Block) getComponents().get("Node")).getParameter("value");
}
}
Now that’s clean! In fact, it’s almost the same as the original recursive solution, but uses the RenderBlock/Block
trick to get around Tapestry’s limitation. Though this example is now still defined in a component, you can just move the code and HTML into a page definition. Since the page already has its own context, you only need to copy in the getBlockContext()
method. The HTML can be copied directly and voila! A clean solution to recursive structures in Tapestry without defining any new components or using messy kludges.
@InvokeListener
Since the BlockContext
is called many times from the component—real-world implementations will also likely need more such properties—, we’d like to set up all necessary data when starting the DataRowBlock
. To do this, use the @InvokeListener
from the HTML to call a function on the page instance.
public abstract class DataObjectTree extends BaseComponent {
private Object blockContext;
public abstract DataObject getContext();
public Object getBlockContext() {
return blockContext;
}
public void updateBlockContext() {
blockContext = ((Block) getComponents().get("Node")).getParameter("value");
}
}
From the HTML template, simply call this function at the beginning of the DataObjectBlock
:
<div jwcid="DataObjectBlock@Block">
<span jwcid="@InvokeListener" listener="listener:UpdateContext"/>
…
</div>
This is a good pattern to follow to avoid having getters that are too computationally expensive.
For the non Tapestry-savvy, here are a few tips:
ognl
indicates that a Object-Graph Navigation Language expression is coming – it’s the mechanism Tapestry uses to script objects from a template. This language understands get/set and can read properties.For
is a loop, If
is a conditional and Insert
adds text to the template. Text before the @ sign is an explicitly named component, which can be referred to by this name elsewhere in the template.For more information, visit Tapestry’s home page.
abstract
are automatically given getters and setters and wired up by Tapestry in a dynamically generated descendent (that’s why the class is abstract
as well.Using Java 1.5 and Tapestry 4.0.2
See Finding Conforming Methods for part one of this two-part article.
The problem we’re working on is as follows:
Published by marco on 6. Nov 2006 21:30:31 (GMT-5)
Updated by marco on 1. Dec 2006 08:51:34 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
See Finding Conforming Methods for part one of this two-part article.
The problem we’re working on is as follows:
We will use annotations to mark up methods as callable or not. Given the Method
we obtained in part one, it shouldn’t be too hard to find its annotations. Simply pass the class of the desired annotation to getAnnotation()
; if the annotation was specified for that method, we check its contents to determine whether the method can be called or not.
In part one, calling getConformingMethod( “giveCommandTo”, {new Assistant()}, Manager.getClass())
returns the overridden method from the Manager
class. Unfortunately, a call to getAnnotations()
on this method returns an empty list. Why?
The Java reflection API makes a distinction between annotations that appear directly on an element and all annotations for an element, including ancestors. These two lists can be retrieved from any AnnotatedElement
using the following methods:
Annotation[] getAnnotations();
Annotation[] getDeclaredAnnotations();
The documentation states that getDeclaredAnnotations()
returns “all annotations that are directly present on this element”, whereas getAnnotations()
returns “all annotations present on this element”. The key word here is directly, which is to be interpreted as stated above … for classes. For methods, there is no notion of inheritance per se in the reflection API. That is, if a method in a base class has an annotation and that method is overridden in a descendent, the signature for the method in the descendent returns empty lists for both getDeclaredAnnotations()
and getAnnotations()
.
This doesn’t make any sense and directly contradicts the documentation. It seems that the all vs. declared distinction only holds for classes, even though it is defined for all elements. A quick look into the Java source shows that Method
inherits from AccessibleObject
, which implements the AnnotatedElement
interface. AccessibleObject
implements getAnnotations()
with the following code:
public Annotation[] getAnnotations() {
return getDeclaredAnnotations();
}
Alrighty then! Method
itself does not override this method, so it’s relatively clear that inherited annotations are not available from a method. In effect, the @inherited
keyword only has an effect for classes, which is a shame. A quick check of the documentation for that keyword verifies this claim:
“Note that this meta-annotation type has no effect if the annotated type is used to annotate anything other than a class.”
So, once again, we’re on our own and must build the functionality in a custom function. The code below shows how to search a method and its inherited implementations for the Callable
interface:
private Callable getCallable(Method m, Object[] actualParameters) {
result = null;
if (m != null) {
Callable result = m.getAnnotation(Callable.class);
if (result == null) {
Class<?> parent = m.getDeclaringClass().getSuperclass();
if (parent != null) {
Method superMethod = getConformingMethod(m.getName(), actualParameters, parent);
result = getCallable(superMethod, actualParameters);
}
}
}
return result;
}
It’s not rocket science, but it involves a lot of digging around in the guts of Java reflection that shouldn’t be necessary.
Using Java 1.5
This is a two part post illustrating some tricks for working with the Java reflection API. Part two is available here.
Java reflection provides a wealth of information about your code. One interesting use of... [More]
]]>Published by marco on 6. Nov 2006 21:30:22 (GMT-5)
Updated by marco on 1. Dec 2006 08:51:59 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
This is a two part post illustrating some tricks for working with the Java reflection API. Part two is available here.
Java reflection provides a wealth of information about your code. One interesting use of this information is to layer scriptability on top of an application, calling code dynamically. Suppose we wanted to do the following:
Let’s tackle step one first: a logical approach is to get the Class
for the target object and call getMethod()
with the method name and list of parameters to get the desired Method
object.
Sounds pretty easy, right? The Java reflection API puts a few stumbling blocks in the way.
getMethod()
finds only methods whose parameter lists are an exact match for the one given, not methods, which can actually be called with that list of parameters. That is, it ignores polymorphism completely when performing a search.
For the following discussion, assume the following definitions:
public class Person {
public void executeCommand(String s) {
}
@Callable(CallLocation.FromWeb) [1]
public void giveCommandTo(String s, Person p) {
p.ExecuteCommand(s);
}
}
public class Assistant extends Person {
}
public class Manager extends Person {
List<Person> underlings = new ArrayList<Person>();
public boolean getIsInChainOfCommand(Person p) {
return underlings.contains(p);
}
public void giveCommandTo(String s, Person p) {
if (! getIsInChainOfCommand(p)) {
throw new RuntimeException("Cannot order this person around.");
}
p.ExecuteCommand(s);
}
}
Managers can only order their own underlings around. We expect to be able to call giveCommandTo()
with a piece of text and an Assistant
, and Java—polymorphic wunderkind that it is—obliges. As mentioned above, getMethod(“giveCommandTo”, {new String(), new Assistant()})
[2] returns null
because Assistant
, though a conforming actual parameter, is not an exact match for the formal parameter.
That’s a shame. I’m sure getMethod()
is much faster for this optimization, but it doesn’t really work for applications that would like to benefit from polymorphism. Any application that wants to search for methods that can actually be executed will have to do so itself. In English, we want to get the list of methods on a class and iterate them until the name matches the desired method. If the matching method has the same number of formal parameters as actual parameters and each of the formal parameter types isAssignableFrom
the corresponding actual parameter, we have a winner. The code below does this: [3]
protected Method getConformingMethod(String methodName, Object[] actualParameters, Class<?> cls) {
Method[] publicMethods = cls.getMethods();
Method m = null;
int idxMethod = 0;
while ((m == null) && (idxMethod < publicMethods.length)) {
m = publicMethods[idxMethod];
if (m.getName().equals(methodName)) {
Class<?>[] formalParameters = m.getParameterTypes();
if (actualParameters.length == formalParameters.length) {
int idxParam = 0;
while ((m != null) && (idxParam < formalParameters.length)) {
Class<?> param = formalParameters[idxParam];
if (!param.isAssignableFrom(actualParameters[idxParam].getClass())) {
m = null;
}
idxParam++;
}
} else {
m = null;
}
} else {
m = null;
}
idxMethod++;
}
return m;
}
A call to getConformingMethod(“giveCommandTo”, {new String(), new Assistant()})
returns a match where getMethod()
did not. Now that we have our method, we can check whether it can be called or not. For this, we retrieve the annotations on it.
Continue on to part two.
Callable
annotation, which takes an enumeration as an argument. It is discussed in more detail in part two.Using Java 1.5
See Finding Conforming Methods for part one of this two-part article.
The problem we’re working on is as follows:
Published by marco on 2. Nov 2006 09:36:49 (GMT-5)
Updated by marco on 1. Dec 2006 08:51:15 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
See Finding Conforming Methods for part one of this two-part article.
The problem we’re working on is as follows:
We will use annotations to mark up methods as callable or not. Given the Method
we obtained in part one, it shouldn’t be too hard to find its annotations. Simply pass the class of the desired annotation to getAnnotation()
; if the annotation was specified for that method, we check its contents to determine whether the method can be called or not.
In part one, calling getConformingMethod( “giveCommandTo”, {new Assistant()}, Manager.getClass())
returns the overridden method from the Manager
class. Unfortunately, a call to getAnnotations()
on this method returns an empty list. Why?
The Java reflection API makes a distinction between annotations that appear directly on an element and all annotations for an element, including ancestors. These two lists can be retrieved from any AnnotatedElement
using the following methods:
Annotation[] getAnnotations();
Annotation[] getDeclaredAnnotations();
The documentation states that getDeclaredAnnotations()
returns “all annotations that are directly present on this element”, whereas getAnnotations()
returns “all annotations present on this element”. The key word here is directly, which is to be interpreted as stated above … for classes. For methods, there is no notion of inheritance per se in the reflection API. That is, if a method in a base class has an annotation and that method is overridden in a descendent, the signature for the method in the descendent returns empty lists for both getDeclaredAnnotations()
and getAnnotations()
.
This doesn’t make any sense and directly contradicts the documentation. It seems that the all vs. declared distinction only holds for classes, even though it is defined for all elements. A quick look into the Java source shows that Method
inherits from AccessibleObject
, which implements the AnnotatedElement
interface. AccessibleObject
implements getAnnotations()
with the following code:
public Annotation[] getAnnotations() {
return getDeclaredAnnotations();
}
Alrighty then! Method
itself does not override this method, so it’s relatively clear that inherited annotations are not available from a method. In effect, the @inherited
keyword only has an effect for classes, which is a shame. A quick check of the documentation for that keyword verifies this claim:
“Note that this meta-annotation type has no effect if the annotated type is used to annotate anything other than a class.”
So, once again, we’re on our own and must build the functionality in a custom function. The code below shows how to search a method and its inherited implementations for the Callable
interface:
private Callable getCallable(Method m, Object[] actualParameters) {
result = null;
if (m != null) {
Callable result = m.getAnnotation(Callable.class);
if (result == null) {
Class<?> parent = m.getDeclaringClass().getSuperclass();
if (parent != null) {
Method superMethod = getConformingMethod(m.getName(), actualParameters, parent);
result = getCallable(superMethod, actualParameters);
}
}
}
return result;
}
It’s not rocket science, but it involves a lot of digging around in the guts of Java reflection that shouldn’t be necessary.
Using Java 1.5
This is a two part post illustrating some tricks for working with the Java reflection API. Part two is available here.
Java reflection provides a wealth of information about your code. One interesting use of... [More]
]]>Published by marco on 2. Nov 2006 09:36:44 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
This is a two part post illustrating some tricks for working with the Java reflection API. Part two is available here.
Java reflection provides a wealth of information about your code. One interesting use of this information is to layer scriptability on top of an application, calling code dynamically. Suppose we wanted to do the following:
Let’s tackle step one first: a logical approach is to get the Class
for the target object and call getMethod()
with the method name and list of parameters to get the desired Method
object.
Sounds pretty easy, right? The Java reflection API puts a few stumbling blocks in the way.
getMethod()
finds only methods whose parameter lists are an exact match for the one given, not methods, which can actually be called with that list of parameters. That is, it ignores polymorphism completely when performing a search.
For the following discussion, assume the following definitions:
public class Person {
public void executeCommand(String s) {
}
@Callable(CallLocation.FromWeb) [1]
public void giveCommandTo(String s, Person p) {
p.ExecuteCommand(s);
}
}
public class Assistant extends Person {
}
public class Manager extends Person {
List<Person> underlings = new ArrayList<Person>();
public boolean getIsInChainOfCommand(Person p) {
return underlings.contains(p);
}
public void giveCommandTo(String s, Person p) {
if (! getIsInChainOfCommand(p)) {
throw new RuntimeException("Cannot order this person around.");
}
p.ExecuteCommand(s);
}
}
Managers can only order their own underlings around. We expect to be able to call giveCommandTo()
with a piece of text and an Assistant
, and Java—polymorphic wunderkind that it is—obliges. As mentioned above, getMethod(“giveCommandTo”, {new String(), new Assistant()})
[2] returns null
because Assistant
, though a conforming actual parameter, is not an exact match for the formal parameter.
That’s a shame. I’m sure getMethod()
is much faster for this optimization, but it doesn’t really work for applications that would like to benefit from polymorphism. Any application that wants to search for methods that can actually be executed will have to do so itself. In English, we want to get the list of methods on a class and iterate them until the name matches the desired method. If the matching method has the same number of formal parameters as actual parameters and each of the formal parameter types isAssignableFrom
the corresponding actual parameter, we have a winner. The code below does this: [3]
protected Method getConformingMethod(String methodName, Object[] actualParameters, Class<?> cls) {
Method[] publicMethods = cls.getMethods();
Method m = null;
int idxMethod = 0;
while ((m == null) && (idxMethod < publicMethods.length)) {
m = publicMethods[idxMethod];
if (m.getName().equals(methodName)) {
Class<?>[] formalParameters = m.getParameterTypes();
if (actualParameters.length == formalParameters.length) {
int idxParam = 0;
while ((m != null) && (idxParam < formalParameters.length)) {
Class<?> param = formalParameters[idxParam];
if (!param.isAssignableFrom(actualParameters[idxParam].getClass())) {
m = null;
}
idxParam++;
}
} else {
m = null;
}
} else {
m = null;
}
idxMethod++;
}
return m;
}
A call to getConformingMethod(“giveCommandTo”, {new String(), new Assistant()})
returns a match where getMethod()
did not. Now that we have our method, we can check whether it can be called or not. For this, we retrieve the annotations on it.
Continue on to part two.
Callable
annotation, which takes an enumeration as an argument. It is discussed in more detail in part two.This is the first of a two-part article on interfaces. part two is available here.
Delphi Pascal, like many other languages that refuse to implement multiple inheritance, regardless of how appropriate the... [More]
]]>Published by marco on 26. Oct 2006 10:37:41 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
This is the first of a two-part article on interfaces. part two is available here.
Delphi Pascal, like many other languages that refuse to implement multiple inheritance, regardless of how appropriate the solution often is, added interfaces to the mix several years ago. However, Borland failed, at the same time, to add garbage collection, so they opted instead for a COM-like reference-counting model, which automatically frees an interface when there are no more references to it. Any references to the object behind the interface are on their own.
This is not just a theoretical problem; it’s extraordinarily easy to provoke this situation. The definitions below show a simple interface and a class that uses that interface:
ISomeInterface = interface
procedure DoSomethingGreat;
end;
TSomeObject = class( TInterfacedObject, ISomeInterface )
procedure DoSomethingGreat;
end;
Now imagine that an application has a library of functions that accept an interface of type ISomeInterface
(like DoSomething
in the example below). Given the definition above, if it has an instance of TSomeObject
, it can magically profit from this library, even though the library doesn’t know anything about any of the objects in its inheritance chain. ProcessObjects
below uses this library function in the simplest and most direct way possible.
procedure DoSomething( aObj: ISomeInterface );
begin
aObj.DoSomethingGreat;
end;
procedure ProcessObjects;
var
obj: TSomeObject;
begin
obj:= TSomeObject.Create;
try
DoSomething( obj );
finally
obj.Free;
end;
end;
At first glance, there is nothing wrong with this code. However, executing it results in an access violation (crash). Why? The short answer is that references to the object (as opposed to references to the interface) do not increase the reference count on the object. In order to better illustrate this point, let’s unroll the DoSomething
function into ProcessObjects
to make the interface assignment explicit. This is shown below, with the reference count of obj
shown before each line:
procedure ProcessObjects;
var
obj: TSomeObject;
aObj: ISomeInterface;
begin
obj:= TSomeObject.Create; // (0)
try
aObj:= obj; // (1)
aObj.DoSomethingGreat;
aObj:= nil; // (0) obj is freed automatically!
finally
obj.Free;
end;
end;
With reference-counted objects, as soon as the reference count reaches 0, it is automatically freed. Programming with this kind of pattern is, at best, a touchy affair, so most experienced Delphi programmers have learned one of two things about interfaces:
A non-reference-counted interface implementation overrides the _AddRef
and _Release
methods to always return 1, so that the object behind the interface is never automatically released. This avoids a lot of crashes, but not all of them. Part two will show how to avoid the dreaded dangling interface.
Continue to part two.
This is the second of a two-part article on interfaces. part one is available here.
In part one, we saw how to use non-reference-counted interfaces to prevent objects from magically disappearing when using... [More]
]]>Published by marco on 26. Oct 2006 10:37:34 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
This is the second of a two-part article on interfaces. part one is available here.
In part one, we saw how to use non-reference-counted interfaces to prevent objects from magically disappearing when using interfaces in common try…finally…FreeAndNil()
cases. Though this brings the interface problem under control, there is further danger.
A dangling interface is another problem that arises even when using non-reference-counted interfaces. In this case, the crash happens because an object has been freed, but there are still (often implicit) references to it in interfaces. Anytime a reference to an interface is removed—set to nil—the function _Release
is called on the object behind the interface. If this object has already been freed, there is a rather nasty crash deep in library code.
A nice use of interfaces is as a return type, so that objects from various inheritance hierarchies can be used from common code. To better illustrate this problem, consider the two interfaces below:
IRow = interface
function ValueAtIndex( aIndex: integer ): variant;
end;
ITable = interface
procedure GoToFirst;
procedure GoToNext;
function IsPastEnd: boolean;
function CurrentRow: IRow;
end;
The two interfaces describe a way of generically iterating a table and retrieving values for each column in a row. Now, take a look at a concrete implementation for the table iterator. [1]
TRow = class( TNonReferenceCountedObject, IRow )
protected
Values: array of variant;
public
function ValueAtIndex( aIndex: integer ): variant;
end;
TTable = class( TNonReferenceCountedObject, ITable )
protected
Index: integer;
Rows: TObjectList;
public
procedure GoToFirst;
procedure GoToNext;
function IsPastEnd: boolean;
function CurrentRow: IRow;
end;
The implementation is not shown, but assume that each row allocates a buffer for its values and that the table allocates and frees its rows when destroyed. Assume further the naive implementation for the remaining methods—they are not salient to this discussion.
The example that follows iterates this table in a seemingly innocuous way, but one that causes a crash … sometimes. That’s what makes this class of problem even more difficult—it’s unpredictability. The lines of code that change a row’s reference count are followed by the reference count. This helps see what is happening behind the scenes and explains the ensuing crash.
procedure DoSomething;
rowSet:= CreateRowSet;
try
rowSet.GoToFirst;
while not rowSet.IsPastEnd do begin
val1:= rowSet.CurrentRow.ValueAtIndex( 0 ); // (1)
val2:= rowSet.CurrentRow.ValueAtIndex( 1 ); // (2)
rowSet.GoToNext;
end;
finally
FreeAndNil( rowSet );
end;
end; // (1) CRASH!
The code looks harmless enough; it is not obvious at all that CurrentRow
returns an interface. The two references to an IRow
are left “dangling” in the sense that the code has no references to them. But they exist nonetheless and will be cleared when exiting the function scope—after the objects to which they refer have been freed.
The way to fix this—and to work completely safely with interfaces—is to use only explicit references to interfaces. DoSomething
is rewritten below:
procedure DoSomething;
rowSet:= CreateRowSet;
try
rowSet.GoToFirst;
while not rowSet.IsPastEnd do begin
row:= rowSet.CurrentRow; // (1)
try
val1:= row.ValueAtIndex( 0 );
val2:= row.ValueAtIndex( 1 );
finally
row:= nil; // (0)
end;
rowSet.GoToNext;
end;
finally
FreeAndNil( rowSet );
end;
end;
Interfaces are very useful, but Delphi Pascal’s implementation leaves a lot to be desired. It is possible to write completely safe code for them, but it takes a lot of practice and care. And, as seen in the examples above, interfaces can be easily hidden in with and mixed with objects, so that crashes remain a mystery if the presence of a rogue interface is not detected.
TNonReferenceCountedObject
is assumed to an implementation of the IUnknown
methods to prevent reference counting, as illustrated earlier in the article.Any properties used from a Tapestry template have to be declared in the corresponding Java page class. It is highly recommended to declare these properties as abstract
; Tapestry implements them for you,... [More]
Published by marco on 25. Oct 2006 06:40:34 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
Any properties used from a Tapestry template have to be declared in the corresponding Java page class. It is highly recommended to declare these properties as abstract
; Tapestry implements them for you, automatically including code that re-initializes each property automatically when a page is re-used from the cache. If you implement the properties yourself in the customary Java getter/setter way, it is up to you to clear them in order to ensure that users can’t see one another’s data.
That said, there are a few bumps in Tapestry’s current implementation. For example, the @For
component requires a source list and an iteration value from that list.
The declaration in HTML looks like this:
<tr jwcid="@For"
source="ognl:DataObjectList"
value="ognl:DataObject"
element="tr">
<td>
<span jwcid="@Insert" value="ognl:DataObject.Name">Name</span>
</td>
</tr>
That is, the component iterates the list returned from getDataObjectList()
(Tapestry automatically chops off the “get” part when searching for a method) and iterates it, executing the body of the loop for each value. For each iteration, it assigns the current value to the DataObject
property, which refers to the iteration element in the body of the loop. The example above prints the name of each data object in the list.
The page class in Java looks like this:
public abstract class EditorPage extends BasePage {
public abstract IDataObject getDataObject();
public abstract List<DataObject> getDataObjectList();
}
Tapestry will automatically implement appropriate getters, setters and initializers according to the declarations. However, executing the code above results in the following error message:
Unable to read OGNL expression ‘<parsed OGNL expression>’ of $[Generated page class name]: source is null for getProperty(null, “Name”)
What happened? A quick check in the debugger indicates that the list is assigned, contains elements and none of them are null
. So why isn’t DataObject
assigned? If you look more carefully at the declarations in the page class, you’ll see that although getDataObjectList()
returns a list of DataObjects
, getDataObject()
returns an [I]DataObject
.
Tapestry correctly generated getters and settings for these methods, but failed to raise an error when the @For
component tried to assign a DataObject
from the list to the property of type IDataObject
. Instead it silently left it null
and the page crashed later with the wrong error message.
Java supports immutable collections of all kinds, but not in the way you would expect. A naive implementation would declare the immutable (unmodifiable in Java parlance) interface as follows [1]:
interface... [More]
]]>
Published by marco on 25. Oct 2006 06:39:45 (GMT-5)
This article was originally published on the Encodo Blogs. Browse on over to see more!
Java supports immutable collections of all kinds, but not in the way you would expect. A naive implementation would declare the immutable (unmodifiable in Java parlance) interface as follows [1]:
interface UnmodifiableList<T> {
function T get();
function int size();
}
There is no way to modify this list—the API is simply not available. That done, we can now create the modifiable version of the list as follows:
interface List<T> extends UnmodifiableList<T> {
function void add(T);
function remove(T);
}
A class can now use these interfaces to carefully control access to a list as follows:
class SomeClass {
private List<SomeOtherClass> list;
function UnmodifiableList<SomeOtherClass> getList() {
return list;
}
}
That would be pretty cool, right? Unfortunately, even if you declared these interfaces yourself, the example above does not work. Java’s generics support amounts to little more than syntactic sugar, so List<SomeOtherClass> does not conform to UnmodifiableList<SomeOtherClass>. There are several solutions to this problem:
List
interface that returns it as an unmodifiable list. This is probably the best solution, as the code for unmodifiability will be defined in one place, the implementing collection.So that was fun, but how exactly does it work in Java, then? In addition to the limited generics, Java is further hampered by a legacy of old code. This means that they can’t (read: won’t) change existing interfaces because it might break existing code. Here’s how Java defines the two interfaces:
interface List<T> {
function T get();
function int size();
function void add(T);
function remove(T);
}
There is no second interface. All lists have methods for adding and removing elements—even immutable ones. Immutability is enforced at run-time, not compile-time. Pretty cool, huh? Not only that, but List
itself doesn’t even have a method to return an unmodifiable version of itself because Sun didn’t want to add methods to existing interfaces. Instead, you use a static method on the Collections
class to get an immutable version of a list.
class SomeClass {
private List<SomeOtherClass> list;
/**
* Returns an unmodifiable list (treat as read-only).
*/
function List<SomeOtherClass> getList() {
return Collections.unmodifiableList(list);
}
}
The type system itself has nothing to say about modifiability. Any calling client can happily add elements to and remove elements from the result without any inkling that what they are doing is wrong. The compiler certainly won’t tell them; the Javadoc offers the only clue—in effect supplementing the type systems with comments! When that code is called at run-time, Java will happily issue an UnsupportedOperationException
and smile smugly to itself for a job well done.
Say it with me: backwards-compatibility is king!
Array
as a placeholder here.Published by marco on 21. Jun 2006 20:05:44 (GMT-5)
The Commodore PET (Wikipedia) first came onto the scene in 1977. Why is that interesting? As with most disciplines and careers, programmers like to engage in pissing contests to determine who’s suffered the most under the least expressive language under the most oppressive OS on the most restrictive hardware. One of the most important markers of experience is the “first machine I ever programmed on” metric. Many cut their teeth on BASIC on the Commodore 64; I cut mine on the machine to the left.
Until Wikipedia and the glorious Internets brought it back into sharp focus, the PET hovered fuzzily in memory as only a name (without a manufacturer), an achingly slow tape drive and scrolling lines of green text on a black screen. It turns out that the PET was also manufactured by Commodore—and several years before the Commodore 64 was even created. It understood a pretty basic BASIC, for which our 8- and 9-year-old minds wrote “Choose Your Own Adventure” GI Joe stories. Sadly, the sands of time have worn away these masterpieces, drifting them under the dunes of an obsolescent file format on lost media for which no reader exists. In that way, old-school programmers have another advantage over the young whippersnappers of today: if we’re lucky, all evidence of anything less than a complete mastery of computing has been mercifully lost to the unreachable past, leaving only a gleaming legacy of perfect code.
Yahoo... [More]
]]>Published by marco on 17. May 2006 22:19:01 (GMT-5)
Updated by marco on 17. May 2006 22:19:37 (GMT-5)
Google and Yahoo are tripping all over themselves to help those us of with less time on our hands create reliable, usable web applications. They take different approaches, with Yahoo providing cross-platform JavaScript code and Google providing a new way of building web front-ends.
Yahoo kicked it off with the initial release of their JavaScript Libraries (earthli News), following up with a second release called AutoComplete, Windowing, Menu and More. The library looks really well-organized, has great documentation and examples and is built on a genuine cross-platform hierarchy (which are cross-referenced from everywhere in the documentation). Included are components like Tooltip, Panel, Dialog, and SimpleDialog as well as a complete Menu implementation.
Subscribe to the Yahoo UI blog to keep up-to-date on new developments.
Google has also released a toolkit that takes a different tack. The Google Web Toolkit lets you write your AJAX code in Java, which is translated to cross-browser JavaScript for deployment.
“a Java software development framework that makes writing AJAX applications like Google Maps and Gmail easy for developers who don’t speak browser quirks as a second language. … You write your front end in the Java programming language, and the GWT compiler converts your Java classes to browser-compliant JavaScript and HTML.”
Unlike Yahoo’s library, Google’s lets users build and debug their scripts in a familiar Java environment like Eclipse—so stuff like handling mouse events is even testable and debuggable without “alert” boxes. Google also kindly includes examples of their code in action; though the Dynamic Table example didn’t work in Opera, the Kitchen Sink did. The widgets used in the demos are documented (though not as nicely as Yahoo) and can be used directly, as with Yahoo’s components.
Subscribe to the Google Code Blog to keep up-to-date on new developments.
]]>
- It takes the fewest number... [More]
Published by marco on 4. May 2006 23:27:51 (GMT-5)
This is the simplest possible tutorial for creating convincing OS X–style Aqua effects using only vector graphics. The Ultimate Aqua Button takes a designer step-by-step through Fireworks to create a simple oval button. Here are the advantages listed in the tutorial:
- It takes the fewest number of steps (for a technique that doesn’t leave out any design elements)
- It uses fewer objects to complete the design
- All the elements of the button remain fully editable
- The final button is made entirely out of vector objects
This technique applies equally well to other shapes and can be ported to other vector graphics programs quite easily. It’s a very interesting technique in that it makes Aqua-style graphics extremely portable in size, color and so forth and shows how an operating system like OS X could save on memory (sacrificing CPU) by rendering graphic effects on the fly as vectors instead of using prerendered bitmaps.
Published by marco on 30. Apr 2006 13:01:26 (GMT-5)
Updated by marco on 30. Apr 2006 17:24:34 (GMT-5)
At long last, ISE Eiffel has released their development environment and libraries as Open-Source software, as announced in their press release (ISE Eiffel). The project is hosted on a wiki at the ETH and includes downloads for the most recent builds and nightlies (for Linux and Windows). The ISE implementation is the only one that fully supports the ECMA-367 standard, released in June of 2005. The download includes all development tools and libraries.
This is the language that Java and C# should be chasing instead of just trying to be better than one another or C++. Their recent introduction of generics [1] as glorified preprocessors, in which generic classes don’t follow expected inheritance rules, is one such recent debacle. Eiffel is the one OO language that asks the question, “how can we make the language more expressive?” before asking “how can we make the compiler easier to write?”
Eiffel the language has several features that it is relatively safe to say neither Java nor C# will ever have:
expanded
and the naturally expanded types, like INTEGER
, BOOLEAN
, FLOAT
and so on are simply optimized by the compiler. The magic happens in the compiler, though, allowing developers to inspect, use and design with INTEGER
s in the same way that they design with LIST
s, WINDOW
s or other compound or reference types.The generics recently included in Java and C# are a great improvment over the 1970s-style arrays previously available. They allow a much higher level of expressivenes and, most importantly, obviate a lot of iteration code and horrible, horrible casting. However, seemingly innocent limitations—like a lack of parameter covariance and lack of inheritance—cripple these implementations further. [4] Now that both C# and Java have some form of generics, we need to change the following clever tagline [5] from:
“OOP without generics is like a car that only turns left—sure you can go right, just do three lefts.”
to:
“OOP without parameter covariance is like a car that only turns left—sure you can go right, just do three lefts. [6]”
Because of this, the class models in Java and C# will also never offer another wicked feature of Eiffel, anchored types. [7]
Eiffel has many excellent online presentations illustrating Design-by-Contract and comparing itself to Java, C# and Delphi Pascal. They’re well worth a look.
There are some drawbacks to the IDE release, mainly for Mac users. The IDE runs under OS X and can produce even GUI applications using the EiffelVision libraries—but as X11 applications. These don’t integrate very well into OS X. However, given Eiffels vaunted interoperability with other languages, it should be possible to build program logic in Eiffel, gaining robustness and clarity from the design powers of the language, and to integrate those objects into OS X applications using Objective-C or Java (using JNI).
To be fair to C#, it doesn’t seem to be as content to linger in the past as Java. The Linq Project is a “set of extensions to the .NET Framework that encompass language-integrated query, set, and transform operations.” The C# 3.0 Language Specification (in Microsoft Word format) [8] describes the feature and its deep integration into the next version of C#. This feature is anchored by type inference, which allows develoepers to leave off explicit types in cases where the type is clear from the context. The lambda expressions available in C# 3.0 come from functional languages and allow chunks of executable code to be passed around a C# program (analogous to Eiffel agents, though not as flexible). The LINQ Project overview offers more details in a web page (instead of a Word document).
Which Java and C# don’t have anyway, but which can be emulated using assertions and “Do” routines. For example, for the non-virtual public function, f (p1: A), there is a virtual function do_f (p1: A), which is called from f, after it asserts pre-conditions and before it asserts its post-conditions, as in the example below (show in Java notation):
public void function f (A p1) {
assert( f_precondition );
do_f (p1);
assert( f_postcondition );
}
protected void function do_f (A p1) {
// to be implemented in descendents
}
This approach, though workable, does little to enhance clarity of design and does not solve the problem of inheriting from multiple such implementations to maximize reuse.
A[G]
, with descendent B[G]
(where G is generic), and X
with descendent Y
, we expect B[Y]
to conform to A[X]
. In Eiffel, it does; in C# and Java, it does not. Further, if A
declares function f (p1: G): G
, we expect the function in A[X]
to be expressed as f (p1: X): X
and the function is B[Y]
to be f (p1: Y): Y
. Since both Java and C# lack parameter covariance, it is a compiler error to use a generic type for a function argument. Since they do support function result covariance, the version in A[X]
is f (p1: X): X
and that in B[Y]
is f (p1: X): Y
. Inside the implementation of f
in B[Y]
, one must once again resort to a cast in order to get the type one would already be assured of in Eiffel.A
, with attribute a1: A
. The function to set this attribute could be declared as set_a1 (a2: like a1)
. If B
inherits this attribute and redefines it (using function result covariance) to a1: B
, the argument to function set_a1
is automatically redefined as well. If this is always desired, the original parameter in A
could be declared as a1: like Current
, where Current
is the equivalent of this
in C# and Java.When Apple shipped Mac OS X 10.4 “Tiger” last year, it included the Dashboard and Widgets. Widgets are almost completely platform-independent, built with HTML, JavaScript and CSS and the Dashboard is a desktop-sized layer that could be called up instantly to show all... [More]
]]>Published by marco on 14. Apr 2006 11:12:06 (GMT-5)
When Apple shipped Mac OS X 10.4 “Tiger” last year, it included the Dashboard and Widgets. Widgets are almost completely platform-independent, built with HTML, JavaScript and CSS and the Dashboard is a desktop-sized layer that could be called up instantly to show all installed Widgets. The “almost” above is deliberate since the release of the WebCore browser engine in Tiger included special hooks through which scripts could call system utilities (like executing local scripts or getting system information).
It also introduced the Canvas
object, through which JavaScript could perform drawing operations such as those found in the 2-D API of a modern operating system. The fancier widgets built their cool fading, flipping and drawing effects using the canvas (though the developers achieved a lot with transparent PNGs and CSS as well). The canvas was so obvious that Firefox and Opera quickly announced support for it. It has since evolved into a de-facto standard for a modern browser [1].
If you have one of these browsers, you should be able to enjoy the demonstration and sample code found in the Reflection Demo. A quick look at the source reveals a very plain HTML document with several embedded images. The images themselves do not contain the alpha-blended reflection seen beneath each one—that’s the effect applied by the “reflection” class present on each image. The accompanying JavaScript finds all elements with the “reflection” class and extends the image with a custom reflection built by extracting and compositing pixels from the image itself. It’s very fast and extends the power of CSS to image manipulation.
Image manipulation in <canvas> by Arve Bersvendsen has more examples and tips for using the canvas, including individual color channel manipulation and more dynamic effects, like hard and soft spotlights. The effects are rendered extremely quickly and fluidly and bring a gee-whiz effect to simple web pages not seen outside of Flash. These effects could also be achieved with SVG [2] or Flash, but it’s nice to be able to stay in the familiar world of HTML/CSS/JS.
The canvas opens up a new world of integrated, dynamic effects to web developers—effects that 90% of the market will never see since Internet Explorer shows no signs of either including new technologies in any soon-to-be-released version. That means that major sites are unlikely to start dazzling you with download-friendly (effects are rendered locally without extra images) effects to improved your browsing experience. Though many of the uses of the canvas will likely be in making advertising more annoying, it is also highly likely that web application interfaces can be drastically improved in their usability and gesturing (indicating what happened where).
On the other hand, the demos look really nice; specialty web sites that don’t mind spending money on an effect that only appears for 10% of its users will start using the canvas to stand out from the rest of the crowd.
The Yahoo! User Interface Library offers all of... [More]
]]>Published by marco on 21. Feb 2006 22:20:35 (GMT-5)
If you’re looking for good advice on JavaScript programming, take a look at javascript.faqts, which offers a massive list of questions and answers about JavaScript, including many samples and snippets organized by topic.
The Yahoo! User Interface Library offers all of Yahoo’s JavaScript controls, packaged and ready for download as Open Source. It includes GUI-level components for handling drag & drop, or building treeviews or calendars as well as low-level components for managing AJAX connections and browser events. On top of that, they’ve also released the Yahoo Design Pattern Library, which offers hints and strategies for implementing common web application tools, like breadcrumbs or auto-completion.
Whereas there are other cross-platform libraries out there, Yahoo has done an excellent job of preparing their tutorials.
Published by marco on 11. Dec 2005 00:10:20 (GMT-5)
Why Ajax Sucks (Most of the Time) by Jakob Nielson is a critique of Ajax that borrows almost all of its text from a critique of HTML frames made several years ago. The author claims the article is a spoof, but, given that the complaints made about frames were valid at the time — and still are — and that the complaints are just as valid for the current batch of web applications, it’s hard to see what’s so funny about this “spoof”.
There are those who argue that frames in fact have not died out, since they are used on almost every web site to house advertising or as invisible containers for dynamic content. That argument misses the point of the initial critique, which derides the use of frames for navigational purposes. Sites using frames for navigation have gone the way of the dodo — for exactly all of the reasons mentioned all those years ago.
I’m an Opera user and I have mouse gesturing hard-wired into my brain. I sweep the mouse from right to left without even thinking about it. I expect to go back to the previous page in an instant, as Opera also provides the convenience of using its cache to serve up recent page — unlike Firefox and IE, which still see the need to go online to check whether the page I was just at one second ago needs to be updated.
When I take this — admittedly spoiled — behavior to an Ajax-enabled site like Google Mail, I’m crippled. I frequently sit in front of an empty page because, as mentioned in the article above, back simply does not work. Sweeping the mouse desperately from left to right (to go forward again) brings more white pages, with a small “Loading…” hint in the top left of the page. A glance at the bottom of the page shows no Opera progress bar, which means Google is not loading anything.
The only tool I have left? Reload. The Google Mail page weighs in at over 300KB (Opera’s progress bar also conveniently shows how much page it’s loading), so that’s no small chore. If you’ve managed to go back too much, you might have logged yourself out, in which case you get to completely start over.
It is, indeed, a usability catastrophe. The ideas in the UI are interesting, in that everything happens inside one window, with more and more data loading into the view without replacing the old data. It’s just impossible to navigate.
Published by marco on 10. Jun 2005 08:23:39 (GMT-5)
Mapping Google is an in-depth examination of Google Maps, a new web application that searches the US graphically. There are follow-up articles in Making the Back Button dance and Still more Fun with Maps. The series of articles covers the techniques Google used to bring a full-fledged, usable application to a web browser.
What’s so special about it? It feels like a desktop application:
Since it’s introduction, Google has added a satellite map feature that overlays the map with satellite data; zoom in as far as possible and see individual cars parked next to the store you’re looking for.
It’s really cool, it’s really fast and it’s the future of web applications.
“…web browsers suck at printing … they all suck. And CSS is never going to fix it. Did you hear me? CSS is never going to fix it.”
That’s pretty much all he has... [More]
]]>Published by marco on 20. Jan 2005 21:46:57 (GMT-5)
Updated by marco on 14. Apr 2006 11:25:51 (GMT-5)
Printing XML: Why CSS Is Better than XSL by Håkon Wium Lie, Michael Day (O'Reilly XML.Com) responds to a line drawn in the sand in webarch.pdf by Norman Walsh, a noted XSLFO proponent. In it, he said:
“…web browsers suck at printing … they all suck. And CSS is never going to fix it. Did you hear me? CSS is never going to fix it.”
That’s pretty much all he has to say about CSS in that paper; it actually discusses a general style sheet, written in XSL, that transforms an HTML document to printable FO. If you motor on over to that link, even with Firefox’s helpful formatting, well-written XSL is still not exactly legible. The guy, however, knows what he’s talking about and claims to be able to format this Architecture of the World Wide Web, Volume One into a printable FO (which converts easily to PDF) using it.
Now comes the interesting part. As with all standards, we have to ask: “how good is good enough?” Håkon Lie, senior technology officer at Opera and longtime contributor to the CSS standard for the W3C, has “just used CSS to style a 400-page book”. Apparently, CSS is “good enough” for printing at least one book and, and here’s the authors’ point: it’s much easier to read and, at only about 200 lines*, much shorter.
*The link from Slashdot includes a 100-line version, which comes from the Prince distribution mentioned below.
The next natural question is: what the hell do I view this in? Browser support for printing is pretty sketchy, at best (though Opera’s printing in the 8.0 beta has gotten much, much better). Prince to the rescue! It generates PDF from XHTML and CSS. You can download for Linux or Windows and includes support for SVG, PNG, JPEG and TIFF graphics files.
If you download the Prince Alpha and run the sample document (which, purely by coincidence, is the same Web Architecture document referenced above) with the ‘forprint.css’ stylesheet, it generates a really nice-looking PDF in just seconds. Granted, PDF is a bit more verbose a format, using 1.5MB instead of the ~200KB for the original HTML file.
There are several examples of the XSL and CSS used to generate various parts of the document. The CSS, in every case, is much clearer and easier to write. I found them more interesting as examples of what you can do with CSS. You can see most of these in action if you download Prince.
@page :left {
@bottom-left {
content: counter(page);
}
}
This puts the current page number on the bottom left side of left-handed pages in when printed. Cool.
ul.toc a:after {
content: leader('.') target-counter(attr(href), page); }
This places the page number of a link’s target next to the link’s text in a table of contents (for example), using a ‘leader’ composed of periods.
body { column-count: 2; column-gap: 8mm; }
This one should be obvious, no? I didn’t know columns were that easy with CSS (though I’m not sure which version this requires).
The interesting thing to take away from all this is not that CSS is better than XSLFO or vice-versa. It’s that CSS is capable of generating very nice printed output from an HTML or any XML document. It’s also much easier to learn and maintain than XSLFO. That makes it a far more accessible formatting language. Some people on Slashdot were pointing out that “you are only going to write it once”. You never write something once and leave it alone. Maintainability and extensibility are paramount for most uses.
Also make sure to remember that CSS is not a wholesale replacement for XSL.
“XSL is a Turing-complete language which, in principle, can be used for all programming tasks and is particularly suited for document transformations. Styling documents is only one of many things XSL can do. CSS, on the other hand, has been developed with only one task in mind: styling documents.”
The thing to consider is whether your task needs the complexity of writing XSL transformations or whether you just need to style documents. If you only need to style, CSS is a comfortable and powerful alternative to XSL.
Published by marco on 16. Jan 2005 14:04:15 (GMT-5)
If you make websites, pay careful attention to this JavaScript object: it’s going to change everything about web application interfaces. Web pages can use this object to make an “in-place” request for data to another URL, then inject the results of the request (with optional post-processing in JavaScript) into the current document.
All without refreshing the page.
Google introduced the first really noticeable implementation of a web application using this technology in GMail. It was so noticeable, it prompted Opera to finally implement the object in their JavaScript engine and release a new version of their browser. That means that it’s also uniformly supported on all browsers on all platforms (Safari 1.2x, Mozilla 1.x, Opera 7.6x., IE 5.x).
Dictionary is a simpler example of this technology, which also explains how it works. The gist is that you create an XMLHttpRequest object in your page and tell it retrieve data from a URL. When the response arrives, it triggers an event handler in your JavaScript and you can work with the response text. In Dictionary’s case, the response is preformatted HTML which is assigned to the innerHTML
property of a given DIV container in the page (which is pretty much the standard way of injecting/updating content in the DOM of a web page).
The request is issued when the user types a new letter into the text box on the page and returns and displays the top ten definitions for the text entered.
Other uses for this construct are limitless. Gone are the days of horrible workarounds for status pages during long operations; a page can request the status from the server directly, which queries the running process for an update and returns text describing the status. I would imagine that the content returned in the response should involve as little client-side processing as possible to avoid having to write complex JavaScript (server-side languages are usually easier to debug).
Similarly, pages no longer need to refresh the entire page when loading content for interdependent drop-down boxes. Nor do they need to load all possible variations in JavaScript when the page is initially loaded (both horribly limited solutions both bandwidth- and usability-wise). Simply request the new contents of the dependent drop-down when the user makes a selection in the “master” drop-down.
I, for one, am looking forward to the next generation of web applications based on this technology. For more information, see Apple’s developer documentation for XMLHttpRequest.
Published by marco on 10. Jan 2005 21:54:10 (GMT-5)
Hey, I know he’s well-read in the industry and he often has some interesting topics, but his latest article, Advice for Computer Science College Students, is way more over the top than it needs to be. Maybe he thinks that, since he’s addressing people about to go to college (or those already in college that have not yet chosen a major), he needs to go all MTV on us and “get all up in our faces”, not missing a single opportunity to “dis the man”.
Whatever; it’s annoying.
He takes needless potshots at all majors not immediately important to building a successful IT business, deriding anything that’s not immediately applicable or “useful” as a waste of time. “What are you going to do, major in History?” is one such throwaway comment that’s probably supposed to incite a snigger of contempt for anyone who can’t use a computer, but, yeah, actually it would be nice if some Americans learned history. Americans not knowing history causes a lot more problems in the world than Americans not knowing how to program.
He also disguises ad hominem arguments as legitimate critiques of subject matter — discarding entire subjects because he got a shitty teacher. I too took cultural anthropology and I had a great teacher (Doug Raybeck at Hamilton College). I liked the course; a good teacher makes a big difference. Getting to know how other people tick and learning about other cultures is another one of those things that might help build a little bit of empathy and understanding in this world, which is so sorely lacking in the upper middle class that is his essay’s target. Computer geeks are already smug and superior enough, for God’s sake. Encouraging them to listen to their worst instincts is a terrible idea. I did notice that he encourages people to take the course anyway, but in the way that your dad encourages you to eat vegetables when you know that he hates eating them too. Kind of a grin, grin, wink, wink, just do it to please your mom, we’ll go eat cookies in the basement later kind of way.
So you’re not supposed to learn computer science, but you’re supposed to “[l]earn C before graduating” because it “it is still the lingua franca of working programmers”. “Java … Python [are] trendy junk” that are being taught to deliberately mislead you. Don’t be fooled by high-level abstraction; listen to uncle Joel and start right off from the beginning. Standing on the shoulders of giants is for those idiot “scientists”. Sure, I also think you should have a passing knowledge of C, but knowing how the machine processes instructions is unnecessary for most of the software being written today. I mean, Joel’s company’s main product is a web application, for Christ’s sake. Just how close does he think that runs to the processor? I hope all of his developers are properly optimizing the stack in all of their routines. He compares a programmer who doesn’t know C with a “medical doctor who doesn’t know basic anatomy”. I think it’s much closer to a medical doctor who doesn’t know how mitochrondria exchange food and oxygen across a cell membrane. Despite that massive gap in her knowledge, my doctor can probably still tell me if my leg’s broken (and set it).
Despite that diatribe, may I offer a few corollaries to the rule about learning C before you graduate:
The advice to get an internship is good, though. Internships are an ad-hoc replacement for the fact that there is no apprenticeship program in the US. He should be using his advocacy position to push for better education in the first place, instead of being happy to hire people out of a system in which “ elite schools” that cost “$160,000” don’t even teach you how to program. The main problem is that you do spend time programming in computer science (in most of the courses I took) … you just don’t spend any time designing. Writing the code for a design is pretty much incidental in most cases. Designing good code is what takes practice, experience and guidance.
Mostly, though, he feels that the contempt that programmers feel for the part of the world that can’t (or doesn’t want to) write “while loops” is completely justified. After all, computer scientists are useless because, even he, a mere developer, “found a mistake in [dynamic logician] Dr. Zuck’s original proof” after only “a couple of hours” and he “got an A” in a Cultural Anthropoly class he found less exciting than “watching grass grow”. I think he sums it up well enough himself with:
“The moral of the story is that computer science is not the same as software development.”
His implication is that it is far, far worse than software development and just slows development down. Think Bush’s attitude toward all forms of science and you’ll get the idea. Personally, I think we could do with more computer science in development, especially for tools and libraries that we all have to live with for years and years and years.
Take the newest buzzword now, Generics, for which we can thank Microsoft for inventing. Generics are an extremely useful mechanism for specifying a design very precisely while at the same time reducing the code required by at least a third. C# shipped initially without this feature, as did Java. Generics were not new when these languages were introduced (despite my sarcasm above). Eiffel has (and had) an extremely well thought-out and well-built implementation and C++ had a poor man’s preprocessor that didn’t fit into its existing type system at all.
Eiffel is a language designed by a computer scientist for practical use. C++ is an object-oriented version of C, “the lingua franca of working programmers”. Guess which version C# copied?
You’d be wrong if you guessed either one. C# copied Java’s generics, which use the semantics and type compatibility rules of the much weaker system used in C++. Expressiveness of the language has been lost because a compiler-friendly solution was chosen instead of a programmer-, or even computer scientist-friendly one.
Maybe Sun and Microsoft’s marketing departments had something to do with it as well. Maybe my point is that developers and computer scientists shouldn’t fight, but instead unite against the common foe: marketers.
Java provides numerous examples where a language seems to have completely lost its way — it’s proponents aren’t even qualified to discuss the basic tenets of programming languages. Take a look at these Tech Tips from the 2005-01-05. The first one, about VarArgs, spends seven whole pages describing and providing pedantic examples for a feature that provides poor-man’s support for tuples in one specific, limited case (where the arguments are all of the same expanded type). A look at Eiffel’s tuple support (snobby, ivory tower, CS bullshit language) and you’ll be wishing Java had done its homework.
The second tech tip is about “covariant parameter types”. They start off by saying that:
“The intent is to demonstrate that although there are good reasons for implementing covariant return types, implementing covariant method parameters is unsound.”
They do nothing of the sort, proving only that Java can’t do them, so they must not be useful. They provide a solution using generics that ends up using a syntax that leaves you yearning for covariant parameter support, so the message is, to say the least, somewhat confused. I’m sure a previous Tech Tip similarly disproved multiple inheritance’s usefulness by showing that:
class A extends B, C {}
fails to compile under Java. I wonder whether the contempt for computer science has anything to do with the level of sophistication found in the new languages being trundled out for us today.
Computer science can be useful … it’s the way new programming methods are developed and new algorithms are designed (yeah, hackers can find them too, but you can’t get lucky all of the time). Don’t trash the whole subject as ivory tower bullshit just because it doesn’t involve enough development. The best developers I’ve met are the ones that actually read computer science papers, not those that just scan the latest software development magazines for the latest toolbar screenshots.
Published by marco on 11. May 2004 22:44:05 (GMT-5)
Updated by marco on 11. May 2004 22:54:11 (GMT-5)
UML is bandied about so much these days that it’s considered by many to be standard. It’s standardized, but to be standard it’s got to be in use almost everywhere. Everywhere important, at least. Domain-Specific Modeling and Model Driven Architecture by Steve Cook (PDF) assures us that we who feel uncomfortable with UML’s claims to universality are, in fact, in good company.
Where I work, we do pretty much pure object-oriented designs; every once in a while someone dares put an aspect of our design into UML form. If you only casually use UML, it’s never really obvious what it’s trying to say. If everybody in the room only casually uses UML, you can easily (and sometimes gleefully) spend precious meeting time discussing what exactly an open diamond with a star next to it means. Any reasonably complex system is not interpretable at a glance since “[m]odels are typically decorated with a lot of symbols and textual elements that must be carefully inspected to see the real meaning.”
Steve works for Microsoft and has concisely defined their viewpoint of the state of MDA. They are, all in all, quite excited about it.
“The development of domain-specific models, patterns, and frameworks, organized into software value chains — product lines, promises to industrialize the production of software similar to the way in which the production of many household goods was industrialized in the last century.*”
Those who’ve watched Microsoft get ‘excited’ about things over the years will either:
He talks about “domain-specific modeling language” as being far more useful than UML’s more universal approach. This makes sense in light of the effort needed to get UML used at the developer level, where it was intended. It’s not a tool that developers easily take to because it runs against limits so quickly, like it’s complete inability to be used as a two-way specification (ability to work in both model and generated code easily) and the fact that “it does not translate very directly into [common] technologies.”
“a UML class cannot be used directly to depict a C# class because [it] does not support the notion of properties … a UML interface cannot be used directly to depict a Java interface because [it] does not contain static fields”
While these are tiny details when viewed from the ivory tower of a specification developer, they are showstoppers as far as a developer using UML is concerned. Microsoft is dead-on in using “UML to the extent that it provides recognizable notation for well-understood concepts” and creating “new conventions” where its vaunted universality is not quite up to snuff.
We should hardly be surprised to see that Microsoft is going to ‘embrace and extend’ UML. They’ve gone to the trouble of writing a whole paper giving good, logical reasons why they’re going to use proprietary …ahem, domain-specific … modeling languages in their tools. We can hardly complain though since the promise of UML as a universal tool was ever a mirage. That Microsoft is forging ahead with their own standard is a foregone conclusion, but at least they are quite justified this time. We, as developers, simply run the risk of getting another de-facto standard which is “whatever Microsoft uses and supports”. Those who develop for the web know how much fun it is when Microsoft’s attention wanders and leaves the world addicted to its half-assed standards.
UML − Unified or Universal Modeling Language? by Dave Thomas (JOT) is a fun read, which tells a bit of the history of UML’s development. Throughout, he hits quite often on his main point:
“Doesn’t it seem odd that a language intended to help developers whose most productive tools are textual editors, outliners, and IDEs has no nice syntactic expression?**”
Essentially the same argument brought by Steve Cook: UML is not getting accepted because of systemic problems that can never be addressed because the parts seen as problems by developers were deliberately designed by people who didn’t use it in the real world. UML is “yet another committee attempt to unify the world in a single grand language — the vain quest for a ‘computer Esperanto’.”
No more being chained to code! Or implementations! Or platforms!
“You just draw the pictures; mix in a few textual specifications, and model-driven code generators will eliminate the need for low-level programming (and programmers!).”
This interpretation of the goals of modeling languages is similar to Steve Cook’s excitement at the coming “industial age” of software engineering. We, as developers, should not allow grinding capitalist interest use the modeling languages we design to obviate us within our lifetimes. We should make sure that they, first and foremost, fulfill our needs and make our work easier and more fun, not to mention our products more stable/more maintainable/etc.; whether or not a corporate bottom-line is improved is secondary to us. There are already plenty of people designing our world for bottom-lines; you damned well better make sure you’re pushing back to keep the world better for yourself.
“We need to ensure that important new languages for programming and design have actually been used and tested in real applications before they are foisted on programmers. Standards groups can then play their proper role, which is to develop language standards that are based on real-world best practices.”