Programming > earthli News 3.7

Why do so many people resist Git GUIs?

2026-06-28T13:16:43+02:00

Published by marco on 28. Jun 2026 13:16:43 ------------------------------------------------------------------------ The post "Why Git is so hard to learn and how best to teach it." included the following phrase, "Any student given the choice will use a GUI version like desktop." What's wrong with that? Are there use cases that students can't cover using a UI? Is there some reason they need to use the command line? My students generally end up using VSC Git UI (or VS Git UI) but I start off by recommending SmartGit because it's much better at visualizing and handling the kinds of problems that students tend to produce. Merge conflicts have a lot of inherent complexity that there is no way around teaching. Once you've internalized that complexity, and understand how to deal with it, you wonder what the big deal ever was, but I've met many developers and students who need a lot of time to grok it. And many never do. They just do what that XKCD says: nuke it from orbit and start again. [image] But a good UI (like SmartGit) offers insight into the code base -- at a glance -- that, even if it doesn't help the student, helps me to help them solve their problem relatively quickly.

jj vs. git vs. GUIs

2026-02-16T18:17:45+01:00

Published by marco on 16. Feb 2026 18:17:45 ------------------------------------------------------------------------ The article "jj init — Sympolymathesy" by Chris Krycho explains what Jujutsu is and what it does. I was reminded of these notes that I wrote over a year ago when I read "Evolving Git for the next decade" by Joe Brockmeier , which briefly mentioned it as a command-line UX toward which Git itself is working. [1] [Git is not worse than all the others] "Jujutsu is two things: It is a new front-end to Git. This is by far the less interesting of the two things, but in practice it is a substantial part of the experience of using the tool today. In this regard, it sits in the same notional space as something like gitoxide. Jujutsu’s jj is far more usable for day to day work than gitoxide’s gix and ein so far, though, and it also has very different aims. That takes us to: It is a new design for distributed version control. This is by far the more interesting part. In particular, Jujutsu brings to the table a few key concepts — none of which are themselves novel, but the combination of which is really nice to use in practice: Changes are distinct from revisions: an idea borrowed from Mercurial, but quite different from Git’s model. Conflicts are first-class items: an idea borrowed from Pijul and Darcs. The user interface is not only reasonable but actually really good: an idea borrowed from… literally every VCS other than Git." I don't think that this is true. but I have perhaps more experience with "literally every VCS" than the author does. For example, Perforce changelists are anything but intuitive for new users. TFS and Subversion branches were a horror to deal with. Every feature of VSS was only tolerated because it was amazing to have source control at all.. It's not just Git that has difficulties with UX. The author is being a dick here, taking easy swipes that they know no-one will question. They just expect to get an amen from their chorus who think that, because there is room for improvement in the Git UX, that every other tool must be better. This is not true. But it doesn't matter because Git can't just be an amazing tool for which GUIs have filled the gaps in its UX: it has to suck in order for Jujutsu to save us all from it. People are tiring. [It's a Google thing] "[...] given it is being actively developed at and by Google for use as a replacement for its current custom VCS setup, it seems like it has a good future ahead of it." That is such a naive thing to say, given the history of most of the software that comes out of Google. The best that you can hope for is that Google doesn't try actively kill it while an open-source community tries to keep it alive. [Jujutsu's features] "Jujutsu has two discrete operations: describe and new. jj describe lets you provide a descriptive message for any change. jj new starts a new change. You can think of git commit --message "something I did" as being equivalent to jj describe --message "some I did" && jj new. This falls out of the fact that jj describe and jj new are orthogonal, and much more capable than git commit as a result." "[...] a given change logically the child of four other changes, with identifiers a, b, c, and d? jj new a b c d. That’s it. One neat consequence that falls out of this: a merge in Jujutsu is just jj new with the requirement that it have at least two parents. (“At least two parents” because having multiple parents for a merge is not a special case as with Git’s “octopus” merges.)" When would you need this? Am I missing a use case? I've never felt particularly like I needed to merge four branches together but I'm just a simple guy with simple needs. I'm sure I'm missing something by not wrangling four simultaneous branches instead of programming. Is this something that LLMs need maybe? "[...] you can describe the change you are working on and then keep working on it . The act of describing the change is distinct from the act of “committing” and thus starting a new change. This falls out naturally from the fact that the working copy state is something you can operate on directly: akin to Git’s index, but without its many pitfalls." What pitfalls does Git's index have? It seems pretty useful to me. At the end of the article, he actually backs off on this and admits that Git's index is kinda necessary for staging parts of the workspace. Also, the whole feature -- separating "describing" and "committing" -- isn't as revelatory as he's making it out to be. "With jj new -A , you just insert the change directly into the history. Jujutsu will rebase every child in the history, including any merges if necessary; it “just works”. That does not guarantee you will not have conflicts, of course," Um. Ok. I mean, how could it possibly have guaranteed that? It's not magic. Conflicts are the nasty part, though, and are always more difficult to deal with when doing operations on multiple commits at once. [Delaying conflict-resolution is cool] "Jujutsu can incorporate both the merge and its resolution (whether manual or automatic) directly into commit history. Just having the conflicts in history does not seem that weird. “Okay, you committed the text conflict markers from git, neat.” But: having the conflict and its resolution in history, especially when Jujutsu figured out how to do that resolution for you, as part of a rebase operation? That is just plain wild ." I kind of like the idea of committing conflicts instead of forcing the user to resolve them immediately. That's a good idea. It lets you revisit the merge by preserving the inputs. "Jujutsu will add conflict markers to a file, not unlike those Git adds in merge conflicts. However, unlike Git, those are not just markers in a file. They are part of a system which understands what conflicts are semantically, and therefore also what resolving a conflict is semantically. This not only produces nice automatic outcomes like the one I described with my library above; it also means that you have more options for how to accomplish a resolution, and for how to treat a conflict. Git trains you to see a conflict between two branches as a problem. It requires you to solve that problem before moving on. Jujutsu allows you to treat a conflict as a problem which [must eventually] be resolved," "Jujutsu allows you to create a merge, leave the conflict in place, and then introduce a resolution in the next commit, telling the whole story with your change history." "Conflicts are inevitable when you have enough people working on a repository. Honestly: conflicts happen when I am working alone in a repository, as suggested by my anecdote above. Having this ability to keep working with the repository even in a conflicted state, as well as to resolve the conflicts in a more interactive and iterative way is something I now find difficult to live without." The article discusses retaining conflicts in commits as first-class, semantically valuable artifacts that the conflict resolver can either resolve immediately or later when another commit comes along to make the conflict go away. This is very interesting for multi-commit rebases where git currently makes you resolve the conflicts every step of the way, even when you know that the conflict will definitely go away further up the chain. Often, you don't even remember how you actually want to resolve the conflict in the "old" commits -- and you don't care. This only happens with rebase, which I use much, much, much more than merge. When you merge, git considers the sum of all changes in all commits that you're merging, so you get the behavior you want: the sum of the commits eliminate irrelevant conflicts. Rebase in git doesn't benefit from this behavior. In jujutsu, it does. [Massaging your commits is good, too] "[...] this ability to move part of one change into a different change is a really useful thing to be able to do in general. I find it particularly handy when building up a set of changes where I want each one to be coherent — say, for the sake of having a commit history which is easy for others to review." Yes, of course. Some of us have just been using powerful GUIs to do this for a long time, with hotkeys flying, instead of furiously typing commands into a command line. This is a laudable goal. I'm glad that jujutsu brings that experience -- and that ability -- to command-line users. [Wait. No branch names?] "The default log template shows me the current set of branches, and their commit messages are usually sufficiently informative that I do not need anything else." Sure, ok. No branch label required. That's definitely not a recipe for disaster for most developers. It never struck me as too burdensome to have to make little bookmarks, really. It felt kind of obvious and logical. I guess I didn't notice that I was wasting time when I could have just been reading commit messages instead. [Acknowledging Git GUIs] "GUI tools could make all of those much easier. Any number of the Git GUI s have tried, but Git’s underlying model simply makes it clunky. That does not have to be the case with Jujutsu. Likewise, surfacing things like Jujutsu’s operation and change evolution logs should be much easier than surfacing the Git reflog, and provide easier ways to recover lost work or simply to change one’s mind." First of all, I don't think there's that strong a point to make here. A GUI like SmartGit manages to elide a lot of the complexity and has done so for years. I wonder if Syntevo is working on anything for jj? Or is there no point because SmartGit has actually already been doing what Jujutsu does for a while? Maybe there is an advantage to storing the conflicts that SmartGit could also leverage. [Editing history for code you don't have checked out] The article "I'm daily driving Jujutsu, and maybe you should too" by Drew Devault writes, "As a git power user, I rely heavily on git rebase to edit my git history as I work, frequently squashing and splitting and editing commits as I work, and I used “stacked diffs” without branches before it was cool." Same. I've just never done it on the command-line, so I've never felt the pain of doing any of this git-fu there. SmartGit makes most of my history-editing seamless, easy, and foolproof. I know all of you console-jockeys hate them but give me drag-and-drop operations any day. "When I edited this earlier commit, I was in the middle of working on something else and I hadn’t committed or even staged it. I did not run git stash, nor git commit -m"WIP", nor git add, nor git checkout, nor git rebase, at any point. The only command I ran was jj squash. When it was done, I was returned immediately to where I left off, with a half-written, uncommitted change in my workdir. It took all of two seconds to complete this operation and pick up where I left off. "The “wow” moment came when I realized that I had done this several times that day without finding it particularly remarkable. Jujutsu makes editing history absolutely effortless." This is, I think, the killer feature of jujutsu: you can edit history that you don't currently have checked out. While better conflict-handling and editing commits anywhere in the tree are attractive features, I still feel that switching back to using the command-line would incur enough drawbacks to outweigh the benefits for me. [image] -------------------------------------------------------------------------------- [1] That article wrote, "There are a number of things that Jujutsu got right, he said. For example, history is malleable by default. "It's almost as if you were permanently in an interactive rebase mode, but without all the confusing parts." When history is rewritten in Jujutsu all dependents update automatically "so if you added a commit, all children are rebased automatically". Conflicts are data, not emergencies. "You can commit them and resolve them at any later point in time." These features are nice to have, he said, and fundamentally change how users think about commits. "You stop treating them as precious artifacts and rather start treating them as drafts that you can freely edit"."

Optimize by keeping only the code you need

2026-02-08T17:16:25+01:00

Published by marco on 8. Feb 2026 17:16:25 ------------------------------------------------------------------------ In the video "Context is Everything" by Andreas Fredriksson , the author pinpoints that a dependency in his app -- a JSON-handling library -- is sucking all the performance out of it. So, he takes a look at it. It's a general-purpose library, with a lot of edge cases...edge cases that his input data doesn't have. That is, if he can guarantee a certain context, then he can use an optimized version of the JSON library's code. This isn't always going to be the solution -- it will, in fact, rarely be the solution for a LOB app for which every line of maintenance is a burden -- but, when you're making something with performance constraints, it's good to be able to think like this. [1] He takes the original JSON library and profiles it. Then he starts to excise the slow bits -- bits his app doesn't need anyway. This gets him impressive performance boosts. First, he gets it to be 2x faster with a simple linear fix (removing unneeded branches), then boosts it to over 11x faster by using a mixed-parsing mode. Another profile shows that a function called isspace() is taking up 45% of the processing time now. He trims that down to just handle the whitespace characters his file might actually contain. He also ditches the locale check that happened every single time. 17x faster now. OK. What else can we do? Ah, we could observe that the data doesn't have to contain spaces at all! That is, instead of parsing the spaces as they come along, you can use a SIMD-based solution combined with a LUT (Look-Up Table) to normalize the input data before you even parse it. He uses a quick-and-dirty Perl script to build the LUT. 22x faster now. That performance improvement alone is 5x more than the original speed of the parser. * We just removed a bunch of poorly predicted branches, nothing else * Low-level thinking = not paying for things you don't need * Low-level thinking = partition work in hardware-friendly ways "[...] "We didn't change any of the behavior of the program. All we did was we separated these two passes in a way that was friendly for the hardware. We moved branches from being in the integer control flow to being inside masks in the SIMD flow." The next step is to reexamine what "white space" actually is: he reinterprets it to mean anything that's not a printable character, which allows him to optimize the mask even further. 29x faster. Over 1GB/s of throughput. Are we done? Bitch, please. He moves on to two more levels of optimization that still bring good-sized gains, but at the cost of more complexity. They also contain more assumptions but that's OK if the assumptions will always be correct. You want to stop optimizing when it makes sense for your use case. If you're writing code for a very tight loop on some low-level hardware -- or in a game where the budget per frame is a maximum of 16ms -- then it might be very important: you might end up saving incredible amounts of time for your users; you might be using a lot less power. * Solve the right problem * Ask the right questions * Consider the liabilities and overall economics of your approach * * Consider the unique context and the potentially massive wins * Generic means "not tuned for your use case" * Don't be afraid to look inside -------------------------------------------------------------------------------- [1] I can't help but notice that this is absolutely not what LLMs tend to do: they don't know how to remove code. They don't know how to remove unnecessary functionality. Maybe you can corral it into doing so but I've not read of too many experiments in that direction.

Learning about OCaml Effects

2026-01-18T12:08:41+01:00

Published by marco on 18. Jan 2026 12:08:41 ------------------------------------------------------------------------ [image]I don't program with OCaml. I never have. I have a good colleague who does, occasionally, write stuff in OCaml, and he sent me a bunch of links about OCaml Effects, starting with a discussion asking "Are we rational? About exceptions and effects" by olleharstedt . The author writes, "I was thinking about the fact that there’s no consensus about exceptions and whether to include them or not in a programming language. Think about Go. They decided to not add support for exceptions. Did they cite any study to support this decision, that supports the notion that exceptions in general lower the quality[1] of the ecosystem? Not that I know of. Now OCaml goes in the opposite direction - adding more ways to jump around in the code, with effects. Also no studies, no experiments." This is an excellent point to make: we consider ourselves to be rational engineers and scientists but, very often, we nearly completely elide our reasoning for major decisions about architecture and functionality. [1] From there, I was intrigued to learn more about this controversial feature in OCaml and I landed on the documentation page called "Chapter 12 Language extensions - 24 Effect handlers" , which summarizes its topic as follows, "Effect handlers are a mechanism for modular programming with user-defined effects. Effect handlers allow the programmers to describe computations that perform effectful operations, whose meaning is described by handlers that enclose the computations. Effect handlers are a generalization of exception handlers and enable non-local control-flow mechanisms such as resumable exceptions, lightweight threads, coroutines, generators and asynchronous I/O to be composably expressed." While I -- someone who's been reading programming-language specifications for over 30 years -- really like the sound of that, I'm forced to admit that most of the documentation, while comprehensible to someone versed in language constructs and terminology, serves as a perfect example of "why no-one uses OCaml." It is dense. Consider the following description of a concept like exception-handling, which many programmers can grok relatively easily, when it's been abstracted away into a generalized effect mechanism, described as follows, "We run the computation comp1 () under an effect handler that handles the Xchg effect with a continuation bound to k. Here effect is a keyword which signifies that the Xchg n pattern matches effects and not exceptions. As mentioned earlier, effect handlers are a generalization of exception handlers. Similar to exception handlers, when the computation performs the Xchg effect, the control jumps to the corresponding handler, and unhandled effects are forwarded to the outer handler. However, unlike exception handlers, the handler is also provided with the delimited continuation k, which represents the suspended computation between the point of perform and this handler." This is not to take anything away from the documentation -- which is precise, indubitably accurate, extensive, and replete with examples -- but you're not going to be onboarding any newcomers or dilettantes with this kind of stuff. As a sign that even OCaml community members are aware that this might be a problem, the page "Concurrent Programming with Effect Handlers" offers another view on it. It describes OCaml effects as, "An algebraic effect handler is a programming abstraction for manipulating control-flow in a first-class fashion. They generalise common abstractions such as exceptions, generators, asynchronous I/O, or concurrency, as well as other seemingly esoteric programming abstractions such as transactional memory and probabilistic computations. "Operationally, effect handlers offer a form of first-class, restartable exception mechanism. In this tutorial, we shall introduce gently algebraic effect and handlers with gentle examples and then continue on to more involved examples." Don't get me wrong, I find reading about a generalized mechanism that collects all of the effect-ful mechanisms hard-coded into other languages fascinating. With effects, OCaml has sacrificed "elegance and approachability of the language" for "provability of the program". Even if the number of people who end up using this feature in OCaml amounts to a rounding error, I think that research into mechanisms like this is vital because it leads to improvements in other, more mainstream languages. Related to this all is a practical implementation using effects for a laudable goal: inversion of control and dependency injection [2], for which the approaches in OCaml are described in detail in "Basic dependency injection with objects" . It discusses two common approaches to DI in OCaml and then proposes a more practical alternative. On the effect-based approach, the author writes, "an Effect system is often described as a systematic way to separate the denotational description of a program, where propagated effects are operational “holes” that are given meaning via a handler, usually providing the ability to control the program’s execution flow (its continuation), unlocking the possibility to describe, for example, concurrent programs." "It’s quite amusing to see that dependency injection and exception capturing can be considered two special cases of effect abstraction, differing only in how the continuation is handled." Spoiler: the author ends up using objects rather than modules (weak type-inference support, overly verbose) or effects (weak type-inference support, complexity). -------------------------------------------------------------------------------- [1] Where the OCaml discussion centers on the rationality of how a working group chooses which features to include in a programming language, the article "Why Engineers Can't Be Rational About Programming Languages" by Steve Francia has a very good discussion about the general irrationality that influences choosing a programming language for a project. "[...] what struck me was how broken their reasoning was. If they were making a logical argument, surely they would have considered Go and in doing so with their presented criteria they would have realized Go was a better option and, at the very least, refined their criteria. I pulled the VP aside after the meeting. “Walk me through how you evaluated other language candidates,” I said. His face went blank. “We… didn’t really look at any others,” he admitted. “Everyone’s talking about Rust.” There it was: a 50 million dollar decision made on hype, about to be green lit. For me this was the moment of epiphany, finally an answer to the question for the beginning of my career. The presentation didn’t share an analysis, they hadn’t done one; it was a justification for a choice already made. This was a decision based purely on hype, emotion, and identity." This is utterly unsurprising. No evaluation. Gut feeling. Justify that when things go tits-up. Or maybe -- and stick with me here -- it wouldn't have gone tits-up if you'd done an evaluation. "The researchers’ conclusion was stark: “To consider an alternative view, you have to imagine an alternative version of yourself.” Your brain can’t objectively evaluate challenges to identity based beliefs because doing so requires temporarily dismantling the neural architecture that defines who you are. It’s not a matter of being more rational or trying harder. The mechanism that would allow you to see the bias clearly is the same mechanism the bias has compromised.""Every time an engineer evaluates a language that isn’t “theirs,” their brain is literally working against them. They’re not just analyzing technical trade offs, they’re contemplating a version of themselves that doesn’t exist yet, that feels threatening to the version that does. The Python developer reads case studies about Go’s performance and their amygdala quietly marks each one as a threat to be neutralized. The Rust advocate looks at identical problems and their Default Mode Network constructs narratives about why “only” Rust can solve them.""The moment you hire a Rust developer to evaluate languages, you’ve already chosen Rust. You’ve just added a $2 million feasibility study to make the predetermined decision feel rational.""Industry research suggests that technology stack decisions account for 40-60% of total development costs over a product’s lifecycle. Research by Stripe found that developers spend 42% of their time on technical debt.""Instead of asking “which language is best?” we need to ask “what is this language going to cost us?” Not just in salaries, but in velocity, in technical debt, in hiring difficulty, in operational complexity, in every dimension that actually determines whether you survive.""Choosing a programming language is the single most expensive economic decision your company will make. It will define your culture, constrain your budget, determine your hiring pipeline, set your operational costs, and ultimately dictate whether you can move fast enough to win your market." This goes for frameworks and technologies as well. [1] Something that I've been writing about quite extensively recently. See, e.g., * "Discussing DI, IOC, and containers" * "Thinking about the null-object pattern" * "The goal is to test everything automatically"

The goal is to test everything automatically

2026-01-16T09:21:34+01:00

Published by marco on 16. Jan 2026 09:21:34 ------------------------------------------------------------------------ So it all started with the following line of code in the Startup.cs of a WPF application, locator.GetInstance().LogInBasedOnGeneralSettings(); It was to be replaced with these lines of code. #if DEBUG locator.GetInstance().LogInBasedOnGeneralSettings(); #else locator.GetInstance().Show(); #endif [Reduce startup complexity] Going by the single-responsibility principle, the startup should be responsible for starting the app but not for making decisions. The new code makes a decision, so it should be encapsulated in a component. The simplest (though not testable) way of reducing "cognitive complexity" (PDF Download) is to move the logic to a function. E.g., private void EnsureLoggedIn(IServiceLocator locator) { #if DEBUG locator.GetInstance().LogInBasedOnGeneralSettings(); #else locator.GetInstance().Show(); #endif } Now the calling code is much simpler, EnsureLoggedIn(locator); [Define a login service] I also like to reduce calls to locator.GetInstance() as much as possible, so I prefer do define something like a LoginService that consumes the IAuthenticationService and the LoginViewModel, so that you have something like this: class LoginService( IAuthenticationService authenticationService, LoginViewModel loginViewModel) { public void EnsureLoggedIn() { #if DEBUG authenticationService.LogInBasedOnGeneralSettings(); #else loginViewModel.Show(); #endif } } Now we can make the method in the startup file use the LoginService instead. locator.GetInstance().EnsureLoggedIn() We don't need to pollute the startup with the nuance of which mode we're in. [Making it testable] A colleague responded that, "But then you have to [...] inject a ViewModel into a Service?" I'll include the reply to show how the sausage is made: instead of just showing the solution, I often appreciate learning how people think about problems. So, here was my response, I'm not trying to be pedantic; it just comes naturally. 😃 * I was going to write that injecting a ViewModel into a service isn't bad because it's just a view model. But then I noticed that it seems to be communicating with the view in order to show something to the user. 😃 * We're trying to abstract away complexity and to make our logic testable. * We need to call Show() during startup; that's a fact. If we introduce a service, it actually makes that part mockable. * If we wanted to test that the LogInBasedOnGeneratedSettings() is called when expected, we couldn't do that right now, could we? * If we make it a service, then we could think about verifying the logic with a test. * Of course, once we want to build the test, we'd then be confronted with the need to abstract away the compiler-define. Otherwise, we wouldn't be able to test both branches without recompiling. That's a code smell, too. * Which is why I usually end up with some standard settings objects like the following. public interface ICompilerSettings { public bool IsDebug { get; } } public class CompilerSettings : ICompilerSettings { public bool IsDebug { get { #if DEBUG return true; #else return false; #endif } } } public interface ILoginServiceSettings { public bool ForceLogin { get; } } public class LoginServiceSettings(ICompilerSettings compilerSettings) : ILoginServiceSettings { public bool ForceLogin => !compilerSettings.IsDebug; } You might think that this is over-engineering, overkill, an excess of ceremony introduced by an architect astronaut! I, on the other hand, think that this is a minimal solution that separates concerns and makes all branches testable. Once we inject the ILoginServiceSettings into the LoginService, we can easily verify the behavior with tests (using fakes, mocks, etc.). class LoginService(ILoginServiceSettings settings, IAuthenticationService authenticationService, LoginViewModel loginViewModel) { public void EnsureLoggedIn() { if (!settings.ForceLogin()) { authenticationService.LogInBasedOnGeneralSettings(); } else { loginViewModel.Show(); } } It might look like a lot of ceremony but, without it, how else can you say with confidence that the login is required in some cases but not others? We can even verify that it's not required in DEBUG mode by mocking ICompilerSettings. Then the only thing we have to verify without automated tests is that the CompilerSettings are implemented as expected, which is very little code to manually check. We don't need to look at the rest. 👍 [We want to test everything] My colleague very politely responded, "Injecting ViewModels into Services is generally considered bad practice. The rest seems to depend on what you wish to test and don't overengineer it..." At this point, we took the conversation to meatspace, i.e., I ran over to his desk to tell him that "I always want to test everything." I am willing to concede on time constraints, priority, and planning, but my goal is "test all the code paths" eventually. I'm patient, though, so will accept unwritten tests as technical debt. I will design my could so that it could be tested, though, when we eventually find time to do so. This We shouldn't just punt on tests because "it looks difficult" or "it's not much logic". In the first case, the fact that it looks difficult may indicate you're not writing your code in a testable way or may reveal architectural problems. In the second case, those are famous last words. If it's just a little logic, then why wouldn't you just test it instead of investing the time arguing that you don't need a test? If you have a code base that’s difficult to test because of some unfortunate architectural decisions, then the thing to do is not to ignore it but to slowly chip away at it. How else would we get a higher percentage of our code covered by tests? Hint: it's not by continuing to write more code without tests. [Addendum: A note on architectural boundaries] He'd also argued about mixing levels -- injecting a ViewModel into a service -- but I convinced him that this is already what was happening whether you wrap a service around it or not. The startup is already instantiating and using a view model. Is that somehow better? I don't think it's a bad thing, as it's just a way of asking the user for input in order to continue starting the application. It's a step in the application startup. If you wrap it in a service, then you can at least test that the code does what you want. This is exactly the kind of thing that everyone is going to forget to test manually. Should the startup be using the view? Maybe, maybe not. It currently does and it makes for a legible workflow. Being too pedantic about architectural boundaries just for the sake of it is often wasted effort and can often lead to overly complex solutions. Ordinarily, the problem would be that introducing a dependency across a boundary weakens testability, but that's obviously not the case here. We can just mock away the view and test all of our logic. I wouldn't worry about this one too much, though it's a good rule of thumb.

Using extensions for operators in C# 14

2026-01-13T22:59:03+01:00

Published by marco on 13. Jan 2026 22:59:03 ------------------------------------------------------------------------ The article "C# 14 Extension Members: Complete Guide to Properties, Operators, and Static Extensions" by Laurent Kempe writes, "Perhaps the most powerful C# 14 capability is extension operators. You can now add user-defined operators to types you don’t control, enabling natural mathematical operations." When I first saw this, I thought it was kind of gimmick-y. But I just realized why it's very nice that you can declare operators separately -- optionally -- from the type. Adding operators by default is a heavy decision in most APIs. You generally don't do it except for the most obvious cases, like matrices, etc. where there is really only one possible way to implement the standard operators. However, for a lot of other types, it would be convenient to have these operators but they might be annoying for some. This way, you can either add them in yourself -- tailoring the implementation for your needs -- or you can pull in a NuGet package that extend standard types with operators. This allows you to opt in to the operators. With these new extensions, we're probably going to see more lightweight types that are delivered in multiple NuGet packages, the satellite packages being extensions that enhance the base type for specific scenarios. The author demonstrates such a custom operator, using tuples. extension(Point point) { public static Point operator +(Point point, (int dx, int dy) offset) => new Point(point.X + offset.dx, point.Y + offset.dy); } // Usage: Point translated = myPoint + (5, -3); Nice. [C# Logo]

Thinking about the null-object pattern

2025-12-27T22:27:23+01:00

Published by marco on 27. Dec 2025 22:27:23 ------------------------------------------------------------------------ I had never thought of an if statement as a type-check until a Smalltalk programmer explained it to me in this video. She explained how Smalltalk has six keywords -- according to "Wikipedia" , they're true, false, nil, self, and super, but her list had thisContext on it as well [1] -- and you can get rid of conditions and turn them into message-passing instead, as God intended. [media] From the official video description, "Our code is full of hidden assumptions, things that seem like nothing, secrets that we did not name and thus cannot see. These secrets represent missing concepts and this talk shows you how to expose those concepts with code that is easy to understand, change and extend. Being explicit about hidden ideas makes your code simpler, your apps clearer and your life better. Even very small ideas matter. Everything, even nothing, is something." [The null-object pattern] [image]I often use sentinel (or placeholder) objects so that I don't have to query a condition, like if (a == null) { }. Instead, as Sandi says, you just "pass a message". She calls it the "null-object pattern" or an "active nothing". Fine, cool. Lots of names for it. As she also noted, you don't get rid of the conditional, but you move it to the place where the decision should be made, rather than propagating the decision to every caller or dependency. She spent a lot of time on it, but it's basically about the following pattern, which is drastically simplified from what you'd probably find in the wild. interface IAnimal { public string Name { get; } } class Animal : IAnimal { public string Name { get; init; } } List animals = [new Animal { Name = "Pig" }, null, new Animal { Name = "Cow" }]; foreach (var animal in animals) { if (animal != null) { Console.WriteLine(animal.Name); } else { Console.WriteLine("no animal"); } } The condition is the problem, because every client of that list has to deal with the possibility of nulls. One way to handle it would be to move the condition out of the (obvious) loop, eliding nulls from the list, as shown below. var actualAnimals = animals.Where(a => a != null); foreach (var animal in actualAnimals) { Console.WriteLine(animal.Name); } You still have the conditional, of course, but you're also handling it just once and then letting the rest of your code be free of needing to deal with possible nulls. However, this hides the length of the original list, which is not always what you want. What if you want to represent the "empty" slots? What if, as the talk is called, "Nothing is Something"? Then you would use the "null-object pattern". class MissingAnimal : IAnimal { public Name => "no animal"; } var actualAnimals = animals.Select(a => a ?? new MissingAnimal()); foreach (var animal in actualAnimals) { Console.WriteLine(animal.Name); } Voila. [Composition beats inheritance] In the second act of this 36-minute talk, she demonstrates how to use composition rather than inheritance by ruthlessly applying the single-responsibility principle. She starts with a simple-looking class that returns some data. class Thing { private IEnumerable _data; public Thing(IEnumerable data) { _data = data ?? throw new ArgumentNullException(nameof(data)); } public IEnumerable Data => _data; } She then shows how you can use inheritance to make two descendants, one of which returns the data in a random order and another that returns the data with each entry doubled. class Thing { private IEnumerable _data; public Thing(IEnumerable data) { _data = data ?? throw new ArgumentNullException(nameof(data)); } public virtual IEnumerable Data => _data; } class RandomThing { public override IEnumerable Data => base.Data.Shuffle(); } class DoubleThing { public override IEnumerable Data => base.Data.Zip(data, (x, y) => new[] { x, y }).SelectMany(o => o). } Now try to make one that returns the data in a random order and doubles each entry. Don't repeat yourself. With inheritance, you're quickly "in a tight spot" . The thing to remember is that you've now introduced two new features to Things, which kind of slipped in there: RandomThing orders the data but does not transform it, whereas DoubleThing transforms the data but doesn't touch the order. It sounds like the Thing now has two responsibilities, i.e., it addresses two concerns. The answer is to separate out these two concerns into components and then to inject those components into the Thing. It's always the same answer. It's boring, right? Boring is good. This is an intermediate step, to illustrate the simplest form of composition, with the fewest changes. It's going to be more code than we'd like, but let's go ahead and write it. class Transformer { public virtual IEnumerable Transform(IEnumerable data) => data; } class Doubler : Transformer { public override IEnumerable Transform(IEnumerable data) { return data.Zip(data, (x, y) => new[] { x, y }).SelectMany(o => o); } } class Sorter { public virtual IEnumerable Sort(IEnumerable data) => data; } class Shuffler : Sorter { public override IEnumerable Sort(IEnumerable data) => data.Shuffle(); } class Thing(IEnumerable data, Transformer transformer, Sorter sorter) { public IEnumerable Data => sorter.Sort(transformer.Transform(data)); } new Thing(["A", "B", "C"], new Doubler(), new Shuffler()); This is immediately obviously suboptimal First of all, we should recognize that changing the order and transforming the data aren't different operations. They're both functions on a sequence that return another sequence. Instead of passing in a Sorter and a Transformer, as in the example in the video, we could instead pass in a sequence of transfomers to apply. class Transformer { public virtual IEnumerable Transform(IEnumerable data) => data; } class Doubler : Transformer { public override IEnumerable Transform(IEnumerable data) { return data.Zip(data, (x, y) => new[] { x, y }).SelectMany(o => o); } } class Shuffler : Transformer { public override IEnumerable Transform(IEnumerable data) => data.Shuffle(); } class Thing(IEnumerable data, IEnumerable transformers) { public IEnumerable Data => transformers.Aggregate(data, (current, t) => t.Transform(current)); } new Thing(["A", "B", "C"], [new Doubler(), new Shuffler()]); Another thing we can notice is how rigid this all is in the type of the item. Let's make this a more generalized pattern. class Transformer { public virtual IEnumerable Transform(IEnumerable data) => data; } class Doubler : Transformer { public override IEnumerable Transform(IEnumerable data) { return data.Zip(data, (x, y) => new[] { x, y }).SelectMany(o => o); } } class Shuffler : Transformer { public override IEnumerable Transform(IEnumerable data) => data.Shuffle(); } class Thing(IEnumerable data, IEnumerable> transformers) { public IEnumerable Data => transformers.Aggregate(data, (current, t) => t.Transform(current)); } new Thing(["A", "B", "C"], [new Doubler(), new Shuffler()]); Note that now we have all of our logic independent of the type of item in the sequences. It's only in creating the Thing that you decide on the item type. The Transformer is called a functional interface -- i.e., an interface with a single function -- which would be type-compatible with a function signature in Java, but still isn't in C#. It's kind of clunky and repeats a bunch of code. Can we get rid of it? Can we also get rid of the dynamic dispatch (i.e., the virtual and override)? Sure, we can. class Thing(IEnumerable data, IEnumerable, IEnumerable>> transformers) { public IEnumerable Data => transformers.Aggregate(data, (current, t) => t(current)); } new Thing(["A", "B", "C"], [data => data.Zip(data, (x, y) => new[] { x, y }).SelectMany(o => o), Shuffle]); Well, that's a lot less code, but it's a bit messy at the declaration point. One nice thing, though, is that we're only declaring the item type once now, in the type parameter to Thing. We can clean that up a bit but we're going to be limited by the requirement to specify the type parameter as soon as we leave the constructor of the Thing. The Shuffle part is succinct enough but the Double part isn't at all obvious. How about we encode the types in a static method call? public static class ThingTools { public static IEnumerable Double(this IEnumerable data) { return data.Zip(data, (x, y) => new[] { x, y }).SelectMany(o => o); } } new Thing(["A", "B", "C"], [ThingTools.Double, Shuffle]); That's quite a bit better. Now that we already have a helper class, we can keep improving things by making another helper method that allows us to create a Thing by passing in a collection of items without specifying the item type explicitly. Instead, the item type is picked up from the data passed in. public static class ThingTools { public static IEnumerable Double(this IEnumerable data) { return data.Zip(data, (x, y) => new[] { x, y }).SelectMany(o => o); } public static Thing Create(IEnumerable data, IEnumerable, IEnumerable>> transformers) { return new Thing(data,transformers); } } ThingTools.Create(["A", "B", "C"], [ThingTools.Double, data => data.Shuffle()]); Isn't that fun? Did I go too far? You decide what you want to use based on your comfort level with any of the versions that use composition (don't use the ones that don't use composition).

Discussing DI, IOC, and containers

2025-12-27T22:01:40+01:00

Published by marco on 27. Dec 2025 22:01:40 Updated by marco on 14. Feb 2026 17:48:49 ------------------------------------------------------------------------ I was recently allowed to observe as a team discussed the benefits and drawbacks of using an IOC container. [1] I was asked not to directly participate because it was a team-building exercise; the team needed to convince itself based on the merits of its own arguments. If those for the technology were unable to articulate their convictions sufficiently, then it wouldn't help for an outside authority to dictate the answer. I assisted in the background, with clarification and alternate explanations. [Why no tests?] Some team members had a reasonable hesitation to using an IOC. Why reasonable? Because they'd been hurt in the past by non-pragmatic and overly magical solutions. The main reason that the other team members wanted to use DI and an ICO container was to improve testability. They also appreciated that a side-effect of DI is that it makes it so much easier not only to reason about your system, but to repurpose parts of it. The disconnect arose because the first group doesn't write automated tests. Therefore, they never felt the pain of trying to replace an annoyingly impure component deep in the program logic with something else in order to test other components. If that situation doesn't comes up, then you might not see what the big deal is. So, part of the confusion was that some of the team still had an at-best antiquated -- and at-worst irresponsible and inefficient -- approach to engineering because they did all of their testing manually and in an ad-hoc manner. Another part of the confusion was terminology, where people were arguing against a technology by naming the concept. This was a missed opportunity for finding common ground and then focusing on the details, where they had different preferred approaches. [Application components] [Clarifying terminology] My colleagues were contrasting DI with what they were calling "static class trees". I think that expression is quite confusing because what they probably meant was "static object trees," which I think was meant to mean the untestable evil that is "bottom-up-instantiated object graphs." Let's be clear about which kind of static class trees we think are bad. [Not using DI] This is a static class tree rooted at D that does not inject any dependencies. class A {} class B {} class C { A a; B b; C() { this.a = new A(); this.b = new B(); } } class D { A a; C c; D() { this.a = new A(); this.c = new C(); } } D d = new D(); [Using some DI] You might complain that this is pathological because we've created A twice. OK, fine, let's pass it in to C. class A {} class B {} class C { A a; B b; C(A a) { this.a = a; this.b = new B(); } } class D { A a; C c; D() { this.a = new A(); this.c = new C(a); } } D d = new D(); Congratulations! You've injected your first dependency. Now keep going! [Using DI] The following code illustrates a static class tree rooted at D that uses dependency injection for everything. Note that this example uses primary constructors without confusing things too much, shortening the code considerably. All instantiations are under the control of the calling code. class A {} class B {} class C(A a, B b) {} class D(A a, C c) {} A a = new A(); D d = new D(a, new C(a, new B())); [How DI and IOC are related] * IOC is the concept. It stands for "inversion of control", which means that the control over who gets to decide which implementation backs a given interface is no longer with the consumer of the interface but the provider. * DI is a way of implementing IOC. (Usually rounded up to be equivalent.) * An IOC Container services requests for instances based on interface-to-implementation mappings. Looking at the examples above, I think we can all agree that DI is a good thing. That is, "dependency injection" and "inversion of control" as concepts are good things. [What does the IOC container do?] An IOC container generally consists of two parts: * A mapping of abstractions to implementations. * A method to resolve implementations from abstractions. In .NET, these are two completely separate interfaces, so that you can't register mappings when you should only be using them. You use the IServiceCollection in .NET to register mappings and then use IServiceProvider to request instances. The service provider locates a requested service and constructs an instance where necessary. It recursively locates any parameters to the constructor of a requested service. Obviously, a directly requested service must be registered with a concrete implementation. But also, every service on which it depends must also be registered with a concrete implementation, recursively until dependencies don't have dependencies of their own. Using an IOC container carries the following implications. * ✅ It reduces fragility when constructors are refactored. * ⚠️ It can make it unclear which constructors are called. * ✅ It can be helpful for implementing very generalized factories (where you inject the service provider into, say, a "plugin factory"). A white paper I wrote six years ago has an extended example (in Swift, of all things): "Encodo White Papers: DI, IOC and Containers (2019)" . As in the examples above, and in the extended one below, most of the steps in the paper do not use a container. You can do DI without a container -- it just gets kind of tedious and wordy. Let the IOC container do the brain-dead stuff for you. [What's a DAG?] Some folks might refer DAGs, which are "directed acyclic graphs" . This is just another way of referring to the graph of objects represented by the composition root, which is the single object you should create in your program's root method -- often called main, but sometimes it's just the main file of your application -- to create your application. It's possible to build a DAG without DI. The second and third examples in the sections above also create DAGs. The reason those DAGs are hard to change or extend is because dependencies are created at lower levels rather than injected. We want a DAG ... but not like that. We want a DAG created with explicit dependencies, as illustrated by the first example. For the most part, it's not really important what you call it, as long as you end up with testable components. ["Refining" your code (separating pure logic from impure)] Any non-trivial application comprises pure and impure parts. The impure parts are the messy bits that communicate with the unpredictable outside world: * Reading from the command line * Reading configuration from a file * Reading values from the environment * Reading data from files * Reading from a database * Reading user input * Calling network services * Etc. We don't want unpredictability in our testing application, so we'll push all of the impure stuff as far out to the edges as possible, leaving a nice, fat pile of pure logic that we can reliably and reproducibly test. The preceding sections have hopefully convinced you that IOC is useful, and that constructor-based DI is a good way of implementing it. We've also discussed the advantages of using an IOC container to improve flexibility and reduce code-duplication. All of these things are going to help achieve our goal of testing as much of the program logic as possible automatically. [The single-responsibility principle] Components should not only be pure, but should have a single responsibility. That's the "S" in "SOLID" . For functional languages, parameters serve as dependencies; for C# and many other languages, another natural "injection point" is a constructor. Where injection by parameter mixes parameter types -- that is, which parameters are data and which are tools for the calculation -- a constructor is more clearly a point at which to inject tools. Now that we have a concept -- use IOC to define dependencies as abstractions -- and a mechanism -- use DI via constructor to inject dependencies -- we can write components that address a single responsibility. That is, we have a mechanism for ruthlessly separating concerns. Components will stitch other dependencies together to accomplish their task (i.e., their sole responsibility). If this stitching code becomes too involved, then the act of stitching might be its own task! When each component does only a single thing, it is easier to test its logic in isolation. It is, in many cases, trivial. [We can't test yet! (We need abstractions.)] Do we have testable components yet, though? No. Even if we were to use a container with the examples above, we're not done yet! Our goal is to make our logic testable with automated tests. None of the examples above is testable because, although dependencies are injected, they are concrete dependencies. These cannot be replaced with other implementations and thus cannot be mocked away in tests. That is, if the component A above accessed an external service available only in the cloud or when connected to hardware, you cannot test D without having access either to the cloud or hardware because you can only ever pass in A. We want all modules, high and low, to define and depend on abstractions that can be replaced. This is the "L" in "SOLID" . A component should receive configuration through an interface rather than either creating it itself or accessing a static, global, concrete instance. When all components receive all external dependencies as other, injected components, it's extremely easy to both reason about the code and to test it in isolation. So, to be able to write automated tests, our next step will be to inject abstractions rather than concrete implementations. An abstraction defines a narrow interface that makes as few promises as possible while still fulfilling its task. In C#, we typically use interfaces or abstract classes. Interfaces are way better. Just trust me. [A more concrete example] Instead of continuing with the toy classes defined above, let's look at how we would make a part of our program logic more testable. Suppose you have the following code: class EmailClient { void Send(Email email) { ... } } class SubscriptionManager { void Notify() { var client = new EmailClient(); foreach (var email in _subscriptions.Select(CreateEmail)) { client.Send(email); } } Email CreateEmail(Subscription subscription) { ... } } Now, suppose you'd like to test this code. You can't test it without an email server configured because the EmailClient is hard-coded. If you invert control, though, you can pass that dependency in to the SubscriptionManager. One way to do this is to pass the dependency directly into the method, like this: class SubscriptionManager { void Notify(EmailClient client) { foreach (var email in _subscriptions.Select(CreateEmail)) { client.Send(email); } } Email CreateEmail(Subscription subscription) { ... } } Is this really solving anything, though? No. The callee is still in control of the type because the type of the parameter is a specific class. The caller has no choice but to pass in an EmailClient, which will try to sent mails to an external server over a network. In order to support IOC, the callee needs to abstract its requirement. In C#, you use an interface. interface IEmailClient { void Send(Email email); } class EmailClient : IEmailClient { public void Send(Email email) { ... } } class SubscriptionManager { void Notify(IEmailClient client) { foreach (var email in _subscriptions.Select(CreateEmail)) { client.Send(email); } } Email CreateEmail(Subscription subscription) { ... } } We're done. We've implemented inversion of control. The caller now controls the concrete type. We are also using dependeny injection but of a very manual kind: the caller is expected to provide the email-sending mechanism. This can be inconvenient and can muddy otherwise legible code because each and every caller has to have a reference the thing that the SubscriptionManager needs. That is, instead of coupling just the SubscriptionManager to an IEmailClient, we end up coupling any client of the SubscriptionManager as well. Therefore, a common practice is to inject dependencies like this through the constructor. class SubscriptionManager { private readonly IEmailClient _client; public SubscriptionManager(IEmailClient client) { _client = client ?? throw new ArgumentNullException(nameof(client)); } void Notify() { foreach (var email in _subscriptions.Select(CreateEmail)) { _client.Send(email); } } Email CreateEmail(Subscription subscription) { ... } } The code that calls Notify() no longer has to know anything about the dependency, thus better decoupling the SubscriptionManager interface from its consumers. The SubscriptionManager declares its dependencies in the constructor, which makes good use of that language construct. That is, we're leveraging the language to improve the clarity of our design. At this point, we can still construct the SubscriptionManager manually, passing in the concrete type for IEmailClient but we can now also consider using an IOC container, as outlined above. [Another example: Injecting configuration] Let's take a look at how we can lean on the IOC container to build our application's configuration. public class AppSettings : ITimerSettings, IListenerSettings, IPathSettings, IBroadcastSettings, IDashBoardSettings { /* ... */ } At first glance, you may think this is over-engineered, but there's a good reason for it. There is a single object holding all the settings for the app. But each service only needs to know about one part of these settings. That is, it's convenient on the implementation side to have a single object handling all settings, but each service should only be coupled to the settings that it uses. We don't want to increase coupling in the services just because of how the current implementation works. Therefore, we register the single implementation for all of service-settings interfaces in the IOC container, and each service uses its own settings interface. [Getting the ball rolling] For this section, I'm going to be referencing from two older articles I wrote about a framework I used to work on. It's fun that nothing has really changed in the last decade. * "Encodo’s configuration library for Quino: part III" (2015) * "API Design: Running an Application (Part I)" (2015) * "API Design: To Generic or not Generic? (Part II)" (2015) * "Quino 2: Starting up an application, in detail" (2015) * "Mini-applications and utilities with Quino" (2016) So we can see how to construct an application out of components. We see how to stitch them together with very simple rules. We see how we can test those components. But...now we want to run the application. We want it to do the thing that it does. Do we just create a component root and call ... um ... Run() on it? IServiceCollection services = AppTools.CreateServiceCollection(); IServiceProvider provider = services.CreateServiceProvider(); IApplication application = provider.GetRequiredService(); application.Run(); Are we cool? Kind of. Like, that works just fine. The articles referenced above provide a lot more background on providing exception-handling, standard logging, command-line support, etc. But the code above is what we're shooting for, for real applications. What's a real application? Let's flesh this out a bit. * How many applications do we have? * What kind of application are we running? * What even is an application? Any solution is going to have at least two applications: Whatever the real application is, and one or more test runners. An application is any way of executing part or all of your program logic. ["Real" applications have event loops] A real application might be a console, a GUI, or a server application. These applications have one thing in common: they contain one or more event loops that react to external input. Most applications have an event loop. * A console application that "watches" for changes in the file system, updating other files base on that. It exits when the user issues a special command like Ctrl + C. * A web server that "listens" on specific ports, returning responses to requests. It exits when the user issues a special command like Ctrl + C or when the system terminates the service. * A GUI that "responds" to user input -- mouse and keyboard events -- and runs until the user issues a special command to exit. [Run-once applications] Some applications don't have an event loop They run the parameters through their flowcharts one time and then exit automatically. * A console application or script that processes a single set of command-line parameters, like processing a file and producing a report. * A test runner that executes part of the logic in your real application. [Application "actions"] The example above doesn't show any detail about what the IApplication does when it runs. As the referenced articles show in more detail, "getting the whole ball rolling" in a nontrivial application always involves several "actions" to execute during "startup". That is, the application is not just a service collection, but also a list of startup actions, a list of shutdown actions. Each action is created by the IOC container, so it can have all of the services it needs injected into it. The basic loop in Application.Run() is something like this: foreach (var action in _startupActions) { action.Execute(); } The actual implementation ended up being more complicated than this, as noted above, to accommodate general error-handling, async startup actions, debugging comfort, and to support re-running the application, e.g., when it showed command-line help, or when it needed to run a schema-migration for a database. See the linked article for more information. [I heard you like IOC containers...] Another common problem with IOC containers is: how can you dynamically configure the IOC? What if you want to use a AmazonS3Provider by default, but allow a command-line parameter or configuration file to enable an FTPProvider instead? DI and IOC are awesome, so we want to use them everywhere, right? But we can't, can we? As soon as you get a service provider, you can no longer modify it. As soon as we request the IApplication service above, it's all over for modifying service registrations. I've wrestled with this a lot in the past. The most relevant article is linked above: "Quino 2: Starting up an application, in detail" . Basically, the answer is to use two IOCs. Bootstrap IOC The first IOC is much smaller and contains registrations for services needed to configure the Main IOC (e.g. configuration-loader, command-line-reader, fs-location-resolver, etc.). These registrations are necessarily a small core of services that cannot be changed by configuration (files, command-line parameters, database values, etc.). That keeps things simple. Main IOC Includes all registrations from the Bootstrap IOC, plus overrides that came out of the configuration, plus anything else needed for the main app. As noted above, an appilcation's startup and shutdown are lists of actions (discussed in "Encodo’s configuration library for Quino: part III" ). Specifically, there are actions to execute during, * the bootstrap phase, * the main phase, * and shutdown. So, the application startup kind of looks like this: * Configure services and actions for the Bootstrap IOC and Main IOC. * Any registration in the Bootstrap IOC is made in the main IOC as well. * Crucially, singletons in the Main IOC are the same as those in the Bootstrap IOC. * Seal the Bootstrap IOC (i.e., get the service provider from the service collection). * Execute application-startup actions using the Bootstrap IOC. * The first few actions will be stuff like "read command line", "read configuration", etc. * These might alter the registrations in the main IOC and might add or modify actions to execute. That's OK. It's not "sealed" yet. * Any attempt to alter a registration in the bootstrap IOC results in an error. * Modifying an action in the list before the app's current position in that list will have no effect. * At some point, the "bootstrap" actions are finished, and an action executes that "seals" the main IOC from modification. * Now we're in the "classic" app startup. * Run the main actions. * Run the event loop or application logic (e.g, fixed handling for command-line parameters). * Run the shutdown actions. There's more documentation but it’s no longer available because Encodo has taken down all public documentation … and we never published the source code as open source. I'm working from memory and my existing articles. 🤷 But that's the general gist of it. There are clean solutions to anything that might come up. For example, if you need a more "heavyweight" service during the bootstrap -- like a database, which you also use in the main application, but which you want to keep configurable -- consider making an interface like IBootstrapConfigurationDatabase or something like that, which will be its own singleton and not even available in the main application phase. -------------------------------------------------------------------------------- [1] This isn't the first time I've taken a run at this topic, although I only recently remember that I'd written "Why use an IOC? (hint: testing)" in April 2019.

A review of 35 Microsoft Ignite 2025 dotnet videos

2025-12-21T22:47:15+01:00

Published by marco on 21. Dec 2025 22:47:15 ------------------------------------------------------------------------ I watched/listened to 35 videos, each between 20 and 30 minutes long, and each listed below. I've grouped them but retained the order in which I watched them. I've left the notes mostly as I wrote them, which is kind of stream-of-consciousness, kind of snarky. Some videos that I didn't like got a lot of notes, some videos I liked got fewer notes. It might seem like I hated the video from my snarky notes but I still ended up rating some of them as 🆗, which means that I thought that either the video was well-presented even though the material doesn't have much broad appeal or that I thought the material was important even though the presentation wasn't that great. Don't read too much into it. The notes are just to help me remember what I learned from the videos -- and can hopefully give you a bit of an idea of what the video is about and whether you want to spend 20--30 minutes on it. YMMV. * "C#, F#, & .NET" <#dotnet> * "🆗 Nullable Reference Types: It's Actually About Non-Nullable Reference Types" <#Nullable> * "✅ Performance Improvements in .NET 10" <#Performance> * "🆗 Community Toolkit Roundup" <#Community> * "✅ C# Features you need Habits you want" <#Features> * "✅ Smatterings of F#" <#Smatterings> * "✅ Rx.NET status and plans" <#Rx> * * "Migration" <#Migration> * "⛔ .NET Scores "A Perfect 10"" <#Scores> * "✅ .NET Framework 4.8 to .NET 9 Step by Step" <#Framework> * "⛔️ Modernizing .NET Applications for the Cloud" <#Modernizing> * * "Tools" <#Tools> * "🆗 Visual Studio Debugger: Advanced Techniques" <#Debugger> * "🆗 New dotnet test Experience with Microsoft.Testing.Platform" <#Test> * "🆗 What's New in NuGet" <#NuGet> * "🆗 Real-World .NET Profiling with Visual Studio" <#Profiling> * * "UIs" <#UIs> * "⛔ What's New in Windows Forms" <#Forms> * "⛔ Modern Windows Development with .NET" <#Windows> * "✅ TUIs Are Back (Although They Never Left): Creating Modern CLI Apps in .NET." <#TUIs> * "🆗 Ship Faster with .NET MAUI: Real-World Pitfalls and How to Nuke Them" <#Ship> * "✅ Building Rock-Solid Avalonia Apps A Guide to Headless Testing with AI Assistance" <#Headless> * "✅ What's New in .NET MAUI" <#Maui> * "✅ Build better web apps with Blazor in .NET 10" <#Blazor> * * "Aspire" <#Aspire> * "✅ Taking .NET out of .NET Aspire - working with non-.NET applications" <#Aspire> * "⛔ From Architecture to Docs: .NET Aspire Documented with Copilot" <#Architecure> * "✅ Windows 365 Meets Aspire - Supercharging Multi-Repo Microservice Productivity" <#Supercharging> * "✅ Deep Dive: Extending and Customizing Aspire" <#DeepDive> * "✅ Aspire Unplugged with David and Maddy" <#Maddy> * * "Cloud" <#Cloud> * "🆗 What's New in Containers for .NET 10" <#Containers> * "✅ What's new in Azure App Service for .NET developers" <#Service> * "🆗 Carbon Aware Computing - Using .NET Open Source libraries for more sustainable applications" <#Carbon> * * "AI" <#AI> * "⛔️ Architecting an AI-Powered Sales Dashboard with .NET MAUI and Azure OpenAI" <#Dashboard> * "🆗 One Question, One Answer: Designing Seamless AI Agents with C#" <#Seamless> * "✅ Overcoming the limitations when using AI" <#Overcoming> * * "Everything else" <#Everything else> * "🆗 Modernizing a 17th Century Italian-English Dictionary" <#Italian> * "🆗 Going Passwordless - A Practical Guide to Passkeys in ASP.NET Core" <#Passkeys> * "✅ GitHub Actions DevOps Pipelines as Code using C# and Cake SDK" <#Pipelines> * "🆗 If .NET brewed beer..." <#Beer> [image] [C#, F#, & .NET] "🆗 Nullable Reference Types: It's Actually About Non-Nullable Reference Types" by dotnet | Shawn Wildermuth This is a decent, thorough -- though somewhat slow -- introduction to non-nullable reference types in .NET/C# (which have been available since .NET 3.x / C# 8). If you already know about them, then there's nothing new here. "✅ Performance Improvements in .NET 10" by dotnet | Stephen Toub An in-depth examination of performance improvements in .NET 10. He explains how the various compilers (AOT, JIT, etc.) have been optimized to eliminate allocations and just generally optimized for performance. A reduction in allocations is a multi-win: the performance is better because the allocator isn't working, the memory usage has dropped, and the garbage collector also works less. He compares .NET Framework 4.8 vs. .NET 9 vs. .NET 10. The most impressive improvements are from 4.8 to 9.0, of course, but he highlights some interesting places where .NET 10 eclipses .NET 9, where .NET 9 had already eclipsed .NET Framework 4.8. The last example shows how regular expressions have been continually optimized so that an operation that took 24ms in .NET Framework 4.8 was improved by about 12x to 2.5ms in .NET 9 but has been further improved by about 62,500x to about 40ns in .NET 10. For more coverage, see "Toub’s 232-page tour-de-force on performance in .NET 10" . "🆗 Community Toolkit Roundup" by dotnet | Gerald Versluis, SergioPedri, Michael Hawker They spent some time touting the benefits of the toolkits. * There is an introduction to improvements to the MVVM toolkit. * There is also a toolkit for Aspire, which is interesting. * Then there's the Maui MVVM toolkit, which adds a bunch of media support. * The Windows toolkit added a lot of fixes and controls for WinUI3. They note that a lot of stuff incubates in the toolkits and is often migrated to the official libraries after a while. "✅ C# Features you need Habits you want" by dotnet | Bill Wagner He introduces an existing "magic 8-ball" program, demonstrating its functionality. He doesn't show any tests, though. That does not stop him from refactoring the app to take advantage of "newer" C# features. I write it in quotes because, while some of the features he shows aren't necessarily new, it's good to have a video that shows how you should be upgrading your types when you touch old code, to take advantage of better type-checking, to convert potential runtime errors to compile-time errors. * non-nullable references. * required and init properties. * The field element for properties, which is new to C# 14. * The System.Threading.Lock type instead of System.Object, which allows the compiler to generate more efficient code, all without any change in behavior of the application. * Using verbatim strings and the newer multi-line verbatim strings. * Collection expressions. (He explains how the compiler can optimize the capacity for a collection expression, where it cannot for a direct instantiation of new List().) * The spread operator. (He uses this to replace the explicit call to ToArray(). Again, it's easier to read and the compiler has more optimization opportunities.) * The with keyword. (He explains how this allows you to more easily work with immutable types and structures.) * Using a readonly struct (This sets immutability, which also allows much better optimization, such as lowering copying/allocation when passing data through function/stack boundaries.)He optimizes his pattern-matching, where the compiler helps a lot to figure out exactly how much information is needed in the pattern. If a case can't be reached, it's an error. He removes the lower-bound check on several cases because they're not needed. If you remove too much, the compiler tells you. AnswerType type = randomIndex switch { >= 0 and <= 5 => AnswerType.Affirmation, >= 6 and <= 9 => AnswerType.Encouraging, >= 10 and <= 13 => AnswerType.Uncertain, >= 14 and <= 16 => AnswerType. Doubtful, >= 17 and <= 18 => AnswerType.Rejection, 19 => AnswerType.Redo, _ => AnswerType.Uncertain }; The following is equivalent: AnswerType type = randomIndex switch { <= 5 => AnswerType.Affirmation, <= 9 => AnswerType.Encouraging, <= 13 => AnswerType.Uncertain, <= 16 => AnswerType. Doubtful, <= 18 => AnswerType.Rejection, 19 => AnswerType.Redo, _ => AnswerType.Uncertain }; If you were to change the order of the cases, putting the <= 13 case at the top, the compiler warns that the <= 5 and <= 9 cases will never be matched. AnswerType type = randomIndex switch { <= 13 => AnswerType.Uncertain, <= 5 => AnswerType.Affirmation, // Compile error. <= 9 => AnswerType.Encouraging, // Compile error. <= 16 => AnswerType. Doubtful, <= 18 => AnswerType.Rejection, 19 => AnswerType.Redo, _ => AnswerType.Uncertain }; "✅ Smatterings of F#" by dotnet | Matthew Watt The first five minutes is an introduction to the programmer himself, which was a bit odd but it's fine. It just might not be very interesting if you're looking for technical guidance. He moves on to an introduction to his blog, which he wrote with F# on the back-end, and React for the front-end. The comments section that he built uses Elmish, which is a library for emulating the highly functional Elm pattern of building code. The whole web site is functional from top to bottom so it's kind of neat to see how that works for a real-world application. He finishes up with five minutes on contributing to open-source code. Again, a nice touch. "✅ Rx.NET status and plans" by dotnet | Ian Griffiths He discusses some examples of some new methods in the 6.1 release. These are quite nice, and the concept of RX is just neat, even though I've only ever played with it rather than used it in production. He discusses in detail how some of the new handling for exceptions "bridges between RX's world of observable streams and more ordinary async programming." In the next section, he discusses how the RX project had to do some extra work because System.Linq.Async is no longer their responsibility. It's now in the standard library. But they had to make sure that their version gets deprecated in favor of the new one. As a library developer, think that this detail is fascinating, because you can see the the tools available for managing changing APIs and dependencies have gotten quite good. Finally, he discusses the feature set for Rx.NET 7.0. The functionality won't change much; it's mostly library and platform-compatibility. There is a fix for the "bloat" issue, which only affects projects that target UI applications on Windows. It turns out that design decision in version 4.0 left self-contained deployments with implicit references to UI frameworks, which add dozens of megabytes needlessly. The fix causes a compile error, for which they added an analyzer that nicely explains the fix to apply. This is a neat example of how to help consumers of your library get around compiler errors, which we didn't have available before it was so easy to write and include custom analyzers. Previously, you'd have had to jump through more hoops to avoid giving upgraders compiler errors that weren't warnings in the previous version. Now, if something like that is unavoidable, then you can still provide guidance with a diagnostic. I thought it was a very interesting presentation but I'm a library and framework geek. Your mileage may vary. [Migration] "⛔ .NET Scores "A Perfect 10"" by dotnet | Shaun Walker He describes a successful migration of a large Blazor application to .NET 10 (the open-source "Oqtane" ), presumably from .NET 8. This is OK, but he just describes what he did without showing it. Once he gets to the product, he actually ends up demoing the Oqtane software -- and Blazor's capabilities -- more than he showed any details about what migrating to .NET 10 entailed, apart from a few sentences in the slides. Instead, he spent a bunch of time discussing features introduced by .NET 10 that Oqtane ended up using. That is, instead of covering the migration itself, he discussed the extensions to the product that were enabled by a move to .NET 10. "✅ .NET Framework 4.8 to .NET 9 Step by Step" by dotnet | Michael Christiansen He recommends modernizing the app before retargeting it. This means: * Updating to use the SDK-style project format. * Using package references. * Using the Microsoft.Extensions.* packages, like dependency injection, configuration, logging, and hosting, all of which target the .NET Standard API surface and are therefore available for .NET Framework and .NET. After that, he recommends side-by-side versions of libraries so that you can split them up better without affecting the existing, working version of the code. One of the projects was a tougher nut to crack: it was an old-school ASP.NET application, where the patterns had completely changed in .NET 9 and 10. For that, he managed to have Claude Code do about 90% of the conversion and finished it up manually. The process was very manual -- "spec-driven development" and "very hands-on" -- but Claude Code was quite helpful once he figured out how to steer it properly. If you have a .NET Framework application, then this is a great video. He really has a lot of good advice for how to avoid certain pitfalls (e.g., platform-specific code, like Windows Services). "⛔️ Modernizing .NET Applications for the Cloud" by dotnet | Matt Soucoup Was there ever going to be a chance that he wouldn't start off with telling you that Copilot can do all of the tedious work for you? No. No, there wasn't. Was he ever going to tell you to use your mad skillz with your IDE to apply a ton of changes automatically using tools and refactoring? No, he wasn't. Like the NuGet guy, he's going to get copilot to spend ten minutes running a NuGet one-liner. So like how cool is that? Not only do you have a super-old application that you never upgrade but now you don't even have to understand what you're migrating to! I love how he says that going from .NET Framework to .NET 10 is just soooo easy. You know, don't make any stops along the way, just take the express train. What could go wrong? Anyway ... he shows how to install the Copilot modernization tools, then opens a .NET Framework IIS-based project. Once again, we're watching a guy watch a Copilot chat window write a ton of text that he barely reads. He asks it to explain the security problems, as if this is something that you should do. Shouldn't you inform yourself about the packages? Shouldn't you just upgrade the old things? Do you really need the explanation? And, once again, he says that "you're giving up the reins to Copilot," but, like everyone else, just assumes that everything that Copilot returns in bulletproof. This is still not my experience, to this very day. Back to the update plan: I see the attraction, I really do. It's very detailed ... but who is it for? Is he keeping this upgrade plan in the repository? How much control does the plan actual give him? Doesn't the commit that results just show the changes? He says it "took about an hour to upgrade". 😱 Oh, hell no. It just works for an hour for what he calls "a simple app", using God knows how many tokens, and then you still have to review everything? Why not just do it yourself? He really needs to show the diffs. Show us the diffs, bro. I don't think he's going to show us the diffs. He's just going to show us how he has to coddle the tool, which is basically making black-box changes. "That's just the way it is, that's the way it is working with AI-assisted dev tooling." He didn't show the diffs. I have no idea what this tool did for him on this project. This tool is for people who would have no idea how to go about upgrading a solution on their own, who can use a chat windows but run screaming from a command-line upgrade tool. At the very end, he runs the upgraded version but there are warnings in the build that two packages were restored using .NETFramework,Version=4.6.1 (the worst .NET Framework version ever), which strongly indicates that, even after an hour of f@&king around, the solution still references .NET Framework. "The amount of coding that I had to do was basically zero. All I had to do was supervise things." Well done, buddy. You still have old packages and weird references. Check your warnings. I wouldn't touch this tooling with a ten-foot pole. As I wrote in a comment on the video, This kind of workflow doesn't translate well to a nearly half-hour-long video. There's nothing to see. He ran a command or two. He didn't even show the diffs at the end, to show us what the tool actually did. You could still see some odd warnings about .NET Framework in the output that he had to pretend weren't there. He did a good job FWIW but a lot of this video is watching the Copilot chat window scroll by. The explanation is good but it would have been better as a blog post. [Tools] "🆗 Visual Studio Debugger: Advanced Techniques" by dotnet | Harshada Hole She takes us through the various live and inline indicators in the debugger, with predictive evaluation, including highlighting of the particular part of a condition that caused it to evaluate to true or false. The debugger has moved much closer to Rider's, showing a lot of calculated values in the whitespace next to code, so you can see return values and calculated values without having to look in the variables or watches panes. This also allows you to use more concise coding while still being able to see interim values while debugging. When showing how to analyze exceptions, she showed how to dig down into the call stack to find out why something's null. She used right-clicking for everything, which was already slower than it needed to be...but then she decided to ask Copilot. The "quick" analysis took 30 seconds and then she had to ask it to do a "deep analysis", whereupon it found the error that she would have probably found manually much more quickly. Maybe a more complex example wouldn't have had such an obvious fix. Most people suck at debugging and don't really understand their code, so probably Copilot is better at this than they are (or ever will be). So who am I to stand in the way of progress? I'm just John Henry. I cannot stress enough how annoying it is to have to watch people "ask Copilot" and then we all gather around the chat-window output like it's the word of God. It's too bad, because the first few minutes of this video showed interesting deterministic tools before devolving into an orgy of just clicking that stupid little Copilot icon everywhere and then watching the completely useless and always-disregarded text in the chat windows scroll by. I cannot recall any one of these presenters actually reading any of this text. No-one cares. These tools are really trying to reach out to and onboard completely unskilled developers to an unprecedented degree. These kinds of presentations make me sad. It's fine for what it is, but I don't think that this is the final form of software-development. "🆗 New dotnet test Experience with Microsoft.Testing.Platform" by dotnet | Jakub Jares This is a demo video, with the presenter working in Visual Studio Code but only from the command line. He shows how the console UI has been considerably improved. He also gets into new analyzers, assertions, and attributes. The improvement to the assertions is that they start analyzing the expression tree, which I find to be more fragile than the NUnit approach, which uses an explicit API to declare the assertion, with no magic. The attributes are for extending the framework, e.g., for determining when and in which environments tests will run. Finally, he shows how the MSTest runner has massively improved execution speed, not in this version (4.0), but already in the 3.0 version. The video is OK but the product is quite exciting, as it is a massive improvement over the previous test-runner. "🆗 What's New in NuGet" by dotnet | Sean Iyer & Nikolche Kolev He starts off by threatening us that he will show a bunch of AI stuff. First up: tell us to use the MCP server for NuGet. He uses it to show how to get Copilot to update your dependencies when you have a vulnerability. This is not a hard task and, honestly, you should be aware enough of your dependencies to solve them yourself. It's nice that the warnings are so good now that you can get a tool to fix up all f the things that people never could figure out on their own. Dude, since assembly-binding redirects were fixed in .NET, there's no problem anymore. I don't understand how it's secure to let a hallucinating machine pick your dependencies for you. Now you don't have to understand anything! He spends a bunch of time talking about how to avoid getting outdated implementations that aren't in the training data using an MCP. Or you could, you know, just update to the latest version. I don't know why they're making everything so complicated. In the second half, he talks about security improvements but then just starts talking about how Copilot did all of his work for him. So, like, it's secure but also an only partially reliable machine made all of the changes and he didn't seem to look at them. Nikolche shows how to eliminate vulnerabilities without Copilot (thank God) and shows how to use the pruning option with the audit command to remove unneeded dependencies that might show up in audits unnecessarily. "🆗 Real-World .NET Profiling with Visual Studio" by dotnet | Nik Karpinsky The first four minutes is a discussion of what profiling even is, with a nice workflow diagram for noobs. Next, he grabs the NLog open-source repository and opens the solution in Visual Studio. "Now I want to talk to the profiler agent." Oh no. He has the agent build a benchmark for a given class. The build fails, though because the solution uses advanced trimming options. Of course, he can figure this out, but if a developer who needs an agent to write benchmarks gets this failure, their day is already over. Copilot is not going to figure something like this out, either. He goes on to generate more code but it's very clear that the agent is a support tool because he brings a lot of knowhow to the table. For example, he sees immediately that the agent's proposed solution never cleared the StringBuilder, which would skew the results toward better initial performance because of thrashing caused by reallocation that affects only subsequent runs. Of course, if you don't notice this, then you have a shit benchmark that you will trust unreservedly because we've all long since stopped doubting the output of our new overlords, LLM agents. What I don't understand is why he keeps having the agent build and run the benchmarks. There are hotkeys for this. Is the future of Visual Studio just a chat interface? Who is the target audience here? Anyway, his new benchmark finds a problem with Boolean boxing issue and the profiler agent jumps on it, optimizing the code. He shows how tedious the stack trace would be to investigate -- which is not tedious at all because he clicks through it quickly -- but we're also supposed to ignore how long that little progress circle next to "Analyzing performance trace" in the agent window is spinning. It takes long minutes while the developer has long since explained what the problem is and would likely have fixed it. The agent is really there for people who wouldn't have understood the problem illustrated by the profiling trace and who wouldn't be capable of judging the proposed solution. The solution is wrong. He characterizes it as "the first time I ran it, it came up with a better solution," but that's a cop-out because the solution shown in the video doesn't compile. He begs the agent to return a boolean instead of a string which, like, duh, because the whole problem was with boxing booleans. But, sure, let's run the profiler by writing "run the benchmark again" in the chat window instead of hitting a f@&king hotkey. F@&k, people are absolutely in a cult about these agents! "What's really cool here is that the profiler agent was able to have a, um, successful impact on this code and help me contribute to this repository in a meaningful way when I don't really know anything about this repository." WTF BRO. You just made a video showing non-developers how to pad their GitHub commit histories with performance-improvement PRs that they don't understand (and that might not even work) by spamming open-source projects. I was more excited about this one, and I think it would have worked better without the agent, but he wanted to show the agent. [UIs] "⛔ What's New in Windows Forms" by dotnet | Mary McGalla & Klaus Loeffelmann The two presenters use a giant prompt with Copilot to build a .NET 10 Winforms app to show slides like PowerPoint. As usual, they feed this prompt in to the "planner" to get a more agent-friendly plan that they'll send to the agent. They had to jabber quite a bit because the tool takes a long time to run. The tool generates a list of steps in Markdown with checkboxes and a progress bar that it regenerates as it works. OK? I guess? Is Markdown a UI target now? WTF? Like, how shitty are your WPF or HTML skills when you're hacking a new UI library on top of a Markdown renderer? Who thought that this was a good idea? I guess the last state of the UI is preserved and can be fed back in to the planner or agent? It seems to have worked, though, ... except that you can't go to the next slide. Oh, no, wait, cursor keys are supported. As usual, they didn't show any of the content in the gigantic prompt that they wrote. These two fools seem to have no idea how the tool that they spent 25 minutes using works. Also, they barely talk about Winforms. The few things that they mentioned are better covered in the "What's new in Windows Forms for .NET 10" release notes. This video sucked unless you enjoy watching people watch Visual Studio build code for them. "⛔ Modern Windows Development with .NET" by dotnet | Roy & Michael Hawker The two presenters discuss how much the community has done for WinUI3 development, with a huge style guide and much-better integration with the common MVVM toolkit also used in WPF and Maui. The WinUI3 styles can also be used with WPF, so that's neat, I guess. They didn't mention Maui. They talked about open-sourcing WinUI for quite a while. They also pretty much watched Copilot do stuff like generating UI chunks from text examples, converting to JSON then to a view (I think). This was all running locally, on the NPU (Neural Processing Unit) rather than using a model in the cloud, which is kind of nice. However, it's amazing how happy they are to demonstrate brute-forcing regeneration of a tool that generates a JSON then view from text, again and again and again. No-one asks at all anymore whether the generated code is the same, whether it works, whether there are tests to verify it, whether it makes sense to generate umpteen copies, whether the time couldn't be better spent on just doing it yourself, etc. etc. Of course, they never, ever show what was generated or give any indication that they have reviewed the code or consider it necessary to do so. Just run it once, look at it for a second, commit, push, and make a pull request. Hey everyone! We've all been wasting our time all of these years with structured development practices. With this tool that's right 70% of the time, you can skip all of that. Look at that UI go! Watch it flicker as it generates a whole bunch of stuff you're never even going to bother looking at until you get a call at 03:00 in the morning because everything blew up. Just kidding. No-one's going to call you. They're going to call other people who were stupid enough to take jobs on an on-call team. "✅ TUIs Are Back (Although They Never Left): Creating Modern CLI Apps in .NET." by dotnet | Andres Pineda He goes through the history of UIs for the first third of the video, which is kind of interesting and provides decent context for why we might want a TUI. In the second third, he presents the Spectre.Console framework for building TUIs. The initial version uses an in-memory database, then an SQLite database, and then an external database. It uses dependency injection and the by-now standard application startup. He also discusses Terminal.Gui, which runs on all supported platforms and has Miguel de Icaza as a contributor. This one creates apps that kind of look the old Borland DOS-mode applications. You build them with MVVM (supports CommunityToolkit.Mvvm) and generated views (not XAML) that you build with a text-console-based visual designer. You kind of have to see it to believe it. It's really pretty cool. If you want to use XAML, though, you can use RazorConsole with Spectre.Console to build UIs with that instead. "🆗 Ship Faster with .NET MAUI: Real-World Pitfalls and How to Nuke Them" by dotnet | Paul Usher A lot of the pitfalls he discusses are relatively general: resolution, distribution, deployment, staying up to date with security, etc. Dude recommends Console.WriteLine() as an important debugging tool. Ok, buddy. On the other hand, it's nice to see someone who shows his whole setup in detail, which, even though some of his tools are outdated (e.g., he uses CodeRush!), is nice to see, especially if you really have no idea how to get started. He goes on to discussing app-store-related problems and how to overcome some of them, which is also quite helpful, as this is a part of the process that few people talk about. It's not particularly enlightening but it's good to discuss, as you can't deploy an app without getting on app store. Another pitfall is dealing with lifecycle changes and interruptions: is the app in the foreground? Is the device asleep? Is there network connectivity? Is the battery low? Is the app in sleep mode? When do you perform which initialization? Which expectations can you have about connectivity? Everything is asynchronous and the situation outside the app changes all the time. You have to watch all of the events and respond appropriately. He advises using the emulator or simulator for a tighter feedback loop but there's no way to avoid testing on a target device -- or multiple target devices, as their behavior varies as well. He mentions that two recent Android devices (a Pixel and a Samsung) had different behavior in crucial areas affecting his apps. "✅ Building Rock-Solid Avalonia Apps A Guide to Headless Testing with AI Assistance" by dotnet | Dong Bin Whereas Avalonia and Maui both support iOS, Android, Windows, and MacOS targets, Avalonia also support Linux targets, including Linux running on embedded systems. The target that Dong addresses though is the headless mode, which is used for end-to-end UI testing. Avalonia's rendering is completely decoupled from the platform, with the headless platform being just another target, like Windows or Mac. God bless him for actually showing us how to write tests in the code editor. he's using Rider on Windows. His code uses ObservableProperty from the Community Toolkit. This is a good demo. In an advanced demo, he shows how to use "screenshot" rendering, even in headless mode. He also shows how to test controls for performance, both in speed and memory-usage, which is very important for building controls for highly constrained environments like embedded systems. He points out that headless testing won't help you with testing native features, actual visual look-&-feel. Instead, you can use the Skia renderer to approximate tests like that. Finally, he actually introduces a usage of AI that makes sense to me: helping to write all of the unit, integrated, headless, and render tests. He explains how the task is focused, verifiable, and already has a lot of context to keep the generated code on the right path. "✅ What's New in .NET MAUI" by dotnet | David Ortinau This one starts with an overview of the project. SyncFusion contributes heavily, from dozens of PRs to providing over 30 controls as open-source controls. They're also working much more closely with the Uno platform, which is ostensibly a competing framework but seems to be merging or moving closer to Maui. They're working on NativeAOT for Android, SkiaSharp improvements (it's their main rendering library), as well as WebAssembly multi-threading (that's another target that they have that Maui does not, unless you count Blazor integration). His demonstration is kind of neat: he shows a Maui app with SyncFusion controls and Community Toolkit, all running within an Uno Platform App. He shows it running in an Android emulator. This kind of support may extend Maui's reach without having to replicate everything. For example, the WebAssembly target Uno offers works seamlessly with .NET Maui apps. He demos a NuGet browser that was written for desktop, but now running in a browser. Next up is a very prosaic but very welcome addition: global usings/namespace declarations for XAML files. You no longer need to use prefixes and you no longer have a clump of stuff at the top of the file. On top of that, they also now support implicit namespaces (the feature is in preview). Now a XAML file for Maui can look like this: This is really nice. There's also XAML source-generation now. This increases speed of debugging and reduces the differences between the debug and release builds massively. This is an opt-in feature but it sounds great. You can debug the generated code instead of relying on a bunch of reflection. Debugging uses 99% less memory and view-inflation is now 1000% faster (10x). Overall app performance is 25% faster with 30% less memory usage. He talks about support for "safe edges" (UI integration with mobile form factors) and improved support for hybrid apps. He briefly discusses Aspire orchestration, which is completely integrated. This is especially interesting with hybrid solutions because the front-end actually has two parts that need to be coordinated. Doing this with Aspire is interesting. You can use the dashboard to inspect telemetry because the standard rendering is integrated as well. This telemetry is also available on the command line if you don't use Aspire. "✅ Build better web apps with Blazor in .NET 10" by dotnet | Daniel Roth The author talks a bit about large-scale apps in the U.S. and Europe that are built with Maui and, specifically, Maui Blazor. His presentation in this part is quite stilted and seems to have been massaged by the PR department. Like, he says that .NET Aspire makes you "cloud-ready," which, if you've watched the Aspire talks, is no longer the focus of Aspire, and hasn't been for a while. Deploying to the cloud is possible and well-supported, but it's not the main use case. He does demo some code, though. He shows passkey-integration for Blazor apps. I love how people watch this and think, "this is great; so much easier to log in," whereas I watch it and have just watched someone log in using a 4-digit PIN rather than a safe password. How is this better? It's similar to using a password manager on your device that's always logged in, though. But passkeys are really replicating a bunch of the convenience that you already had with a password manager. Next up is better integration for telemetry, which all appears in the Aspire dashboard. There are also advanced diagnostics, like being able to extract memory dumps and low-level runtime metrics from a running WASM Blazor app using a JavaScript command. The dottrace file can be easily converted to a gcdump file using the dotnet command and can then be analyzed in Visual Studio. This got very technical very quickly and I am here for it. Blazor is also about 20% faster in .NET 10. For developers, * Hot Reload is better; he demonstrates an over 10x speed improvement, from 38s to about 3s. * Full-graph form-validation, so complex forms no longer need custom validation. * Automated browser/end-to-end testing using WebApplicationFactory but then also launching a full-fledged headless browser and then running Playwright tests against it. * Better state-persistence support, with automatic persistence on idle, pause/resume on idle, etc. This all integrates with the telemetry and can be inspected in the Aspire dashboard. Very interesting and encouraging. [Aspire] "✅ Taking .NET out of .NET Aspire - working with non-.NET applications" by dotnet | David Gardiner He presents a multi-language, multi-environment solution that uses Python/uv, Rust/cargo, and TypeScript/pnpm, each of which are run manually. From there, he shows a template Aspire solution with a Redis cache, an API service, and a web front-end. He starts with a new Aspire solution, then integrates Mongo support using aspire-add-mongo and then integrates the PowerShell script that populates the data using an Aspire API. With that loaded up, he searches for an Aspire extension that works with his existing Python/uv setup. He doesn't have to change anything; he just binds the startup of that part into Aspire so that the service is available to his "app host" (and also shows up on the dashboard). The Rust service easily follows, again by using an existing Aspire package to integrate Rust/cargo specifically. Finally, he binds the React/Vite/pnpm solution using a node.js extension from the Community Toolkit (again). Where Aspire shines is that you don't need to run these disparate apps from various command lines or scripts, and you don't need to configure containers with YAML; you bind the various components and services with C# code, indicating dependencies between them, which Aspire not only handles but displays in the dashboard. He uses this power to remove hard-coded ports from his services, using the C# variables to read the and use the dynamically assigned ports instead. Finally, he integrates OpenTelemetry into the Python and Rust services so that the various services show their telemetry in the Aspire console, structured logging, traces, and metrics views. Finally, he adds an extra service that uses a node backend. Adding it once you have Aspire configured is very, very easy. This is an absolutely great 22-minute video that you can send to anyone who asks "what can Aspire do for me?" "⛔ From Architecture to Docs: .NET Aspire Documented with Copilot" by dotnet | Jorge Fernandez & David Oliva This video explains the basics of Aspire (like, the very basics), as well as the basics of Copilot and MCP. You can skip that part, as they're just reading from the slides, in what I am forced to note are pretty strong Spanish accents. I honestly can barely tell what's going on here. I feel so bad for these guys because they are probably much better in their native language but it's so much work understanding them in English. They're generating stuff with Copilot to generate an architecture overview for an existing solution file, using Markdown and ASCII diagrams. They then upgrade to using Mermaid diagrams. But I dare you to replicate what they did. "✅ Windows 365 Meets Aspire - Supercharging Multi-Repo Microservice Productivity" by dotnet | Eric Guo & Chuanbo Zhang This video demonstrates using .NET Aspire to wire up microservice servers with simulated Azure services in order to test InTune deployment software. It's quite a complex use case. They show how you can test locally, using Docker and the Azure-service simulators, and also deploy to Azure infrastructure. They even show how to simulate some of your own microservices by using the VS .http file format to quickly mock responses for a subset of the functionality. In this vein, they also discuss how to configure data-seeding for a stable environment, then finish up by discussing how to use XUnit to run automated tests against this entire infrastructure, both locally and in pipelines. Although the specific use case is quite complex, there is a lot of good stuff to learn about testing automation in this talk. .NET Aspire makes it a lot easier to run locally and in the cloud without different approaches. "✅ Deep Dive: Extending and Customizing Aspire" by dotnet | David Fowler & Damian Edwards Fowler shows a single-project solution with a .NET Aspire AppHost project that binds non-.NET dependencies (i.e., they're not they're own projects). One of the dependencies is a postgres database that is absolutely a dependency but has classically been managed outside of the solution. Now, you can declare and bind the dependencies with C#. The takeaway is: a much slimmer readme file, that you just clone and call aspire run. The great thing about this is that it has to stay in-sync, unlike a readme file. Fowler shows the app dashboard with a lot of custom dependencies, including the .NET 10 OpenAPI replacement called Scalar, which is fully integrated into the Aspire dashboard. Fowler even shows how you can customize the dashboard appearance with C# code, using very standard options customization, as you would see in other host-based applications like ASP.Net (or many other types, Console, Windows Service, etc.). Damian points out what we're all thinking: holy crap, Fowler, WTH you hacked everything into the AppHost.cs file, like hundreds of lines, including a custom database seeder that uses the endpoint spun up by Aspire. It's neat to see how you can bind in that kind of code, though, to just wait until the HTTP REST server is available and then to run some C# code to seed it with data. It's ugly and it's hacky in his code, but it's wonderful that you can prototype and test so quickly with disparate systems and components. He has only one C# file and orchestrates diverse other components and scripts from it. OK, he continues to show how you can bind commands into the Aspire Dashboard that he uses to bind a "reset command" that uses the Aspire interaction service to show a message box requesting approval. Finally, at the very end, he shows how to use an MCP integration with Aspire. This is no more exciting than watching anyone else watch Copilot stumble drunkenly around a dark room. It's only the last two minutes so we're not subjected to too much of this foolishness. It was still writing furiously into the chat as the video ended. Fowler is also using Visual Studio Code rather than Visual Studio. He also speaks very, very quickly, so brace yourself. "✅ Aspire Unplugged with David and Maddy" by dotnet | David Fowler & Maddy Montaquila They have T-Shirts with a great sentiment on them, "Friends don't let friends write YAML." Except that the "Write YAML" part is really, really big for some reason, so it looks like the shirts are exhorting users to actually write YAML. Whatever. The first question is for Fowler, who describes the impetus of Aspire. It came from the pains of configuring so many scripts for infrastructure, even with a strong tool like Kubernetes. It grew into a "general-purpose dev tool" for any sort of environment. It was originally scoped as a cloud-native tool but it quickly became obvious that nearly every solution has some sort of orchestration and scripting that always ended up in readme files or PowerShell or Bash scripts: starting the database, starting the backend for a mobile app, whatever. "That became one of our key things, right? Like you want to onboard someone, you model all the stuff in code and then like you don't have to tell someone run this script, run that script, pass the output from this script to that script, string together stuff. Like you can just kind of like put it in code, have it be there." He gives a lot of examples and detail about how polyglot and scalable .NET Aspire is. The other video he did -- "Deep Dive: Extending and Customizing Aspire" by dotnet | David Fowler & Damian Edwards -- showed a lot of code for integrating JavaScript and Python services. Another video -- "Taking .NET out of .NET Aspire - working with non-.NET applications" by dotnet | David Gardiner -- also shows how to integrate a lot of plugins from the community, including a Rust backend service. The next big question is about persisting containers, supporting hot-reload, which is finicky to design and increases the complexity of the architecture significantly but the upside is huge if they can get it working. They managed a huge rewrite of all of the plumbing to support this type of scenario and are much better positioned for future developments. The next question builds on this, asking about multi-repo support, with what's called the "AppHost in AppHost" question: can you nest .NET Aspire apps? How does that work? It would be nice to be able to group shared services into one AppHost and then reference then from another high-level AppHost (for much larger solutions, obviously). What happens to the dashboards, though? The idea of Aspire is to work with existing solutions, so the aspire init is a much more important workflow than aspire new. That is, you're much more likely to already have a solution into which you'd like to integrate an AppHost or set of projects around which you'd like to wrap an AppHost than you are to be green-fielding a solution and starting with Aspire. I love the dynamic between Fowler and Maddy. You can really tell they love working together, that they really, really respect one another. They love the "adult" Damian as well. [Cloud] "🆗 What's New in Containers for .NET 10" by dotnet | Rich Lander & Chet Husk The two presenters first discuss the history of containers in .NET, including operating systems, support periods, etc. The second half demonstrates using dotnet publish using AOT and multiple OS targets and then deploying them into various containers. This targets are all variations of Linux and for command-line or server apps. "✅ What's new in Azure App Service for .NET developers" by dotnet | Byron Tardif He quickly covers when .NET 10 will be available in App Service for Linux (Ubuntu, not Debian) and Windows, then moves on to showing how to use .NET Aspire to build and deploy an application to App Service. Blessedly, he's doing it manually, following a simple guide, rather than "getting Copilot to do it for him." This inspires much more confidence that it's well-designed and simple enough to actually learn, rather than implying that you need to ask a black-box globe-girdling data-model in order to grok it. He's got the standard Aspire app and then types azd up. It takes five minutes for the system to analyze, find a subscription, determine existing resources, and then deploy, creating services where needed. Access to the deployment is automatically configured (e.g., the dashboard is only available for authorized users). He quickly shows the Azure Portal resources that were created for the App Service. This is nice. .NET Aspire is a worthy and welcome successor to Bicep scripts. He shows a bunch of features of App Services specifically, including scaling options. "🆗 Carbon Aware Computing - Using .NET Open Source libraries for more sustainable applications" by dotnet | Aydin Mir Mohammadi This video covers tactics and tooling for running data services in a sustainable manner. E.g., load-shifting from day to night, adjusting available capacity depending on local energy availability, etc. There's a lot of telemetry and real-time monitoring needed to even begin working in a sustainable manner. In the second half, he gets to integrating an SDK that calculated best-execution time. Even libraries like Hangfire have methods like IncludeCarbonAwareExecution() (I'm not kidding!) that wrap all of this in a very high-level abstraction. [AI] "⛔️ Architecting an AI-Powered Sales Dashboard with .NET MAUI and Azure OpenAI" by dotnet | Shriram Sankaran The app he discusses summarizes market data using AI. Did we all just choose to forget that AIs are not good at numbers? Did I miss the technology that we used to fix this problem? Remember "AIs are not good at numbers?" I do! When did we fix that? Anyway, the UI looks decent and it's completely cross-platform thanks to Maui. It uses SyncFusion's controls as well as standard Maui controls. He spends quite a bit of time going over the features of his app. The AI is used to query the app data with a built-in chatbot. When he finally gets to the code, his project is curiously not using CommunityToolkit.MVVM (all of the properties are implemented manually instead of source-generated. He eventually gets to more source but it's not very illuminating. I can't really recommend it. "🆗 One Question, One Answer: Designing Seamless AI Agents with C#" by dotnet | Mark Miller The presenter works on CodeRush for DevExpress. He uses CodeRush (I guess?) in dictation mode to build his calculator app, which, you know, is going to be something that the AI can easily build, as there are probably millions of examples in the training data. The generated code is horrifically defensive and not even close to what I would have made, or what I consider to be maintainable, but it's fine for a prototype. So, here we have another video that's just showing how to program with an AI. He's arguing for a workflow that stays in the code and is delivered via talking -- because it's 2-4 times faster than typing for most people and LLMs are very forgiving of extra words and filler words -- so that you can avoid most of the pain points of working with the by-now "classic" AI-chat interface. He talks about lot about how to optimize the context but I guess his tool does this? "✅ Overcoming the limitations when using AI" by dotnet | Michael Washington This guy doesn't show up on the video. His voiceover and cadence is somewhat odd. It sounds very much like a text-to-speech engine. The whole presentation seems fake but the information is quite interesting. I guess he wrote the presentation but then had a machine read it for him. He discusses how LLMs are bad at math, so the solution was to have the LLM create code to calculate answers. It's wild how much f@&king processing power we're willing to invest in getting the correct answer to 43 x 34. The LLM interprets the text, then generates an answer that includes a little Python program that it then executes in a sandbox so that i can include the output in its answer. It's just flat-out nuts. Still, he shows off how he's managed to work around these limitations but they are really elaborate. Next up is that "AIs can't write fiction". He discusses AI story-builders, which use text-file databases in order to maintain context and continuity for stories. He found that page-by-page and chapter-by-chapter doesn't work very well, but that paragraph-by-paragraph is the level of granularity at which an LLM needs guidance. There is a whole program surrounding the LLM's inputs and outputs. Without it, the story goes off the rails immediately. After that, he shows that AI cannot create applications. They can code but they have no idea of architecture and no idea how to deal with complex systems. Find his slides and work at "Overcoming limitations When Using AI" . "🆗 Modernizing a 17th Century Italian-English Dictionary" by dotnet | Wayne Sebbens This was not uninteresting but it wasn't a lot of programming information. Half of the video is a discussion of European martial arts and its relation to archaic Italian dialects and spellings. He basically made an app for searching these terms using vector databases and ML in .NET. If that sounds like something you want to do, check out the video and his "repo" . If not, then you can safely skip the video. [Everything else] "🆗 Going Passwordless - A Practical Guide to Passkeys in ASP.NET Core" by dotnet | Maarten Balliauw This is a decent and thorough introduction to authentication mechanisms, from passwords to MFA to passkeys, illustrating both the differences between passkeys and other methods as well as the .NET support for working with passkeys in your own applications (mostly in the last third of the video). "✅ GitHub Actions DevOps Pipelines as Code using C# and Cake SDK" by dotnet | Mattias Karlsson Cake is a build system written in C# with a rich .NET API. Mattias did a bunch of live-coding. The Cake scripts might be useful for defining a bunch of stuff that we currently use Azure Pipeline Definitions for. he demonstrates how provider plugins enable high-level abstractions that make it much easier to specify a declarative pipeline. It's all in C#, so you use a code editor like Rider, with code-completion, refactoring, etc. You continue to use the YAML pipeline definition to set up the environment but everything else will be in the Cake file. This makes a lot of sense and could be quite powerful. Instead of using a bunch of pipeline nested templates that you can't run or debug, you could have a NuGet package with common APIs for Cake. You can also test a bunch of the Cake script locally (unless you have some highly specific steps like signing with a key only available in the cloud or calling a tool that's only available in the cloud. You can use standard C# to make these optional when testing locally, though. "🆗 If .NET brewed beer..." by dotnet | Shaun Lawrence He starts with a 10-minute presentation on his home-brewing setup, finally getting to the point where he discusses the embedded device for which he used .NET: A Meadow F7v2 DevModule. For the next ten minutes, he just kind of muddles about, showing the API surface of the meadow library. After showing how to integrate a temperature sensor, he shows how to integrate PID control ("Proportional-Integral-Derivative control" ), again using the API. He mixes in support for PWN ("Pulse-width Modulation" ). Both of these are commonly used algorithms to stabilize the interaction with a sensor: for interpreting and smoothing the signal and for ensuring that the written value corresponds to the desired value without slewing about. At the very end, he shows that his UI is built with Maui but he doesn't get into it too much. It's nice that they provide low-level support for working directly with hardware but it's not too fascinating. It's good to know that C# is increasingly becoming a viable alternative to systems programming with C, C++, or even Rust or Go. He uses Visual Studio Code.

Toub’s 232-page tour-de-force on performance in .NET 10

2025-11-18T22:55:56+01:00

Published by marco on 18. Nov 2025 22:55:56 ------------------------------------------------------------------------ [image]The book-length "Performance Improvements in .NET 10" by Stephen Toub arrived a couple of months ago. He explains how the various compilers (AOT, JIT, etc.) have been optimized to eliminate allocations and just generally optimized for performance. A reduction in allocations is a multi-win: the performance is better because the allocator isn't working, the memory usage has dropped, and the garbage collector also works less. See previous coverage in: * "Toub’s 234-page tour-de-force on performance in .NET 9" (2024) * Somehow, I never documented .NET 8. Huh. * "Performance Improvements in .NET 7" (2022). [A presentation at .NET Build 2025] If you prefer a 30-minute video, then you're in luck. [media] He compares .NET Framework 4.8 vs. .NET 9 vs. .NET 10. The most impressive improvements are from 4.8 to 9.0, of course, but he highlights some interesting places where .NET 10 eclipses .NET 9, where .NET 9 had already eclipsed .NET Framework 4.8. The last example shows how regular expressions have been continually optimized so that an operation that took 24ms in .NET Framework 4.8 was improved by about 12x to 2.5ms in .NET 9 but has been further improved by about 62,500x to about 40ns in .NET 10. [Citations and Notes] And now, on to the citations from Toub's book along with my notes. He starts off with a bit of history and context in the wider world. "What made "Tudor’s" ice last halfway around the world wasn’t one big idea. It was a plethora of small improvements, each multiplying the effect of the last. In software development, the same principle holds: big leaps forward in performance rarely come from a single sweeping change, rather from hundreds or thousands of targeted optimizations that compound into something transformative. .NET 10’s performance story isn’t about one Disney-esque magical idea; it’s about carefully shaving off nanoseconds here and tens of bytes there, streamlining operations that are executed trillions of times." "As with many languages, .NET historically has had an “abstraction penalty,” those extra allocations and indirections that can occur when using high-level language features like interfaces, iterators, and delegates. Each year, the JIT gets better and better at optimizing away layers of abstraction, so that developers get to write simple code and still get great performance. .NET 10 continues this tradition. The result is that idiomatic C# (using interfaces, foreach loops, lambdas, etc.) runs even closer to the raw speed of meticulously crafted and hand-tuned code." [JIT] "If the compiler can prove an object doesn’t escape, then that object’s lifetime is bounded by the method, and it can be allocated on the stack instead of on the heap. Stack allocation is much cheaper (just pointer bumping for allocation and automatic freeing when the method exits) and reduces GC pressure because, well, the object doesn’t need to be tracked by the GC. .NET 9 had already introduced some limited escape analysis and stack allocation support; .NET 10 takes this significantly further." "[...] where things gets interesting is around what the JIT is able to devirtualize. In .NET 9, it struggles to devirtualize calls to the interface implementations specifically on T[], so it won’t devirtualize either the _list.GetEnumerator() call nor the _list[index] call. However, the enumerator that’s returned is just a normal type that implements IEnumerator, and the JIT has no problem devirtualizing its MoveNext and Current members. Which means that we’re actually paying a lot more going through the indexer, because for N elements, we’re having to make N interface calls, whereas with the enumerator, we only need the one with GetEnumerator interface call and then no more after that." To be clear: this has been addressed in .NET 10, so that the indexer is also almost always devirtualized. "dotnet/runtime#110827 from @hez2010 also helps more methods to be inlined by doing another pass looking for opportunities after later phases of devirtualization. The JIT’s optimizations are split up into multiple phases; each phase can make improvements, and those improvements can expose additional opportunities. If those opportunities would only be capitalized on by a phase that already ran, they can be missed. But for phases that are relatively cheap to perform, such as doing a pass looking for additional inlining opportunities, those phases can be repeated once enough other optimization has happened that it’s likely productive to do so again." "The static readonly field is immutable, arrays can’t be resized, and the JIT can guarantee that the field is initialized prior to generating the code for Read. Therefore, when generating the code for Read, it can know with certainty that the array is of length three, and we’re accessing the element at index two. Therefore, the specified array index is guaranteed to be within bounds, and there’s no need for a bounds check." The JIT has been doing these kinds of optimizations for a long time but the number of cases for which it can "prove" increases with each release. "My choice of benchmark in this case was not coincidental. This pattern shows up in the FormattingHelpers.CountDigits internal method that’s used by the core primitive types in their ToString and TryFormat implementations, in order to determine how much space will be needed to store rendered digits for a number. As with the previous example, this routine is considered core enough that it was using unsafe code to avoid the bounds check. With this fix, the code was able to be changed back to using a simple span access, and even with the simpler code, it’s now also faster." "Many of these different optimizations interact with each other. Dynamic PGO triggers a form of cloning, as part of the guarded devirtualization (GDV) mentioned earlier: if the instrumentation data reveals that a particular virtual call is generally performed on an instance of a specific type, the JIT can clone the resulting code into one path specific to that type and another path that handles any type. That then enables the specific-type code path to devirtualize the call and possibly inline it. And if it inlines it, that then provides more opportunities for the JIT to see that an object doesn’t escape, and potentially stack allocate it. dotnet/runtime#111473, dotnet/runtime#116978, dotnet/runtime#116992, dotnet/runtime#117222, and dotnet/runtime#117295 enable that, enhancing escape analysis to determine if an object only escapes when such a generated type test fails (when the target object isn’t of the expected common type)." This led to several several dozen performance-test improvements across the board when the PR landed. The whole section boils down to the JIT optimization working not only for regular loops, enumerable loops, but also hand-unrolled code with multiple array accesses (where bounds-checks can now be elided using clever cloning). [Inlining] "[...] generally the most benefit from inlining comes from knock-on benefits. Just as a simple example, if you have code like:" int i = Divide(10, 5); static int Divide(int n, int d) => n / d; "if Divide doesn’t get inlined, then when Divide is called, it’ll need to perform the actual idiv, which is a relatively expensive operation. In contrast, if Divide is inlined, then the call site becomes:" int i = 10 / 5; "which can be evaluated at compile time and becomes just:" int i = 2; "Just inlining everything would be bad; inlining copies code, which results in more code, which can have significant negative repercussions. For example, inlining’s increased code size puts more pressure on caches. Processors have an instruction cache, a small amount of super fast memory in a CPU that stores recently used instructions, making them really fast to access again the next time they’re needed (such as the next iteration through a loop, or the next time that same function is called)." "As part of these heuristics, the JIT has the notion of “boosts,” where observations it makes about things methods do boost the chances of that method being inlined. dotnet/runtime#114806 gives a boost to methods that appear to be returning new arrays of a small, fixed length; if those arrays can instead be allocated in the caller’s frame, the JIT might then be able to discover they don’t escape and enable them to be stack allocated. dotnet/runtime#110596 similarly looks for boxing, as the caller could possibly instead avoid the box entirely." [Code Layout] "When the JIT compiler generates assembly from the IL emitted by the C# compiler, it organizes that code into “basic blocks,” a sequence of instructions with one entry point and one exit point, no jumps inside, no branches out except at the end. These blocks can then be moved around as a unit, and the order in which these blocks are placed in memory is referred to as “code layout” or “basic block layout.” This ordering can have a significant performance impact because modern CPUs rely heavily on an instruction cache and on branch prediction to keep things moving fast. If frequently executed (“hot”) blocks are close together and follow a common execution path, the CPU can execute them with fewer cache misses and fewer mispredicted jumps." "Consider a tight loop executed millions of times. A good layout keeps the loop entry, body, and backward edge (the jump back to the beginning of the body to do the next iteration) right next to each other, letting the CPU fetch them straight from the cache. In a bad layout, that loop might be interwoven with unrelated cold blocks (say, a catch block for a try in the loop), forcing the CPU to load instructions from different places and disrupting the flow. Similarly, for an if block, the likely path should generally be the next block so no jump is required, with the unlikely branch behind a short jump away, as that better aligns with the sensibilities of branch predictors." [GC Write Barriers] "Whenever there’s a reference write that could cross a generation, the JIT emits a call to a helper that tracks the information in a “card table,” and when the GC runs, it consults this table to see if it needs to scan a portion of the higher generations. That helper is referred to as a “GC write barrier.” Since a write barrier is potentially employed on every reference write, it must be super fast, and in fact the runtime has several different variations of write barriers so that the JIT can pick one optimized for the given situation. Of course, the fastest write barrier is one that doesn’t need to exist at all, so as with bounds checks, the JIT also exerts energy to try to prove when write barriers aren’t needed, eliding them when it can. And it can even more in .NET 10." [Miscellaneous] "As with most compilers, the JIT employs common subexpression elimination (CSE) to find identical computations and avoid doing them repeatedly. dotnet/runtime#106637 teaches the JIT how to do so in a more consistent manner by more fully integrating CSE with its Static Single Assignment (SSA) representation. This in turn allows for more optimizations to kick in, e.g. some of the strength reduction done around loop induction variables in .NET 9 wasn’t applying as much as it should have, and now it will." I just love how Toub manages to keep up his excitement so deep into this document. He's really a great writer. [Native AOT] "Native AOT [Ahead Of Time [compilation]] is the ability for a .NET application to be compiled directly to assembly code at build-time. The JIT is still used for code generation, but only at build time; the JIT isn’t part of the shipping app at all, and no code generation is performed at run-time. As such, most of the optimizations to the JIT already discussed, as well as optimizations throughput the rest of this post, apply to Native AOT equally." [VM] "With dotnet/runtime#114462, the runtime now uses a single shared “template” for many of the small executable “stubs” it needs at runtime; stubs are tiny chunks of machine code that act as jump points, call counters, or patchable trampolines. Previously, each memory allocation for stubs would regenerate the same instructions over and over. The new approach builds one copy of the stub code in a read-only page and then maps that same physical page into every place it’s needed, while giving each allocation its own writable page for the per-stub data that changes at runtime. This lets hundreds of virtual stub pages all point to one physical code page, cutting memory use, reducing startup work, and improving instruction cache locality." [Threading] "If a thread is blocked on an operation that depends on work items in that thread’s local queue getting processed, that work item being picked off now depends on the global queue being exhausted and another thread coming along and stealing the work item from this thread’s queue. If there’s a steady stream of incoming work into the global queue, though, that will never happen; essentially, the highest priority work item has become the lowest priority work item. "So, back to these PRs. The idea is fairly simple: when the thread is about to block, and in particular when it’s about to block waiting on a Task, it first dumps its entire local queue into the global queue. That way, this work which was highest priority for the blocked thread has a fairer chance of being processed by other threads, rather than it being the lowest priority work for everyone." "dotnet/runtime#107843 from @hamarb123 adds two new methods to the Volatile class: ReadBarrier and WriteBarrier. A read barrier has “load acquire” semantics, and is sometimes referred to as a “downward fence”: it prevents instructions from being reordered in such a way that memory accesses below/after the barrier move to above/before it. In contrast, a write barrier has “store release” semantics, and is sometimes referred to as an “upwards fence”: it prevents instructions from being reordered in such a way that memory accesses above/before the barrier move to below/after it." "These barriers are referred to as “half fences”; the read barrier prevents later things from moving earlier, but not the other way around, and the write barrier prevents earlier things from moving later, but not the other way around. (As it happens, though, while not required by specification, today the implementation of lock does use a full barrier on both enter and exit, so nothing before or after a lock will move into it.)" [Reflection] "System.Net.Http sits above System.Security.Cryptography, referencing it for critical features like X509Certificate. But System.Security.Cryptography needs to be able to make HTTP requests in order to download OCSP information, and with System.Net.Http referencing System.Security.Cryptography, System.Security.Cryptography can’t in turn explicitly reference System.Net.Http. It can, however, use reflection or [UnsafeAccessor] and [UnsafeAccessorType] to do so, and it does. It used to use reflection, now in .NET 10 it uses [UnsafeAccessor]." [Primitives and Numerics] "dotnet/runtime#111505 from @alexcovington enables TensorPrimitives.Divide to be vectorized for int. The operation already supported vectorization for float and double, for which there’s SIMD hardware-accelerated support for division, but it didn’t support int, which lacks SIMD hardware-accelerated support. This PR teaches the JIT how to emulate SIMD integer division, by converting the ints to doubles, doing double division, and then converting back." That fix, roundabout as it sounds, ends up making that operation 4x faster. This is pretty cool because dividing integers in SIMD code just became 4x faster on .NET. You don't use this, you say? Well, are you sure? Are you sure that there is no code in handshake-negotiation (e.g.) that needs to divide multiple integers in parallel? These are exactly the kind of improvements that, as noted in Toub's introduction, lead to smoother operation in many other places. This is such a low-level primitive. " We can then reuse those methods to do the same thing that’s already done for scalar operations but do it vectorized: take a vector of Halfs, convert them all to floats, process all the floats, and convert them all back to Halfs. Of course, I already stated that the vector types don’t support Half, so how can we “take a vector of Half“? By reinterpret casting the Span to Span (or Span), which allows us to smuggle the Halfs through. And, as it turns out, even for scalar, the very first thing Half‘s float cast operator does is convert it to a short. "The net result is that a ton of operations can now be accelerated for Half." These optimizations improve performance for processing Half in dozens of operations by 11x. "with C# 14, it’s possible for a type to not only define a + operator, it can also define a += operator. If a type defines a += operator, it will be used rather than expanding a += b as shorthand for a = a + b. And that has performance ramifications. "[...] that means that such compound operators on the tensor types can just update the target tensor in place rather than allocating a whole new (possibly very large) data structure for each computation. dotnet/runtime#117997 adds all of these compound operators for the tensor types. (Not only are these using C# 14 user-defined compound operators, they’re doing so as extension operators, using the new C# 14 extension types feature. Fun!)" [Collections] "[...] as noted earlier in the JIT section, the JIT has been gaining super powers around dynamic PGO, escape analysis, and stack allocation. This means that in many situations, the JIT is now able to see that the most common concrete type for a given call site is a specific enumerator type and generate code specific to when it is that type, devirtualizing the calls, possibly inlining them, and then, if it’s able to do so sufficiently, stack allocating the enumerator. With the progress that’s been made in .NET 10, this now happens very frequently for arrays and List. While the JIT is able to do this in general regardless of an object’s type, the ubiquity of enumeration makes it all that much more important for IEnumerator, so dotnet/runtime#116978 marks IEnumerator as an [Intrinsic], giving the JIT the ability to better reason about it." "For shorter lists, dynamic PGO will see MoveNextRare invoked a reasonable number of times, and will consider it for inlining. And if all of the calls to the enumerator are inlined, the enumerator instance can avoid escaping the call frame, and can then be stack allocated. But once the list length grows to a much larger amount, that MoveNextRare method will start to look really cold, will struggle to be inlined, and will then allow the enumerator instance to escape, preventing it from being stack allocated." "While OSR is awesome, it unfortunately causes some complications here. Once the list gets long enough, an invocation of the tier 0 (unoptimized) method will transition to the OSR optimized method… but OSR methods don’t contain dynamic PGO instrumentation (they used to, but it was removed because it led to problems if the instrumented code never got recompiled again and thus suffered regressions due to forever-more running with the instrumentation probes in place). Without the instrumentation, and in particular without the instrumentation for the tail portion of the method (where the enumerator’s Dispose method is invoked), even though List.Dispose is a nop, the JIT may not be able to do the guarded devirtualization that enables the IEnumerator.Dispose to be devirtualized and inlined. Meaning, ironically, that the nop Dispose causes escape analysis to see the enumerator instance escape, such that it can’t be stack allocated. Whew. "[...] Specifically for enumerators, this PR enables dynamic PGO to infer the missing instrumentation based on the earlier probes used with the other enumerator methods, which then enables it to successfully devirtualize and inline Dispose." "Labels A and B form a loop, but that loop can be entered by jumping to either A or to B. If the compiler could prove that this loop were only ever enterable from A or only ever enterable from B, then the loop would be “reducible.” Irreducible loops are much more complex than reducible loops for a compiler to deal with, as they have more complex control and data flow and in general are harder to analyze. dotnet/runtime#116949 rewrites the MoveNext method to be a more typical while loop, which is not only easier to read and maintain, it’s also reducible and more efficient, and because it’s more streamlined, it’s also inlineable and enables possible stack allocation." This results in a 7x performance improvement when iterating a list of integers. There are also a ton of optimizations in Linq, for Contains (with 10x - 400x improvements), Fill (40x), Shuffle (2x - 40x), LeftJoin, and RightJoin (2x). There are also specific improvements for many of the base collection types. [IO] The next section on IO is also interesting, with one case where they didn't actually change any code but instead introduced an analyzer that discourages using the EndOfStream property in asynchronous code, which can lead to pathological cases in which the stream is blocked until more data arrives. [Searching / Regular Expressions] This section includes a longer discussion about the improvements included in previous versions of .NET, especially as it relates to avoiding backtracking. There are normalized forms of regular expressions that incur no backtracking penalty and can thus be evaluated with the faster version of the regular-expression engine that doesn't have to account for it. Here's an example that I've lifted up from much further down in this section. "Given the pattern ^abc|^abd, the code generators would end up emitting this exactly as it’s written, with an alternation with two branches, the first branch checking for the beginning and then matching "abc", the second branch also checking for the beginning and then matching "abd". Now in .NET 10, the anchor can be factored out, such that ^abc|^abd ends up being rewritten as ^ab[cd]." The idea here is to search for pathological formulations for which there is a non-pathological equivalent and automatically use that version under the hood. That is my interpretation of the following rather-dense section. "Consider a pattern a*b. a*b is observably identical to (?>a*)b, which says that the a* should not be backtracked into. That’s because there’s nothing the a* can “give back” (which can only be as) that would satisfy what comes next in the pattern (which is only b). It’s thus valid for a backtracking engine to transform how it processes a*b to instead be the equivalent of how it processes (?>a*)b. And the .NET regex engine has been capable of such transformations since .NET 5. This can result in massive improvements to throughput. With backtracking, waving my hands, we effectively need to execute everything after the backtracking construct for each possible position we could backtrack to. So, for example, with \w*SOMEPATTERN, if the w* successfully initially consumes 100 characters, we then possibly need to try to match SOMEPATTERN up to 100 different times, as we may need to backtrack up to 100 times and re-evaluate SOMEPATTERN each time we give back one of the things initially matched. If we instead make that (?>\w*), we eliminate all but one of those! That makes improvements to this ability to automatically transform backtracking constructs to be non-backtracking possibly massive improvements in performance, and practically every release of .NET since .NET 5 has increased the set of patterns that are automatically transformed. .NET 10 included." There are several detailed examples of 5x--6x improvements in performance for relatively common-looking regular expressions. Stephen Toub loves writing about very-specific regular-expression examples. Like, each paragraph is a blog post just on its own. Needless to say, this section is, at the same time, fascinating, extremely detailed, and eminently uncitable (because it would just entail citing pages of detail that is all necessary to understand the optimization). The improvements are impressive and incredibly well-described. Go check out that section if you like regular expressions and mathematical analysis (equivalence of expressions, reduction of solution space). The additional beauty is that the regular-expression evaluators are all source-generated C#, so it's much, much easier to evaluate what's going on than with the assembly-level discussions in the JIT discussion, for example. As a final example, here is the level of holistic analysis we're talking about. "Unfortunately, the helper that emits that IndexOf call was passed the wrong node from the pattern: it was being passed the object representing the (?:.|\n) any-set rather than the "*/" literal, which resulted in it emitting the equivalent of IndexOfAnyInRange((char)0, '\uFFFF') rather than the equivalent of IndexOf("*/"). Oops. It was still functionally correct, in that the IndexOfAnyInRange call would successfully match the first character and the loop would re-evaluate from that location, but that means that rather than efficiently skipping using SIMD over a bunch of positions that couldn’t possibly match, we were doing non-trivial work for each and every position along the way." As in the IO section above, some of the optimizations come in the form of analyzers that recommend an optimization that the user can apply rather than something that the runtime can do automatically. "[...] the .NET 10 SDK includes a new analyzer related to Regex. It’s oddly common to see code that determines whether an input matches a Regex written like this: Regex.Match(...).Success. While functionally correct, that’s much more expensive than Regex.IsMatch(...). For all of the engines, Regex.Match(...) requires allocating a new Match object and supporting data structures (except when there isn’t a match found, in which case it’s able to use an empty singleton); in contrast, IsMatch doesn’t need to allocate such an instance because it doesn’t need to return such an instance (as an implementation detail, it may still use a Match object, but it can reuse one rather than creating a new one each time)." [MemoryExtensions] "These overloads all parallel existing methods, but remove the IEquatable (or IComparable) constraint on the generic method parameter and accept an optional IEqualityComparer? (or IComparer). When no comparer or a default comparer is supplied, they can fall back to using the same vectorized logic for relevant types, and otherwise can provide as optimal an implementation as they can muster, based on the nature of T and the supplied comparer." This part is very interesting because you see how the improvements to MemoryExtensions led to SearchValues being faster, which, in turn, led to methods like Normalize and Contains being faster (especially when working with strings that are automatically treated as Spans wherever possible). [JSON] A good method to know is RemoveAll(), which accepts a lambda to filter for the elements to remove. If, instead of looping over the items and calling RemoveAt(n), you write _arr.RemoveAll(static n => n!.GetValue() % 2 == 0), you get a huge performance benefit because RemoveAll() adjusts the underlying buffer only once rather than on each call to remove each individual item. "With JSON being used as an encoding for many modern protocols, streaming large JSON payloads has become very common. And for most use cases, it’s already possible to stream JSON well with System.Text.Json. However, in previous releases there wasn’t been a good way to stream partial string properties; string properties had to have their values written in one operation. If you’ve got small strings, that’s fine. If you’ve got really, really large strings, and those strings are lazily-produced in chunks, however, you ideally want the ability to write those chunks of the property as you have them, rather than needing to buffer up the value in its entirety. dotnet/runtime#101356 augmented Utf8JsonWriter with a WriteStringValueSegment method, which enables such partial writes. [...] These modern protocols often transmit large blobs of binary data within the JSON payloads. Typically, these blobs end up being Base64 strings as properties on some JSON object. Today, outputting such blobs requires Base64-encoding the whole input and then writing the resulting bytes or chars in their entirety into the Utf8JsonWriter. To address that, dotnet/runtime#111041 adds a WriteBase64StringSegment method to Utf8JsonWriter." [Cryptography] "A ton of effort went into cryptography in .NET 10, almost entirely focused on post‑quantum cryptography (PQC). PQC refers to a class of cryptographic algorithms designed to resist attacks from quantum computers, machines that could one day render classic cryptographic algorithms like Rivest–Shamir–Adleman (RSA) or Elliptic Curve Cryptography (ECC) insecure by efficiently solving problems such as integer factorization and discrete logarithms. With the looming threat of “harvest now, decrypt later” attacks (where a well-funded attacker idly captures encrypted internet traffic, expecting that they’ll be able to decrypt and read it later) and the multi-year process required to migrate critical infrastructure, the transition to quantum‑safe cryptographic standards has become an urgent priority. In this light, .NET 10 adds support for ML-DSA (a National Institute of Standards and Technology PQC digital signature algorithm), Composite ML-DSA (a draft Internet Engineering Task Force specification for creating signatures that combine ML-DSA with a classical crypto algorithm like RSA), SLH-DSA (another NIST PQC signature algorithm), and ML-KEM (a NIST PQC key encapsulation algorithm)." [Conclusion] Overall, this is another amazing document -- a book -- that is edited to an incredibly high quality. I didn't notice any grammatical, formatting errors, or typos (maybe a missing `?` on IComparer in "These overloads all parallel existing methods, but remove the IEquatable (or IComparable) constraint on the generic method parameter and accept an optional IEqualityComparer? (or IComparer)." or when he wrote "frequently-requested" (the hyphen is only correct with adjectives, not adverbs).

The idea of MCP: "Tea. Earl grey. Hot."

2025-05-31T22:36:05+02:00

Published by marco on 31. May 2025 22:36:05 ------------------------------------------------------------------------ The article "A Critical Look at MCP" by Rasmus Holm discussing many of the drawbacks of MCP as it is currently conceived. One of them is the push to build everything in Python, which is a dynamic language that's better-designed than JavaScript, but isn't a lot better at helping users write maintainable code. "Am I being pretentious/judgmental in thinking that people in AI only really know Python, and the "well, it works on my computer" approach is still considered acceptable? This should be glaringly obvious to anyone that ever tried to run anything from Hugging Face. "If you want to run MCP locally, wouldn't you prefer a portable language like Rust, Go, or even VM-based options such as Java or C#?" [image]I've been having discussions with people at work about MCP. This post made me think that I haven't been clear about my attitude toward it. I think it would be amazing if we could pose natural language queries to machines and have them do things for us. Absolutely. Who doesn't want to just order ""Tea. Earl grey. Hot."" ? My doubts are more specific to MCP itself, technically, as a protocol. This article is highly technical, but it boils down to: MCP is such a hype-y protocol right now and it's so technically shaky that we have a responsibility to not just grab the first damned thing that shows up and make it the standard. We did that with JavaScript and it took 2 years until it was everywhere and over 20 years until it was an actual professional tool. I'm an old man and, looking back, very often our industry is just stepping on rakes "that are right there" . I just to clarify that I'm pushing back on the implementation not the idea.

CSS is a collection of layout algorithms

2025-05-30T16:44:47+02:00

Published by marco on 30. May 2025 16:44:47 ------------------------------------------------------------------------ [image]This is a nice explanation of how CSS is a declarative language, where you describe the metadata of your styles. The layout algorithm determines which property values affect the size and position of the element. Generally the properties position and display properties determine which layout algorithm is used for a given element. The layouts are, [media] * "Normal flow layout" (selected by default) * "Inline layout" (selected by default for inline elements) * "Flexible box layout" (selected with display: flex) * "Grid layout" (selected with display: grid) * "Inline layout" (selected with display: grid) * "Multi-column layout" (selected with display: grid) * "Positioned layout" (selected with display: grid) * "Flow layout" (selected float: left or float: right) Most properties work the same in all layouts. Some properties only have an effect in a specific layout mode, e.g., grid-template-columns is ignored if the layout is not grid. Other properties are interpreted differently or completely ignored depending on layout mode, e.g., width and margin are ignored in the inline layout.

A good explainer of how the core concept of CSS is layout

2025-05-23T07:37:01+02:00

Published by marco on 23. May 2025 07:37:01 ------------------------------------------------------------------------ This is a nice ~13-minute explanation of how CSS is a declarative language, where you describe the metadata of your styles. The layout algorithm determines which property values affect the size and position of the element. [media] Generally the properties position and display properties determine which layout algorithm is used for a given element. The layouts are, [image] * "Normal flow layout" (selected by default) * "Inline layout" (selected by default for inline elements) * "Flexible box layout" (selected with display: flex) * "Grid layout" (selected with display: grid) * "Inline layout" (selected with display: grid) * "Multi-column layout" (selected with display: grid) * "Positioned layout" (selected with display: grid) * "Flow layout" (selected float: left or float: right) Most properties work the same in all layouts. Some properties only have an effect in a specific layout mode, e.g., grid-template-columns is ignored if the layout is not grid. Other properties are interpreted differently or completely ignored depending on layout mode, e.g., width and margin are ignored in the inline layout.

Refactoring a dead-simple progress-bar function

2025-05-19T11:07:51+02:00

Published by marco on 19. May 2025 11:07:51 Updated by marco on 30. May 2025 08:35:28 ------------------------------------------------------------------------ I just saw a neat code example from "a Dutch government project" (function starting at line 182). The commentator who posted it at "Reddit" wrote, "Some people laughed at it and suggested all kind of clever one liners to replace it, but to me, that if statement is perfect. The intent is immediately clear and bugs are easy to spot. This is the kind of code you want in critical apps." Here's the code. private static string GetPercentageRounds(double percentage) { if (percentage == 0) return "⚪⚪⚪⚪⚪⚪⚪⚪⚪⚪"; if (percentage > 0.0 && percentage <= 0.1) return "🔵⚪⚪⚪⚪⚪⚪⚪⚪⚪"; if (percentage > 0.1 && percentage <= 0.2) return "🔵🔵⚪⚪⚪⚪⚪⚪⚪⚪"; if (percentage > 0.2 && percentage <= 0.3) return "🔵🔵🔵⚪⚪⚪⚪⚪⚪⚪"; if (percentage > 0.3 && percentage <= 0.4) return "🔵🔵🔵🔵⚪⚪⚪⚪⚪⚪"; if (percentage > 0.4 && percentage <= 0.5) return "🔵🔵🔵🔵🔵⚪⚪⚪⚪⚪"; if (percentage > 0.5 && percentage <= 0.6) return "🔵🔵🔵🔵🔵🔵⚪⚪⚪⚪"; if (percentage > 0.6 && percentage <= 0.7) return "🔵🔵🔵🔵🔵🔵🔵⚪⚪⚪"; if (percentage > 0.7 && percentage <= 0.8) return "🔵🔵🔵🔵🔵🔵🔵🔵⚪⚪"; if (percentage > 0.8 && percentage <= 0.9) return "🔵🔵🔵🔵🔵🔵🔵🔵🔵⚪"; return "🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵"; } This is a cool example because it demonstrates how easy it is to understand the return value when you use a separate constant for each "progress bar" increment instead of using something like new string('🔵', 5), which, as we'll see below, doesn't even compile. Still, all but the first condition needlessly checks the lower-bound already guaranteed by the previous step. At the very least, we can reduce it to the following: private static string GetPercentageRounds(double percentage) { if (percentage == 0) return "⚪⚪⚪⚪⚪⚪⚪⚪⚪⚪"; if (percentage <= 0.1) return "🔵⚪⚪⚪⚪⚪⚪⚪⚪⚪"; if (percentage <= 0.2) return "🔵🔵⚪⚪⚪⚪⚪⚪⚪⚪"; if (percentage <= 0.3) return "🔵🔵🔵⚪⚪⚪⚪⚪⚪⚪"; if (percentage <= 0.4) return "🔵🔵🔵🔵⚪⚪⚪⚪⚪⚪"; if (percentage <= 0.5) return "🔵🔵🔵🔵🔵⚪⚪⚪⚪⚪"; if (percentage <= 0.6) return "🔵🔵🔵🔵🔵🔵⚪⚪⚪⚪"; if (percentage <= 0.7) return "🔵🔵🔵🔵🔵🔵🔵⚪⚪⚪"; if (percentage <= 0.8) return "🔵🔵🔵🔵🔵🔵🔵🔵⚪⚪"; if (percentage <= 0.9) return "🔵🔵🔵🔵🔵🔵🔵🔵🔵⚪"; return "🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵"; } I would elect to go further, preserving the clarity in constants (or maybe a comment) to avoid repetition in the code. First, let's write a few tests with NUnit. [TestCase(0.00, "⚪⚪⚪⚪⚪⚪⚪⚪⚪⚪")] [TestCase(0.10, "🔵⚪⚪⚪⚪⚪⚪⚪⚪⚪")] [TestCase(0.11, "🔵⚪⚪⚪⚪⚪⚪⚪⚪⚪")] [TestCase(0.19, "🔵⚪⚪⚪⚪⚪⚪⚪⚪⚪")] [TestCase(0.20, "🔵🔵⚪⚪⚪⚪⚪⚪⚪⚪")] [TestCase(0.30, "🔵🔵🔵⚪⚪⚪⚪⚪⚪⚪")] [TestCase(0.40, "🔵🔵🔵🔵⚪⚪⚪⚪⚪⚪")] [TestCase(0.50, "🔵🔵🔵🔵🔵⚪⚪⚪⚪⚪")] [TestCase(0.60, "🔵🔵🔵🔵🔵🔵⚪⚪⚪⚪")] [TestCase(0.70, "🔵🔵🔵🔵🔵🔵🔵⚪⚪⚪")] [TestCase(0.80, "🔵🔵🔵🔵🔵🔵🔵🔵⚪⚪")] [TestCase(0.90, "🔵🔵🔵🔵🔵🔵🔵🔵🔵⚪")] [TestCase(1.00, "🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵")] public void TestBubbles(double percentage, string expectedOutput) { var actualOutput = GetPercentageRounds(percentage); Assert.That(actualOutput, Is.EqualTo(expectedOutput)); } First, let's just reflect that now it's the tests that document how the API works! That means that now we don't have to lean so hard on the examples being obvious in the code itself. We can optimize the code for maintainability because the test offers the high-level explanation. When I refactor this code, I'm not removing the in-code documentation -- I've simply moved it to the test. Next, let's give in to our refactoring instincts and see if a shorter formulation of the algorithm is also understandable. The algorithm is now: 1. Build constant buffers for empty and filled. 2. Calculate the portion of each of these buffers to include in the result (filledCount and emptyCount). 3. Copy the correct number of characters from the buffers using the C# range-operator. private static string GetPercentageRounds(double percentage) { const string empty = "⚪⚪⚪⚪⚪⚪⚪⚪⚪⚪"; const string filled = "🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵"; var filledCount = (int)Math.Floor(percentage * 10); var emptyCount = 10 - filledCount; return filled[..filledCount] + empty[..emptyCount]; } This doesn't work, though! The tests fail. For example, the test for 0.8 returns "🔵🔵🔵🔵⚪⚪" instead of "🔵🔵🔵🔵🔵🔵🔵🔵⚪⚪". What's going on? We would get another hint as to what is going on if we were to refactor the constant declarations to use each symbol only once. For example, I could create the string with a special constructor, as shown below. var empty = new string ('⚪', 10); var filled = new string ('🔵', 10); This avoids repeating the symbol several times but it's probably also not as clear what's happening. It also no longer uses constants -- initialized once and stored in the app -- so we're allocating new strings each time. We could declare them as static instance variables so that they are allocated only once. However, that would also mean that we don't declare them locally in the method, which again decreases readability. On top of that, though, the second initialization doesn't even compile! [image] Strings are encoded in UTF-16 (the standard for .NET). [1] In this encoding, the "⚪" is represented with one byte, while "🔵" is represented with two bytes. That knowledge, together with knowing that the range operator works with bytes, explains why we only got half as many filled-in symbols as expected. Knowing this, we can revert to the original constants and fix the algorithm as follows (code-change is highlighted). private static string GetPercentageRounds(double percentage) { const string empty = "⚪⚪⚪⚪⚪⚪⚪⚪⚪⚪"; const string filled = "🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵"; var filledCount = (int)Math.Floor(percentage * 10); var emptyCount = 10 - filledCount; return filled[..(2 * filledCount)] + empty[..emptyCount]; } OK. Now it's working. We now have two questions: 1. Can we avoid the "hack" for UTF-16 in our calculation? 2. The code is now more maintainable; is the code still as understandable as before? Let's tackle the first one. It turns out that there is a standard way of indexing by grapheme but you have to opt in to it by using a StringInfo object, which offers a method named SubstringByTextElements(). private static string GetPercentageRounds(double percentage) { const string empty = "⚪⚪⚪⚪⚪⚪⚪⚪⚪⚪"; const string filled = "🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵"; var filledCount = (int)Math.Floor(percentage * 10); var emptyCount = 10 - filledCount; return new StringInfo(filled).SubstringByTextElements(0, filledCount) + new StringInfo(empty).SubstringByTextElements(0, emptyCount); } Now our code is no longer making assumptions about how many bytes represent our empty and filled symbols. But is it better? No. It is absolutely less legible than even the previous version. It also allocates two new StringInfo objects ever time it executes. Is it even necessary? Also no. Why wouldn't it be necessary? In the general case, we have to stay flexible and make sure that we're extracting the correct number of graphemes (not characters), but we don't have a general case here. We have two constant strings in a known encoding. We know that we can index by byte into the empty string and we know that we can index by two bytes into the filled string. These are constants. They will not change. We can make assumptions based on that. That means, after this little excursion, that we'll use our original version but we will also no longer consider it a hack. This takes us to the final point: is the new version more legible than the original? I think that it is. At first blush, the original looks like it's very self-explanatory -- you can see how the progress bar is built -- but you also have many more points of logic to check to verify that it's actually working as expected. While you can use the test I've defined above to check all of the logic, there are many more conditions to check when something goes wrong. We measure the number of paths through a piece of logic as "cyclomatic complexity" . The lower the better. We have learned that, when you program in the original way, you may actually save time! The original formulation didn't have to concern itself with encodings because it wasn't slicing strings. The original programmer didn't even need to be aware that some characters are encoded with multiple bytes whereas others are encoded with a single byte. They didn't even have to know what a byte was at all! Food for thought. As almost always, there isn't a "best solution" for all situations. There is a solution that minimizes drawbacks for the given requirements, but not for all possible requirements. If one of the requirements were: the reader need not know what a byte is, then the original solution would be more appropriate. The final version below has lower cyclomatic complexity, uses constants to indicate what the result will actually look like, and explains its algorithm reasonably well, if you understand percentages. I've included a comment to explain why we double the number of bytes to select from filled. private static string GetPercentageRounds(double percentage) { const string empty = "⚪⚪⚪⚪⚪⚪⚪⚪⚪⚪"; const string filled = "🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵"; var filledCount = (int)Math.Floor(percentage * 10); var emptyCount = 10 - filledCount; // Each 🔵 is two bytes in UTF-16 return filled[..(2 * filledCount)] + empty[..emptyCount]; } -------------------------------------------------------------------------------- [1] Yes, you can use the u8 prefix to make "UTF-8" , but that doesn't help because the symbols we want to use are multi-byte graphemes in that encoding as well.

An LLM use case with function-calling

2025-05-18T21:58:25+02:00

Published by marco on 18. May 2025 21:58:25 ------------------------------------------------------------------------ The article "Function calling using LLMs" by Kiran Prakash writes, "It’s important to emphasize that when using function calling, the LLM itself does not execute the function. Instead, it identifies the appropriate function, gathers all required parameters, and provides the information in a structured JSON format. This JSON output can then be easily deserialized into a function call in Python (or any other programming language) and executed within the program’s runtime environment." This is an approach that works very well when you don't have a testing environment: build a plan, evaluate validity of the plan, and then apply the plan after verification. You should also be able to slice the work into sub-tasks to make verification more reliable. This is the approach I took for a PowerShell script that runs against an ADOS instance: it's production data, so you really want to be sure what is going to be executed. In the implementation, you can see how the code he writes prepares the query to the LLM in a structured way with the required context in an attempt to guide the result. Happily, he begins by writing unit tests! [image]This is another good step-by-step example of working with an LLM, but for a different task: it's using an LLM as an interpreter for the user's input. It's basically a way of adding a natural-language "search-like" interface to an app without forcing the user to structure their input, without developing an UI, and without writing a parser. The advantage is that you get a way of querying a potentially large API surface in a way that in more amenable to more users. I think of an example from Markus Schenkel from Cudos, who talked about using an MCP plugin for working with a CAD/CAM program -- apps that notoriously have dozens of toolbars and thousands of functions. He could formulate his "novice" request as text, and the LLM, together with the mapping to tool functionality, made relatively good guesses about what he was trying to do. It often took a few attempts -- but he was able to accomplish his task, whereas he would have either given up or had to invest a lot more time to get it done otherwise. I think this is great for products that are in proof-of-concept stage, so that you don't iterate on UIs too early in the design process. But we also have to be aware that we have UIs for a reason. Once there's a well-established set of use cases and functionality, then it's unclear that making users continue to use a command-line interface where they compose text is better than a GUI. At any rate, the article is filled with detail and code (in Python) for using an LLM in the way described above. There's a section on refactoring at the end, a comparison to the rules-engine-based approach that this technique seeks to replace, and also a comparison of function-calling with MCP.

A subtle failure to pattern-match null in C#

2025-05-18T13:58:53+02:00

Published by marco on 18. May 2025 13:58:53 ------------------------------------------------------------------------ [image]The article "The null check that didn't check for nulls" by Oren Eini points out an interesting and subtle difference in code-generation, depending on whether you use the var keyword. Using var in pattern-matching might lead to a pattern that looks like it checks for null but doesn't. You can see and play with a "live example" but I've replicated the examples below. This is the problematic example: string Test1(List strs) { if(strs is [var s]) { return s; } return string.Join(",", strs); } It's basically saying that the pattern should match anything that's a collection with one element. Since the type is obvious from the method signature's parameter strs, we use var instead of string. That generates the following code. internal static string

$>g__Test1|0_0(List strs) { if (strs != null && strs.Count == 1) { return strs[0]; } return string.Join(",", strs); } Note that it returns the first element without checking it for null. If you change the var to string, which, as noted above, is redundant, then the generated code includes a null-check. string Test2(List strs) { if(strs is [string s]) { return s; } return string.Join(",", strs); } This is the generated code for the example above. internal static string

$>g__Test2|0_1(List strs) { if (strs != null && strs.Count == 1) { string text = strs[0]; if (text != null) { return text; } } return string.Join(",", strs); } If you instead use { } to indicate that you want to match a non-null object, then you also get the null-check. string Test3(List strs) { if(strs is [{} s]) { return s; } return string.Join(",", strs); } This is the generated code for the example above. It is the same as the second example that uses string for the matched parameter. internal static string

$>g__Test3|0_2(List strs) { if (strs != null && strs.Count == 1) { string text = strs[0]; if (text != null) { return text; } } return string.Join(",", strs); }

Zed shows off how to work with AI agents

2025-05-18T13:53:10+02:00

Published by marco on 18. May 2025 13:53:10 ------------------------------------------------------------------------ The article "Zed: The Fastest AI Code Editor" by Richard Feldman includes a great description and video that shows off the behavior of their new "agent" feature. "The entire Zed code editor is open source under GPL version 3, and scratch-built in Rust all the way down to handcrafted GPU shaders and OS graphics API calls. Zed's new AI capabilities are also open-source, just like the rest of the editor, so you can see exactly what the new Agent Panel is doing under the hood." This editor is very, very smooth and more powerful than a standard Visual Studio Code. It's also so much faster. However ... it's currently MacOS and Linux-only. The "Windows version is in an early-access phase" . [image]Even if you can't use the editor, the ~5-minute video at the beginning of the post is absolutely what I've been looking for: how do you use these tools for real. The video demonstrates the following: * Using the inline-completion to make small edits * Using the chat-AI agent (the tool has access to many actions in the editor). * Running a larger request/action against a large code base (they use the code of Zed itself, written in Rust). * The request is to make the number of most recently used values in a list configurable via settings. * The settings object already exists. * The list already exists. * It's hard-coded to six elements right now. * Viewing the steps taken in running the request. * Reviewing and adjusting the proposed changes. "The diff is fully editable, so you can easily make changes to whatever the model came up with. It supports multicursor editing, language server integrations, and all the speed you love from the rest of Zed." * Noting that one of the proposed changes is something that even a senior developer might have forgotten to do in a first attempt at the feature (updating settings * Final review in a Git diff. This is hands-down the best demonstration I've seen of extending a workflow comprising what the author nicely describes as deterministic tools -- I've been calling them analysis-based tools -- with AI-based tools (and agents). The section on "costs" is remarkably fair and open.

How pattern-matching in C# is lowered

2025-04-27T11:53:33+02:00

Published by marco on 27. Apr 2025 11:53:33 ------------------------------------------------------------------------ [image]A while back, I wrote "Stop trying so hard to use pattern-matching" . I stand by everything i wrote there. I was recently mentoring a very clever programmer who's new to C# but has cut his teeth on Rust. We were discussing switch statements vs. switch expressions. Which pattern-matching features can you use where? Which features can you combine? The article "Tutorial: Use pattern matching to build type-driven and data-driven algorithms" offers a good introduction. Pattern-matching on objects is lovely (and its been available since "C# 7.0" [1] (2017)). The version they were using still used "switch statements". There's another level called "switch expressions" (available since "C# 9" (2020)) that they could have used if they were returning a value. The article "C# 9.0: Pattern Matching in Switch Expressions" by Thomas Claudius Huber provides the following example, string favoriteTask = obj switch { Developer dev when dev.YearOfBirth == 1980 => $"{dev.FirstName} listens to metal", Developer dev => $"{dev.FirstName} writes code", Manager _ => "Create meetings", _ => "Do what objects do", }; Speaking of syntactic sugar, you can check out what the compiler would generate using the "SharpLab.IO" . Throw in any compiling code on the left, and you get the "lowered" version on the right. If you throw in the example from above with a bit of extra code to make it compile, using System; public class C { public void M(object obj) { string favoriteTask = obj switch { Developer { YearOfBirth: >= 1980 and <= 1989 and not 1984 } dev => $"{dev.FirstName} listens to heavy metal while coding", Developer dev => $"{dev.FirstName} writes code", Manager _ => "Create meetings", _ => "Do what objects do", }; } private class Developer { public int YearOfBirth { get; } public string FirstName { get; } = string.Empty; } private class Manager { } } You can see that the generated logic is quite straightforward. The snippet below elides the generated code for the Developer and Manager classes. It's not how I would have written it manually, but I bet it's pretty efficient. [NullableContext(1)] public void M(object obj) { Developer developer = obj as Developer; string text; if (developer == null) { text = ((!(obj is Manager)) ? "Do what objects do" : "Create meetings"); } else { int yearOfBirth = developer.YearOfBirth; if (yearOfBirth >= 1980 && yearOfBirth <= 1989 && yearOfBirth != 1984) { Developer developer2 = developer; text = string.Concat(developer2.FirstName, " listens to heavy metal while coding"); } else { text = string.Concat(developer.FirstName, " writes code"); } } string text2 = text; } -------------------------------------------------------------------------------- [1] We're on "C# 13" in April 2025, with "C# 14" in the works for November, 2025.

Angular is pretty specialized

2025-04-21T11:25:50+02:00

Published by marco on 21. Apr 2025 11:25:50 ------------------------------------------------------------------------ [image]I recently had a conversation about the pros and cons of using Angular and I found this year-old article that I'd prepared from my notes but never published. The article "Two-way binding between Signals and Query Params" by Julio Castro includes the following code snippet. @Component({ selector: "app-root", standalone: true, imports: [AsyncPipe], template: `

Signals Demo

Your first name is: {{ firstName$ | async }}

`, }) export class AppComponent { private activatedRoute = inject(ActivatedRoute); firstName$ = this.activatedRoute.queryParams.pipe( map((allQueryParams) => allQueryParams["firstName"]) ); } I cannot describe how gross I think Angular code is. None of this is "using the platform". This is all JavaScript-first. It's all custom, untyped, string-matching, gobbledygook. The firstName$ in the template isn't checked. There are no type-safe views. [1] What the hell does | async do? I'm sure it's convenient, but this is more obtuse-looking than modern React. The injection is also just magic that you have to know about. And why is it injected differently than the AsyncPipe? There are probably good reasons for it, in Angular, but it looks pretty slapdash and ad=hoc as an API. It's like there's a different symbol or character or concept for every possible thing. The imports is in a custom place. It's all packed into a @Component decorator that does a bunch of magic for you to build what is probably a web component (but I'm not sure). They wrapped every single possible API in something custom to Angular. I hope I'm wrong, but this is so unappealing. Reading a bit further and we see an example where some of the noise -- e.g., the async pipe -- has been removed because of the magic of signals. If you're learning Angular, you're not learning anything portable about web-programming. You won't know HTML, you won't know CSS, you won't even necessarily know JavaScript or the browser APIs. You don't use the platform. It's a shame because the platform is already so powerful. In the old days, you needed a framework to shield you from the differences. Nowadays, the platform is more than well-specified, -supported, and -implemented to just write to directly. Learning the platform API is just as easy as learning whatever I'm seeing in Angular. In fairness to the article, though, it's well-written and offers some good techniques for making the best of a bad situation if you have to work in Angular. 🙃 But then there's this. "Since we are accessing the value of the allQueryParams signal in the effect, it will run every time this signal gets updated, which happens every time Angular emits a new value in the activatedRoute.queryParams observable. "Inside the effect, we are just updating the value of our queryParamValue signal. For that, notice that we need to pass the allowSignalWrites: true option. This is necessary because updating signals in effects could lead to infinite loops and unexpected and intricate situations in general." This is the same kind of black magic for real-life situations as you see in React these days. I'm still a fan of using MobX for the state model, then attaching it to pure reactive web components. I'm still deciding whether that will scale to what I need, but I'm more and more convinced that none of the huge frameworks are the way to go. They're just so much wrapping and bizarre APIs that feel legacy before they're even officially released. I'm not going to copy it in here but the final version of the read/write signal service based on query-parameter values is 41 lines of hairball code. Do I know how much code it would be to achieve something similar outside of Angular? No. No, I don't. I just know that if I ended up having to learn how to do it and write it -- and even if it ended up being more code -- I would have learned the general platform and built a service that can work in any web site, not just one framework. -------------------------------------------------------------------------------- [1] A colleague confirmed for me just last week that this is still the case, even in the latest versions of Angular.

Balancing user experience and performance in a web page

2025-04-21T11:17:49+02:00

Published by marco on 21. Apr 2025 11:17:49 ------------------------------------------------------------------------ [image]This video is just under 30 minutes and provides a lot of useful tips about how to optimize web pages. It's almost a year old, but a lot of the optimizations are good to know, even though they won't apply to most pages out there. It's good to know how the browser works and which heuristics it uses to determine what can be optimized. Knowing these things helps you avoid accidentally formulating your web pages in ways that slow things down unnecessarily. You'll be less likely to suffer under load. [media] The author goes through several optimizations. * He starts by showing how to turn on the "mid-tier mobile" testing mode, which uses fast 3G and slows the CPU down 4x. This makes it easier to spot problems on a developer-class desktop/laptop. * He then shows how to set up and use the profiler, zooming in and out of the extremely rich data recorded for every interaction. * He discovers and removes a polyfill that's no longer needed. It turns out that that version of polyfill was broken and always active -- regardless of whether the feature was supported natively. * Another fix was to remove the background blur when making an element "inert" because it was engaging the GPU and causing a much longer paint when the browser had to animate the drop-shadow moving across the blurred element. * Another fix involved simply moving an interaction away from the initial event handler by executing it in a timer instead. He used a dead-simple debouncing technique to ensure that only the most recent task would be executed. * Another fix was to remove complex logic for avoiding setting the display property on a DOM element. The solution there is to simply let the browser do its thing; it's much more optimized than you think. The code that was trying to avoid touching the DOM was much slower than actually setting a DOM property. * Another fix was to defer and chunk appending results as well as setting styling for found terms by using async.

Swift protocol extensions for C#

2025-04-21T11:06:20+02:00

Published by marco on 21. Apr 2025 11:06:20 ------------------------------------------------------------------------ [image]Since this feature is being touted for C# 14 -- this time it's coming for real! -- I thought it would be good to refresh what I'd already learned about it. The title is a bit hyperbolic but it's quite an interesting feature. It's basically protocol extension from Swift for C#. It's .NET's answer to extending extension methods to properties and, probably, operators. You can't add state, as far as I can tell. But that isn't so surprising. The video below discuss the proposal as it looked for C#13. The pages "What's new in C# 14: Extension members" and "Extension declaration (C# Reference)" offer more insight into the current shape of the feature. It seems like they mean it this time and that it will land in November 2025. [media] What it primarily is, though, is further work on making it easier to transition APIs. We got the first batch of support with default interface implementations. This feature will allow to smooth migrations even more. They will also allow us to "add" properties to types that then introduce their own version of those properties in future versions but that's OK, I think. It means that every added property will be a potential breaking change for someone but maybe it will make us start categorizing breaking changes. There are implicit extensions, which are pretty much a new way of defining extension methods, but with support for proprties. The following example shows how the property IsLead will be available for any Person without modifying that type. This doesn't seem much different than existing extension methods, other than support for properties, where the this keyword stands in for the parameter that would otherwise have been passed in a classic extension method. public implicit extension PersonExtension for Person { public bool IsLead => this.Organization .Teams .Any(team => team.Lead == this); } There are also explicit extensions, which are a way of specifying extensions to types that are neither implemented nor inherited, but are instead given to a type without coercion. That is, you can define a type that can be applied to another type (e.g., Lead for Person in the example below), which makes more methods and properties available. It's kind of confusing without an example. public explicit extension Lead for Person { public IEnumerable Teams => this.Organization .Teams .Where(team => team.Lead == this); } var person = new Person(); var personTeams = person.Teams; // Compile error Lead lead = person; var leadTeams = lead.Teams; // OK While this might look like a cast, it's not, because Person doesn't implement Lead -- it's extended by Lead in code that isn't necessarily associated with the code that defines Person. In the latest syntax, the example above would look a bit different but the idea is the same. It looks like they've simplified it a bit. public static class PersonExtension { extension(Person source) { public bool IsLead => source.Organization .Teams .Any(team => team.Lead == this);; } } * There a talk called "What’s new in C# 13" by Mads Torgersen and Dustin Campbell for which you have to register. The video will probably come out later on YouTube, though. * There's the original "[Proposal]: Extensions #5497" * The most informative link (so far) is ".NET Announcements and Updates from Microsoft Build 2024" by .NET Team

A good intro to .NET Aspire from the 2024 Build Conference

2025-04-21T10:54:46+02:00

Published by marco on 21. Apr 2025 10:54:46 ------------------------------------------------------------------------ This is another 46-minute, 10-month-old video from the last Build conference that I found extremely helpful in explaining what .NET Aspire is and what it's good for. [media] Damian Edwards and David Fowler do a soup-to-nuts demonstration of Aspire. It basically lets you configure your multi-project, distributed projects with code rather than with YAML (e.g. dockercompose.yml). Instead, it writes the files for you and handles the deployment to Docker. This lets you much more easily create and configure things like email servers (for registration workflows), queues, databases, etc. Some of the resources run in Docker containers, some run on Azure if you want. There is a dashboard with deep telemetry, with very nice graphs showing how each service participates in a given request.

Toub and Hanselmann at the Build Conference 2024

2025-04-21T10:46:39+02:00

Published by marco on 21. Apr 2025 10:46:39 ------------------------------------------------------------------------ This 46-minute presentation by Scott Hanselman and Stephen Toub is ten months old but is still worth watching. I note below that one of the more significant things Toub shows is not any sort of programming wizardry, but column-selection in a text editor. Half of the things that people use AI for can be solved with column-select and judicious copy/paste. [media] Another fantastic "deep dive" with these two: this time they're optimizing the Humanizer library on-the-fly, on-stage, during a session. This feels nearly completely improvised. Kudos to these two gen-Xers, doing an old-school presentation of just plain programming bravura with no frills. At 38:20, Toub shows how to use column-select to make changes, which wows the audience. I guess it's really not such a well-known feature, but it's an incredible productivity booster. Toub uses the mouse to select when he could have just used the keyboard to select the lines with Alt + Shift + down or by selecting the space and starting double-quote, then Alt + Shift + . to select subsequent matches. After that, he used the mouse again to select the end of the lines, but he could have just left the lines selected from before and hit End to jump to the end of all the lines. It's good that he showed it but, as in previous videos, he's a bit more of a "mouser" than I am.

Getting Docker in the path on MacOS

2025-04-21T08:15:12+02:00

Published by marco on 21. Apr 2025 08:15:12 ------------------------------------------------------------------------ I couldn't call Docker from the command line. I had installed Docker a long time ago, but had just restored from a Time Machine backup, so my system was new but the applications had been restored. That meant that Docker had recorded that the executables had been sym-linked to the right folder (/usr/local/bin) but those links were part of the old, dead system. Long story short, go to the settings, as shown below. If you're in the situation that I was in, in which the app was out of sync with the system, then you'll have to change something to get the app to apply the settings again. [image] You can try to simply switch from User to System, click Apply & Restart but that didn't work for me. It never asked for my password or confirmation. However, I found the instructions in this answer to "docker-compose command not available for mac" , which gave the tip to change other things on the page to force Docker to make the update. "I fixed it by going to Docker app -> settings -> Advanced "In advanced I had to check off:" * Allow the default Docker socket to be used (requires password) * Allow privileged port mapping (requires password) "Click Apply & Restart "And enable again options:" * Allow the default Docker socket to be used (requires password) * Allow privileged port mapping (requires password) "Click Apply & Restart" If that still doesn't work, then check whether you actually have a /usr/local/bin folder. Docker fails silently when the folder doesn't exist. If you don't have it, use an administrator and sudo to create the folder and try the procedure above again. That did the trick for me. If you'd like to link the tools to your user's path ($HOME/.docker/bin) instead, then follow the instructions in this answer to "Setting PATH environment variable in OSX permanently" that show you how to modify the paths for your system.

Toub's 234-page tour-de-force on performance in .NET 9

2025-04-20T23:03:54+02:00

Published by marco on 20. Apr 2025 23:03:54 ------------------------------------------------------------------------ [image]The articlebook "Performance Improvements in .NET 9" by Stephen Toub was published about six months ago. It contains a tremendous amount of interesting information, which I've attempted to summarize below, following the document structure in the original. [Tier 0] "Another tier 0 boxing example is dotnet/runtime#90496. There’s a hot path method in the async/await machinery: AsyncTaskMethodBuilder.AwaitUnsafeOnCompleted (see "How Async/Await Really Works in C#" for all the details). It’s really important that this method be optimized well, but it performs various type tests that can end up boxing in tier 0. In a previous release, that boxing was deemed too impactful to startup for async methods invoked early in an application’s lifetime, so [MethodImpl(MethodImplOptions.AggressiveOptimization)] was used to opt the method out of tiering, such that it gets optimized from the get-go. But that itself has downsides, because if it skips tiering up, it also skips dynamic PGO, and thus the optimized code isn’t as good as it possibly could be. So, this PR specifically addresses those type tests patterns that box, removing the boxing in tier 0, enabling removing that AggressiveOptimization from AwaitUnsafeOnCompleted, and thereby enabling better optimized code generation for it." [Loops] "In .NET 8, as part of the work to improve dynamic PGO, a more powerful graph-based loop analyzer was added that was able to recognize many more loops. For .NET 9 with dotnet/runtime#95251, that analyzer was factored out so that it could be used for generalized loop reasoning. And then with PRs like dotnet/runtime#96756 for loop alignment, dotnet/runtime#96754 and dotnet/runtime#96553 for loop cloning, dotnet/runtime#96752 for loop unrolling, dotnet/runtime#96751 for loop canonicalization, and dotnet/runtime#96753 for loop hoisting, many of these loop-related optimizations have now been moved to the better scheme. All of that means that more loops get optimized." [ARM SVE] "There are multiple ways such an ISA impacts .NET, and in particular the JIT. The JIT needs to be able to be able to work with the ISA, understand the associated registers and be able to do register allocation, be taught about encoding and emitting the instructions, and so on. The JIT needs to be taught when and where it’s appropriate to use these instructions, so that as part of compiling IL down to assembly, if operating on a machine that supports SVE, the JIT might be able to pick SVE instructions for use in the generated assembly. And the JIT needs to be taught how to represent this data, these vectors, to user code. All of that is a huge amount of work, especially when you consider that there are thousands of operations represented. What makes it even more work is hardware intrinsics." "Designing and enabling the SVE support is a monstrous, multi-year effort, and while the support is functional and folks are encouraged to take it for a spin, it’s not yet baked enough for us to be 100% confident the shape won’t need to evolve (for .NET 9, it’s also restricted to hardware with a vector width of 128 bits, but that restriction will be removed subsequently). Hence, [Experimental]." [AVX512] "So the values are 0 1 1 0 1 0 0 0, which we read as the binary 0b01101000, which is 0x68. That byte is used as a “control code” to the vpternlog instruction to encode which of the 256 possible truth tables that exist for any possible (deterministic) Boolean combination of those inputs is being chosen. This PR then teaches the JIT how to analyze the tree structures produced by the JIT to recognize such sequences of Boolean operations, compute the control code, and substitute in the use of the better instruction. Of course, the JIT isn’t going to do the enumeration I did above; turns out there’s a more efficient way to compute the control code, performing the same sequence of operations but on specific byte values instead of Booleans." "This is beneficial for a variety of reasons, including less data to store, less data to load, and if the register containing this state needed to be spilled (meaning something else needs to be put into the register, so the value currently in the register is temporarily stored in memory), reloading it is similarly cheaper." All the considerations are mind-boggling. Does it fit in a cache line? How many registers does it use? Is it colocated with similar data? Is the data aligned on a boundary? [Vectorization] "Of course, you may then wonder, why wasn’t bool.TryFormat reverted to use the simpler code? The unfortunate answer is that this optimization only currently applies to array targets rather than span targets. That’s because there are alignment requirements for performing these kinds of writes, and whereas the JIT can make certain assumptions about the alignment of arrays, it can’t make those same assumptions about spans, which can represent slices of something else at unaligned boundaries. This is now one of the few cases where arrays are better than spans; typically span is as good or better. But I’m hopeful it will be improved in the future." [Object Stack Allocation] "The hardest part of stack allocating objects is ensuring that it’s safe. If a reference to the object were to escape and end up being stored somewhere that outlived the stack frame containing the stack-allocated object, that would be very bad; when the method returned, those outstanding references would be pointing to garbage. So, the JIT needs to perform escape analysis to ensure that never happens, and doing that well is extremely challenging." This is the case where you have to be exceedingly clever in order to not have to let pessimism kill the feature entirely. That is, if you can't prove enough, then you end up having to assume that escape is possible in too many cases -- and the optimization ends up applying much less than you'd hoped it would. [VM] "The .NET runtime provides many services to managed code. There’s the GC, of course, and the JIT compiler, and then there’s a whole bunch of functionality around things like assembly and type loading, exception handling, configuration management, virtual dispatch, interop infrastructure, stub management, and so on. All of that functionality is generally referred to as being part of the coreclr virtual machine (VM)." [Mono] "We frequently say “the runtime,” but in reality there are currently multiple runtime implementations in .NET. “coreclr” is the runtime thus far referred to, which is the default runtime used on Windows, Linux, and macOS, and for services and desktop applications, but there’s also “mono,” which is mainly used when the form factor of the target application requires a small runtime: by default, it’s the runtime that’s used when building mobile apps for Android and iOS today, as well as the runtime used for Blazor WASM apps." "[...] when targeting WASM, the interpreter has a form of PGO where after methods have been invoked some number of times and are deemed important, it’ll generate WASM on-the-fly to optimize those methods. This tiering gets better in .NET 9 with dotnet/runtime#92981, which enables keeping track of which methods tiered up, and if the code is running in a browser, storing that information in the browser’s cache for subsequent runs. When the code then runs subsequently, it can incorporate the previous learnings to tier up better and more quickly." [Threading / Debugger.NotifyOfCrossThreadDependency] "When you’re debugging a .NET process and you break in the debugger, it pauses all threads in the debuggee process so that nothing is making forward progress while you examine state. However, .NET debuggers, like the one in Visual Studio, support invoking properties and methods in the debuggee while debugging. That can be a big problem if the functionality being invoked relies on one of those paused threads to do something, e.g. if the property you access tries to take a lock that’s held by another thread or tries to Wait on a Task. To mitigate problems here, the Debugger.NotifyOfCrossThreadDependency method exists. Functionality that relies on another thread to do something can call NotifyOfCrossThreadDependency; if there’s no debugger attached, it’s a nop, but if there is a debugger attached, this signals the problem to the debugger, which can then react accordingly. The Visual Studio debugger reacts by stopping the evaluation but then by offering an opt-in option of “slipping” all threads, unpausing all threads until the evaluated operation completes, at which point all threads will be paused again, thereby again trying to mitigate any problems that might occur from the cross-thread dependency." [VM] "The "official .NET memory model" has now been documented. However, some of the practices that were being employed in the core libraries (due to defensive coding or uncertainty of the memory model or out-of-date requirements) are no longer necessary. One of the main tools available for folks coding at a level where memory model is relevant is the volatile keyword / the Volatile class." "Marking fields or operations as volatile can come with an expense, depending on the circumstance and the target platform. For example, it can restrict the C# compiler and the JIT compiler from performing certain optimizations." [Reflection] "Delegates in .NET are “multicast,” meaning a single delegate instance might actually represent multiple methods to be invoked; this is how .NET events are implemented. If I invoke a delegate, the delegate implementation handles invoking each constituent method, sequentially, in turn. But what if I want to customize the invocation logic? Maybe I want to wrap each individual method in a try/catch, or maybe I want to track the return values from all of the methods rather than just the last, or some such behavior. To achieve that, delegates expose a way to get an array of delegates, one for each method that’s part of the original." There are a lot of long chapters on number- and text-processing, which is fascinating but not eminently quotable. You can really see how so many of the various improvements build on each other to finally offer incredible speed improvements (e.g. Quaternion.Cosh()). So many operations have been improved to reduce allocations to zero while reducing time to a few percent of the previous time, all often with even more code defined in C# rather in the JIT as native code (see "Move memset/memcpy helpers to managed impl #98623" for an extreme example that touched 68 files in 48 commits). I find this to be quite elegant. It shows that the investment in the new C# constructs are paying off because it allows framework developers to build faster and better primitives without escaping to a different language and runtime. This, in turn, allows other skilled developers to benefit from the same. Not only that, but managed code is accessible to the GC whereas native code is not. It's very clear how .NET and C# are being positioned to take over numeric and text processing from Python and C++/C. Everything is being made more generic and funneled to vectorized types, which, in turn, map to the most optimal set of instructions for the myriad supported scenarios, like AOT, ARM, WASM, x64, x86, etc. It's quite an incredible effort. All of these things combine to make your regular expressions and text searches faster, even if you stick to the existing APIs. In some cases, there are new APIs to use, but not too many. Instead, the beauty of .NET 9 is that it will just make everything so much more efficient -- faster and with fewer allocations and GC churn -- without programmers having to do a thing. A true feat of engineering. "[...] it’s important to recognize that many of the changes discussed thus far implicitly accrue to Regex. Regex already uses SearchValues, and so improvements to SearchValues benefit Regex (it’s one of my favorite things about working at the lowest levels of the stack: improvements there have a multiplicative effect, in that direct use of them improves, but so too does indirect use via intermediate components that instantly get better as the lower level does)." [DFA Limits] There is a ton of detail about the specifics of regular-expression optimization -- enough to make your head spin. Like this: "The non-backtracking implementation works by constructing a finite automata, which can be thought of as a graph, with the implementation walking around the graph as it consumes additional characters from the input and uses those to guide what node(s) it transitions to next. The graph is built out lazily, such that nodes are only added as those states are explored, and the nodes can be one of two kinds: DFA (deterministic) or NFA (non-deterministic). DFA nodes ensure that for any given character that comes next in the input, there’s only ever one possible node to which to transition. Not so for NFA, where at any point in time there’s a list of all the possible nodes the system could be in, and moving to the next state means examining each of the current states, finding all possible transitions out of each, and treating the union of all of those new positions as the next state. DFA is thus much cheaper than NFA in terms of the overheads involved in walking around the graph, and we want to fall back to NFA only when we absolutely have to, which is when the DFA graph would be too large: some patterns have the potential to create massive numbers of DFA nodes. Thus, there’s a threshold where once that number of constructed nodes in the graph is hit, new nodes are constructed as NFA rather than DFA. In .NET 8 and earlier, that limit was somewhat arbitrarily set at 10,000. For .NET 9 as part of this PR, analysis was done to show that a much higher limit was worth the memory trade-offs, and the limit was raised to 125,000, which means many more patterns can fully execute as DFA." "The inner matching loop is the hot path for a matching operation: read the next character, look up its minterm, follow the corresponding edge to the next node in the graph, rinse and repeat. Performance of the engine is tied to efficiency of this loop. These PRs recognized that there were some checks being performed in that inner loop which were only relevant to a minority of patterns. For the majority, the code could be specialized such that those checks wouldn’t be needed in the hot path." [Span, Span, and more Span] "The introduction of Span and ReadOnlySpan back in .NET Core 2.1 have revolutionized how we write .NET code (especially in the core libraries) and what APIs we expose (see "A Complete .NET Developer’s Guide to Span" if you’re interested in a deeper dive.) .NET 9 has continued the trend of doubling-down on spans as a great way to both implicitly provide performance boosts and also expose APIs that enables developers to do more for performance in their own code." "One of the really nice optimizations the C# compiler added several years back was the ability to recognize when a new byte/sbyte/bool array was being constructed, filled with only constants, and directly assigned to a ReadOnlySpan. In such a case, it would recognize that the data was all blittable and could never be modified, so rather than allocating an array and wrapping a span around it, it would blit the data into the assembly and then just construct a span around a pointer into the assembly data with the appropriate length." This is a wonderful optimization. Clever in a way that only a systems programmer would invent. foreach (Range r in clientSecWebSocketProtocol.AsSpan().Split(',')) { if (clientSecWebSocketProtocol.AsSpan(r).Trim().Equals(acceptProtocol, StringComparison.OrdinalIgnoreCase)) { return true; } } "In doing so, it becomes allocation-free, as this Split doesn’t need to allocate a string[] to hold results and doesn’t need to allocate a string for each segment: instead, it’s returning a ref struct enumerator that yields a Range representing each segment. The caller can then use that Range to slice the input. It’s yielding a Range rather than, say, a ReadOnlySpan, to enable the splitting to be used with original sources other than spans and be able to get the segments in the original form." There is such a strong focus on structs and refs to make allocation-free code. And now we see how they leverage the recently introduced Range to provide indexes into a sequence that the calling code can decide how to extract. This offers maximum flexibility to the caller, as the algorithm isn't making any costly decisions for it. In this case, he's discussing how they've made it relatively easy and intuitive to write code that searches a string without any allocations. The sequence doesn't allocate, examining the chunk as a span doesn't allocate, even the Trim() on a Span doesn't allocate anything. [LINQ] There is a long chapter on LINQ optimizations that boils down to having cleaned up a ton of internal implementation to consolidate on a common base class for customer iteration-combinations like Where/First, Where/OrderBy, etc. Instead of testing for interfaces, it can now test for a single base class and perform a virtual rather than an interface dispatch (which is cheaper). This massive cleanup has the dual benefit of having made many, many LINQ operations 10, 20, and even 100 times faster -- and many of them (if not most) are now completely allocation-free. Reducing allocations reduces churn in the GC, which also makes the app faster. [Core Collections] There is also a long chapter on dictionary optimizations. In particular, you can now store data in a dictionary with string keys but request an alternate view on the dictionary that lets you work with it as if it used ReadOnlySpan, which can drastically reduce allocations as the spans you have don't need to be converted to strings simply in order to do the lookups and stores. The changes apply to HashSets as well. [Compression] This is less about compression and more about the general philosophy and tactics underlying performance optimization in .NET (and, presumably, any runtime). "It’s an important goal of the core .NET libraries to be as platform-agnostic as possible. Things should generally behave the same way regardless of which operating system or which hardware is being used, excepting things that really are operating system or hardware specific (e.g. we purposefully don’t try to paper over casing differences of different file systems). To that end, we generally implement as much as possible in C#, deferring down to the operating system and native platform libraries only when necessary." [Networking] "dotnet/runtime#99364 changes the synchronization mechanism from using a pure lock-based scheme to a more opportunistic concurrency scheme that employs a first-layer of lockless synchronization. There’s now still a lock, but for the hot path it’s avoided as long as there are connections in the pool by using a ConcurrentStack, such that renting is a TryPop and returning is a Push. ConcurrentStack itself uses a lock-free algorithm, that’s a lot more scalable than a lock." "UrlEncode had a complicated scheme where it would UTF8-encode into a newly-allocated byte[], percent-encode in place in that (thanks to the ability to reinterpret cast with spans), and then use the resulting chars to create a new string. Instead, string.Create can be used, with all of the work done in-place in the buffer generated for that operation." "[...] updated UrlEncodeToBytes, using stack space instead of allocation for smaller inputs, and using SearchValues to optimize the search for invalid bytes." You can really see how the changes made over the last several versions allow a literal horde of open-source programmers to optimize the hell out of hot paths in the .NET library. Use Spans and ReadOnlySpans and ref structs and readonly ref structs to avoid allocations, allocate on the stack wherever you can when you can't avoid allocations, return enumerators instead of allocating array results, use highly optimized building blocks like SearchValues and ConcurrentStack, which employ lock-free algorithms or include custom implementations for common patterns. It all adds up to being able to just write performant code by default, writing in a legible, maintainable, and concise high-level API that is carefully marshaled down to the processor by the compiler and/or the JIT to super-efficient IL and assembler code. You can visualize your code being analyzed and then "sorted like Plinko chips" until it finally lands in the processor cache as instructions. [Profiling with Benchmark.Net] "There’s another very handy nuget package, Microsoft.VisualStudio.DiagnosticsHub.BenchmarkDotNetDiagnosers, which contains additional “diagnosers” for BenchmarkDotNet. Diagnosers are one of the main extensibility points within BenchmarkDotNet, enabling developers to perform additional tracking and analyses over benchmarks. You’ve already seen me use some, including the built-in [MemoryDiagnoser(false)] and [DisassemblyDiagnoser]; there are other built-in ones we haven’t used in this post but that are helpful in various situations, like [ThreadingDiagnoser] and [ExceptionDiagnoser], but diagnosers can come from anywhere, and the aforementioned nuget package provides several more. The purpose of those diagnosers is to collect and export performance traces that Visual Studio’s performance tools can then consume. In my case, I want to collect a CPU trace, so as to understand where CPU consumption is going, so I added a [CPUUsageDiagnoser] attribute to my Tests class" [Lock-free programming] What does lock-free programming look like? You replace a lock with an atomic compare/exchange operation, usually in a loop (that's why they're sometimes called "spin locks"). lock (this) { _delta += value; } "it used an interlocked operation to perform the addition atomically. Here _delta is a double, and there’s no Interlocked.Add that works with double values, so instead the standard approach of using a loop around an Interlocked.CompareExchange was employed." double currentValue; do { currentValue = _delta; } while (Interlocked.CompareExchange(ref _delta, currentValue + value, currentValue) != currentValue); [Cache lines] Finally, an optimization that takes CPU cache lines into account. I hadn't seen anything else that low-level so far. "In this benchmark, one thread is incrementing _values[0] and the other thread is incrementing either _values[1] or _values[31]. That index is the only difference, yet the one accessing _values[31] is several times faster than the one accessing _values[1]. That’s because there’s contention here even if it’s not obvious in the code. The contention comes from the fact that the hardware works with memory in groups of bytes called a “cache line.” Most hardware has caches lines of 64 bytes. In order to update a particular memory location, the hardware will acquire the whole cache line. If another core wants to update that same cache line, it’ll need to acquire it. That back and forth results in a lot of overhead. It doesn’t matter if one core is touching the first of those 64 bytes and another thread is touching the last, from the hardware’s perspective there’s still sharing happening. “False sharing.” Thus, the Counter fix is using padding around the double values to try to space them out more so as to minimize the sharing that limits scalability." 👏 File that under something I understand but would never have programmed. "In the two benchmarks, we can see that the number of instructions executed is almost the same between when false sharing occurred (Index == 1) and didn’t (Index == 31), but the number of cache misses is more than three times larger in the false sharing case, and reasonably well correlated with the time increase. When one core performs a write, that invalidates the corresponding cache line in the other core’s cache, such that the other core then needs to reload the cache line, resulting in cache misses." [Conclusion] "There are multiple forms of performance improvements covered throughout the post. Some of the improvements you get completely for free just by upgrading the runtime; the implementations in the runtime are better, and so when you run on them, your code just gets better, too. Some of the improvements you get completely for free by upgrading the runtime and recompiling; the C# compiler itself generates better code, often taking advantage of newer surface area exposed in the runtime. And other improvements are new features that, in addition to the runtime and compiler utilizing, you can utilize directly and make your code even faster. Educating about those capabilities and why and where you’d want to utilize them is important to me. But beyond the new features, the techniques employed in making all of the rest of the optimizations throughout the runtime are often more broadly applicable. By learning how these optimizations are applied in the runtime, you can extrapolate and apply similar techniques to your own code, making it that much faster." And that is much appreciated, Stephen. Having seen the available tools, I feel much better equipped to not only write but be able to advise on writing performant code.

Junior code is insidious

2025-04-20T22:54:49+02:00

Published by marco on 20. Apr 2025 22:54:49 ------------------------------------------------------------------------ [image]The article "Enumerated Science" by Remy Porter describes a train wreck of a code example. It suitably illustrates why we really have to question whether scientists/juniors/etc. should really be writing code with so little training. If they wrote text this poorly, they'd be laughed out of their profession. Somehow, it's perfectly fine to write code like this. index = 0 for index, fname in enumerate(img_list): data = np.load(img_list[index]) img = data[0][:,:] img_title 'img'+str(index).zfill(4)+'.jpg' cv2. imwrite(img_title, img) index = index + 1 The article points out all of the mistakes but I'll summarize them here. * Why does the code ignore the iteration item declared in fname? Instead, the code re-indexes into the array being iterated with img_list[index]. Like, why bro? You already had it! You know what img_list[index] is? It's fname, bro. * Why does the code bother calculating a complicated new filename that has nothing to do with the original filename? Why is the filename called img_title? It's not a title; it's a filename. * Why does the code increment the index? It has no effect, does it? Or is it possible that this algorithm skips every other item? Honestly, why does Python even allow modification of the iterator variables? They should be const/immutable exactly so you can avoid doing something distracting like this.

Web optimization: preload vs. fetchpriority

2025-04-20T22:43:58+02:00

Published by marco on 20. Apr 2025 22:43:58 ------------------------------------------------------------------------ This is a nearly 50-minute video about certain optimizations that used to be useful but which, in modern browsers, often get in the way of heuristic optimizations that browsers apply automatically. [media] "Preload should be applied with surgical precision" * Specific edge cases (you really know what you're doing) * If the resource isn't in the HTML * Fonts * Dynamic LCP images * JS imports Basically, he said if you're using preload, you're almost certainly doing it wrong. For example, you can use fetchpriority=high on an img instead, and get the same performance benefit in the current crop of browsers. [image]These kinds of optimizations aren't for most web sites. Most web sites have much larger performance problems than can be addressed with fetchpriority and preload optimizations. The one place where it's still useful is for preloading fonts, which is a good idea for everyone. While those two settings affect how the browser loads resources during the initial load o a page, setting loading=lazy on a resource takes it out of the initial load, so it puts it into a different part of the page-rendering (it's loaded on demand, only when needed, e.g., when you scroll down to it).

Guide to being a good person and programmer

2025-04-20T22:35:18+02:00

Published by marco on 20. Apr 2025 22:35:18 ------------------------------------------------------------------------ [image]The article "The Best Programmers I Know" by Matthias Endler seems almost too good to be true. NGL I feel seen. I have cited heavily from it, highlighting the parts I find especially interesting. At the end are a few pallid notes from me, but the meat of this article is the quote. * Read the Reference * Know Your Tools Really Well * Read The Error Message * Break Down Problems If you work as a professional developer, that is the bulk of the work you get paid to do: breaking down problems. If you do it right, it will feel like cheating: you just solve simple problems until you’re done. Don’t Be Afraid To Get Your Hands Dirty [...] read a lot of code and they are not afraid to touch it. They never say “that’s not for me” or “I can’t help you here.” Instead, they just start and learn. Code is just code. They can just pick up any skill that is required with time and effort. Before you know it, they become the go-to person in the team for whatever they touched. Mostly because they were the only ones who were not afraid to touch it in the first place. Always Help Others Great engineers are in high demand and are always busy, but they always try to help. That’s because they are naturally curious and their supportive mind is what made them great engineers in the first place. It’s a sheer joy to have them on your team, because they are problem solvers. Write Most awesome engineers are well-spoken and happy to share knowledge. The best have some outlet for their thoughts: blogs, talks, open source, or a combination of those. I think there is a strong correlation between writing skills and programming. All the best engineers I know have good command over at least one human language – often more. Mastering the way you write is mastering the way you think and vice versa. A person’s writing style says so much about the way they think. If it’s confusing and lacks structure, their coding style will be too. If it’s concise, educational, well-structured, and witty at times, their code will be too. Excellent programmers find joy in playing with words. Never Stop Learning If there is a new tool they haven’t tried or a language they like, they will learn it. This way, they always stay on top of things [...] the best engineers don’t follow trends, but they will always carefully evaluate the benefits of new technology. If they dismiss it, they can tell you exactly why, when the technology would be a good choice, and what the alternatives are. * Have Patience * Never Blame the Computer No matter how erratic or mischievous the behavior of a computer seems, there is always a logical explanation: you just haven’t found it yet! The best keep digging until they find the reason. They might not find the reason immediately, they might never find it, but they never blame external circumstances. With this attitude, they are able to make incredible progress and learn things that others fail to. When you mistake bugs for incomprehensible magic, magic is what it will always be. Don’t Be Afraid to Say “I Don’t Know” The best candidates said “Huh, I don’t know, but that’s an interesting question! If I had to guess, I would say…” and then they would proceed to deduce the answer. That’s a sign that you have the potential to be a great engineer. Keep It Simple Clever engineers write clever code. Exceptional engineers write simple code. That’s because most of the time, simple is enough. And simple is more maintainable than complex. I disagree with the "Don’t Guess" one, in that I think "guessing" -- forming a hypothesis -- is the crux of scientific investigation. I think what the author probably meant was to "don't leave a guess unproven." An addendum to the "magic" one above is that you should also know when to cut bait, i.e., when it's not worth anyone's time to find out what the real reason was. This can happen in one-off scripts, or in tight-deadline situations. Sometimes, you have to back-burner an investigation and either never bring it back to the front burner or learn the lesson at a later time. But, yes, every problem solved is a bit of experience. It's all worth it. A couple of decades of doing that and you might really have something.

Maddy Mondaquila (new .NET Aspire PM) talks programming tools

2025-04-06T21:06:03+02:00

Published by marco on 6. Apr 2025 21:06:03 ------------------------------------------------------------------------ This is a wide-ranging, occasionally delightfully foul-mouthed, and brutally honest interview with PM Maddy Mondaquila of Microsoft. Kudos to them for letting their best people do these kinds of informative and insightful interviews. [media] At 45:57, "[...] yesterday Dave [Fowler] and I were fighting about if the Visual Studio .gitignore is getting dumber and he was like, 'who cares about that? Why would anyone care about that?' And I was, like, it's 400 lines, dude. Like, we're ignoring things from [...] code-coverage tools that were deprecated five years ago. And then, finally, I start sending him screenshots, and he's, like, wait, why is that in there? Why is that in there?" At 49:00, I absolutely loved the shout-out to "A Year without Santa Claus". The plot summary was both accurate and possibly better than the actual movie (except for the musical number, which is worth the price of admission). I'm going to remember that Heat Miser vs. Cold Miser analogy. [1] Working in a company where most people didn't grow up in the U.S. will make it an uphill battle to use it effectively, but I will not be discouraged. At 51:49, "[image]The more I've gotten to understand what customers are doing and talk to people and seeing the convoluted things that people do to develop an app, the more ... I think I probably say once a week. I don't know how anyone ships software. I don't know how any of this stuff runs. This is all crazy to me, because everything is duct-taped together. Like, it is terrifying and you onboard someone and it takes like two weeks to get them to be able to run the app on their device. "Like, what are we doing? What are we doing as a society? This is embarrassing. "We should be able to do more than this and so that's the thing about Aspire that excites me. We're not trying to blackbox anything, right? We're not trying to say, 'oh, you use this and then your vendor-locked into this thing.' It's very much, like, we're just trying to help you get off the ground and then you can grow out of it. "I had done a lot with App Center [...] and my fundamental issue with it [...] was that you couldn't grow up into a big-girl Azure service, is what I used to say. Like, once you hit the limits of apps, you had to start over and I was like, with Firebase or something, everything's actually just gcp and when you're ready you go into a big-girl [service], Google's like, you're ready, you move on. "And so, Aspire was built with that in mind. Like, if you use Aspire for orchestration and then you use the client Integrations to do your databases, then at some point you're, like, you know what? I actually don't like the way that they're setting this up. I'm going to do it my own way. You don't rip anything out. You just keep going. "And so that was like a really really big sell for me early. And then deployment was a whole other world that I did not understand and the more I've looked at it, I don't, ... again, I don't know how anyone gets anything done. DevOps is insane. [...] trying to bring that theory of, like, grow-out-able-ness instead of just replacing into deployment has been a very, very fun challenge to try and like tease apart." At 1:11:26, "[...] we were talking to the Dutch police force and they are a completely polyglot shop so they have people running every language and there was one Java guy that came and and he was, like, so, like .NET's, like, open-source and stuff now? And I was, like, yeah. And he was, like, but, like, really, like, it doesn't have any ecosystem around it? And I was like what? YES and, like. there are real, like, expert, smart developers out there who just have no idea." This nearly deliberate ignorance about other programming languages, about tooling, about technique -- it's pervasive. There are people who care, and really want to find better combinations of tools and techniques to do their jobs better, to do what they love better. But there are just as many who just can't even begin to imagine that there are other languages out there, that there are newer versions of the language they use available, with features that would actually be useful to them. These features are provably useful. They make your code more resilient, readable, and maintainable. They do not care. They don't even know that they don't care. They stopped learning a long time ago. Their curiosity is stunted. It's a pleasure watching people like Maddy and Nick discussing something that they're passionate about. I'm passionate about that thing too, but it's mostly because I understand that there is a good way of doing something -- writing tests with MSTest and their bog-standard assertion library and no test-case-generation infrastructure -- and a better way of doing something -- writing tests with NUnit and their elegant assertion library, excellent error messages, and myriad ways of producing test cases. On the other hand, I keep a watchful eye on .NET's new testing infrastructure that allows you to pre-compile unit tests as executables. This works for NUnit, too, so it's a win all around. That's just one example but it sets the tone. People can't explain why they don't think they need ReSharper. They might not! But they have no idea why. They can't explain why they use VSC instead of WebStorm. They have no idea that the latter actually supports a useful multi-file renaming refactoring whereas VSC still struggles to do a useful rename within a single file. Everyone should be appalled and bitterly disappointed but, instead, they don't even notice. They have no idea what they're missing. So they don't miss it. No wonder LLMs could catch up so quickly. -------------------------------------------------------------------------------- [1] I love Heat Miser and Cold Miser, but I really like something I remember a coworker of mine saying decades ago, when I still worked in New York. When two guys would be arguing and spiraling out of control, he would drawl in a southern twang that he'd never lost, "now, now, girls; you're both pretty." I can't use that among non-native speakers either.

tsc is going native

2025-03-16T12:21:36+01:00

Published by marco on 16. Mar 2025 12:21:36 ------------------------------------------------------------------------ The article "A 10x Faster TypeScript" by Anders Hejlsberg includes the following text, as well as a link to the video below, "[...] we’ve begun work on a native port of the TypeScript compiler and tools. The native implementation will drastically improve editor startup, reduce most build times by 10x, and substantially reduce memory usage. By porting the current code-base, we expect to be able to preview a native implementation of tsc capable of command-line type-checking by mid-2025, with a feature-complete solution for project builds and a language service by the end of the year." This port is going to be available very, very soon, probably within the next three months. That's quite ambitious. [media] The discussion "C# vs. Go Concurrency Model" led me to "A 10x Faster TypeScript" , which included a reference to "Why Go? #411" , which explains why Go was chosen, "[...] the most important aspect is that we need to keep the new codebase as compatible as possible, both in terms of semantics and in terms of code structure. We expect to maintain both codebases for quite some time going forward. Languages that allow for a structurally similar codebase offer a significant boon for anyone making code changes because we can easily port changes between the two codebases. In contrast, languages that require fundamental rethinking of memory management, mutation, data structuring, polymorphism, laziness, etc., might be a better fit for a ground-up rewrite, but we're undertaking this more as a port that maintains the existing behavior and critical optimizations we've built into the language. Idiomatic Go strongly resembles the existing coding patterns of the TypeScript codebase, which makes this porting effort much more tractable." The following image, included by a commentator, demonstrates quite nicely how idiomatically similar Go and TypeScript can be. [image] If you read the rest of the justification, the similarities extend to the guts of the respective runtimes and their approach to memory-management and concurrency, but the visual illustration makes it much clearer that this is a port and not a rewrite. A C# version -- with its slightly different concurrency model and also a focus on byte-code rather than native code -- would have involved much more change than this. A version in Rust would have the focus on native-code generation but would have been a complete rewrite, as a lot of the concurrency and data-sharing possible in JavaScript would have to be explicitly allowed or worked around, something that you can't always (or completely) hide with helper functions. The additional guarantees required in Rust to ensure safety would have to appear explicitly. Sure, you'd have the safety then, but it's important to remember that, when you're doing a migration, you should make sure you focus on one migration at a time. Going from TypeScript to Go will improve some type-safety (though probably not even much) and massively improve speed with a native target. If you want the additional safety of Rust, then you'd do a separate migration step from Go to Rust. There is another, longer interview video here: [media] The interviewer is the guy who just published "TypeScript types can run DOOM" .

Pointers for large files and repositories in Git

2025-03-16T12:17:16+01:00

Published by marco on 16. Mar 2025 12:17:16 ------------------------------------------------------------------------ "How to Add files to a Large Repository?" Git has opt-in support for handling large files. * Use the "--depth" option to control how much history to clone (good for pipelines, where you're usually only interested in the tip, so depth 1) * Whereas depth controls how much you clone (size of the .git folder), "sparse-checkout" controls the size of your working tree. * [image]Use "LFS (Large File Storage)" to store files. This will not remove large files from existing commits. This feature is seamless to enable and well-supported throughout the ecosystem. * Once you've set up LFS for future commits, you can consider removing large files from already-existing commits using something like "BFG" and then re-adding them with LFS.

Narrowing types to avoid primitive obsession

2025-03-16T00:09:48+01:00

Published by marco on 16. Mar 2025 00:09:48 ------------------------------------------------------------------------ Recently, I saw that the following error had been fixed in a code review. [image] The error shown above is an example of a "design smell" called "Primitive Obsession" . This is where code is "obsessed" with primitives, in that it uses a much "wider" type than is actually acceptable. Whereas C++ has a typedef, TypeScript and Delphi Pascal have a type, C# has ... nothing simple. The "linked article" describes a hand-coded version for making "narrower" types (e.g., MeanLength or ShortFiber). Our go-to generated-source guru "Andrew Lock" describes a solution that uses the "StronglyTypedId" package, but also links to a series from 2020 by Thomas Levesque that "uses records" . It looks like you can use something like public record MeanLength(int Value); to succinctly define a narrower type. While it's nice that C# autogenerates all the necessary machinery (equals, hashCode, etc.) for the record, it's also unfortunate that it's necessary, as we're usually just trying to disambiguate two ints without further validation or restriction. Also, I'm not recommending you leverage the type system to avoid primitive confusion in every code base! I'm just noting that the error that arose is so common that it not only has a name, but that there are well-defined solutions for avoiding that class of problems using the type system. Before you start messing with these types in a language like C# -- where, as noted, you don't have a simple type-disambiguation mechanism -- you need to have everyone in the team on board. Many programmers will consider them to be too heavy-handed (they suspect it affects performance somehow, and aren't willing to trade any potential and unproven performance drawback anywhere for increased type-safety). Those are usually the same programmers who write code with a ton of primitive obsession and zero automated tests, so take their critique for what it's worth.

Who's going to fix the bad projects?

2025-03-15T14:50:10+01:00

Published by marco on 15. Mar 2025 14:50:10 ------------------------------------------------------------------------ The article "Can You Get Better Doing a Bad Job?" by Jim Neilsen cites Woody Harrelson as saying, "I think when you do your job badly you never really get better at your craft." Of course that's true on the surface: If you manage to avoid learning anything else, then you will only ever get better at doing a bad job. The author expands on this point as follows, "Experience is a hard teacher. Perhaps, from a technical standpoint, my skillset didn’t get any better. But from an experiential standpoint, my judgement got better. I learned to avoid (or try to re-structure) work that’s being carried out in a way that doesn’t align with its own purpose and essence." I agree with the highlighted bit especially. Any experience can be "good" because there is always room for seeing how you can make something useful in the midst of madness, how you can extract enjoyment out of even a poorly managed project. You can hone your programming skills; you can hone your diplomatic skills; you can learn how to turn it around. Maybe. At the worst, you learn how not to turn it around. [image]But then he writes that he "learned to avoid (or try to re-structure)" work that he doesn't like. This is fine, but I would flip those two, to read "I learned to try to restructure (or, in the worst case, avoid)," in order to emphasize that someone should be trying to impart order to chaos -- it might as well be you, if you're so damned smart. If everyone were avoiding bad projects, then where would good projects come from? Does everyone think that they're so precious that good projects have to be prepared for them before they'll even consider participating?

James Mickens is back with more

2025-03-13T09:41:39+01:00

Published by marco on 13. Mar 2025 09:41:39 ------------------------------------------------------------------------ [image]I recently stumbled upon some "Essays from the funniest man in Microsoft Research" by Raymond . He is such a funny writer that this article, against convention and like the one before it ("Brilliant articles by the funniest guy at Microsoft" ), will consist mostly of citations rather than an even mix of citations and paraphrasing that I naturally consider to be much more lucid and pithy. I quote at length to do the material justice, for documentation and to ensure that you all download the PDFs to see if there is more where that came from (there is). All emphases have been added. He has since moved on to teach at Harvard University and is publishing things like "The Wisdom of James Mickens" there. That one's about, "Sometimes, when a professor goes on sabbatical, that professor will create a great book or sculpture, or commence upon a tour of the world to propagate some transcendental vision about the stars or aesthetics or our relationship to the divine. I myself spent my sabbatical writing a heavy metal album called “Ten Times Your Master: A King in Every Corner.” The ostensible conceit behind the album is that I, James Mickens, have gone missing, and the heavy metal community has formed a five-person tribute band called Ten Times Your Master to cover my songs. In reality, I, James Mickens, have not gone missing at all, and in fact am at the height of my powers, having created five distinct musical aliases to record a tribute album to myself, James Mickens." Which, you have to admit, is one of the most uniquely funny and interesting things that you've read in a while. The notes below stem from a draft I wrote in 2014 and have only now published 11 years later. I have not altered them in any way because, well, I haven't gotten any smarter, so who am I to correct my past self? "To Wash It All Away" by James Mickens (PDF) discusses the delusions of web programming: "A modern Web page is a catastrophe. It’s like a scene from one of those apocalyptic medieval paintings that depicts what would happen if Galactus arrived: people are tumbling into fiery crevasses and lament- ing various lamentable things and hanging from playground equipment that would not pass OSHA safety checks. This kind of stuff is exactly what you’ll see if you look at the HTML, CSS, and JavaScript in a modern Web page. Of course, no human can truly “look” at this content, because a Web page is now like V’Ger from the first “Star Trek” movie, a piece of technology that we once understood but can no longer fathom, a thrashing leviathan of code and markup written by people so untrust- worthy that they’re not even third parties, they’re fifth parties who weren’t even INVITED to the party "In a rational universe, a single uncaught excep- tion would terminate a program, and if a program continued to execute after throwing such an exception, we would know that Ragnarok is here and Odin is not happy. In the browser world, ignoring uncaught exceptions is called “Wednesday, and all days not called ‘Wednesday.’” The JavaScript event loop is quite impervious to conventional notions of software reliabil- ity, so if an event handler throws an exception, the event loop will literally pretend like nothing happened and keep running. This ludicrous momentum continues even if, in the case of the seventh error, the Web page tries to call init() on an object that has no init() method. You should feel uncomfortable that a Web page can disagree with itself about the existence of initialization routines, but the page is still allowed to do things with things. Such a dramatic mismatch of expectations would be unacceptable in any other context. "it would definitely be horrible if your browser’s scripting lan- guage combined the prototype-based inheritance of Self, a quasi-functional aspect borrowed from LISP, a structured syntax adapted from C, and an aggressively asynchronous I/O model that requires elaborate callback chains that span multiple generations of hard-working Americans. OH NO I’VE JUST DESCRIBED JAVASCRIPT. "JavaScript is dynamically typed, and its aggressive type co- ercion rules were apparently designed by Monty Python. For example, 12 == “12” because the string is coerced into a num- ber. This is a bit silly, but it kind of makes sense. Now consider the fact that null == undefined. That is completely janky; a reference that points to null is not undefined—IT IS DEFINED AS POINTING TO THE NULL VALUE. And now that you’re warmed up, look at this: “\r\n\t” == false. Here’s why: the browser detects that the two operands have different types, so it converts false to 0 and retries the comparison. The operands still have different types (string and number), so the browser coerces “\r\n\t” into the number 0, because somehow, a non-zero number of characters is equal to 0. Voila—0 equals 0! AWESOME. " I obviously get what I deserve if my JavaScript library redefines native prototypes in a way that breaks my own code. However, a single frame in a Web page contains multiple JavaScript libraries from multiple origins, so who knows what kinds of horrendous prototype manipulations those heathen libraries did before my library even got to run. This is just one of the reasons why the phrase “JavaScript secu- rity” causes Bibles to burst into flames. "Some JavaScript libraries intention- ally begin with an initial semicolon, to ensure that if the library is appended to another one (e.g., to save HTTP roundtrips during download), the JavaScript parser will not try to merge the last statement of the first library and the first statement of the second library into some kind of semicolon-riven statement party. Such an initial semicolon is called a “defensive semico- lon.” That is the saddest programming concept that I’ve ever heard, and I am fluent in C++. "I could go on and on about the reasons why JavaScript is a cancer upon the world. I know that there are people who like JavaScript, and I hope that these people find the mental health services that they so desperately need. I don’t know all of the answers in life, but I do know all of the things which aren’t the answers, and JavaScript falls into the same category as Scientology, homeopathic medicine, and making dogs wear tiny sweaters due to a misplaced belief that this is what dogs would do if they had access to looms and opposable thumbs. "The first log entry says that the browser executed a downloaded file as JavaScript, even though the MIME type of the file was text/html. Here’s a life tip: when you’re confused about what something is, DON’T EXECUTE IT TO DISCOVER MORE CLUES. This is like observing that your next-door neighbor is a creepy, bedraggled man with weird eyes, and then you start falling asleep on his doorstep using a chloroform rag as a pillow, just to make sure that he’s not going to tie you to a radiator and force you to paint tiny figurines. Here’s how your life story ends: YOU ARE A PAINTER OF TINY FIGURINES." "This World of Ours" by James Mickens (PDF) discusses the delusions of passwords, cryptography and security research: "Basically, you’re either dealing with Mossad or not-Mossad. If your adversary is not-Mossad, then you’ll probably be fine if you pick a good pass- word and don’t respond to emails from ChEaPestPAiNPi11s@ virus-basket.biz.ru. If your adversary is the Mossad, YOU’RE GONNA DIE AND THERE’S NOTHING THAT YOU CAN DO ABOUT IT. The Mossad is not intimidated by the fact that you employ https://. If the Mossad wants your data, they’re going to use a drone to replace your cellphone with a piece of uranium that’s shaped like a cellphone, and when you die of tumors filled with tumors, they’re going to hold a press conference and say “It wasn’t us” as they wear t-shirts that say “IT WAS DEFI- NITELY US,” and then they’re going to buy all of your stuff at your estate sale so that they can directly look at the photos of your vacation instead of reading your insipid emails about them. In summary, https:// and two dollars will get you a bus ticket to nowhere. Also, SANTA CLAUS ISN’T REAL. When it rains, it pours." "Even worse than the PGP acolytes are the folks who claim that we can use online social networks to bootstrap a key infra- structure. Sadly, the people in an online social network are the same confused, ill-equipped blunderhats who inhabit the phys- ical world. Thus, social network people are the same people who install desktop search toolbars, and who try to click on the monkey to win an iPad, and who are willing to at least enter- tain the notion that buying a fortune-telling app for any more money than “no money” is a good idea. These are not the best people in the history of people, yet somehow, I am supposed to stitch these clowns into a rich cryptographic tapestry that supports key revocation and verifiable audit trails. One time, I was on a plane, and a man asked me why his laptop wasn’t working, and I tried to hit the power button, and I noticed that the power button was sticky, and I said, hey, why is the power button sticky, and he said, oh, IT’S BECAUSE I SPILLED AN ENTIRE SODA ONTO IT BUT THAT’S NOT A PROBLEM RIGHT? I don’t think that this dude is ready to orchestrate cryptographic operations on 2048-bit integers." "I realize that, in an ideal world, I would recycle my trash, and contribute 10% of my income to charity, and willingly accept the cognitive overhead of finegrained security labels. However, pragmatists understand that I will spend the bulk of my disposable income on comic books, and instead of recycling, I will throw all of my trash into New Jersey." "Similarly, we know that IFC research should not focus on what would happen if I somehow used seventeen types of labels to describe three types of variables. Instead, IFC research should focus on what will happen when I definitely give all my variables The God Label so that my program compiles and I can return to my loved ones." This is the enduring problem of security. This is what always happens when the people charged with providing security aren't well-versed enough to handle the complexity of the chore. Our job as framework developers is to make a system that is as simple as possible, but no simpler. Sometimes, initial efforts miss the mark. Sometimes, there is no way to feasibly bridge the gap between what users' capabilities and the complexity of the tasks that they have to solve. "The worst part about growing up is that the world becomes more constrained. As a child, it seems completely reasonable to build a spaceship out of bed sheets, firecrackers, and lawn fur- niture; as you get older, you realize that the S.S. Improbable will not take you to space, but instead a lonely killing field of fire, Child Protective Services, and awkward local news interviews, not necessarily in that order, but with everything showing up eventually." "Security research is the continual process of discovering that your spaceship is a deathtrap. However, as John F. Kennedy once said, “SCREW IT WE’RE GOING TO THE MOON.” I cannot live my life in fear because someone named PhreakusMaximus at DefConHat 2014 showed that you can induce peanut allergies at a distance using an SMS message and a lock of your victim’s hair. If that’s how it is, I accept it and move on. Thinking about security is like thinking about where to ride your motorcycle: the safe places are no fun, and the fun places are not safe." A sample from "Tenure announcement: April 2019" by James Mickens : James Mickens’s Harvard tenure announcement is unorthodox. He’s basically trash-talking to his enemies, who can no longer touch him. Some samples: "My seventh-favorite enemy is obviously Alan Fontaine of Iowa State University. I know that you’re reading this, Alan, because you keep inviting me to NSF panels even though I mailed you a glitter bomb that was shaped like me mailing you a glitter bomb. Your theories on Muppet physiology are childish and naïve, and I viciously refute them in my upcoming article “Parasitic Infections of Muppet Gastrointestinal Hand Holes." I wish you the best when you lose your endowed professorship and are forced to teach at a lower-tier institution that can’t even afford real Muppets and has to use oven mitts with faces drawn on them." "On that page, I used my grandfather’s charcoal pencil to draw a picture of myself dunking a basketball over your confused, athletically-incompetent body;"

And another thing about MVVM

2025-02-22T19:59:32+01:00

Published by marco on 22. Feb 2025 19:59:32 Updated by marco on 22. Feb 2025 19:59:58 ------------------------------------------------------------------------ I recently wrote "Real quick on MVVM" and now I see that a good colleague and friend has written his own "MVVM understandings" by Austin Jones . His piece got me thinking again about how the concept is a good start but isn't really sufficient. [Justifying the view model] Somewhere near the beginning, he writes, "The View Model’s function is separate from the Model. Abstraction requires discipline to not let two pieces of code that do the same thing become the same thing, purely out of convenience. Things that operate together should be functionally coupled, not just that same code." While I deeply appreciate the sentiment, I think that (A) most people are going to be unconvinced that they need additional complexity for such a vague goal, and (B) there are more concrete reasons to keep them separate. In "Real quick on MVVM" , I posited a simple example, repeated below. record Person( string FirstName, string LastName, Company Company, DateTime BirthDate); The view model might want to expose: int Age => DateTime.Now.Year - _model.BirthDate.Year; string FullName => $"{_model.FirstName} {_model.LastName}"; Company Company { get; } IReadOnlyList AvailableCompanies { get; } The AvailableCompanies is for the drop-down menu. The data in the model is a different shape than that required by the view. This happens quite quickly and quite often. Anyone who tries to "cheat" by using a type as both a model and view model will quickly be writing spaghetti code. It is the view-model's job to marshal data to and from its own shape to that of the model. It is decidedly not the model's job to do that, because it exposes data, while one or more views might display it in different ways. Perhaps another view is showing the birthdate directly, in which case that view model would simply pass the value through unmodified. [Thinking through an example] "Most logic seems to fall into the View Model as your business logic rules are often mirrored by presentation rules. E.g. a button has to be disabled if the user hasn’t met some requirement." It may seem too picky but I would instead use the verb reflect instead of mirror, to say that the view model exposes properties that reflect the state in the model. Just off the top of my head, I can imagine that each component of the architecture has unique duties, as illustrated in the example below, * A model contains several properties that must adhere to certain rules in order to be saved. * A validation service determines whether those rules have been satisfied, returning a list of zero or more validation results. * A view model exposes the most recent list of validations as a property, as well as a property called readyToSubmit. The view model triggers the service to calculate a new list of validations when the view notifies it that a relevant change has been made. * A view binds the validations as it sees fit -- either attaching properties directly to the controls that will display their values and allow users to manipulate them, as well as exposing the list of validations to the user in some way -- as well as binding the Enabled property of the submission button to the readyToSubmit property. This is just a simple example but we can see that the model is just a data container. In classic OO, the service would have been part of those objects. However, it's far more flexible to keep the model as a set of "dumb" DTOs and to keep the logic in the service. This makes it much easier to replace the validation logic in specific cases, without touching the data layer, which doesn't need to change. The view model does the work of managing calls to the validation service as well as retaining the results as long as the view needs them. The view model doesn't know anything about buttons. It doesn't need to know that they can be enabled or disabled. That's the view's job, which deals with the actual representations presented to the user. This makes the view model, in turn, flexible enough to be used with alternate representations. For example, we can imagine a view or view model that simply auto-saves when readyToSubmit is true, so it would have been a shame to have named that property saveButtonEnabled because it would have been an awkward fit for the hypothetical second view. [Benefits] As you can well imagine, it's incredibly easy to test systems built in this way, as you can very easily construct the data/model that you want and test something like the validation service. You can also very easily build on top of that to verify that the view model updates and notifies as expected. You can even bind to its properties to verify that a potential view would have received the expected notifications. The view doesn't have more logic in it than binding. Views are more finicky to test -- although it's not impossible or even especially difficult with practice, its it's also not usually necessary. For most problems that crop up, your tests eliminate the possibility that the bug is anywhere other than the view, so you quickly find where the incorrect binding. Obviously, if errors like this are chronic -- or if you have very complex views -- then you'll want to test the view with end-to-end tests. Just remember that testing the view usually requires the most effort, results in the slowest tests, and provides the least benefit, so you should really be doing those last, if at all. [Conclusion] My colleague's example focuses more on how the service layer pertains to persistence, for loading and storing models. I wanted to provide an example that doesn't have anything to do with persistence but shows that there is non-persistence logic that obviously -- at least in hindsight -- doesn't belong anywhere but in the service layer. I've been working with this type of abstraction since at least 2002, when I started working on the Atlas framework at Opus Software AG, which was written in Delphi Pascal. We didn't call it MVVM but we had a very clear separation between the object model, the view model, and renderers.

Avoiding completely failed estimates

2025-02-22T11:54:02+01:00

Published by marco on 22. Feb 2025 11:54:02 ------------------------------------------------------------------------ [Estimate Scrabble]The relatively short post "My Washing Machine Refreshed My Thinking on Software Effort Estimation" by Chris Horsley is kind of interesting, in that it's a cautionary tale about being overconfident about your estimates. As the title suggests, his was a real-world task where he'd assumed that a tenth iteration would go just as smoothly. He draws some good conclusions but for what I think might be the wrong reasons. "[...] while 90% of the project will be the same, there's going to be one critical difference between the last 5 projects and this project that seemed trivial at the time of estimation but will throw off our whole schedule. It could be one or all of:" 1. Our well-used task-running framework we were going to use for a relatively small part of the system is totally unmaintained now and we'd have to fork it to make it fit for purpose again. 2. Our entire development tooling ecosystem was obsoleted 18 months after the last time we did this, so we're going to be learning the sharp edges of a whole new toolchain from scratch. 3. We find that our OS version has moved on and no longer supports key requirements for our existing dependencies, requiring rethinking or developing from scratch. 4. We need our infrastructure stack to use one component we've never used before and it doesn't work anything like we expected. The things that he lists as stuff that you generally don't think about are exactly the kinds of things that you should keep in mind when doing an estimate, though! Like, if you've been doing estimates and any of these things are a surprise to you, then you've just been LARPing at estimates. Even when you copy/paste an existing solution to a similar problem, you have to consider the context in which the original was developed and the degree to which that context might be different this time around. It's not easy but it's your job to be aware of limitations and concessions at all times. Always consider that the passage of time is part of the context. It too so long to set up his washing machine because he's a rank amateur at doing that, despite having done it so many times. He got lucky the first nine times because literally nothing that could go wrong went wrong. He thought that his experience promoted him to a senior-level position but he was still a junior-level monteur. On his tenth time, everything went wrong and he was totally blindsided by it -- but only because he'd learned nothing about the system he was working on. He didn't learn, for example, what his requirements or environmental expectations were nor that he should quickly check to verify that they were satisfied before he started. It's as if he'd gone downstairs to check his car's oil but hadn't brought his house keys with him because the door to the garage had always been propped before. When he had to go back upstairs to get his house keys, that was considered a blindsiding showstopper that you couldn't have accounted for. Even after the first setback, he didn't sit back to take stock of the situation and plan his next few steps rather than just his next step. That's the classic misinterpretation of agile: it doesn't mean you should turn your brain off in planning. It doesn't mean you shouldn't do any long-term planning. It means that you should always be prepared to change your plan. It means that you should take a little time to plan for the immediately foreseeable work. That doesn't mean that you're suddenly doing waterfall! It means you're filling your backlog.

Tacking against the winds of Copilot

2025-02-16T08:45:32+01:00

Published by marco on 16. Feb 2025 08:45:32 ------------------------------------------------------------------------ The tweet ""Programming"" by Andrej Karpathy is what some people are calling the future of programming -- with the loudest claiming that it's already here and that you're all missing the boat if you're not programming like this. "There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper so I barely even touch the keyboard. I ask for the dumbest things like "decrease the padding on the sidebar by half" because I'm too lazy to find it. I "Accept All" always, I don't read the diffs anymore. When I get error messages I just copy paste them in with no comment, usually that fixes it. The code grows beyond my usual comprehension, I'd have to really read through it for a while. Sometimes the LLMs can't fix a bug so I just work around it or ask for random changes until it goes away. It's not too bad for throwaway weekend projects, but still quite amusing. I'm building a project or webapp, but it's not really coding - I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works." The people who claim that this represents the future of all human engineering effort are mostly fools and mountebanks, who are either not software developers or engineers, or who claim that they are, but never really understood what we do. Karpathy is, at least, very explicit that he's doing this for "throwaway" projects. He is building, at best, prototypes and, at worst, inscrutable toys that don't do what he wants, and he's not going to bother trying to fix them. This style only works because Karpathy already kind of knows how to program and he's smart. This style will not make someone magically able to produce anything other than a crude facsimile of what has already been produced before. They will likely get stuck in myriad cul-de-sacs where the LLM avows very confidently that something will work and it will not work -- sometimes subtly, sometimes overtly. The other day, I was confronted three times with outright inaccurate, or simply outdated or inapplicable suggestions from Copilot. I did not request these suggestions; they were introduced by colleagues during reviews and collaboration. [Clinging to the past] In one case, someone asked in a PR whether we really could eliminate the framework declaration in app.config files. The Microsoft documentation still confidently declares that you need to include this declaration but says nothing about why it's not included in any of its starter templates. Copilot therefore confidently writes that you absolutely need the declaration for reasons that might have been somewhat applicable in 2009, or maybe even 2012. The app in question works with .NET Framework 4.8, a version that has been available for three years. There is no follow-up version in sight. A 4.x version of the framework has been included on all versions of Windows for years. The app doesn't run with a lower version. It already declares this in the element of the project file. We don't need to artificially pin the version with an outdated mechanism. The version of the framework give in the project file is not only sufficient, but adding an outdated alternative that says something slightly different but will be ignored is distracting, to say the least. It's a recipe for configuration cargo cults and future goose chases, which is why I threw it out. Copilot would chirpily encourage people to put it right back in, and to waste time fine-tuning it. [Simplistic and incorrect] In another case, a colleague was having trouble upgrading Entity Framework from using the System.Data.SqlClient to using the Microsoft.Data.SqlClient instead. There was an error where a registration wasn't being honored and the app couldn't find the new provider. The Copilot solution was to confidently recommend adding a whole bunch of crap that wouldn't solve the problem in any sane manner -- and couldn't. It just amounted to vague instructions to include the EF framework -- which my colleague was obviously already doing. The solution to the problem was going to have to be more subtle, but my colleague was getting lost in the weeds of trying to figure out the subtlety underlying what Copilot might possibly mean with its naive suggestion. I am continually amazed by how much leeway people give these LLM tools, very optimistically assuming that a given tool "understood" the intent and is suggesting a brilliant solution that will reveal itself if only the developer can rise to the occasion. When I was called in, I read the error message and it was pretty clear that something in the system was overriding the settings we wanted to use. It turned out that there was a reference to the old data provider in one of the core libraries that we'd upgraded. Instead, we ended up focusing on the fact that the EF support took a default dependency on a relatively old 5.x component but the latest version was 6.x. The suspicion was that perhaps something would work better in 6.x. When my colleague added the newer version to all projects, things started working again, so apparently a bug had been fixed. But which bug? While it was good to have a workaround, adding the reference everywhere was an awkward solution. So my colleague took a small timebox to investigate further He eventually discovered -- without Copilot -- that there was a registration for the old data provider buried in our library code. Removing that single line of configuration in the base library solved the error. Including another registration shouldn't cause the configuration to ignore subsequent registrations of other data providers, though. I think that the 5.x version wasn't capable of properly managing multiple, registered data providers but that the 6.x version could. When the library was forced to use a newer version, everything worked again. The maintainable solution that we ended up using was: 1. Remove the unwanted configuration entry in the base library. 2. Add a single reference to the newer version of the transient library to ensure that we were using the 6.x version -- with the bug fix that we technically no longer needed -- in all consumers of the library. This step wasn't even strictly necessary but there's no harm in it. We marked the package-inclusion with a reason and a link to the work item that was tracking the work and included a write-up of what had happened. [A solution that doesn't exist] Finally, I had updated a build-pipeline template to accept a new parameter: OutputFolder, which was to default to a value based on the value of another parameter SolutionFolder. The following solution feels like a natural way of expressing this intent. parameters: SolutionFolder: '' OutputFolder: '${{ parameters.SolutionFolder }}\Output' This is what it looked like in the actual template: [image] However, if you try it, you will get an error message saying that "A template expression is not allowed in this context": [image] According to the machines -- the screenshot below is DuckDuckGo, which is based on ChatGPT by default -- it will cheerily tell you that you can totally do exactly what you clearly can't do. [image] This is the "cloud theory link" referenced in the screenshot. There's nothing in there that has anything to do with the question or its answer. The post is about Azure pipelines but it's a pretty banal article that explains the basics and it will utterly waste your time if you were looking for a definitive answer on how to reference parameters from one another, as the LLM claims is possible. "The best-case scenario is a mediocre solution that works. "The worst-case scenario is a productive-looking rabbit-hole that leads nowhere." This is my worry about this whole LLM-based tool thing: it's wasting people's time and making them inefficient. The best-case scenario is if it manages to give them a mediocre solution that works. The worst-case scenario is that they get routed down a productive-looking rabbit-hole that leads nowhere but which they can then claim that had been productive because they'd been doing "work" and who can blame them if it was the LLM that started them on that path? How do you fix this for real? You use the solution that I came up with after quickly learning from "Allow parameter defaults to be calculated based on other parameters #2737" on GitHub that the feature I was looking for had never been implemented. I used a variable and "coalesce" , as shown below. variables: defaultOutputPath: '${{parameters.SolutionFolder}}/Output' templateOutputFolder: ${{ coalesce(parameters.OutputFolder, defaultOutputPath) }} Curious, though, I went to Copilot to find out what it would suggest (assuming that Copilot is a bit more advanced than the default ChatGPT used by DuckDuckGo). As usual, it returned a lot of boilerplate text that may or may not have actually corresponded to the code it provided. I didn't bother reading most of it and just kept the code examples. ----------------------- | 🧐 1️⃣ prompt | ----------------------- azure yaml use parameter in default for another parameter ----------------------- ----------------------- | 🤖 1️⃣ answer | ----------------------- parameters: - name: repositoryName type: string default: '' - name: sonarQubeProjectKey type: string default: '' jobs: - job: SetDefaults steps: - script: | echo "##vso[task.setvariable variable=sonarQubeProjectKey]com.company.${{ parameters.repositoryName }}" displayName: 'Set SonarQube Project Key' ----------------------- This is an impenetrable and unmaintainable disaster that uses a script to write a variable to the job's terminal, which also sets it in the pipeline environment as a side-effect. Why would you do it that way when you just need a local variable? I already know what the solution is, so I know that I can keep prompting instead of wasting my time trying out this version, which I think might work but which is not elegant, legible, or particularly maintainable.. 🎰 That's the advantage of being a senior developer: I will know a good answer when I see it, so the LLM can save me time as I pull the slot-machine arm until it comes up sevens. In the brusque style I typically use with LLMs, I pressed on. ----------------------- | 🤨 2️⃣ prompt | ----------------------- Not even close. Try harder. ----------------------- ----------------------- | 🤖 2️⃣ answer | ----------------------- parameters: - name: repositoryName type: string default: 'default-repo' - name: sonarQubeProjectKey type: string default: '' variables: - name: sonarQubeProjectKey value: ${{ parameters.sonarQubeProjectKey != '' && parameters.sonarQubeProjectKey || 'com.company.' + parameters.repositoryName }} jobs: - job: Build steps: - script: | echo "SonarQube Project Key: $(sonarQubeProjectKey)" displayName: 'Display SonarQube Project Key' ----------------------- That's a bit better. Now there's a variable declaration, even though it still uses the unfortunate old-style name/value syntax that is not super-intuitive. It also uses a bizarre bastardization of the || operator to coalesce to a default value if it's not set. I'm not an expert on pipeline-expression syntax but the placement of the && looks pretty wonky. As with the first answer, I didn't even bother trying to see if that one-liner mess even compiles, to say nothing of getting me what I want. Even if it works, it's an impenetrable and unmaintainable line that no-one will ever dare touch again. Since I already knew the answer, I told it straight-up how I wanted it to write the expression. ----------------------- | 😒 3️⃣ prompt | ----------------------- Use coalesce ----------------------- ----------------------- | 🤖 2️⃣ answer | ----------------------- parameters: - name: repositoryName type: string default: 'default-repo' - name: sonarQubeProjectKey type: string default: '' variables: sonarQubeProjectKey: $[coalesce(parameters.sonarQubeProjectKey, 'com.company.' + parameters.repositoryName)] jobs: - job: Build steps: - script: | echo "SonarQube Project Key: $(sonarQubeProjectKey)" displayName: 'Display SonarQube Project Key' ----------------------- It's unclear why it's using "Runtime expression syntax" instead of the more-standard "Macro syntax variables" . It doesn't matter, though because, according to the documentation (and as far I knew already), you have to use the "Template expression syntax" when referencing template parameters (which we most certainly are -- it's right there in the question). Is it complicated that there's a table as well as a section called "What syntax should I use?" that explains which form to use where and what to watch out for? Yes, it is. But I don't make the rules. I just play by them. "The tool of the future that you're a fool and knave for not using because junior devs super-powered by it will blow by your ancient ass if you don't learn how to use it" Copilot, on the hand, is not constrained by petty rules, and would have cheerily helped a less-savvy developer piss away an entire afternoon if not a couple of days chasing phantom errors, trying desperately to get the code produced by the genius machine to work as advertised. The thing about pipeline scripts is...you have to execute them in the cloud. The turnaround time is murder. Let's take a quick peek at my version, the one I had before we starting playing around with the tool of the future that you're a fool and knave for not using because junior devs super-powered by it will blow by your ancient ass if you don't learn how to use it. variables: defaultOutputPath: '${{parameters.SolutionFolder}}/Output' templateOutputFolder: ${{ coalesce(parameters.OutputFolder, defaultOutputPath) }} Succinct, easy to read, with useful variable names. Easy, peasy, lemon-squeezy. Also, it works! That sets it apart from the Copilot suggestions above. The only trouble I had in the PR was that my reviewer didn't know what the coalesce function did...and the "top German translation" was for Verbinden, which means to link, tie, or join. That meant he thought it was a synonym for concatenate, which it most certainly is not. It was a good learning opportunity, where I reminded him of the coalesce function in SQL, as well as the "null-coalescing operators" -- ?? and ??= -- in C#. Look, the world is complicated, human languages and creations are messy, and a glorified "Plinko machine" like an LLM doesn't seem quite yet cut out for helping us navigate it.

Building sites with HTML and CSS

2025-02-15T23:00:21+01:00

Published by marco on 15. Feb 2025 23:00:21 ------------------------------------------------------------------------ This is a 40-minute discussion about the combining the latest technologies, like scroll-snapping, scroll-driven animations, anchoring, etc. to produce responsive, progressive, animated, modern, and very fast sites without any JavaScript at all. [media] [Carousels] Adam uses it all to build carousels, which is fine for demos and proving the power of the technologies, but ... I'm not a fan. While our two hosts mention that Netflix comprises only carousels, they don't really discuss that Netflix is also a deeply unsatisfying experience for finding content. [Slow adoption rate] [image]They finish up with an interesting discussion of how quickly changes are introduced and the absolutely legitimate reasons why adoption of some features is so slow. It's often difficult for developers to be aware that a feature exists and also to be aware that it would be a solution for the problem that they're having. There's also the fact that most developers and product owners will allow the scope of the familiar to limit their vision of what is possible. You really need people who stay on top of these things and can say that yes, it is possible to animate this now, or it is possible to eliminate a ton of cruft here, and also to be aware of whether that feature is available on all target platforms, or whether it can be made optional with progressive enhancement, or ... it's a very complicated, complex thing to handle. It takes years before a feature is just known and accepted. Often, it takes a new generation of programmers who've grown up with that feature to know how to use it. [Forget about yesterday's cruft!] Just think: today, you can build responsive, progressive, fast, pretty, and accessible web sites with no layout hacks and no JavaScript. Everything just works. But you haven't always been able to do that, so there is a large percentage of the web-developer community that is not aware that this is the case because they stopped paying attention a while ago and are stuck on the feature set that they know. At best, they're aware that a feature exists but wasn't ready for primetime when they last checked, even though they've not checked in a while. Even if they're aware of it, they might not have the time or budget to use it in existing projects, where everything has already been tested. Who's going to risk ripping out a ton of custom code to replace it with two lines of CSS, when you have to test everything all over again?

A roundup of .NET 9 release videos

2025-02-15T21:48:49+01:00

Published by marco on 15. Feb 2025 21:48:49 Updated by marco on 11. Apr 2025 11:18:08 ------------------------------------------------------------------------ In November of 2024, Microsoft released a lot of videos about .NET to accompany the release of .NET 9. I watched/listened to a lot of these, with varying levels of attention paid. When something caught my attention, I took notes. These videos are roughly in the other than I watched them, although I reserved the right to shuffle them about a bit to improve grouping. "AI" <#ai> 1. "AI Building Blocks − A new, unified AI layer by Steve Sanderson" <#sanderson> 2. "Discover the Latest GitHub Copilot Features for .NET Developers in Visual Studio" <#Discover> 3. "Building and scaling cloud-native, intelligent applications on Azure and .NET" <#Building> "Aspire" <#aspire-section> 1. "What's new in .NET Aspire" <#Aspire> "ASP.NET" <#asp-section> 1. "Bulletproof ASP.NET Core APIs: The OWASP API Security Top Ten" <#Bulletproof> 2. "What's New for ASP.NET Core & Blazor in .NET 9" <#Blazor> 3. "Exploring the New Fluent UI Blazor Library: Next-Gen Web Components and Architectural Innovations" <#Exploring> 4. "I Confronted Microsoft About Blazor's Future" <#I> 5. "Introducing HybridCache in ASP.NET Core" <#HybridCache> 6. "Easily Improve Web Application Performance using .NET 9 Caching and Redis" <#Redis> 7. "Clean Architecture with ASP.NET Core 9" <#Clean> 8. "New tools in Visual Studio for Web API developers" <#New> "Desktop" <#desktop-section> 1. "Modern WinForms Development with .NET 9" <#Modern> 2. "MVVM Building Blocks for WinUI and WPF Development" <#MVVM> 3. "How Fidelity uses .NET MAUI for Cross-platform desktop" <#How> 4. "Cross-Platform Magic: Transforming WPF Apps with Avalonia XPF" <#Cross> "General" <#general-section> 1. "Advanced Pattern Matching in C#" <#Advanced> 2. "Testing.Platform, the new way to run .NET tests" <#Testing> 3. "Modernising Legacy .NET Codebases with NDepend" <#Modernising> 4. "F# - Nullable Reference Types" <#F> 5. "Embedding Python libraries in .NET services and applications" <#Embedding> [AI] [media] The API looks very approachable and straightforward to use. I wish that they would stop papering over the inaccurate responses, though. At about 17;45, he writes that the additional pair of socks has been added to the cart and that it's "gone up in the way that it should." Except that's not what the response showed. The response showed the total number of pairs of socks in the cart, yes, but it showed the price only for the additional pair of socks that was added in the last step. It noted that this was the case but it was quite confusing to show the total number of items in the cart and then write "the total price for that pair is", which would confuse a reader into thinking that it was the total for the cart, unless they read carefully. Using language like "total" for a single item is confusing, if not misleading. This is the kind of stuff that would ordinarily be cleaned up in a final product but it's doubtful that this kind of stuff will be cleaned up in AI-generated prototypes that are scaled up to production, especially since people that will know how to fix it might become rarer. It will be deemed "good enough" and the inexorable erosion of software quality will continue. Also, Sanderson had GitHub Copilot running during the entire demo and he pretty much completely ignored all of its suggestions, choosing instead to copy/paste pre-written snippets. This is fine, of course! It's just that, ... why didn't he just turn off the annoying prompts that constantly plastered completely irrelevant information into his video? It was quite distracting. -------------------------------------------------------------------------------- [media] I suppose this is future of programming? In this video, you can watch a young person ask an AI to add usings because you have no idea what they are and don't know that the IDE could just add them for you automatically. She described everything as "awesome" and thought that the interaction loop was super-intuitive and easy to use, as she typed out natural-language command after natural-language command to try to get the machine to do what a programmer could have done in seconds with analysis-based tools. I suppose if you want to program without knowing anything about the technologies, then this is probably going to get you a little bit further. Maybe. It was pretty painful to watch, though, like someone claiming that they were building a house by throwing wood at bags of cement and hoping for the best. -------------------------------------------------------------------------------- [media] This is a longer presentation -- almost an hour -- that goes in-depth on converting an ASP.Net Web application first to an Azure Function, and then adding a .NET Aspire Host project to not only coordinate communication between the front-end client and the Function project, but also to facilitate deployment directly to a Cloud Container. Hunter explains the different between WithReference(), which indicates that a service depends on another service being started and WaitFor(), which extends the reference to indicate that a service should wait until a health-check of its dependency indicates success before declaring itself available. If you think about it, almost all references are important in his way, but .NET Aspire still makes the distinction to give your app flexibility in starting up. If you have two service that depend on each other, they can't each wait for each other, or you might be a deadlock (unless the health-check can return success before the service itself is ready). On top of that, there is a method called WithExternalHttpEndpoints(), which they describe as "doing the right thing" and setting up a virtual private network in the cloud container so that only the web client has access to the Azure Function endpoint. "It's network-isolated by default, which is one of the features of container apps, which is the default way of publishing in an Aspire application." This is very cool and seems a lot easier than writing a bunch of custom Bicep code. The web client can now access the Azure Function at the alias that it assigned in the .NET Aspire host project's configuration. That configuration is all written in C#, with a nice fluent API. I'm a little disappointed that they don't use a shared C# constant to reference the unique strings, but nobody in any of the .NET Aspire demos seems to do that, preferring to ride the ragged edge of disaster with copy/pasted identifiers. Getting back to the demo: He shows how it's published to Azure, using Managed Identity. The .NET Aspire dashboard shows the remote resources with full logging available, also available and published to Azure. It's the same dashboard as you would use locally. Everything they demonstrated is available in several solutions, all listed in "Samples for Building and Deploying Cloud Native Intelligent Apps" by Paul Yuknewicz . They say, "All you do is clone the repo, and then you do azd init and azd up and they're super-easy to get in the cloud and try yourself." It will deploy the resources into your subscription and region of choice. There are detailed instructions for each example, e.g., "azure-functions-openai-aisearch-dotnet" (RAG example). Paul goes into this example in more detail, examining the Bicep scripts (which I'm not sure whether they're hand-written or generated by .NET Aspire). They cover how to build an embedding for RAG and how much support there is in .NET now for making this kind of thing easy. They use a standard HTTP body as input but discuss many other potential input streams (queues, etc.), all of which are just as easily supported by default. I learned that "agenting" is just the cool new term for allowing an LLM to use "tools". It's also called "skilling" (adding "skills"). The most approachable epithet is the least-cool-sounding one: "function-calling", which at least explains what it does. So "agenting" is empowering an LLM to execute tools in order to enhance results. Like many of the other videos, they use the ".http Files" feature to store recipes of calls to make against an HTTP server. When their demo doesn't initially work, you see how Microsoft also has a locked-down network that they have to work around. They very smoothly transitioned to Scott discussing security initiatives that Microsoft has taken and is taking to lock things down by default for new applications, including those generated by customers. Secure by default. They then smoothly transition back to the demo, for which Paul has reset his network to a working state. Nice job. The funny thing about some of these demos, though, is that they used an LLM to find out what time it is in New York City...and everyone held their breath to see if it would get it right. It took about 10 seconds to figure it out. That's laughable on its face, but the point is that it's now quite easy to set up a powerful tool to round-trip to an LLM running securely in Azure, built with access to custom functions (agenting) and custom data (RAG). Subsequent demos are more impressive. At 40:00, Paul addresses why it's interesting to solve problems in this way: you can scale up quickly to much higher data volumes without changing the architecture or implementation at all. This is great but you want to remember the aphorisms, "you're not Google" and "you don't have big data." In this case, though, an argument can be made that these technologies are the right way to build it when its small and when its large. The support and abstractions are good enough that you don't have to choose a a non-scalable solution early in the process to save money. On the subject of ignoring warnings in code, though, Paul had a typo in the word "cacluated" that Visual Studio showed him and he still hadn't bothered to quick-fix it, even though the entire world would be seeing it. The final example is showing how to use a "blog trigger", which is a function that reacts when new data is added to an Azure blob container. When you drop a PDF into it, it uses standard, available recognition tools to analyze the document and then funnel that content to the LLM (which is not great at "cracking" PDFs provided as context on its own). These dependent tasks are captured as "activities", which are composed as part of a "durable function", which is essentially a high-level abstraction on top of potentially distributed calls. This is the part that Paul called "orchestration" at the beginning of this section. It's not orchestration like Kubernetes (although possibly related, way down at the low level), it's orchestration of high-level activities and representing them as a single function call that takes an indeterminate amount of time. Paul demonstrates how much tooling and web-based observability there is available for debugging and monitoring solutions. "Again, this is Functions, so you can do this at scale. You can send million of documents." It just costs money, but you're not otherwise limited by the architecture if you build it with these concepts, this architecture, and these building blocks. [Aspire] [media] This addition to the .NET ecosystem continues to impress me. While this video demonstrates only .NET Aspire, the previous video ("Building and scaling cloud-native, intelligent applications on Azure and .NET" <#Building>) also made heavy use of Aspire. Many of the videos would include Aspire, actually. [ASP.NET] [media] This is a very useful introduction to common security issues and how to address them. He talks about how to program by default so that the issues never come up. At around 19:00, he even discusses how to build a threat model. He kind of backs into describing it by talking about the types of risks for which you might need processual mitigations. That is, the threat model talks about something like "the system allows a single user to book multiple seats for themselves on a plane" and then talks about (A) whether you even want to mitigate this and (B) which kinds of mitigations would work against it. -------------------------------------------------------------------------------- [media] This was pretty informative, overall. I wish he'd spent a bit more time on HybridCache, which seems like a big win. Oh, hey, look at that: there's a video for it, called "Introducing HybridCache in ASP.NET Core" <#HybridCache>. -------------------------------------------------------------------------------- [media] This library is quite nice and seems to offer a good basis on which to start projects. It's interesting, though, how people just say things that they've heard. Denis said at one point that Playwright will be integrated soon in order to improve "code quality", which is absolutely not what tests do. Tests do the second thing he mentioned: "avoid regression." If you have tests, your code can be any old quality. The tests don't care, as long as they're green. -------------------------------------------------------------------------------- [media] I mean, honestly, can you just write "interviewed" instead of "confronted"? Do we really have to write everything as if it were a title in a Fleet Street broadsheet? It's a friendly interview. * They discussed the difference between .NET LTS and non-LTS versions. There is no difference in quality. * Why wasn't there so much Blazor news this year? The .NET 8 release was so huge that they spent a lot of time in the first half of the year after the release simply stabilizing that release, and then focused on quality improvements in .NET 9. There is a big feature that consolidates the different ways of using Blazor into a more uniform concept. * Who's using Blazor? How big is it? Year-over-year growth is high but the overall usage numbers are still kind of low, relative to other frameworks, like the by-now ancient WebForms, ASP.Net MVC, ASP.Net Core, etc. * Is Blazor going to go the way of Silverlight? No, it's the recommended way to develop web sites on .NET. Nick gave a good intro here, talking about how good WebForms actually was -- taking aside ViewState -- but that frameworks like Next.JS are still re-inventing what WebForms had already offered and pawning it off as a revelation. It's kind of how most of the server-side frameworks now just look like PHP. * .NET Aspire has a super-short support cycle; it goes obsolete with the first point release. You have to upgrade it rather aggressively to stay in-support. It's not part of the .NET release. It's out-of-band. It's super-useful but it's a bunch of tools and wiring without much of its own API. It makes sense to keep support cycles short because, while it's been released for others because it's so damned useful, it's also acknowledged that the surface will potentially change quite a bit as more and more real-world use cases appear.Why do we even need Blazor? Microsoft isn't using it anywhere, is it? The problem that Blazor solves is trying to build a web site with a team that doesn't know any of the languages, tooling, or paradigms of front-end development. For the vast majority of web sites, you really don't need full-fledged React or Angular or Svelte. While there are developers who can legitimately live in both worlds, Blazor is for those for whom a good web site is good enough. You can make anything in Blazor, of course, but it really helps you get to a standard, good-enough view (especially with Blazor Fluent UI) that covers so much of the software being built. Roth describes how the client-side world is such a different beast and that spinning developers up to be productive and happy in that world takes a lot of time, money, and resources. It's a fair point. Many people just can't wrap their heads around that style of development. It's too alien to them. I would venture to say that most web developers aren't very good at software-development, don't really understand the environment or their tools, and are just cargo-culting their way to freedom and happiness. It's why we've had so many RAD dev environments, it's why we have so many frameworks. Even after a good 20 years of development churn, things are still churning. There is almost no consensus on how to address the plethora of non-functional requirements in clients: accessibility, compatibility, graceful degradation, progressive enhancement, etc. The best philosophy seems to be PWA and probably an MPA not an SPA for apps, which is not most web sites. Most web sites are mostly static. So the common web frameworks are ill-suited to those kinds of pages. Etc. Etc. "Blazor lets me get more with less. I can't afford to hire a full-time front-end developer." * Why is .NET trying to do everything? Isn't that a recipe for being mediocre in most things? "I think it's fair to say that there are parts of the .NET stack that have ... more strengths than other parts." I think that there's a real need at Microsoft for doing a lot of things in all areas. MS needs to develop web sites, needs to develop mobile apps, needs to develop cloud-based apps, etc. They're going to develop at this scale anyway. We can be happy that this workman's version of these tools are available, and cross-platform. Apple makes some amazing technology that only works on their hardware and on their systems and they don't have a cloud. .NET runs on AWS, GCP, Azure, etc. .NET runs on ARM, x86, Linux, Windows, MacOS, etc. The base library is incredibly well-designed. * How big is the Blazor team? There are six full-time devs. It sits atop other parts, like SignalR, which has two full-time devs. The actual framework is six engineers. He mentions that the community does some heavy-lifting here as well. * Who's using Blazor? Roth mentions some customers, and then talks about how it's used in a lot of places internally at Microsoft. They don't use Blazor for Teams, Office365, large customer-facing products. Those use React. A lot of that is historical, because Blazor has only been around for 5 years. They used to use Script#, which was a transpiler for C# to JavaScript, but then they moved to React. A ton of the recent, internal LOB-style products use Blazor. That's what it's for. Smaller teams use Blazor and there are thousands of devs who use Blazor at Microsoft. * Is there anything public-facing that uses Blazor? Very little. The Aspire Dashboard is one of the only things. Part of that reason is "technology fit". Blazor is very good for internal LOB that runs on a known set of devices and capabilities. Think Office, though: they need to be able to run on anything. That's a completely different proposition because it constrains you more. You need more control of the stack. A high-level solution like Blazor doesn't save you time there; it costs you time. For products that need to be optimized in terms of download-size and speed, etc., then you probably should use JavaScript directly. They recommend Blazor if it fits your scenarios. Otherwise, use ASP.Net Core with a JS front-end. He made the comparison to using Node on the server. It's not the optimal thing for performance, but it might match your team best. But you can scale the server-side with money. You can't scale the client-side. Roth agrees: if you have millisecond-initial-download constraints, then Blazor isn't for you. He does say that you'd be surprised how many apps aren't like that. I'm not surprised. He says that even the heavyweight Blazor server model, which is basically PHP-style, if we're honest, then you can support dozens of thousands of concurrent users on a single, modestly sized VM. Most apps have expectations of hundreds of concurrent users, at most. Being server-based will restrict your interactivity if you rely on it too much with server-based stuff. For forms and LOB, though? It's fine. You're not Facebook. Relax. * How many people use the web-assembly stuff instead? Quite a few, actually. The server-side Blazor is slightly more popular, though. They're both growing at about the same speed. .NET 10 plans to invest more in the server-side version. They need to solve some problems about server-side state: hydration strategies for longer-lived processes and workflows. * What if you had to make an app for millions of users? If it's B2B or LOB, then consider Blazor. Start with Server-side, then move to interactive server-rendering on the client, move to WASM-based to push individual islands to the client where necessary. If it's customer-facing, then it's going to be .NET on the backend and a JavaScript front-end (he doesn't say "React", notably). I think MS engineers are also seeing the value in writing to the web platform, using JavaScript. He doesn't even say TypeScript, because they're so close these days. It would be amazing if browsers allowed the syntax directly so we didn't have to transpile anymore. There's a "proposal" for this. * Yeah, but which platform would you use if you couldn't use Blazor? TypeScript because static typing is awesome. He mentioned that the Angular 18 release was "pretty compelling". I personally looked into it based on this and couldn't figure out what he was talking about; it looked like an incremental upgrade to me. * Daniel asked what about server-side rendering? Would you look at Next.JS? SSR? Server-side components? Nick says Next.JS but I don't agree. I think their solutions, just like Remix, about which I've also read quite a bit. Their solutions get ... complicated at scale, with their attempts to paper over the difference between client and server parts without being forced to know where anything is running tending to be quite leaky abstractions. I've read quite a bit about Remix and Next.JS and, in both of those, I've seen where cracks show that people deeply familiar with the technology think "aren't so bad" but that's only because they know that it used to be so much worse. Daniel says "Look around at what is going to be around for a while. Everybody has to plan for the longevity of their career." This is so sad, though. We don't get anything great from people "planning for the longevity of their career." We get cool things from people who just can't help but try to make something better, to make something cool, to make something to help themselves, that interests them. Dude, React is a hype. Most people are using it poorly. You get to leaky abstractions in the first two days of teaching, where you have to tell people how to avoid horrible performance with useMemo() and useCallback(). They're working on a compiled version of React, which is just where Svelte has always been.What's next for Blazor? Roth talks about SSR a lot. Interaction between SSR and client-side, etc. Performance and caching. I think Blazor will be a better, more well-thought-out and much less ad-hoc approach to SSR than Next.JS and Remix have gotten. Why? Because the people that Microsoft has and the culture that they have tend to produce really good APIs. That's just a fact. Multi-threading feature is on the radar again. They're going to try again, but it's not committed yet. Security is the #1 push right now, though. So if Blazor has work to do there, then that takes priority. This will result in a more secure stack for users of Blazor as well. * If you had to work with a different back-end language, what would you use? Daniel responded "Python". Nick said "Kotlin", which he says is how JetBrains fixed Java by making it C#. I would take another look at Swift, which I haven't used for anything real since version 5. Or maybe finally do something in Rust, just to see where the tooling is at. -------------------------------------------------------------------------------- [media] This video covers Redis, HybridCache, and stampede-protection. -------------------------------------------------------------------------------- [media] Moar caching. Pretty good. -------------------------------------------------------------------------------- [media] I follow this guy's blog and find many of his articles to be a bit basic. It feels like he's writing just enough articles to keep his Microsoft MVP badge. In this video, though, he is en fuego, absolutely ripping through a whirlwind introduction to clean architecture, with some demos and some code. You can find the sample projects in "NimblePros / eShopOnWeb" and "ardalis / CleanArchitecture" . He also very quickly demonstrates how to use the new API window to submit requests to the running server; this replaces external tools like Postman, keeping you within Visual Studio (or Rider, which has supported this for even longer). In his full-tilt presentation, he also mentions using "Papercut-SMTP" , which is "a 2-in-1 quick email viewer AND built-in SMTP server (designed to receive messages only)." This is ideal for local devs to test emailing code and can be easily integrated and started with .NET Aspire. After having generated a solution using his clean-architecture template, he says that "this is the slowest part ... opening the new solution in Visual Studio," but this is really unfair because the solution is restoring and doing everything in the background, while he's clicking around in the solution explorer. Previous versions of Visual Studio would never have allowed this. He even launches the product and is browsing around in the Aspire Dashboard and the web server's OpenAPI front-end within seconds. It's an impressive demo. -------------------------------------------------------------------------------- [media] Sayed starts off with a good overview of the basics of ".http files" , which showed up with .NET 8 and allow you to keep sets of API calls for testing, much in the way that people have been using Postman or Insomnia. The new feature is that .http files can now store values in variables for transferring results from one call to others. He uses Visual Studio but, as I've noted elsewhere, Rider has supported them for even longer than Visual Studio. He also shows a not-quite-ready-for-primetime-but-coming-soon feature of Visual Studio called the Endpoints Explorer, which is a sort of Swagger/OpenAPI browser available as a Visual Studio panel. [Desktop] [media] This was quite an interesting video, in that it really drives home that WinForms is here to stay. The community pushed hard to make a lot of the code base expose and use nullability. Microsoft has also improved performance in System.Drawing and replaced all interop with code generated by "CSWin32" . There is also improved support on base UI objects for asynchronous calls like Form.ShowAsync and so on. I like that his demo to show text in a color-mode-aware manner failed because he was creating the brush with the right color but he wasn't assigning it anywhere. How do I know he wasn't using it anywhere? Because Visual Studio had grayed out the instance variable to which he had initialized his brush. He'd assigned the brush but hadn't actually assigned it to be used by any control. This is why you configure and then pay attention to the warnings and suggestions in your IDE, folks. It really does help you solve otherwise pretty hairy problems. In this case, I was able to diagnose his problem just from a brief flash of less than a second of him scrolling through his file. It hurts me so much to watch people click toolbar buttons to comment/uncomment code. Seriously, you're fired. So, cool video but this guy really shows how outdated some WinForms developers are. 😹 -------------------------------------------------------------------------------- [media] I learned that the latest preview version of the MVVM Community toolkit is already using partial properties -- which are new to C# 12 in .NET 9 -- to help you write even less code for your view models. Also, you can use x:Bind instead of x:Binding to have a compile-time, reflection-free binding, which has much better support for code-completion, inspections, etc. -------------------------------------------------------------------------------- [media] This is a pretty impressive demo of what is now possible with Maui. They're using Telerik controls. The neat thing about this Maui app is that it runs on MacOS as well. They talk about WebViews a lot -- and painting to the canvas -- so it also runs in a browser. He does discuss how using SkiaSharp is a valuable place to seek performance but that you are almost certainly going to make usability or accessibility concessions. Use source generators, not reflection. Interestingly, they mention how you can speed things up with Windows Defender by signing assemblies with "HSM methodology", which is something to look into, I think. I just can't help but think that it isn't any faster or better-looking than the trading app that I was lucky enough to be able to build for Peak6 back in 2010--2013. That one was a multi-threaded Winforms app that connected to a data hose that shoveled tons of data toward the app -- dozens to hundreds of thousands of refreshed data rows per second -- in dozens of open windows and portals. Each of the grids showed data in grouped, tiered, aggregated, and heat-mapped views. When it really got going, you had 40+ open windows on eight screens, all updating in real-time. Close the app and re-open it and it came right back where it was. it was very, very colorful. -------------------------------------------------------------------------------- [media] This is a great look at Avalonia, which is a very interesting migration path from the Windows-only WPF to a multi-platform approach that doesn't involve rewriting everything in Maui or Blazor. With Avalonia, you can either port from WPF, primarily changing the styles to use the more CSS-style styling of Avalonia. Because of this feature, though, an Avalonia app can relatively easily be deployed to a web application. He discusses a community project "Avalonia Visual Basic 6" by BAndysc , which you can "browse in a demo" . It runs in a browser, it runs on all desktop platforms -- including Linux, which Maui doesn't support. On top of that, it also supports something called Avalonia XPS, which is a complete replacement for the WPF rendering engine, so you can "port" an app just by changing the SDK uses in the project files. That's it. He demonstrates it live and it works extremely quickly and seamlessly. Of course, if you have P/Invokes or a lot of custom rendering -- or external components that aren't compatible -- then you'll have to do more work. But it's a huge step forward to getting WPF apps running on other platforms. For charts, he mentions that "SciChart" is the "best" charting library for WPF and that it is compatible with Avalonia. They use the XPS layer to "provide support for Linux platforms" . His final demo shows a WPF app (the calculator) running with XPS but targeting a web browser. This is wild. I had completely underestimated Avalonia. [General] [media] At about 8:30, he shows a nice usage of switch expressions with range expressions to make a recursive summing function. static int Sum(Span l) => l switch { [] => 0, [var x, .. var xs] => x + Sum(xs) }; Nice! There were a lot of interesting examples in this video. The final one for refactoring a red-black tree was really cool. It's funny how bad these people are at demos, though. He showed us that he had 513 tests running and passing in 0.5s. Then he says that the passing tests is the only thing that's important. Um, no, it's also important that rebalancing is done in a reasonable amount of time, so we should also keep an eye on the time the tests take with any refactored implementation. To demonstrate that the tests actually test the code he's going to refactor, he wiped out the enter implementation and re-ran the tests. But they didn't run because he was no longer returning a value from his method, so it didn't even compile. He blew right by that and said "see, the tests don't run." Um, no, the program no longer compiles and you haven't proven anything about the connection between the implementation you're going to refactor and the tests. All that aside, though, it's quite an elegant solution that looks just like the original Haskell code. It's not legible at a glance but is a very succinct representation that uses the standard style for these kinds of things. [image] -------------------------------------------------------------------------------- [media] The Testing.Platform is very nice. It promotes test suites to first-class citizens, built as executable files without any reflection-based assembly-scanning at runtime. Instead, the source generator scans and generates code for running the tests. It also supports a "watch" mode (called "hot reload", of course), which lets you keep the tests running as a separate app. It's much faster and more reliable, it's AOT-friendly, etc. etc. When it was initially introduced in January of 2024, the only drawback was that you could only use it with MSUnit. That's changed! At 23:00, he shows how to enable and run NUnit-based and XUnit-based tests with Testing.Platform. I really, really like how MS-based projects like this embrace open and non-Microsoft standards: the testing platform is to replace VSTest, which only ran on Windows. Testing.Platform is platform-agnostic and testing-framework-agnostic, bringing its Reflection-free, AOT-capable, runtime-stable approach for everyone. He demonstrates running a solution with NUnit, XUnit, and MSUnit tests running as a standalone, collectively with dotnet run, and in Visual Studio. "Rider / VSTest" writes "JetBrains Rider can run tests from any custom test framework that uses VSTest or Microsoft.Testing.Platform." "ReSharper / VSTest" writes "ReSharper can run tests from any custom test framework that uses VSTest or Microsoft.Testing.Platform." Good news all around. This thing is ready to be used! -------------------------------------------------------------------------------- [media] I really like the tool NDepend and I was quite interested to see how Lin used it to upgrade a codebase from .NET Standard 2.1 to target .NET 8, and then .NET 9. He's from Singapore so he was not easy for me to understand -- but he definitely knows what he's doing. This video reminded me of a time about 9-10 years ago when I was really heavily using the tool to modernize the Quino code-base, which had grown quite organically and was proving difficult to use for only web servers, especially those running on Linux. We made our own journey from .NET Framework 4.7.2 to .NET Standard 1.0 (didn't work at all), then to .NET Standard 2.0 (success!). I continued to use the tool for the next five years. Here is a list of related articles, which I argue go into more depth on how to use NDepend than the video does. NDepend hasn't changed a significant amount in 10 years -- it was an incredibly powerful tool then, and it still is. I haven't used it much at my new job at Uster but time will tell. * "2014: The Road to Quino 2.0: Maintaining architecture with NDepend (part I)" * "2014: The Road to Quino 2.0: Maintaining architecture with NDepend (part II)" * "2015: Splitting up assemblies in Quino using NDepend (Part I)" * "2015: Iterating with NDepend to remove cyclic dependencies (Part II)" * "2018: The Road to Quino 2.0: Maintaining architecture with NDepend (part I)" * "2019: Finding deep assembly dependencies" -------------------------------------------------------------------------------- [media] This is by far the nerdiest video in the entire series. It's chock-full of interesting information about F#, with a focus on the new support for null, which it has in order to better interoperate with .NET modules built in other languages that do support null. F# is a functional language and uses options. However, Tomáš demonstrates how this is not sufficient when working with data coming in from outside of the F# system. The feature piggybacks on the |-operator to allow | null in any type definition. Most of the rest of the presentation shows how the new feature integrates with options, pattern-matching, generic types, etc. There are analyzers in the compiler that help your code shed "nullness" as soon as possible, leaving most of your F# code without nulls, as God intended. Tomáš calls "shadowing" a feature, which is being a bit generous. It's a nice trick to declare a "new" version of the incoming argument that has a type narrowed by a null-check function call. Languages like TypeScript and C# actually have a "feature" in which the type is narrowed without variable-shadowing. It amounts to the same thing, though. I suppose F#'s version is less gimmicky and implicit, but shadowing is frowned upon in so many other places, because it's super-confusing when done inadvertently. Using shadowing to narrow a type is a clear use case but it will also prevent analysis from being able to preclude accidental shadowing. This video is mostly just an F# tutorial, though. At the end, a guy named Kinfey pops in to tell Tomáš that he better wrap things up. It's quite unprofessional but also quite funny. -------------------------------------------------------------------------------- [media] This isn't the most organized demo, and they don't really show how to set things up but it does show how integration of Python into .NET is much simplified by source generators that generate bindings for marshaling data to and from Python. Not only that but .NET Aspire is indispensable for configuring a system like this, not only for tying together the moving parts -- PostgreSql, Python API, web front-end -- but also for monitoring not only the startup but also API executions, which you can track in a nice process-graph for each request (shown below). [image] It even shows how the chart is rendered in Python, returned as bytes to C# and then rendered into the body of the response directly (basically sending back an image rendered in Python without conversion). Unlike previous attempts like IronPython, this approach uses .NET Aspire to simplify integration of Python projects and code without changing it. It just integrates it, like taking care of setting up the PostgreSql database and then passing the connection string to the Python code. Anthony discusses at the end how the common data types used in Python ML processing (tensors, etc.) are all supported in an efficient manner, allowing you to pass buffers back and forth from Python to .NET and offloading code like web servers and GUIs to .NET development while benefitting from existing Python libraries. I imagine that this is just an interim solution until the .NET ecosystem catches up more with Python -- which it is definitely going to do, if it hasn't largely done so already.

Mads Torgerson on union types, existential types, and C# missteps

2025-02-11T22:39:11+01:00

Published by marco on 11. Feb 2025 22:39:11 ------------------------------------------------------------------------ This is a nearly 100-minute-long interview and discussion about programming-language design and evolution. It gets deep into the weeds on very specific and relatively advanced language features. While a feature may eventually feel quite simple to use, the considerations about how to design it and how to fit it into the landscape of the rest of the language can be very, very complex. There are a lot of moving parts to consider in a language, runtime, community, and ecosystem as established as .NET and C#. [media] [Existential Types] At about 45:00, Mads describes a higher-level typing system that allows for "hiding" some generic-type parameters. "There are various features that are related but not quite the same. There are 'associated types' in some languages and there's what one scholar calls 'abstract types' [I think he meant 'existential types'], which might be my favorite version of the feature, which are kind of an alternative to generics or a kind of generics. "The really short version is they help you not have so damn many type arguments all the time, yeah? Okay; essentially think of it as a class. Instead of a class having a type parameter saying, I'm an animal with a type argument saying which kind of food it eats...that means every time you talk about animals, you have to pass type arguments around. That's really annoying because what kind of food it eats is inherent to it. It shouldn't be like something on the outside; it should be a member, saying my food type is [whatever] ... "...and if you do that, then you can kind of tamp down on a lot of the...sometimes you just end up in generics overload, or passing the same stuff around. And every one of these related types has a type argument for which particular implementation of the other related interfaces it is using and they all carry the same five type arguments around all the time. "So that would be a feature, if we can get it right, and if we can work it into the runtime, and it's limited -- and we are occasionally talking about it -- that could be a really really beautiful and quite impactful addition to C#." It is described in detail in "Proposal: Existential types for interfaces and abstract types #8711" . While Mads's description of it made my ears perk up, the proposal does a better job of explaining it. An interface might look like this: interface ICounter { T Start { get; } void Next(T current); bool Done { get; } } Describing the visibility of the generic parameter with the keyword protected is part of the proposal. Since the parameter is protected, external users would use the interface without the generic parameter, like this: void M(ICounter ic) { var x = ic.Start; while (!ic.Done) { x = ic.Next(x); } } Only an implementation is required to provide a type parameter: class Counter : ICounter { int Start => 0; int Next(int current) => current + 1; bool Done => current == 42; } [Type unions] At 58:00, they discuss what something like "discriminated unions" or "tagged unions" would look like in a decidedly object-oriented language like C#. "Mads: We arrived at this degree of clarity around what our options are. The type unions -- use of the word 'types' there reflects that one conclusion that we reached is that, in C, unions should be 'unions of types'. If you look at F# or other functional languages, discriminated unions are not unions of types. They're unions of something with a name, a tag-discriminator, whatever you want to call it, the tagged unions that can then be deconstructed to give you values of one or more types. So the different options are like 'named options' but they're not things in and of their own right. They're just a means to get to what's inside. "In C#, one thing that we agree on is that that has to change. It has to be that the things are types. So you don't have to pattern on a union and get an A or you get a dog and immediately you have to decompose it into how many legs it has and and what it eats. You know what its name is. You can carry it around as a dog. It makes sense in its own right and it can be its own object. So that's essentially trying to take an object-oriented view on what discriminated unions would look like in C#. "Nick: Fundamentally, discriminated unions are a very functional-programming-like concept and they don't fit in a language that already has inheritance right? Because the idea is it's their version of inheritance. "Mads: Exactly." [Design mistakes in C#] At 01:22:00, Nick asks what are Mads's three least-favorite/most-hated features of C#, things that he wishes had never gone in or that, in a better world, he would remove. 1. Events, because they should never have been a language feature. They should have been a runtime/library feature, at most. 2. Delegates, because they were forced to be collection types because of events. Their execution dynamics are indeterminate (because of ordering) and they actually break covariance (kind of like arrays). 3. Void should have been a type (as it is in Typescript and Eiffel). Its being a language feature means that they bifurcate all other support, like requiring a distinction between Func and Action. 4. dynamic is a pretty great feature academically but the value of the feature ended up being much less than they'd hoped. "Performance-wise, it's a disaster. There's a whole bunch of infrastructure to maintain it. It doesn't carry its own weight." [Nullability] The final part of the discussion is about nullability-checking in the compiler, and its limitations and strengths. "The point is that it's so much better than nothing." The feature makes you think about what even should be allowed to be null. My advice is to avoid returning or accepting null unless you absolutely can't avoid it. Don't ruin your API, of course, but be absolutely sure that null needs to be an option. It is much easier to write code to APIs that never return null. Consider using sentinel objects instead. The feature in C# has gotten a lot better and is very good now. init and required properties helped finalize the feature.

CSS: Popovers and anchors

2025-01-27T13:49:06+01:00

Published by marco on 27. Jan 2025 13:49:06 ------------------------------------------------------------------------ The article "Do JavaScript frameworks still need portals?" by Ollie Williams explains a bit more about the relationship between dialog, popover, and anchor as well as how these elements have made "portal" support in frameworks obsolete. The following ~20-minute video shows how to use only CSS and HTML to position and display elements, regardless of their nesting in the document. [media] The main example uses "CSS anchor positioning" , "position-try" and ":popover-open" . You can use position-try to direct the browser to adjust the appearance of the popover when there isn't enough "space" on-screen at its default position. You can also animate everything. [image] .user-button { padding: 0; border-radius: 100vw; aspect-ratio: 1; anchor-name: --profile-button; } .profile-menu { position: absolute; position-anchor: --profile-button top: anchor (bottom) ; right: anchor(right); margin: 0; inset: auto; margin-block-start: 6px; } .profile-menu:popover-open { display: grid; } @position-try --bottom { inset: unset; top: anchor(bottom); right: anchor(right); }

Angular 19 sounds ... complicated?

2024-12-23T12:35:30+01:00

Published by marco on 23. Dec 2024 12:35:30 ------------------------------------------------------------------------ Some of the features described in the video below sound pretty interesting but they also sounds super-complicated, with deferred-loading, hydration, signals, and event/replay all combined with decisions about whether something loads on the client or the server or whether it's initially built on the server but then enhanced on the client and then run independently from the server, ... I understand that a lot of this technology is for optimizing large web sites, but It's honestly questionable who can actually take advantage of it. That's always been the problem with trying to massively optimize your web applications as if you were Google. It's nice that this stuff is available; it's unfortunate that so much of it has to be opt-in because the cognitive load required to use it well is very, very high. It's the same experience I've had in React. When teaching it, you get into the weeds on performance and useMemo() and useCallback and profiling very quickly, which makes it...not really a beginners library. Take a look at the video, though. Maybe something piques your interest. Be warned, though, the presentation style is...interesting, to say the least. It's got a "definitely not robots" vibe to it. More below the video. [media] Man, I thought some of the recent MS .NET videos seemed less-than-professional but those were downright charming compared to whatever this thing from Google is. All of the people featured in this video are so stiff, so obviously touched-up and coifed, that it's throwing me off. * The guys are hold their arms on-camera so that you can see that they go to the gym. * Everyone does that super-annoying "I've taken an oration course and have learned to artificially move my hands to emphasize my points." that people who are definitely not robots do. * The one (only?) lady's diction reveals to me that she definitely filmed her segment a dozen times. The "jokes" fall so flat that I can't even imagine which audience would think this is funny.This video feels like it belongs in a high school. Every line is scripted. There's no spontaneity and it's not good. I'm used to presenters who act much more naturally than this; even people like Kevin Powell and Nick Chapsas are still obviously performing, it doesn't feel like a "don't do drugs" PSA put on by parents.

Using calc-size in CSS

2024-12-23T12:24:31+01:00

Published by marco on 23. Dec 2024 12:24:31 ------------------------------------------------------------------------ The following video shows not only when and how to use calc-size(), it also mixes in advice on generating timing functions for animations, sprinkles CSS variables throughout, and even uses overflow: clip combined with an absolutely positioned element to reveal more content without disturbing the layout. [media] The syntax for calc-size() is, as Kevin says, "weird"; you have to pass two parameters: the first is the name of the logical size you'd like to use, while the second parameter is a formula that uses the placeholder size, which accepts the value of the first parameter. In a sense you are passing the argument, along with a lambda that accepts that argument. The following CSS sets the inline (horizontal in LTR and RTL) size of an element to be whatever the intrinsic size of the element would have been, given the size of its children, plus 3rem (where rem is the "font size of the root element" ). inline-size: calc-size(max-content, size + 3rem); This video accompanies "Kevin's article" for the "12 days of code 2024" , if you'd rather read about it, with more code examples.

How to apply EF migrations

2024-12-23T12:20:02+01:00

Published by marco on 23. Dec 2024 12:20:02 ------------------------------------------------------------------------ The picture and title are, as usual, clickbait-y, because apparently people don't click on videos that sound educational unless you promise them ground-breaking learnings. Still, I don't hate the player; I hate the game. But it's the world we have. The video is quite informative and is 90% not the guy pictured. Instead, it's another guy called Gui Ferrera, who is quite competent. [media] He starts by explaining how to deploy migrations in production -- you don't just run them, as you would in development. Why not? Because, if you run the migrations as part of your application startup, then your application implicitly has permissions to modify the database schema -- permissions that you are unlikely to revoke or downgrade. As an alternative, you can use the dotnet-ef tool to generate an ef migrations bundle (he names the file efbundle), which is an executable that you can then just run, using the pipeline secret that has administrator access to the target database. This executable runs separately and is only in charge of migrating the database to a particular version. Your application will run and fail if the schema is not correct, which is the desired behavior. If it is correct, your application will run with a database user with much lower permissions -- at the very least, it won't be able to issue DDL commands. The bundle option generates a binary; there is also a script option, which generates SQL. This is pretty neat and there's even a flag called idempotent, which allows you to generate a script that will ensure that previous migrations have been applied before continuing with subsequent migrations. The implementation isn't as obviously straightforward as it advertises itself. There must be limitations for custom-migration behavior that uses program logic. I know, because Quino [1] had a very similar feature and, although we could generate SQL for some user customizations to the migration process, there was no way to support everything. It's nice to see how solid the EF migrations story has gotten, even though I think the design still suffers when switching branches. You need much more developer discipline to keep your local database usable and in-sync. Anecdotally, I hear that most developers just trash their local database all the time, and rely much more on seeding functions to restore the state. You tend to lose your local ad-hoc-created data and it takes a bit more time, but it works with EF. It's actually not a bad alternative since it forces you to focus more on the seeding function, which will also benefit you while writing tests. For prototyping, though, there was nothing as fast as Quino or Atlas [2], both of which, instead of using a metadata table in the database, read the database schema, compared against the application model and applied custom migrations to address differences. Ferrera finished up with by-now standard advice for adding required columns (and other, similar types of breaking changes): you have to add the column as nullable with a default in the first migration, then get rid of the unwanted default value and nullability in a subsequent migration. You can only remove an unwanted field once the deployed application isn't using it. That is, you have to drop the column for the deployed version after the version that no longer needs the column. Otherwise, you run the risk of breaking the application that is still running against that database (especially if you have multiple clients/API servers running against the same database instance). -------------------------------------------------------------------------------- [1] Quino was a metadata-based ORM/database migrator written in C# that I worked on for years at Encodo Systems AG. [1] Atlas was a metadata-based ORM/databse migrator written in Delphi Pascal that I worked on for years at Opus Software AG, before I helped found Encodo.

Are you doing the Advent of Code?

2024-12-08T21:32:28+01:00

Published by marco on 8. Dec 2024 21:32:28 ------------------------------------------------------------------------ No. No, I'm not. I was briefly considering it because two good programmer friends [1] of mine asked me, and it seemed like it might be kind of fun to compare our solutions. But ... 24 days, man. I've got other things to do. Like, a lot of other things to do. I am not in any way bored or looking for things to do. I'm not even lacking in programming projects that I could be working on. I'm teaching a JavaScript class right now, for which I'm constantly refining the examples and project code, trimming the sails for the particular people in the current class. I've got a whole web site to which I've been meaning to do a bunch of tweaks. I'm just now getting the time to get around to it; I certainly don't need to prioritize arbitrary programming tasks. I've got a ton of stuff to write that interests me more than writing code for a problem that looks for all the world like a job-interview assessment assignment. I just finished up a 48-page summary of my week's reading and writing in "Links and Notes for November 29th, 2024" . I've got a stack of book reviews a mile high that I've not given up on. Yet. Those kinds of things takes time -- and it's important to me, much more so than the aforementioned programming exercises. So, no, I'm not doing advent of code. I'd like to compare solutions with my friends, but not enough to reshuffle all of my priorities. I also wonder who came up with the advent of code? All of the time that people spend every day working on these arbitrary examples is good training, I guess. But training for what? Job interviews? It's all time that they're not spending on other creative projects or spending on open-source projects. I'm not trying to knock it at all -- becoming a good programmer means just putting in the time. But the question is what are putting in your time on? When I was the same age as one of my (much younger) friends, I poured all of my time into building a web site -- that I'm still using every single day a quarter of a century later. The featured sponsor for December 8th is Optiver, which says it codes "sub-nanosecond trading systems", whatever the hell that means. So, they're trying to find people to make fast trading systems. Just. Stop. We need brilliant coders to be building software useful to society, not to billionaires. Those things are pretty much diametrically opposed. Let's take a closer look at the list of sponsors. [image] * Jane Street (trading) * A bunch of online training-course providers... * Accenture (formerly Anderson Consulting before they had to rename themselves after the Enron debacle, but don't worry, they're far, far larger than ever, having failed upward as expected). * JPMorgan Chase ('nuff said) * Lighttricks (AI company...🙄) * Ahrefs ("indexing" company, but almost certainly a funnel for AI content) * Shopify (they own the world of online storefronts, having homogenized that part of the world for us; thanks) * Jump Trading * American Express * Bank of America * ING Bank * ...and a bunch of other, smaller things It doesn't look great, if I'm being honest. Anyway, I'm doing a meta advent of code, in which I just work on as much stuff as I can, but of my own choosing and on my own schedule. -------------------------------------------------------------------------------- [1] One of whom is painstakingly and entertainingly detailing his brilliant solutions in Rust on "his blog" .

Real quick on MVVM

2024-12-07T22:50:43+01:00

Published by marco on 7. Dec 2024 22:50:43 ------------------------------------------------------------------------ A little while back, someone wrote "I can't wrap my head around MVVM" , asking for help. I answered with a short example, reproduced below. "tl;dr: Use the "MVVM Toolkit" and try JetBrains ReSharper or Rider for more IDE assistance for binding and fixing up views. [1]" The concept is that: * the (M)odel describes your data in the shape you want to store it, process it, etc. * a (V)iew describes the elements of the UI. * a (V)iew(M)odel mediates between these two "shapes". [image] Why do we need this? Why not just bind the view directly to the model? Consider a simple person: record Person( string FirstName, string LastName, Company Company, DateTime BirthDate); The view model might want to expose: int Age => DateTime.Now.Year - _model.BirthDate.Year; string FullName => $"{_model.FirstName} {_model.LastName}"; Company Company { get; } IReadOnlyList AvailableCompanies { get; } The AvailableCompanies is for the drop-down menu. So that's why there are two models. We don't want to pollute the data model with view-specific properties. Each view gets its own view model and you can have multiple views/viewModels on the same model. Nice. The *mechanics* of binding the view to an object has nothing to do with MVVM. It's *binding*, which is done by magic. This magic is made a lot easier if you use the "MVVM Toolkit" . The latest versions use source generators so you can actually *see* the magic binding code (in separate source-generated files). I would also try JetBrains ReSharper or Rider because either of those tools provides a lot more code-completion, hints, warnings, and fixup assistance than a bare Visual Studio does. -------------------------------------------------------------------------------- [1] I just saw in a .NET 9 release video that "x:bind" is a better alternative to the classic binding, which has better compiler and code-completion support (because it generates sources rather than using reflection). I haven't had a chance to investigate yet, though.

Writing elegant code

2024-12-04T22:22:42+01:00

Published by marco on 4. Dec 2024 22:22:42 ------------------------------------------------------------------------ I watched this video analyzing a chunk of code, in the hopes of refactoring it. [media] The original code is the laughably overblown example below. public List ProcessData(List data) { if (data != null) { if (data.Count > 0) { var processedData = new List(); foreach (var d in data) { processedData.Add(d * 2); } return processedData; } else { return new List(); } } else { return null; } } Nick rewrote it as the following: List ProcessData(List? data) { if (data is not { Count > 0 }) { return []: } return data.Select(d => d * 2). ToList(); } Nick's is OK, but I don't understand why he bothers to check for Count > 0 when Select() already short-circuits on this case. @DmitryKandiner rewrote it as the following: List ProcessData(List? data) => data?.Select(d => d * 2).ToList() ?? []; This is really short and avoids the unnecessary length-check but it still deals with nullable code, which is silly. There is no need for this function to handle possibly null input data. I commented the following: We can also drop the null-check if we have nullability enabled (which any modern project should). Also, I prefer defining APIs with enumerables rather than lists, but if the design insists, I would do it with two methods. This gives callers the option of building lists but doesn't require them to do so. List ProcessList(List data) { return ProcessSequence(data).ToList(); } IEnumerable ProcessSequence(IEnumerable data) { return data.Select(d => d * 2); } To which @swozzares replied that I could eliminate the return by using => (called an "expression body"). So I updated the sample with: List ProcessList(List data) => ProcessSequence(data).ToList(); IEnumerable ProcessSequence(IEnumerable data) => data.Select(d => d * 2); And I might as well include the test: [Test] public void TestProcessSequence() { List input = [1, 2, 3, 4, 5]; List expected = [2, 4, 6, 8, 10]; Assert.That(ProcessList(input), Is.EqualTo(expected)); Assert.That(ProcessSequence(input), Is.EqualTo(expected)); } Someone named @Me_myself_and_I took offense to my comment that projects should be using nullability. "Except that there are a lot of projects that pre-date that. Its good for new projects sure, but probably not worth the effort to refactor and re-test existing large projects. Devs really need to learn to think about existing systems and long-term maintainability not just new code." I think we should be clear about the context. The commentator's argument is the same one I would make when I work on a legacy project. I think we should take care to show that, if you started a project in C# in the last five years, you shouldn't be checking for null everywhere anymore. If you turned off nullable checks, then, yeah, you'll still have to check for it. But you should be designing null-free APIs -- where null is not allowed by default.

You're probably doing code reviews incorrectly

2024-11-11T12:12:33+01:00

Published by marco on 11. Nov 2024 12:12:33 ------------------------------------------------------------------------ The article "Processes and rules make code review less intimidating" by Stefan Judis writes, "[image]Code reviews are, by nature, intimidating. Sometimes even brutal. If you've been in the game for long enough, you probably experienced the following: you worked hard on a feature, you're proud of yourself and open the PR to be praised and land your changes, and then... it rains comments, suggestions and nitpicks. And if it's really bad, you're forced to take multiple feedback and clean-up rounds. It sucks." Oh, wow. Why are people treating writing code like individual school assignments? That is absolutely not how to effectively use code reviews. That is absolutely not how to work in a team. Teams work together, not against each other. Why are you putting your heart and soul into your solution? Why is it "your" solution? Is this how y'all were raised? You know, where you prefer being the star genius in your own story, where you managed to get the perfect solution on the first try? Like, you're the superhero, brilliant, engineer, billionaire playboy? And then, you learn that you aren't, and you're shattered. But you know what? You're on a team that's willing to look at what you made and really try to make it better. Maybe they will make it better! Or maybe they won't. Both are good! If they do ... then it's better. That's a win. If they don't, well, then, you've gotten some evidence that supports your theory that what you've written really is good and will work. Up until someone looks at it, you only had a hypothesis that your solution was good enough. You suspected it because you had some code and you had some tests (yes you did, otherwise you have no right to be offended about code-review comments). That's a hypothesis. You know who else does reviews to verify theories and hypotheses? Scientists. Quit your whining. Quit your bullshit. If you're treating code reviews like a gladiator arena, as if you were going on Shark Tank or The Voice, then you're doing it wrong. The best software is written by a team. It is collaborative. Maybe one person is writing all of the actual text, but there are other minds that contribute advice and feedback that hones the final product. You know what that sounds like? A writer and one or more editors or proofreaders. This is how professionals work. Fix your process. Fix your expectations. Fix your fragile ego. Seek validation in a less self-destructive manner. The author proposes a fix. It's a technical solution, so it's not great. He could have suggested that people do "live reviews" instead of PRs because most people are too lazy or incapable to write critical comments that are also constructive. This is sadly often the case because learning how to write and how to empathize is a lot of work. You could start with empathizing, about which more below. Instead, he writes about an insipid system where shittily aggressive review comments like "this is not worded correctly" are somehow made better by prepending the text "suggestion:". No. It does not make it better. Why not? Well, for starters, because the text is not formulated as a suggestion. There is only an implicit suggestion that the reviewer would have worded it correctly. This is passive-aggressive time-wasting behavior. On top of that, everything in a code review is a suggestion unless the power dynamic in your team is so severely skewed that we need to be having a different conversation. The comment should read something like, "I think that something like "... ..." would be a clearer way of writing that." Or, maybe, you could establish a rapport with the people reviewing your code so you're not pants-shittingly terrified that you're going to lose mana when you do one. Maybe you could -- gasp! -- even be friends. This would help establish a human connection often summarized as empathy wherein the reviewer would consider for a brief moment how a comment might look to the other side and adjust it accordingly, in a manner that is totally not like how robots would do it. If you've established that code reviews are collaboration and not a gladiator arena where "two enter and one leaves", then the reviewer can be more concise without wasting a lot of time writing curlicue sentences. If you don't have this rapport, then, yes, I'm afraid you're going to have to be...what's the word?...polite. If you can't be polite, then, at the very least, you should write review comments that don't need review comments of their own. You're going to have to follow the rules of error messages. As detailed in "Alerts" , "Avoid using an alert merely to provide information. People don’t appreciate an interruption from an alert that’s informative, but not actionable." Any review comment has to be both informative and actionable. The comment in question -- "this is not worded correctly" -- is neither. It just vaguely points at something and says "you suck." It's clearly attached to a specific line but doesn't indicate what's wrong with it. Even if it said specifically what's wrong, it doesn't suggest how to fix it. An error message from a piece of unthinking software can't go quite that far -- unless it's a spellchecker or grammar-checker! -- but an actual, empathetic human in the role of a collaborator can! That person could formulate a suggestion for how to rewrite it so that the review for that line might end after only one cycle. And, as a bonus, it doesn't end with anyone crying and curled into the fetal position under their desk. Even if you have an informative and actionable comment, we still come to another downside: they're still not very efficient. The most efficient way of reviewing code is to do it synchronously or "live", where the collaborators can discuss and improve the code on the fly, together maybe -- and here me out here -- even before you've even pushed anything! Imagine! If you're stuck using PRs and web UIs to communicate, then writing comments like the one in question just wastes everyone's time. The submitter either will assume what the commentator meant and try again -- NOPE, STILL WRONG -- or they will have to write "what do you mean?" or "how would you have written it?" This is useless churn. Just write your reformulation with your comment. Remember, you're a collaborator. You're not just trying to get through this review as quickly as possible. It's part of your job.

C# vs. TypeScript type-narrowing and coercion

2024-11-10T22:33:55+01:00

Published by marco on 10. Nov 2024 22:33:55 ------------------------------------------------------------------------ I was working with a colleague to get the properties that have a particular attribute. The original formulation returned the properties then got the attributes again, plucking the first one off of the list and asserting that it exists to convince the compiler that everything's OK. We know it exists because otherwise we wouldn't have returned the property -- but the computer doesn't know that. Ok, it works but it's not efficient or elegant. Is there some way to build this so we allocate minimally and don't have to use the null-forgiving ("dammit") operator? I proposed the following formulation. The null-forgiving operator bugs me a bit because I feel like TypeScript would have determined that attribute could no longer be null. C#/Roslyn doesn't do that. private static IEnumerable<(PropertyInfo PropertyInfo, TAttribute Attribute)> GetPropertiesAndAttributes(Type type) { return from prop in type.GetProperties() let attribute = prop.GetCustomAttributes(typeof(TAttribute), false).FirstOrDefault() as TAttribute where attribute != null select (prop, attribute!); } My collaborator prefers the non-query syntax for Linq, so he rewrote it as follows. private static IEnumerable<(PropertyInfo PropertyInfo, TAttribute Attribute)> GetPropertiesAndAttributes(Type type) { return packetType .GetProperties() .Where(prop => prop.GetCustomAttributes(typeof(TAttribute), false).Length != 0) .Select(propInfo => (propInfo, propInfo.GetCustomAttribute()!)); } I really don't like that it calls both GetCustomAttributes() and GetCustomAttribute(), so I looked into how to do emulate let with chained-method syntax. I found "Code equivalent to the 'let' keyword in chained LINQ extension method calls" and rewrote the code as follows. private static IEnumerable<(PropertyInfo PropertyInfo, TAttribute Attribute)> GetPropertiesAndAttributes(Type type) { return packetType .GetProperties() .Select(propInfo => (propInfo, attribute: propInfo.GetCustomAttributes(typeof(TAttribute), false).FirstOrDefault() as TAttribute)) .Where(t => t.attribute != null) .Select(t => (t.propInfo, t.attribute!)); } I still can't get rid of the second Select() because the type of the first Select() is (PropertyInfo PropertyInfo, TAttribute? Attribute) rather than (PropertyInfo PropertyInfo, TAttribute Attribute) As in the other formulations, we still need the null-forgiving operator to coerce the type. In the final formulation, it's much clearer that this is only required for the compiler because the check that attribute is not null is made on the immediately preceding line. I was curious about TypeScript, though, C# only supports narrowing conversions for inbuilt primitives. Typescript is fancier. I used "TypeScript Playground" for the examples below. Here's where we stand with C#, rewritten in TypeScript: [image] At this point, TypeScript is making the same complaint as the C# compiler would. [image] However, if you remove null values from the result, TypeScript recognizes that and automatically narrows the type to number[] from (number | null)[]. [image] I just realized that, while this example is interesting, I hadn't replicated the example from C# because I wasn't using tuples. So, let's try again and see how far TypeScript gets. [image] We already have a problem because TypeScript represents tuples with square brackets, which means that this could be an array of two-element tuples or a two-dimensional array. TypeScript defaults to the latter. We fix that with an explicit type. [image] If I assign null to b in one element, we have a problem, as expected. [image] We fix that by adjusting the type of the local variable to be [a: number, b: number | null]. [image] Now, we have the type-conversion error on the result. [image] We fix that the same way as we do in C#, with a Select, which is called map in TypeScript/JavaScript. [image] Since TypeScript doesn't allow you to directly address tuple elements like C# does, we have to "destructure" the elements with const [a, b] = x). So, we have the same thing as in C#, where we allocate "new" tuples for the result. What we don't have is a ! at the end because TypeScript recognizes the type-narrowing. Clever compiler. Of course, TypeScript can do this because it's just transpiling to JavaScript, which plays very fast and loose with types anyway. In C#, the compiler has to make decisions about the shape of the memory it uses, so a Nullable is going to have a different representation than an int. If you want to go from the former to the latter, then you have to define a conversion operator, either an explicit one or an implicit one. Or, as we did, you have to create a new tuple, which entails an allocation. Unfortunate, but unavoidable. In JavaScript, that's not the case, at least until it's run through a JIT, in which case other heuristics would have indicated how to most efficiently shape the storage for that particular instance. If that sentence doesn't make sense and you have a follow-up question, then me-from-ten-years-ago is here to offer more in the article "Optimizing compilation and execution for dynamic languages" , which summarizes a much-longer document written about the WebKit JavaScript engine. My colleague, who is always game to play with language features, had the good idea to try it with filter(). [image] This confirmed for us that the TypeScript checker doesn't actually know anything about the effects of the methods filter and map. It's just that the narrowing happens in the closure passed to the map version, so it can determine that there is no way that b will ever be null when the closure exits. Since it doesn't know the semantics of filter, it doesn't know that it actually does that too.

CSS Typography is really good now!

2024-11-10T22:18:20+01:00

Published by marco on 10. Nov 2024 22:18:20 ------------------------------------------------------------------------ I not unexpectedly very much enjoyed this somewhat-rambling 59-minute course on controlling font features from CSS. It's really quite amazing and wonderful what you can do declaratively these days. [media] To begin, Roel Nieskens takes a long look at variable fonts, which can be manipulated via both standard CSS properties, like font-weight, as well as using "font-variation-settings" , all of which can be animated. Variable fonts support a much more granular range of values for font-weight than traditional fonts, all without downloading anything other than the initial font file. You can use the "Wakamaifondue" site to determine which features a specific font has, as well as to play with the values along these axes. The "standard axes" are mapped to CSS properties, like font-stretch, font-style: oblique + angle, font-style: italic, and font-optical-sizing. All of this can also be animated, with the font being able to influence the animation as well. The demonstrations are quite impressive, especially since it's all manipulated using a CSS feature that is "widely available across major browsers" . [image]The next section covers colored fonts, which can contain multiple palettes, each with multiple colors. You can use CSS to override the colors but not directly with color. Instead, you define "@font-palette-values" to choose a different palette or to override individual colors in a palette. All of this can rely on variables, be animated, and so on, with optimized updates as provided for everything else in a modern web browser. Next up, he showed how to set font-variant-numeric to tabular-nums to make the font render numbers so that they line up vertically for tabular display. The font has to support this feature but nearly all of them do. This is a good default for table cells. He also shows font-variant-caps and font-variant-numeric to diagonal-fractions, as well as controlling an OpenType feature called "scientific inferiors", which will subscript numbers, as in chemical formulae, by setting font-feature-settings to "sinf". Finally, he talks about standard units like cap (the "the nominal height of capital letters", according to "CSS Length Units" ). He shows how to do a "true" vertical-align: middle by setting margin-top to calc(1ex - 1cap), which centers without lending so much weight to the descender or ascender. "Typography is full of details that nobody notices until they're broken or they've gone away." I, for one, welcome the high-powered typography features that will let web pages finally look as good as printed output, like magazines and newspapers, where many of these techniques have been used for decades, if not centuries.

Delimiting multiple CSS classes

2024-11-10T11:04:34+01:00

Published by marco on 10. Nov 2024 11:04:34 ------------------------------------------------------------------------ [image]While investigating "Charts.css" , I learned that you can throw unrecognized special characters like square brackets or pipes into CSS class references and its just fine. So you can use them to separate longer lists of classes. For more information, see "Cube CSS: grouping" by Andy Bell . So, you can write:

...

and it works just fine, while being more legible. Charts.css uses it to group related classes: ...

More Stephen Toub: Array Pools

2024-11-10T08:36:04+01:00

Published by marco on 10. Nov 2024 08:36:04 ------------------------------------------------------------------------ In this otherwise excellent video, I found myself very much wishing that Toub had written at least a single test for the ArrayPool implementation that he built in this video. Still, check out the selected citation below to get a feeling for how they consider performance implications -- there are no easy answers, there is only testing and benchmarking. [media] At 34:45, "Hanselmann: For folks that may not know what NUMA is: so NUMA is this non-uniform memory access that the computer knows that, like, this CPU is near this memory and...that memory over there, we're going to consider that remote memory. And it's all meant to reduce latency. Is that right? "Toub: Yeah and this definitely factors into things like the GC, right? And even with threadpool scheduling: you want to put the work where the data is. "Hanselmann: Put the work where the data is. Yes. "Toub: Otherwise, you spend all your time moving stuff around and thrashing your cache. "Hanselmann: ...and moving things around at different layers of abstraction. Because you would not want to move between NUMA nodes. You don't want CPU zero to be looking at memory one over there. But then there's the higher-level question of 'is the adding of an array pool to my application going to cause memory fragmentation or do I just trust the GC to handle that?' It's a constant series of trade-offs. Like, did the complexity I added give me the performance that I wanted or did I just make things more complicated. "Toub: Well, it's a great example of where you know I mentioned at the beginning there's a lot of complexity with pools. And this is a great example of it, If you're running on a core over here and you use an object that was last used over here, right? Are you better off using that object? Or are you better off just asking the GC to give you a new one? That's going to be allocated in memory that's very closely associated with where you are. So these pools aren't always wins."

CSS Magician Roman Komarov plays with sibling-count and sibling-index

2024-11-09T17:32:09+01:00

Published by marco on 9. Nov 2024 17:32:09 ------------------------------------------------------------------------ The article "Possible Future CSS: Tree-Counting Functions and Random Values" by Roman Komarov is another mathematical master class in using CSS variables and calculations to get at values like "sibling count" and "sibling index", two values that are in a future proposal for "CSS Values and Units Module Level 5" . The final demo looks like this, with randomly laid out items squared up into equal columns and rows where possible, all done with only CSS. [image] Here's a taste of the code for getting a random value in CSS, .random-example { & li { --random-part-from-sibling: pow(var(--sibling-index), 3) - pow(var(--sibling-index), 2) + var(--sibling-index); --random-part-from-count: var(--children-count); --random-limit: var(--closest-prime); --random-value: calc( mod( var(--random-part-from-sibling) * var(--random-part-from-count) * var(--seed, 0) , var(--random-limit) ) / var(--random-limit) ); } } You should really see the "original article" for the interactive demos. As always, it's stunning how quickly the browser CSS and layout engine efficiently updates values, invalidating only the parts that are affected, even with deeply nested calculations. I went through the article in Opera Beta on an M1 MacBook Pro (from 2020), with a relatively new version of Chromium and it was smooth as silk, with no CPU spikes and no sluggishness (as Komarov indicated might happen in Safari). He first defines the sibling-count and sibling-index functions, then builds randomness on top of those. He uses this toolkit to build grids that know how many items they have so that he can keep the grid "a square" with random transforms and coloring. Finally, he even "stacks them" , with random overlapping and z-order control. Finally, he links some amazing CSS demos where people built things that could use this functionality in CSS (but have had to make do with JS for now). See "Ana Tudor's many examples" or "Una Kravets's radial menu" , or "Amit Sheen's demos" .

IAsyncEnumerable for and by dummies

2024-11-09T17:25:46+01:00

Published by marco on 9. Nov 2024 17:25:46 ------------------------------------------------------------------------ This isn't a terrible video on IAsyncEnumerable but it's also not nearly as high-level and fast-paced as I'm come to expect from the .NET Deep Dive series, which is no-muss/no-fuss with Stephen Toub. Those are much better than this one but, if you're not grokking what IAsyncEnumerable is good for from the documentation or examples, maybe this one-hour video will help. If you're lucky, it will make you feel better about your own skills as a programmer. [media] Maybe I'm just super-smart but I can't understand why so many of Microsoft's .NET videos spend time discussing the _ separators in numbers. Hanselmann always points it out whenever Toub uses them in a longer constant like e.g., 10_000_000. Whereas it seems blindingly f"&king obvious what they're for, Cam Soper in this video just had to ask about them, presumably because, even though he almost certainly knows what they are, he thinks that the audience for a video about IAsyncEnumerables would also be unable to intuit what those symbols might be. [image]So, they get three people involved in a discussion about thousands separators. It's a waste of time. OMG, I started writing this rant at what I thought was the end of the "basic C# syntax" discussion but I was wrong. They continued for thirty more seconds, with a fourth person chiming in. "C# 7; I just verified." Congrats, guys! We solved it! The next thing we got to do was watch the async guru use Task Manager to do memory profiling. Have these guys never heard of Benchmark.Net? Or are they just trying to make other developers feel better about themselves? I am fascinated that they don't explain the mechanism behind the IAsyncEnumerable at all. Chase just talks about it as it were magic rather than an enumerable that returns a sequence of Tasks. The magic is in the enumerable part, which allows an algorithm to avoid creating all of the data in memory at once. The example at 57:00 with "System.IO.Pipes" , System.Text.Json, and IAsyncEnumerable was quite nice, though. It shows the power of the piping abstraction (which lies below streams). It's a pity they didn't get into that more but maybe a follow-up video by Stephen Toub is in the offing.

A quick look at .NET Aspire

2024-11-09T12:33:37+01:00

Published by marco on 9. Nov 2024 12:33:37 ------------------------------------------------------------------------ ".NET Aspire" is a newly introduced tool for building distributed solutions that run just as easily locally as they do in the cloud. This video explains how this is a boon for integration testing. [media] The concept is very nice and seems to greatly simplify building integration tests. Kudos and thanks for the introduction. [image]Still, my hair was standing on end with some of the "fast and loose" programming in this video, though. I know that people will argue that you have to take a direct path to get it working quickly, but I feel that this degrades programming practice, especially when it comes from an "official" source like Microsoft. There are things like: * Defining the service-initialization code in the tests, then explaining that it's to ensure that it's the "same as that used by the server". You know how else to do that? Use common initialization code in static helper methods (or whatever). * Copy/pasting the service-initialization code from test to test * Copy/pasting the HTTPClient code * Copy/pasting the record definition, as if that won't ever bite you in the butt. * Manually adding "usings" (Can't you just get the IDE to do that?) On top of that, the integration tests could have been a lot simpler than they looked if he'd first explained how to set up some common code. Or, perhaps even better, if he'd taken a couple of minutes afterwards to show how to refactor the common code to helper methods (one of which could even be used in the main application so that the app setup is shared with the tests). If he'd used a few more IDE features to speed up coding, he might even have gotten it all in in the same amount of time.

Don't return await unless you have to

2024-11-08T23:08:10+01:00

Published by marco on 8. Nov 2024 23:08:10 ------------------------------------------------------------------------ I finally got around to verifying that the defining dependent async methods like the following one is wasteful. public async Task N() { return await M(); } [image]A less-contrived example looks like this: using System.Threading.Tasks; public class C { public Task M() { return Task.FromResult(false); } public async Task N() { return await M(); } public async void RunIt() { var result = await N(); } } This yields something like the following lowered C# code in "SharpLab.IO" . Note that there are two state machines. public Task M() { return Task.FromResult(false); } [AsyncStateMachine(typeof(d__1))] [DebuggerStepThrough] public Task N() { d__1 stateMachine = new d__1(); stateMachine.<>t__builder = AsyncTaskMethodBuilder.Create(); stateMachine.<>4__this = this; stateMachine.<>1__state = -1; stateMachine.<>t__builder.Start(ref stateMachine); return stateMachine.<>t__builder.Task; } [AsyncStateMachine(typeof(d__2))] [DebuggerStepThrough] public void RunIt() { d__2 stateMachine = new d__2(); stateMachine.<>t__builder = AsyncVoidMethodBuilder.Create(); stateMachine.<>4__this = this; stateMachine.<>1__state = -1; stateMachine.<>t__builder.Start(ref stateMachine); } If you write the equivalent code without the await in the N method: using System.Threading.Tasks; public class C { public Task M() { return Task.FromResult(false); } public Task N() { return M(); } public async void RunIt() { var result = await N(); } } ...then you get the following lowered C#. Note that now there is only one state machine. public Task M() { return Task.FromResult(false); } public Task N() { return M(); } [AsyncStateMachine(typeof(d__2))] [DebuggerStepThrough] public void RunIt() { d__2 stateMachine = new d__2(); stateMachine.<>t__builder = AsyncVoidMethodBuilder.Create(); stateMachine.<>4__this = this; stateMachine.<>1__state = -1; stateMachine.<>t__builder.Start(ref stateMachine); }

Manim: a bespoke animation editor and engine

2024-11-08T22:58:15+01:00

Published by marco on 8. Nov 2024 22:58:15 ------------------------------------------------------------------------ [image]This is a fun video that demonstrates an API, runtime, and IDE called "Manim" that lets you interactively build 3-D animations. It's like a game-engine editor [1] in which you build your scenes by calling APIs in Python. There's an interactive Python terminal, a rendering area, and a text editor. It's quite nicely done and he's put it to good use over the years, building hundreds, if not thousands, of videos with it. [media] The API is quite high-level and robust but it's so clear how limited the Python syntax is. He's very quick with it, but he also knows the whole API by heart. He barely ever used code-completion, so I thought there wasn't any. But then I saw him hover a few APIs to show the expected parameters. I wonder how much time a novice would spend with interpreter errors. Still, once you've gotten used to it, it seems to be pretty efficient. Python's interpreter speed will never be a problem. In particular, the API for integrating formulae via embedded TeX is pretty neat. It even supports identifying manipulable elements from the rendered version for further animation. His style of mixing functions and code and variable definitions makes sense for the types of projects he makes. There is going to be very little re-use between these projects. Anything that needs to be reused would eventually be added to Manim itself. He doesn't seem to see the need for shared libraries. The code is basically throwaway. It takes more time to define common, well-generalized functions than it would to just quickly rewrite it, ready for specialization within that project. As when watching Kevin Powell ask an LLM about CSS without any idea about how LLMs work, watching Grant Sanderson discuss workarounds for "bugs" without any decent background in languages, scopes, functions, and closures. It's kind of exasperating watching people "reinvent" computer science without even thinking that there might be prior art -- especially clever people. Good for Grant for making this video, though, because I think he realized that there's a lot of room to grow in programming skills. He's already noticed that LLMs aren't going to help him code in Manim -- because he's an expert and the LLM is definitely not; there is no way it will be able to help him. -------------------------------------------------------------------------------- [1] It kind of reminds me of the good old days when I was working with the "Quake III level editor" . The API and tools are very bespoke and very powerful.

C# 13 improvements

2024-09-15T23:18:39+02:00

Published by marco on 15. Sep 2024 23:18:39 ------------------------------------------------------------------------ The final document of "What's new in C# 13" is available. There are no major changes for most end users; the changes listed are interested for library and framework developers -- especially those interested in writing highly performant code, e.g., Microsoft in its BCL and ASP.NET. 1. Completely unsurprisingly, the params keyword now also applies to IEnumerable (as well as many descendants) as well as Span and ReadOnlySpan. 2. There's now an official Lock object that, when used instead of the standard object, can lead to more efficient locking code. The .NET runtime and BCL have already starting using this ref struct everywhere. 3. I am not kidding when I say that the third "feature" in the list is that \e is now an accepted escape sequence in all strings. It represents ESCAPE. Um, ok. 4. Method group and method-group calculation has been improved to more closely follow that for overload resolution, allowing Roslyn to better determine a unique type and to extend where var can be safely used. 5. You can now use the "from the end" index operator, ^ in object and collection initializer expressions. Again, this is an improvement that seems like it makes it easier to write input arrays for numeric (data analysis) or tokenizing (LLM) operations. 6. "In C# 13, async methods can declare ref local variables, or local variables of a ref struct type. However, those variables can't be accessed across an await boundary. Neither can they be accessed across a yield return boundary. "[...] You can safely use types like System.ReadOnlySpan [...] "In the same fashion, C# 13 allows unsafe contexts in iterator methods. However, all yield return and yield break statements must be in safe contexts." 7. ref structs can now implement interfaces but it's hard to see the utility because "[t]o ensure ref safety rules, a ref struct type can't be converted to an interface type." Um, ok? 8. The language now supports the allows ref struct (anti-)generic constraint. "This enables types such as System.Span and System.ReadOnlySpan to be used with generic algorithms [...]" The compiler ensures that the generic code complies with all the rules implied by allowing this more specialized construct. 9. Finally, "you can declare partial properties and partial indexers". Expanding support for partial enables more flexibility in code produced by source generators.

Wrapping text the hard way

2024-08-18T19:39:05+02:00

Published by marco on 18. Aug 2024 19:39:05 ------------------------------------------------------------------------ The work journal "2024-03-27T16:03:51 conversation: 01ht0afgwryks5fepkvvm0kn28" by Simon Willison describes the author's process of using AI prompting to write a console text-wrapping algorithm. [image]He prompted with "JavaScript that takes a big string of text and word wraps it at the specified width, adding newlines where necessary." The answers meandered around a solution space that seemed over-engineered and not particularly fruitful -- the answers all used regular expressions, which seems kind of like overkill, when you're really only trying to detect a single character. A simple parser would be more effective, easier to understand,,debug, and maintain, and is more likely to be performant. The original wrapping code was: function wordWrap(text, width) { const regex = new RegExp(`(?![^\\n]{1,${width}}$)([^\\n]{1,${width}})\\s`, 'g'); return text.replace(regex, '$1\n'); } Things got hairy when Willison asked for a solution that includes an indent before each wrapped line. It's honestly hard to tell whether the example will actually do that, largely because of the non-obvious regular expressions (which make use of negative lookahead assertions). function wordWrap(text, width, indent = '') { const regex = new RegExp(`(?![^\\n]{1,${width}}$)([^\\n]{1,${width}})\\s`, 'g'); return text.replace(regex, `$1\n${indent}`).replace(/^/gm, indent); } When Willison noted that the indent appears everywhere but at the front of the text, the proposed code was: function wordWrap(text, width, indent = '') { const regex = new RegExp(`(?![^\\n]{1,${width}}$)([^\\n]{1,${width}})\\s`, 'g'); return indent + text.replace(regex, `$1\n${indent}`); } This looks promising, as the eye is drawn to the indent + at the beginning of the answer. If you look carefully, though, you'll see that the second call to replace() has gone missing. This is dangerous, right? Imagine if you didn't have sufficient tests for this code? You can't really understand it, it looks vaguely right, and pieces that used to work are randomly dropped in responses to "refinement prompts". He tried a few more times, but kept losing rather than gaining fidelity in the response, with one or more corner-cases no longer being supported. When he told it to stop using regular expressions, it responded with: function wordWrap(text, width, indent = '') { const lines = []; let currentLine = ''; text.split('\n').forEach(line => { if (line === '') { lines.push(''); } else { line.split(' ').forEach(word => { if (currentLine.length + word.length <= width) { currentLine += (currentLine ? ' ' : '') + word; } else { lines.push(indent + currentLine); currentLine = word; } }); lines.push(indent + currentLine); currentLine = ''; } }); return lines.join('\n'); } This actually seems better -- and Willison claims it works -- but it still seems only OK. There is a lot of allocation in this response, as well as a failure to separate concerns. There are a few pieces to this request: 1. Split the string into paragraphs 2. Wrap each paragraph to lines of a given width, including an optional prefix on each line You can play with his version at "Wrap text at specified width" by Simon Willison When I threw the code into WebStorm and added a test, I discovered that it doesn't wrap at the desired width. * It fails to take the length of the indent into account when wrapping the text. * It doesn't sanity-check that the indent isn't bigger than the desired wrap length. At least the algorithm doesn't fall into the pathological trap when a word is too long on it's own to fit within the desired width, which would result in an infinite loop. I noodled around with a solution that uses generators to avoid allocation until you actually snip text. The algorithm looks like this: * Pass in a text, desired width, and indent. * Get a generator for all line breaks. * Get a generator for all word breaks in a line. * Yield a generator for all line spans. * Yield a generator from that for all lines. This strategy ensures that there are no substring allocations until you actually need them. You can get the first line or two lines without allocating more than the substrings for those lines. There are no substrings allocated to find word breaks, as the algorithm outlined above does. Is it a more complex algorithm? Of course. Is it relatively easy to understand, especially with the requisite tests? Yes. Does it do its job much more efficiently? Absolutely.

Tactics for automated testing

2024-08-17T16:27:26+02:00

Published by marco on 17. Aug 2024 16:27:26 ------------------------------------------------------------------------ The article "Prefer test-doubles over mocking frameworks" by Steve Dunn writes, "This is testing implementation and not behaviour. Your SUT called something and there is likely an observable side-effect of that. Test the side-effect and not that a particular method was called. If the code is refactored (e.g. you change the implementation but not the behaviour), then your test that checked that a method was called will likely break, but your test that tested the behaviour should remain unchanged and should still pass." I think we have to be more careful here. Sometimes you want to test the implementation, no? If you look at the simplest test double that he's written in the article, shown below, you can see that there is an implicit assumption that would have to be tested: that is, that the Get method in the test-double accurately represents the actual implementation. This is the interface to be tested. public interface IProductRepository { void Store(Product product); Product Get(int id); } This is the test using the test double: [Fact] public void Using_test_doubles() { var repo = new InMemoryProductRepository(); var sut = new ProductService(repo); sut.OnboardNewProduct(123, "Product 123"); repo.DidStore(123).Should().BeTrue(); } Note that the test calls a test-double-only method called DidStore(), which is assumed to have been implemented as expected. A naive implementation would just return true. Since this is a test double, there are no tests verifying that it doesn't always return true. Shouldn't the test instead verify that the product is not stored first -- i.e., repo.Get(123) returns false -- before calling OnboardNewProduct(123, ...) and then testing repo.Get(123) again to verify that it returns true? The following is the implementation of the test-double. public class InMemoryProductRepository : IProductRepository { private readonly List _products = new(); public void Store(Product product) => _products.Add(product); public Product Get(int id) => _products.FirstOrDefault(p => p.Id == id); // This is not part of the interface, but is useful for testing public bool DidStore(int id) => Get(id) is not null; } If you leave the test as formulated, there is literally no guarantee that anything changed at all. The author is simply assuming that Store adds a product because he can see that it does. The author wasn't quite clear why his mock-based implementation isn't good, though. He proposed the code below. [Fact] public void Using_mocks() { var repo = Substitute.For(); var sut = new ProductService(mock); sut.OnboardNewProduct(123, "Product 123"); repo.Received().Store(Arg.Is(p => p.Id == 123)); } Do you see how he checked whether the Store() method had been called rather than testing whether Get(123) returns true? He had to do that because the mock would always return false unless the author had also set up the Get() method to return true if the method were to be called with 123. Why wouldn't he do this? Because he'd then have just been testing the mock. However, if you look closely at the previous example, the author is also just testing his test-double. I have another problem with the statement above: sometimes I very much want to verify that a specific method is being called. I'm not trying to verify the behavior of the test-double; I'm trying to verify the behavior of the actual implementation. If, for whatever reason, I can't use the actual implementation, then I want to verify that a certain method was called because e.g., I know that that method calls a system API directly. That is, I trust that the system API will do what it says on the tin. I'm able to verify manually that the parameters to the method are passed on to the API faithfully. I can't call the API in the test suite -- maybe it's a call to the Windows Registry or maybe it's accessing a USB stick that doesn't exist in CI -- but I can get as close as possible. If something still goes wrong, then I know that I just have to examine the one line of code in the actual implementation. In that way, I've verified a fact about the system that means something. This comes up often enough in more complex component graphs, where you've had a bug that, under certain circumstances, a certain notification is not sent. In that case, you might be unable to verify that the message arrives -- as we do by testing Get(123) above -- because the actual message would go through an online proxy like Apple and would end up on a mobile device somewhere, and maybe you don't want to build the testing infrastructure that mocks a receiving device that you can check. It wouldn't help you because you'd just be testing the test-double implementation anyway. Instead, you would trigger a high-level API that, eventually, bubbles through several layers until the notifier is triggered with a certain message. In that case, an efficient and effective test would be to test that the INotifier.Send() method was called with the expected parameters. Even in the author's example, there is presumably an external data store of some sort that is being mocked. I'm not actually interested in testing whether that data store interprets my command to store correctly. I'm going to assume that it does because it's not my code.. [1] What I want to confirm is that I sent the command to the store. That is, I want to verify that a particular method was called with particular parameters. Perhaps I'll use a snapshot test to verify that the generated SQL is correct. Then I don't have to actually run the SQL against the database every time. In the author's case, he's calling a method on one interface and verifying that a property of another interface has changed. He is testing the interplay of those two components. That he used test-double doesn't help at all -- it's because the test-double was written correctly that the test means anything. And there are no tests to verify that the test-double actually does what he assumes it is doing. While I agree that test-doubles have their place, I think that mocking frameworks can also be very helpful. That's why I don't like rules like "test behavior not implementation". I prefer to consider it a guideline, so that I can remember to write high-level, well-abstracted tests where possible but I can also just test that a certain method on a certain component will be executed. -------------------------------------------------------------------------------- [1] If that promise is broken, then I will have to reevaluate. I could write a test to verify that the external component works as expected -- just in case it breaks again -- I could find a more reliable external component, I could fix the current external component, or some combination of these..

Upgrading to nullability in C#

2024-08-16T03:38:49+02:00

Published by marco on 16. Aug 2024 03:38:49 ------------------------------------------------------------------------ [image]The "Talk - Bringing C# nullability into existing code" by Maarten Balliauw is a 66-slide deck that I summarize as follows: * The C# nullability feature is for build- and design-time. It does not enforce anything at runtime. That means that you still have to check parameters for null. * The C# nullability feature is available to solutions working with .NET Framework and .NET. * For .NET Framework, you have to explicitly set the to 8.0 (however, there are a bunch of cons associated with doing this, as the runtime library itself is not annotated). [1] * The presentation shows how to enable and disable for the whole solution, project, or an individual code region. * For new solutions, enable at the solution level. * For small solutions, enable at the solution level and just work through it. * For large solutions, enable project-by-project or file-by-file -- or even class-by-class. * "Start at the center and work outwards." * While ? suffices in most cases, consider annotations to improve your own APIs * Consider redesigning APIs that return null (use the bool TryGet(..., out T) pattern or return a "null" object instead). * Avoid allowing null parameters (these force a decision on the implementation that is often better handled by the caller). * Don't use ! except temporarily * Don't use suppression except temporarily * Start with types that aren't depended on a lot. Those are easy. * Take types with lots of dependents one-by-one. -------------------------------------------------------------------------------- [1] For more information, see "C# 8.0 nullable references: supporting older runtimes" by Ian Griffiths , published in July of 2021. Also, the article "Consider using C# 8 with the .NET Framework" cites from "Building C# 8.0" by Mads Torgersen. Both of those articles are from 2018. "using C# 8.0 is only supported on platforms that implement .NET Standard 2.1" .NET Framework doesn't implement .NET Standard 2.1 However, the StackOverflow post "Does C# 8 support the .NET Framework?" goes into some detail about which features of C# 8.0 could be supported under .NET Framework. That post notes that syntax-only changes will continue to work, which makes sense. As long as you use a newer compiler that understands the syntax, the lowered code and subsequent generated IL will be compatible with the .NET Framework runtime. That's what syntax-only means: no new functionality was required in the runtime in order to support the generated output. * "Static local functions" * "Using declarations" * "Null-coalescing assignment" * "Readonly members" * "Disposable ref structs" * "Positional patterns" * "Tuple patterns" * "Switch expressions" * "Nullable reference types" are also supported, but the new "nullable attributes" required to design the more complex nullable use cases are not. However, according to "C# 8.0 nullable references: supporting older runtimes" , there's a "Nullable Nuget package" . Be aware, though, that the .NET Framework is not itself annotated, so you will probably see spurious warnings when the compiler can't tell that a result can never be null. That's a lot of features, actually! The StackOverflow post linked above lists them quite well, and "C# 8.0 and .NET Standard 2.0 - Doing Unsupported Things" has some more information about which level of change each C# 8.0 feature requires. That said, "The C# 8/.NET Framework combination is not officially supported by Microsoft. It is, they say, for experts only."

Ignoring files with .gitignore

2024-07-07T21:18:27+02:00

Published by marco on 7. Jul 2024 21:18:27 ------------------------------------------------------------------------ [Introduction] This article defines concepts like repository and working tree and then discusses how you can use .gitignore files to determine the files and folders that Git considers during operations. [Concepts] [image]From a command line, you can run git init in any folder to make any folder a Git repository. Doing so creates a .git folder with a database and configuration files for the local repository. Git considers any folder that contains a .git folder with these configuration and database files in it to be a Git repository. The working tree of such a folder is all files and subfolders in that folder other than the .git folder. Any non-trivial repository will contain at least some files that should not be committed to the Git repository. Commonly, these are build artifacts produced from the source files that are included in the repository. [.gitignore] A repository may include multiple .gitignore files. From the "official documentation]" : "Patterns read from a .gitignore file in the same directory as the path, or in any parent directory (up to the top-level of the working tree), with patterns in the higher level files being overridden by those in lower level files down to the directory containing the file. These patterns match relative to the location of the .gitignore file." Git determines whether a given file is included in the working tree by collecting the patterns from any .gitignore file anywhere in that file's path. Patterns defined "closer" to the file override those defined "farther" away. For example, [Root] 📄 .gitignore 📁 ProductMedia 📁 Win10 📁 Content 📄 .gitignore 📁 Deploy 📁 Control 📄 .gitignore 📄 readme.txt Git determines whether readme.txt is included in the working tree by collecting the rules from files in the following order: * .gitignore * ProductMedia/Win10/Content/.gitignore * ProductMedia/Win10/Content/Deploy/Control/.gitignore [Patterns] This section provides a brief overview with examples for common and more advanced cases that have been needed at Uster. For more information, see "all supported patterns" in the official documentation. A .gitignore file contains zero or more patterns. * Each pattern excludes the affected files from the working tree * Each pattern affects only files in the sub-tree defined by the sub-folder in which the pattern is declared. * Each pattern is interpreted relative to its containing folder (i.e., the root folder for a pattern is its declaration folder). The following examples illustrate common patterns. The column "Disables subsequent patterns" indicates whether subsequent patterns that target files in the ignored sub-tree will have an effect. As "detailed below" <#difference-between-%60/%60-and-%60/*%60>, the presence of a trailing * determines whether Git considers subsequent rules. This only becomes relevant for "re-including deeply nested files" <#re-including-deeply-nested-files>. obj/ All files in any folder named obj anywhere in the sub-tree of the current folder or any sub-folder thereof. Disables subsequent pattern: Yes /obj/ All files in the root folder named obj or any sub-folder thereof. Disables subsequent pattern: Yes /obj/* All files in the root folder named obj or any sub-folder thereof. Disables subsequent pattern: No *.bin All files ending in .bin anywhere in the sub-tree of the current folder. Disables subsequent pattern: Yes, but irrelevant src/*/out/bin All files in any path named out/bin found in any single sub-folder of any folder named src anywhere in the sub-tree of the current folder. Disables subsequent pattern: Yes src/**/out/bin All files in any path named out/bin found anywhere in any combination of sub-folders of the sub-tree of any folder named src anywhere in the sub-tree of the current folder. Disables subsequent pattern: Yes [Re-including files] You can also re-include files with the ! operator. For example, you might want to exclude everything in a folder but a single configuration file. The following example ignores everything in the /out folder except for the file settings.json. /out !/out/settings.json [Difference between / and /*] As very nicely explained in the answer on StackOverflow ".gitignore exclude folder but include specific subfolder" , * The pattern dir/ excludes a directory named dir and (implicitly) everything under it. With dir/, Git will never look at anything under dir, and thus will never apply any of the "un-exclude" patterns to anything under dir. * The pattern dir/* says nothing about dir itself; it just excludes everything under dir. With dir/*, Git will process the direct contents of dir, giving other patterns a chance to "un-exclude" some bit of the content (!dir/sub/). For example, the following patterns ignore all files in obj and bin folders but files named readme.md in those folders should be included. The following patterns achieve this for obj but not for bin folders (because the missing * prevents Git from considering the rule re-including readme.md for bin folders). /obj/* !/obj/Readme.md /bin/ !/bin/Readme.md This distinction is highly relevant for the next section. [Re-including deeply nested files] As linked above, the answer on StackOverflow ".gitignore exclude folder but include specific subfolder" was invaluable in determining how to do accomplish the task below. Suppose we have a folder structure as shown below. [Root] 📁 Folder1 📁 Folder2 📁 Folder3 📁 ProductMedia 📁 Win10 📁 Content 📄 Other files... 📁 Deploy 📁 Control 📄 CustomerSettings.ini 📄 Bootstrap.ini 📄 Settings.xml 📄 Other files... 📁 Folder1 📁 Folder2 📁 Folder3 📁 Other folders... 📁 Win11 📁 Content 📄 Other files... 📁 Deploy 📁 Control 📄 CustomerSettings.ini 📄 Bootstrap.ini 📄 Settings.xml 📄 Other files... 📁 Folder1 📁 Folder2 📁 Folder3 📁 Other folders... 📁 Folder1 📁 Folder2 📁 Folder3 📁 Other folders... We want the patterns that will ignore everything in any files in the Content folder of any sub-folder of ProductMedia except for the CustomerSettings.ini, Bootstrap.ini, and Settings.xml files under the path Content/Deploy/Control/ of any sub-folder of ProductMedia/. You might think that the following would do the trick: # Ignore everything in the "Content" folder of any subfolder of "ProductMedia" ProductMedia/*/Content/* # Except for the specified files under the path "Content/Deploy/Control/" !ProductMedia/*/Content/Deploy/Control/CustomerSettings.ini !ProductMedia/*/Content/Deploy/Control/Bootstrap.ini !ProductMedia/*/Content/Deploy/Control/Settings.xml Microsoft Copilot certainly thought so. This does not have the intended effect. The pattern correctly uses the * to indicate that Git should continue processing patterns for sub-trees of the excluded folder. It also correctly ensures that the rule applies both to Win10 and Win11 folders by using a *. However, while patterns affecting the path Content will be considered, those affecting Content/Deploy or any sub-folder thereof will not. The trick, as outlined in the "StackOverflow answer" , is to re-include, then exclude each individual sub-folder in the path, as shown below. # We ignore the generated "Content" subfolders in the ProductMedia folder ProductMedia/**/Content/* # Allow subsequent processing # Except for certain configuration files !ProductMedia/**/Content/Deploy/ # Include "Deploy", allowing subsequent processing ProductMedia/**/Content/Deploy/* # Ignore everything in "Deploy" !ProductMedia/**/Content/Deploy/Control/ # Include "Content", allowing subsequent processing ProductMedia/**/Content/Deploy/Control/* # Ignore everything in "Content" # Finally, include the desired individual files !ProductMedia/**/Content/Deploy/Control/CustomSettings.ini !ProductMedia/**/Content/Deploy/Control/Bootstrap.ini !ProductMedia/**/Content/Deploy/Control/Settings.xml This pattern of re-including, then re-excluding each sub-folder suffices to allow Git to consider the file patterns at the end, while still ignoring all other files in any sub-trees that would otherwise have been included.

Building RegEx from scratch with Stephen Toub

2024-05-18T21:21:32+02:00

Published by marco on 18. May 2024 21:21:32 ------------------------------------------------------------------------ This is another excellent 1-hour tour of another complex corner of .NET. Toub describes and shows how the source-generated RegEx engine works. [media] * The generated source is human-readable and debuggable. * It is well-commented. * It updates in real-time as you change the expression. * It includes XML documentation that describes the regular expression in plain English. * They rewrote the compiler in .NET 7 to not only better support source generators, but also to be able to emit not only IL, but source code. They rebuilt the emitter to allow more leeway in code-generation -- the first generation emitted C# that looked very much like IL. * They have a gigantic test-suite that they culled from open-source code. 4M expressions deduplicated down to about 20,000 unique expressions that they have in the test suite and that they run against all four RegEx engines to verify that nothing runs pathologically long or with excessive memory. * There is an analyzer that tries very hard to eliminate greediness. It seeks atomicity. Fascinating. * At 47:00, he shows a great example of a regex that requires backtracking, which can lead to pathological, exponential performance. These engines support back-references, which are powerful. They can be super-fast for matches, but they have very bad worst-case behavior that may end up in DDOS behavior. In .NET, you can set a timeout on your regular-expression evaluation to avoid this. You can also set a global timeout. You can also turn off back-tracking. If it can produce the engine to evaluate the expression, then it will evaluate in linear time. If it cannot, it's probably a compile-time error if you're using source generators, which is quite nice. * They also examine an email-address RegEx, which takes Toub into showing how the generated source uses the SearchValues variants, which are a highly-optimized way of searching text, with dozens of algorithms that it chooses by analyzing the input string. They have SIMD/Vector/Arm Intrinsics support where possible and are exactly the kind of optimization that a framework like .NET can offer, but that an app developer would never have time to make. 💙 Stephen Toub. He's absolutely brilliant. Mad props to Scott Hanselmann for reining him in and providing a great sparring partner.

Building async/await from scratch with Stephen Toub

2024-05-18T21:17:22+02:00

Published by marco on 18. May 2024 21:17:22 ------------------------------------------------------------------------ This is another video from Stephen Toub that is just chock/full of useful information. [media] At 27:30, they start to discuss about the nomenclature of Task and how it differs from an Action. It's funny that neither of them mentioned that tasks in .NET are called promises pretty much everywhere else (JavaScript, Java, etc.). Some libraries also use the word future. For more information, see "Futures and promises" . As he's building everything, it is really astonishing to note that Hanselmann has to tell Toub that you can have Visual Studio generate methods for you. How does he not know that? When he did it, he then used the mouse to select "Find References" from the shortcut menu instead of just pressing F12. When he got to the method, he said "Oh, it didn't implement it," as if disappointed that Copilot hadn't botshit a version in there for him. He was going to write it himself anyway, but it was telling that he's gotten so accustomed to Copilot just filling in implementation. A little while later, he's learned the new tool, telling Hanselmann that he's going to use his "trick" to create the method. At around 52:30, he implemented a try/catch to "be a good citizen" and accepted what Copilot had recommended for him, but it didn't match what he said he was writing. He said "so we always set the task result" but the code that he/Copilot wrote returned from the catch, which means that the task result isn't going to be set when there is an exception. Now I don't know which one he meant: what he said he wanted to write (did he misspeak?) or what he actually wrote (which Copilot wrote for him and he might have automatically accepted). Since he has no tests whatsoever, this is exactly the kind of subtle bug that might go undetected for quite a while, as it's in the exception-handling code. It might also be quite difficult to diagnose. When he wrote the exact same thing again at 1:00:00, he seemed to indicate that what it wrote was OK: i.e., it either sets the exception or it sets the result. At 56:00, he gets to the point of trying to get to the synchronous calling style supported by await and builds his own logic. It works, but it still can't be used with the await keyword. He quickly implements a TaskAwaiter and voila! 🧙‍♂️ it works! His very own implementation of the Task pattern that integrates with the compiler.

Building LINQ from scratch with Stephen Toub

2024-05-18T21:11:44+02:00

Published by marco on 18. May 2024 21:11:44 ------------------------------------------------------------------------ This is a great interview with the master of performance-optimization in .NET Stephen Toub. If you're relatively well-versed in C#, .NET, and Linq, then you can just jump to the second video (linked below). I actually watched the second one first. I didn't feel like I'd missed anything. [media] Stephen Toub's the guy who writes the 100+-page release notes on performance. See the following links. * "Performance Improvements in .NET 5" (46 pages) * "Performance Improvements in .NET 6" (109 pages) * "Performance Improvements in .NET 7" (170 pages) * "Performance Improvements in .NET 8" (235 pages) * ...more to come. In this video, at 26:00, Scott flubbed the joke. It doesn't really matter but, according to the article "Two Hard Things" by Martin Fowler , the original saying was: "There are only two hard things in Computer Science: cache invalidation and naming things." This was "upgraded" to: "There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors." At 33:00, Toub talks about how the Current property of IEnumerable is not checked for nullability because ... it can technically always be null, but we also don't use it that much and we don't want the compiler yelling at you for possibly-null access when the item type is a value type (for example) or a non-nullable reference type. See the "code" for the annotation and more comments. At 49:00, they discuss the use of goto, where Hanselmann says, "there's the kind that you can see; and there are the kind that are hidden," which I feel like he fumbled as well. I think what he meant was that there are implicit gotos everywhere. A goto maps to an assembler jmp, so every if has an implicit goto. The break statement inside of a case block of a switch statement jumps (or goes to) the end of the switch block. You should be careful about using goto -- i.e., it's a code smell -- but sometimes it's the clearest and most concise way of expressing intent. In their case, they used it to "fall through" from one case statement to the next with goto case 2. I feel the same way about goto as I do about the continue keyword in a loop. I like to write all of my loops in a consistent and idiomatic manner. That means I use break. So, for me, using continue is a code smell, but sometimes it's the most elegant way of expressing the intent. At about 55:00, Hanselmann shows Toub how to use Winget to install SysInternals and then to use ZoomIt for presentations. This was a nice bonus that probably a lot of viewers didn't know: (A) Zoomit is a poor tool compared to Ctrl + mouse-scroll on MacOS, but it's better than nothing and (B) running Winget from PowerShell is a pretty powerful install-and-update tool on Windows. In this presentation, I notice that Toub doesn't use the "extract variable" refactoring either. I'm not sure whether something like that is available in VS. He also installs packages with the VS NuGet interface rather than just adding the package reference manually in the project file. I note other inefficiencies in his coding style in the notes to the second video below. He's fast! ...but he could be even faster. [media] At about 22:00, he mentions something interesting that makes me change my opinion about the type to use for private variables. I'd always used interfaces to keep it clear which API my implementations depended on. This included list variables, which I would type as IList instead of List. Toub says that the second formulation is better for performance because the compiler doesn't have to deal with virtual dispatch -- it can just call the functions of List directly. This makes me realize that it's a good reason to use var. When you use var with new, the resulting type of the local variable will automatically be the "right" type for performance. The same goes for "target-typed new expressions" , which force you to use a non-interface type for fields. 👉 There are inspections in Visual Studio, ReSharper, and Rider that help you shape your private APIs for optimal performance. At 27:00, they talk about the IDE features for tracking change and inspection states -- the colorful splotches of color in the left and right gutters and on the vertical scroll bar. First of all, Toub doesn't know left from right, but he's a genius so we'll grant him that. Second of all, neither he nor Hanselmann really understands what the file-state markings are. The IDE tracks not only unsaved changes but also uncommitted changes. That's why there is so much green in the gutter on the left and the right: because Toub had rewritten much of the file since the last commit but he's been saving the file the entire time. It's an important distinction to make for understanding what's going on. The right-hand side of the right-hand gutter shows inspections: suggestions, warnings, and errors. It was interesting to see that, since Toub uses Visual Studio without ReSharper, he instinctively used the mouse to copy/paste the name of the constructor from the class after he'd renamed the class. That is, he'd renamed the class and, in order to fix the constructor being named incorrectly, he copy/pasted with the mouse. With ReSharper, he could have just pressed Alt + Enter and selected the quick-fix called "This is a constructor", which just renames the method to the name of the containing class. I could see several places where he used the mouse rather than being able to stay on the keyboard, letting the IDE do his work for him. For example, he also copy/pasted the name of the iterator class again in order to use the more highly specialized version -- but he could have just started typing to have the auto-complete suggest the right type. Or he could have used multiple clipboards to paste the type name that he'd just copied before (when he'd fixed the constructor). When the type of the actual argument no longer matched the formal argument, he copy/pasted again to get the more specific type. Here, he could have also asked ReSharper to show variables in scope with an appropriate type. It would have shown only array and been done with it. No mouse, no copy/pasting, no moving away from the keyboard, no guesswork. At about 35:00 or so, shit gets real, as Toub starts hand-optimizing the code for his iterator. He does some obvious stuff first, removing iterator complexity that is no longer needed when the iterator has to handle only fixed-length and integer-indexable arrays. Next, though, he does a neat trick with a uint cast that ensures that a check for i < source.Length will never be true, even if i is less than 0 (because (uint)-1 wraps to approximately 2 x int.Max, so it will never be less than Length. After that, he talks about how the jitter can use that condition to avoid the automated bounds-checking that comes with .NET's managed code by default when indexing the array. So the cast to uint not only avoids a branch, it also avoids the hidden cost of bounds-checking. Nice! 👏 After running the benchmark again, we can see that he's rebuilt the optimizations available in C# by default. Very nice! 👏 👏 This is why running against a newer runtime and library may increase the performance ⏱ of your own code. Toub mentions that these types of optimizations are great, but they have to balance the value of the optimization versus the additional code to maintain as well as the size of the runtime assemblies. At 49:00, he mentions that the size issue also affects Native AOT, since AOT can't take advantage of PGO or anything else that the jitter has available to eliminate unnecessary code. AOT doesn't have as much information available, so it can't optimize away as much code. That is, if it can't guarantee that only the array-optimized version is called, then it has to keep all of the versions, increasing the binary size. There are also vector-optimizations and SIMD-optimizations that may have to be included. For more information, see "AOT, JIT, and PGO in .NET" . The extremely detailed chapter overview. 00:00:00 Deep Dive into Implementing Iteration in Programming 00:01:50 Understanding the Implementation and Functionality of Custom Iterators in Link 00:07:45 Discussing Optimization Strategies and Array Specialization in Programming 00:10:43 Understanding the Use of Sharp Lab and Compiler Optimization in C 00:15:20 Discussing Optimizations in Link Methods in Programming 00:16:39 Understanding SIMD and its Application in Computer Processing 00:20:12 Discussion on Code Analysis and Optimization Techniques in Software Development 00:23:46 Discussing and Implementing Iteration-Based and Manual Arrays in Programming 00:30:30 Exploring Compiler Optimizations and State Management in Programming 00:37:04 Exploring Hyper and Micro Optimization in Programming 00:40:11 Exploring Code Optimizations and Trade-offs in Programming 00:45:41 Discussing the Challenges and Implications of Optimizations in Software Performance 00:47:54 Discussing the Implementation and Optimization of Select in Programming 00:51:49 Discussion on Programming Syntax and Benchmarks 00:54:41 Implementing and Discussing Iteration Code in Programming 00:57:12 Understanding the Functionality and Implementation of C# Compiler Keywords 01:02:42 Improving Functionality and Performance of Manual Implementation of Iteration Methods 01:08:39 Exploring the Optimization and Implementation of Select Operators in Programming 01:12:57 Understanding and Optimizing Iteration Operations in Programming 01:18:09 Implementing and Utilizing LINQ Programming: A Two-Parter

It's 2024. How's it going, JavaScript?

2024-04-24T22:40:21+02:00

Published by marco on 24. Apr 2024 22:40:21 ------------------------------------------------------------------------ This video is from a great channel, which published a lot of great videos a while back. They covered pretty much everything already, but circled back to JavaScript for 2024. [media] Some choice quotes from the video. "We push on save." "2024 is the year of the serverlesslessness." "They say that every year, but this year they’re out of VC funding." "Don't write this down, next week all of this is gonna change." This guy just keeps knocking it out of the park. Pretty much everything he mentioned exists and is as described. [image][image]

Avoid primary constructors in C# (for now)

2024-04-17T11:01:32+02:00

Published by marco on 17. Apr 2024 11:01:32 Updated by marco on 17. Apr 2024 11:14:05 ------------------------------------------------------------------------ tl;dr: avoid C# 12's primary constructors for classes except for very small, simple classes, in which case you should consider using a record instead. The following video discusses the downsides of the current implementation of primary constructors: [media] To sum up: * Primary constructors don't have a readonly backing field; you can still assign to it within the type. * You can't control the visibility of the generated property or backing field. * You can't throw exceptions, except in a field-initializer, which isn't as obvious or clean as doing so from within a standard constructor Nick contrasts the C# implementation with the language feature in Kotlin, which allows all modifiers in the declaration, but has the same problem that the class definition can get pretty wordy. The article "Primary Constructors – Using C# 12 in Rider and ReSharper" by Matthias Koch describes another ugly phenomenon: double capture. Let’s consider the following example: public class Person(int age) { // initialization public int Age { get; set; } = age; // capture public string Bio => $"My age is {age}!"; } In this class, the parameter age is exposed both through the Age and Bio property. As a result, the object stores the state of age twice! For reference types, a double capture leads to an increased memory footprint and possibly even memory leaks. In our concrete example, you will observe the following unintended behavior: var p = new Person(42); p.Age.Dump(); // Output: 42 p.Bio.Dump(); // Output: My age is 42! p.Age++; p.Age.Dump(); // Output: 43 p.Bio.Dump(); // Output: My age is 42! // !!!!

Fighting with Fowler on Continuous Integration

2024-03-18T10:56:49+01:00

Published by marco on 18. Mar 2024 10:56:49 Updated by marco on 18. Mar 2024 11:18:01 ------------------------------------------------------------------------ [CI]The article "Continuous Integration" by Martin Fowler makes many interesting points. It is a compendium of know-how about CI by one of the industry heavyweights, who's been using it for a long time. While I found a lot of what he had to say interesting, I did wonder how applicable CI is for the kinds of teams that I know and work with. He makes several statements toward that end that pretty severely limit the applicability of what he calls "true CI" for many, if not most, teams. I think he should have started his article with a very clear delineation for which kinds of organizations this kind of process is appropriate or efficient. In leaving it out, he seems to suggest that it's the best for everyone, but at the end of the article, he lists what are, for me, quite severe restrictions. For example, * "Continuous Integration is more suited for team working full-time on a product [...]" * "[...] elite teams deployed to production more rapidly, more frequently, and had a dramatically lower incidence of failure when they made these changes." * "If a team attempts Continuous Integration without a strong test suite, they will run into all sorts of trouble" I don't get the impression that Fowler is discussing a dream scenario toward which one works, but rather what he considers to be the absolute minimum process that anyone should be utterly embarrassed about themselves for not already having. I didn't see a single sentence in this 40-page, at-times repetitive document about how to actually get there from here -- or whether that's really appropriate for many projects that people who read Martin Fowler might be working on. I wonder about the wisdom of prioritizing integration seemingly above all else. Below are citations from the long paper, with my comments interleaved. "This contrast isn't the result of an expensive and complex tool. The essence of it lies in the simple practice of everyone on the team integrating frequently, at least daily, against a controlled source code repository. This practice is called “Continuous Integration” (or it’s called “Trunk-Based Development”)." He says this a lot, but I never hear about the costs. Is there no amount of time lost on integrations that is too high a price? Is there no task that he doesn't break down into a million pieces in order to accommodate this style of work? Is there no efficiency lost by making each task into 1-hour chunks of coding that the entire team then integrates? Is that what we're doing now? "This will consist of both altering the product code, and also adding or changing some of the automated tests. During that time I run the automated build and tests frequently. After an hour or so I have the moon logic incorporated and tests updated." I'm quite fed up with reading this kind of optimistic bulls%!t. What kind of programmers are these who can accomplish major work in one hour? Or are the tasks that Fowler can conceive of all so simple that they can be accomplished in an hour? I'm very suspicious about these kinds of statements. It reminds me of game developers in the 90s talking about how they'd "written the whole engine in a weekend", but then the game still took five more years to deliver. "Some people do keep the build products in source control, but I consider that to be a smell - an indication of a deeper problem, usually an inability to reliably recreate builds. It can be useful to cache build products, but they should always be treated as disposable, and it's usually good to then ensure they are removed promptly so that people don't rely on them when they shouldn't." Sure. But -- priorities. Your product is not the pipeline. It's your product. You can't make everything a slave to the process. Remember to fix that which you can fix quickly, but to focus on your own priorities. Don't polish a build so that Martin Fowler is happy, if it's going to make your customers wait a lot longer for their release. "The tests act as an automated check of the health of the code base, and while tests are the key element of such an automated verification of the code, many programming environments provide additional verification tools. Linters can detect poor programming practices, and ensure code follows a team's preferred formatting style, vulnerability scanners can find security weaknesses. Teams should evaluate these tools to include them in the verification process." "Everyone Pushes Commits To the Mainline Every Day "No code sits unintegrated for more than a couple of hours." This feels completely divorced from reality, but maybe I just "don't get it." "If everyone pushes to the mainline frequently, developers quickly find out if there's a conflict between two developers. The key to fixing problems quickly is finding them quickly. With developers committing every few hours a conflict can be detected within a few hours of it occurring, at that point not much has happened and it's easy to resolve. Conflicts that stay undetected for weeks can be very hard to resolve." I agree with the last sentence, but at what cost? It feels like you're going to spend so much time committing and integrating. How is finding out if you have conflicts the highest-priority task your team has? "Full mainline integration requires that developers push their work back into the mainline. If they don't do that, then other team members can't see their work and check for any conflicts." Who finishes anything non-trivial in an hour? I can't escape the feeling that one-hour chunks is almost too granular, that this size was chosen because it aids integration. While that's a noble goal, I wonder how appropriate it is for many tasks, and to what degree the shape of the process affects the size of the solution set. "Since there's only a few hours of changes between commits, there's only so many places where the problem could be hiding. Furthermore since not much has changed we can use Diff Debugging to help us find the bug." But don't you waste time hunting bugs that would have gone away by themselves if the process weren't so frenetic? If you rebase everything, then you'll still encounter every integration conflict. If you merge, though, you can skip many of those interim integrations because subsequent changes might have obviated prior ones that might have caused conflicts. Instead of testing occasional version, you end up testing absolutely everything you do as if it were a release candidate. I'm not convinced that there's no downside to that. I feel like it's a waste of time if applied so mindlessly. "Often people initially feel they can't do something meaningful in just a few hours, but we've found that mentoring and practice helps us learn." I don't know who you're working with, but I wonder how useful is that? How useful is it to tailor your entire process to ruthlessly chopping up your work into tiny segments? What if that's not how some people work? What if they can't learn? Fire 'em? "Continuous Integration can only work if the mainline is kept in a healthy state. Should the integration build fail, then it needs to be fixed right away. As Kent Beck puts it: “nobody has a higher priority task than fixing the build”." You goal ends up being running to run the process, rather than to build the product. This sounds more and more like a cult. "If the secondary build detects a bug, that's a sign that the commit build could do with another test. As much as possible we want to ensure that any later-stage failure leads to new tests in the commit build that would have caught the bug, so the bug stays fixed in the commit build." "A team should thus automatically check for new versions of dependencies and integrate them into the build, essentially as if they were another team member. This should be done frequently, usually at least daily, depending on the rate of change of the dependencies." This seems like another thing that becomes a higher priority than building the product itself. Daily dependency check seems like overkill, but it's automated, so who cares? He's just running builds all the time, like we don't have a climate crisis. "if we rename a database field, we first create a new field with the new name, then write to both old and new fields, then copy data from the exisitng old fields, then read from the new field, and only then remove the old field. We can reverse any of these steps, which would not be possible if we made such a change all at once. Teams using Continuous Integration often look to break up changes in this way, keeping changes small and easy to undo." "Virtual environments make it much easier than it was in the past to do this. We run production software in containers, and reliably build exactly the same containers for testing, even in a developer's workspace. It's worth the effort and cost to do this, the price is usually small compared to hunting down a single bug that crawled out of the hole created by environment mismatches." I agree with this part, without qualification. At least as a goal. "Being able to automatically revert also reduces a lot of the tension of deployment, encouraging people to deploy more frequently and thus get new features out to users quickly. Blue Green Deployment allows us to both make new versions live quickly, and to roll back equally quickly if needed, by shifting traffic between deployed versions." What about data schemas? What about if you don't have a product that deploys on a web server or app store? I understand that there are solutions to this, but I wonder how great a fit they are to many teams? If your team is accustomed to SQL programming -- or if you already have a suite of products that use SQL databases -- then how worthwhile to your business is it to prioritize moving away from SQL to a local DB like "SQLite" , a NoSQL document store like "RavenDB" , or even to a completely different back-end like "Rama" ? "Continuous Integration effectively eliminates delivery risk. The integrations are so small that they usually proceed without comment. An awkward integration would be one that takes more than a few minutes to resolve." It sounds like very much like it prioritizes eliminating delivery risk over all else. It is only applicable to products built in this way from the beginning. "Having to put work on a new feature aside to debug a problem found in an integration test [or] feature finished two weeks ago saps productivity." So does constantly integrating, though! It can be noise. It's like the noise of micro-reviewing AI responses. You have to figure out the sweet spot for your team and iterate toward that goal, always ensuring that your team can deliver even if the dream process is not already in place. Make a diagram of all the facets and discuss a plan for your project. Pragmatic. Realistic. I don't get the impression that Fowler is discussing a dream scenario toward which one works, but rather what he considers to be the absolute minimum process that anyone should be utterly embarrassed about themselves for not already having. I didn't see a single sentence in this 40-page, at-times repetitive document about how to actually get there from here -- or whether that's really appropriate for many projects that people who read Martin Fowler might be working on. "They found that elite teams deployed to production more rapidly, more frequently, and had a dramatically lower incidence of failure when they made these changes. The research also finds that teams have higher levels of performance when they have three or fewer active branches in the application’s code repository, merge branches to mainline at least once a day, and don’t have code freezes or integration phases." What if you don't have an elite team? "A two week refactoring session may greatly improve the code, but result in long merges because everyone else has been spending the last two weeks working with the old structure. This raises the costs of refactoring to prohibitive levels. Frequent integration solves this dilemma by ensuring that both those doing the refactoring and everyone else are regularly synchronizing their work." Some refactoring can't just be done in mini bites like that. Sometimes, you work on a POC that takes more time to verify. Now what? Throw it away and build it from scratch in bite-sized pieces? Or integrate a long-lived branch, which is verboten? I'm working on a sweeping change to the way solutions are configured. It involves changing packages and versions in four different solutions. Should I have merged to master everywhere and involved the whole team in my project? That sounds stupid. Sure, it takes longer to verify and integrate in one big chunk, but it has the advantage that it didn't make upgrading the solution format the number-one priority for all developers for a sprint or two. "[...] teams that spend a lot of effort keeping their code base healthy deliver features faster and cheaper. Time invested in writing tests and refactoring delivers impressive returns in delivery speed, and Continuous Integration is a core part of making that work in a team setting." For non-legacy projects. Continuous delivery can only really work for web-based products or apps. A lot of other products have to be deployed to processes that aren't as easy to update five times a day. "Continuous Integration is more suited for team working full-time on a product, as is usually the case with commercial software. But there is much middle ground between the classical open-source and the full-time model. We need to use our judgment about what integration policy to use that fits the commitment of the team." That is the first time that he's conceded that maybe there are use cases to which this whole article doesn't apply very well. "If a team attempts Continuous Integration without a strong test suite, they will run into all sorts of trouble because they don't have a mechanism for screening out bugs. If they don't automate, integration will take too long, interfering with the flow of development." No kidding. You need some serious test coverage to continuously integrate and deploy. I also wonder about the size of the product you can legitimately do this. Can you imagine if your test suite takes ten minutes to run and you integrate three or four times per day? Can you imagine how much time you're not developing software because you're integrating someone else's code? I understand that this happens eventually, but I wonder about the wisdom of prioritizing integration seemingly above all else. "Continuous Integration is about integrating code to the mainline in the development team's environment, and Continuous Delivery is the rest of the deployment pipeline heading to a production release." This is a good definition and I wonder that he rewrote this whole essay and didn't put this right at the top. "Continuous Integration ensures everyone integrates their code at least daily to the mainline in version control. Continuous Delivery then carries out any steps required to ensure that the product is releasable to product[ion] whenever anyone wishes. Continuous Deployment means the product is automatically released to production whenever it passes all the automated tests in the deployment pipeline." Also excellent definitions that make the distinction clear. Continuous Delivery is the one that many teams could strive for, even if they will never be able to do Continuous Delivery. The question is: at what cost? "Those who do Continuous Integration deal with this by reframing how code review fits into their workflow." Well, that's an interesting statement. Integration trumps review? Get your code in there and review later? Trust in your tests? Are you kidding me? You should review design, as well as implementation. If everyone's coding and committing and pushing in hours, when do they review? Is the idea to have people communicate with each other only when they've already built something?

Web Interop 2024

2024-02-11T22:33:58+01:00

Published by marco on 11. Feb 2024 22:33:58 ------------------------------------------------------------------------ The article "The web just gets better with Interop 2024" by Jen Simmons writes, "[Interop 2024]The Interop project aims to improve interoperability by encouraging browser engine teams to look deeper into specific focus areas. Now, for a third year, "Apple" , "Bocoup" , "Google" , "Igalia" , "Microsoft" , and "Mozilla" pooled our collective expertise and selected a specific subset of automated tests for 2024. "Some of the technologies chosen have been around for a long time. Other areas are brand new. By selecting some of the highest priority features that developers have avoided for years because of their bugs, we can get them to a place where they can finally be relied on." When we complain about features that remain unimplemented in browsers, we also have to acknowledge that there’s only so much you can do with a given team. There are problems that are technically easier to solve than others. When we complain, we’re actually more concerned about the prioritization of issues. We want to be able to influence what gets fixed when, rather than just having to passively hope that the manufacturer eventually gets around to it. That where the "Web Platform Tests" come in. The "Interop 2024" project follows on iterations from "2023" , "2022" , and "2021" , when it all started. Last year was a banner year. For CSS "Subgrid, Container Queries, :has(), Motion Path, CSS Math Functions, inert and @property are now supported in every modern browser." For JavaScript, we got "Improved Web APIs include Offscreen Canvas, Modules in Web Workers, Import Maps, Import Assertions, and JavaScript Modules" across all modern browsers. These are all super-important features. E.g., Import Assertions for JSON import and Modules in Web Workers, which allows modern and modular programming, making it much easier to offload work, as one would with code running directly on modern operating systems. What's on the schedule for 2024? * Although there was a lot of progress made on CSS nesting last year, it's back on the radar this year to finalize the implementations. * @property will similarly be more polished, as the percentage support is still quite low in many browsers. * It's great to see accessibility improvements for many of these features -- like how sub-grids or display: contents affect element order -- as this means that we will get sites that are automatically accessible, as long as we build our sites logically. * Improvements to IndexedDB will make it easier to write powerful local-first applications (even though something like "Automerge" might be a better fit for apps offering concurrent or collaborative editing). * Browser- and standards-level support for popover with anchors is long overdue, as making usable tooltips and popups is an area fraught with custom code and half-baked solutions. It's nice to see this become an area where you'll no longer need custom JavaScript. * Relative Color Syntax continues the excellent trend of allowing us to write CSS without the support of a CSS preprocessor. With relative colors, dark/light theming support, CSS nesting, and CSS variables, I can't think of a reason I would use a CSS preprocessor anymore. I know some people have used them for so much more, but I've not done so, so my needs are already covered, even without this extension that allows conversion between colorspaces. * @starting-style will fill a gap in CSS that finally allows sites to indicate how an element will transition from or to display: none. See the original article for much more detail.

SourceLink and external sources

2024-01-08T09:50:50+01:00

Published by marco on 8. Jan 2024 09:50:50 Updated by marco on 9. Jan 2024 11:04:29 ------------------------------------------------------------------------ I published a very similar version of the following article in the DevOps Wiki at "Uster Technologies AG" . Since nearly all of that post is general knowledge that I would have been happy to find before I started my investigations, I'm sharing it here. [Overview] When we think about navigating or debugging our code, we usually focus on the code we've written ourselves—local sources in our file system. IDEs have classically focused on being able to debug and navigate this code. More and more, though, we're also interested in navigating and debugging our versioned and compiled dependencies: * Internal NuGet packages * External NuGet packages * The Base Class Library (BCL) Most of these are available as source code. We would ideally like to be able to navigate and debug that code just as easily as we can our own. The following sections define file types and terminology, and then explain how these concepts apply to debugging and navigation for external sources. You can also just jump to the sections on "producing" <#producing-packages> or "consuming" <#consuming-packages> packages (especially as relates to "authentication for private sources" <#authentication-fails>). [Diagram] The following diagram provides an overview of the process of obtaining external packages, along with their symbols and source files. It looks quite complicated, but accommodates the flexibility required by various stakeholders. [image] [File types] There are several types of files associated with debugging and navigation: DLL The executable code generally only includes executable code (instructions). It can include debugging information, but this is relatively rare. PDB The "program database" contains the symbol information for the executable code, which allows the debugger to map instructions back to the source code.

This includes aliases (symbols), file positions, and any other necessary mappings, including references to source code files.

Although invented by Microsoft, the PDB is an industry-standard, platform-independent, and language-independent format. See, e.g., "LLVM's The PDB File Format documentation" or "the Wikipedia entry" . XML The optionally generated XML documentation. Some IDEs can use this file to enhance the developer experience while browsing the source code. *.cs The original source code [Design Considerations] It's reasonable to ask why this process is so complex. [Why can't the nupkg just include the PDB and the *.cs files?] The system was designed for use cases where most sources were closed. That has changed, but the system still reflects the original design choices. The PDB files can also add about 30% to the size of the package. The original use cases preferred to avoid using 30% more space for package downloads that didn't need the debugging information. [Why aren't sources included in the PDB?] Again, historically, the use cases were for providing improved stack traces with symbols, but not to provide access to closed sources. Even if the sources are partially open, access may be restricted to only some users of the packages or symbols. Having the IDE request the sources separately allows an additional authorization phase. [What about open-source?] The defaults still reflect the original use cases, which actually represent fewer and fewer packages as time goes on. These answers aren't particularly satisfying if your use case happens to be "make a package that has symbols for excellent stack traces and sources for excellent debugging". At least we now have IDEs that know how to work with this system and there is a lot of automation for "producing packages" <#producing-packages> with the desired symbol and source-code support. [Terminology] [Debugging] A developer debugs source code by interrupting execution of a program—either manually or by setting breakpoints—and then stepping through the instructions, examining the contents of symbols (variables) to investigate the runtime behavior and operation of the system. The debugger uses the PDB to allow source-level debugging, i.e. debugging in the original source code. While debugging in "lower" formats is possible, it's not nearly as reliable as being able to step through the code in the original source code, using the original symbols. How does the debugger obtain the PDB for a given DLL? * First it searches in the same directory. This is by far the most common location where symbols will be found. * Next, it searches on all known "symbol servers" in the order that they're declared. All DLLs and PDBs have unique identifiers that make it possible to request and download the correct file. Once the debugger has the PDB, it has everything it needs—except the source code. [Local sources] If the PDB was generated locally, then it most likely references the source files that are still in the same locations in the file system as when it was built. In that case, the debugger easily finds the source files because they're just at the paths that are directly referenced by the PDB. If the PDB was not generated locally or the source-code paths do not match, then there are other tricks to find the source files. Visual Studio allows you to set "Directories containing source code" for the "Debug Source Files" [image] [External sources] If the sources aren't available locally, e.g., for a NuGet package, then there is a system called "SourceLink" that is extremely well-supported in the .NET world that makes it possible to easily download the source files that generated a DLL and that are referenced by its PDB. Things to be aware of: * The package must have been built with SourceLink enabled (see "producing packages" <#producing-packages>). * The sources must be available for download in a known format and structure (e.g., Azure Git Repos). * The IDE must know how to download, cache, and use the sources for debugging or navigation. If the package does not support SourceLink, but the sources are available, then you can download the sources locally and use the solution-level mapping above to tell the debugger where the source files are. You can also just point the debugger to the top-level folder when it asks for the file's location, in which case the debugger makes the entry for you. [Navigation] A developer navigates by requesting the source code for a symbol. For example, if the declared type of a variable in an open source file is the class Setting, then the developer can ask the IDE to show the source of Setting by Ctrl + clicking, by pressing F12 in Visual Studio, or by pressing Ctrl + B in Rider. As with debugging, navigating local sources is straightforward, since the sources are in the local file system. For symbols in NuGet packages, the IDE has to be clever enough to download, cache, and use the sources. Visual Studio on its own does not support navigating to external sources via "SourceLink" <#sourcelink>. Instead, it always decompiles external sources, as shown in the example below. [image] If you have ReSharper installed, then the default setting is to try as hard as possible to avoid showing a decompiled version. You can also add "Folder Substitutions" in the "Advanced Symbol options..." for navigating to "External Sources". The option does not seem to be available in Rider. [image] [SourceLink] SourceLink is a system that provides source files for external sources like NuGet packages for debugging or navigation. In order for this to work, you must be able to provide external sources or the client is not properly configured for debugging. See "below for troubleshooting information" <#troubleshooting>, especially as relates to "authentication" <#authentication-fails> for packages and source code pulled from authenticated locations. [Decompiled code] A decompiled version of the source code is a reconstruction of the original source from the instructions and information in the DLL and PDB. When sources cannot be located for a given symbol, Visual Studio, ReSharper, and Rider will produce a decompiled version as a fallback. This is often good enough to be able to read the code reasonably well, but it leaves certain common constructs in their "lowered" format. E.g., calls to extension methods appear as static-method calls rather than as targeted on the first parameter. This can make debugging difficult, as the instructions don't match the mapping. Rider has support for patching the PDB on-the-fly to allow more comfortable debugging of decompiled sources. This is, however, a fallback solution for external packages over which you have no control. It's best to configure your packages to publish with symbols and sources available to IDEs that support them, as shown in the next section. [Producing packages] The documentation to "Enable debugging and diagnostics with Source Link" is thorough and tells you all you need to know about all of the options. If you're working with Azure DevOps Services, you should include the following package reference: With this, you're all set. The package is published to the Azure Artifacts, with a corresponding snupkg available on the Azure symbol server and sources available via the repository URL (subject to authorization; see "below" <#authentication-fails> for troubleshooting). [Additional Properties] You can set a few optional properties, detailed below. Most projects won't need to set these, but they are included to spare you the research if you see them in code examples, either in your institution's code or online. As noted, the only line you need is the package reference shown above. "EmbedAllSources" Embeds all project source files into the generated PDB "EmbedUntrackedSources" Embeds anything that's not included in source control (kind of unclear what they're talking about here, though); included in IncludeSymbols "PublishRepositoryUrl" Ensures that the URL of the repository supplied by the CI server or retrieved from source control manager is available in the package information. This is off by default to prevent discovery of private URLS, but it doesn't really matter for packages published from private sources, as they are protected by Azure DevOps (or whatever) authorization. "IncludeSymbols" Indicates that the PDB should be generated and included either with the package (if DebugType is set to embedded) or in a separate symbol package (if SymbolPackageFormat is set to snupkg). This is implied when the NuGet package "Microsoft.SourceLink.AzureRepos.Git" is included, as shown below. "SymbolPackageFormat" Indicates which package format to use. This is set to snupkg when the NuGet package "Microsoft.SourceLink.AzureRepos.Git" is included, as shown below. See the "SourceLink documentation" for more details. Among other details, they also note that projects that target .NET 8 no longer need to include this support explicitly because Azure Repos are supported by default, as detailed in the "readme for the SourceLink project" . "If your project uses .NET SDK 8+ and is hosted by the above providers (GitHub, Azure Repos, GitLab, BitBucket) it does not need to reference any Source Link packages or set any build properties." [Conditional packaging] You can also include the packaging conditionally in the Directory.Build.Targets, as shown below. See the appendix for "Directory.Build.Props and Directory.Build.Targets" <#direction-build-props-and-directory-build-targets> for more information about which variables and directives are respected in which file. [Consuming packages] [For debugging] If a package has SourceLink enabled and you have access to the online repository from which it was built, then to seamlessly debug into that source code, ensure the following: * "Disable Just My Code" <#disable-_just-my-code_> * "Check that the PDB is available" <#is-it-available%3F> [For navigation] As "noted above" <#navigation>, Visual Studio doesn't support navigating via "SourceLink" <#sourcelink>. To browse external sources with JetBrains tools, ensure the following: * "Check that the PDB is available" <#is-it-available%3F> * "Set up Authentication" <#authentication-fails> [Troubleshooting] [Symbols not loaded] [Disable Just My Code] Once you're sure that the "package supports SourceLink" , then you should also make sure that the "Just My Code" setting is disabled. When Just My Code is enabled, the debugger skips over any code that doesn't correspond to source code in one of the local projects. [Is it available?] * Does the package you've downloaded actually include symbols (a .pdb file next to the .dll file)? * If the PDB is not included with the package, is it available on a "Symbol Server" ? * If it is included, is it being copied into the output folder with the DLL? If it's available in the package, but is not being copied to the output folder, then if you're using .NET 7.0 SDK or higher, you can use the build property named " CopyDebugSymbolFilesFromPackages" . true [Manually load the module] Verify that the symbols for the module you're trying to debug have been loaded. If they aren't loaded, you can try to "load symbols while debugging" . For more details and a screenshot, see "Just My Code debugging" . [Decompiling rather than downloading] If you're trying to navigate in code, but ReSharper or Rider keeps decompiling instead of getting the sources from "SourceLink" <#sourcelink>, then check your "External Sources" settings in ReSharper or Rider. Verify that the tool is configured to check for external sources before it tries decompiling. If the IDE is having trouble "authenticating" <#authentication-fails>, then you will usually see a decompiled version instead. Sometimes the code is so close to the original that it's hard to tell; scroll to the top to see if it includes the "decompiled by JetBrains..." header. Once the IDE has decompiled a source file, it will continue to use this cached copy until you close the tab, or sometimes you have to close and re-open the project. If you're troubleshooting your way through this setup, then you can temporarily disable decompilation as a fallback, which avoids producing the unwanted source-code variant in the first place. [Authentication fails] Visual Studio uses the authentication associated with the logged-in user that you use to enable the IDE. This can be in a weird state if you've recently changed your password or your authentication token is stale or in a non-refreshable state. Try logging out and back in. JetBrains tools (Rider, ReSharper, DotPeek, etc.), on the other hand, need to be given a token. [Configure from the notification] If the tool shows a notification indicating that authentication has failed, then do the following: * Click Configure on the notification to show a dialog * In the resulting dialog, set: * User name to your Azure login, e.g., john.doe@example.com * Token to an "Azure PAT" (click for instructions on how to create one) * Press the Test button to verify that it works (you should see OK 200) * Press Ok to save the credentials [Bug in JetBrains tools] However, there is a bug whereby JetBrains tools fail to show a notification or offer a way to enter credentials. [1] That's going to look something like this: [image] It claims that it can download the source, but it never completes. You have to cancel the dialog. If you then look at the ReSharper Output, then you'll see something like this: [image] The relevant text is at the end of the third line, which indicates that the request for the source file returned a "Non-OK HTTP status code". PdbNavigator: Searching for 'Example.Core.AppConfig.AppConfigKeyAttribute' type sources in C:\Users\john.doe\.nuget\packages\example.core.appconfig\4.1.0\lib\netstandard2.0\Example.Core.AppConfig.pdb PdbNavigator: File names (1) are inferred for type Example.Core.AppConfig.AppConfigKeyAttribute PdbNavigator: Downloader: https://dev.azure.com/example/example.Core/_apis/git/repositories/Example.Core.LabInstruments/items?api-version=1.0&versionType=commit&version=8b34c2aa672facd47e835c27152f695fa796a408&path=/Example.Core/DotNetStandard/Example.Core.AppConfig/AppConfigKeyAttribute.cs -> Non-OK HTTP status code [Configure from the Credentials Manager] The most reliable way to fix this is to create the credentials in the Credential Manager. Be aware that you will need to "create an Azure PAT (personal access token)" . * Open Credentials Manager * Switch to Windows Credentials * Scroll until you see JetBrains SourceLink https://dev.azure.com/exampleOrganization [image] If you don't have this entry, then that's the problem. If you have it, but you still can't get the sources, then edit the entry to have valid credentials. To create or edit the record, do the following from the Credentials Manager: Press Add a generic credential [image] * In the resulting form, set: * Internet or network address to JetBrains SourceLink https://dev.azure.com/exampleOrganization * User name to your Azure login, e.g., john.doe@example.com * Password to an "Azure PAT" (click for instructions on how to create one) 💡 There is no need to restart the JetBrains tools. You will immediately be able to load sources from "SourceLink" <#sourcelink> once valid credentials exist. [Alternative: referencing projects, not packages] As you can see above, although publishing a package is relatively straightforward, there are quite a few stumbling blocks on the way to consuming the package for navigation and debugging. Once you have everything set up and working, it's great, but ... there is still one other drawback. You can't edit the code for packages. This is not optimal. Optimally, we'd like to quickly verify that change to an upstream code would address an issue in downstream code without having to generate new packages. It would be great to just edit the upstream code as if it were part of your downstream solution until you're sure that the change would address your downstream issue. At that point, you can copy the changes back to the upstream solution (where the dependency is produced), add tests, and produce a new version, being pretty certain that the change is effective. The shortest possible developer-feedback loop with code in external packages is: * Determine that you need to make a change to code in an external package * Open the solution for that package * Make the change * Build the assembly * Drop it into your output folder (along with the PDB) * Build and run your solution with the updated code If your package has dependencies or your change in the external package's solution touches multiple packages, then you can do the following: * Build packages for the solution locally * Set up a NuGet source that points to that folder * Update to the newer versions of the packages and restore from that source * Build and run your solution with the updated code If it get too complicated to do locally, then you can always commit, push, and have the CI generate new versions of your packages (hopefully with a prerelease version, e.g., 3.2.4-preview2) The solutions outlined above have a reasonable turnaround time, but sometimes you want to pretend that the external packages are just internal projects instead. This basically entails: * Downloading the project or projects corresponding to the packages that you want to be able to edit * Including those projects into your solution * Replacing the external package-references with project references At that point, you can edit, debug, and navigate the code as if it were your own. See the "Project Munging with Tools & PowerShell" section of "How to Debug NuGet Packages with Symbols and Source Link Painlessly" for a PowerShell script that can help you automate part of this. [Directory.Build.Props and Directory.Build.Targets] MSBuild supports including common configuration in project files. While earlier versions required all configuration to be included explicitly, modern versions include configuration files with special names automatically, greatly simplifying common configuration and reducing clutter in project files. [Mechanics] If the file is named Directory.Build.Props or Directory.Build.Targets, it is picked up automatically and included for all projects in that folder or any subfolder. If you use a different name, then you have to explicitly reference that file from a project or from another *.props or *.targets file. If you choose your own name, you don't have to use the Build.Properties or Build.Targets convention, but it's strongly recommended, to avoid confusion. [Directory.Build.Props] You can use a "Directory.Build.Properties" file to include settings for all projects in a folder or set of subfolders. For example, the following package reference can and should be included in Directory.Build.Props: [Directory.Build.Targets] If you want to include settings conditionally based on build configuration (e.g., Configuration or Platform), then you'll have to use the "Directory.Build.Targets" file, which has access to those variables. [Rules-of-thumb] * Once you're using the SDK-style format for project files [2], you can aggressively consolidate common settings in a Directory.Build.Props file at the root of the solution. * If you have groups of projects to which different settings should be applied, then consider splitting those off into corresponding subfolders (e.g., "Tests") so that you can apply those common settings with a configuration file that applies only to that folder. * If you can't or don't want to move projects into subfolders, then you can create a custom `props` file and manually include it in the project files that need it. -------------------------------------------------------------------------------- [1] After having figured out a workaround, I felt well-equipped enough to file a bug with JetBrains: "ReSharper does not ask for authentication when browsing to source from symbol files in assembly explorer" . After a couple of days, the responsible developer changed the status from "triage" to "open" and he linked a two-year-old bug report to it: "Pdb files cannot be downloaded from Azure DevOps Symbol Server" . Would you like to guess who wrote that bug report? Yours truly. I knew I'd had trouble in this area before, but I'd completely forgotten that I'd reported the bug in such detail. It's still open. Maybe they'll finally address it. [1] This also works for the older project format, but it's hard to keep Visual Studio from repopulating properties in that format. You can use the SDK-style format for nearly all projects these days. The conversion is worth it.

Learning how to use GenAI as a programming tool

2023-12-30T22:46:09+01:00

Published by marco on 30. Dec 2023 22:46:09 ------------------------------------------------------------------------ [image]The article "Exploring Generative AI" by Birgitta Böckeler is chock-full of helpful tips from eight newsletters totaling 25 pages that she wrote throughout 2023. I include some of my own thoughts, but most of this article consists of citations. A lot of my analysis and notes boils down to: you need to know what you're doing to use these tools. They can help you build things that you don't understand, but it's not for medium- or long-term solutions. I've written a lot more about the need for expertise in "How important is human expertise?" "The following are the dimensions of my current mental model of tools that use LLMs (Large Language Models) to support with coding. "Assisted tasks" * Finding information faster, and in context * Generating code * “Reasoning” about code (Explaining code, or problems in the code) * Transforming code into something else (e.g. documentation text or diagram) "These are the types of tasks I see most commonly tackled when it comes to coding assistance, although there is a lot more if I would expand the scope to other tasks in the software delivery lifecycle." "In this particular case of a very common and small function like median, I would even consider using generated code for both the tests and the function. The tests were quite readable and it was easy for me to reason about their coverage, plus they would have helped me remember that I need to look at both even and uneven lengths of input. However, for other more complex functions with more custom code I would consider writing the tests myself, as a means of quality control. Especially with larger functions, I would want to think through my test cases in a structured way from scratch, instead of getting partial scenarios from a tool, and then having to fill in the missing ones." "The tool itself might have the answer to what’s wrong or could be improved in the generated code - is that a path to make it better in the future, or are we doomed to have circular conversation with our AI tools?" "[...] generating tests could give me ideas for test scenarios I missed, even if I discard the code afterwards. And depending on the complexity of the function, I might consider using generated tests as well, if it’s easy to reason about the scenarios." "For the purposes of this memo, I’m defining “useful” as “the generated suggestions are helping me solve problems faster and at comparable quality than without the tool”. That includes not only the writing of the code, but also the review and tweaking of the generated suggestions, and dealing with rework later, should there be quality issues." * [...] * Boilerplate: Create boilerplate setups like an ExpressJS server, or a React component, or a database connection and query execution. * Repetitive patterns: It helps speed up typing of things that have very common and repetitive patterns, like creating a new constructor or a data structure, or a repetition of a test setup in a test suite. I traditionally use a lot of copy and paste for these things, and Copilot can speed that up. Interesting. I've just always used the existing templates or made my own expansion templates. At least then it makes exactly what I want -- and even leaves the cursor in the right position afterwards. Another thought I had is that the kind of programmer that this helps doesn't use any generalization for common patterns. Otherwise, the suggestions wouldn't be useful because they can't possibly take advantage of those highly specialized patterns. Or maybe they can, if they're included in the context. It seems unlikely, if only because the sample size is too small to be able to influence the algorithm sufficiently. But maybe enough weight can be given to the immediate context to make that work somehow. At that point, though, you're just spending all of your time coaxing your LLM copilot into building the code that you already knew you wanted. This practice seems like it would end up discouraging generalization and abstraction -- unless it can grok your API (as I've noted above). This is an age-old problem that is maybe solved, once and for all. The problem is that when you generalize a solution, it becomes much easier, more efficient, and more economical to maintain, but it can end up being more difficult to understand. If the API is well-made and addresses a problem domain with a complexity that the programmer is actually capable of understanding, then the higher-level API may be easier to use, and perhaps even maintain. However, a non-generalized solution is sometimes easier for a novice or less-experienced programmer to understand and extend. It's questionable whether you'd want your code being extended and maintained by someone who barely -- or doesn't -- understand it, but that situation is sometimes thrust on teams and managers. "This autocomplete-on-steroids effect can be less useful though for developers who are already very good at using IDE features, shortcuts, and things like multiple cursor mode. And beware that when coding assistants reduce the pain of repetitive code, we might be less motivated to refactor." "You can use a coding assistant to explore some ideas when you are getting started with more complex problems, even if you discard the suggestion afterwards." "The larger the suggestion, the more time you will have to spend to understand it, and the more likely it is that you will have to change it to fit your context. Larger snippets also tempt us to go in larger steps, which increases the risk of missing test coverage, or introducing things that are unnecessary." On the other hand, "[...] when you do not have a plan yet because you are less experienced, or the problem is more complex, then a larger snippet might help you get started with that plan." This is not unlike using StackOverflow or any other resource. There's no getting around knowing what you're doing, at least a little bit. You can't bootstrap without even a bootstrap. "Experience still matters. The more experienced the developer, the more likely they are to be able to judge the quality of the suggestions, and to be able to use them effectively. As GitHub themselves put it: “It’s good at stuff you forgot.” This study even found that “in some cases, tasks took junior developers 7 to 10 percent longer with the tools than without them.”" "Using coding assistance tools effectively is a skill that is not simply learned from a training course or a blog post. It’s important to use them for a period of time, experiment in and outside of the safe waters, and build up a feeling for when this tooling is useful for you, and when to just move on and do it yourself." This is just like any other tool. There is no shortcut to being good at something complex. The only tasks for which there are shortcuts are the non-complex ones. In that case, you should be asking yourself why your solutions involve so much repetitive programming. "We have found that having the right files open in the editor to enhance the prompt is quite a big factor in improving the usefulness of suggestions. However, the tools cannot distinguish good code from bad code. They will inject anything into the context that seems relevant. (According to this reverse engineering effort, GitHub Copilot will look for open files with the same programming language, and use some heuristic to find similar snippets to add to the prompt.) As a result, the coding assistant can become that developer on the team who keeps copying code from the bad examples in the codebase." That will be so much fun, especially if you can get an echo chamber of lower-skilled programmers approving each other's pull requests. 😉 "We also found that after refactoring an interface, or introducing new patterns into the codebase, the assistant can get stuck in the old ways. For example, the team might want to introduce a new pattern like “start using the Factory pattern for dependency injection”, but the tool keeps suggesting the current way of dependency injection because that is still prevalent all over the codebase and in the open files. We call this a poisoned context , and we don’t really have a good way to mitigate this yet." "Using a coding assistant means having to do small code reviews over and over again. Usually when we code, our flow is much more about actively writing code, and implementing the solution plan in our head. This is now sprinkled with reading and reviewing code, which is cognitively different, and also something most of us enjoy less than actively producing code. This can lead to review fatigue, and a feeling that the flow is more disrupted than enhanced by the assistant." "Automation Bias is our tendency “to favor suggestions from automated systems and to ignore contradictory information made without automation, even if it is correct.” Once we have had good experience and success with GenAI assistants, we might start trusting them too much." "[...] once we have that multi-line code suggestion from the tool, it can feel more rational to spend 20 minutes on making that suggestion work than to spend 5 minutes on writing the code ourselves once we see the suggestion is not quite right." "Once we have seen a code suggestion, it’s hard to unsee it, and we have a harder time thinking about other solutions. That is because of the Anchoring Effect, which happens when “an individual’s decisions are influenced by a particular reference point or ‘anchor’”. so while coding assistants’ suggestions can be great for brainstorming when we don’t know how to solve something yet, awareness of the Anchoring Effect is important when the brainstorm is not fruitful, and we need to reset our brain for a fresh start." "The framing of coding assistants as pair programmers is a disservice to the practice, and reinforces the widespread simplified understanding and misconception of what the benefits of pairing are." "Pair programming however is also about the type of knowledge sharing that creates collective code ownership, and a shared knowledge of the history of the codebase. It’s about sharing the tacit knowledge that is not written down anywhere, and therefore also not available to a Large Language Model. Pairing is also about improving team flow, avoiding waste, and making Continuous Integration easier. It helps us practice collaboration skills like communication, empathy, and giving and receiving feedback. And it provides precious opportunities to bond with one another in remote-first teams." "LLMs rarely provide the exact functionality we need after a single prompt. So iterative development is not going away yet. Also, LLMs appear to “elicit reasoning” (see linked study) when they solve problems incrementally via chain-of-thought prompting. LLM-based AI coding assistants perform best when they divide-and-conquer problems, and TDD is how we do that for software development." "Some examples of starting context that have worked for us:" * ASCII art mockup * Acceptance Criteria * Guiding Assumptions such as: * “No GUI needed” * “Use Object Oriented Programming” (vs. Functional Programming) "For example, if we are working on backend code, and Copilot is code-completing our test example name to be, “given the user… clicks the buy button ” , this tells us that we should update the top-of-file context to specify, “assume no GUI” or, “this test suite interfaces with the API endpoints of a Python Flask app”." "Copilot often fails to take “baby steps”. For example, when adding a new method, the “baby step” means returning a hard-coded value that passes the test. To date, we haven’t been able to coax Copilot to take this approach." Knowing a bit about how LLMs work, there's no way you really could train it to do TDD, because it's an iterative process. It doesn't know what TDD is, nor does the way it's built have any mechanism for learning how to do it. Nor does it know what coding is, for that matter. It's just a really, really good guesser. Everything it does is hallucination. It's just that some of it is useful. "As a workaround, we “backfill” the missing tests. While this diverges from the standard TDD flow, we have yet to see any serious issues with our workaround." Changing how you program because of the tool is something you should do deliberately. This is a slippery slope. "For implementation code that needs updating, the most effective way to involve Copilot is to delete the implementation and have it regenerate the code from scratch. If this fails, deleting the method contents and writing out the step-by-step approach using code comments may help. Failing that, the best way forward may be to simply turn off Copilot momentarily and code out the solution manually." Jaysus. That's pretty grim. "The common saying, “garbage in, garbage out” applies to both Data Engineering as well as Generative AI and LLMs. Stated differently: higher quality inputs allow for the capability of LLMs to be better leveraged. In our case, TDD maintains a high level of code quality. This high quality input leads to better Copilot performance than is otherwise possible." "Model-Driven Development (MDD). We would come up with a modeling language to represent our domain or application, and then describe our requirements with that language, either graphically or textually (customized UML, or DSLs). Then we would build code generators to translate those models into code, and leave designated areas in the code that would be implemented and customized by developers." "That unreliability creates two main risks: It can affect the quality of my code negatively, and it can waste my time. Given these risks, quickly and effectively assessing my confidence in the coding assistant’s input is crucial." "Can my IDE help me with the feedback loop? Do I have syntax highlighting, compiler or transpiler integration, linting plugins? Do I have a test, or a quick way to run the suggested code manually?" "I have noticed that in CSS, GitHub Copilot suggests flexbox layout to me a lot. Choosing a layouting approach is a big decision though, so I would want to consult with a frontend expert and other members of my team before I use this." That's because you care about architecture. Review was always important, but more so when code is being written by something you never hired. "How long-lived will this code be? If I’m working on a prototype, or a throwaway piece of code, I’m more likely to use the AI input without much questioning than if I’m working on a production system." "[...] it’s also good to know if the AI tool at hand has access to more information than just the training data. If I’m using a chat, I want to be aware if it has the ability to take online searches into account, or if it is limited to the training data." "To mitigate the risk of wasting my time, one approach I take is to give it a kind of ultimatum. If the suggestion doesn’t bring me value with little additional effort, I move on. If an input is not helping me quick enough, I always assume the worst about the assistant, rather than giving it the benefit of the doubt and spending 20 more minutes on making it work." "GitHub Copilot is not a traditional code generator that gives you 100% what you need. But in 40-60% of situations, it can get you 40-80% of the way there, which is still useful. When you adjust these expectations, and give yourself some time to understand the behaviours and quirks of the eager donkey, you’ll get more out of AI coding assistants."

AOT, JIT, and PGO in .NET

2023-12-15T13:15:17+01:00

Published by marco on 15. Dec 2023 13:15:17 ------------------------------------------------------------------------ The latest video by Nick Chapsas has a more-than-usually clickbait-y headline. The "big" problem that NativeAOT has, is that it's 4% slower during runtime than the JIT-compiled version. [media] That doesn't seem like such a big problem to me, when the point of AOT is to improve cold-start times for applications launched on-demand. For that use-case, AOT shines. It's over 4x faster on startup than the JIT-compiled version. It's incredibly impressive that JIT-compilation takes less than 1/10 of a second, but it's still 4x slower than AOT. [image] So, you get the app started 4x fast, but it then performs 4% more slowly than the non-AOT version. It really depends on the use-case, but it's great for the common one of starting a server to answer a function call -- think Azure Functions or AWS Lambdas -- and then shutting down again, possibly immediately. "Damian P Edwards" (Principal Architect at Microsoft) commented on the post, "[There are a] few things that cause the slightly lower performance in native AOT apps right now. First (in apps using the web SDK) is the new DATAS Server GC mode. This new GC mode uses far less memory than traditional ServerGC by dynamically adapting memory use based on the app's demands, but in this 1st generation it impacts the performance slightly. The goal is to remove the performance impact and enable DATAS for all Server GC apps in the future. "Second is CoreCLR in .NET 8 has Dynamic PGO enabled by default, which allows the JIT to recompile hot methods with more aggressive optimizations based on what it observes while the app is running. Native AOT has static PGO with a default profile applied and by definition can never have Dynamic PGO. "Thirdly, JIT can detect hardware capabilities (e.g. CPU intrinsics) at runtime and target those in the code it generates. Native AOT however defaults to a highly compatible target instruction set which won't have those optimizations but you can specify them at compile time based on the hardware you know you're going to run on. "Running the tests in [the] video with DATAS disabled and native AOT configured for the target CPU could improve the results slightly." To summarize: 1. The DATAS GC mode is in-use for AOT, but still being fine-tuned. 2. An AOT-compiled app cannot benefit from dynamic "PGO" . It benefits from static PGO, but cannot recompile itself on-the-fly because it doesn't have a JIT compiler to do so. The JIT-compiled app can dynamically recompile what it observes as performance hotspots with more highly optimized code. I wrote a bit about how Safari does something similar for JavaScript in "Optimizing compilation and execution for dynamic languages" -- although for JavaScript, dynamic recompilation is sometimes necessary for backing out of an incorrect assumption about what type a variable is going to have.As well, a JIT-compiled app can take actual hardware capabilities into account, while an AOT-compiled app necessarily targets a static hardware profile. The generic hardware profile is going to be extremely conservative about capabilities because if it assumes a capability that doesn't exist, the app simply won't run. Choosing a hardware profile for AOT that matches the target hardware would boost performance. I guess that was more of a rephrasing, rather than a summary. Anyway, another commenter asked, "[...] would it be possible in the future for a JIT application with Dynamic PGO that has run for a while and has made all kinds of optimizations to then create a "profile" of sorts that could be used by the Native AOT compiler to build an application that is both fast in startup time and highly optimized for a given workload?" Yes. That should be possible. It's unclear what sort of extra performance boost this would give, especially if you'd already fine-tuned the target hardware profile -- which is the first thing you should do. I could imagine adding this sort of profiling as a compilation step, though. You always have to be careful, though, whenever you're running something in production that is different than what you've tested. We put a lot of faith in the JIT and dynamic PGO, don't we? I wanted to also note that, at the end of the video, Chapsas showed Microsoft's numbers, which confirm the performance drop, but also show an over 50% reduction in working set! Dude! How do you not mention that!? The app uses less than half of the memory and runs almost as fast? Yes, please! That's a huge win for people paying for cloud-based services. For once, I'm somewhat surprised to see how naive Nick's take is -- that a 4% drop in performance is at-all significant, especially when the "slow" version is still processing 50,000 requests per second in a performance-constrained environment. He did mention a trade-off, but was very excited to tell people that AOT is slower during runtime. There are always trade-offs and you should be very aware of the actual non-functional requirements for your application before you decide whether to use a technology or not. For 99.9% of the applications, the 4% drop in performance vis á vis a JIT-compiled version won't be the deciding factor. When it's accompanied by a working set that's only 1/2 the size, then it becomes an even more attractive target.

How to replace "warnings as errors" in your process

2023-12-15T11:52:23+01:00

Published by marco on 15. Dec 2023 11:52:23 Updated by marco on 27. Jan 2025 14:25:10 ------------------------------------------------------------------------ A build started started failing after a commit. However, the errors had nothing to do with the changes in the commit. A little investigation revealed that the cloud agent had started using a newer version of the build tool that included an expanded set of default warnings. These warnings started appearing first on CI because developers hadn't had the chance to update their tools yet. The "warnings as errors" setting turned what would have been a build with a few extra warnings into a failing build that prevented a developer from being able to apply completely unrelated changes. The setting allowed new, unrelated, and irrelevant warnings to push their way to the top of the priority queue. 👉 tl;dr: I don't think we should use the "errors as warnings" setting anymore. You can get the same benefit -- and even more -- by using newer, more finer-grained configuration options. [What are we trying to accomplish?] This section wasn't included in my original draft of this essay. It only occurred to me under the shower that this is the real reason why I wrote a ten-page essay to answer a teammate's question in a PR review. In hindsight, it's obvious: to answer whether we should re-enable the "warnings as errors" setting, we should first think about what doing so would accomplish. What need does it fulfill? The rest of this essay meanders drunkenly along a path toward what I hope is a reasonable answer. [My team doesn't care about warnings] I understand the sentiment. You're in a team that never, or rarely, looks at warnings. You've given up on teaching them how to look at warnings and keep them fixed. Fine. You just make every warning an error and now they absolutely have to fix everything. Problem solved. [My team now only cares about warnings] Except it isn't, is it? Not really. "It’s like forcing people to wash their dishes while they’re still eating their meal." What you've now done is ensured that your team will be constantly fixing errors that aren't really errors at times when they wouldn't want or need to be doing so. It’s like forcing people to wash their dishes while they’re still eating their meal. [1] Don't make me waste time pretty-printing code that I'm still writing! How annoying is it when you can't run a test because your comment has an extra line below it? Are you kidding me? [2] [My team is not homogenous [3]] When the runtime changes or any external dependencies change, you'll have to fix all of the new warnings because you're making them into errors. Instead of leaving yourself time to properly address the warnings, it immediately becomes your highest priority because you can no longer work. The practice leaves your team in a position where it might lose the runtime lottery, When the runtime changes, you won't have control which person will be forced have to make those fixes so they can continue working. It may be the wrong person in your team, someone who’s going to choose the quickest way to get compiling again rather than the best way to incorporate the new errors and warnings and inspections. -------------------------------------------------------------------------------- [1] I added this sentence added on January 1st, 2025 because I thought of it sometime over winter break and though it was a good mental image. [1] There are ways to configure formatting automatically to reduce incidences of these. Those ways are discussed a bit below. [1] I added this section on January 1st, 2025 as well. [How much do you trust your team?] If your team does care about warnings, then, ... why do you need to make them errors? Before handcuffing developers with a setting, think about whether there isn't a trust problem first. Are you addressing a symptom rather than the cause? While it's possible that applying handcuffs is the best possible solution in your case, consider that there are other solutions along a spectrum that goes from "enforcing discipline" to "relying on individual discipline". Any feature that's enforced at all times will end up hampering efficiency and flexibility in some cases, while any feature that's left up to developers is liable to not be applied consistently. The job of the person setting up code-style configuration is to thread that needle, tailoring the configuration for the team and solution at hand. If you have a lot of solutions and teams, then you also get to consider the maintenance overhead of having too many custom configurations. In that case, you might want to make a few standard bundles that group teams and solutions, like "legacy", "modern", "junior team", etc. You don't have to name them like that, but the name should give you an idea of how loose or restrictive the settings would be. [Let the CI do it, then] I don't have time for all of that. Let's just run them on the CI. Warnings as errors in the cloud FTW! Now you're allowing team members to push all the way up to the server before they realize that they have errors. Granted, they're actually warnings, but you can't merge to master until you fix them, so, yeah, they're errors. This isn't less annoying. But, but, but, what if they're, like, real warnings? Like "possible NullReferenceException" or something like that? That's a good point, sure. But, in most cases, it's something more like "extra line found at end of file", "space missing after parenthesis", "method can be made private", "class should be internal", etc. There are better -- more automated -- ways of addressing some of those, which we'll discuss below. [The CI is not necessarily stable] Also, what if some warnings start appearing in your CI because of a tooling change? That can never happen, though, right? Because you've locked down all of your tool versions so that it can never happen? No? You didn't do that? You're using "latest"? Why? "The people building the tools are pretty clever, so we want to know what new things they have to tell us about our code." Oh, right. Because it makes sense. If you lock down your tool versions, you run the very real risk of not knowing when your build will stop running with more-modern tools. You run the risk of it having been years since you last changed anything in your build and your being stuck with those settings and old tools ... until they're obsolete or no longer available on your build server. It's better to use "latest" and have an occasional spike of warnings than to just never know where you stand with newer toolchains. Locking down tool versions leads to things like DevOps having to set up on-site build agents with Visual Studio 2010 on them for certain projects. OK, so we want to use latest tools, but that means that we might also get new warnings. These are a good thing! The people building the tools are pretty clever, so we want to know what new things they have to tell us about our code. [The future broke the build] What we don't want is for those new things to break builds that used to be running just fine. This usually shows up when someone pushes new commits, runs the CI, and sees that they're getting errors that they didn't see locally. WTH? "My code didn't cause those errors?" The drawback here is this is (A) annoying and (B) it's very possible that the new errors are a distraction at this point in time. The person's bug fix may be important, but the new warnings have now bumped themselves to the top of the priority queue! And what if the person whose build has failed isn't well-qualified to address these new warnings? Well, then they get to bump the new warnings to the top of someone else's priority queue! Probably a more senior developer. Fun for all! What's the solution then? Well, if you realize that the new warnings appeared because of a tool change, then I suppose you should try to pin the tool version on the CI, with all of the drawbacks outlined above. That's assuming that the person to whom this happens is (A) capable of figuring this out and (B) knows how to pin the tool version. And (C) we don't really like that solution, for the reasons outlined above. [What were the requirements again?] What about if we think again about what we're trying to accomplish with "warnings as errors"? Thinking...🤔🤔🤔... The system must allow individual configuration of severity. We want certain warnings to be errors. The system must not require all team members be capable of configuring it. We want clever tool people to configure things for maximum developer comfort, warning visibility with everyone having to become a clever tool person (which isn't generally possible). The system must be configurable per solution. Each solution should be able to decide what is an error and what is a warning and what is a suggestion. You can't make "possible null-reference exception" an error in some legacy solutions without completely killing forward progress. We want warnings to indicate potential problems, but be careful about forcing a solution to address all of them immediately. It's more realistic to create tasks to slowly eliminate warnings, only switching a setting to an error later, to prevent future transgressions. The system must have a versioned configuration. We want the configuration to grow with the project, so older versions have their own configuration, with which they can still be run when needed. The system should use default settings wherever possible No matter how clever your own team's tool person is, the people who designed the whole thing are probably even better. Apply configuration changes knowingly and judiciously. The system should not force a developer to change priorities unnecessarily. If the developer is focused on something, they shouldn't be forced to switch modes and prioritize formatting. Use gentle, visible hints, unless it's really, really relevant to what they're working on. For example, a possible NullReferenceException is something to be avoided, but is it really an error in all code? It's definitely a warning, but if the developer knows that it doesn't matter right now, then they should be able to ignore it, no? I mean, they haven't even committed it yet (as far as you know 😉). Maybe they have a breakpoint to see how the heck that variable could be null in the first place and they were just going to bounce the EIP past the crash anyway. YOLO. Anyway, we want to be really careful about how pushy we are with the IDE configuration. We want to strike a balance between missing actual problems and decreasing efficiency. We don't want the developer above to have to write a suppression -- or, even worse, do some other, ad-hoc short-circuit of inspections -- in order to keep working. The system should be future-proof We don't want running builds to stop running just because we've upgraded tools, but made no changes to the tools. This won't always be possible but, in this case, the "warnings as errors" setting is a pretty obvious "footgun" . The system should provide developers everything they need locally to avoid CI failure Something should fail only on CI as a last resort. That is, a developer must have tools that make it relatively easy to pass CI. This includes being able to see all warnings in the solution, whether warnings would fail the CI, or having an easy way to apply formatting to all files, if incorrect formatting would fail the build. We want to avoid a process that leads to half of our commits being called "fix formatting" and "remove warnings". So, we should consider things like having the IDE auto-reformat files on save. The system should discourage allowing inspection violations to be committed Inspections should be applied and made visible as quickly as possible, to give the developer the opportunity to produce conforming code from the get-go. The path of least resistance should result in committing code that will also pass CI. We don't want to encourage "noisy" commits that "fix up" formatting or other inspection violations. We would rather have a high signal-to-noise ratio in our commits. We want compact, descriptive commits -- so we don't want bug-fix commits to include formatting changes to other parts of the file, if we can avoid it. Looking at these requirements, we have to conclude that the "warnings as errors" configuration option is an absolute cudgel that we had to use in the old days because we didn't have fine-tuned control of the inspection-configuration. [We are no longer in the dark ages] Can we do better today, with modern tools? Absolutely, we can! Most modern IDEs support .editorconfig, which allows fine-tuned configuration of both code-style and formatting, especially for languages like C# and TypeScript/JavaScript. The wide variety of JetBrains, Intellij-based tools use it as well, e.g. "PyCharm" , "WebStorm" , or "PHPStorm" . Visual Studio understands it. Visual Studio Code understands it. Of course, the devil is in the details and, the degree to which code-inspection configuration applies from one IDE to another depends very much on the level of standardization for that language and environment. The .NET/C# world has a high degree of standardization, which is very helpful. [Using EditorConfig] EditorConfig allows you to control almost anything you can think of about your code style or formatting. These are called inspections, each of which you can configure with an inspection-specific value and a severity to assign when the inspection is triggered. For example: dotnet_style_require_accessibility_modifiers = for_non_interface_members:silent dotnet_style_prefer_auto_properties = true:silent The two inspections above should be relatively obvious. In both cases, the preferred setting is configured, but the severity is "silent", so the IDE doesn't complain about it. What's the point of configuring a preference and then not showing it to the developer? Ah, because the developer is the not the only one modifying the code. Excuse me? [The IDE also writes code] Don't forget that the IDE will auto-format the code when requested. The IDE also writes code when it refactors anything. It needs to know how to format the code that it's inserting or modifying. The IDE uses the configuration in the EditorConfig to determine how to format the code. Your tools guy can configure the EditorConfig to conform to the style that the solution / team wants to use. When the code is auto-formatted or refactored, everything should end up looking just the way they wanted it. [How to apply "silent" inspections?] If you have a "silent" severity, that means it's something that you don't want the team wasting time with during development. However, if no-one ever auto-formats the code, then those inspections will never be applied. You should consider the process by which your solution will be made to conform with silent inspections in the EditorConfig. Visible inspections If the inspection severity is suggestion or higher [4], then the developer sees an indicator in the code when the file is open. Suggestions, warnings, and errors are shown in the build output, as well. Of course, the developer can disable showing warnings and messages (where suggestions appear) in the error-list pane, but you can't control everything -- and you shouldn't try. Give your developers the tools and configuration to be efficient and produce good code, but try not to be too pushy about when they do it. Invisible inspections If the inspection severity is silent or none, then the inspection setting is only used by auto-formatting and refactoring tools. In this case, you'll have to consider when will your code be formatted? Do your developers occasionally auto-format files? Do they auto-format on save? Is there a step in the CI that auto-formats everything before compilation? If so, does it commit those changes? Or does the CI reject for formatting warnings? If you have silent inspections, be honest about when they're going to be applied. If you don't have a plan, then they will be applied seemingly randomly when someone inadvertently triggers the hotkey for auto-formatting a file [5], which may lead to unpleasant surprises and/or messy commits. -------------------------------------------------------------------------------- [1] There is a bit of a mismatch with using .EditorConfig versus the JetBrains-native configuration: "JetBrains tools support an additional severity level called "Hint"" , which is generally shown as a green squiggly line rather than the blue one for warnings. However, if you set the severity to "hint", Visual Studio interprets it as a warning, showing it as such in both the IDE and in the build output. On top of that, JetBrains seems to think that the silent option is called none, although it seems to understand silent well enough. [1] Probably because of historical reasons, there is a difference between Visual Studio's Format Document (Ctrl + K, D), ReSharper/Rider's Reformat Code (Ctrl + Alt + Enter), and ReSharper/Rider's Cleanup Code (Ctrl + E, C). I can't tell you exactly which inspections are considered in which mode, but I've listed them in order of "number of changes they seem to make to the code", with Visual Studio's native command seeming to make the fewest changes. [Code Style vs. Formatting] Let's clear up the distinction between these two main groups of inspections. Code style A code-style inspection expresses a preference for something that affects semantic content. That is, applying the fix for the inspection may change the code in a way that makes it compile differently or may lead to it not compiling at all. Even something as subtle as using var instead of an explicit type can, in very rare cases, lead to code that no longer compiles. By now, many IDE tools are generally clever enough to avoid even suggesting such a change, but it can still happen. Formatting A formatting inspection expresses a preference that affects only syntactic content. That is, applying the fix for the inspection will change the appearance, but will not change how the code compiles. Consistent formatting is very important for direct readability of the code, but also to avoid spurious differences in files when inspecting commits. The fewer differences there are, the less likely it is that conflicts appear when merging or rebasing. [Local vs. CI] So we've examined inspections in detail and talked a lot about setting severity to optimize the developer feedback loop i.e., we don't want to mess with a developer's priority queue unless absolutely necessary. But aren't there some things that we might allow a developer to do locally but not allow to pass CI? That's where the "warnings as errors" setting ensured that the CI never passed, even if the developer forgot to check something locally. For example, it's important to have consistent formatting before attempting a merge. There are other ways to encourage and support proper coding practices, though. Auto-formatting files Most IDEs have an option to run formatting or code cleanup automatically when files are saved. Consider setting this option in your solution. Developers will become accustomed to having the IDE lightly reformat even their WIP code and they will always have correctly formatted files. Local pre-commit hooks Pre-commit hooks can run locally, running global formatting on the code base before a developer can commit. This is kind of touchy, as sometimes developers are just committing a WIP to avoid losing their changes. It would be annoying if you had to clean up your formatting just to commit those. You could include auto-formatting in the commit hook, but it's probably better to set up auto-formatting in the IDE. Server pre-commit hooks Instead of a local pre-commit hook, you can configure a pre-commit hook on the server. This hook could cause a push to be rejected if its head commit doesn't conform to certain conditions. But...isn't that what the CI is for? Well, kind of, but the CI runs only after the commits have landed on the server. It's prefereable to have the developer fix commits locally before being able to push, again, to avoid "fix formatting" and "cleanup warnings" commits. You could choose which branch patterns to run these on. My recommendation is to lean as heavily as possible on IDE configuration before getting lost in the weeds with commit hooks. [Avoiding ugly or "noisy" commits] As soon as we start talking about "fixes" for warnings or formatting, we're talking about "noisy" commits. If we enforce inspections more strictly on CI than we do locally, then there will be more "fixup" commits. OK, so what do we do about them? Squash 'em! Right? Right? 🫠 Kind of. Look, the PR machinery allows you to merge, rebase, squash-merge, or squash-rebase. That's OK, but it's not great. A lot of times, you'll have four commits that are descriptive and semantically relevant, describing changes that were made, as well as a few commits that address problems that either came up in CI or as part of the review. Don't you think you should squash those into the four commits and make a clean history instead of just squashing the whole lump into one big hairball? Or do you think that each PR should have only one commit, equating a branch with a commit (as e.g. plugins like "Graphite" positively encourage)? I recently wrote "PRs suck. Stop trying to fix them." that also touches on the workflow outlined below. You see how tool configuration affects everything? You have to think about how your team builds PRs, how they review PRs, how they repair PRs after review -- or whether they even use PRs. I would encourage a more real-time review culture, where possible. 1. Set up the tools so a developer has a good chance of committing conforming code 2. When a developer has a set of commits they want to push, they ask another team member to review, live, explaining what they did. 3. The live reviewer can point out any issues in the code and they can repair them together, all without wasting time writing and reading review comments. 4. Those repairs can be squashed into the appropriate commits before anything's pushed to the server. 5. Once on the server, the CI runs. If something still fails, the original developer can squash in fixes and force-push [6] to keep the commits in the PR clean. 6. You can create a PR to note the integration to master, but then its mostly a formality. -------------------------------------------------------------------------------- [1] What's the problem? Don't you trust your team members to decide what to do with their own highly ephemeral feature branches? Allowing force-push encourages team members to regularly pushing to the server to back up their work, but still care about what the commit history looks like. Allowing force-push allows your team members to revise their commit history until it tells a coherent story, even if they've already pushed. See "Rebase Considered Essential" for a longer discussion on rewriting commit history. [Configuring your solution] Phew! So, what have we learned? * There is no one-size-fits-all solution. * Errors are priority interrupts. The developer cannot execute until an error is fixed. Be judicious about which inspections you promote to errors. * "warnings as errors" is an inelegant configuration option that promotes any warning to an error. That's not very judicious at all. It has been replaced by a much more flexible and granular system in EditorConfig. * Encourage a culture in your team that pays attention to warnings. There will always be team members who do more than others. That's just life. Don't force those team members least equipped to deal with them to address warnings by making them errors. Baby steps. * Silent inspections will only be applied when the tools apply them, e.g., with auto-formatting or refactoring. Consider when those inspections will be applied. * Keep what's executed locally as close as possible to what's executed in CI * Consider how you want your whole process to work, right up through integration to the trunk/main/master. If that all sounds like a lot, well -- it is. Building clean, maintainable code is a complex undertaking. There are a lot of tools that can help, but you have to put some time into thinking how you want to use them, and then into configuring them so they help you instead of getting in your way. It's a delicate balancing act: to give developers the best chance of (A) producing conforming code in the first place and (B) avoiding "noisy" commits, while (C) not hitting them with priority interrupts irrelevant to what they're working on. There will be tradeoffs. [Sharing or copying configuration?] Once you've set up a couple of solutions, you can just copy/paste the configuration to others as a starting point. Remember, though, that solutions are usually pretty unique. Only consider generalizing or packaging a configuration if you've considered that, * It will be more difficult for solutions that use the package to override settings that turn out not be as standard as you thought. * It will be more time-consuming to make changes to the configuration because you have to roll out a new version -- e.g., when the underlying tools change -- with potentially multiple stakeholder solutions. * This extra work may discourage solutions from improving the shared configuration. Instead, it will languish, with all users annoyed by the same inconsistencies, but no-one willing to do the work to address them. For these reasons, each solution having its own copy of the configuration is probably better. They can just copy/paste -- the horror! -- improvements where appropriate. If you're worrying about configurations drifting out-of-sync, schedule a work item every few sprints that evaluates and possibly re-syncs configurations. [Conclusion] There are always trade-offs. Improving code-quality is an incremental process. So is configuring the tools that support that process. It gets easier with practice. Good luck!

PRs suck. Stop trying to fix them.

2023-12-15T11:37:03+01:00

Published by marco on 15. Dec 2023 11:37:03 Updated by marco on 15. Dec 2023 11:58:51 ------------------------------------------------------------------------ I read through the article "Your GitHub pull request workflow is slowing everyone down" with great interest because I, too, am not thrilled about how PRs work. While I agree with the problems Graphite see with PRs, I think they miss other problems -- and I don't like their solution very much. [PRs are, apparently, HUGE] "The single most important bottleneck is PR size - large PRs can make code reviews frustrating and ineffective. The average PR on GitHub has 900+ lines of code changes. For speed and quality, PRs should be maintained under 200 lines—with 50 lines being ideal. To put this in perspective, where giant 500+ line PRs take around 9 days to get merged on average, tiny PRs under 100 lines can make it from creation to landing within hours." Holy shit! The average is 900 lines? That's already using the system completely incorrectly. That's so wild. It absolutely confirms my theory that PRs are a terrible way of committing code. I already thought they were terrible just because of the limited UI and lack of introspection of what the code you're reviewing actually does. PRs don't encourage starting and running the change to verify that it actually works as advertised. You're not using any of the tools that you use to develop code to review it. How silly is that? If you load changes into an IDE, you can see how many warnings there are, see if the layout shifts when you format the document, etc. Why would you want to review in a completely different environment? As Robin Williams once eloquently put it, "It's like masturbating with an oven mitt." . [image] Not only that, but people probably aren't looking at individual commits, so they're just reviewing 900+ lines at once. The fewer people there are looking at individual commits, the fewer people there will be who make good, individual commits. This is a shame because it would counteract the awfulness of reviewing code in the PR web-UI, at least a little bit. [PR web UIs are not good for reviews] There are far better and more efficient ways of reviewing code than with PR web UIs. Reviewing through a PR web UI should be a fallback that you only use when nothing else is possible. If you're in the same time zone and working on the same schedule as the rest of your team, there is absolutely no reason why you should you be using the PR web UI instead of real-time reviews of local commits. What the current PR machinery does is fool remote, async teams into thinking that they're reviewing code efficiently. A face-to-face, real-time review will be much more efficient and yield much higher-quality code. I honestly can't believe the high pain threshold that some developers have. If the developer hasn't pushed yet, then: 1. Review with the person who wrote it. If the developer has pushed and is not available for real-time review, then: 1. Pull. 2. Open the branch in SmartGit. 3. Examine the commits. 4. Launch the solution/project. 5. Run the tests locally. 6. Apply your own commits instead of review notes wherever possible. Yes, you can do this! Why not? You're both on the same team. It's a shared code base, not someone's personal zen garden. Instead of explaining what you would want changed, just make your suggestion in the form of a commit. It's often more efficient than writing prose. 7. Add review notes in SmartGit (synced with GitHub, Azure, GitLab, etc.) or use the PR Web UI to add review notes. You can thank me later. [Errors slip in] "Problems can easily get hidden between the diffs, and reviewers often make assumptions instead of testing to avoid feeling overwhelmed. One particularly interesting finding is that as the size of a PR increases (by number of files changed), the amount of time reviewers spend on each file decreases significantly (for PRs with 8 or more files changed)." Obviously! But it's good to measure -- this was my intuition. PRs don't encourage local testing or verification in an environment similar to that which the original developer used. [Dumbing down Git] "By default, every PR is restricted to only 1 commit of <200 lines, keeping changes tightly scoped. This forces developers to consciously limit work to related changes—the registration endpoint PR can't sneak in unrelated styling tweaks." Yikes! I don't like the sound of that. So you make multiple PRs rather than one PR with multiple smaller commits? Why don't you just review commits rather than one giant blob? Do you really need to corral each commit into its own branch and PR to force yourselves to actually make useful commits? Yeeess? 🧐 "Stacking centers around breaking down big feature work into chains of smaller pull requests. Each PR is typically limited to 1 commit focused on an isolated change. This restriction guides developers to consciously make only a single change, squashing and rebasing along the way, instead of cluttering the PR with random unnecessary commits like "typo fixes"." This is yet another technique invented to accommodate teams that don't trust each other, or that contain people who, if they can't be trained to do better -- or don't understand what better is -- probably shouldn't be programming yet. Instead of teaching team members how to use their tools, they impose an arbitrary rule. What a kindergarten. [Integrate all the time!?] "Unlike Git workflows, where it is easy to neglect staying updated, Graphite centers your workflow around continually integrating with the current mainline state." Yikes! I don't love the sound of that, either. Doesn't that force you to spend more time on integration that you might have spent working? I understand you don't want to have long-lived branches, but now you're just shooting to the other extreme, forcing integration on every pull. It's not bad as long as the integrations are automatic, but might not be appropriate for developers who aren't great at resolving merge conflicts. Even if they know how to deal with them well, might they not waste time resolving conflicts integrating a version of their code that wasn't at all ready to be integrated? I understand that this feature follows from the logic of "if you integrate more often, then integration is easier," but, again, you're taking agency out of developers' hands, implicitly not trusting your team members. I don't like it. If you have several stacked commits, I wonder how much shuffling there is in the working tree (causing unwanted IDE reloads) during the integration cascade. Are they somehow integrating without touching the working tree? I don't know that that's possible. Go ahead and work on the main branch if you want -- I do it all the time -- but this should be more of a choice than it sounds like it is. [I remember this...] "This command will add your changes and create a new branch in one motion. You can then continue iterating by creating and stacking additional branches:" Ah, I see now. They've reinvented "Mercurial's patch queues" . Everything old is new again. A really bright and good friend of mine "added an extension to Mercurial's mq" decades ago that sounds like it works the same. I remember discussing the technique with him as he was developing it. [Conclusion] I'm a bit worried about two things: 1. The one-commit-per-branch thing 2. The auto-integration-cascade "By cleaning up your PR commit history, you ensure a clear and concise main branch history that makes it easy to see exactly what’s changed over time." By enforcing one commit per branch, you dumb everything down. It does seem that, instead of acknowledging that PR supremacy is stupid, Graphite doubles down, strips branches of most of their functionality by equating them to commits, and uses multiple PRs to force people to review by commit. It seems like a waste. But, hey, maybe I need to actually try it. I might be missing something. Still, instead of adding another tool, I think you should use git better. * Set up your local tools -- or in-cloud IDEs, whatever -- to support building the kind of code you want. See "How to replace “warnings as errors” in your process" for more information. * Encourage your team members to learn how to use those tools. * Have a review culture centered on real-time reviews where quick fixes and changes can be made before you've even pushed anything. This cultivates a culture of respect for commits. * Use PRs only to "stamp" a set of changes and merge them to the trunk.

"Developer experience" is rarely a requirement

2023-11-30T21:23:21+01:00

Published by marco on 30. Nov 2023 21:23:21 Updated by marco on 30. Nov 2023 21:43:00 ------------------------------------------------------------------------ The article "Some notes on Local-First Development" by Kyle Matthews focuses on a very good trend in app development, but focuses a bit too much on what he calls DX, or developer experience. "I see “local-first” as shifting reads and writes to an embedded database in each client via“sync engines” that facilitate data exchange between clients and servers. [...] The benefits are multiple:" * Simplified state management for developers. * Built-in support for real-time sync, offline usage, and multiplayer collaborative features. * Faster (60 FPS) * CRUD * More robust applications for end-users. I don't want to read too much into it, but he did mention end-users only in the last bullet point. I think the author is focusing too much on the tech and too little on the value. DX is great and all, but it's about the UX, no? Every app would benefit from realtime updates if it's cheap and easy to build. Almost every app is multiplayer, if you think about it a bit. "For almost any real-time use case, I’d choose replicated data structures over raw web sockets as they give you a much simpler DX and robust guarantees that clients will get updates." No, my friend. You've come to the right conclusion for the wrong reason. If the tech is solid, if it doesn't negatively influence debuggability or traceability, if it's predictable, if operations can be correlated, if you don't end up limiting your functionality to fit the framework -- then go for it. What I mean is that it's important that the thought process that leads to the correct conclusion serves all stakeholders. If you're only doing things because they're better for developers, then, eventually, you're going to be deciding against the users. [image]Be aware of the trade-offs, and be sure all of the stakeholders can live with them. What does good DX translate to for other stakeholders? Easier maintenance? Less complexity? Easier onboarding? The DX is really mostly secondary unless you're making a framework, in which case it might matter. No-one cares about DX for real-world products. I love good DX, but I'm a developer! As a developer with a lot of experience, I'm forced to admit that it's not at all a primary goal. Having good DX might lead to other desirable things, but that doesn't make it directly desirable. Don't forget that.

Handling long-running projects

2023-11-08T21:50:04+01:00

Published by marco on 8. Nov 2023 21:50:04 ------------------------------------------------------------------------ This is a brilliant interview, in that Oren Eini just talks for about 40 minutes, answering pretty much just one or two questions. [media] At one point (I forget where), he says, "I don't like unit tests." Agreed. I likelove automated tests. They're indispensable. But I think unit tests are only useful when you want to focus on a failing integration test. David rightly points out that they're really good for pinpointing where a problem actually happens, but Eini says that they also "hinder change" because, by their nature, they lock down a lot of the design and implementation.This is absolutely true. Just to be clear: I think of anything that's not a unit test as an integration test. I generally like "smaller" integration tests. It's probably better to just be agile about it and write them when the situation requires it, i.e., when the cause behind a failing integration test is proving difficult to pin down -- or when you've determined the cause and you want a direct proof that you've fixed the underlying problem. Ir requires discipline to realize when you need to write more unit tests in order to help pinpoint which component involved in a failing integration test is causing the problem. If you preemptively write all of the unit tests, you're wasting time that could be better spent elsewhere. I have had no small amount of success with a large test suite that was mostly integration tests. It ran relatively quickly (10 minutes for 10,000 tests on a reasonably classed developer desktop) and helped me survive three major refactorings.

Architecture is about intent

2023-10-24T22:39:45+02:00

Published by marco on 24. Oct 2023 22:39:45 ------------------------------------------------------------------------ The following video is a talk by Robert Martin "Uncle Bob", one of the graybeards worth listening to. This video from 2011 is wide-ranging and contains a lot of brilliant advice. It's stuff that we've known for a long time now, but every generation of programmers needs to re-learn these things about every 5-10 years. You usually can't stop people from just reinventing the wheel because who wants to watch videos of or read blog posts written by old dudes, ammirite? [media] At 10:00, he talks about how the top-level architecture of most applications reflects the framework used to implement the web-delivery mechanism rather than the purpose of the application itself. In his example, he shows how a Ruby-on-Rails application is immediately recognizable as such, but that you have literally no idea what the application does. He urges us to consider what this implies about our priorities as architects and developers. It means that we are much more concerned with the technology than with the functionality. This is not good. He contrasts it with a high-level. 2-d blueprint of the first floor of a church, where the intent is obvious: it's a church (he says). Of course, inferring that it's a church involves applying the appearance of the diagram to a given context -- e.g., a very western one -- but the point is clear: the standard, top-level view of the design of a church screams out that it's a church. It says nothing about how the church is to be built -- or has been built -- it says what it is. "Architecture is about intent." Just to be clear: this presentation is from 12 years ago, and we're still confronted with the same concepts -- still confronted with the same failure to remember these precepts. Our frameworks still push themselves to the fore. This is, in a way, the problem with LLM-generated code: we are already terrible at expressing the intent of our software in a way that makes it maintainable and qualitative. We are already mostly terrible at designing and building things in a way that satisfy the nearly-always-implicit non-functional requirements, like maintainability, usability, performance, etc. And now we're asking another piece of software, whose workings we can't yet fathom, but which we know we've built by feeding it all of these terrible versions of software, and asking it to write software for us. All of the theory that we've developed about how to build software will not be respected, except by luck, if the neural net is feeling like that's a high-probability next token. On the one hand, I have to admit that this doesn't sound much different from how software is built today, except that the human builders are potentially capable of following rules, whereas the software-based builders are less trainable. Again, though, we have decades of experience showing that, while people are ostensibly trainable, they are not necessarily practically trainable, at least in the general case for the general type of person who takes part in this field of endeavor we call programming. Which leaves us with the question: have we achieved the maximum potential in software development? We already knew everything we needed to know about how to do it decades ago. What is missing is the will to do it that way. It's definitely possible to train people to do it that way. The hangup is, as always, the cost, specifically, the cost-benefit ratio. The perceived benefit of better software is usually far less than the perceived (initial) cost. And we always perceive only the initial cost because we are super-bad at long-term thinking about complex problems like building software. At 34:00, Uncle Bob says "There's gotta be some better way to do this. [...] This is just 3270 programming poisoned with all sorts of crud. How many languages do you have to do know to write a web application? Well, there's some programming language, but that's incidental! You've gotta know HTML and CSS and JS and Zazzle and Dazzle and ... and, you know, the guy over here's going: 'let's build communities by leveling people up. Leveling them up! I mean, what we're going to do is hand them a ... OK, now, hold this hammer. Ok? Good. You got that hammer? Now, here's another one. Hold that hammer too. Now I've got a big barrel you've got to hold on your head. We are not helping our cause with this truly terrible mechanism that we have adopted." At 41:00, he says "The database is a detail." This reminds me of "The UI is an afterthought, a detail" , an article I wrote recently [1] about a 7-year-old video I watched that expressed the same sentiments about external systems that Martin is expressing in his 12-year-old video. "That's what architecture is: find some place to draw a line and then make sure every dependency that crosses that line goes in the same direction." At 55:45, he says, "There's an interesting case of the database -- the thing that's so incredibly important -- and yet, we took that decision and we just deferred it off the end of the world and then, when somebody needed it, we shimmed it in in a day. Because our architecture had done something right. What is the hallmark of a really good architecture? A good architecture allows major decisions to be deferred." "A good architecture maximizes the number of decisions not made." At 1:00:50, he says, "How do you keep the beast under control? You need a suite of tests you trust with your life. You must never look at that suite of tests and think 'you know? I don't think I really tested everything?' As soon as you think that, you've lost it. Because now you're afraid of your code. The reason we write our tests first is so that we know, that every single line of code we wrote was because of a failing test that we wrote. So that we know that every single decision that we made is tested. So that then, we can pull up that code on our screen and say 'Oh my God, that looks like a mess' -- and clean it!...without any fear." Great talk. Add it to the pile of things that we know -- or should know -- better, but don't.

Woefully unqualified "programmers"

2023-10-11T21:15:00+02:00

Published by marco on 11. Oct 2023 21:15:00 Updated by marco on 6. Mar 2024 07:35:06 ------------------------------------------------------------------------ As I was reading the absolute train wreck of a unit test in "Testing with a Lisp" , the song "What the fuck is going on?" popped into my head, like it always does when I see that a programmer not only didn't understand the assignment, not only doesn't know how to program, but also doesn't know that they don't know how to program. They are living their best life because they don't think that "knowing how to program" is required in order to be a programmer. Neither does their boss or team, I guess. That's when the music starts to play in my head, and I think of little blind Dillon playing football because a very non-PC friend [1] sent me that video so many years ago. Am I going to link the video? Of course I am. Because I'm a terrible person. [2] [media] And this is the test from the article above. test("Returned objects arguments immutable (a b)", function() { var result = lispParser("(a b)"); expect(3); ok(typeof(result) === 'object', "result is an object"); var children = result.arguments; var newValue = 2; var firstChild = children[0]; if (children[0] == newValue) { firstChild = ++newValue; } notEqual(result.arguments[0], newValue, "Underlying array was immutable"); equal(result.arguments[0], firstChild, "Underlying array was immutable"); }); [image]Nothing about that test makes any sense. It will always pass. It is, in its own way, a work of art. It is the JavaScript equivalent of Chomsky's "Colorless green ideas sleep furiously" , an example of a sentence that is "grammatically well-formed, but semantically nonsensical". Honestly, this looks worse than anything I've seen my students try to write. They usually have enough shame that they don't bother filling in an answer if they really have no idea what's going on. I'm also wondering, of course, whether this is the work of an AI -- or the bastard child of a poseur-programmer and an AI. The future is bright. -------------------------------------------------------------------------------- [1] All kidding aside, I love this kid's confidence. It's off the charts. I hope karma was gentle with him. [1] It was the same person who sent me the little cartoon I've included.

Technology-independent software-development courses

2023-10-05T13:48:10+02:00

Published by marco on 5. Oct 2023 13:48:10 Updated by marco on 10. Oct 2023 06:22:28 ------------------------------------------------------------------------ I was recently asked something like the following question, which I am citing with a few minor edits. We would like to do a course about SW development with Python, preferably an online course, so that we can start at our own pace. We don’t want a Python course, but would instead like a course more about SW development. It would be great if it were in Python because we are comfortable with it. The interesting topics would be: * object-oriented programming * functional programming * design patterns * good coding practices As well as other important topics such as: * Testing * Documenting * Version control * Working in a team with version control The course doesn’t have to contain all these topics. It can be also several courses or it can be toy-projects from somewhere. [Learning how to develop software] I have very little familiarity with courses as I've usually been tasked with figuring out how to do things before others have gotten to it. Of late, I've been teaching courses, not taking them. So, how did I learn what I know about software development? When I started writing software, there was nothing available online, outside of a bunch of GeoCities pages (one of which was mine). MSDN was on CDs or local help files. I read some books, OOSC and OOSC2, as well as the Gang of Four's Design Patterns. I can't remember what else, but that's partly how I leveled up my skills. I had the great fortune of being able to build and work on large frameworks, from which I drew many lessons. I worked with very good people, who challenged me and taught me a lot. Nowadays, I use DuckDuckGo as my online reference. I have developed a relatively advanced skill at searching for what I'm looking for. I very often get it within minutes. I almost never use videos. a primary skill in software development is to be able to imagine what you should be looking for. That is, you don't have to know how to do everything without looking it up, but you do have to imagine that it might exist. For example, I don't know how to write automated tests in Python, but I know that it should be possible. I know that I should figure that out very early in my experiments with Python. I know what to expect from an automated-testing environment. I know which settings to look for and expect. That kind of knowledge transfers from one language or development environment to another. I know that I code-completion makes me faster, I know that I would like to avoid runtime errors -- how can I best use Python to achieve those ends? [Online Courses] I took a quick look around for online courses, but was not immediately convinced that I am equipped to be able to distinguish between scams and actually worthwhile courses. Does the course even mention general software-development principles? How much time is allocated to that? [Udemy] The "Complete Software Engineering Course with Python" looks as follows: [image] * Only about 1% of students even bothered to rate the course * The pricing fees high pressure and scammy * The course descriptions are barely in English What about general programming? [image] Just over nine minutes? And you can't even be bothered to describe it in something approaching well-written English? No, thanks. [PluralSight] The course "Learning To Program - Part 2: Abstractions" looks a bit more professional, but it still has some quirks (especially for $29 per month). [image] * Overall more professional than Udemy, but also seems to have been incompletely translated from French * Still, there's definitely more time dedicated to core concepts There is an "assessment" that you can take, but you have to sign up first. [image] Maybe PluralSight is able to tell you which courses you need, but I doubt it will err on the "you need fewer courses" side. [Dometrain] Update on 2023-10-06: added the section below I've recently heard from a source I've been watching for a while that this course is quite good for C# developers: "From Zero to Hero: Test-Driven Development in C#" by Guilherme Ferreira . The person recommending it releases quite interesting/advanced videos on YouTube and has "his own range of courses at DomeTrain" . [How should software-development be taught?] How would I teach basic software-development principles? I would probably start with very abstract principles that try to answer the classic questions for "use cases": * Requirements: What are you trying to achieve? * Actors: Who is trying to achieve it? * Stakeholders: Are there other points of view than just the actor's? [Which language?] A question people tend to start with is: which programming language should I use? That's the wrong question. The applicability of programming languages to fields differ widely, but most languages have a large overlap in functionality. Where they differ is in the degree of runtime or library support for specific tasks. For example, Python famously has a lot of libraries for number-crunching and data-analysis (although I feel that this advantage is grossly exaggerated) whereas it's terrible for writing Windows GUI applications. C#/.NET has excellent web and desktop technology support. The Python runtime is notoriously slow (with essential libraries written in C++) whereas .NET is known as a very performant cross-platform runtime. Do you see how quickly the conversation turns from "what can the language do?" to "what can the standard runtime/libraries/environment do?" That's because you can do most tasks with most languages. Instead, we want to think about this at a higher level. We want to, * maximize useful expressiveness while minimizing harmful expressiveness. * accommodate inherent complexity without introducing accidental complexity. * express our intent explicitly in our programs. * be able to discover and eliminate assumptions * get compilation errors or warnings, not runtime errors. [Developer discipline] Programming languages exist on several spectra. One of these is "the degree of developer discipline required to use the language effectively and safely." What does that mean? For example, Python and JavaScript have a dynamic type system. While there are mechanisms, practices, and IDE support that you can use to set up guardrails missing in the language, but they are optional and Idiomatically written code in both of these languages tends not to use any of it. It's the wild west, for the most part, with a lot of assumptions that nothing will ever go wrong. More strict languages force you to consider all possibilities before your program even compiles or runs. For example, Haskell and Rust are famously picky. If you have a function that returns a value under certain conditions, those languages will make you explicitly indicate what to return when those conditions don't hold. Forgiving languages will just use some default value, usually null or undefined. This is called "happy path" programming because you only write the code for the hoped-for path through your use case. For example, the user selects a valid file with the expected data format with an acceptable length with no validation or processing errors, generating a data file to which the initiating user has access. * Did the user click cancel? Not handled. * Was the file missing? Not handled. * Was the file in an unexpected encoding? Not handled. * Did the file fail to validate? Not handled. * Was the data empty? Not handled. * Did processing crash? Not handled. * Was the generated output not accessible to the user? Not handled. Writing programs in this fashion is a dangerous thing to do with a strict language, and it's even worse to do in a lax language. Even the simplest software has many, many branches. The less your language or compiler or IDE reminds you of them, the more you have to fill that gap with developer discipline. [Important language/runtime/IDE features] To get more concrete, some good questions to consider are: * Can you clearly describe and use types? (implicitly typed ≥ explicitly typed ≥ dynamically typed) * Are types statically checked? (yes > no) * Can you primarily work with early binding? E.g., how strongly idiomatic are virtual/late bindings? * Are data and operations idiomatically merged or separate? Is there support for pure data structures (e.g. records vs. classes) * What about discriminated unions? Range types? Can you avoid primitive obsession? * Can you declare non-nullable references? * Can you designate functions as pure? * Is a functional programming style supported? * Is there a way to make data or records immutable? * To what degree can you optimize performance where needed? * How does error-handling work? * How concise can you be? What kind of abstraction mechanisms are there? Do you have to write a lot of boilerplate? * How does I/O work? Is it a second-class citizen? (I.e., does the language enforce purity so hard that it make it a pain to read from a file? I'm looking at you, Haskell and Elm) * What is the asynchronous programming model? * How good are the error messages? If these don't make any sense to you, don't worry. But they are questions that are important when you're choosing a tool for building software. [Intent & handling events] The whole point of a programming language is to express intent. You indicate what you intend to happen when a given event occurs. An programmer expresses an intent by writing that, "when this thing happens, I intend for this other thing to happen." For example, * When a filename is provided on the command line, read the contents of the file in a given encoding, process it as lines of text, and save the results to another file. * When the user clicks the screen in a particular place when the program is in a particular state (e.g., over a button displayed in a dialog box), then execute an operation. * When an HTTP request arrives, then read the body, process it, and return an HTTP response with the results * When a sensor triggers an interrupt, then change a GUI readout from red to green * When the system sends a shutdown message, then close and flush all logs [Questions to consider] How do we choose a programming language? You're not just choosing a programming language, you're also implicitly deciding which subset of language features to use. This is predicated, of course, on knowing about these features. It's best to inform yourself about what your language/libraries/runtime (let's call it a software-development tool) can do for you -- or find someone who is well-informed to help. For each feature, you should ask yourself: how useful is it? Does it help me achieve my task? Let's take a look at high-level features of a software-development tool that may be important. Maintainability * By whom? What level of programmer? * How much will a certain part need to change? Comprehensibility * By whom? What level of programmer? * Are you applying the "rule of least power"? That is, are you using the simplest, most effective tool in your arsenal? To know this, you have to expand your arsenal ... and then use it judiciously * Low syntax noise Discoverability * Can you use the IDE and code-completion to learn the API? * How much documentation do you need? * How much on-boarding would a new developer need? * How easy would it be to hand off to another person or team? Learnability * Even if not especially discoverable, how learnable is the code? * Are the patterns consistent? * Is the API clear? Strictness / Correctness Error-free is strongly encouraged or enforced Complexity Inherent vs. Accidental Idiomatic Is it written in a way that other programmers of this language or framework can easily understand? Testability * Can you write semantically useful tests? * Is it easier to write a test than to debug? Debuggability * Can your debugger set a breakpoint on all of the important bits? * Can you even debug it? * Is there too much magic? * Is there so much generalization that you can't figure out what's going on? Observability * Is there sufficient logging to figure out what's going on without debugging? (This is important once the product is in the field.) * What about error-handling? Are problems separated from errors? [1] * Can the developer predict what's happening and measure it? * How's your telemetry? * Could the software be monitored if needed? Performance * Does it need to be fast? * All of it? Or just parts? For code designed to be reusable (libraries, frameworks), you can also consider: Completeness degree to which the definition/API captures all facets of the problem domain Expressiveness Concise and precise Flexibility Applicability to different problem domains Which of the features above matters more depends on what you're building. A one-off script doesn't need to satisfy many of these features. A full-blown application that needs to be maintained for 10-20 years by different teams has to be much, much more careful. [Other articles] This isn't the first time I've written about these ideas, so I've included links to other, similar articles below. [General Programming Practices] These articles discuss the topic of software-development on a similar level to the discussion above. * "Elegant Code vs.(?) Clean Code" * "Programming-language Features: How much is too much?" * "Applying the rule of least power in practice" * "Framework Design: Programming Paradigms and Idioms" The articles below are more recent, are more-or-less on the same level, but are also more targeted. * "How to evaluate dependencies" * "To crash or not to crash; that is the question" * "Stop trying so hard to use pattern-matching" * "The UI is an afterthought, a detail" * "On the usefulness of containers like Docker" * "The “Hustle” culture in Software Development" ["Encodo White papers archive" ] These white papers were written from 2006 to 2019 when I was still employed at "Encodo Systems AG" . They expand on recommended practices of specific facets of software development. They are presented in reverse-chronological order, but can be read in any order. * "DI, IOC and Containers" (2019) * "Clean and Safe Code" (2019) * "Testing as First-Class Citizen" (2019) * "Code Reviews" (2017) * "Document Everything" (2017) * "Continuous Integration and Delivery" (2017) * "Component-based Design" (2006) * "Design by Contract" (2006) * "Test-driven Development" (2006) [Videos] This is a YouTube playlist I've maintained for years that I continuously update whenever I watch a video that I think would be interesting for other developers. It's only technology videos, but it's pretty eclectic (i.e., it's language- and technology-agnostic). "Developer suggestions" [Conclusion] Pace yourself. You can't have everything all at once. Programming takes wisdom. Wisdom takes time. It takes practice. It comes, or it doesn't. It takes different forms. As "Rainer Maria Rilke wrote in 1903" [2], "Forschen Sie jetzt nicht nach den Antworten, die Ihnen nicht gegeben werden können, weil Sie sie nicht leben könnten. Und es handelt sich darum, alles zu leben. Leben Sie jetzt die Fragen. Vielleicht leben Sie dann allmählich, ohne es zu merken, eines fernen Tages in die Antwort hinein." Good luck. -------------------------------------------------------------------------------- [1] See "The Error Model" by Joe Duffy . [1] h/t to Ömer for making me aware of this great piece of writing.

To crash or not to crash; that is the question

2023-10-04T21:54:06+02:00

Published by marco on 4. Oct 2023 21:54:06 ------------------------------------------------------------------------ Note: I found this old draft containing my response to a colleague. I 100% agree with you, in general. I absolutely want to know immediately when an assumption I've made does not hold. But...😁 The degree to which I'm willing to crash depends on whose consistency I'm basing my assumptions on. When I call a method in my code from another method in my code, I'm absolutely going to assert that an argument is not null. I can control that. My IDE will tell me when I might be passing null. That is definitely a programming error. When I'm getting external input (e.g. from the Windows registry), I'm a bit more cautious because I'm less sure about how solid my assumption is. I know what the documentation says but a lifetime of programming has taught me that some things (like the Windows registry) are going to work exactly as expected on my (modern) developer machine, but are going to fail mysteriously on a (perhaps less modern) machine in (for me) completely unpredictable ways. Therefore, I'm a bit careful about is what I'm willing to pay to find errors. The primary purpose of a program is to bring value to the customer/user. I want to improve my program for more situations, but how am I going to find out in which situations it doesn't work? I can test, of course, but some things will only ever happen in the field. If it happens in the field, then I'm using the customer's/user's time to help me fix my program (they benefit, of course, but not for free). Can I soften the blow to the user of having to help me improve the program without sacrificing consistency or accuracy? Sometimes, the answer is a resounding no. The program absolutely cannot continue if e.g., the reference to the data it needs to work on is null. That's a no-go. There's no rescuing the program from that or completing any other useful work. In the case of this tool, if it crashes, the user no longer gets a report. Would they have been able to get some of the report if it hadn't crashed? In this case, yes. All of the other checks could be run. The checks that crashed would show as "failed" with the exception message. That seems to me to be better than skipping all subsequent checks when one crashes. I can even continue to hope that the user then reports the mysterious error message they got for one of the reports! Die Hoffnung stirbt zuletzt! I'm delighted to discuss programming and error-handling philosophy in person next week!

Encodo White Papers Archive

2023-10-04T21:36:27+02:00

Published by marco on 4. Oct 2023 21:36:27 ------------------------------------------------------------------------ This article is a copy of the white papers and process description that I wrote for Encodo Systems while I still worked there. I've preserved a copy of it here and in the linked articles. Through our many years of experience building software, we've accumulated methodologies and principles that lead to quality software. * The experience and know-how we bring to our consulting services also make us excellent partners in development-process consulting. * We have experience in reviewing existing software and proposing changes to improve its reliability and stability. * Not only can we build great software for you, we can help your development teams improve their current software and teach them how to write better software in the future. Listed below are our methodologies. "DI, IOC and Containers" (2019) Applications are graphs of components, each with dependencies and inputs and outputs. DI and IOC are a simple and powerful way of building testable, flexible and easily replaced components. "Clean and Safe Code" (2019) A large part of writing good, clean code is to restrict your options. Avoid nullable data. Avoid mutable data. Avoid state. Code is not a poem: it's actually preferable if it all looks the same and seems "easy" and "obvious". "Testing as First-Class Citizen" (2019) Use tests to execute your code. Automate them to guard against future changes. Make testing easy for developers by building nice integration suites. Tests are code. "Code Reviews" (2017) Finished code is reviewed before it is committed. Reviews make sure that the other parts of the design process were followed (documentation, tests, etc.) and that errors are fixed as quickly as possible. "Document Everything" (2017) Software without documentation is not maintainable. Good naming conventions, logical designs and well-built tests help make code self-documenting. High-level tutorials explain how the different components are intended to work together. "Continuous Integration and Delivery" (2017) Low-overhead releases lead to faster turnaround and higher quality. Automated (or nightly) builds pinpoint software errors before they can become problems. "Component-based Design" (2006) Components encapsulate specifications. Software is designed to balance reuse and abstraction against the project goals, all the while optimizing maintainability. "Design by Contract" (2006) The specifications are built right into the code. Software publishes its assumptions instead of leaving its contract implicit. "Test-driven Development" (2006) Tests are a central part of building software. All component features have application-independent tests which can be run automatically and throughout development.

Encodo White Papers: DI, IOC and Containers (2019)

2023-10-04T21:36:20+02:00

Published by marco on 4. Oct 2023 21:36:20 Updated by marco on 20. Apr 2025 12:59:29 ------------------------------------------------------------------------ Encodo keeps the "SOLID" principles in mind when designing software. [DI & IOC] We implement the Inversion of Control (I) pattern with the dependency-injection pattern (D) to allow for a large amount of flexibility in how an application is composed. We've applied this principle throughout the Quino framework and use it in our products as well. What does this mean? It means that the product or framework doesn't make any decisions about which exact components to use. Instead, it indicates the API Surface (interface) that it expects in the form of injected components. That is, the responsibility for deciding which component to use lies not with the lowest level of the software stack, but with the highest level. This inversion means that the application entry point configures the object graph (i.e. which objects will be used). That makes it much easier to isolate and test individual components, especially where those components would depend on native- or web-only functionality in production. See the "How do I DI?" presentation from February 2018 for more information. [Principles] In order to make good use of this pattern, an application should adhere to the following rules: * Prefers composition over inheritance, exposing clear dependencies. * Refers to dependencies via interface or protocol types with as small a surface area as possible. * Obtains dependencies through injection, preferably via the constructor. Components built in this manner are agnostic in their implementation. They can be composed by an application as it sees fit. [Components] An application is a graph of components, each with one responsibility (S) and zero or more dependencies, injected via the constructor. Components are composed with other components to build higher-level functionality (O). They are also unaware of the other components' implementations and can be replaced with other implementations (L). Components make software flexible: * Products can replace any component without changing anything else * Products can inject any component without pulling in more functionality than needed [Declaration & Implementation] Components have a very clear purpose (S) indicated through an interface. In most cases, we use an actual "interface" language construct to clearly define the API surface and to not limit a product in its implementation (e.g. with an abstract base class). * Prefer composition to inheritance, exposing clear dependencies * Reference dependencies via interface (or type or protocol, depending on language) with as small a surface area as possible * Obtain dependencies through injection, preferably in the constructor Most components have a single method, amounting to a functional interface and allowing composition with lambdas. While TypeScript has this feature (as does Java), C# does not. We end up defining a lot of single-method classes that implement a single interface. It's more code than we'd like, but it's purely structural syntax and doesn't introduce additional complexity. See the "Interfaces, base classes and virtual methods" in the Quino conceptual documentation for more information and on and examples of patterns that we use. [Containers] Although it's possible for applications to manually create an object graph (the composition root), we prefer to use an IOC Container. The container provides two services: * Registration: Applications declare the object or type and the lifetime (generally singleton) to use for interfaces * Requests: Applications request objects, which the IOC creates—injecting other registered objects, as necessary—or retrieves, depending on lifetime. A container can create transient objects even for unregistered types. The container introduces the following restriction: * A concrete type may have many constructors, but only one may be public The lifetime of an application is as follows: * Collect registrations in the IOC Container * Create composition root with the IOC Container * Apply method to composition root See the "Quino Application Configuration" for more information about application lifecycle. The blog article "Starting up an application, in detail" is a bit older, but provides more detail on how Quino integrates the IOC into the startup. In the long example below, we will first look at how composition even without a container is very powerful. Then we'll look at how a container can improve on that. [Example] Although we generally use C# or TypeScript in our work, these examples were originally written to introduce Swift developers to an iOS framework that we wrote. We will first look at how composition even without a container is very powerful. Then we'll look at how a container can improve on that. [Step One: A limited robot simulator] Let's take a look at an example of an application that looks OK at first, but turns out not to be very flexible. Note: The example is small, so some of the steps will feel like over-engineering. It's a good point, but the principles shown here apply just as well for larger systems. The following example defines a simulator that can move a robot along a route, defined by movements. The robot starts at a given location and can travel at a fixed speed. enum Direction { case north case south case east case west } struct Movement { let direction: Direction let distance: Int } struct Point { var x: Int var y: Int } class FastRobot { var speed = 2 var location: Point = Point(x: 0, y: 0) let movements: [Movement] = [Movement(direction: .north, distance: 1)] func move() { for movement in movements { let distance = speed * movement.distance switch (movement.direction) { case .north: location.y += distance case .south: location.y -= distance case .east: location.x += distance case .west: location.x -= distance } } } } class Simulator { func run() { FastRobot().move() } } As mentioned above, this implementation looks well-written, but what if we wanted to verify that the robot ended up at the right location? Let's try that below. [Step Two: Running the limited robot] Simulator().run() // Now what? It turns out that we can't test anything in this application. We can fix this by applying the patterns outlined in the first section. [Step Three: Decouple the robot from the simulator] First, let's tackle the Simulator interface: class Simulator { func run(robot: FastRobot) { robot.move() } } let robot = FastRobot() Simulator().run(robot: robot) XCTAssertEqual(robot.location.x, 0) XCTAssertEqual(robot.location.y, 2) Now we can test that the robot is working as expected. The robot is still quite hard-coded, as is the simulator's relationship to the robot. The robot must be a FastRobot and it can only move along a fixed route. [Step Four: Reduce the robot "surface"] We'll first decouple the Simulator from a direct dependence on the FastRobot. protocol IRobot { func move() } class Robot : IRobot { // As above } class Simulator { func run(robot: IRobot) { robot.move() } } Now the simulator only knows about the protocol IRobot, which has a very small surface area. It's still too small to be very useful. [Step Five: Make the robot configurable] Instead of hard-coding everything, we can compose the robot out of parts. Examining the algorithm, we see three parts that could be externalized: * The robot's speed is currently fixed. We could make a component that is responsible for calculating the speed of the robot. * The robot's route is also fixed. We could make a component to represent the route as well. * Finally, the robot's initial position is also fixed. We could make that configurable as well. Let's first externalize all of the hard-coded values out of the FastRobot into a generic Robot class. class Robot : IRobot { let speed: Int var location: Point let movements: [Movement] init(speed: Int, location: Point, movements: [Movement]) { self.speed = speed self.location = location self.movements = movements } func move() { for movement in movements { let distance = speed * movement.distance switch (movement.direction) { case .north: location.y += distance case .south: location.y -= distance case .east: location.x += distance case .west: location.x -= distance } } } } Now we can create a Robot, injecting all of the initial conditions. let origin = Point(x: 0, y: 0) let route = [Movement(direction: .north, distance: 1)] let robot = Robot(speed: 2, location: origin, movements: route) Simulator().run(robot: robot) XCTAssertEqual(robot.location.x, 0) XCTAssertEqual(robot.location.y, 2) The same assertions hold as before, but the Robot class is much more generalized. We can now test the robot's movement algorithm with various combinations of origin, speed and route. At this point, we've made the robot and simulator composable and testable. Now we want to have a look at how we can separate the configuration from the usage. [Using a container to build objects] We're not nearly done, though. What does this all have to do with a service provider? That's where the inversion part comes in. In the very first example, the Simulator was responsible for creating the robot. This made it impossible to test whether the robot did what it was supposed to do. So we passed the robot in as a parameter to run(), making the caller responsible for creating the robot instead of the Simulator. This is fine, as long as the caller is the top-level part of the program, responsible for composing the objects that will be used. However, what if the direct caller doesn't know how to do that? Or, put another way, what if the caller should not be doing that? What if the caller is a button handler in a UI? Would we want the button handler—or the UI that contains it—to be responsible for constructing the robot or its initial conditions? This is where the container comes in: we want to register all of the types and instances that we want to use in one place. This configuration can be retrieved at any later point without knowing any more than the interface that's required. This takes us full circle to the original code, except, instead of creating the Simulator directly, we want to get it from a container, called a provider in the following examples. let simulator = provider.resolve(ISimulator.self) simulator.run() let robot = provider.resolve(IRobot.self) XCTAssertEqual(robot.location.x, 0) XCTAssertEqual(robot.location.y, 2) Note: For reasons of simplicity, we assume that all objects in the container are singletons. [Step Six: Configure the container] Let's take the configurable code above and translate it to a container. Here the registrar is the configurable part and the provider is the part that can be used to retrieve objects based on that configuration. The registrar is sometimes called the composition root. Note: We use the syntax for the Swift IOC, but the examples are hopefully clear enough in their intent. In the example below, we register singletons for each of the objects we want the container to be able to create, Point, Int, [Movement], IRobot and Simulator. let registrar = ServiceRegistrar() .registerSingle(Int.class) { _ in 2 } .registerSingle(Point.class) { _ in Point(x: 0, y: 0) } .registerSingle([Movement].class) { _ in [Movement(direction: .north, distance: 1)] } .registerSingle(IRobot.class) { p in Robot( speed: p.resolve(Int.class), location: p.resolve(Point.class), movements: p.resolve([Movement].class) )} .registerSingle(Simulator.class) {p in Simulator(p.resolve(IRobot.class))} This is a decent start, but many of the registrations above have no semantic meaning, like Int and Point and [Movement]. For these, it's better to use higher-level abstractions. [Step Seven: using higher-level abstractions] We need to define three abstractions—called IOrigin, IRoute and IEngine—with implementations. The IRobot interface also needs to be redesigned to use them. protocol IRoute { var movements: [Movement] { get } } protocol IOrigin { var point: Point { get } } protocol IEngine { var speed: Int { get } } protocol ISimulator { func run() } class Simulator : ISimulator { var robot: IRobot init (_ robot: IRobot) { self.robot = robot } func run() { robot.move() } } struct StandardRoute : IRoute { var movements: [Movement] = [Movement(direction: .north, distance: 1)] } struct StandardOrigin: IOrigin { var point: Point = Point(x: 0, y: 0) } struct FastEngine : IEngine { var speed: Int = 2 } class Robot : IRobot { var location: Point! let engine: IEngine let route: IRoute init(_ engine: IEngine, _ origin: IOrigin, _ route: IRoute) { self.engine = engine self.route = route location = origin.point } func move() { for movement in route.movements { let distance = engine.speed * movement.distance switch (movement.direction) { case .north: location.y += distance case .south: location.y -= distance case .east: location.x += distance case .west: location.x -= distance } } } } We've created concrete objects for our standard parameters. An added bonus of the improved semantics is that we can rewrite the init for IRobot so that it no longer expects argument labels—because the parameter are now clear without further explanation. Now we can take another crack at the configuration using these new types. This time, we'll define an extension of the IServiceRegistrar that we can use again below. extension IServiceRegistrar { func useSimulator() -> IServiceRegistrar { return self .registerSingle(IEngine.class) { _ in FastEngine() } .registerSingle(IOrigin.class) { _ in StandardOrigin() } .registerSingle(IRoute.class) { _ in StandardRoute() } .registerSingle(IRobot.class) { p in Robot( p.resolve(IEngine.class), p.resolve(IOrigin.class), p.resolve(IRoute.class) )} .registerSingle(ISimulator.class) {p in Simulator(p.resolve(IRobot.class))} } } We've now configured a system that knows how to create our simulator along with all of its dependencies. You can see that if the ISimulator type is resolved from the container, it will, 1. create a Simulator, which 2. resolves the IRobot, which 3. resolves the IEngine, IOrigin and IRoute [Step Eight: Changing the speed] An application can now change the speed of the robot without knowing anything else about the simulator, simply by changing the IEngine that's used. class SlowEngine : IEngine { var speed: Int = 1 } let provider = ServiceRegistrar() .useSimulator() .registerSingle(IEngine.class) { _ in SlowEngine() } .commit() As well, any location in the application can either use the IRobot or the ISimulator without having to know anything about how either of the concrete objects are constructed. The simulator might be much more complicated than the very simple one defined above. The robot might do much more when asked to move. [Step Nine: Using a factory] What if we wanted to let the robot decide how fast it is, depending on what kind of robot it is? Or what if we want to separate the speed from being fixed in the IEngine? What we need is a way to create transient objects that require parameters that are not available in the provider. These are types like Int, String, etc., as we had in "Step Six" <#step_6> above. The example below shows a very simple usage of the factory pattern. Instead of having a single IEngine for the whole application, we want to provide settings that the robot uses to get its engine. The code below sketches the new types and shows how the robot would use them. protocol IEngineFactory { func createEngine(speed: Int) } protocol IRobotSettings { var speed: Int } class Robot : IRobot { init(_ engineFactory: IEngineFactory, _ settings: IRobotSettings, _ origin: IOrigin, _ route: IRoute) { self.engine = _engineFactory.createEngine(settings.speed) // ... } } You'll note that we didn't declare any new properties. The robot still just has an engine, but asks the factory to create it based on a speed, rather than having the provider inject its singleton. The robot's speed can now be configured without replacing the entire implementation. let settings = provider.resolve(IRobotSettings.self) settings.speed = 10; let simulator = provider.resolve(ISimulator.self) simulator.run() let robot = provider.resolve(IRobot.self) XCTAssertEqual(robot.location.x, 0) XCTAssertEqual(robot.location.y, 10)

Encodo White Papers: Clean and Safe Code (2019)

2023-10-04T21:36:13+02:00

Published by marco on 4. Oct 2023 21:36:13 ------------------------------------------------------------------------ These are the two core principles that guide how we write code: * KISS: Keep It Simple, Stupid * YAGNI: You Ain't Gonna Need It [KISS] This first principle is a constant reminder to ourselves to avoid the seductive call of cleverness. Most code does not need to be clever. Very occasionally, it is necessary to implement something with real flair, that requires explanation. The best code, though, requires no explanation. The best code gets its job done in a very boring way, using the same patterns to achieve different ends. The best code is instantly recognizable to those who know the patterns. The best code doesn't raise any questions. The best code doesn't need comments. The best code is obvious and, yet, does amazing things—like fulfill requirements in a stable, predictable, testable, customizable and high-performance manner. It's kind of obvious: The lower the complexity, the easier it is to reason about systems. The easier it is to reason about a system, the easier it is to prove that either certain things can't happen or will always happen. It should be obvious where to add a customization—because there's only one place that it could logically go. It should be obvious where a bug lies—because there's only one place it could have originated. The best code is readable and understandable not only by the original programmer, but also by another programmer—even if that's the original programmer, six months later. [YAGNI] We'd be lying if we said that we never write code that we don't need, but we keep this principle in mind whenever we build code. There's a bit more wiggle room when building frameworks vs. products. It's easier to determine whether a feature is appropriate for a product than to do the same for a framework. Who knows how a framework might be used? Encodo does have a framework named Quino. The point of a framework is to support the development of products that use it. It's not easy to predict what those products might need, even when you're focused only on features that your framework is supposed to provide. However, a framework or library has a purpose and it shouldn't stray from it. Just as an example: Does Quino provide a remote data driver? Yes, because products have used it and the feature fits into the strategy of metadata-supported data. Is there an XML transport protocol? No, because no-one needed it. Do we support any kind of object? Not out of the box, we don't. You can register your own converters, but it's not a generalized protocol. At the very least, we stay away from throwing in everything but the kitchen sink—just in case a product that uses Quino might need it. Be prepared for anything, but build only what you need. [Other Principles] We apply the following principles to avoid unneeded complexity. * Separate state from logic * Use immutable data * Use non-nullable references * Avoid side-effects * Compose functional components * Use singletons [Separate state from logic] From the article "Why OO Sucks" by Joe Armstrong (inventor of Erlang). "State is the root of all evil. In particular, functions with side effects should be avoided." The sentiment in the title is a bit strong, but its not unfair. OO programming mixes data with operations, leading to more complexity than required by the task. Most applications need some state. That state should be isolated from most components. State should be stored in dumb objects and passed around. A component without state is purely functional, drastically simplifying the things that could possibly happen to it. Its output is completely determined by its inputs. It does not introduce any threading issues beyond those inherent in its input. [Use immutable data] A component avoids a whole class of issues if it cannot make changes to the data that flows through it. As with state, restrict mutability to only certain components. For example, transient objects like DTOs or ORM objects are mutable because it makes the program logic much more understandable Another example is stateless singletons with configuration settings. instead of using a single component with mutable properties, define the configuration in a settings component. This has several advantages: * The settings is a dumb "state" object (single responsibility) * The service is a stateless singleton (single responsibility) * A product can replace the service independently of the settings (and vice versa) * Service implementations don't repeat boilerplate code to manage properties [Use non-nullable references] If references are guaranteed to be non-null, whole swaths of checking code fall away and make the component much simpler. As with immutability, there are far fewer possibilities of what can happen to non-nullable code. TypeScript supports a null-checking mode. C# supports one as well, starting with C#8. For older versions of C#, use the JetBrains Annotations along with ReSharper to enable real-time/compile-time null-checking. [Avoid side-effects] A method should either change state or it should return data. This is the idea behind CQRS (Command-Query-Separation Principle). That said, we employ a weaker version where only visible state really counts. Techniques like lazy-initialization and caching retrieved data are generally OK. Technically, those behaviors have non-visible state in the sense that they affect performance, but are still OK if used carefully. [Compose functional components] We use C# and TypeScript—wonderful OO languages with strong functional support—but we're using less and less of what OO has to offer. Virtual methods are a code smell. Instead, use smaller, testable components with a single purpose. If it's easier to test, it's easier to replace where necessary. Smaller components are more focused and easier to replace without duplicating code. If logic is separated from data, and services are injected or passed as parameters, then there is less and less need for base classes with many helper functions or virtual/protected methods. [Use singletons] If state just flows through a component, then that component can be a singleton, avoiding needless allocation. It's a lot easier to reason about an application that comprises a graph of singletons with transient data flowing through it. Inject factories to create transient services (e.g. a remote-method caller that captures state). [Conclusion] As you can see, we put a lot of thought and care into our development practices and patterns. We try really hard to work in a way that ends up with quality software: stable, maintainable, extensible, testable and, most importantly, does what it's supposed to. For more information about specific development patterns, please see the "architecture section of the Quino conceptual documentation" . There are sections on "interfaces, base classes and virtual methods" , "providers" , "tools & toolkits" , "task-specific interfaces" and much more.

Encodo White Papers: Testing as first-class citizen (2019)

2023-10-04T21:36:07+02:00

Published by marco on 4. Oct 2023 21:36:07 ------------------------------------------------------------------------ Tests are code. Writing tests is not a "step"—it is part of writing the code itself. The component is nothing without its tests. It should be easy to verify any requirement with a test. The tests should tell the story of the requirements. A developer can test any component in isolation (unit testing) or can test the component in the constellation in which it normally exists (integration testing). [Tests are code] Just so we've said it: tests are not a place to use a different coding style or different coding practices than in "regular" code. Choose your frameworks wisely. It should be easy to write powerful, elegant and easily understood tests. Build your own support code and libraries where needed. Apply the same coding principles as you would with the code being tested. You have to maintain testing code just like any other code. We discuss below that we prefer integration tests to unit tests—that only works if you provide a way to write high-performance integrated tests without repeating a lot of code. [Unit Tests and Mocking] Unit tests are very easy to write for properly written components. With a proper infrastructure, such tests can just as easily be executed in an integrated environment. In such cases, there is generally no need to invest time (and incur maintenance debt) writing two sets of tests. Automated tests will sometimes replace components and dependencies with fake or mocked objects, in order to isolate and test only a component's logic without incurring the costs of configuring and setting up unrelated components. If integration testing is too complicated or too slow, then a web of unit tests may suffice. In most cases, though, this doesn't apply and we avoid mocking entirely and test components directly in common, integrated settings. For example, if a component is commonly used as part of a database-based application, then it is more effective to test that component in such a scenario, rather than expending effort in isolating the component in order to have a "true" unit test. With only unit tests, there is a danger that the component works, but only as tested, not as actually used. [Testing configuration] Often, these problems arise in component configuration. A unit test will pass in carefully prepared (and sometimes faked) dependencies and run all-green. However, an integration test will check that the configuration code also works. That is, that the component is configured correctly for products that use it and not just in the tests that verify its behavior. Mocks and fakes must be used judiciously, otherwise you end up either testing only the mock or you end up hiding certain classes of problems, as discussed in more detail below. [Missed opportunities] Imagine a UI list that validates and saves entries when the focus changes. This list might work just fine in a test, where notifications and side-effects as a result of saving are disabled with mocks. This is no longer the real-world situation, though. What happens if one of the notifications would have led to a reload of the list or a state-change in one or more objects? What if the list only saves an object it is is marked as "changed" but that the spurious event resets that status in integration? This kind of interaction—this kind of bug—represents exactly the kind of thing we would miss when testing the list in too isolated a manner. Because we've mocked away too much—because we focused too tightly on a unit test of the list—we've missed a bug that will come up in production instead. [Writing tests] While we don't practice strict TDD at Encodo, we do write tests from the very beginning. It's really the only way to test the code that you're writing, isn't it? What are you going to do instead? Fire up the web server each time you want to throw data at a controller? Use a browser or Postman to fire those requests? Or are you starting a desktop UI and clicking around and typing? Or did you hack together a little console application in order to debug code? Stop doing all of those things. Use a testing environment instead, so your product acquires a growing stable of automated, repeatable regression tests. It will become second nature to write tests to verify requirements about the components you write. As we said above: the tests are part of the component. [Coverage] A point made above is that unit tests are useful but they're often not complete. Unit tests can fool you with excellent syntactic coverage but sub-standard functional coverage. We have many tools to measure the former, but only experience to measure the latter. Sure, you've covered all of the lines, but did you actually choose a representative set of inputs? Are you making the right assertions? Did you actually test the requirements? [Expectations] One technique that we use a lot is expectation files (called snapshots in some frameworks). Instead of writing several (sometimes, dozens of) assertions, we format output to text and then compare it against the text produced by the previous, presumably correct test run. The idea is to detect when something has changed. We use this in Quino to verify log output during certain operations, or to verify queries or generated SQL or model structure or lists of data. Expectation files increase the depth and robustness of tests while at the same time making it extremely efficient to write and maintain such tests. An expectation (or snapshot) is updated automatically when it changes and shows up as a difference in source control. If the change is expected, the developer commits it. [Experience counts] It takes a lot of experience to write just the right number and kind of tests. You don't want to write too many tests: it's code you have to maintain, after all. Also, it can be confusing when the same problem crops up in multiple places in different fixtures. Some components should have unit tests as well as integration tests. For other components, unit tests are redundant because the integration tests cover everything already. Experience guides you in deciding what to write first, what to keep, and what to throw away. [Too many tests?] It is possible to have too many tests. If you're not aware in which layer your code resides, you might end up running the same code in multiple scenarios, when that component behaves the same regardless. For example, if you're testing how expressions are mapped to a database, then that test should definitely run against every supported database. If you're testing how a high-level query composes those expressions before they get to the mapper, then you only really need to run it against one database in integration. [Conclusion] No-one wants to admit to releasing untested software. And no-one really wants to do manual testing. Automating tests reduces turnaround time for changes and enhancements. It also increases confidence for quick turnarounds when going to manual testing or production. Unit tests are good, but prefer coverage in integration tests so that you have the best guarantee that your tests are running your code in a way that emulates the production environment as closely as possible.

Encodo White Papers: Document Everything (2017)

2023-10-04T21:36:01+02:00

Published by marco on 4. Oct 2023 21:36:01 ------------------------------------------------------------------------ Good documentation is part of every piece of quality software. What do we mean by "good", though? Or "documentation", for that matter? Quality software should be self-explanatory, but don't be fooled into thinking that you don't need to write documentation. [Actors & Use Cases] Documentation has an audience. Before writing anything, consider who you're writing it for. What are the possible audiences? [Evaluators] Evaluators are interested in what your software does, how it interacts with other software, its performance characteristics, system requirements, the product roadmap, open issues and so on. If you don't document your software sufficiently, an evaluator won't purchase it in the first place. By "purchase", we mean that an evaluator will decide to use your software. This applies not only to commercial projects, but also to open-source freeware or even internal company software, be it a potentially time-saving Excel spreadsheet, a set of common UI or server components or an enterprise-wide multi-tier application. [Installers] Installers are interested in the basic installation options/paths and how to get from purchase/download to running. Here you need to find a balance between getting them up and running quickly, but also informing them that there is more to your product than just the standard rollout. They need to know that they can get set up efficiently but also that they're not locked in to a single way of doing things (unless that's what you're selling). [Customizers] Customizers are advanced installers: they want to know how to tweak or customize an installation to meet their special needs. These are often the same people as installers, but [New Users] New users are going to use installed/customized software. They want to not only know what your software does, but how they can use it for these standard tasks. They are interested in underlying concepts in both the application domain and the user experience. They need both introductory and high-level documentation, with meticulous, step-by-step instructions. These users are likely to navigate documentation in a progressive manner, reading from beginning to end. [Everyday/experienced Users] Everyday/experienced users aren't generally interested in introductory documentation. They are interested in how to become more efficient with your software. They will jump around in the documentation, using a search function to find what they need. [Extenders] Extenders are users—usually developers—who will be using your software as a building block, integrating it with other software or extending it to meet their needs. These users are interested in command-line options as well as descriptions of available APIs. If the API surface is larger, then functionality should be grouped and examples included to demonstrate how to use the various calls in common workflows. [Developers] Last but not least, you have to document for developers. That means writing your code and documenting it in a way that is understandable not only to you but other members of your team. Future members of your team, will also need to get up to speed. As is often the case, you yourself will be one of those future developers, when you come back to a project or product after a longer absence. Your future you will definitely thank you for leaving well-documented clues. [How to get started] Wow! That seems like quite a lot of documentation to write. It is. As with anything else, you'll have to prioritize. We can make a list of the various documentation types we have at our disposal and identify the actors that would use them. * Description/purpose (evaluators) * System requirements (evaluators, installers) * Readme (evaluators, installers, customizers, extenders) * Concepts (all users) * Tutorials (users, extenders) * Training Videos (users, extenders) * Error messages (users, developers, extenders) * Command-line help (installers, customizers, extenders) * Code documentation (developers, extenders) * Log messages (developers, extenders) * Issues/stories/bugs/todos/roadmap (evaluators, developers, extenders) As you can see, we consider anything that helps actors to understand the software to be documentation. That means that writing useful error and logging messages is also an important way of documenting the product. Similarly, a clearly defined roadmap with stories/bugs/todos provides context for evaluators and developers. All of these forms of documenting a product can save everyone a lot of time, money and confusion by offering context-sensitive documentation right where it's needed. This extends to everything in your software or product: the best documentation is a good design. If the UX is more intuitive or command-line help is clear or the APIs are consistent and well-organized, that can go a very long way already. There is less need for extensive tutorials explaining each and every task when the product documents itself. For example, if you name an API getUsers() and an input variable includeAdministratorUsers, then you don't need to write much more than "Gets a list of users, optionally including administrators." For those reasons and many others, we recommend getting started early with documentation. If look at the list above, that's kind of obvious advice. * If you don't know the purpose or the concepts, then you shouldn't be building the product yet. * You need to document how to set up and install the product or your own developers won't be able to work. * If error and log messages are obscure, your developers and supporters will lose a lot of time and money on issues and bugs. Most importantly, the simple act of trying to describe what you are making will lead to a better product. You'll often find that, as you document, you'll notice things that could be done better or more intuitively or more consistently or more easily. The simple act of trying to explain what you're making leads to a better product. If you find it relatively quick and easy to write documentation, then there's a good chance that you've managed to build quality software.

Encodo White Papers: Continuous Integration and Delivery (2017)

2023-10-04T21:35:54+02:00

Published by marco on 4. Oct 2023 21:35:54 ------------------------------------------------------------------------ An important part of the software process is the final step: delivery. If you can't get your software into your customer's hands, then what's the point of writing it at all? [Goals] There are several at-times cross-cutting goals. In descending order of importance, they are: * Improve reliability and quality of releases * Improve efficiency of the release process * Improve testing feedback loop * Improve efficiency of the development process [Definition] There are several aspects to continuous integration and delivery: * Build: create testable artifacts * Test: execute automated tests on a clean machine, separate from any developer's environment * Package: create deployable artifacts (may be same as build output) * Deploy: deploy artifacts to target environments (e.g. Dev, Staging, UAT or Production) [Benefits] As expected, working in an organized manner with increased automation has clear benefits. * An excellent protocol of which software version contains which changes * Automated and centralized archiving of versions * Improved code & configuration quality as all code is tested on a non-developer machine * Practice makes perfect: if you're delivering during the entire life of the project, then the final delivery is much more predictable and far less stressful [Limitations] There are obviously limitations as well. The most immediate one is infrastructure investment: you have to set up build servers or purchase them in the cloud. You also have to make your process work with automated builds and possibly retrain personnel to work with it. You have to plan your project and you have to have patience on the part of all stakeholders. You have to train everyone on the team to not even consider releasing a version of the software from a developer PC. Setup and maintenance of build agents takes time and effort, especially over longer periods of time. Operating systems are upgraded, core components changed, build systems upgraded. All of these things will cause the build to fail on a given agent, even though nothing is actually wrong with the product. Here again, though, the agent will act as a canary in the coalmine for your development team. More often than not, the build-server failure will alert the team to avoid a feature that would have other wise cost them time to integrate before it's ready. [Deployment types] The type of deployment depends on the product. For desktop software, you need to build an installer or a compressed archive that users can execute and install. Mobile or UWP applications must be built and then delivered to app stores for installation. Web servers and sites can be deployed directly to in-house servers or into the cloud (e.g. AWS or Azure). These deployment types are for the end users, but there are many more releases than that. Developers need to test their changes locally. Testers need to get these versions in order to provide feedback in a timely manner. We think of all of these releases as part of the build infrastructure, not just the continuous-integration server delivering an end-product. [Requirements] * Clean, predictable versioning: preferably semantic versioning, which means that you can support alpha, beta, RC and other pre-release versions * Scripted packaging: there can be no manual steps in the entire release process * Infrastructure: One or more agents for executing builds, either hosted locally or in the cloud * Knowhow: Knowledge of how to configure builds and deployments, preferably distributed among multiple team members (even if access is limited to IT) [Experience] At Encodo, we have experience with various systems for various types of software. We started off using Jenkins but moved to JetBrains TeamCity several years ago. Web projects have their own packaging and testing mechanisms (e.g. WebPack, Mocha) that integrate into almost any build infrastructure. We've also used Fastlane combined with Test Flight for mobile deployment. Our main expertise lies with configuration of .NET deployments paired with TeamCity. [Recommendations] * Use the same tools on the build server as your developers do. That is, if you use .NET with R# and the NUnit test runner, then use those same tools on your build server. In this case, TeamCity is a good fit for many of our projects. * Avoid writing too many custom scripts for the build server. The build server will need to perform some extra tasks (like clearing databases), but make sure that those scripts are in the code repository and can be executed and tested locally as well. This decreases debugging time in the CI environment. * If you do have to write scripts for the build server, consider whether you can use the same scripts on local developer machines. For example, Encodo uses a lot of NAnt scripts to clean, build, deploy and package solutions. We use those scripts locally as well on the build server. This increases the likelihood that an issue with the scripts will be detected locally rather than only on the build server (where it's generally more difficult and time-consuming to address).

Encodo White Papers: Design by Contract (2006)

2023-10-04T21:35:48+02:00

Published by marco on 4. Oct 2023 21:35:48 ------------------------------------------------------------------------ Design by Contract is a software engineering practice in which software requirements and promises - the "Contract" - are explicitly written into the code. The code is, at the same time, better documented, more reliable and easier to test against. Encodo uses this technique to ensure software quality. [A brief overview of contracts] A software contract is composed of several components: preconditions, postconditions and invariants. Preconditions are what a component requires of a client, whereas postconditions are what a component guarantees to a client. In object-oriented programming, these contracts are attached to method calls in a class. Invariants are a list of conditions that must always be true for software. An invariant is typically attached directly to a class; the runtime checks the class invariant when entering and exiting a method call. Popular programming languages, like Java, C#, Delphi Pascal and others, lack the language constructs needed to express these contracts. However, these languages contain assertion constructs, which allow one to roughly describe the contracts. The section on emulating contracts in other languages section shows the most common technique. Eiffel is a language whose inventor, Bertrand Meyer, pioneered Design by Contract. It includes rich support for expressing contracts, is similar to Pascal in syntax and will be used for the examples below. The FAQ offers more information on why we chose Eiffel for our examples. [Using contracts] The best way to show how the use of contracts affects software is with an example. Imagine a database connection class with a method Open. This opens a connection to the database, allocating resources for it and failing if the request is refused. ---------------------------------- | Listing 1 - Initial definition | ---------------------------------- Open is do -- Execute code to open the connection here end ---------------------------------- Any procedural programming language is capable of formulating the code above. However, what happens if Open is called twice in a row on the same connection? One way to handle this is to simply ignore subsequent calls to Open. ----------------------------------------- | Listing 2 - Ignoring subsequent calls | ----------------------------------------- Open is do if not IsOpen then -- Execute code to open the connection here end end ----------------------------------------- This is not optimal, for several reasons: * Clients that misbehave by repeatedly calling a powerful function like Open will never know they are doing so. * Clients that lack the original source will have no idea that the check is already made and will check again, needlessly muddying their code and wasting performance. * The function fails to open the connection silently, which is an extremely dangerous way of responding to a non-standard condition. Another way to respond is to accept that this might happen, but making it non-silent, logging the occurrence to some sort of logging mechanism. ---------------------------------------- | Listing 3 - Logging subsequent calls | ---------------------------------------- Open is do if not IsOpen then -- Execute code to open the connection here else -- Log a warning end end ---------------------------------------- This is slightly better and an entirely appropriate solution in some cases. However, the connection is quite a low-level component; it should not be responsible for deciding what to do about repeated calls to Open. We can use a contract to push the responsibility onto the client. ---------------------------- | Listing 4 - Precondition | ---------------------------- Open is require not IsOpen do -- Execute code to open the connection here end ---------------------------- The require clause contains optionally named boolean expressions. If one evaluates to false, a precondition violation is signaled. The violator can immediately be pinpointed and repaired to conform to the contract (by adding a check for IsOpen before calling Open). What are the benefits? * A client has a list of conditions that must be satisfied before calling a routine. The interface is clear. * A routine has a way of devolving responsibility for certain conditions onto its clients. The contract for this routine is not complete, as it has only published its requirements, but said nothing about guarantees. Given the name of the function, we would expect it to have the following postcondition: ----------------------------- | Listing 5 - Postcondition | ----------------------------- Open is require not IsOpen do -- Execute code to open the connection here ensure IsOpen end ----------------------------- The function is now completely defined, having explicitly detailed its requirements and guarantees. The postcondition often looks quite superfluous: the code for opening the connection is right above it, isn't it? Not necessarily. If the function is deferred (abstract in Java and Pascal, virtual in C-style languages), the implementation is in a descendent. The pre- and postconditions apply to the redefinitions as well. This allows a base class to very precisely define its interface with other classes without making any decisions about implementation. --------------------------------------- | Listing 6 - Deferred implementation | --------------------------------------- Open is require not IsOpen deferred ensure IsOpen end --------------------------------------- The precondition can only be expanded in a descendent, whereas the postcondition can only be further constrained. That is, a descendent cannot define the precondition to be not IsOpen and DatabaseExists. A client with a reference to the ancestor class sees only the ancestor precondition and cannot be forced to conform to a contract defined in a descendent. Likewise, the postcondition cannot be redefined to be IsOpen or ActionFailed. The original interface has already decided that if the database cannot be opened, the implementation must raise an exception. A client with a reference to the ancestor class does not have access to the ActionFailed feature and cannot accept this as a valid postcondition. The descendent adjusts the precondition in a function like this: ------------------------------------ | Listing 7 - Extending a contract | ------------------------------------ Open is require else AutoCloseIfOpened do -- Execute code to open the connection here ensure then not CompactOnOpen or DatabaseIsCompacted end ------------------------------------ This descendent has expanded the precondition to allow a caller to call Open repeatedly only if IsOpen is false (inherited precondition) or if the AutoCloseIfOpened option has been set. Likewise, it has further constrained the postcondition to promise that, in addition to IsOpen being true (inherited postcondition), the database will be compacted if the CompactOnOpen option is set. [Emulating Contracts in other Languages] So, that's Eiffel. How can other languages express contracts without the proper language constructs? As mentioned above, almost all modern languages include an assert function, which accepts a boolean expression and raises an exception if it is false. This function can emulate pre- and postconditions, but class invariants are largely impractical in languages without some form of pre-processor (a search for Design by Contract in C++ turns up several such libraries). Here's Listing 5 written in Delphi Pascal: ------------------------------------ | Listing 8 - Emulating a contract | ------------------------------------ procedure Open; begin Assert( not IsOpen ); // Execute code to open the connection here Assert( IsOpen ); end {Open}; ------------------------------------ Note how the contract is expressed in the implementation body; this makes contract inheritance difficult. The following pattern illustrates a single level of contract inheritance (which prevents descendants from removing contracts by not calling inherited methods): ---------------------------------------------- | Listing 9 - Emulating Contract inheritance | ---------------------------------------------- procedure Open; // Not overridable begin Assert( not IsOpen ); DoOpen; Assert( IsOpen ); end; procedure DoOpen; virtual; abstract; ---------------------------------------------- Under this pattern, descendants are required to implement DoOpen and cannot alter Open (Delphi methods are by static by default - equivalent to final in Java, sealed in C# or frozen in Eiffel). There are naturally drawbacks to this approach, especially when compared to the rich contract syntax available in Eiffel*, but the technique is sufficient for many of the desired contracts. See the further reading below to learn about using old in postconditions and expressing class invariants [FAQ] [Question 1] "Why is there no try .. finally to ensure that the postcondition is checked in Listing 8?" A postcondition is only guaranteed when the function exits successfully. In the example, it is perfectly legitimate for Open to fail because of an external connection problem. The precondition only guarantees that the connection is not open, not that it can be opened. Such guarantees are useless because they involve performing the action in order to check that the action can be performed. The function should raise an exception if it cannot open the connection, avoiding evaluation of the postcondition and resulting in an acceptable error condition. An implementation that fails silently will cause a postcondition violation, which is an unacceptable error condition. Using a try .. finally construct to force evaluation of the postcondition under all circumstances would result in both the desired error (connection could not be opened) and a postcondition violation, which is not correct. [Question 2] "What if there is an exit or return statement in Listing 8?" Question 1 proposed a using a try .. finally construct to ensure that the postcondition was always executed. As you can see from the answer, this has undesirable side effects. The simple answer is not to use instructions that break the normal instruction flow (e.g. exit or break). The usefulness of such constructs is debatable and the drawbacks are high (especially, as shown above, when the instruction avoids checking contracts). This exposes the weakness of languages without explicit contract constructs — it requires discipline to avoid bad practices. Relying purely on discipline invites error. However, it is better than nothing at all. [Further Reading] * "Object-oriented Software Construction" * "Eiffel.com Presentations"

Encodo White Papers: Code Reviews (2017)

2023-10-04T21:35:39+02:00

Published by marco on 4. Oct 2023 21:35:39 ------------------------------------------------------------------------ A healthy and active review culture is essential for any team interested in building quality software. At Encodo, we've been doing reviews for a long time. They've become an essential part of everything we do: * Analysis * Estimates * Design * Architecture * Coding * Style * Performance * Coverage * Security * Deployment * And more... [Definition] What we mean by review is not a formal process at all. It is simply that you prepare work you've done for an informal presentation to a team member. Explaining what you've done in a review is often a good way of collecting your thoughts—you should be able to explain what you've done. Getting a review from a colleague is an efficient and productive way of making sure you can do that. [Limitations] While there are many reasons to do reviews, we've also learned that reviews can't do everything. * They're not very efficient at distributing know-how. A review helps another, well-versed team member learn what you've done, but it won't help non-team members get on-board. * If the reviewer has to learn too much from the review, then they cannot perform a useful review. They can be, at best, a sounding board. * For similar reasons, a review is not a good way of mentoring junior team members. [Benefits] * The quality of anything that you expect to review with someone automatically increases. Just knowing that you will have to tell someone what you did and why increases the likelihood that you think about the solution clearly and consider potential questions. * A major benefit of getting a review is that it makes you check and prepare your own work. You'll often notice that there's still work to be done all on your own—even if the reviewer doesn't say one word. * When you've spent a long time on a problem, a fresh perspective as offered by a reviewer may offer an alternative solution. Often we're so deeply involved in a solution that we don't notice when we've diverged from the requirements and are just making more work for ourselves. * When applied to very early stages, like analysis or estimates, the savings in time, money and manpower can be enormous. [Scheduling] It's important to get reviews often enough to avoid wasting time and effort but not so often that your work or the reviewer's work grinds to a halt. It's all about balance. A good rule of thumb is about one review per task. If your task is longer than a day, then think about how to break up that work into phases in order to get a review of earlier phases. That way, you're more likely to catch issues before building on top of mistakes. [Synchronous versus Asynchronous] Encodo prefers live, face-to-face reviews. This is the most efficient manner of reviewing as neither party has to prepare anything other than the work to be reviewed. Issues that come up can often be handled immediately—and such issues are far more likely to be mentioned and fixed. While in-person reviews are superior, video-chat/shared-desktop reviews work quite well, too. If that's not possible, then we have also used tool-based, asynchronous reviews, such as pull requests with review software. However, we find these to be not only less efficient but also less likely to find as many issues. With a live code review, it's relatively easy to ask the submitter to reorder, split or squash commits. It's also easier to point out and quickly fix stylistic issues (like naming or interface usage, etc.). Because the turnaround time is much faster, a reviewer is far more likely to point out smaller fixes that would improve code quality, maintainability and so on. However, in an asynchronous review, a reviewer must decide what is most important. Is it worth rejecting the whole pull request if it's 95% correct with a few details? Do you reject it and ask the submitter to fix up spacing or formatting or missing documentation? Do you really write down every last little thing you would have said? Do you reject it and hope that the submitter understands all of your notes? Or do you accept it and just fix those things up yourself? How many iterations do you go through? We prefer synchronous, face-to-face reviews because they're much more efficient. Misunderstandings can be cleared up quickly, iterating until the submitter and reviewer find a consensus. [Rules of Thumb] * A review should not consist of more than 200 lines of code * A review should not last longer than an hour * One reviewer is usually enough; if necessary, pull it at most two reviewers * For non-code, use your judgment to determine an appropriate amount of reviewable work [Getting Faster] We encourage reviews everywhere because we know how to make them faster. * Use your IDE to help submitters improve their code before they ask for a review. Formatting, code-style preferences and other issues can be eliminated from review if the tools enforce them. * Start by focusing on the changes made (diffs) and expanding from there if further context is needed. If the reviewer was chosen well, then that context is often already available * Review related commits en masse rather than individually * Smaller, focused commits are much easier to explain and to review * Spend less time on commits that are purely refactoring (trust your tools and your tests) * Practice makes perfect! [Getting Better] Both the reviewer and the submitter need to practice. A reviewer should practice diplomacy and formulate critique in a way that it will be accepted. A submitter must keep an open mind and prepare good arguments or justification for the code. Both sides should stay positive. A review shouldn't be a competition: it's about producing high-quality code together, as a team. Encodo has done presentations on reviews, in both English and German.

Encodo White Papers: Component-based Design (2006)

2023-10-04T21:35:31+02:00

Published by marco on 4. Oct 2023 21:35:31 ------------------------------------------------------------------------ This article is part of an "archive of Encodo White Papers" . What is the best approach when designing a new application, be it a small tool or an end-user application? [Build a Prototype?] Many developers jump straight into a prototype, in order to get a feel for how the application will work. While prototypes are good for demonstrations, they are dangerous: in projects with tight time or budget constraints, the temptation to simply "build out" the prototype becomes irresistible. This leads to applications with nice user interfaces (hereafter called UI), but inflexible and difficult-to-follow implementations. [Build Components] A better first step is to list the requirements and assign them to possible components. This doesn't have to be a long or complete evaluation of the requirements; a few minutes is enough to come up with enough ideas to get started coding. These non-UI components are a natural fit for testing environments and are more likely to define a clean, sensible API (Application Programming Interface). Once the core logic has been built and tested, a prototype can easily be built on top of it. To summarize, the component-based approach is important for the following reasons: * It maintains business logic in clearly-defined units * It improves testability of business logic * It improves portability and reuse * It avoids spreading core logic throughout event handlers, which is a common practice in RAD (Rapid Application Development) environments [RAD considered harmful] A good UI library is a wonderful thing, allowing clean-looking, well-integrated applications to be built in a very short time. However, the allure of this style of programming is dangerous, as it quickly leads to applications without a clearly defined API, which leads to extensibility and maintenance issues. These systems entice programmers into working "backwards", building their application logic around events generated by the UI. The first generation of RAD environments were notorious for mixing UI and business code. The latest generations make use of libraries with "code-behind" built right in, automatically supporting core/UI separation in both web or classic UI application. This separation of core logic and UI events makes is commonly called the MVC or Model-View-Controller pattern. [What is MVC?] MVC is the official name for the technique described above, in which functionality is contained in a model (M), which communicates state changes to a view (V) through some form of update mechanism. The controller (C) represents user input and applies changes to the model. In many UI libraries, the view and controller layers are merged, making it much easier to apply the pattern to smaller projects. View components are typically bound to model components using the Observer pattern: the view "listens" for changes in the model and reacts accordingly. [Designing with Components] Consider a tool which processes text files and generates output of some kind (perhaps PDF or CSV). The actual task doesn't matter - this is the kind of tool that is often written in a seat-of-the-pants fashion, with the excuse that it is "faster" to get it done this way. Let's take a component-based approach and see what we get. What are the components of the system? * Transformer - Takes an input, applies one or more actions and generates an output * Actions - Performs an operation on data * Importers - Readers for various input formats; convert to a format the transformer understands * Exporters - Writers for various output formats; converts from the transformer format * Plugin Registry - Registration for recognized input and output formats * Options/Preferences -Global options for the system [Analyzing the Component-based Design] This list took only a few minutes to write and could have been written by anyone familiar with the project. The list contains only domain knowledge — there is no implementation-specific data. Having written down the requirements, we see that there is a need for an internal data representation, which will be used by the importers, exporters and actions. This is a facet of the design that might have gone unnoticed during prototyping, but would have been expressed implicitly nonetheless. [Is it overdesigned?] The list of features above is not an "over-design", but rather an explicit expression of the specifications. While an implementation can avoid using importer, exporter and action components, these concepts are part of the design nonetheless: an implementation without tehm is simply more difficult to describe, understand and extend. With a little bit of thought, we have designed a system that will scale to multiple import and export formats and even support multiple transformations. Writing the application in this way may involve marginally more initial work, but will result in a far more testable, extensible and reusable framework, decreasing maintenance and support time. [Does it slow development?] Another popular argument is the perceived reduction in programming efficiency. Applications or tools of the "throwaway" kind will take longer to develop when using a clean programming model. Whereas that may be true in the very short term, the majority of an application's life span is spent in support and maintenance, which takes more time and energy if the application is poorly designed. Though a throwaway prototype may be available marginally quicker, it will be of poorer quality. In addition, subsequent applications cannot benefit from its code. The biggest loss comes in the form of functionality, improvements or bug fixes which are never even attempted because the code is not in a maintainable or testable state. [How does the UI work?] Realization of this design at the core level is not so difficult. Even though the application initially only has one importer and one exporter, it doesn't take much more to define an API that supports multiple plugins. Writing the tests for these components is likewise trivial. The opposite is true in the UI: building an interface to manage and configure all of the functionality that was easily written into the model is prohibitive. There is no reason, however, that the UI has to express all of the details of the underlying model; the application, as specified, need only expose enough functionality in the UI to be able to import and export. The UI stays remarkably simple, but can be easily and quickly extended to offer more features, if desired. Since the model has automatic tests, it can be assumed to be stable and it is easier to accurately estimate the time required to build the new GUI elements. [Analyzing the Prototype-based Design] The standard, quick-prototyping approach would have started coding a main form with some input fields, building the transformation code directly into the form itself. Options and preferences would have likely been encapsulated with a few controls on the main form, which, in turn, would have been responsible for loading and storing them. The design sketched above would be expressed implicitly and partially, at best. An application written without these concepts in mind will not be worth refactoring. If the code is re-used at all, it is typically copied to a new project and modified there, resulting in multiple copies of nearly the same code. Fixes and enhancements to one will not necessarily appear in the other. A prototype that is considered "throwaway", but grows into an application, does not benefit from any of the following: * It is easier to document the clearly defined API of a model; good documentation allows multiple developers to support or upgrade the application * Reuse across multiple applications * It is far easier to refactor and repurpose model code that lends itself to test-driven development [Extending the Application] It's obvious from the design above that it can be extended to support multiple importers, exporters and actions. The initial application was assumed to be a GUI which did not expose all of the functionality available in the model. The GUI can be made more powerful, exposing more of the underlying functionality. The extensibility of the design is clear. What about reuse? [A command-line version] The examples below are in Delphi Pascal. In a traditional prototype, command-line support is bolted on to the same application, because the required code is buried in UI structures. Such a command line application will involve something like: ------------------------------------------- | Listing 1 - Hacking the GUI Application | ------------------------------------------- if command = 'C' then begin { Create the main form first, so it is treated as the main form by the system, then hide both forms so they don't appear in front of the command line. } form:= MainForm.Create; form.Visible:= False; prefsForm:= PrefsForm.Create; prefsForm.Visible:= False; prefsForm.LoadOptions; form.EdtFileToUse.Text:= parameterFromCommandLine; form.BtnConvertClick( nil ); form.Close; // Close main form to quit application end; ------------------------------------------- Using the elements of the model from the component-based design, we could build a separate application, whose main loop is logical and readable: ------------------------------------ | Listing 2 - Logical and readable | ------------------------------------ if command = 'C' then begin options:= ToolOptions.Create; options.Load; try try converter:= FileConverter.Create( options ); converter.Convert( parameterFromCommandLine ); finally FreeAndNil( converter ); end; finally FreeAndNil( options ); end; end; ------------------------------------ The second version addresses the requirements in a much clearer, more maintainable fashion. On top of that, the implementation in the GUI application would have a similar pattern. The code above could go into an event handler, passing text from an input control instead of an argument from the command line. The following code assumes that the converter and options from the command line example above are globally available: ------------------------------------------ | Listing 3 - A clean GUI implementation | ------------------------------------------ procedure MainForm.BtnConvertClick( Sender: Object ); begin Converter.Convert( EdtFileToUse.Text ); end; ------------------------------------------ [Conclusions] With a small amount of time invested at the beginning, one can define any application in terms of UI-independent components. An application that was designed in this way lends itself to ready reuse. Applications that use these components need only be concerned with delivering input to a clearly defined API. Fixes and updates to the core components will be reflected in all applications.

Encodo White Papers: Test-driven Development (2006)

2023-10-04T21:33:33+02:00

Published by marco on 4. Oct 2023 21:33:33 ------------------------------------------------------------------------ Most people in the software industry have heard of test-driven development — it has become a buzzword with several possible meanings. [The Problem with Unit Testing] One of the more negative associations is the notion of unit testing. Unit testing traditionally involves writing a test for each and every routine in a unit or class, to ensure that it does what it claims. This practice has, of late, declined in popularity — mostly because of the sheer mindlessness of maintaining complete coverage of an ever-growing API. [Component Testing to the rescue] Another form of testing is to write tests for components of a system, ensuring functionality on a higher level than that of the routine. Tests of this kind tend to encapsulate use cases, which are far more closely related to the way in which clients (actual users or other software) make use of an API. Naturally, use cases for extremely low-level components will end up testing individual routines, just as unit testing does. Writing the component tests is not tedious and, in fact, helps tremendously in determining whether a piece of software is complete or not. They can be viewed as software implementations of the requirements documents or specifications. Proper application of Component-based Design makes it quite simple to build tests for the majority of an application's functionality. [Testing in the code] A far better tool for ensuring consistency at the lowest level, where unit testing traditionally comes into play, is Design by Contract. This practice involves including verification mechanism directly in the software, so that violations of software contracts can be pinpointed and quickly repaired. The most important element of any testing strategy is to stick with it. When a defect is found, the first step is to create a test to replicate the problem. The next is to fix the error so the problem no longer occurs, but all the other tests still work. Finally, any missing contracts that may have helped pinpoint the problem sooner should be added. Once the test suite runs through without problems, the software is ready for release testing. [The final step: Release Testing] Automated testing is a fantastic way of guaranteeing baseline software quality, but it is not the last step before releasing a product. For server software or software with a command-line interface only, the test suite can provide an extremely high-level of coverage (approaching 100%). Software which interacts with humans, however, requires a manual testing regimen to verify that the software functions as desired for all forms of input. Whichever parts of the testing chain cannot be automated (UI testing is notoriously difficult) should be documented in detail to ensure reproducibility between releases.

The UI is an afterthought, a detail

2023-09-07T11:02:51+02:00

Published by marco on 7. Sep 2023 11:02:51 ------------------------------------------------------------------------ [media] "Can we make our UI dumb enough to make our app usable without it?" The video demonstrates navigating through a simple e-commerce site. Then, he shows how the app can be driven from the console by calling the APIs directly -- upon which the URL and UI all update automatically. That is, the logic is not in the UI. He then demonstrates that he can drive the web site without a UI by deleting the rendering to React DOM entirely. He can still manipulate the console API to perform the same operations because the logic is all defined completely independent of the UI. Of course, this is the same command-line interface that can be used in the automated tests, which means that the entire product can be tested without a UI at all. I'm becoming increasingly convinced that neither React nor Angular is the way to go. Both React and Angular mix logic into the UI, putting the UI front and center. This is wrong. Additionally, Angular suffers from a complete inability to speed up the development lifecycle because it's so strongly tied to WebPack. I've used Redux before and the boilerplate becomes prodigious. I've used the React reducers as well, and it's a bit better, but still doesn't feel very natural. I've used "MobX" but long before its current incarnation where it really seems to "just work" as a store of state and reactive programming logic. The when construct (see 16:37 in the video), which takes a predicate and an action, is a very neat concept that allows you to define exactly how your application reacts to state changes without burying it all in the components. "If the view is to be purely derived from the state, then routing should affect state, not the derived component tree." Therefore, a url-change is an action like any other, modifying the state and letting MobX handle notifying all interested parties. Once you've gotten that far, you don't even need a UI-specific routing library because you can just configure any router to direct URLs to the store API -- which will automatically update the UI. The UI (e.g., React) doesn't have to have anything to do with routing. A route change triggers an action, which changes the state. The UI reacts. The UI does not do anything with the route -- it just triggers actions. A reactive non-UI component ensures that the route stays in-sync with the state by reacting to changes in the state. In most cases, you can just create a value that calculates what the URL should be, based on the state. This could get complicated, of course, but it's also completely separate from the rest of the application logic and can be thoroughly tested. We can also use the when construct outlined above to simply listen for changes to the calculated URL and update the browser's location and history. This way, the management of the history and URL is not entwined with the rest of the application logic. It's just reacting to state changes, like everything else. Working like this results in automated tests that work naturally and look very much like Playwright tests -- but completely without UI and using semantically meaningful constructs. The UI is an afterthought (as "Michel himself wrote in 2019" ). Playwright is nice, but it's a last resort when you've already botched the job of writing your code in a more testable manner. It's a nice check that the UI is properly wired to the logic of the application, but should not be used to verify application behavior -- simply to verify UI behavior. This all goes very much in the direction of "The Humble Dialog Box" by Martin Fowler , which shows that we've known how to build software correctly for over 20 years -- and we keep getting distracted by "the new shiny", thinking that we can somehow start with the UI and still get maintainable software.

On the usefulness of containers like Docker

2023-08-27T03:32:42+02:00

Published by marco on 27. Aug 2023 03:32:42 ------------------------------------------------------------------------ The article "Works on most machines" by Mark Seemann argues provocatively that containers are a fallback for poorly written software. "When you have general-purpose software, though, do you really need containers?" Well, yes. The point isn't that you need a container to paper over software that isn't sufficiently generic: it's to avoid fixing incompatibilities that have nothing to do with your target deployment systems. I think the author is thinking too much of highly general-purpose software whereas the majority of software doesn't need to run everywhere and anywhere. If it's built for the cloud, it's going to run in a container anyway. If it's built for a specific device, it's going to run on that device. In that case, why not just run that software at the developer side in the same environment? That way, you can avoid wasting a ton of time fixing problems that are related to how it runs in development rather than production. "Ultimately, you may need to query the environment about various things, but in functional programming, querying the environment is impure, so you push it to the boundary of the system. Functional programming encourages you to explicitly consider and separate impure actions from pure functions. This implies that the environment-specific code is small, cohesive, and easy to review." It implies it, but it in no way guarantees it. The author is also forgetting about the quality of the developer that is likely to be building the solution. [image]In this post, he assumes that the developer uses enough tests to thoroughly test the system -- even to the point where he is able to determine where a solution isn't sufficiently generalized yet. He assumes that the developer uses methodology like functional programming to separate pure from impure code, and that the developer is good enough to do all of this in a way that is both efficient and leads to a finished product. This is not at all a guarantee -- or even a likelihood -- in the real world. In the real world, developers are not reaching for the stars -- even if they had the capabilities, which many do not, they're often not given the time to do things correctly -- they are just trying to get it done. If they can "cheat" by restricting the world of possible environments -- rather than accommodating their software to environments it will never encounter in production -- then why not? It's actually an engineering problem. If you're going to make something that has to work well underwater, the only reason it needs to work out of water is because it makes it easier to work on, not because you think it's worth the time making it function properly when in air. If you can make it just as easy to work on underwater than you it is in air, then you would just do that instead. Wouldn't you? Why waste your time and your company's when there's a lot of other, more important work to do?

ImageSharp vs. SkiaSharp

2023-05-30T22:04:15+02:00

Published by marco on 30. May 2023 22:04:15 Updated by marco on 11. Sep 2023 13:14:50 ------------------------------------------------------------------------ I watched a great video about image-manipulation using an AWS lambda function. [media] I was curious about the imaging library he was using and searched for ImageProcessingContext (because I saw it in his code). That led me to ImageSharp, after which I searched for comparisons to the cross-platform library used in "Maui" . That led me to the issue "SkiaSharp vs ImageSharp" , which noted that, "Note that JimBobSquarePants, the creator of ImageSharp, contributed some interesting discussion in #47." I read/waded through that whole issue thread and commented the following: tl;dr: Maui.Graphics uses SkiaSharp because it is a 2D-rendering library rather than just an image-manipulation library. [image]For future readers: The discussion itself is not very interesting, but the conclusion is. The title of the issue is "Basic premise of the library is based upon a fallacy and harms existing projects." (referring to Maui.Graphics), which doesn't feel super-constructive (and wasn't). There are long screeds about how harmful MS is for everything OSS. The final comment is worth reading, as it explains that it turns out that the harshness of the issue title was completely unwarranted (as admitted by the original poster). Good conclusion; typically unproductive Internet discussion. There is no conflict. Skia's support for images is weaker than ImageSharp's but it allows using GPU rendering on supported platforms whereas ImageSharp is for in-memory data (CPU-bound). In the "referenced issue" itself, I commented, "That's wonderful. While I'm happy to learn that the issue was resolved, is there any way that we can pin this comment to the top so that future readers don't have to wade through the 80% catfight in the middle? "I was linked to this issue while researching Skia vs. ImageSharp and found the initial question and a couple of responses interesting, then waded through 80% chest-thumping, then finally got to this comment that essentially says "hey, we actually talked to each other and it turns out it was a tempest in a teapot", which is what I was hoping to learn." I just got a response today: "No way to pin comments, but I added a link to that comment from the initial issue description." Nice! 👌❤️‍🔥

Working with Git Submodules

2023-03-28T22:15:01+02:00

Published by marco on 28. Mar 2023 22:15:01 ------------------------------------------------------------------------ [Introduction] The intended audience of this document is people interested in knowing which commands to execute to update submodules. The initial analysis section is intended for people interested in knowing how the commands work and what their strengths/weaknesses are. The inspiration for this documentation was that I was wondering whether submodules were always cloned with detached heads and if there were some way to avoid that. The short answers to these questions are, respectively, "yes" and "no". Skip to the "examples below" <#useful-commands> to just see the commands and their effects. At the end of the document are "links to pages" <#links> referenced to produce this documentation. [Terminology] In the discussion below, the term superproject refers to the root repository that contains submodule references. It comes from the git documentation where they make the distinction because submodules can be nested. Suppose, we have multiple nesting, as shown below. 📁 A 📁 B 📁 C * A is the root repository of both B and C * A is the superproject of B * B is the superproject of C [Where do submodules go?] Submodules are stored inside another repository. For a simple we would see the following: 📁 A 📁 .git 📁 modules 📁 B 📄 config (worktree = ../../../B) 📁 B 📄 .git (points to ../.git/modules/B) The submodule's git folder is stored in the superproject's git folder and is replaced by a file that references the new location. The submodule uses the "worktrees" feature to check out to a different folder. [Can I share a local copy of a submodule?] No. Storing the working tree of the submodule outside of the repository is not supported. Why would you want to do that anyway? One use case is that you have two repositories, each of which includes the same submodule, as shown below. 📁 A 📁 B 📁 C 📁 B Instead of using two copies, you might think you could make the superprojects refer to the same copy of the submodule. 📁 A (refers to ../B) 📁 B 📁 C (refers to ../B) * The advantage would be that changes made to A would immediately be available in C * However, it would no longer be possible to make A and C refer to different commits Whereas you can "manually move a submodule outside of the repository" after you've cloned it, you cannot configure a superproject's submodules in a way that Git will be able to clone properly. If you try it, you'll probably get an error message like, fatal: No url found for submodule path 'SUBMODULE.NAME' in .gitmodules The next section explains how you can share local commits for testing. [Testing submodule changes in multiple projects] Assume, as above, that there are two copies of the submodule, B^A and B^C. Suppose there are commits in B^A that have been tested with A, but should also be tested with C. One way to test C would be to push the commits in B^A and then pull them from B^C. That involves a round-trip to the server, which is not optimal, but relatively straightforward. Another way to test C would be to add the local B^A as a remote to B^C and then check out the commit from B^A directly. To set up a remote called B_A in B^C, execute: git remote add B_A ../../A/B The testing flow would be, roughly, * Test changes to submodule B^A in A * Create commit #1 in B^A * Fetch from B_A into B^C * Check out commit #1 in B^C * Test changes in C * Repeat as needed [What to expect when cloning with submodules] A clone of a superproject (a repository with submodules) fetches submodules only when required (e.g. when --recurse-submodules is included). If submodules are fetched, then git sets the checked-out commit in each submodule to the commit ID specified for that module in the superproject. This makes sense because that is the correct commit to use. However, this also means that, after a clone, all submodules will be in a detached head state. On an initial clone, git creates a local branch in the superproject corresponding to the checked-out branch in the clone command (either the default branch or the branch specified in the -b option, if included). Git does not create local branches in any of the submodules. Git assumes that you will be working in the root repository and not in the submodules. The checked-out branch in the submodule is irrelevant to the superproject. If you want to work in (one or more of) the submodules anyway, then you have to create a local branch for yourself and check it out. The detached head situation is not "weird" but "entirely expected" and "working as designed". All detached head means is that a commit ID has been checked out rather than a named, local branch. If, however, you want the submodule to be checked out to the same branch as that checked out in the superproject (e.g. main), then the way to address that is to call git switch main in the submodule repository. This will have no effect on the superproject if the main branch in the submodule repository is at the same commit ID as the one pointed to by the superproject. If it is not, then switching to the main branch in the submodule repository will show up as a change in the superproject (the change being that the submodule repository is now pointing to a different commit). To accept that change in the superproject, simply git add the submodule folder and commit the change. [What does --remote-submodules do?] The --remote-submodules option does the following (according to the official documentation): "Git will use the status of the submodule's remote-tracking branch to update the submodule, rather than the superproject's recorded SHA-1 (i.e. "commit ID")" That means that using this parameter may cause changes in the working tree of the superproject if the remote-tracking branch in the submodule repository does not point to the same commit as that referenced by the superproject. ["Tracking" a branch in a submodule] The basic submodule registration looks like this in the .gitmodules file. [submodule "SharedRepo"] path = SharedRepo url = git@ssh.dev.azure.com:v3/ustertechnologies/uster.quantum/PoC.IMHSharedRepo If you don't plan on using --remote-submodules, then that's all you need. However, if you want to set up your git submodules so that the superproject knows which branch it should "track" in the submodule, use the following configuration: [submodule "SharedRepo"] path = SharedRepo url = git@ssh.dev.azure.com:v3/ustertechnologies/uster.quantum/PoC.IMHSharedRepo branch = . update = rebase Note that the branch name is ".". This tells git to use the same branch name as that which is checked out in the superproject (if it exists; if it doesn't, then git does nothing further). This allows you to set up the .gitmodules once and it works as expected for all branches. Otherwise, you run the risk of merging in a .gitmodules file that references a specific feature branch (for example) and you end up syncing with that feature branch by accident if you call submodule update with --remote. The update action indicates how git should get to the desired commit if it needs to make a change. Again, this only applies if you explicitly tell git to use the head commit for the given branch on the remote instead of just using whichever commit is already referenced locally. [A remote-update example] A superproject will see an update if it follows a branch in the submodule (as outlined in the preceding section) and that branch in the submodule has gained new commits since the last time the superproject was updated (i.e. the superproject still references a commit in the submodule that does not correspond to the current HEAD of the branch in the submodule). Using the --remote-submodules option is a way of cloning a superproject, but also updating its submodules to the latest commits instead of just checking out whatever is referenced in the superproject. It is a useful way of cloning a superproject with the latest commits in not only the superproject's repository, but also all submodules. However, you are then not only checking out the current state of the repository, but also requesting updates to the referenced submodules. This only works if the submodule reference specifies a branch, though. If it doesn't, then git has no way of knowing which branch in the submodule repository it should update to. As noted above, setting this branch doesn't mean that git will create a local branch in the submodule with that name and check it out; it just means that it will change the commit ID referenced by the superproject for that submodule if the commit referenced by that branch in the submodule is different than the commit currently referenced by the superproject. Phew! We now know enough to determine the commands to use. [Useful Commands] We now have the base knowledge to work with git and submodules using the command line. This will be useful for e.g. setting up agents. Imagine we have two repositories * Repository A has a main branch that tracks the main branch of submodule B (currently commit ID1) * The main branch in B points to commit ID1 * Repository A has a feature/setup branch that tracks the feature/setup branch of submodule B (currently commit ID2) The examples will use something like the following diagram to show results. The bold indicates the commit and branch that are checked out. A bold commit with a non-bold branch name indicates a detached head. The diagram below shows the situation outlined above, with main checked out. [image] [Clone with submodules] To clone a repository with submodules and check out the default branch in the superproject, execute the following: git clone --recurse-submodules This results in: * The superproject is cloned and checked out to the default branch * Each submodule is cloned and checked out to the commit referenced in the respective submodule definition * Submodules are in detached head state because git does not create local branches in submodules Using the example from the start of this section, after executing this command, we will see: [image] No change from the example is expected. [Clone with submodules (and check out a branch)] To do the same as above, but check out a particular branch, execute the following: git clone -b feature/setup --recurse-submodules This results in the same as above, but the superproject is checked out to "feature/setup". Using the example from the start of this section, after executing this command, we will see: [image] [Update submodules after cloning] To update submodules after an initial clone (not necessary immediately after a clone, of course), execute the following: git submodule update This results in: * No changes to the superproject * Missing submodules are cloned * All submodules are checked out to the commit referenced in the respective submodule definition Submodules where a change to the checked-out commit is required are in detached head state. If no change is made, then the submodule remains at which detached commit or branch was previously checked out As with an initial clone, this command does not update any references to submodule commits. [image] [Clone with submodules and update remote references] To not only clone a superproject and all of its submodules, but to also update references to those submodule's latest HEADs (as outlined in the "remote-submodules section" <#what-does-%60--remote-submodules%60-do%3F> above), execute the following: git clone --recurse-submodules --remote-submodules This results in: * The superproject is cloned and checked out to the default branch * Each submodule is cloned and checked out to the latest commit on the branch referenced in the respective submodule definition * Submodules are in detached head state because git does not create local branches in submodules If, for example, the remote branch main in repository B had been updated to BID2, then the reference from A to B would also have been updated to BID2: [image] [Update submodules to remote references] To update submodules after an initial clone and update references (as outlined in the "remote-submodules section" <#what-does-%60--remote-submodules%60-do%3F> above), execute the following: git submodule update --remote This results in: * No changes to the superproject * Missing submodules are cloned * All submodules are checked out to the latest commit on the branch referenced in the respective submodule definition * Submodules where a change to the checked-out commit is required are in detached head state. If no change was made (i.e. the remote commit for that branch in the submodule is still the same commit as that referenced by the superproject), then the submodule remains either with a detached commit or whichever branch was already checked out As when calling clone with --remote-submodules, this command updates submodule references. Therefore, if the remote branch main in repository B had been updated to ID3, then we would expect to see A referencing that commit in B. [image] [Links] The following links were helpful in writing this documentation: * "Why is my Git Submodule HEAD detached from master?" * "Git submodule is in "detached head" state after cloning and submodule update" * "Why did my Git repo enter a detached HEAD state?" * "How can I specify a branch/tag when adding a Git submodule?" * "git clone" * "git submodule"

Extracting subtitles from an mkv with ffmpeg

2023-03-17T07:22:46+01:00

Published by marco on 17. Mar 2023 07:22:46 ------------------------------------------------------------------------ I'd watched an excellent movie [1] that was primarily in German but had some English parts, with hard-coded English subtitles and soft German subtitles plastered on top of that. I wanted to cite a bunch of interesting sections, so I looked for the subtitles online. Only the English subtitles are available, which I didn't want. I liked the German formulation and wanted to cite that. Well, I have the subtitles: they're just trapped in the mkv file. I figured that there was some way of extracting them, but a search turned up a lot of pre-compiled and sketchy-looking software whose veracity I couldn't adequately validate. I want the subtitles, but I don't want to get a virus or crypto-locked. I got a good hint to use ffmpeg from "How to Extract .SRT Files From MKV File" . It suggested something like, ffmpeg -i FILENAME.mkv -map 0:s:0 german.srt Once I'd installed ffmpeg with "Homebrew" , I was able to extract a subtitle stream. Unfortunately, it was kind of short, so I'd grabbed the wrong stream. Part of the output of the command above is a list of available streams, shown below. Stream #0:0: Video: h264 (Main), ... (default) Metadata: DURATION : 01:25:55.332000000 Stream #0:1(ger): Audio: aac (LC), 48000 Hz, stereo, fltp (default) Metadata: title : Stereo DURATION : 01:25:55.285000000 Stream #0:2(ger): Subtitle: ass Metadata: title : German forced DURATION : 01:03:24.130000000 Stream #0:3(ger): Subtitle: ass Metadata: title : German DURATION : 01:25:43.890000000 Stream #0:4(ger): Subtitle: ass Metadata: title : German SDH DURATION : 01:25:43.890000000 The "ffmepg documentation" isn't particularly illuminating on the -map option, but I finally figured out that the parameter is something like: 1. The first position seems to be the file selector (you can specify multiple inputs with multiple -i options 2. The second position seems to select the type of stream, where s indicates subtitles (I intuit this because it looks like p indicates programs, according to "FFMPEG: How to chose a stream from all stream" [sic]) 3. The third position selects the index of the stream within that type Armed with this information, I was able to select the second subtitle stream, which is the full German subtitles rather than just the German subtitles for the English parts. ffmpeg -i FILENAME.mkv -map 0:s:1 german.srt This gave me the desired subtitles in seconds. Happily, I have what I want and I didn't have to install any sketchy tools that were installed in an unvetted binary. Instead, I'm comfortable installing the well-known tool ffmpeg using the well-known package manager brew. -------------------------------------------------------------------------------- [1] The movie was "Oeconomia" , which is, honestly, must-see viewing for everyone. Every single person should see this movie, to learn how the macro-level economy really works and how we're being used.

You're already testing; now automate it.

2023-03-05T21:23:29+01:00

Published by marco on 5. Mar 2023 21:23:29 Updated by marco on 3. Oct 2025 06:45:54 ------------------------------------------------------------------------ [Introduction] Testing is any form of validation that verifies a product. That includes not only structured validation using checklists, test plans, etc. but also informal testing, as when engineers click their way through a UI, emit values in debugging output to a console, or perform operations on hardware. Automated testing is common for software, as regression-style tests that execute both locally and in CI. This includes unit, integration, and end-to-end tests. The following discussion focuses primarily on software-testing but hopefully contains some insights and information relevant to other engineering disciplines (e.g., embedded and hardware developers). [The testing mindset] Testing is primarily a mindset. Thinking about what you're building in the terms outlined above can help you to determine how and what you're actually going to build. It will help you focus, * going from "this would be a nice feature" 🤩 * to "how would I test it?" 🤨 * to "who would actually use it?" 🙄 * to, perhaps, "it would be neat, but no-one needs it. It's not a requirement." ✋ * or, "the use case is clear and here is how I would test it." 👌 You should think of writing tests not as something you have to do, but rather as something you want to do. * How else do you prove that what you've made actually works? * What does "it works" mean? * Which use cases are covered? * How do you answer these questions without tests? * What do we mean by writing tests? Let's define some of this jargon—use cases? "it works", etc.—before we continue. [Why do we test?] It's a bit of a provocative question, perhaps, but it makes sense to ask about anything into which you're going to invest time and money. So, let's start a bit further back. ❓ What would we like to do? "We would like to build a product of high quality" ❓ What's a product? "A product is an implementation of a set of requirements." ❓ Then what's a requirement? "A requirement is a collection of use cases." ❓ OK, fine. What's a use case? "A use case comprises a set of initial conditions, an action, a set of inputs, and an expected output." ❓ What is quality? "A product that satisfies its requirements is of higher quality than one that does not." ❓ How can I know that my product has the desired quality? "We test use cases for a version of a product to determine quality." ❓ How can I know when my product has enough tests? "When all of the use cases are covered." ❓ What if I change the product after I've tested it? "Then you have to test all of the use cases again." ❗ What the heck? That's boring! I don't have time for that! "It's called regression-testing. There's no way around it." ❓ What if I know that I've only changed a tiny thing? "You might be able to get away with it. But that's where 🕷 bugs come from." ❗ I can't afford to test everything manually every time I make a change! " That's why you automate as many tests as you can." ❗ Running the tests ties up my local machine! I can't work. "Run tests in another environment (e.g., in the cloud)" [Conclusions] * A product of quality includes tests. * A product is considered untested if it has changed since it was last tested. * Regression-testing is unavoidable. * Automated tests improve efficiency and reliability * Using a separate environment improves robustness [Introduction to methodologies] We've established both that testing is a mindset and that it is necessary to building high-quality products. We should keep in mind that the goal is to have a well-tested product with as many of these tests as possible being automated. The question is: how close to the goal state do you stay during development? [Developer feedback loop] In other words, what does the development-feedback loop look like? The goal of the development-feedback loop is to shorten the time between a change and its verification. In practice, this often manifests as "knowing as soon as possible when you've broken something." The longer it takes from change to verification, the more likely it is that multiple changes will be verified at once. Root-cause analysis becomes more difficult. That's why manual tests are undesirable: they are far less likely to be run/applied in a timely manner, increasing the number of changes that have occurred since the last time tests were run. So, the longer you wait to define tests, the longer your product remains untested. The longer you wait to automate tests, the longer you must do manual testing to verify behavior. With that idea in mind, let's consider the spectrum of methodologies. At one end, there's "development">TDD" , where you write the tests first, letting them fail and then writing the implementation. At the other, there's writing all of your acceptance tests once you've finished the product. [Test-driven Design] Always writing the tests first is just one extreme, and one that scares a lot of people away from automated testing. As with any dogma, strict adherence is unlikely to be efficient. Sometimes, you'll need to try out an implementation to see if it's even feasible or want to play with an API to see how it feels before you write a ton of tests for it. You don't want to go too long without testing that you haven't broken something, but you also don't want to write tests for code that you're going to throw away in an hour anyway. Tests are only one part of the array of techniques a developer can use to verify a product. As discussed in more detail below, a strong type system, linting, and static-code analysis of all kinds help verify a product. We should always be aware of which parts are necessary during which phases. If certain tools take longer to verify code, consider whether they need to be executed all the time, or perhaps just when pushing to a remote, or before merging into the master branch. [Acceptance-tests at the end] If you wait until you've finished the product to write all of your tests, you will still have a well-tested product, but you will not have benefited from testing during development. Being able to test as you go improves your efficiency tremendously, as you're not constantly fighting with things that are mysteriously breaking. Instead, you're usually able to pin the blame on the most recent change you've made. A product of nontrivial complexity can be written more reliably and quickly if there are tests. It also becomes possible for one team member to write the tests while another provides the implementation that satisfies it. [A balanced approach] The spectrum in between is where most developers live, writing tests as they go, but not always before they've implemented something. It's understandable that there will always be certain tests that are difficult, if not impossible to automate. However, the document that follows will provide some tools for extracting the testable bits from the untestable ones to increase coverage. Anything that can be tested automatically can be executed by all team members all the time, as well by pipelines in the cloud. [You're already testing!] You're almost certainly already testing. You might be clicking through the UI or emitting statements in a command-line application, but you're verifying your code somehow. I mean ... you are, right? RIGHT? I'm kidding. Of course you're not just writing code, building it, and committing it. You're validating it somehow. That's testing. [A list of validations] If you're really good, you might even keep a list of these validations. Once you have a list, then, 1. You don't have to worry about forgetting to do them in the future 2. Even someone with no knowledge of the system can perform validation This is fine, but it's still a manual process. A manual process carries with it the following drawbacks: 1. It gets quite time-consuming, especially as the list of validations grows 2. You're highly unlikely to perform the validations often enough * It's much easier to fix a mistake if you learn about it relatively soon after you made it 3. 4. You're also unlikely to add all of the validations you need * Generally, you won't validate smaller "facts" and will focus on high-level stuff You're much more likely to make mistakes in manual testing 5. A manual validation process can't be run as part of CI or CD [Automating the list] Automated testing means that you codify those validations. 😒 Great! I have tests! How the heck do I codify them? Don't panic. Almost any code can be tested. In fact, if you can't get at it with a test, then you might have found an architectural problem. See? Automating tests will even help you write better code! 🤨 How do I get started? Just start somewhere. It doesn't matter where. Don't worry about coverage. Just get the feeling for writing a proof about a facet of your code. Any bit of logic can—and should—be tested. What if you still don't know where to begin? Ask someone for help! Don't be shy. It's in everyone's best interest for a project to have good tests. You want everyone's code to have tests so you know right away when you've broken something in a completely unrelated area. This is a good thing! [Goals] 🤸‍♀️ Developers should be excited to use tests to prove that their code works. [Tests should be quick and easy (maybe even fun) to write] A project should provide support for mocking devices and external APIs, or for using test-specific datasets. [Tests should be reasonably fast] A reasonably fast test suite will tend to be run more often. We would like a developer to notice a broken test right after the change that broke it, preferably even before pushing it. [Avoid debugging tests in CI] Tests a developer runs locally should almost always work in CI. Failing tests in CI should also fail locally. [Guidelines] 🤨 Don't be pedantic. For example, * Don't get obsessed with automating everything. * Get the low-hanging fruit first, and leave the rest to manual testing. * See where you stand. * If you haven't automated enough, iterate until done. 🔄 * * Don't forbid mocking in integration tests and don't force mocking in unit tests. * In fact, stop worrying about whether it's a unit or an integration and just write useful tests that prove useful things about your code. * "Stop requiring only one assertion per unit test: Multiple assertions are fine" [Figure out where you stand] The following questions should help you evaluate for yourself where you are on your automated-testing journey. * How much automated testing have you done? * Do you write automated tests now? * Do you feel confident that you can verify your work with automated tests? * Do you understand the limitations? * Do you understand how system architecture can affect testability? [Tests should be useful] We never want anyone in a team to get the impression that we're writing tests just to write tests. We write tests because they help us write better code and because it feels good to be able to prove that something that was working continues to work. You should feel more efficient and productive and feel like you're producing higher-quality code. * Tests should confirm use cases * Tests should prove something about your code that you think is worth proving. * Tests should confirm behavior that either is how the code currently works or how it should work. * Tests should help you write better code from the get-go. * Every bug that you need to fix is de-facto a use case that needs a test. [Code Coverage & Reviews] How do you know when there are "enough" automated tests? Don't get distracted by trying to achieve a specific coverage percentage. The most important thing is that the major use cases are covered. If software is stable and there is "only" 40% test-coverage, then maybe there is a lot of code that rarely or never gets used? In that case, you might want to think about removing code that you don't need rather than to waste time writing tests for code that never runs. New code, though, should always have automated tests. A code reviewer should verify that new functionality is being tested. [Types of tests] Unit Cover a single unit, mocking away other dependencies where needed. Useful for verifying simple logic like calculated properties or verifying the results of service methods with given inputs. Integration Cover multiple units, possibly mocking unwanted dependencies Useful for verifying behavior of units in composition, as they will be used in the end product. The goal is to cover as much as possible without resorting to more costly end-to-end tests End-to-End Also called UI Tests, these tests verify the entire stack for actual customer use cases Very useful, but generally require more maintenance as they tend to be more fragile. Essential for verifying UI behavior not reflected in a programmatic model. Can work with snapshots (e.g. error label is in red) [Approach] The article "Write tests. Not too many. Mostly integration." describes a pragmatic approach quite well. Instead of the classic "testing pyramid", it suggests a "testing trophy". [image] This style of development has the following aims: 1. Verify as much as possible statically, with linting and analyzers 2. Make integration tests cheaper because they prove more about your system than unit tests 3. Prove as much as possible outside of end-to-end tests because they're expensive and brittle [Analysis] Remember that everything you use has to work both locally and in CI. [Static-checking] A project should include analyzers and techniques so that the compiler helps make many tests unnecessary. For example, if you know that a parameter or result can never be null, then you can avoid a whole slew of tests. Developers should only spend time writing tests that verify semantic aspects that can't be proven by the compiler. [Null-reference analysis in .NET] The .NET world provides many, many analyzers and tools to verify code quality. One of the most important things a project can do is to improve null-checking. The best way to do this is to upgrade to C# 8 or higher and enable "null-reference analysis" . The "default language for .NET Framework is going to stay C# 7.3" , but you can "enable null-reference analysis for .NET Framework" quite easily. Another option is to use the "JetBrains Annotations NuGet package" , which provides attributes to indicate whether parameters or results are nullable. The preferred way, though, is to use the by-now standard nullability-checking available in .NET. Doing neither is not a good option, as it will be very difficult to avoid null-reference exceptions. [Unit-testing] Unit tests are very useful for validating requirements and invariants about your code. These are the easiest tests to write and will generally be the first ones that you will write. A requirement or an invariant may be specified in the story itself, but it can be .anything that you know about the code that's important. It's up to the developer and the reviewer(s) to determine which tests are necessary. It gets easier with experience—and it doesn't take long to get enough experience so that it's no longer so intimidating. [Unit-testing example] Just as a quick example in .NET, consider the following code, public bool IsDiagnosticModeRunning { get => _isDiagnosticModeRunning; set { _isDiagnosticModeRunning = value; _statusManager.InstrumentState = value ? InstrumentState.DiagnosticMode : InstrumentState.Ready; } } Here we see a relatively simple property with a getter and a setter. However, we also see that there is an invariant in the implementation: that the _statusManager.InstrumentState is synced with it. Using many of the "techniques described below" <#tools-and-techniques>, we could write the following test: [DataRow(true, InstrumentState.DiagnosticMode)] [DataRow(false, InstrumentState.Ready)] [TestMethod] public void TestIsDiagnosticModeRunning(bool running, InstrumentState expectedInstrumentState) { var locator = CreateLocator(); var instrumentControlService = locator.GetInstance(); var statusManager = locator.GetInstance(); Assert.AreNotEqual(expectedInstrumentState, statusManager.InstrumentState); instrumentControlService.IsDiagnosticModeRunning = running; Assert.AreEqual(expectedInstrumentState, statusManager.InstrumentState); } Here, we're using MSTest to create a parameterized test that, * creates the IOC * gets the two relevant services from it * Verifies that the state is not already set to the expected state (in which case the test would succeed even if the tested code doesn't do anything) * Sets the property to a given value * Verifies that the state is correct for that value We now have code that validates two facts about the system. Should something change where these facts are no longer true, the tests will fail, giving the developer a chance to analyze the situation. * Was the change inadvertent or deliberate? * Are the facts still correct? Does the test need to be updated? If you're addressing a bug-fix, though, you might be able to prove that you've fixed the bug with a unit test, but it's also likely that you'll have to write an integration test instead. [Integration-testing] Unit tests have their place, but they are far too emphasized in the testing pyramid. The testing pyramid comes from a time when writing integration tests was much more difficult than it (theoretically) is today. The "theoretically" above means that the ability to write integration tests as efficiently as unit tests is contingent on a project offering proper tools and support. One common complaint about integration tests vis à vis unit tests is that they run more slowly. Another is that they take longer to develop. Ideally, a project provides support to counteract both of these tendencies. To this end, then, a project should offer base and support classes that make common integration tests easy to set up and quick to execute: * Interacting with a database * Setting up a known database schema * Getting to a clean dataset * "Mocking" <> the database * Mocking other external dependencies in a project (e.g. loading configuration from an endpoint, sending emails, sending modifications to endpoints) There are many different ways to solve this problem, each with tradeoffs. For example, a project can load dependencies in Docker containers, either created and started manually (see "Testing your ASP.NET Core application - using a real database" ) or even dynamically with a tool like the "Testcontainers NuGet package" . [Comparing Unit and Integration tests] A drawback to unit tests is that, while they can test an individual component well, it's really the big picture that we want to test. We want to test scenarios that correspond to actual use cases rather than covering theoretical call stacks. It's not that the second part isn't important, but that it's not as important. Given limited time and resources, we would prefer to have integration tests that also cover a lot of the same code paths that we would have covered with unit tests, rather than to have unit tests, but few to no integration tests. This, however, leads directly to... The advantage of a unit test over an integration test is that when it fails, it's obvious which code failed. An integration test, by its very nature, involves multiple components. When it fails, it might not be obvious which sub-component caused the error. If you find that you have integration tests failing and it takes a while to figure out what went wrong, then that's a sign that you should bolster your test suite with more unit tests. Once an integration test fails and one or more unit tests fail, then you have the best of both worlds: you've been made aware that you've broken a use case (integration test), but you also know which precise behavior is no longer working as before (unit test). [Tools and Techniques] [Tests are Code] Testing code is just as important as product code. Use all of the same techniques to improve code quality in testing code as your would in product code. Clean coding, good variable names, avoid copy/paste coding -- all of it applies just as much to tests. There are two main differences: * You don't need to document tests * You don't have to write tests for tests. :-) [Writing testable code] This is a big, big topic, of course. There are a few guidelines that make it easier to write tests—or to avoid having to write tests at all. As noted above, code that can be validated by the compiler (static analysis) doesn't need tests. E.g. you don't have to write a test for how your code behaves when passed a null parameter if you just forbid it. Likewise, you don't have to re-verify that types work as they should in statically typed languages. We can trust the compiler. Here are a handful of tips. * Prefer composition to inheritance * A functional programming style is very testable * An IOC Container is very helpful * Avoid nullable properties, results, and parameters * Avoid mutable data * Interfaces are much easier to fake or mock; use those wherever you can * Generally, the "SOLID" principles are a decent guide See the following articles for more ideas. * "C# Handbook Chapter 4: Design (2017)" * "Questions to consider when designing APIs: Part I (2014)" * "Questions to consider when designing APIs: Part II (2014)" * "Why use an IOC? (hint: testing) (2019)" [Parameterized Tests] Investigate your testing library to learn how to write multiple tests without having to write a lot of code. In the MSTests framework, you can use DataRow to parameterize a test. In NUnit, TestCase does the same thing, and Value allows you to provide parameter values for a list of tests that are the Cartesian product of all values. [Mocking/Faking] Use mocks or fakes to exclude a subsystem from a test. What would you want to exclude? While you will want to make some tests that include database access or REST API calls, there are a lot of tests where you're proving a fact that doesn't depend on these results. [Focus on what you're testing] For example, suppose a component reads its configuration from the database by default. A test of that component may simply want to see how it reacts with a given input to a given method. Where the configuration came from is irrelevant to that particular test. In that case, you could mock away the component that loads the configuration from the database and instead use a fake object that just provides some standard values. [Test error conditions] Another possibility is to fake an external service to see how your code reacts when the service returns an error or an ambiguous response. Without mocks, how would you test how your code reacts when a REST endpoint returns 503 or 404? Without a mock, how would you force the purely external endpoint to give a certain code? You really can't. With a mock, though, you can replace the service and return a 404 response for a specific test. This is quite a powerful technique. [How to fake?] As noted above, it's much, much easier to use fake objects if you've consistently used interfaces. You can just create your own implementation of the interface whose standard implementation you want to replace, give it a fake implementation (e.g. returning false and empty string and null for methods and properties) and then use that class as the implementation. [Faking/mocking libraries] If you have interfaces that perform a single task (single-responsibility principle), then it doesn't take too much effort to write the fake object by hand. However, it's much easier to use a library to create fake objects—and there are other benefits as well, like tracking which methods were called with which parameters. You can assert on this data collected by the fake object. For .NET, a great library for faking objects is "FakeItEasy" . With a fake object, you can indicate which values to return for a given set of parameters without too much effort. Similarly, you can use the same API to query how often these methods have been called. This allows you to verify, for example, that a call to a REST service would have been made. This is a powerful way of proving facts about your code without having to actually interact with external services. [An example] The following code configures a fake object for ITestUnitConfigurationService that returns default data for all properties, except for Configuration and GetTestUnitParameterValues(), which are configured to return specific data. private static ITestUnitConfigurationService CreateFakeTestUnitConfigurationService() { var result = A.Fake(); var testUnitParameters = CreateTestUnitParameters(); var testUnitConfiguration = new TestUnitConfiguration(testUnitParameters); A.CallTo(() => result.Configuration).Returns(testUnitConfiguration); var testUnitParameterValues = CreateTestUnitParameterValues(); A.CallTo(() => result.GetTestUnitParameterValues()).Returns(testUnitParameterValues); return result; } In the test, we could get this fake object back out of the IOC (for example) and then verify that certain methods have been called the expected number of times. var testUnitConfigurationService = locator.GetInstance(); A.CallTo(() => testUnitConfigurationService.Configuration).MustHaveHappenedOnceExactly(); A.CallTo(() => testUnitConfigurationService.GetTestUnitParameterValues()).MustHaveHappenedOnceExactly(); [Snapshot-testing] You can avoid writing a ton of assertions and a ton of tests with snapshot testing. For example, imagine you have a test that generates a particular view model. You want to verify 30 different parts of this complex model. You could navigate the data structure, asserting the 30 values individually. That would be pretty tedious, though, and lead to fragile and hard-to-maintain testing code. Instead, you could emit that structure as text and save it as a snapshot in the repository. If a future code change leads to a different snapshot, the test fails and the developer that caused the failure would have to approve the new snapshot (if it's an expected or innocuous change) or fix the code (if it was inadvertent and wrong). The upside is that large swaths of assertions are reduced to a simple snapshot assertion. The downside is that the test might break more often for spurious reasons. Generally, you can avoid these spurious reasons by being judicious about how your format the snapshot, * Avoid timestamps or data that changes over time * Avoid using output methods that are too likely to change over time See the documentation for the "Snapshooter NuGet package" . [End-to-end Testing] There have been many solutions to the problem of automated testing of web UIs over the years. The one many know is "Selenium" , but tools like "Cypress" , "TestCafe" , "Puppeteer" and "Playwright" have largely replaced it. The "WebdriverIO" library Before choosing a tool, you'll want to consider what your requirements are: * Tests should run quickly * Headless/command-line support for integrating into CI builds * A GUI for running tests is a plus * Traceability of tests * Snapshot-testing * Debugging, including rewinding through the UI events The current front-runner for end-to-end testing is "Playwright" , an open-source cross-browser, cross-platform, cross-language testing framework. * Video: "What's new in Playwright 1.32" shows the new UI Mode in action (see the "release notes" ; screenshot below) * !"image.png" * Video: ""Playwright can do this?" — Microsoft meetup March 2023" (see masking for visual regression at 00:18:00) * "GitHub" * "Example" [Planner / Executor Pattern] This pattern is particularly useful when you have a bunch of steps to execute. Instead of executing the steps as you go, you build a plan that describes how those steps would be executed and return that as the result of the planner phase. You can test this plan very easily without worrying about how to mock away the mutating part of the code. For example, suppose you want to sync an online data source with a local configuration. The classic way would be to do something like the following: var items = GetItemsFromServer(); foreach (var item in items) { var itemData = GetItemDataFromServer(item); if (string.IsNullOrEmpty(itemData.Text)) { SetStandardText(item, itemData); SaveItemToServer(item); } } With so little logic, there's really no way to question this setup, is there? But think about what happens if there are more decisions to make, more data to retrieve, more data to update on the server. As this logic increases in complexity, the mutating code becomes ever more deeply embedded in read-only logic. That read-only logic ends up being the lion's share of the code that you want to test, but you have to step very lightly to avoid making changes on the server. You can, of course, mock away services, to make sure that nothing is communicated back to the server, but there is another way. What if you were to consider the set of operations as phases? 1. A planning phase where the program gathers all of the information that it needs to determine which commands to execute in order to "repair" the situation. 2. A much shorter and simpler execution phase where the program loops over the plan and applies it. This approach has several advantages: * There are fewer questions about how to handle exceptions that occur while applying the plan. You don't have to worry about what happens when a mutation occurs deep within the planning logic. * It's easier to test the meat of the logic because the output is a plan that you can snapshot or otherwise verify. * You have the user-friendly option to present the user with a detailed plan of what will happen before applying any changes. * You can even store the plan to execute later, e.g., after it has been audited by a separate team. Once again, we have a pattern that not only makes testing easier, but it makes the entire architecture more robust, opening up possibilities that you wouldn't have with the straightforward pattern (which would be harder to test). To finish up this section, let's take a quick look what that could look like in pseudocode. var items = GetItemsFromServer(); var commands = new Commands(); foreach (var item in items) { var itemData = GetItemDataFromServer(item); if (string.IsNullOrEmpty(itemData.Text)) { var command = CreateCommand( "Set standard text for {item}", () => { SetStandardText(item, itemData); SaveItemToServer(item); } ) } } // Present commands to the user; store the commands for later, or execute them... // This is where tests would verify the commands generated from a given set of // item data. foreach (var command in commands) { try { command.Apply(); } catch { // Log error and continue? } } Instead of executing the command immediately, we store what we would want to do with a closure and a description. We can do whatever we want with those commands; executing this is one option, but you can see how useful it would also be for verifying that the logic is correct in tests.

Hiding folders in Azure DevOps Code Wikis

2023-01-18T10:09:22+01:00

Published by marco on 18. Jan 2023 10:09:22 ------------------------------------------------------------------------ [image]It is currently not possible to hide individual folders or files in an Azure DevOps Code Wiki. Folders and files beginning with a . are hidden by default, but you can't influence the structure other than by reordering pages with a .order file in an individual folder. The topic "Hide folders that do not contain Markdown files" discusses extending this functionality. I replied with the following: -------------------------------------------------------------------------------- There are a lot of good suggestions here. Changing the name of the folder or file in order to hide it (e.g., by prepending the name with .) is not a practical solution. Wikis based on, e.g., .NET solutions cannot just change the names of folders that would be empty in the Wiki. Although I think that hiding empty folders by default seems like a good idea, I also understand that clicking an empty folder shows the UI that allows a user to create a page for an empty folder, so hiding that folder would also remove functionality from the online UI. I think that many code-based Wikis wouldn't mind losing this functionality, but we probably need a top-level Code Wiki option here where you can decide whether to show or hide empty folders by default. That takes care of the default behavior, which would cover a lot of use cases for "cleaning up" the wiki's structure. However, if you elect not to hide folders by default, or if you just want to hide another file or folder, how can we support that requirement? I would suggest two mechanisms: * As suggested above, a .wikiignore file that allows globbing à la Git would be powerful (e.g., it would allow you to ignore all Properties folders in all project folders in .NET solutions). * We could extend the .order file to support !, which would hide the folder or file from being displayed. This feature would technically also cover all use cases covered by a .wikiignore file, but would involve quite a bit more work to support (i.e., you would have to add a .order file to every Properties folder instead of just configuring once, in a root file).

Stop trying so hard to use pattern-matching

2023-01-15T11:10:38+01:00

Published by marco on 15. Jan 2023 11:10:38 ------------------------------------------------------------------------ In the article "Why tuples in C# are not always a code smell" by Dennis Frühauff , the author writes the following code for calculating a discount. The requirements are as follows: * Premium customers get 20% off. * Gold customers get 30% off. * Regular customers, when they are students (< 25 years), get 10% off. * Regular adult customers get no discount. * All regular customers get 15% off during happy hour (3 to 8 p.m.). [The author's original version] public decimal CalculateDiscount(Customer customer, DateTime time) { if (customer.CustomerType == CustomerType.Gold) { return 0.3m; } else if (customer.CustomerType == CustomerType.Premium) { return 0.20m; } else { if (time.Hour is > 15 and < 20) { return 0.15m; } if (customer.Age < 25) { return 0.1m; } else { return 0m; } } return 0m; } He doesn't like this code, and neither do I. But we have different reasons. [The author's pattern-matching version] The author rewrites the code above with pattern-matching, to make it "pretty much look like the business rules stated above". His final version looks like this: public decimal CalculateDiscount(Customer customer, DateTime time) { return (IsStudent(customer), IsHappyHour(time), customer.CustomerType) switch { (_, _, CustomerType.Gold) => 0.3m, (_, _, CustomerType.Premium) => 0.2m, (_, true, CustomerType.Regular) => 0.15m, (true, false, CustomerType.Regular) => 0.10m, (false, false, CustomerType.Regular) => 0.0m }; } public bool IsStudent(Customer customer) => customer.Age < 25; public bool IsHappyHour(DateTime datetime) => datetime.Hour is > 15 and < 20; I strongly disagree that this looks like the original business requirements. In order to figure out who gets a 15% discount, you have to figure out what the first two boolean fields of the tuple indicate, so you look at the ad-hoc-instantiated tuple (which is created only in order to pattern-match on it), where you can see from the local-method names that they indicate whether the customer is a student and whether the sale was made during happy hour, respectively. I have a few issues with this version; * As noted above, it is not easily legible * I am not sure about the allocation or efficiency of this code * The extra formatting required (aligning the _ placeholders) makes it look difficult to maintain [Cleaning up the original without pattern-matching] I would tackle this differently, and with classic means. First of all, my main problem with the original version is that it's made unnecessarily long and cluttered by including else statements after returns. Get rid of those and you'll get rid of indenting and all of a sudden, the original code looks remarkably legible. It's also 100% clear that there are no allocations and we don't have to worry our pretty heads about the efficiency of code generated for either if and return statements or for simple comparisons. public decimal CalculateDiscount(Customer customer, DateTime time) { if (customer.CustomerType == CustomerType.Gold) { return 0.3m; } if (customer.CustomerType == CustomerType.Premium) { return 0.20m; } if (time.Hour is > 15 and < 20) { return 0.15m; } if (customer.Age < 25) { return 0.1m; } return 0m; } [Improving semantics] How much clearer would you like that to be? I suppose we could add some local methods to add some semantics to the comparisons. public decimal CalculateDiscount(Customer customer, DateTime time) { if (IsLevel(CustomerType.Gold)) { return 0.3m; } if (IsLevel(CustomerType.Premium)) { return 0.20m; } if (IsHappyHour()) { return 0.15m; } if (IsStudent()) { return 0.1m; } return 0m; bool IsLevel(CustomerType customerType) => customer.CustomerType == customerType; bool IsStudent() => customer.Age < 25; bool IsHappyHour() => time.Hour is > 15 and < 20; } To make up for the fact that we lost all of that delicious pattern-matching and those tuples from the author's version, we're using local methods. Is this an improvement? Overall, I think so. The first version was already pretty good, but now we've improved the semantics by taking the guesswork out of the magic numbers. The IsHappyHour method is definitely an improvement. The IsStudent also imparts more knowledge about what the magic age of 25 means. Also, we've managed to separate the calculation of the rebate from the determination of the conditions that affect the rebate. [Pattern-matching: take two?] Can we do anything with pattern-matching, though? Can we use pattern-matching in a way that's more legible than the version proposed by the author? What about this? public static decimal CalculateDiscount(this Customer customer, DateTime time) { return (customer, time) switch { ({ CustomerType: CustomerType.Gold }, _) => 0.3m, ({ CustomerType: CustomerType.Premium }, _) => 0.2m, (_, { Hour: > 15 and < 20}) => 0.15m, ({ Age: < 25 }, _) => 0.1m, _ => 0m }; } OK. That's not as bad as the author's version. It doesn't allocate a tuple just to be able to use a tuple, for starters. But is it more legible than the previous version? Not at all. We could, of course, improve the formatting to align all of the return statements, but that's also no fun to maintain. The real issue with the pattern-matching solution is that we can no longer use local functions to improve semantics. The only thing we could do would be to add an IsStudent property directly to the class ("extension properties are still being discussed" ). We cannot improve the semantics of the pattern-matching on DateTime because that type is not under our control. In conclusion, as with anything else in programming, you should be judicious in where you use the new and shiny features, always considering whether they're actually helping improve your code.

You should be using a GUI for Git

2023-01-11T21:21:28+01:00

Published by marco on 11. Jan 2023 21:21:28 ------------------------------------------------------------------------ [image]I've seen this "Noob question: Does anyone use things like git gui?" by Collekt again and again. "Just curious as I'm learning and getting familiar with git. Do real production teams use any kind of tools for git like "git gui" or others? Or does everyone just use it from command line? Thanks for any insight. :)" You almost certainly have several use cases for your source control: * clone/push/pull * commit * amend/squash/rebase interactive * merge * diff * code forensics (log/blame, cross-reference, find changes) * work with submodules The command-line isn't the most efficient or least error-prone for any of these tasks. For example -- something you do every day -- a good GUI client will let you very quickly navigate diffs in your working tree with only a few arrow-key presses. You can't beat that with the command line. And, once you have to merge ... you'll want a more powerful view on things than you're going to get from command-line tools. Of course, it's possible to merge on the command-line! I'm just saying it's more error-prone and not as efficient -- especially for most developers. There are probably a couple of "John Henrys" out there, but c'mon. It's great that the command-line exists! It allows us to build UIs on top of it. It allows us to integrate anything we'd like into a headless process like CI/CD. However, you're going to be more efficient with a good GUI. There are pros/cons to the various UIs. I've landed quite firmly on "SmartGit" after an evaluation of all of the other tools (in no particular order: Tower, VS, VSCode, GitLens, Kraken, GitExtensions, GitHub Desktop, SourceTree, Git GUI). Why an external rather than an integrated Git client? * Uniformity regardless of IDE * Hotkeys are more intuitive (in-IDE source-control tends to end up with strange hotkeys) * Ability to integrate a good merging tool (e.g. BeyondCompare) * etc. Why an integrated rather than external Git client? * inline change markers * inline history/blame * etc. You can use both, of course! Use whatever helps you be more accurate and efficient and happy. Visual Studio Code's default source control is very limited (no code forensics to speak of), so be careful of defaulting to that one. Visual Studio is getting better all the time, though. Still feels a bit weird for me, but it's 10x better than it was a couple of versions ago. Of course, YMMV, but please don't continue to believe in the myth that using a command line is somehow a requirement to being a "real" developer. Developers who only use the command line are probably wasting time, probably making mistakes they shouldn't, almost certainly missing out on powerful enhancements to their workflow.

Terminology for CSS values

2022-12-11T22:53:38+01:00

Published by marco on 11. Dec 2022 22:53:38 ------------------------------------------------------------------------ [image]The article ""Thousand" Values of CSS" by Karl Dubost clarifies the definitions for the various types of value in CSS. While there aren't a thousand different kinds of value in CSS, there are quite a few. Each has its raison d'être. The article is informative, but lists the values in what I consider to be an unintuitive order. I've changed the order and consolidated a bit. Each term links to the W3C documentation [1] and each definition starts with the official description, a layman's translation, and a simple code example. Click to jump to the definition or read them in order to learn how they build on each other. * "Initial value" <#initial> * "Declared value" <#declared> * "Cascaded value" <#cascaded> * "Specified value" <#specified> * "Computed value" <#computed> * "Used value" <#used> * "Actual value" <#actual> * "Resolved value" <#resolved> "Initial value" "Each property has an initial value, defined in the property’s definition table. " I.e. the initial value could also be called the default value, as defined in the specification. p { /* the initial value of color is black */ } "Declared value" "Each property declaration applied to an element contributes a declared value for that property associated with the element." I.e. the declared value is the one that you've directly assigned to a property in a CSS element. p { color: red; /* declared value is red */ } "Cascaded value" "The cascaded value represents the result of the cascade: it is the declared value that wins the cascade (is sorted first in the output of the cascade). If the output of the cascade is an empty list, there is no cascaded value." I.e. the cascaded value is the declared value that sorts first in the list generated by the cascade of declared values that apply to that element. p { color: red; /* declared value is red */ } p { color: green; /* declared and cascaded value is green */ } "Specified value" "The specified value is the value of a given property that the style sheet authors intended for that element. It is the result of putting the cascaded value through the defaulting processes, guaranteeing that a specified value exists for every property on every element." I.e., the specified value is the cascaded value, or the default value for that property, if there are no cascaded values. p { color: red; /* declared value is red */ } p { color: green; /* declared, cascaded, and selected value is red. */ /* Also, the selected value for, e.g., margin-left is 0 because that's the default, and no value was specified. */ } "Computed value" "The computed value is the result of resolving the specified value as defined in the “Computed Value” line of the property definition table, generally absolutizing it in preparation for inheritance." I.e., the computed value is the specified value, but converted to absolute units (e.g., 2em converts to 32px if the font-size is 16px), or to a special value like auto. html { font-size: 16px; } p { font-size; 2em /* declared, cascaded, and selected value are 2em, but computed value is 32px. */ /* computed value of width is auto because there is no declared value, so the selected value is the initial value. */ } "Used value" "The used value is the result of taking the computed value and completing any remaining calculations to make it the absolute theoretical value used in the formatting of the document." I.e., the used value is the computed value, but special values are converted based on context. E.g., a computed value of width: auto will have a used value of width: 100px if the parent container is 100px wide. body { width: 100px; } p { width; auto /* declared, cascaded, selected, and computed value are 2em, but used value is 100px. */ } "Actual value" "A used value is in principle ready to be used, but a user agent may not be able to make use of the value in a given environment. For example, a user agent may only be able to render borders with integer pixel widths and may therefore have to approximate the used width. Also, the font size of an element may need adjustment based on the availability of fonts or the value of the font-size-adjust property. The actual value is the used value after any such adjustments have been made." I.e., the actual value is the used value, but adjusted as necessary for the output device. p { border-width: 1.1px; /* declared, cascaded, selected, computed, and used value are 1.1px, but actual value is 1px. */ } "Resolved value" Despited the name, the value returned by the "getComputedStyle()" method will be either the computed or the used value, depending on the type of property. The result of this method is called the resolved value. body { width: 100px; } p { width; auto } const p = document.querySelector('p')[0]; const resolvedValue = window.getComputedStyle(p).width; /* resolvedValue == 100px */ -------------------------------------------------------------------------------- [1] The W3C documentation lists the terms in the intuitive order, but is quite extensive and technical. The summary in this article is, I hope, easier to understand.

How to evaluate dependencies

2022-12-04T22:11:39+01:00

Published by marco on 4. Dec 2022 22:11:39 ------------------------------------------------------------------------ As software developers, we are constantly making the decision between make or buy. Deciding to make something carries with it the obligation to design, develop, test, document, and support it. You'll have everything under your control, but you'll also have to do everything yourself. If a component is not part of your project's core functionality, then it's often a good idea to look around and see if you can find someone who's already built that functionality. Optimally, the component you find will be free and open-source and will have been built by a team whose aim was to provide exactly that functionality. Because they've focused on their task, it's more likely to be a robust solution to your problem that what you would write yourself (focused, as you hopefully are, on your task). Their solution might go a bit too far (see "Size/Focus"), but that might be fine too (see "Extensibility"). Is the component good, though? What do we mean by "good"? How can we tell? How do we go about sizing up a dependency? [Facets] The following table outlines various facets to consider. [Legal] License See the "Uster OSSPolicy" for compliance information Cost Free? One-time fee? Per-seat license? [Organizational] Maturity How long has the product been around? Activity When was the last commit? The last release? Maintenance Status Is the project actively maintained? How long is the issue list? Are bugs addressed? Popularity How many stars? Is it widely used? Community Do questions get answered? Is there help on StackOverflow? Reputation Are there known issues with the product or maintainers? [Technical] Documentation Is it sufficient? Are there good examples or tutorials? Configuration Can you just include the package? How is the configuration? Does it follow platform standards? Size / Focus Does it do one thing well? Or many other things you don't need? Extensibility How easy is it to extend the package for additional use cases? Will that matter to your project? Efficiency / Performance For this you have to know your non-functional requirement Portability Does it work on all target platforms and run-times? Are there unreasonable restrictions? Transitive Dependencies What are its dependencies? Are those reasonable? Quality What sort of impression does the project make overall? How does the code look? [References] * "A Dependency Checklist" by Justin Howard * "How to Choose the Right Dependencies for Your Project" by Jamie Bullock

C# 11 Features

2022-11-21T22:48:57+01:00

Published by marco on 21. Nov 2022 22:48:57 Updated by marco on 7. Dec 2022 22:47:43 ------------------------------------------------------------------------ The articles "Twelve C# 11 Features" by Oleg Kyrylchuk and "Welcome to C# 11" by Mads Torgersen provide an excellent overview with examples of new features in C# 11, available with .NET 7.0. I include my own notes below. [Interesting and obviously useful] "Obvious" to me, at least. The terms link to examples in one of the articles linked above. "Native UTF-8 Strings" You can now append u8 to the end of a literal string to make it "UTF-8" instead of the system-standard UTF-16. For example, "Test string"u8 will be encoded by the compiler as UTF-8 and will have the type ReadOnlySpan. "Raw Strings (Here-Doc)" C# finally supports "here documents" (which have been supported in other languages like "Perl" or "PHP" for a long time). In C#, they're called "raw string literals" and, * they begin and end with at least three double-quotes * can be multi-line * can contain unescaped everything (unless you have three double-quotes in a row, in which case, you just add more double quotes to the fences at the beginning and end) * support interpolation * and also automatically trim left indenting. Finally, you can just pass a formatted and indented JSON into C# code, interpolate some variables, and do it all without escaping anything! [1] "Abstracting over static members" "In fact .NET 7 comes with a new namespace System.Numerics chock-full of math interfaces, representing the different combinations of operators and other static members that you’d ever want to use. [...] All the numeric types in .NET now implement these new interfaces – and you can add them for your own types too! So it’s now easy to write numeric algorithms once and for all – abstracted from the concrete types they work on – instead of having forests of overloads containing essentially the same code." See "here" for an example of using generic parameters in operators, or "Generic Math" for an example that uses some of the new interfaces, like IAdditionOperators and ISubtractionOperators. In that vein, there are a lot more interfaces that support generalized computation, like "ISpanParsable Interface" , which "[d]efines a mechanism for parsing a span of characters to a value." "Required members" "Another ongoing theme that we’ve been working on for several releases is improving object creation and initialization. C# 11 continues these improvements with required members." "Generic Attributes" You can now make attributes generic and use a generic constraint to limit which types may be passed as type parameters (enforced by the compiler, rather than at runtime). E.g. [Generic] declared an attribute of type GenericAttribute parametrized with MyType. "Extended nameof Scope" This seems like a small one, but it's a welcome improvement. You can now use nameof with "method parameter[s] in an attribute on the method or parameter declaration." StringSyntaxAttribute [Added on 03.12.2022] This one is not technically part of C# -- it's actually included in .NET 7 -- but it's worth an honorable mention. You can now decorate a parameter to indicate the string-syntax that it supports. This allows IDEs to provide string-syntax-specific code-completion, highlighting, and error-handling. A good example is, of course, for regular expression patterns. While Rider and ReSharper have provided this support for certain constructors and methods (e.g. RegEx or DateTime.Format), this is a welcome standardization that gives your own APIs the same star treatment. The post "What does the StringSyntaxAttribute do?" includes a list of the syntaxes supported out-of-the-box. The post "StringSyntaxAttribute for syntax highlighting" provides examples and screenshots. [Niche Additions] A few that seem a bit dubious, but are, I guess, welcome additions, and will be useful to someone are, "List patterns" You can do some wild matching with these (i.e. numbers is [_, >= 2, _, _] returns true if numbers is a four-element list where the second element is greater than or equal to 2. "Newslines in string-interpolation patterns" I guess it's nice that you can format complex variables inside an interpolated string, but I still think that you should just make a local variable instead. That would be more readable, in any case. "Auto-default Structs" This will allow you to define structs without being so pedantic about defining the constructor. "Unsigned Right-shift Operator" I know I'm almost certainly not going to use this one, but it nicely rounds out the support offered with the new System.Numerics interfaces and the increased generality offered by abstracting over static members (linked above). "File-scoped types" This seems kind of like an analog to unexported types declared in TypeScript, but I don't really see myself using them very much until we get the "type declaration from TypeScript" as well. "Source-generated regular expressions" This feature leverages the "source-generation" that's been available since ".NET 5" to avoid JIT for regular expressions by generating code for it directly. It's really great to see the .NET team getting mileage out of the features they're adding (I'm sure this isn't a coincidence). For another example of source-generation, see "Generating PInvoke code for Win32 apis using a Source Generator" by Gérald Barré , which explains how to use Microsoft's NuGet package Microsoft.Windows.CsWin32 to easily generate source for any Win32 API or type -- no more writing this stuff manually! -------------------------------------------------------------------------------- [1] Check out the following animation of converting an escaped string to a raw string in Rider (from the post "Rider 2022.3: Support for .NET 7 SDK, the Latest From C#11, Major Performance Improvements, and More!" by Sasha Ivanova : [image]

Waterfall vs. Agile vs. "Wagile"

2022-09-20T21:45:20+02:00

Published by marco on 20. Sep 2022 21:45:20 ------------------------------------------------------------------------ The article "Agile Projects Have Become Waterfall Projects With Sprints" by Ben Hosking argues that a lot of projects using agile aren't agile at all, but are "more like waterfall projects with upfront requirements, fixed deadlines, sprints and 2 weekly demos." Overall, I understand where the author is coming from, but I found the tone pretty overwhelmingly negative. I can only imagine what the author has seen to have put them in such a dark place. 😐 I thought that this was an interesting comment in the article: "You cannot create fixed deadlines unless you know all the requirements and guarantee no requirements are changed." However, you can create fixed deadlines (the world kind of expects them sometimes, e.g. when you're preparing for a conference that happens on a specific day), but then you have to be willing to adjust on what will be delivered on that day. Agile started out in a world where a partial product could be delivered and still have value. That is not the case with all projects. Thus, the designations MVP (Minimum Viable Product) and MMP (Minimum Marketable Product). Even agile projects have to be honest about what the minimum time frame is for an MVP, though. Where some projects have an advantage is that they can iterate in smaller increments after that, but also can deliver useful, though nonviable pieces as artifacts of iterations. There are some projects where it's more difficult to carve out such deliverables. Although there is always work that has been planned and successfully accomplished and documented, it's sometimes hard to measure or see progress until a larger amount of work has been done. I suppose that's the art of planning and measuring. Here, it's also useful for technical team members or more technically oriented teams to learn how to consider administrative, planning, design, and documentation work as just as useful as producing technical artifacts (be they physical or virtual). A waterfall process doesn't help figure out what to do when the delivery cannot be completed on time. It (generally) has no plan for what to drop if you can't deliver on time. Also, it doesn't really have any ideas for what to do when new things "crop up". An agile process is supposed to help you triangulate toward a version of the product that can actually be delivered by the target date -- or help you better (and sooner) predict whether it's even possible to deliver anything useful by that date. I think you have to be honest about which projects really can be run in an agile way -- but then also make sure that they take advantage of agility to be bolder than they have been. Release early, release often, think about what your MVP is, all of those things are good to take from the agile process. As far as the "ceremony" of the process goes: I have always found value in the review and retros.

Applying the rule of least power in practice

2022-09-02T04:28:25+02:00

Published by marco on 2. Sep 2022 04:28:25 ------------------------------------------------------------------------ Some asked "is there a js library that animates the text word by word like shown?" by DemDavors . A bunch of people answered "just do it with CSS!" and one or two recommend using "GSAP" (Green Sock Animation Platform). I'd just heard about that library in the following instructive video and had had a chance to investigate how it works. [media] I'd like to expand on the comments recommending to use the "rule of least power". They are absolutely correct, but you have to consider the entire task: * Learning how to use CSS for this task * Writing maintainable CSS to execute this task For those who already know how to do this and are trying to limit JS as much as possible then, by all means, use CSS only. For anyone else, "least power" means using CSS where possible, but not necessarily excluding JS if doing so improves maintainability, enhances developer speed and accuracy, and reduces errors. If you look at what GSAP does, it generally maps a high-level JS animation API to CSS animations and transitions. The concession you've made is to include animations using a relatively thin layer of JavaScript. That thin layer, though, is a change in technology (more power), which ensures that the animations will no longer work if JavaScript is disabled. However, you're actually using CSS animations under the hood, benefiting from the high-level and highly optimized implementations in the browser. So you've lost flexibility as far as user agents is concerned, but the performance is the same, and you've probably saved time debugging and tweaking the implementation. That might be a better balance for those developers who would have no idea how to animate the given example with native CSS. If they did that, they would have to first learn how to do it, taking up a lot of time, to say nothing of that they might end up creating a suboptimal implementation, both performance- and maintenance-wise. Telling someone to "just use CSS" is technically correct, but also sounds a lot like answering "just use pipes" when someone asks how to install a toilet. There's a bit of detail missing there.

New feature for C#: Anchored types

2022-09-02T04:27:24+02:00

Published by marco on 2. Sep 2022 04:27:24 ------------------------------------------------------------------------ I recently answered the question "What features from other languages would you like to see in C#?" by BatteriVolttas I think "Anchored Declarations" and "Qualified Anchored Declarations" from Eiffel would be very useful. I like the name "anchored" because you're anchoring the type of one thing to another. Instead of using int throughout a class, you can just make e.g. a field named _id be an int and then make all other types (e.g. for the parameter passed to a method) refer to the anchor with like _id or typeof _id. If the type of the field ever needs to change, you only need to update one place. It's more expressive because the alternative is to explicitly write the type of the parameter, whereas that was never what was going on. The method doesn't decide what the type is; we're just used to _syncing_ it to the type of the field _manually_ because there is no way to express the relationship in most languages we're using. Here's an example: class A { int Status { get; set; } = 0; like Status PriorStatus { get; } void Start(like Status s) {} void Stop(like Status s) {} } The syntax is similar to how ref and out work now, but looking at it takes a bit of getting used to, especially for the property declaration. TypeScript has this feature, with the "typeof" operator, but they don't name it. TypeScript has two advantages here: it places the type after the variable name, which feels a bit more natural when the type is expressed with multiple words, and TypeScript has implicit return types, so you don't have to write the type at all in many cases. Because of the implicit typing, TypeScript has technically had anchored types all along! class A { status: int = 0; // The implicit type here is derived from "status", // which "anchors" the type of the function to that field. get priorStatus() { return status; } // Here we're obliged to restrain the type explicitly void Start(s: typeof status) {} void Stop(s: typeof status) {} } As of TypeScript 4.7, it "supports qualified anchored declarations" on private fields as well. Someone suggested in a response that generics might fill this bill already. In a way, yes, that's true. I could define the whole class with a generic type argument and then create a derived type that fixes the type argument to int. class A where TStatus : INumber { TStatus Status { get; set; } = TStatus.Zero; TStatus PriorStatus { get; } void Start(TStatus s) {} void Stop(TStatus s) {} } class IntA : A {} We have to use the newest features from C# 11 in order to be able to initialize the value to 0. If it were a value that maps to a non-mathematical concept (e.g. additive or multiplicative identity), then we wouldn't be able to use the generic approach. It feels a bit like misuse of generics, though, when I just wanted a shorthand for letting one type reference another. As I wrote, TypeScript already allows this and seems to have found it a useful addition to generics (you can probably implement it under-the-hood with the same code in the compiler). I feel the same way about the missing type declaration from TypeScript (or the very similar, but less powerful typedef from C or Pascal).

Quick CSS: text-decoration

2022-03-21T22:50:55+01:00

Published by marco on 21. Mar 2022 22:50:55 ------------------------------------------------------------------------ The article "When to Avoid the text-decoration Shorthand Property" by Šime Vidas makes a couple of interesting points. Basically, you have a lot of control over how underlines are drawn on text. * Modern browsers allow you to tweak the "text-decoration: underline" with "text-decoration-thickness" and "text-underline-offset" * Use the ":any-link" to select links that actually have an href attribute rather than selecting all links. * The CSS property "text-decoration is a shorthand property, which means that setting it overwrites all of the properties that it might represent (including the underline thickness)." The article doesn't mention these, but, * "text-decoration-skip" controls how to underline whitespace * "text-decoration-skip-ink" controls whether a text decoration (underline or overline) can touch the ascenders or descenders of glyphs. The following text has the style text-decoration: underline .4em; text-underline-offset: .4em. Note that it doesn't affect the bounding box. squabbling The following text has the style text-decoration: underline; text-decoration-skip: spaces; text-decoration-skip-ink: all. Note that text-decoration-skip only works with Safari at the time of writing. squabbling with boggling

The "Hustle" culture in Software Development

2022-01-24T17:20:05+01:00

Published by marco on 24. Jan 2022 17:20:05 ------------------------------------------------------------------------ Have you noticed that there is more and more content available to help you learn how to program? For every topic under the sun, there seems to be a blog article or video of superficially reasonable quality. For every question on StackOverflow, there's an effusive answer with examples. This is all pretty great, honestly. However, with the increase in content. there is also the need to be able to wade through it. How old is that StackOverflow answer? How appropriate is the answer to your particular question? Are there other solutions? Maybe easier ones? Maybe more modern ones? Has this solution to this particular problem been addressed in more recent versions? This isn't new, of course. You should have been asking yourself questions like this for quite a while with these so-called expert-community sites. However, now, we're also inundated with content from people hustling to make a living as professional, freelance, advice-givers online. This is not a bad thing, necessarily. It's great that the unsung masters that formerly only provided value inside of a single company are bringing their didactic abilities to the world. That's not all that they're doing, though. Those who are on a subscriber model have to publish content in order to keep their subscribers. They don't even necessarily have to produce anything of lasting value -- they just have to produce something. They just have to retain and/or grow their subscriber base. This leads to nice-looking, but ultimately useless "fluff" content that rehashes an old concept with a few flashy graphics or an accompanying video. And the videos! Many of them take 15 minutes to explain a concept that you could describe adequately in a paragraph and a code example. The Microsoft MVP bloggers are very conspicuous these days: there are many who are publishing an article or two per week "explaining" a C# 10 feature that has already been explained to death in dozens of other high-profile articles -- to say nothing of the article "Welcome to C# 10" by Kathleen Dollard , which comes straight from the horse's mouth, is wonderfully written, and, honestly, says all there needs to be said about these features. But, if you search for "C# 10", there is a flood of repetitive and, sometimes, outdated, information on C# 10. And these authors are all still churning out the articles. They're doing it for the clicks, for the ad-views, for the subscribers. It's a living. I get it. But, overall, it contributes to a very muddled picture that makes it difficult for people looking for advice and assistance.

CSS Speedrun

2022-01-22T12:22:44+01:00

Published by marco on 22. Jan 2022 12:22:44 ------------------------------------------------------------------------ If you want to test or hone your CSS skills, check out the "CSS Speedrun" . It lets you warm up with a relatively easy "intro", then takes you through ten levels. Generally, each level tests a different feature of CSS (usually a specific selector). The final question (pictured) makes you combine what you've learned or used from other levels. The image below is from my second time through. The first time through I needed about nine minutes; the next morning, I got through much more quickly. I guess I'd learned something. 🎉 for me. [image]

TIL: nth-of-type() and na+b in CSS

2022-01-21T11:12:46+01:00

Published by marco on 21. Jan 2022 11:12:46 ------------------------------------------------------------------------ I've known about nth-child(n) for a long time. It selects the nth child from a structure if that child happens to match the given tag. You can always select the nth child by omitting the tag. For example, div :nth-child(2) (two selectors) will match the second child of any div, regardless of type. However, div span:nth-child(2) will only match if the second child is also a span. You cannot write a selector that says "select the second span" using nth-child. That's where nth-of-type(n) comes in. The selector div span:nth-of-type(2) does exactly that. I can't recall that I've ever had this need before, but it's also possible that I ended up adding extra tags or convoluted selectors in order to achieve what could have been more elegantly done with nth-of-type. Additionally, while I was aware that nth-child supported constants and the keywords odd and even, I didn't know that it also supported a formula an + b. The a is a multiplier and b is an offset. With this formula, you can select every third or fifth (or whatever) element and then move the selection by a given offset. The selectors first-of-type, last-of-type, etc. also exist, as well as only-of-type, which matches an element when it's the only child of that type in the parent. See "Meet the Pseudo Class Selectors" by Chris Coyier for more information. You may see where this is heading. The article "The wondrous world of CSS counters" by Chen Hui Jeng includes an example where he writes the famous FizzBuzz program with CSS. Start with an ordered list,

Then apply the following CSS to it, ol { list-style-position: inside } /* To line-up all items neatly */ li:nth-of-type(3n+3), li:nth-of-type(5n+5), li:nth-of-type(3n+3):nth-of-type(5n+5) { list-style: none /* When text of Fizz, Buzz or FizzBuzz appears, get rid of the numbers */ } li:nth-of-type(3n+3)::before { content: "Fizz" } li:nth-of-type(5n+5)::before { content: "Buzz" } li:nth-of-type(3n+3):nth-of-type(5n+5)::before { content: "FizzBuzz" } Put it all together and you get "CSS FizzBuzz" .

Accessibility is important

2021-12-28T23:45:47+01:00

Published by marco on 28. Dec 2021 23:45:47 ------------------------------------------------------------------------ [image]I recently read through the "a11y myths" . They're quite interesting and should be required reading for managers running projects that develop web sites. From it, I learned about the evils of overlays (see the "Overlay Fact Sheet" ) and that there are really good resources out there, like "Understanding Conformance" with WCAG 2.0 (Web Content Accessibility Guidelines). "All WCAG 2.0 Success Criteria are written as testable criteria for objectively determining if content satisfies them. Testing the Success Criteria would involve a combination of automated testing and human evaluation. The content should be tested by those who understand how people with different types of disabilities use the Web." If you build custom controls, you should use "ARIA" . That page includes the following note, "Many of these widgets were later incorporated into HTML5, and developers should prefer using the correct semantic HTML element over using ARIA, if such an element exists. For instance, native elements have built-in keyboard accessibility, roles and states. However, if you choose to use ARIA, you are responsible for mimicking (the equivalent) browser behavior in script." If you do need to use ARIA, then there's a set of rules for its use in the article "Notes on ARIA Use in HTML" . While we're on the topic of building your own custom controls instead of using the built-in HTML inputs, we can also talk about how "Good semantics" also goes a long way to having good accessibility, right out of the gate. So, go ahead and use main, nav, header, footer, aside, section, and article. There's some really good advice in there on writing clearly (e.g. use full month names and clarify abbreviations) as well as using meaningful text in links (e.g. don't just use "click" or "here").

TIL CSS border-radius lets you define ellipses

2021-12-26T09:24:51+01:00

Published by marco on 26. Dec 2021 09:24:51 ------------------------------------------------------------------------ I hadn't ever really thought about it because I don't use the API very much, but it turns out that the border-radius property is not only a shorthand for setting all four corners at once, but also sets the horizontal and vertical lengths simultaneously. To set them individually, use a / between two values. The corner radii are then calculated using ellipses as shown in the following visualization, [image] The article "CSS Border-Radius Can Do That?" by Nils Binder has many more examples. It also introduces a "Fancy-Border-Radius tool" to help you create the desired shape visually. [image] CSS includes the much more generalized "shape() API" [1], but it wouldn't be as easy to define the "blobs" shown above with that API because the "blob" is defined by the intersection of four overlapping ellipses and the shape() API doesn't allow combining multiple shapes into one shape. Not only that, but the fact that the "blob", as defined by the eight values shown above, can be quite easily animated by providing the end "blob" to a transition or by providing several "blobs" to tweenable @keyframes. You can see the technique in action in "this CodePen" . Scroll all the way down in the CSS definition to see that the effect uses a combination of morphing the border-radius and rotating using a transform to achieve a quite-complex and organic effect using only very straightforward and highly available CSS. @keyframes morph { 0% {border-radius: 40% 60% 60% 40% / 60% 30% 70% 40%;} 100% {border-radius: 40% 60%;} } @keyframes spin { to { transform: rotate(1turn); } } -------------------------------------------------------------------------------- [1] You can even use "tricks" to create many shapes without using the shape() API either. See "The Shapes of CSS" by Chris Coyier for many, many examples.

Real Hacks are not easy

2021-12-23T15:30:47+01:00

Published by marco on 23. Dec 2021 15:30:47 ------------------------------------------------------------------------ Most of us know "hackers" from the media -- either the news media, television shows like Mr. Robot, or movies like Swordfish. But the fast and easy way of hacking presented in the media actually does a disservice to how incredibly clever these hacks really are. Less-complex techniques -- like guessing or brute-forcing passwords -- still work super-well. And you've always got social engineering hacks, like just asking someone for their credentials in an official-sounding way. But real, technical hacking involves getting to know a system's dependencies and memory layout and runtime environment even better than the original programmers ever did. Note: Both of these issues have been fixed, but it’s fascinating to read about how they did it. It really offers insight into what to avoid doing in your own code (e.g. do not open a WebSocket on 0.0.0.0). [NSO's zero-click iMessage exploit] The first article "A deep dive into an NSO zero-click iMessage exploit: Remote Code Execution" by Ian Beer & Samuel Groß is a longer read, but I found it fascinating how many pieces they needed to chain together in order to hack iMessage -- which they managed to do with a 0-click exploit. Just sending a message to the phone with a specially coded picture in it was enough to trigger code to run automatically that, unfortunately, ran before the sandbox. It overwrote memory in a controlled manner -- making sure not to crash the app -- and set up its own virtual machine to execute arbitrary code, which it then did. "JBIG2 doesn't have scripting capabilities, but when combined with a vulnerability, it does have the ability to emulate circuits of arbitrary logic gates operating on arbitrary memory. So why not just use that to build your own computer architecture and script that!? That's exactly what this exploit does. Using over 70,000 segment commands defining logical bit operations, they define a small computer architecture with features such as registers and a full 64-bit adder and comparator which they use to search memory and perform arithmetic operations. It's not as fast as Javascript, but it's fundamentally computationally equivalent. "The bootstrapping operations for the sandbox escape exploit are written to run on this logic circuit and the whole thing runs in this weird, emulated environment created out of a single decompression pass through a JBIG2 stream. It's pretty incredible, and at the same time, pretty terrifying." [VSC with WSL opens an unprotected WebSocket] The second hack is less wide-reaching, in that it would apply only to certain software developers using certain tools, which automatically limits the audience. The "RCE in Visual Studio Code's Remote WSL for Fun and Negative Profit" by Parsia describes, in relatively easy-to-follow detail, how the author found a pretty big hole in the remote-debugging support for Visual Studio Code using WSL (Windows Subsystem for Linux). In order for it to work, the user had to approve opening the port in the Windows Firewall, but it was kind of unconscionable that it opened such a big hole. The developer could be forgiven for thinking that it was OK to approve the request, given that they had just initiated an action to debug between machines. Approving a firewall in that situation is not only expected, but incredibly common. The dialog box doesn't provide an information about which ports it wanted to amend. The Local WebSocket Server Every time you see a local WebSocket server, you should check WHO can connect to it. "WebSocket connections are not bound by the Same-Origin Policy and JavaScript in the browser can connect to local servers." -- TL;DR WebSockets WebSockets start with a handshake. It is always a ""simple" " (in the context of Cross-Origin Resource Sharing or CORS) GET request so the browser sends it without a preflight request. These bugs can be chained: 1. The local WebSocket server is listening on all interfaces. If allowed through the Windows firewall, outside applications may connect to this server. 2. The local WebSocket server does not check the Origin header in the WebSocket handshakes or have any mode of authentication. The JavaScript in the browser can connect to this server. This is true even if the server is listening on localhost. 3. We can spawn a Node inspector instance on a specific port. It's also listening on all interfaces. External applications can connect to it. 4. If an outside app or a local website can connect to either of these servers, they can run arbitrary code on the target machine.

State of CSS 2021

2021-12-23T09:55:59+01:00

Published by marco on 23. Dec 2021 09:55:59 ------------------------------------------------------------------------ I just finished reading through the "State of CSS 2021" . It's a well-presented [1] summary of a developer survey about CSS. I liked the following sections: "Features" the sub-sections have a pretty fine-grained listing of CSS features, usage, "caniuse" and "MDN" links, if you're interested in finding out what you might be missing...or want to be smug about all of the CSS features you know about and use. "Technologies" The rankings in the sub-sections are broken down by "Satisfaction", "Interest", "Usage", and "Awareness". "Awards" "PostCSS" , "vanilla-extract" , "CSS Modules" , and "Stiches" seem worth following up on. "Conclusion" They rightly note that "container queries" [2] and "Cascade Layers" [3] will drastically reduce the number of media queries and prefixing and pre- or post-processing needed for CSS. Another step in eliminating pre- or post-processing would be to support CSS nesting natively, but that's a "bit farther out" . [4] -------------------------------------------------------------------------------- [1] The permalinks next to the titles don't work. I submitted an "issue" and it may already be fixed by the time you browse to it. [1] Available behind a flag in Blink-based browsers (Chrome, Edge, Opera). In-progress in WebKit/Safari. There's a "container-query polyfill" available [1] Available behind a flag in Blink-based browsers and Firefox and the Safari TP. [5] No polyfill, so you really have to wait for non-flag release to use this feature. [1] Technology Preview [1] There are support bugs for all major browsers, but no released implementation so far, even behind a feature flag.

CSS sub-grids (and grids) with Kevin Powell

2021-11-13T13:36:23+01:00

Published by marco on 13. Nov 2021 13:36:23 ------------------------------------------------------------------------ I've been using CSS Grids for a while now. I've found many instances where I had used flexbox, where grids turn out to be much more appropriate. That is, the grid layout algorithm lets me specify what I want without fiddling about with flex-base and flex-grow, etc. Flexbox definitely has its place, but I think we all ended up abusing it a bit in our rush to leave tables-for-layout behind. But that's all in the past because now we have CSS grids "available everywhere" and all is well with the world! That being said, if you've not used CSS grids yet, then you should check out this CSS-grid super-fan's many videos. He has a "playlist of CSS Grid videos" by Kevin Powell that you can work your way through. He even made a short video (5min) describing how to use the grid inspector in browsers. The grid inspector is super-handy, but not so intuitive to find. [media] [Sub-grids] I'm more interested in what the same guy has to say about sub-grids. which are "currently only available in Firefox" (but it's been available there for over 2 years now). The 8-minute video below shows a concrete, real-world example, where you can see how little effort is required to get the browser to just align everything for you, all without fixed minimum or maximum widths (just like it used to be with tables). It should be immediately obvious why this feature is both a good thing and necessary (because the behavior can't be replicated with existing CSS layout features). [media] The 11-minute video below shows how the generalized mechanism lets you do the same thing for rows: [media] You can find the full list of sub-grid videos (so far) in the "Subgrid playlist" by Kevin Powell . CSS sub-grids are an elegant way of aligning items without hard-coding anything (as required by existing techniques). They will continue to do what you expect regardless of the content added -- i.e. there are no fixed minimum or maximum heights to make the alignment work, so you won't be surprised when one of these artificial restrictions limits the algorithm unnecessarily (as it would with flexbox or regular grids). You can enable "Subgrid" by including grid-template-columns: subgrid. My advice to the feature designers would be to rename the value to grid-template-columns: inherit because that would be closer to the mark. Several times in the video, Kevin has to correct himself that he's talking about the same grid rather than a copy of the grid. That's what the nested container is doing: it's inheriting the grid from a parent. Since it also has to declare itself as a display: grid, it can choose to inherit or explicitly set a template for its rows and/or columns. I think that would be relatively intuitive, but what do I know? This feature kind of feels like a generalized way of getting back one of the advantages of the table-layout algorithm. The table-layout algorithm makes the cells in columns the same width throughout the table. This, despite the fact that the cells are all defined in different parents -- and columns aren't even defined as elements at all. I think we all understand why it's not a good idea to abuse the table semantics just to be able to use the table layout algorithm. It's nice to see that the advantages of that layout are being rescued -- and generalized to be even more powerful.

Why can't identifiers start with a number?

2021-11-10T11:01:27+01:00

Published by marco on 10. Nov 2021 11:01:27 ------------------------------------------------------------------------ The video "I'm not sure how much longer I can wait!" by Kevin Powell is an excellent introduction to sub-grids in CSS. But I was more interested in the fact that he told his viewers that, "you can use numbers in classes, but if you have a class or id that starts with a number, it's invalid. [...] It's one of those weird things in CSS that sometimes trips people up." I immediately thought to myself, "it's not weird. Every programming language is like that." Then, I thought, "I bet this guy only knows CSS, so he doesn't have anything to compare it to." Then, I thought, "Wait...why can't you start an identifier with a number?" And, finally, "I bet it's a lexing/parsing thing." [Parser or lexer?] I've written several parsers for medium-sized languages and my gut feeling is that letting an identifier start with a number seems like a surefire way of making the lexer more ambiguous or pushing more work into the parsing stage. For example, if 25L can be either an identifier or a long integer, then the parser has to figure out from context which one it is (e.g. by checking whether that identifier is declared). If it can only be a number, then it comes out of the lexer as a number token and the parser doesn't have to disambiguate. Even if your language doesn't allow suffixes, you'd still have the problem with an identifier like 25, which would be legal unless you introduce the additional restriction that an identifier must have at least one alphabetic character. In that case, though, you might as well make the rule that the identifier has to start with an alphabetic character and avoid the whole ambiguity. With that common -- not weird! -- rule, the disambiguation happens in the lexer, where the operation is clearer and less expensive, performance-wise. [Unresolvable ambiguity] It's actually worse than that, though. In the case of a programming language, you could see how the following would result in a compiler ambiguity: var 3 = 5; // I'm already confused //...the compiler gets it, though var a = 3; // Now, the compiler's confused as well Is the developer assigned the value 3 to a or the variable 3? Not only is this a terrible idea for readability, the compiler can literally not resolve this ambiguity without additional information. So there have to be restrictions on identifier names in order to avoid clashes with not only reserved words (e.g. if) but also manifest constants (e.g. 3). [Where's the problem with CSS?] In the case of CSS, where you do have suffixes (e.g. 25px) but you can't really mix class identifiers with values, it's possible that you could get away with no ambiguities right now. So it's not weird that you can't start an identifier with a number -- it's perfectly natural for developers -- but it is, in the case of CSS, not required for unambiguous processing. As you can see below, though, it's still kind of confusing for the user. What if we have a class named "3"? It's not very expressive -- we'd probably call the class something like "3-part-panel" -- but it's the pathological case. Maybe a class called "3px" would be even worse. .3-part-panel { /* This is fine */ } .3 { /* Weird, but OK */ } .3px { /* Now you're just being obnoxious */ } Do we actually get any ambiguities, though? I don't think so. I think in this case, the authors of CSS just used the "standard" (not weird!) definition of an identifier. It's only when you have people using CSS who have had no exposure to any other programming languages (or parsing/lexing) that you get people thinking it's "weird" that you can't start with a number. The only place where you could get an ambiguity is with CSS customer properties. In that case, though, "[a] custom property is any property whose name starts with two dashes", according to "CSS Custom Properties for Cascading Variables Module Level 1" . So, variable names in CSS are even more restricted than in most programming languages. Is that weird? Again, no. As in the case above with other programming languages, the end result is more clarity for the user. For example, the following declares a few CSS custom properties with deliberately obnoxious names. :root { red: #F33; color: #FF0; 0: 1; 3px: 1px; } .error-text { color: var(red); background-color: var(color); border-width: var(3px); opacity: var(0); } Although I've chosen confusing values and names, this doesn't -- at first glance -- seem to cause any ambiguities. As with the examples above, it does force implementations to handle enumerations (e.g. all of the colors) in the parser, rather than the lexer. If the word "red" cannot be used as a variable, then it could (possibly) be recognized as its own token in the lexer, (possibly) improving performance. The same goes for the property names. If it's possible for custom properties to use the same names as built-in properties, then the lexer can't handle them. There is no ambiguity because custom-property values must be resolved using the CSS function var(). The problem is worse than that, though. There is an actual ambiguity that isn't obvious because we're using the :root pseudo-class [1]. The example below, using < html>, makes it clearer. html { color: #F33; // Is this setting the color // ...or declaring a color variable? } This is an ambiguity that the compiler cannot resolve. So that's why the CSS designers settled on a prefix for custom properties. So, to a layman or user of CSS, naming restrictions on class or custom-property identifiers may seem arbitrary and "weird", but they are a logical requirement of being able to process the grammar unambiguously. -------------------------------------------------------------------------------- [1] If you know where I'm headed, then fine, it's obvious to you. Congratulations. I didn't see it immediately, so I'm writing it this way.

C# 10 Features

2021-06-05T23:04:48+02:00

Published by marco on 5. Jun 2021 23:04:48 Updated by marco on 11. Nov 2021 08:20:38 ------------------------------------------------------------------------ The article "Introducing C# 10" by Ken Bonny discloses some incremental but very welcome changes to the C# language in the iteration that will be released with .NET 6 in November. In no particular order: * field in property accesses to manipulate the backing property without having to define it. This is a welcome improvement that will clean up useless boilerplate for properties that need to do something with the value before storing it (e.g. field.Trim()) * The required keyword for properties in any of the supported types (e.g. records, classes, structs, or struct records). This lets types enforce initialization without forcing a constructor parameter. The compiler will force callers to initialize the property in the object initializer instead. * record struct for records that are value instead of reference types * operator overloads in records * The with operator will work with anonymous classes as well as declared types. * global usings for commonly used namespaces (e.g. System) to cut down on clutter in files * namespace without braces will put all types in that file into that namespace. This cuts down on an indenting level in all files. * Improvements to lambdas: attributes on parameters and return types, explicit return types * Static methods on interfaces (to round out the default-implementation feature introduced in C# 9) * Constant interpolated strings (e.g. $"Hello {Name}" is considered constant if Name is also considered constant (recursively, of course). Update on November 11th, 2021 from "Dissecting Interpolated Strings Improvements in C# 10" by Sergey Teplyakov : This feature is based on an a nice performance improvement, as well. The compiler now understands interpolated strings and emits more efficient code rather than always using string.Format(), which incurred allocations for unboxing, time for parsing, etc. There are even attributes to hook the compiler output that could be e.g, "used by logging frameworks to avoid string creation if the logging level is off." * Update on June 7th, 2021 from "A Closer Look at 5 New Features in C# 10" by Matthew MacDonald : Introduce !! suffix for method arguments that instructs the compiler to generated a null-check for that argument. So, string is not nullable, but not checked (i.e. the developer is responsible for including a check to avoid a NullReferenceException if one slips past the compiler), string? is nullable, and string!! is not nullable and checked. This will avoid a ton of boilerplate argument-checks. Can't wait. * Update on November 4th, 2021 from "A quick review of C# 10 new language features" by Thomas Levesque : The compiler will now "[a]utomatically infer a “natural” type for a lambda", so you can now use var to declare variable to which you assign a manifest lambda. E.g. var isEven = (int n) => n % 2 == 0; automatically gets the type Func< int, bool>. * Also from the same November 4th article: You can now "[m]ix declarations and variables in deconstruction" so that you can now write (x3, int y3) = p; where x3 is a preexisting variable. I really appreciate how the changes build on changes that came in previous versions. There's a very noticeable direction that they're pulling in with these languages changes: * Being able to write performant code (records, refs, etc.) * Cutting down on boilerplate for common use cases (records, field, pattern-matching. etc.) * Being able to write maintainable, backwards-compatible code (interface default methods, etc.) * Improving type system (covariant returns, etc.) * Turning runtime issues into compile-time issues (nullability, etc.) For more information, see the "csharplang/proposals/" folder. Some of the C# 10 features are in the main folder rather in the csharp-10.0/ folder.

Handling Dependencies in Functional Languages

2021-06-05T22:33:53+02:00

Published by marco on 5. Jun 2021 22:33:53 ------------------------------------------------------------------------ Out of curiosity, I looked up how dependency injection works in functional languages. I stumbled upon this amazing article series -- "Six approaches to dependency injection" by Scott Wlaschin -- that presents five different techniques -- from very simple and easily applicable to more complex, but potentially robust. The article series applies various abstraction techniques to a program that reads input, processes it, and writes it out again. The reading and writing are impure operations and should be abstracted away to make it easier to reason about and test the actual program logic. The "first article" details Dependency Retention (hard-code everything; appropriate for scripts and POC projects) and Dependency Rejection (make an impure/pure/impure sandwich that collects program logic in a testable "middle"). The next article covers "Dependency parameterization" (passing as parameters and using partial application in a separate abstraction layer). These are all pretty usable techniques. The next two articles -- "The Reader Monad" and "Dependency interpretation" -- are more...involved. With both, you end up writing a description of your program that you can then execute by passing in the appropriate parameters. The dependencies are separate from the logic -- in kind of in a separate layer -- but there are drawbacks to these approaches. For one, they are quite complex and require everyone on the team to understand the patterns very well. This is an example of the program description using the Reader monad. [image] The "final article" applies all of these techniques to a slightly more complex problem domain, namely a user-profile update that receives an update request, reads from a database, compares data to determine updates, and sends an email to confirm address changes. This is complex enough that we can see how the techniques scale. As expected, the more complicated but functionally pure Reader Monad and Dependency Interpretation examples take up 2/3 of the implementation and explanation (with the later taking 50% all on its own). All in all, this is impressive work that answered my question superbly. Highly recommended. I've only very lightly summarized the pros and cons and descriptions above. The original author does a superb job of explaining these in much more detail -- without repeating himself.

CSS and HTML Toolbox 2021

2021-04-22T18:20:27+02:00

Published by marco on 22. Apr 2021 18:20:27 Updated by marco on 20. May 2025 10:15:48 ------------------------------------------------------------------------ Over the last four months, I've been collecting interesting HTML/CSS techniques and ideas. 1. I'm planning a bit of a make-over of the earthli style and stylesheets to replace some older cruft with more modern, simpler implementations. 2. I'm planning a new curriculum for the JavaScript class I'll be teaching again this coming winter. For both of these goals, I'm focusing on leveraging as much of the power of the browser -- especially CSS/HTML -- as possible without getting mired in too much JavaScript or client-side libraries. To that end, I've collected the stuff I learned and would like to use in a hopefully semi-readable and searchable format. I tried to split it into coherent sections with supporting information and links. YMMV. [Guides and Resources] The following guides/manuals contain a wealth of information. "MDN CSS" MDN is definitely the OG of web sites, with amazing examples and in-depth documentation of everything HTML and CSS. I have n idea how they stay so up-to-date or who pays for it, but it's amazing and much-appreciated "W3Schools" This site is also very complete and, while it doesn't have as much description or documentation as MDN, it has a lot of interactive samples -- check out the animation and transform ones -- and is easy to browse. "CSS-Tricks" This site has a lot of in-depth articles on various CSS features by various authors. You can find an in-depth article for any of the features discussed below. "web.dev" This site is run by (sponsored by?) Google and it contains a ton of articles and videos and tutorials for taking advantage of the cutting edge of web technology, with an eye toward using advanced features common to all browsers, but also introducing some of the stuff that Chrome has that hasn't been fully adopted yet. "Web Almanac By HTTP Archive" This is an interesting guide that shows usage statistics for certain features. It's enormous and shows an insane amount of detail about e.g. how many sites use flexbox or grid. "State of CSS" In particular for this article, you can check out the usage level of specific "features" . This list is definitely worth a look, as a supplement to the features I outline below. "CanIUse" While MDN also shows you the browser support for features, this site is very searchable and includes links for a tremendous amount of supporting information about quirks and flags for cutting-edge features. [Complexity] The article "What Makes CSS Hard To Master" has several interesting examples, but it mostly boils down to: "HTML documents are complex programs". It's always been difficult to tell which styles are applied when -- it's a near-miracle that browsers can untangle the myriad ways that style rules interact with an ever-changing DOM and viewport size correctly to say nothing of doing so with such sheer alacrity. There are selectors, media queries, related properties (e.g. position), CSS Properties, and much more and all of it cascades with inheritance everywhere. At least most browsers now handle this similarly with predictable performance. A tremendous amount of content is generated dynamically using layers of framework code, either on the server or the client. Any one of these moving parts could introduce a seemingly innocuous change that breaks the entire layout or inheritance (e.g. when a component introduces a wrapping

somewhere, either where it's flatly invalid (e.g. a table) or where it's just unwanted (e.g. in a sequence of flexing containers, where the new container does not flex). To control this chaos, most designers and developers impose self-discipline and use guidelines to avoid confusion while still allowing them to leverage the power of CSS to be able to do what they want. From the article linked above, "I think mastering CSS comes down to having a good amount of knowledge about it, recognising the subtle dependencies between different declarations, rules, and the DOM, understanding how they make your CSS complex, and how to best avoid them." Congratulations: you've just described programming at anything but a trivial level of complexity. If a tool has power, then you have to understand it in order to avoid hurting yourself with it. That's why "everyone codes" is a lost cause doomed to end in failure, broken dreams, and embarrassed disappointment, like so many other quixotic attempts to ignore immanent complexity. CSS is a moving target. Things that used to be difficult are now easy. [1] But that's the nature of the game: someone is going to abstract away the thing you spent time learning and make it easier for everyone else. That is the nature of abstraction and frameworks. If the new thing (e.g. grid) replaces the old thing (e.g. float) well and you have time and budget to use the new thing and it's a priority then, by all means, upgrade to use the new technique and pay down some technical debt, while hopefully gaining some flexibility. [Generators] While CSS generators -- pre-processors like LESS and SASS -- are invaluable, they also introduce another layer of abstraction where code is generated for the developer -- sometimes with unpredictable results. The latest versions of CSS have included some of the features introduced in these generators. Vendor prefixes are less necessary than they used to be; CSS properties and variables and eval() (as well as other standard functions) allow a flexibility beyond even that offered by pre-processor variables. Color and transformation and animation functions are standard now. [Standard Layouts] Check out the site "SmolCSS" by Stephanie Eckles for a long list of common layouts, like: * Responsive CSS Grid * Modern Centering * Avatar List Component * Stack Layout * And much more... It's called "smol" because almost all of them do a lot of heavy lifting with very few lines of CSS. [image] [Selectors] The article "Guide to Advanced CSS Selectors - Part One" by Stephanie Eckles is a good overview with good illustrations and some selectors I'd never heard of, like "General Sibling Combinator" , which "[f]or example, p ~ img would style all images that are located somewhere after a paragraph provided they share the same parent." That whole site is beautiful and exhibits an absolute mastery of CSS. Check out the use of the skew transform for the cards at the bottom of the page or for the "whole series" . The rainbow gradients on the :before and :after borders and backgrounds are a great idea and well-executed. The excellent tutorial "Diving into the ::before and ::after Pseudo-Elements" by Will Boyd is an absolute treasure trove of information, including how to use the ::before/::after pseudo-elements to insert content, but also noting how a classic use of ::after can now be replaced with display: flow-root (the modern clearfix). He also covers ::markers. The article "Three important things you should know about CSS :is()" gives a few caveats but also shows the power of this operator to reduce CSS clutter (along with the up-and-coming nesting feature described "below" <#nesting>). You can use "where()" instead of "is()" to keep the specificity contribution of the clause neutral. The "has()" selector function is defined, but isn't available anywhere. Combine any of these with "not()" for even more powerful selectors. [Properties] The article "CSS custom properties are not variables" explains a common misconception about CSS "variables". "A custom property is not a variable, but it defines a variable. Any property can use variables with the var() function whose values are defined by their associated custom properties. "[...] This distinction is useful because it allows us to talk about “variables with fallback values” (a custom property like any other property cannot have a fallback value) and “properties using variables” (a property cannot use a custom property)" Another great article is "The styled-components Happy Path" by Josh W. Comeau . which discusses styling with CSS properties in React components. In it, he references another article of his, "CSS Variables in React tutorial" , which is more of an introduction to some of the techniques he works with in the first article. You commonly properties with default values on the ":root pseudo-selector" . [2] The article "What Can You Put in a CSS Variable?" shows a lot of nice uses of CSS properties, variables, and "functions" . CSS Properties can basically hold anything you want: text, concatenated strings, references to variables, images via urls, a single value, multiple values, etc. [3] "Some properties, like background and box-shadow, can take a list of things. You can use a CSS variable as a single item in the list, a sublist of the list, or the entire list." As mentioned above, declaring colors is one of the primary uses of a CSS pre-processor language. CSS Properties handle this job very nicely, without preprocessing and also with full recalculation at runtime. [4] The "demo" with the set of animated RGB sliders that control the color of a swatch is worth the price of admission. All without any JavaScript at all. Smooth as butter. As a practical application, the article "Make the page count of a 3D book visible using CSS Custom Properties" shows how you can use CSS to make a "book" out of a div and a cover image, transforming it in 3D-space and then using a CSS property to determine how many "pages" it looks like it has. [image] "You can play with a demo here." You can find a simpler and very straightforward demo in the article "Sharing data between CSS and JavaScript using custom properties" by Christian Heilmann , which shows how to use CSS properties with a few lines of JavaScript to follow the cursor in your document. "Practical Use Cases For CSS Variables" by Ahmad Shadeed provides many, many short examples and ideas for using custom properties as an abstraction instead of setting one or more standard properties directly. Future work: See "below" <#property_proposals> for a discussion of proposed but not yet supported extensions and uses of CSS Properties. [(Custom) Media Queries] The article "The complete guide to CSS media queries" is a great overview of how media queries work, but also how they've changed recently for those who've gotten accustomed to them over the years. For example, the section "New notations in Media query levels 4 and 5" shows how ranges are easier, how you can now use or, the not() function, and custom media queries, which allows you to basically make aliases for media query combinations that you need to use in several places. /* Define your custom media query */ @custom-media --small-screen (max-width: 768px); /* Then use it somewhere in your stylesheet */ @media (--small-screen) { } /* You can also combine it with other media features */ @media (--small-screen) and (pointer: fine) { /* styling for small screens with a stylus */ } [Gradient Overlays] The article "Handling Text Over Images in CSS" by Ahmad Shadeed gives a wonderful overview with many examples on how to use gradient overlays on images to make overlay text readable for all types of images. At the end, you can see how many sites are using this (including YouTube for its overlay video controls). See also the "gradients" used in borders (hover over a "card"), headers, and other elements in the ModernCSS tutorial. [Animations & Transitions] Check out "Animating a CSS Gradient Border" , which has no JavaScript. It leverages a newer feature of Chrome-based renderers to avoid writing a lot of keyframe boilerplates, but it's all in CSS. You could write it all in bog-standard CSS. Another example is a slide show written with only HTML and CSS. You can keep all slides in a single document and make animated transitions between them. See "How to Play and Pause CSS Animations with CSS Custom Properties" for ideas. The article "An Interactive Guide to CSS Transitions" provides a lot of background and interactive examples of how transitions work and how you can influence their behavior. CSS animations apply to many, many properties -- in all modern browsers -- as detailed in the article "The Surprising Things That CSS Can Animate" by Will Boyd , which shows how easy it is to animate box-shadows (for a "pulsating" effect) or even z-order, with a few other properties, to animate two items "switching places" in a very intuitive way -- all without JavaScript. [Tooltips] The article "Cooltipz.css — Pure CSS Customisable Tooltips" by Bramus van Damme includes a good demonstration of "Cooltipz" . This library uses very modern, but well-supported techniques to place and format tooltips or flyouts (for non-desktop browsers). [Shapes] "Understanding Clip Path in CSS" shows how to work with the standard shape functions and combinators and the clip-path property to make pure-CSS non-rectangular accents and effects that run on all modern browsers. The article "Responsible Web Applications" by Joy Heron is an absolutely lovely design that illustrates the power and simplicity of pure CSS. Right at the very top, it uses shape-outside and circle to make text wrap elegantly around a circular shape that contains the navigation. [image] The key piece of CSS is very compact and understandable. shape-outside: circle(21rem at 1.5rem 40%); The page makes liberal use of CSS custom properties (see below) and rem units to make everything scale nicely. It's kind of a master class in CSS and is well worth reading. [Background clipping] Speaking of clipping, you can assign the "background-clip" property to determine which part of its element a background covers. In particular, setting it to the value text clips the background to show through only for area covered by text. It's been supported for quite some time and allows developers to make dynamic effects that would otherwise have to be hard-coded in graphics. The article "CSS background-clip Demo: Text with Animated Emoji" shows a neat demo of an animated SVG ghost moving back and forth behind clipped text. [image] [Filters] In the same ballpark is the "backdrop-filter" , which allows you to apply filters to everything behind a particular element. Naturally, you need to make the element at least partially transparent in order to see the effect. [image] The CSS is very simple and supported on all modern browsers. Being able to create this kind of composition dynamically on the client brings very nice effects without pre-rendered compositing. [Clipping and masking] "CSS Paper Snowflakes" combines transforms, clip-paths, mask-images, and tons of properties and variables to render what look like pre-built graphics using only CSS (well, SCSS in this case). The article "CSS mix-blend-mode not working? Set a background-color!" illustrates how to use the mix-blend-mode to make sure that the text has proper contrast versus whichever background it happens to be over. [image] This is a really nice effect and very handy for usability. You can have the browser ensure that text is always readable, regardless of what kind of background slides into place behind it. [Sticking & Snapping] The article "Smooth Scrolling Sticky ScrollSpy Navigation" provides a tutorial for building a JS-free TOC with sticky headers. The article "Smooth Scrolling and Accessibility" by Heather Migliorisi provides some background, history, and advice on honoring user preferences. The following CSS is enough to get started. The full demo shows how to use a little bit of JS with an IntersectionObserver to implement the ScrollSpy feature in just one line of code. html { scroll-behavior: smooth; } main > nav { position: sticky; top: 2rem; align-self: start; } The article "Using position: sticky to create persistent headers in long texts" by Christian Heilmann provides a very minimal and highly re-usable example of using this feature for "sticking" headers to the top of the page when scrolling. h1, h2, h3, h4 { position: sticky; top: 0; } And there's also scroll-snap-type, scroll-snap-align, and browser units (e.g. vw and vh) to basically make a slide show out of an HTML file without any JavaScript ("demo" or another "demo" with some additional JS to highlight the displayed slide/image in a thumbnail browser). * The article "Sticky CSS Grid Items" is also good (another "demo" shows an address-book-like implementation) * The article "HTML and CSS techniques to reduce your JavaScript" is a multi-step tutorial to make picture browsers with "snap points" and more tricks to control scrolling with minimal or no code. * The article "Practical CSS Scroll Snapping" by Max Kohler provides many more examples and techniques. As for "sticky" or "stuck" elements, "[...] there is one limitation: it is impossible to change the appearance of an element whether it is stuck or not, say with a pseudo-class :stuck. This is a general limitation of CSS. In this case, I recommend combining the benefits of position: sticky to keep the element sticking with IntersectionObserver to change its appearance (while taking care not to change its dimensions, to prevent content jumps)." The "A table with both a sticky header and a sticky first column" by Chris Coyier provides a good example of using sticky to make frozen columns in tables. For a really fancy scroll-spy, see the "Progress Nav" demo. This is very cool-looking, but it's a little bit older, so also check out the "Progress Nav with IntersectionObserver" by Bramus for a linked version that does the same thing, but uses the IntersectionObserver to reduce the amount of code significantly. [Line clamping] For limiting text in a box, you can let the browser do all of the heavy lifting by using line-clamp or the even smoother and also standardized webkit-line-clamp. See a "demo" that shows how to use it in a grid layout. [Functions] The line-clamp feature is not to be confused with the "clamp()" CSS function, which is shorthand for bounding a value between a "min()" and "max()" . There are a ton of "CSS functions" , for math, colors, filters, images, fonts, shapes, and more. You can use all of these with variables and custom properties to avoid whole swaths of JavaScript. [Grid layouts] You'll want to use minmax to override the default minimum size of auto, which is content-sizing, which can get quite large in what the cool kids are calling a "grid blowout". See "The Minimum Content Size In CSS Grid" for examples, graphics, and more links and guides. The tutorial "Building a Side Navigation" pulls a lot of concepts together to create a common UI element that tends to become a time sink if you don't plan correctly. A lot of the CSS features used in this article help to reduce the work significantly. [Table Columns] If you've ever wondered what you need and for, then "Highlighting columns in HTML tables" by Manuel Matuzovic will show you how to use them to apply styling to a column without much additional markup. He even has an example that styles a "selected" column using the :target pseudo-selector. [Images] You can also use a simple attribute to tell the browser to be "proactive about loading images" . The article "Alt vs Figcaption" by Elaina Natario nicely illustrates how well browsers now handle the FigCaption tag, which is yet another feature I'd implemented on earthli long ago, but with custom HTML and extra containers and positioning code. It's nice to know that I can replace that all with a single attribute that's been supported for years. [Viewport units] Viewport units let the developer size elements based on the size of the viewport. This includes not only vw and vh, but also vmin and vmax, which is the minimum or maximum of the two viewport dimensions, respectively. The article "Simple Little Use Case for vmin" by Chris Coyier shows a very simple way to make a highly responsive header without using media queries. header { padding: 10vmin 1rem; } [Controls] The article "Accept several email addresses in a form with the multiple attribute" shows you how to use the multiple property to have the browser automatically validate multiple email addresses, all without any custom JavaScript at all. Once you're using HTML validations (and you should), you can use the :invalid pseudo-selector to style elements that need correction. "Form Validation: You want :not(:focus):invalid, not :invalid" shows several ways of combining it with good UX to avoid annoying users with hyperactive validation messages. A good setup is: .error-message { display: none; } input:not(:focus):invalid { border-color: var(--color-invalid); } input:not(:focus):invalid ~ .error-message { display: block; } input:not(:focus):not(:placeholder-shown):valid { border-color: var(--color-valid); } There's also the new :focus-visible property to help perfect focus-display in forms. /* Hide focus styles if they're not needed, for example, when an element receives focus via the mouse. */ :focus:not(:focus-visible) { outline: 0; } /* Show focus styles on keyboard focus. */ :focus-visible { outline: 3px solid blue; } See ":focus-visible Is Here" for more information. Password controls need a bit more love, as documented in the article "Perfecting the password field with the HTML passwordrules attribute" by Scott Brady , which makes the case for a new attribute passwordrules to be standardized. His focus is on making password fields maximally accessible and usable for password tools. A weaker -- but available -- alternative to his proposal is to use the "pattern" property to restrict input (helping the user, but not the password generator). To that end, he also mentions that you should set the "autocomplete" , " autocapitalize" , and "autocorrect" (non-standard) properties correctly instead of just leaving them at the defaults. [User-resizing] The "resize" CSS property controls the directions in which the user will be able to resize any DOM element. "The resize CSS property sets whether an element is resizable, and if so, in which directions." This box should be resizable. [Accessible Components] The article "A Complete Guide To Accessible Front-End Components" includes everything from guidance to links to tutorials to full-fledged examples and screenshots of HTML/CSS/JS implementations of commonly used controls that are also accessible. * "Modals" * "Footnotes" * "Tabs" * And "couple of dozen more..." The "Tab Panel" is quite nice in that it responsively switches to an accordion at smaller widths. The article "Building a Settings component" by Adam Argyle demonstrates accessible components using a lot of pretty advanced -- but generally available -- techniques, like properties, grids (w/align-items, vw, minmax, auto-fit for pretty much automatic responsiveness with nearly no code), dark/light theming, light JS manipulation of controls, FormData, accent-color, and much more. Watch the "embedded video" for a very quick, 8-minute overview, play with the "live demo" or grab the "source" . [Web Components] "Styling: Styles Piercing Shadow DOM" shows you how to reset all styles in your component, using the :host pseudo-selector. :host { /* Reset specific CSS properties */ color: initial; /* Reset all CSS properties */ all: initial; } The article "Options for styling web components" by Nolan Lawson shows how to design a styling API for a web component using CSS custom properties. [Custom Form Elements] The article "Creating Custom Form Controls with ElementInternals" by Caleb Williams introduces an interesting concept. The example it uses is to make a single "control" that holds several text inputs, which isn't groundbreaking, but it does show the power of packaging CSS/HTML/JS as components that show up as simple tags with properties. None of that is new -- we've had web components for a while now -- but the ElementInternals allows deep integration into the form's workings, including hooking validation, submitting, drawing, and so on. [The inherit value] The inherit value is not new, but I often forget to use it as intended. It's meant to help avoid re-stating a base color. The following example changes the color for the nav tags to red, but wants links to retain the original color. body { color: black; } nav { color: red; } nav a { color: black; } Instead of repeating the value black, you can instead use inherit. body { color: black; } nav { color: red; } nav a { color: inherit; } The "initial" value is also useful. [content-visibility] The article "content-visibility: the new CSS property that boosts your rendering performance" discusses a very new feature. It landed in official releases of Chrome, Opera, and Edge in September 2020. "The content-visibility CSS property controls whether or not an element renders its contents at all, along with forcing a strong set of containments, allowing user agents to potentially omit large swathes of layout and rendering work until it becomes needed. Basically it enables the user agent to skip an element's rendering work, including layout and painting, until it is needed, makes the initial page load much faster." Related to this newer property are the existing will-change, object-fit, and contain. See "contain-intrinsic-size" and "content-visibility" for more information. [box-decoration-break] Update January 2022: Added the section below. The article "box-decoration-break helps to define how elements should be rendered across lines" by Stefan Judis presents an interesting property that lets you determine how padding, border, and other properties are applied to inline elements that span multiple lines. [image] [Isolating Siblings] Instead of setting arbitrary z-indexes in your styles, sometimes the isolation property "is a better way of creating a stacking context" . [Page Visibility] The "Page Visibility API" is available in all browsers and provides a high-level API for running code when showing or hiding a page. "With tabbed browsing, there is a reasonable chance that any given webpage is in the background and thus not visible to the user. The Page Visibility API provides events you can watch for to know when a document becomes visible or hidden, as well as features to look at the current visibility state of the page." Pages can use this to "pause" activity when they're in the background (e.g. server-polling or animations). In the case of animations, though, "Most browsers stop sending requestAnimationFrame() callbacks to background tabs or hidden < iframe>s in order to improve performance and battery life." They also "throttle SetTimeout()". [Houdini] The "CSS Houdini" APIs are a low-level way to hook custom JavaScript into various parts of the rendering pipeline. Of particular interest is the part that's finished and implemented in all browsers: the CSSOM (CSS Object Model) and Houdini, which let a page render custom CSS effects using JavaScript. The collection of low-level APIs is known by the umbrella term Houdini, described in "Cross-browser paint worklets and Houdini.how" . From the MDN page linked above: "Houdini is a set of low-level APIs that exposes parts of the CSS engine, giving developers the power to extend CSS by hooking into the styling and layout process of a browser’s rendering engine. Houdini is a group of APIs that give developers direct access to the CSS Object Model (CSSOM), enabling developers to write code the browser can parse as CSS, thereby creating new CSS features without waiting for them to be implemented natively in browsers." And: "Houdini enables faster parse times than using JavaScript style for style changes. Browsers parse the CSSOM — including layout, paint, and composite processes — before applying any style updates found in scripts. In addition, layout, paint, and composite processes are repeated for JavaScript style updates. Houdini code doesn't wait for that first rendering cycle to be complete. Rather, it is included in that first cycle — creating renderable, understandable styles. Houdini provides an object-based API for working with CSS values in JavaScript." "Houdine.How" is a collection of open-source CSS extensions that you can use, extend, and learn from. I heard about this from "css-houdini-circles — A Houdini Paint Worklet that draws Colorful Background Circles" by Bram Van Damme (see his "code" ) The following video provides an excellent overview in 12 minutes. [media] Once you start making custom effects, you'll run into classic rendering problems, one of which is addressed in the article "CSS paint API: Being predictably random" , which explains how to use a stable seed to use predictably random data for animation data. While the painting API is relatively well-supported, the "Layout API" is still in early days. "The layout stage of CSS is responsible for generating and positioning fragments from the box tree. [...] This specification describes an API which allows developers to layout a box in response to computed style and box tree changes." [VisBug] The "VisBug Chrome/Opera/Edge Extension" is an excellent tool in general, but seems to be indispensable for optimizing Houdini code. [media] Skip to 23:25 for the VisBug demonstration. -------------------------------------------------------------------------------- As of April 2021, the features described below are experimental and either not implemented yet or only available in canary builds or behind feature flags (or both). [Advanced Properties] "The future of CSS: Higher Level Custom Properties to control multiple declarations" by Bramus Van Damme discusses a very, very recent proposal (December 2020), discussed in detail in the issue "[css-variables?] Higher level custom properties that control multiple declarations #5624" The article "@property: giving superpowers to CSS variables" by Una Kravets provides more examples. [Container Queries] Another interesting up-and-coming development is "container queries" , which are like media queries, but addressing the nearest "root" container in the list of parent containers for the element to which it's applied. The article "CSS Container Queries: A First Look + Demo" takes you step by step through using it. Basically, you write @container (min-width: 38rem) instead of @media (min-width: 38rem) and assign the contain property, like so: contain: layout inline-size. The article "Say Hello To CSS Container Queries" by Ahmad Shadeed provides a lot of real-world examples that will make you wonder how we've lived with only viewport-based media queries for so long. [Nesting in CSS] One of the main features values added by a CSS pre-processor like LESS is nesting, which improves clarity and cuts down on duplicated definitions. The article "The future of CSS: Nesting Selectors" by Bramus indicates that this feature is coming to mainline CSS, as documented in "CSS Nesting Module" . The document is an editors' draft, so there's still quite a way to go. "Nested Media Queries" are already supported, though more as a side-effect of the implementations, not necessarily because it was specified that way. [Logical properties] The "logical properties" feature will add aliases for some of the venerable CSS properties like margin-right and margin-left that make it easier to build more agnostic and flexible content using, e.g., margin-inline-start and margin-inline-end. Assigning one of these instead of a hard-coded side means that a style will work in both LTR and RTL (for example). The article "Digging Into CSS Logical Properties" by Ahmad Shadeed provides many more examples. The "full list of proposed properties" is quite extensive. Many of the newer modules like flexbox and grid were designed like this from the very start. See also "CSS Logical Properties Are the Future of the Web & I18N" by Daniel Yuschick for more information and tons of examples, with a demystification of the difference between direction (inline axis, or flow) and writing-mode (block axis). Update: 16.10.2021 Two more interesting logical properties are "inline-size" and "block-size" , which correspond to width and height in the horizontal-tb "writing-mode" . Using the logical properties means that the layout works even if the writing mode is changed to vertical-lr or vertical-rl. [Portals] The article "Hands-on with Portals: seamless navigation on the web" explains how this new feature in Chrome/Chromium improves support for securely embedding content from other sites (i.e. "portals"), as when using OAuth providers. It also generally improves transitions in MPAs (Multiple Page Applications) by allowing one page to prepare another rendered page in memory and then transition to it and perhaps even back. "Single Page Applications (SPAs) offer nice transitions but come at the cost of higher complexity to build. Multi-page Applications (MPAs) are much easier to build, but you end up with blank screens between pages. "Portals offer the best of both worlds: the low complexity of an MPA with the seamless transitions of an SPA. Think of them like an < iframe> in that they allow for embedding, but unlike an < iframe>, they also come with features to navigate to their content." [Page Lifecycle] The article "Page Lifecycle API" by Philip Walton discusses an improvement over even the "Page Visibility" API (discussed above). Instead just handling visibility, it also provides hooks for suspending and resuming pages. "The Page Lifecycle API, shipping in Chrome 68, provides lifecycle hooks so your pages can safely handle these browser interventions without affecting the user experience. Take a look at the API to see whether you should be implementing these features in your application. "[...] While the web platform has long had events that related to lifecycle states — like load, unload, and visibilitychange — these events only allow developers to respond to user-initiated lifecycle state changes." * The article "Deep Dive into Page Lifecycle API" by Viduni Wickramarachchi provides some real-world tests and data. * Check out the "page-lifecycle" package for working with the new events. -------------------------------------------------------------------------------- [1] The video " Thinking on ways to solve CENTERING" (and accompanying "article" ) is an excellent look at several modern techniques for doing centering content, most of them one-liners with good-to-great behavior in many situations. [1] While you're at it, take a look at "HSL: a color format for humans" by Paul Hebert , a format that is a lot more intuitive for blending and setting up color schemes than the classic RGB. There are also many guides online for picking a color set, like "Which color scale to use when visualizing data" by Lisa Charlotte Rost and "How to pick more beautiful colors for your data visualizations" by Lisa Charlotte Rost as well as tools for choosing colors that work well together, like "Colordot" , "ColorDot" , "Farbvelo" , or "I Want Hue" . [1] If you didn't know about :root, then check out the list of "Pseudo-elements" to see which extra parts of a document you have access to with CSS (e.g. the ::file-selector-button selector is a relatively new addition that lets you style the button in an upload control). [1] The article "Injecting a JavaScript Attack Vector using CSS Custom Properties" by Bramus show a site that executes JavaScript stored in a CSS property.

Configuring and using Jetbrains Rider 2021.1.1 and Visual Studio 2019 16.9.4

2021-04-18T22:50:04+02:00

Published by marco on 18. Apr 2021 22:50:04 Updated by marco on 23. Apr 2021 08:59:44 ------------------------------------------------------------------------ Visual Studio with ReSharper has been my main development tool for many, many years. I first started using it in 2008 or 2009. Over the last several years, I've used many other IDEs, like Visual Studio Code for documentation, advanced search, and JavaScript/TypeScript or PHPStorm for PHP, Android Studio for Java/Android, XCode for Swift/iOS, or WebStorm for TypeScript/JavaScript. JetBrains Rider came on the scene several years ago and was not, at first, a viable alternative, but it has gotten much, much better. It now makes sense to consider using Rider as well as or even instead of Visual Studio/R#. tl;dr: Both IDEs are excellent, if configured properly. "Download the solution template" to see an example of a solution configuration that includes all settings discussed below. [Initial Visual Studio Setup] Before going into the new setup, let's briefly discuss what we were replacing. * Visual Studio Community 2019 16.9.4 [1] * ReSharper 2021.1.1 * StyleCop by JetBrains 2020.3.0 * ReCommended Extension for ReSharper 5.5.0 * Enhanced Tooltip 3.15.0 * Heap Allocations Viewer 2020.3.0 * CyclomaticComplexity 2020.3.0 * .EditorConfig used only lightly All inspections and quick-fixes run through ReSharper. Visual Studio "squiggles" are disabled because they're distracting and contribute nothing additional. StyleCop does a lot of the heavy lifting, but it does a bit too much. It checks spelling in documentation, even though ReSharper already does that natively. The biggest drawback is that StyleCop uses its own parser, which is not just detrimental to performance -- the Roslyn parser, the ReSharper parser, and the Style Cop parser are all running at the same time -- but also the StyleCop parser is no longer compatible with some features of C# 8 and 9. It records "syntax errors" for perfectly valid code. [Initial Rider Setup] Rider doesn't support the StyleCop, ReCommended, or the Enhanced Tooltip extensions. Not having Enhanced Tooltip isn't that big a deal (Rider's tooltips are OK), but not having StyleCop and ReCommended meant a significant number of style and formatting inspections were not applied in Rider. Rider supports style and formatting, but it doesn't warn or indicate when there are issues. This makes it more difficult to help developers use a common style. [Requirements] * Get as close to feature parity as possible between Rider and Visual Studio/ReSharper. * Retain StyleCop's style and formatting checks, but without the old parser * Reduce inspection redundancy wherever possible * Use human-readable and human-maintainable configuration * Use common confirmation for Rider and Visual Studio/ReSharper * Use configuration that Visual Studio, but also Visual Studio Code understands wherever possible or advantageous [Approach] [StyleCop.Analyzers] The StyleCop.Analyzers project has been around for a while, but making the move is not as straightforward as just installing the package in all projects. You also have to rewrite the configuration. Luckily, they have a good template from which to start and the documentation is very good. Since the test solution uses Directory.Build.Props, it also made it very easy to include the assembly and configuration for all projects. I created a special version for test assemblies that removes the documentation requirement. StyleCop.Analyzers has its own JSON configuration, but it uses the .NET-standard rulesets to configure inspection severities. Removing the StyleCop plugin for ReSharper was not without drawbacks; it removed a few minor goodies to which I'd grown accustomed: * The "Chop Parameters" quick-fix was nice. * The "Add documentation" was better than the default in either VS or Rider. Update 22.04.2021: I've since discovered that "chop" is available in Visual Studio by positioning on a method, pressing Ctrl + ., and choosing one of the many wrapping options. [image] Also, documentation-generation is getting better with each point release. [.EditorConfig] Another standard is using the .EditorConfig file for as much configuration as possible. This format is not IDE-specific: Visual Studio, ReSharper, Rider, Visual Studio Code, and many other editors/IDEs make use of it. Keeping as many settings as possible in this file helps ensure style and formatting is applied correctly no matter which IDE is used. It's not a guarantee, but there's a better chance than if these settings are stored in a ReSharper-specific format, as before. These days, a lot of the configuration can be stored in an .EditorConfig file -- all but a handful of the Rider and ReSharper settings are mapped there already and there are a few more with each release. [.Directory.Build.Props] I'm also using SDK-style project files together with the "Directory.Build.Props" feature of the MSBuild system to consolidate configuration to just one or two files. [Evaluated Setup] Visual Studio: * Visual Studio Community 2019 16.9.4 * ReSharper 2021.1.1 * ReCommended Extension for ReSharper 5.5.0 * Enhanced Tooltip 3.15.0 Rider: * Rider 2021.1.1 Shared: * StyleCop.Analyzers 1.2.0-beta.333 * Heap Allocations Viewer 2020.3.0 * CyclomaticComplexity 2020.3.0 * .EditorConfig used for nearly everything I have not tested Visual Studio without ReSharper because, although Visual Studio has leapt forward in functionality, there are still too many features I miss without ReSharper. [2] [General Options] I use a separate Git client called "SmartGit" , so I generally turn off as must of the Git integration as possible to save power and memory. The CodeLens (VS)/Code Vision (Rider) is an amazing insight into a ton of statistical information, but I don't ever use it, so I turned it off. Also, I don't like how it feels when editing code because it introduces virtual "lines" in too many place. I also would sometimes inadvertently click the links and then have to close detail panels or refocus the editor. For the same reason, I disable almost all inlay hints in Rider/ReSharper (inline hints in Visual Studio). I do not miss seeing types everywhere. I only care what the actual types are when something doesn't compile. In Rider, you can long-hold the Ctrl key to show inlay hints on-demand. The only inlay hint I always show is for inherited attributes (e.g. for [NotNull] annotations). I've also disabled Code Folding (Rider)/Outlining (Visual Studio) because I never use it. I don't need to see the noise along the left-hand gutter and I don't need to accidentally click the nodes (or accidentally trigger a folding with an inadvertent key combination). [VS/ReSharper Options] * Adjust shortcuts/keys * (-cmd) + W to close a window * (-cmd) + (-opt) + W to close other windows * (-cmd) + (-shift) + W to close other windows Change color for boxing allocation to orange (default is red) * Disable inlay hints for parameter names * Disable more inlay hints * Disable Code Style * Disable Outlining * Disable Environment / Intellisense / Automatically show parameter info in ... * Use Package References for NuGet by default [Rider Options] These are options that I ended up changing from the defaults. * Turn on CamelHumps * Turn on exception-handling / tune * Adjust shortcuts/keys * (-cmd) + W to close a window * (-cmd) + (-opt) + W to close other windows * (-cmd) + (-shift) + W to close other windows * Ctrl+R/Ctrl+G key-combo to the same command as Ctrl+(opt-)+O (Optimize Imports). This command pops up an ominous "Cleaning up Code..." progress dialog for a second but, so far, it seems to be doing only the import-cleanup. * Alt + < to navigate to methods in the same file Change color for boxing allocation to orange (default is red) * Change color for matching brace (it's the same as for selected text, which is very confusing, as it always looked like I'd selected the brace as well) * Disable inlay hints for parameter names * Disable more inlay hints * Disable Code Vision (equivalent to Code Lens in Visual Studio) * Disable Code Folding * Disable Editor / General / Code Completion / Show the parameter info popup in ... * Use Package References for NuGet by defaultReduce default font size from 13 to 12 * Reduce default line-spacing from 1.2 to 1.05 [Code Style and Formatting] For C# Code style, I ended up adding these extra settings. There are probably others, but these are the ones that made (-cmd) + K / (-cmd) + D usable for me, especially for the single-line null-check statements that we use a lot. * Check the box for "Keep existing arrangement of embedded blocks" to prevent reformatting of blocks * Set the lines to add "After statements with child blocks" to 0 * Set "Wrap invocation arguments" set to "Chop if long or multiline" (I'm not 100% this won't mess other things up, but it worked a treat in a long test-fixture file) * Set "Max object and collection initializer elements on a single line" to 1, so he stops reformatting multi-element blocks that we've put on multiple lines. With the first two settings, the formatter won't fix some things that he would have fixed before, but he's also not going to change a whole bunch of stuff that you'd rather he left alone. [image][image][image] It took me a few tries to configure Ctrl+K/Ctrl+D (format document) in Rider, which doesn't work as loosely as in ReSharper/Visual Studio. In Visual Studio, it leaves single-line argument checks alone. Rider is more ... consistent ... and reformats all lines, which messes up a lot of formatting. On the positive side, the configuration for Rider ended up improving "Code Cleanup" in Visual Studio/ReSharper, which had never worked so well before. I eventually figured out how to set things up so that "Format Document" and "Code Cleanup" (Ctrl+E/Ctrl+F) both work flawlessly in Rider and Visual Studio, but it took some time and patience to find all of the settings. The "Detect Formatting Settings" in both ReSharper and Rider were indispensable. [File Layout] I also finally configured the "File Layout" feature so that "Clean Up Code" works as expected. StyleCop Analyzers supports enforcing an ordering on members, but it doesn't support configuration of that ordering. The order is fixed as StyleCop wants it. Their default style has fields at the top, which is a no-go for our style. That means that I've disabled the "arrangement" feature of StyleCop and no longer see warnings about out-of-order members. This is OK, though, as re-ordering members just to fix a warning is not that great for reviews and merging. "Clean Up Code", however, does apply the file-layout rules. I think that this is a better balance overall, as leaving a method in place when you've changed its visibility from public to protects (or vice versa) should not earn a warning. [Configuration Files] As noted above, I configured all of the StyleCop, .EditorConfig, and Rider/R# settings to make "format document" and "clean up code" work perfectly with our style. These are just a jumping-off point (even within Encodo). Adjust StyleCop inspection severities in the *.ruleset files. Adjust formatting preferences in the .EditorConfig whenever you can. Rider/ReSharper will also allow you to override these settings, storing them in the *.sln.DotSettings file, but it's clearer and more consistent to configure the ruleset and .EditorConfig files because those are more human-readable and better-documented than the *.sln.DotSettings file. Tip: "Download the solution template" to see an example of the configuration with all settings discussed above. [Comparison] I made this comparison over the last 4 months, during which the setup changed slowly into the configuration outlined above. I have tried to weed out the notes and impressions that no longer apply, but I may have missed some. I do my best to give the impression of what it's like to work with these IDEs. I left some longer descriptions in place, just to give a feel of what I experienced while using the IDEs. [Launch and Processes] For small-to-medium projects on a my 4-year-old desktop, you barely notice startup. For the larger Quino project, with over 120 projects (for now), startup speed is more noticeable. All of the IDEs start relatively quickly now. They're just fast in different places. It really depends on where your focus is. Visual Studio by itself starts very, very quickly. The latest versions of ReSharper start up in parallel, so VS is on the screen and the editor is typable in seconds, even with a solution like Quino. You can't search at that point, though. [3] Rider looks like it's totally up and running, but it mostly can't search either, not until the projects have been processed and the indexes loaded. The initial Rider project-chooser takes longer to start up than you'd expect. Once it's up, though, opening a solution from there is very fast. Rider runs all open solutions in a single process. Visual Studio launches a separate process per solution. While I'm happy that the startup speed has improved all-around, I don't really care about startup speed, not really. I never reboot unless I have to. I never log out unless I reboot. I just leave my tools running all the time. I have 32GB of RAM. Once it's running, it's running, and I don't care how much RAM it takes (within reason) -- I care how fast it does the things I ask of it. Once I configured StyleCop.Analyzers, my initial solution-load in Rider showed a shocking amount of memory for Quino (an extra 4.5GB just for the Roslyn checker process). It felt fast enough, even though the memory usage kept growing. Rider's a 64-bit process and I have 32GB of RAM on my desktop, so it was a luxury I could afford. [image][image] Luckily, after a restart, the memory was still higher than it was, but now stable at around 3GB. Conclusion: No real launch-speed advantage until a solution is navigable. Both use about a reasonable amount of RAM. [Performance Issues and Crashes] Solution-wide analysis is enabled by default in Rider, with no performance degradation noticeable at all. In fairness, there is little to no performance degradation evident with ReSharper in Visual Studio either. Code Vision is enabled by default in Rider; also no performance-degradation noticeable. I am running everything on a desktop and I have seen CPU usage spike quite high on Rider. Code Lens in Visual Studio and Code Vision in Rider both probably suck the life out of a battery, though. "TANSTAFL" . While it's nice that Rider uses all available CPU power for certain tasks -- e.g. building -- I imagine that the CPU fan would be running a lot under heavy usage. Visual Studio probably suffers the same, though its CPU usage seemed to be flatter when I checked. Solution-reloading is more stable and a bit faster than in Visual Studio. In a recent task where I was constantly cherry-picking and rebasing, making changes to project files and the solution file, Rider just worked. Visual Studio would usually throw up a yellow warning bar at the top sooner or later (usually sooner). Sometimes, Rider is quite slow at getting its "intention actions", something I've never seen with ReSharper. [image] This usually clears up after 5-10 seconds, but a couple of times, Rider went looking for inspections for 10 seconds and came up with nothing -- repeatedly. It's odd because, in that case, Rider kept having trouble with the same extension-method call and had to look it up again and again. This effect is noticeable in other places, as well. When you elect to show the dialog to "Configure Inspection Severity", then sometimes it takes several seconds to show the dialog box (with no user feedback). And, sometimes, Rider just dies. For example, when I look up sources for a .NET type, like IndentedTextWriter, by using (-opt) + F12. Rider showed a dialog for several seconds, but didn't seem to be doing anything. It wasn't downloading, as expected; instead, it just showed "Searching for implementations...". This wouldn't be worth mentioning but, after having dismissed the dialog, now I can't navigate to anything with F12. I have to restart Rider. This is not the first time that this has happened. This never happened with Visual Studio. It definitely makes the IDE feel much shakier. In Visual Studio, with R#, I can view the sources for IndentedTextWriter after only a slight pause. On the subject of reloading: Visual Studio definitely still freezes more (usually showing its yellow warning bar at the top after a few seconds), but Rider is just more subtle about being loaded, but still unusable. You have to keep an eye on the progress bar at the bottom in both IDEs. In general, Rider reloads more quickly than Visual Studio -- and has no UI "hangs", like VS still does, for a few seconds -- but not always. On the other, other hand, I've also experienced more build errors after changing framework targets than with Visual Studio. Rider can't copy files or its looking in the wrong place for files. Restarting Rider fixed that problem, but I shouldn't have to restart to fix a build. Rebuild should have fixed it, but it didn't. Conclusion: The latest version of Visual Studio with ReSharper feels more stable than Rider and has fewer mysterious failures. Rider reloads more quickly. [Code Analyzers and Quick Fixes] I was unable to get Rider to respect the generated_code setting from the .EditorConfig file, something that worked immediately with Visual Studio/Roslyn (ReSharper is not involved). I've reported that issue as "RIDER-61283" . In the meantime, I'm using the "Elements to Skip" feature to ignore the same file masks Rider should be ignoring anyway. That at least works for now. Still, Rider's integration is nice because it pulls everything together into a single list, but its quick-fixes for Analyzer inspections aren't as strong as Visual Studio's nor can you actually fix everything (see the issue with UTF8 below). In Visual Studio, the analyzers work quite well, but there is no integration with ReSharper. Instead, the integration with Visual Studio is really good -- with Ctrl + . instead of (-opt) + (-enter), you can get quick fixes and even apply them to the entire method, document, project, or solution. In Visual Studio, there's a very nice preview mode. In fact, there is useful and accurate user feedback throughout, which was a pleasant surprise. It's quite fast in collecting fixes for all 120 projects and applying the changes. There's even good keyboard support for arrowing to the file/project/solution actions. This is a definite boon for getting through thousands of fixes quickly. In Rider, there are quick fixes, but most of them only work for a single instance of the inspection. Some of the fixes (e.g. each attribute on its own line) can be applied to file/project/solution with ReSharper as well, but not all. Some of the fixes aren't available at all with ReSharper (e.g. SA1513, insert newline after brace) but are available in Rider. So, Visual Studio's integration with Code Analyzers worked better out of the box, but it forces you to use both ReSharper quick fixes ((-opt) + (-enter)) and VS quick fixes ((-cmd) + .), depending on which system detected the issue. The inspections also show up in two different panes. This is actually easier to get used to than it sounds, though. Conclusion: Rider has a merged view, which is nice. Visual Studio has quick fixes for everything, applicable to method, document, project, and solution. [Inspections] There is no ReCommended extension for Rider (with no plans to add support, according to issue "#51: Add support for Rider 2020.2" , which was closed as "too much work". All of "these inspections" are missing in Rider. * Check usage of annotations * Check async/await usage * Check usage of lock variables * "Etc." When you add a parameter to the constructor, Rider doesn't mark the identifier as unused if it has an attribute. In the examples below, you can see that the identifier is grayed out in Visual Studio, but not in Rider. [image][image] Sometimes Rider doesn't indicate when a conditional access is unnecessary (e.g. when ?. can be converted to .). It also doesn't indicate when an expression that is always false or true could be simplified as reliably as ReSharper does. Neither Rider nor ReSharper seems to notice when you do a silly pattern-matching check, like if (sender is Person person) when person is already a Person. VS, Rider, and ReSharper simply assume that you're doing the check in order to assign the variable, I guess. Conclusion: Both have a tremendous number of useful inspections, but Visual Studio/ReSharper/ReCommended is a slightly more powerful combination. [Inspection Accuracy & Speed] Now I know why the solution-wide analysis is so fast in Rider: It doesn't reevaluate warnings when the project changes (e.g. if you change the root namespace). You have to visit each file individually for it to clear the warning. Clicking "Reanalyze all files with errors" doesn't work on files with warnings, as it does under ReSharper. You can use (-opt) + (-shift) + PgDn to jump through the warnings, opening each file as you go. It's pretty fast, but feels clunky. This is especially unfortunate when Rider thinks that there are errors. I suppose that this is a side-effect of repeated solution/project reloads as I'm quickly switching branches. Changes to the ruleset and stylecop settings are noticed in both IDEs instantly. I changed a rule from warning to info and Rider changed the color of the squiggle in what felt like less than a second. Unfortunately, changes to the .stylecop.json file are not picked up without a reload of the solution. Here is where ReSharper is much perceivably faster than Visual Studio. It's even a bit faster than Rider. Turn on solution-wide analysis. Remove the last reference to a function. Watch ReSharper gray out the identifier in the declaration nearly immediately. Or remove a method call. Watch ReSharper underline it immediately. Visual Studio/Roslyn? Still feels laggy. ReSharper's list of errors and warning updates immediately. Rider's is pretty good, too, but, mysteriously, not as accurate or quick-to-update as ReSharper's. Both are much faster than Visual Studio/Roslyn, which often takes long seconds to clear warnings or errors -- and sometimes never does, until you force a build. Roslyn (Visual Studio) is sometimes flaky and won't clear old warnings/errors until the next build. ReSharper was definitely faster here, even with the extra StyleCop parser. This didn't used to be an issue, but with the switch to Code Analyzers, I'm now using Visual Studio/Roslyn for a good portion of my inspections (StyleCop). What does flaky mean? Whereas Rider updates relatively reliably when you make a change in any file, StyleCop Code Analyzers in Visual Studio will only occasionally show the warnings. If the file isn't open (or in some sort of in-memory cache), then only a "Rebuild All" will make the warning appear. This also only works if you're not using "ReSharper Build". Rider does this much less often, but it still does occasionally have incorrect inspections that can be very difficult to correct. For example, the following screenshots show an unrecognized dictionary. [image] Visual Studio recognizes the using System.Collections.Generic, but Rider grays it out. [image] Restarting Rider sorted out this error. Several other cached errors and warnings disappeared with the one noted above. Rider is very quick, as is ReSharper. Also, it's generally pretty good on updating inspections, but I've also seen flakiness with lingering warnings and errors in the pane, but never in the sources. The only way I've found to update the pane is by actually opening the file, at which point Rider re-detects that the issues are gone and clears the inspections. Manually triggering a reanalysis does not help here. Conclusion: Both have lingering inspections sometimes but, overall, Rider's inspection speed and accuracy are both better. For either IDE, run "Rebuild All" to see all warnings. [Navigation] The solution-wide find/replace window in Rider is lightning-fast and supports newlines, copy/paste, regular expressions, shows change previews. It's wonderful. The change previews in Visual Studio Code are just a tiny-bit better, but the overall experience is solid and super-fast. The search/replace in Visual Studio is looks very dated next to this feature in Rider. Navigation to other files is so fast in Rider that I sometimes thought it hadn't navigated (it had!) There is no way to navigate the warnings in a solution using the keyboard. In general, Rider tends to let panels "steal" the keys for next/previous, so when you try to navigate errors or warnings or find-results, the test session can "steal" these keys and suddenly you're navigating tests and fixtures instead. I find myself grabbing the mouse more often in Rider than I do in Visual Studio. Where ReSharper has Ctrl + T as a central search for everything, the same key combination does not include "search everything" in Rider. For that, you need to switch to Ctrl + (-shift) + F. On the other hand, the dedicated "find in solution" panel is lightning fast and makes up having to switch between panes. Conclusion: Both offer really good navigation, but I give a slight edge to Visual Studio/ReSharper for consistent keyboard support. [Editing] [Column/Noncontiguous Selection] Rider doesn't really support extending a non-contiguous selection. It has column-selection mode, like Visual Studio, but it doesn't have (-cmd) + Shift + . to select "like" text. In Sublime Text and Visual Studio Code, this feature is available via (-cmd) + D. Rider doesn't seem to have this, which limits editing capabilities. There is documentation for "multi-selection" but the shortcut keys are confusing and not the ones I have assigned. Nor can I find anything in the keymap with any of those names. It's either a new feature or its only partially supported. Update 23.04.2021: I just tried (-cmd) + Shift + . in Rider (even though that wasn't documented) and it works just like in Visual Studio! That's a nice surprise. I'm not sure if this was always there and just poorly documented or whether they just added it in a recent release. At any rate, good news for editing in Rider. [Commenting] Pressing Ctrl+K/Ctrl+C comments code. However, instead of commenting again, it uncomments if applied a second time. This means I can't "double comment" to indicate that this code is temporarily preserved, but should not be flagged as commented code to be removed. [Double-click Select] Double-clicking on an identifier uses CamelHumps, if you have CamelHumps enabled (just like all other JetBrains tools). With ReSharper, though, the CamelHumps apply to cursor-based word-selection, but a double-click selects the whole word. I think that's a better balance because that's what I expect when I double-click an identifier. I don't think I've ever wanted to select just a part of the double-clicked word by default. It's not a deal-breaker, but it's annoying because I have to double-click, then extend the selection manually to get the full identifier. [Undo Buffers] The undo function in Rider fails much more often than I'm used to from Visual Studio. I've deleted lines of documentation and then hit undo and Rider couldn't get them back. [image] Once the undo buffer is broken, you have to restart Rider in order to be able to undo again. It feels quite unstable. I'm quite surprised, considering the literally dozens of popular IDEs built on this platform. [UTF-8 Support] Rider creates files as UTF-8, but without the BOM. Then the StyleCop analyzer demands that the file have a BOM, but there is no quick fix in Rider for this, nor is it clear how to convert the file. I end up switching back to Visual Studio, where there's a quick fix to set the encoding properly. [Typing Speed] Typing speed is better in Rider than in Visual Studio/ReSharper. Just a little, but it is. It's smoother. Even after replacing the StyleCop extension with StyleCop.Analyzers, it still feels a bit smoother, overall. Rider on Mac feels even smoother than Windows. Conclusion: Rider's text-editing is smoother but Visual Studio feels slightly more solid overall. [Code Completion] I just wasted 10 minutes in Visual Studio trying to figure out from the documentation how to create a StreamWriter with a non-default encoding. The list of overloads did not show any overloads when using a path. [image] I searched and the wizards at StackOverflow rather snippily asked why not use the docs? So I looked at the docs and then switched to the right target (first .NET 2.1, then .NET Standard 2.0), but the desired overloads have been around forever. Back to VS and it is really not showing those overloads. Switch to Rider and ... there they are. [image] It turns out that Visual Studio has a maximum height for its overloads list. The only hint that there are more methods are some heretofore not-noticed dashes at the bottom. The only way to see the other overloads is to select the popup and use the arrow keys. There is no scroll bar or other evidence to indicate that this is possible. There is also no reason why the popup couldn't be taller. In Visual Studio, the developer can use the up arrow and down arrow to traverse the various overloads, showing the documentation for them. In Rider, it's not obvious how to navigate. The trick is to keep hitting (-cmd) + (-shift) + space to cycle forward through the list. Typing a { in a non-interpolated string does not show code-completion. In ReSharper, you can type {, select a variable and ReSharper automatically makes the string interpolated. If you add a parameter, Rider rightly complains that the data between the curly braces needs to be an index, but doesn't offer to convert the string to interpolated. You have to go back to the front of the string and add the $ yourself. This is now working in Rider 2020.3 Conclusion: Visual Studio's UI is more easily navigable, but Rider's UI is better for longer lists of overloads. [Refactoring] Rider doesn't offer to rename related symbols as much as ReSharper does. For example, if you rename a field, ReSharper will offer to rename the constructor parameter that sets that field. Rider does not. When you insert a new parameter in a method call and then tell Rider to add it to the method, it then shows a panel with other calls that need to be updated, asking how to handle each one. This is the same as in ReSharper and is a welcome feature. As in ReSharper, you can navigate the various calls with the arrow keys and the focus is set correctly. However, I can't figure out how to activate the choices with the keyboard. I have to use the mouse. Conclusion: Both amazing, but slight edge to Visual Studio for completeness. [Restore & Build] The NuGet integration is nice in Rider and the NuGet Explorer is quite fast. It still doesn't feel as robust as Visual Studio, but it's getting there. I rarely went back to Visual Studio to try to resolve an issue I couldn't solve in the Rider UI. Rider's "build" command still doesn't notice when you're changed packages external to the solution and do a nuget restore for you. In fact, when I updated Winform DevEx packages externally (because neither the NuGet UI in Rider nor that in VS could apply the changes without getting tripped up in dependencies because it can't upgrade multiple projects at once), Rider had no idea what I'd done until I manually deleted the obj folders from the projects that depend on DevEx. I don't recall having to do that for Visual Studio, which runs a nuget restore check before each build. Visual Studio was more amenable to finding the actual error with a "rebuild all". Rider cached more and stayed stuck on the original "error", which was hiding the real problem (an interface mismatch after the upgrade). When you update NuGet packages, Rider uses stale data a lot more than Visual Studio does now. This is how Visual Studio used to be, but it's gotten a lot better with its caches. Rider is still a few steps behind. I just upgraded NuGet packages for a project and then ran the tests. A bunch of them failed with a MissingMethodException. I know this error, so I forced a full rebuild and ran the tests again. This time everything worked. With Visual Studio, I'd gotten used to no longer having to consider "rebuild all" or "restart the IDE" as possible solutions. With Rider, you still have to occasionally use these solutions, for now. It's not the end of the world, but it does waste time and effort -- especially if you don't jump to that conclusion quickly enough. Often enough, you'll lose a good quarter of an hour chasing phantom errors and warnings instead. Conclusion: They're both about the same for day-to-day use, but Visual Studio is still slightly better at corner cases. [Testing] When you edit a unit test to change the parameters to a test case, the test session will update and then move the selection to the top of the list. This is very annoying since it always scrolls away from the test area I had focused. It also has an annoying habit of nearly constantly changing the selected item in the tree, making navigation difficult. This might be related to when tests are running or a build is running, but there's always something like that going on -- it's not very nice that the whole IDE has to be quiet before I can use keyboard navigation in a tree without Rider constantly stealing focus and jumping around. While running tests, Rider does not allow you to collapse nodes in the unit-test session. It quite annoyingly expands it again whenever you try to collapse a node. Searching in tests is quite slow in both Rider and ReSharper. Update 23.04.2021: I've discovered that I can use F4 in Rider to jump to the source of a test. That's very handy because double-clicking on a test in either test runner has unpredictable results that seem to depend on whether the test is defined in a base class. I can't treat the Unit Test Session window as an editor window in Rider, so it's harder to switch back and forth. The tests are docked at the bottom by default. You have to switch to that window with a hotkey, then use another hotkey to hide it. I'm getting used to it, but I don't understand why the JetBrains IDE doesn't support this feature (it doesn't have it in any other JetBrains IDEs I've used either). Conclusion: There's not much difference in testing support between the two. [Debugging] Integrated debugging with auto-disassemble and sources in Rider is pretty awesome (e.g. I debugged into SimpleInjector without SourceLink). You can open any referenced type in any assembly and either have the original source from SourceLink [4] or disassembly. In either case, you can set breakpoints and debug into it. If the file is disassembled, it's not always pretty, but it's amazingly useful for inspection. The Smart Step-in feature in Rider is a very nice upgrade, to which I've already become quite accustomed (just (-tab) to cycle locations). It's a bit finer-grained than being able to disable property step-in universally in Visual Studio. On the other hand, I'm not super-happy with the different ways of running an application in Rider. They seem to make it very difficult to debug an application and stop on unexpected errors. I've seen other users using Rider just kind of look in the output window as if live debugging wasn't a feature we should all expect to work. It can be configured, but you have to make sure to run in debug mode and turn on exception-handling. It's also much harder to debug a StackOverflowException in tests because Rider doesn't show a useful stack trace (it instead shows a trace for the LogException in the test runner itself. The "launch log file" is detailed, but provides no additional information. Instead, I was forced to set breakpoints and continually "edge closer" to the crash and find it myself. This is how Visual Studio used to work, but for a couple of years, its handling for stack overflows has been much better. Also, Rider doesn't stop on unhandled exceptions by default, either when running tests or running a web server. The stack trace in the debug output when running the web server isn't highlighted and can't be clicked. The debugger in Rider does not make use of the DebuggerTypeProxy to display or format debugging information, which is a shame because Quino has useful customizations for debugger display that I miss in Rider. [Invalid Targets] I was unable to debug unit tests for a while because Rider complained that my DotNet runtime (AnyCPU) didn't match the chosen testing target (x86). All of the solutions I've opened have been "Any CPU"-only, so I was mystified how Rider came up with the idea to run my tests as x86. Rider pops up a helpful tip to take me directly to the setting to change the runtime to use. I don't even have an x86 runtime. And I don't want to run tests as x86 anyway. The real fix is to go to Settings => Build, Execution, Deployment => Unit Testing => Default platform architecture and set it to "Automatic". Mine was hard-coded to x86, for some reason (maybe a settings upgrade from an older version). [Variable Inspection] Viewing a variable isn't as easy because Rider uses a much less-stable tooltip than VS. If you have a long value that you want to "view", you have to cruise your mouse along a long, skinny tooltip for dozens of centimeters before you can click the "view" button (you have to know it's there) at the end. [image] Since the tooltip is unstable, Rider has trained me to go down to the variable window and copy the value from there. Conclusion: Both debuggers are excellent -- each has advantages and disadvantages -- but Visual Studio still has a slight edge. It feels more mature. [External Sources] Both Rider and VS/ReSharper support navigation using SourceLink as of 2020.3, which is a massive win for usability. Now you can open a type with Ctrl + T or hit F12, (-opt) + Home, (-opt) + End to navigate to a related symbol from source and Rider/ReSharper will navigate within the SourceLink sources, which means that you can easily set breakpoints in code from NuGet packages, as long as they have SourceLink. Rider additionally offers support for setting breakpoints in disassembled code, with mixed results. However, browsing works less well in Rider. For example, I pressed (-opt) + F12 on EventHandler to "peek" it and it popped up a processing dialog for 15 seconds before I canceled it. When I pressed F12 to navigate there instead, it didn't show a progress dialog, but it also just seemed to break Rider because syntax-highlighting and code-completion stopped working for subsequently typed code. The "Errors in Solution" pane was similarly crippled, showing files with warnings, but no warnings. The navigation action never showed the code for the EventHandler, but it did make everything else stop working. A restart fixed everything. In addition, navigation to authenticated sources was only working temporarily. It is broken in the most recent version of Rider, as I've documented in "RIDER-61280" . Conclusion: Visual Studio is more reliable with SourceLink right now. Rider can debug and set breakpoints in disassembled sources (which it has to use because SourceLink is broken again). [Documentation] The formatting for XML documentation works strangely when Rider inserts text in documentation (e.g. when you apply a fix). We use a tab size of 2 everywhere, but the settings window shows a tab size of 4, but also mention that some settings might be overridden by the .EditorConfig. Reformatting or cleaning up code fixes the indentation to where it should be. It's unclear where Rider is getting its settings for the initial insertion. Even with the StyleCop Analyzers, there are fewer fixes for XML documentation than with Visual Studio/ReSharper. For example, there is no way to quickly add parameter documentation. Rider does not have any significant support for generating documentation (the initial format is very compact and never formatted according to rules). Rider's parameter-completion in documentation works more smoothly (Esc not necessary), but it does not use a "smart" sorting for tags. In ReSharper, once I've selected paramref once, that is sorted at the top and selected by default. In Rider, the order is unchanged, so I have to arrow down or type out most of the tag name in order to get past param. Rider still shows a hint to add on the class, even if the class has its own documentation. Conclusion: Visual Studio's XML documentation support is slightly better than Rider's, but they're both quite good. [General UI] [Action Menu] There's an extra item in the action list for "move to separate file" that does nothing. There's another item that includes the name of the file in the caption that does work. [Tooltips] There's no Enhanced Tooltip extension (and the tooltips are not as nicely formatted in Rider) [image][image] [Icons] I can't seem to change colors of icons as I can for ReSharper. I'd gotten used to brighter colors and miss them in Rider. [image] [Inlays] In ReSharper, you can disable specific inlay hints directly from the completion menu. In Rider, you can do this for some of them, but not all. If it's not there, you have to select "Configure inlay hints" and then have to find the corresponding checkbox yourself. [Windows Integration] Rider doesn't keep track of the last opened solutions to open from the task list. [As of 2021.1.1, the task list is now populated with recent solutions.] [Git Integration] The "Commit" panel doesn't refresh very quickly at all. Long after I'd seen the files in SmartGit, they were still not in the panel. When I switched away and then back, the new changes suddenly appeared. I don't use the integrated Git support, but I'm not going to start, either, after seeing how it works. [File Structure] * The file structure panel does not highlight the element where the cursor is. * The file structure panel does not support dragging and dropping * The file structure panel does not refresh to reflect changes without closing it (or switching to another panel that hides it) and reopening/reshowing it [Key Maps] I can't search for the bindings for a key combination in Rider, like I can in Visual Studio. Instead, I have to guess at the name of the operation that I think it's bound to. Update 23.04.2021: I've found that if you click on the magnifying glass to the right of the search field, you can "Find actions by shortcut". [Fine-tuning Formatting] Rider also doesn't have the "show active configurations" panel, for some reason. I'm currently fighting with Rider because it suddenly came up with the idea to format everything with 4 spaces instead of 2 spaces. Just yesterday this was finally working so that I could reformat the document and everything worked. Now, Rider is reindenting everything for me. Visual Studio/ReSharper is showing that I have 2 spaces configured. [image] Conclusion: Visual Studio w/ReSharper has a slight advantage in that it still feels more complete than Rider does. [Conclusion] Although Visual Studio/ReSharper edged out Rider in most of these categories, you're well-served with either one. I think if I'd compared Visual Studio by itself to Rider, then Rider would have won easily. It's only in combination with ReSharper that Visual Studio ends up being a bit better. It's just more mature and I never found myself going to Rider from Visual Studio, whereas I did have to open Visual Studio a few times to fix something I couldn't do in Rider. It's happening less with each version, though. Over the four months of the evaluation, Rider has improved steadily [5] I think you're well-served with either version. Once Rider files off a few more rough edges and has true feature-parity -- perhaps by natively implementing some of the inspections from the ReCommended extension -- it's slightly smoother editor might help it pull ahead in this comparison. Most of the above is complaining to a very standard, though. Both IDEs will make anyone who knows how to use them a much more efficient developer of reliable and readable code. -------------------------------------------------------------------------------- [1] I actually have an Enterprise license through Encodo's Silver Partnership. However, when that license lapsed at one point (we forgot to take a test on time), I'd installed Community in order to keep working. A few days later, when my license was restored, I didn't see the need to install Enterprise again. Almost two years later and I still haven't done it. The Visual Studio installer doesn't offer to "upgrade" the Community edition and I don't have space for side-by-side installations. I could uninstall and re-install, but everything's working now, and I miss nothing from Enterprise. [1] The last time I tried working with Visual Studio without ReSharper was over two years ago, with Visual Studio 2019 Preview 3. Still, I can see much more of Visual Studio working better than ever, taking over more and more of what I use ReSharper for. I'd installed Visual Studio 2019 Preview 3 to investigate the following, * Memory usage * C# 8 features * How does VS2019 work without ReSharper? I installed the desktop and web-development workloads, totaling almost 6GB. * Memory usage is better, obviously. * I liked the default highlighting with purple for flow keywords (I actually ended up turning this on for ReSharper, as well) * Finding references, ancestors and implementations is still tedious because Visual Studio does everything in panels or tool windows, rather than with popup menus * Solution-wide analysis isn't nearly the same * Source-level highlighting of errors and warnings as I type are noticeably slower * The test runner is much better than it used to be, but still not as good as the one in ReSharper/Rider. * Ctrl + Q to quick-search features works just as well as Ctrl + Shift + A does to find "actions" in ReSharper. In November of 2020, the article "Announcing .NET 5.0" by Richard Lander wrote that, "[...m]oving forward, the idea is that as when we add new features to .NET, we're also adding corresponding analyzers and code fixers to help you use them correctly, right out of the gate." and "With .NET 5, we have heavily improved our support for static code analysis. This includes an analyzer for platform-specific code and a better mechanism to deal with obsoletions. The .NET 5 SDK includes over 230 analyzers!" The latest versions of VS also allow you to fine-tune the severity of any warnings directly from the UI/Solution Explorer. This is all a great leap forward for Visual Studio 2019, but ReSharper still improves the following features: * Navigation is much faster * Searching and introspection is more efficient and uses better UX than VS's standard listsThere are more refactorings. Visual Studio doesn't yet help with inlining variables and methods in as many places as ReSharper does. Also, there are subtle inspections and quick fixes, like the following two that are only available with ReSharper/Rider. [image] [image] * The test runner is head and shoulders above that of VS * I miss "Find Everywhere". R#'s integration of "find in files" is so much faster than VS. [1] Visual Studio's search is available, but it's weaker than ReSharper's. Enough so that I don't want to use it unless I have to. It's good, don't get me wrong. It's just not as good as I know what I could have were I to install ReSharper. [1] As noted in a few places above.

Set a Git Tag on Azure

2021-03-30T21:04:45+02:00

Published by marco on 30. Mar 2021 21:04:45 ------------------------------------------------------------------------ As with "installing a dotnet tool on Azure" , there isn't a standard task for setting a Git tag from a pipeline YAML configuration. The Pipeline UI has an option to easily do this, but that hasn't translated to a task yet, nor does it look like it's likely to, according to online discussions. Setting a Git tag is relatively straightforward, but is complicated by permissions (as with "installing a dotnet tool" . To tag a build, you have to just execute the git commands in a script. - task: CmdLine@2 displayName: Push Git Tag inputs: script: | git tag $(Build.BuildNumber) git push origin $(Build.BuildNumber) If, for whatever reason, you want the tag to be created by the triggering user, then include the following lines as well: - task: CmdLine@2 displayName: Push Git Tag inputs: script: | git config user.email $env:BUILD_REQUESTEDFOREMAIL git config user.name $env:BUILD_REQUESTEDFOR git tag $(Build.BuildNumber) git push origin $(Build.BuildNumber) You should include this step after the version number has been updated. With the task in place, you have to ensure that you've granted permissions to the proper user. 1. Go to the "project settings" 2. Select "Repos/Repositories" 3. Select the "Permissions" tab 4. Allow the specific permission "Contribute" for the "Project Collection Build Service" user. [1] [image] -------------------------------------------------------------------------------- [1] Note: granting the permission to "Project Collection Build Service Accounts" or "[Project Name] Build Services" had no noticeable effect.

Installing a dotnet tool on Azure

2021-03-29T22:36:59+02:00

Published by marco on 29. Mar 2021 22:36:59 ------------------------------------------------------------------------ I have a .NET solution (Quino) that contains a project that I publish as a `dotnet` tool. The tool calculates a version number based on the branch and version number found in the solution. I use it from Quino itself and also from other project pipelines. In order to use it from any pipeline (including Quino itself), I need to install it from the Quino artifact feed. The original solution is a couple of years old: I'd had a secure file for NuGet.Config that included the PAT. This works fine, until the PAT expires. So, I went searching for a better solution and thought I'd try something a bit more resilient and better-supported. By now, I'm using YAML files for my pipeline, so I tried the DotNet task, but it doesn't support installing tools. There are open issues and even a very old open pull-request for supporting a Microsoft tool on Microsoft's premiere hosting service that Microsoft has steadfastly ignored. There seem to be no plans for supporting dotnet tool install natively, with seamless authentication, as they've done for dotnet restore. The example below shows how this works for restore. - task: DotNetCoreCLI@2 displayName: 'Restore Server Packages' inputs: command: 'restore' feedsToUse: 'select' feedRestore: 'Quino' projects: 'server/src/**/*.csproj' verbosityRestore: Normal includeNuGetOrg: true I was hoping to follow this pattern to use the dotnet task to install a tool with something like the following: - task: DotNetCoreCLI@2 displayName: 'Restore Server Packages' inputs: command: 'tool install' feedsToUse: 'select' feedRestore: 'Quino' includeNuGetOrg: true isGlobal: true toolName: quino There is no support for this. The PR mentioned above would support it, but it's never been accepted and Microsoft has not seen fit to add automatically authenticated feeds for anything other than restore. Instead, I use two tasks: the first is a workaround for the lack of proper support in Azure for `dotnet tool install` from authenticated feeds; the second installs the tool. See "dotnet tool install/update" not working with Azure Artifacts #10057" and "Add dotnet tool install command to support tools location in Azure Artifact feeds #13401" (the PR) for more information. I can copy/paste the two tasks below into all of the pipelines that need it. It's a bit bulky and non-intuitive, but it is both project-agnostic and doesn't include any passwords or PATs directly. Instead, it uses the $(System.AccessToken). If the project has been granted access to the feed identified by using the standard feed permissions control panel, then it works. - task: NuGetCommand@2 displayName: 'NuGet Add Credentials For Internal Feed' inputs: command: custom arguments: > 'sources add -Name "" -Source "/nuget/v3/index.json" -Username "this_value_could_anything" -Password "$(System.AccessToken)"' - task: CmdLine@3 displayName: Install tools inputs: targetType: inline script: dotnet tool install --global Where: * is obtained from your Azure project * doesn't matter, as long as it doesn't conflict with any defaults * is the name of the tool to execute This is utterly unintuitive, but it works and it's not too much hacking. I think it's indisputable that it would be much nicer if "install tool" was an option for the "dotnet" command. It's not like it's an external tool. This is literally how Microsoft has asked us to work. It would be nice if I hadn't had to spend half an afternoon trying to figure out how to get a dotnet tool installed from a feed in the same project on Azure. I'm glad I got it working, but everyone who comes after will also waste time trying to figure this out -- or will give up and use a gross hack instead.

A nice CSS demo that uses 350% CPU

2021-03-13T22:09:08+01:00

Published by marco on 13. Mar 2021 22:09:08 ------------------------------------------------------------------------ The article "Getting the Mouse Position using CSS" by Bramus talks about a neat trick that uses sibling elements to react to mouse events without using JavaScript. It also features some kick-ass translucency and animation effects with CSS transitions. As you move the cursor around, the layer of "cells" change X and Y positions that the CSS text elements "watch". This lets the central elements "follow" the mouse, transforming a stack of "CSS" texts in a nicely composed and layered stack. It looks like this. [image] While this is a nice-looking effect -- and it's impressive that it works purely in the browser and purely in CSS -- it kicked in the fan on my iMac, something that rarely happens. That said, the compositing features of a modern browser are impressive and can save website authors a lot of time and effort. That this is even possible is already really, really nice. Maybe with a bit of tweaking, it can be made less detrimental to battery life. If you want to try it out yourself or tweak the code, check out the "CodePen" .

TIHI: SmartGit's new "Discard to Stash" Feature

2021-03-04T22:39:04+01:00

Published by marco on 4. Mar 2021 22:39:04 Updated by marco on 4. Mar 2021 22:39:38 ------------------------------------------------------------------------ This a quick note for anyone else who's downloaded the latest version of SmartGit (20.2.3 #16150) and is seeing mysterious stashes that they know they haven't created. There's a new feature called "discard to stash" that is enabled by default. [image] What this does is to stash every time you press (-cmd) + Z to discard changes. I understand that this is a failsafe "just in case", but I kept ended up with a dozen stashes I had no use for. On balance, I'd rather have the tiny risk of wanting changes back that I'd discarded -- I can't recall this ever having happened -- than the "noise" of stashes muddling the list of actual stashes I'd saved. I started off trying to train myself to hit right arrow and then enter, or typing D instead of (-enter), but I gave up and found an "advanced" preference to switch the default behavior. [image] 1. Open Preferences 2. Select "Low-level properties" 3. Select "I accept the risk!" 4. Search for "discard" 5. Change the value to "false" 6. Press OK

A breakdown of programming languages used in a week

2021-02-17T21:56:03+01:00

Published by marco on 17. Feb 2021 21:56:03 ------------------------------------------------------------------------ In one recent week, I realized I'd been working in many different areas and on many different projects, so I took an inventory. For one project, I reconfigured a program with Delphi Pascal, using Delphi 7 (it's a very old, legacy solution) to run on my local machine instead of in a VM that had swollen to 120GB. For that project, I also used SQL on SQL Server, running in a Docker container that I'd configured with YAML. The solution has several products, among which you can switch, so I wrote a Windows Batch program to transfer and back up versions, so you can nicely diff them with SmartGit using Git. In order to diff SQL, I used a tool written in TypeScript, which I extended with a few fixes and tests written with Jest in Visual Studio Code. I updated the documentation in Markdown. At the same time, I was working on "Quino" , written in C# for the .NET platform, using Visual Studio on Windows and Jetbrains Rider on MacOS. I also set up a new solution using Quino, which involved editing a bunch of XML project files as well as configuring SQL Server and PostgreSql with Docker. I again used YAML to define pipelines in Azure DevOps. For two evenings, I graded final projects for a JavaScript class I've just finished teaching. On the other evenings, I researched modern HTML, CSS, and SVG for an upcoming redesign of earthli. I made a few PHP fixes for earthli as well. I wrote blog posts, wiki entries, and issue analyses in Jira Syntax, Markdown, XWiki syntax, and "earthli Syntax" in both English and German.

Software without Process

2021-01-17T17:43:34+01:00

Published by marco on 17. Jan 2021 17:43:34 ------------------------------------------------------------------------ [image]A software product with undocumented or poorly documented commits and a patchy issue-tracker is akin to a shipping pallet with 100 boxes haphazardly stacked on it, all wrapped up in shipping cellophane. You can see some of the labels and some of them you can't and some of the boxes definitely don't even have labels at all. [image]If it looks like the pallet to the right, then you already know you can't ship it. That's an obvious train-wreck of a project that's going to blow up in everyone's face. But the picture to the left looks...OK...ish. How do you know if it's legit? Check the shipping manifest and get out your scanner gun, right? The shipping manifest on your clipboard has 3 and 1/2 items on it, none of them really helpful. If you really want to be sure about what you're shipping, you're going to have to unwrap the whole thing and look at each box individually, noting it on the manifest if it's missing -- and maybe even opening it up to see what's actually inside. Maybe it's even broken and leaking on other boxes, somewhere in the middle of that whole pile. Maybe someone wrapped it in cellophane to give it the sheen of reliability, but you can't know for sure. Is it possible that you spend all of the time to dot the i's and cross the t's just in order to find out that it was fine, but just drastically under-documented? It's possible, of course. That's a risk you take when you try to be professional. The alternative is to become a gambler -- shipping something and hoping that it doesn't come back to haunt you. A better approach would have been to use a documenting process as you built the product -- like engineers rather than cowboys -- slowing our awesome selves down a bit, but also -- maybe, just maybe -- getting faster because we're more careful and can avoid wasting time on work that doesn't need to be done. Documenting the work to be done -- e.g. to explain it to other team members -- can have the much-appreciated side-effect of focusing you on the work that actually needs doing. This is generally more efficient and satisfying than just shooting out of the gate and doing what you "know needs doin'" and not noticing possible ramifications until it's too late do anything but react to rather than plan for them. In the end, you have not only solid, well-designed, and tested software, but also good documentation of what was actually done for a given release, as well as analyses for what was not done and what needs doing in the future. That everything is well-documented enough to implement now means you've got half a chance of still knowing what it means in 1/2 or 3/4 of a year when you finally get a chance to plan and implement it. Who knows? You may never need to work on it again -- which is just fine. At least you'll know what you didn't implement and why. This is very helpful for that time, in a year or two, when you think of this exact same solution and are maybe too stressed or under too much pressure to remember why you decided against it the first time. A good software product is not just the product itself, but all of the metadata surrounding it: the documentation, the analyses, the release notes, the roadmap.

Set up PHP With Docker, PHPStorm, and XDebug

2021-01-17T00:01:34+01:00

Published by marco on 17. Jan 2021 00:01:34 ------------------------------------------------------------------------ Until now, PHP debugging involved a fragile balance between the IDE, the server, and the debugger, each with overly verbose configuration. On top of that, using Docker introduced the wrinkle that you were technically debugging on a remote server rather than on the "real" localhost. It's been a long journey, but it's finally a lot easier to set up PHP debugging with a server running in a Docker container. Once you use the most modern tools, everything works with a couple of lines of configuration. "tl;dr: "" * Ignore anything you find on StackOverflow from before November of 2020 and use "the install-php-extensions project" instead (see example below). * Set environment variables in the docker-compose file to indicate the client and the default mode (debug) * Use the latest PHPStorm, which supports XDebug 3.x My setup is as follows: * PHPStorm 2020.3.1 running on MacOS * Docker 3.0.4 running on MacOS * "db" container running MySql 5.7.24 * "web" container running PHP 7.2.24 on Apache So far, so good: it's basically a standard developer setup for PHP where I have an IDE on my machine and am running servers in Docker containers. XDebug initiates a connection from the server in the "web" container back to the IDE on the docker host. Without further ado, these are the magic configuration files to install extensions and set up XDebug for PHP. [web/Dockerfile] After much searching and rigamarole and fighting with docker-php-ext-install and docker-php-ext-enable and PECL and where the PHP.INI is and whether I need to move one of the default files somewhere so that PECL can update it and downloading dependencies with apt-get and getting the right dependencies, depending on the PHP version and passing the right flags to docker-php-ext-configure if the version is a bit older and, and, and... After trying a ton of no-longer-relevant and now-overly-complex suggestions on StackOverflow, I finally returned to "php on dockerhub" and discovered a hint to use "the install-php-extensions project" , which basically takes care of everything for you. It does. End of story. FROM php:7.2.24-apache ENV DEBIAN_FRONTEND=noninteractive ADD https://github.com/mlocati/docker-php-extension-installer/releases/ latest/download/install-php-extensions /usr/local/bin/ RUN chmod +x /usr/local/bin/install-php-extensions && sync && \ install-php-extensions gd xdebug mysqli exif zip I pin the PHP version to the one on my server, download the latest version of install-php-extensions [1] and then call it to install the non-standard extensions I use on earthli: * exif: Extract date information from pictures * gd: Generate thumbnails * mysqli: Provide access to MySql using a legacy API * xdebug: Debugging support on the server * zip: Open and read files from ZIP archives See the "web site" for the list of supported packages. Your site will likely use different ones (but you should definitely install xdebug because it's totally easy to use now). [docker-compose.yaml] Finally, you just need to set two environment variables to enable debugging for PHP: * XDEBUG_CONFIG: accepts a list of settings, but we only need to set the client_host to tell XDebug which machine hosts the IDE to which to connect (Docker handily provides the host.docker.internal alias for MacOS and Windows) * XDEBUG_MODE: this sets up the tool for step-debugging (see "XDEBUG mode" for more information). I've included nearly the full Dockerfile from earthli, but the only relevant part for debugging is in the environment. web: build: web container_name: "${COMPOSE_PROJECT_NAME}-web" restart: unless-stopped ports: - 80:80 volumes: - ../site:/var/www/html - ../lib:/var/tmp/earthli.com-lib - ../../earthli-webcore/site:/var/tmp/webcore-site - ../../earthli-webcore/lib:/var/tmp/webcore-lib - ../../earthli-data:/var/tmp/earthli-data - ../../earthli-logs:/var/tmp/logs - ../config/apache-dev.conf:/etc/apache2/sites-available/000-default.conf depends_on: - db environment: XDEBUG_CONFIG: client_host=host.docker.internal XDEBUG_MODE: debug [PHPStorm and Browser] At this point, you're well on your way to debugging with PHPStorm. From here, follow the instructions in the settings dialog, shown below. [image] * You can click the "Validate" link in the dialog to verify that your XDebug is recognized and working in principle * Activate step debugging from the browser. You can pass XDEBUG_SESSION=PHPSTORM in the query string, but that gets a bit tedious. Instead, "install a browser-debugging extension" , which simply injects the cookie XDEBUG_SESSION=PHPSTORM into the request so that PHPStorm knows that debugging is desired. See "XDebug's documentation" for more information on other ways of triggering debugging, including from the command line (e.g. when running unit tests). * Make sure PHPStorm is listening for incoming PHP Debug Connections (you can toggle in the dialog shown above or from the toolbar in the IDE). That's it. A long and kind of painful journey has finally led to a solid and easy-to-configure debugging experience for PHP. -------------------------------------------------------------------------------- [1] I will probably pin the version to the one I'm using right now because I know it works.

A dynamically generated Groovy foot-gun

2020-07-05T21:52:52+02:00

Published by marco on 5. Jul 2020 21:52:52 ------------------------------------------------------------------------ Groovy is a dynamically typed programming language that executes on the Java Runtime. It mixes its own highly dynamic syntax with islands of Java code. The Android ecosystem and its IDE use Gradle for its build scripts. Gradle uses the Groovy programming language. [The Problem Code] A large project I'm working on contains quite a bit of custom Gradle code for integrating framework libraries, making obfuscated builds, configuring publication, and, finally, creating signed builds. The signed builds are configured using standard Android Gradle DSL commands. Basically, there was a block of code something like the one shown below. signing { storeFile = getKeyStoreFile() storePassword = getKeyStorePassword() keyAlias = getKeyAlias() keyPassword = getKeyPassword() } The names of the methods (e.g. getKeyAlias) used to be different before I'd refactored them to have more standard names. [1] The methods check whether there are environment variables set by the build server, using sane defaults for developer builds. [2] This is where I went wrong. Never touch a running system [3], even when you're trying to pull it back from the precipice of "maintenance nightmare that everyone is terrified to touch, to say nothing of change". Well, I changed it, and ended up frittering away a couple of hours investigating the Groovy "feature" outlined below. [Lowering Groovy to Java] Groovy performs syntax-checking, but is extremely lenient as far types and variables are concerned. Variables have to be defined, but pretty much anything can be coerced into anything else. It is transformed to Java code and then Java byte code by the Java compiler. Any typing errors you see are from the Java compiler, not the As any programming language would, Groovy resolves identifiers to match the declaration that is closest in scope to the call, even when that declaration is generated at compile-time and couldn't possibly be the one that the original author had intended to call. This is going to be important later (which is why I put it in scary italics). The four methods above are defined in an ext {} block [4]. Calling them without a specific target as above automatically resolves to the methods from the ext {} block. Oddly, of the four properties being set in the example above, only the first two actually called the methods I'd defined in the ext {} block. The calls to getKeyAlias() and getKeyPassword() were not made to the expected functions. I could tell they weren't being called because the logger.info() calls from those two methods never appeared in the output. What the hell is going on? If you look carefully, you'll notice that the first two methods have different names than you would use for writing the getter and setter for the properties being assigned. The second two methods match those names exactly. [Auto-generated Java Code] When Groovy lowers its syntax to Java code, it declares these getters and setters. The Java compiler, in turn, references these new methods because the calls in the original Groovy code hadn't been specific about the target of the methods. Instead of lowering to Java code and being explicit about which ext block the method should be called from, Groovy just left the naked call as I'd written it. Probably, if I'd explicitly called ext.getKeyAlias(), it would have avoided calling the dynamically generated this.getKeyAlias() method. Of course, Groovy had trained me to stop prepended the target ext. on global function calls because ext resolves to different things, depending on the DSL-specific context. Sometimes it's the root project's extra variables and sometimes it's the sub-projects extra variables and sometimes ext doesn't work at all (e.g. in Java classes, naturally, but also in blocks created by special keywords). Sure, you can trying playing around with rootProject.ext. or other similar constructs, but the code quickly becomes even more unreadable than it already would be and the non-prefixed version works 99% of the time. So what ended up happening was that, instead of calling the method I'd actually called, the Groovy compiler generated a new method with the same name and a higher specificity in the scope, capturing the call. Instead of calling my method, it ended up calling this.setKeyAlias(this.getKeyAlias()), which is basically a NOP that leaves the property empty. [Fixing the Problem] The solution is to use a unique name for the function that does not conflict with any of the auto-generated getters. That is, of course, an unmaintainable nightmare, but part and parcel of working with Gradle. signing { storeFile = getKeyStoreFile() storePassword = getKeyStorePassword() keyAlias = getSigningKeyAlias() keyPassword = getSigningKeyPassword() } Lo and behold, my log entries appeared and I was back in business. [Fixing the Compiler] The compiler authors could have tried harder to avoid altering the semantics of the higher-level Groovy code when replacing it with Java. One way would be to use more obfuscated auto-generated getters and setters (to the degree that Java even allows this, which I think it does). Another way was hinted at above: when lowering calls that auto-resolve to functions declared in ext regions, include information about the resolution in the call made in Java. That is, instead of just encoding getKeyAlias() as I'd written it (which is semantically correct at the Groovy level), transform that call to rootProject.ext.getKeyAlias() in Java. [tl;dr] Gradle is a shaky piece of business that automagically generates code that might replace actual, legitimate calls in your own code. It should never have been used for a build system. It makes MSBuild seem like a pretty good idea. -------------------------------------------------------------------------------- [1] The code had somewhat haphazard names before, which actually ended up protecting it from the bug I ran into. Lesson learned: using obtuse function names is good. [1] Yes, we're going to integrate a secret store like Vault instead of relying on environment variables. That comes after this refactoring. [1] In the end, I was happy I'd refactored everything because I ended up with a lot less code that was much more self-documenting. The job was to refactor and clean up scripts that had grown over the years, so there was no avoiding renaming oddly named methods. The side-effect in this case was unusually painful. [1] Gradle lets you declare "extra" variables in a scoped block called ext. The parent of this block depends on the context. It's usually rootProject, unless you're executing project-specific code, in which case any declarations will be made in the sub-project-specific ext block instead of the one for the rootProject. It can get quite confusing if you're not sure which context you're in when you declare an ext {} block, which is why some authors try to declare rootProject.ext or project.ext, but then you run into problems when you grab a variable from the wrong extra region. Not to mention that it gets quite a bit messier to read and if all authors don't stick to the same style, it becomes difficult to tell which explicit references are necessary and which are just thrown in there "to make sure". I settled on just declaring as much as possible in the ext {} and letting Groovy figure out which variable to use from scope. That ended up biting me in the ass exactly once, as detailed above.

C# 9: finally, covariant returns

2020-05-24T22:27:00+02:00

Published by marco on 24. May 2020 22:27:00 ------------------------------------------------------------------------ The article "Welcome to C# 9.0" by Mads Torgersen (May 2020) introduces several nifty new features that I am really looking forward to using. [What about C# 8?] I still haven't moved "Quino" to C# 8, as the only feature I'd love to have there is the non-nullable types, which ReSharper Annotations provide with earlier versions of C#. Not only that, but the nullabilities are properly propagated to users of Quino. It's understood that recent versions of Visual Studio and runtimes and compilers also do this but, until recently, our customers weren't up-to-date yet. In C# 8, we could also replace extension methods with default interface methods -- but we've also been replacing almost all extension methods in Quino with singletons and composition anyway. A lot of the "rest of the features" are nice, and interesting, but they are targeted optimizations that don't really apply to a lot of the code that I write. I see how they are eminently useful for lower-level library and runtime optimization -- many are clearly made to be able to handle web requests and fine-grained tasks more quickly and without allocation [Features in C# 9] Still, the features in C# 9 make an upgrade even more attractive. * Init-Only Properties can only be initialized in the object initializer, after which they are immutable. This extends read-only properties, which can only be initialized to default values or in the constructor to make them much more useful and allow many, many more data structures to be immutable. * Records reduce a ton of boilerplate for what used to be referred to as DTOs. The declaration support is very similar to the syntax in Typescript. Record classes automatically get construction, deconstruction, and value-based equality and hash-code support for a very natural way of declaring and working with immutable data. * The with keyword and functionality allows code to easily derive new data from existing data (e.g. var originalPerson = otherPerson with { LastName = "Hunter" };) * There's some neat improvements to pattern-matching with relational patterns and logical patterns, but I honestly don't expect to use that too much (my use of advanced pattern-matching so far has been relatively limited...I haven't even availed myself of the extended support in C# 8). * There are improvements to target-typing with some coalescing expressions now compiling as expected rather than requiring what always felt like a superfluous cast. I doubt I'll be using target-typing like this: Point p = new (3, 5); rather than this: var p = new Point(3, 5); * And, finally, covariant return types make an appearance. Java has had these for forever and there is no logical downside to introducing them. This allows a descendant method to change the return type of an override to a descendant as well. The most common use case would for the return type of a Clone() method. The next step would be to allow anchored types (as in Eiffel), which would let a method declare its return type as like this and remove the requirement that each descendant override Clone at all, while still having the desired covariant return type. I've been musing about these features for what feels like most of my career. * "Static-typing for languages with covariant parameters" (June 2003 -- as anchored types) * "Joel “Blowhard” Spolsky Sounds Off" (January 2005) * "Wildcard Generics" (November 2006 -- in Java, not C#) * "Waiting for C# 4.0: A casting problem in C# 3.5" (October 2009 -- footnote linking to PDF about anchored types) * "A provably safe parallel language extension for C#" (February 2013) * "C# 6 Features and C# 7 Design Notes" (March 2015)

W3C Web Animations

2020-04-13T11:20:34+02:00

Published by marco on 13. Apr 2020 11:20:34 Updated by marco on 15. Apr 2020 15:47:26 ------------------------------------------------------------------------ The "Web Animations Working Draft" was published in October of 2018. "Can I use "Web Animations"" shows that the only browser that supports this API 100% is the latest technology preview on iOS and MacOS. Chromium-based browsers have had (very) basic support for quite some time, but Safari has thrown down the gauntlet with full support, which I learned about from "Web Animations in Safari 13.1" by Antoine Quint . This API is intended to replace many usages of CSS Animations and CSS Transitions, which are not only somewhat verbose and unwieldy for even simple cases, but are also not efficient in that each animation tends to force itself to start, artificially interrupting the browser as it prepares a page. With the Web Animations API, a page can much more declaratively indicate its intent without force-calculating animation target values, as is required now with CSS Animations. A page can create and launch animations, but it can also get a reference to that animation and change it on-the-fly afterward. You can play it, pause it, change the play position, the play state, hook into the animation lifecycle with a Promise-based API, and much more. A page can even get all of the animations associated with an element or the entire document and manipulate them wholesale. Safari's new inspector uses this API to offer much richer display and control of all running animations. Understandably, Safari has reimplemented CSS Animations and CSS Transitions on top of a whole new animation engine that the Web Animations API also controls. Safari puts a very strong implementation forward, with only two features missing: * SVG Animations are not running on the new engine yet and cannot be manipulated with the API * "Effect composition" is not yet supported

PostgreSql Drawbacks

2020-04-13T11:18:02+02:00

Published by marco on 13. Apr 2020 11:18:02 ------------------------------------------------------------------------ Despite the title, from what I can gather from "10 Things I Hate About PostgreSQL" by Rick Branson , the author is a big fan of PostgreSql. However, he has such vast experience with it that he can still list 10 things that don't work as well as they could. They seem to boil down to: * Default replication is still serialized and therefore not as reliable as the alternative async protocol that is much harder to configure and pretty much what every other major database offers. * Obsolete-data-handling requires regular and fine-tuned (for performance-critical) vacuuming instead of just handling it in a cleaner manner like MySQL, Oracle or SQL Server (rollback log or temp tables) * Indexes use more space than they should because they copy the indexed fields rather than referring directly to the data in the table. That is, unlike most other DBs, the index data is physically separate from the table data and therefore can nearly double the required space for a table with few columns. * The plan-builder doesn't support planning hints, which means you can't patch a query in production to buy time: you have to either meta-patch it (i.e. figure out some way of sending a "hint" to the planner through other means) or fix it for real, which can take a lot more time while your production servers are blowing up. From the article, "I do understand their reasoning, which largely is about preventing users from attacking problems using query hints that should be fixed by writing proper queries. However, this philosophy seems brutally paternalistic when you’re watching a production database spiral into a full meltdown under a sudden and unexpected query plan shift. (Emphasis in original.)"

Getting started with D3.js

2020-03-21T18:37:55+01:00

Published by marco on 21. Mar 2020 18:37:55 Updated by marco on 15. Apr 2020 15:50:21 ------------------------------------------------------------------------ [image]The programmable notebook "Introduction to D3" by Arvind Satyanarayan is part of a "full course" at MIT about Interactive Data Visualization. The linked notebook uses "D3.js" , but previous classes in the course have dealt with "Vega" , which is, "[...] a visualization grammar, a declarative language for creating, saving, and sharing interactive visualization designs. With Vega, you can describe the visual appearance and interactive behavior of a visualization in a JSON format, and generate web-based views using Canvas or SVG." Vega is a higher-level abstraction than D3 and is, therefore, both more powerful and more limited than it. If what you want to build fits the higher-level building blocks of Vega (see "examples" ), then you'll be done more quickly with that; if it doesn't, then D3.js offers more flexibility as it functions at finer granularity. "[...] grammars [like Vega] break visualization design down into a process of specifying mappings (or visual encodings) between data fields and the properties of graphical objects called marks. They're useful for concisely and rapidly creating recognizable visualizations, while giving us more design flexibility (or expressivity) than chart typologies like Microsoft Excel. "However, describing visualization design in these high-level terms limits the types of visualizations we can create. For example, we can only use the available marks, and can only bind data to supported encoding channels." With D3.js, you have to do a bit more legwork yourself, but it offers more graphical flexibility and possibilities. Instead of customizing the settings for predefined renderers (or "marks"), you define the renderers yourself: the notebook includes examples in HTML and SVG. To keep things simple, the SVG examples replicate the HTML examples, but they could render much more that is not so easy to realize in HTML. Although D3.js has a reputation as a "charting library", that moniker is actually more appropriate for Vega. D3.js is a generalized data-to-graphics mapping library. As you can see from the "examples" , it is very useful for charts, but allows a lot more customizability than Vega. Anyone building charts for their site should consider very carefully whether the additional power and complexity are warranted vs. a solution with something like Vega. That said, it was a lot of fun getting to know D3 with this notebook. The notebook is extremely well-written and organized and it's absolutely fantastic that it's available online, for free. I was able to understand and execute all of the exercises and feel like I have a good enough grasp of D3 now to be able to build something with it. Perhaps more importantly, I feel that I can now: 1. Determine whether D3 or Vega are appropriate for a given project 2. Estimate the amount of time required to build something with either of them

Rust: from zero to pretty-well-versed in 30 minutes

2020-03-21T15:59:18+01:00

Published by marco on 21. Mar 2020 15:59:18 ------------------------------------------------------------------------ I found the article "A half-hour to learn Rust" by Amos to be extremely helpful in learning the syntax and mechanics of Rust. It starts out with the absolute basics: "let introduces a variable binding [...]" then takes you through * Modules * Blocks * Conditionals * Matches * Options * mutables * Copy/clone semantics * Traits * Generic parameters * Constraints * Macros * Enums * Lifetimes and borrowing * Generic lifetimes * Statics vs. owned vs. referenced * Slices and range literals (Index and IndexMut) * Results * Errors, panic and unwrap, expect() and ? * Closures (Fn, FnMut, and FnOnce) * move * for ... in and ends up with a function builder that tests strings: fn make_tester<'a>(answer: &'a str) -> impl Fn(&str) -> bool + 'a { move |challenge| { challenge == answer } } fn main() { let test = make_tester("hunter2"); println!("{}", test("*******")); println!("{}", test("hunter2")); } // output: // false // true

Quino 2020 Roadmap

2020-03-07T18:43:04+01:00

Published by marco on 7. Mar 2020 18:43:04 Updated by marco on 8. Mar 2020 10:58:06 ------------------------------------------------------------------------ Now that "Quino 8.x is out the door" , we can look forward to Quino 9. Quino 8 is a very solid and stable release that has already been test-integrated into many of our current "products running on Quino" . We don't anticipate any more low-level API changes, though there will be follow-up bug-fix releases. There are a few larger-scale changes improvements and enhancement, outlined below (and noted in the "roadmap" ). [Ready for Change] With this release, we've got more coverage than ever. Excluding only generated code (e.g. *Metadata.cs and *.Class.css in the model assemblies), we ended up with a respectable 81% test coverage. Quino has almost 10,000 tests comprising about 51k LOC and covering 82k LOC [1], Many, many of these are integration and scenario tests. With this level of test coverage, we feel comfortable with refactoring to improve usability and performance. [image] [Documentation] One of the primary near-term goals is to improve Quino's documentation story. The aim is to take a new developer through the common tasks of working with a solution based on Quino. * "Setting up a solution" : getting packages and creating standard projects for testing, web, and so on * "Creating a model" * "Using expressions" * "Using quino's command-line tools" * Generating "ORM" and "metadata" code * Working with "business logic" * Integrating the "web" * Improving the "debugging experience" * "Writing tests" efficiently, quickly and easily using powerful base classes and tools for snapshot-based tests Some of this documentation is currently still out-of-date or will change as we improve the corresponding components. For example: * Anything that refers to Nant is no longer relevant * Some of the quino tool documentation will no longer be relevant after 8.1 (see tools-related issue in the "issue tracker" ) * The "data documentation" is still very much a work-in-progress [Search and Index] The latest "table of contents" is much more comprehensive than before and we're still improving it. We don't have an integrated search for the conceptual documentation yet, but you can use Google's site-specific search. For example, search for configuration with the following search text "configuration site:docs.encodo.ch". The top results are: * "Configuration" * "Configuration System" * "Quino Application Configuration" Which is pretty decent, overall. [Database-first] Several of our upcoming products using Quino (two are so new that they're "not yet listed" ) are replacing legacy products that are highly dependent on a central database that defines the application domain. That is, the model is in the database or in a model description that is not initiallly a Quino model. Instead of defining the model in C# code manually and then building the database from that (the standard approach with Quino), these products define the model with varying levels of automation and import and then use the existing database. The following list shows the various ways that we're building Quino models, in addition to the standard approach of "defining them in C#" : * Import a model from Microsoft Dynamics CRM * Import a model from a legacy DSL like the Atlas modeling language * Import a model from the database schema itself, using Quino's "schema-import/migration support" * Map a model in C# to an existing database schema, using Quino's new "support for mapping schemas" This allows customers with existing databases to relatively quickly and easily produce a Quino model that gets them access to "the plethora of features available to Quino applications" (e.g. ORM, schema-check and -migration, generated GUI for desktop or web, and so on). -------------------------------------------------------------------------------- [1] The LOC analyzer included in Visual Studio had slightly different numbers: * 297k LOC (source) * 198k LOC (docs) * 202k LOC (source/libs) * 99k LOC (executable) * 56k LOC (executable/libs) * 43k LOC (executable/tests) Quino has almost one line of testing code per line of library code (43k/56k ~ 77%). Quino has almost 4 lines of non-executable code per line of executable library code (202k/56k ~ 360%). The disparity between the two results (JetBrains DotCover and Microsoft Visual Studio) just goes to show what a fraught metric LOC really is. According to these two measurements, Quino has between 56k and 83k LOC of executable library code.

Quino v8.0.0: ASP.NET Core, Web Client 2, Culture/Language improvements

2020-02-22T17:43:38+01:00

Published by marco on 22. Feb 2020 17:43:38 ------------------------------------------------------------------------ The summary below describes major new features, items of note and breaking changes. * Resources * "Artifacts" (Note: the URL is a NuGet Source; you can't browse here directly) * "Documentation" Issues/Changelog * "Quino-Standard 8.0.0.2505" * "Quino-WebAPI 8.0.0.2511" * "Quino-Windows 8.0.0.69" The links above require a login. [Highlights] * Quino-Web now targets ASP.NET Core ("QNOWEB-149" , "QNOWEB-84" , "QNOWEB-55" ) * Quino-Web has Serilog enabled by default ("QNOWEB-147" , "QNOWEB-146" , "QNOWEB-145" , "QNOWEB-139" ) * Enabled SourceLink for all packages ("QNOWIN-262" ) * Improved testing and debugging support. ("QNO-6289" , "QNO-6282" , "QNO-6278" , "QNO-6277" , "QNO-6275" , "QNO-6255" , "QNO-6213" ) * Improved culture and language-handling ("QNO-6302" , "QNO-6303" , "QNO-6253" , "QNO-6230" , "QNO-6228" ) * Extended expression functions with CreateGuid(), CreateDate(), and CreateTime(). ("QNO-6304" , "QNO-6305" ) [Breaking Changes] Before upgrading, products should make sure that they do not depend on any obsolete members in the current version (7.x). [ASP.NET Core] Quino-Web 8.0 is a rewrite and is therefore mostly incompatible with 7.x. * The controller returns data in a completely different format * The Quino Client has been completely rewritten to accommodate it * The startup and pipeline have been completely rewritten to integrate with ASP.NET Core * Testing support has been considerably extended to accommodate end-to-end integration testing and in-process hosts See the Quino-Web/Sandbox.Web project for a working example. This integrates the standard SandboxApplication into a web site using the standard GenericController and MetadataController to provide data and UI to the generic Quino Client. [Namespace Changes] Some internal types in Quino-Standard have been moved to more appropriate namespaces and assemblies, but the impact on products should be non-existent or very limited. The following types were moved from Encodo.Quino.Core to Encodo.Quino.Culture: * LanguageTextAttribute * IValueParser * CaptionAttribute * LanguageDescriptionAttribute The following types were moved from Encodo.Quino.Core to Encodo.Quino.TextFormatting: * IFileSizeFormatter [Culture- and Language-Handling] Quino's default culture-handling has been overhauled. Instead of tracking its own language, Quino now uses the standard .NET CultureInfo.CurrentUICulture for the default language and CultureInfo.CurrentCulture for default formatting (e.g. times, dates, and currencies). Many fields have been marked as obsolete and are no longer used by Quino. [Default Languages] The default languages in Quino have changed from "en-US" and "de-CH" to "en and "de", respectively. The reasoning behind this is that, while a _requested language_ should be as specific as possible, a _supported language_ should be as general as possible. The standard culture mechanisms and behavior (e.g. .NET Resources) "fall back" to a parent language when a more-specific language cannot be found. If an application claims to only support "en-US", then a request for "en-GB" fails. If the supported language is "en", then any request to a language in the "en" family (e.g. "en-US", "en-GB", "en-AU") will use "en". An application that supports "en-US" and "de-CH" has, therefore, a more limited palette of languages that it can support. [Fallback-resolution] Quino code runs in the context of a user, who has a list of preferred languages, in decreasing order of preference. This context can last the entire duration of an application (e.g. a standalone application like a console or desktop application) or last as long as a web request. The application itself has a list of languages that it supports, as well as resources and metadata that defines text in these languages. The resources are standard .NET Resources with the standard fallback mechanism (i.e. a request for "en-US" can be satisfied by "en"). The metadata uses DynamicString objects, which encapsulate a map from language codes (e.g. "en" or "de") to strings. During application startup or at the beginning of a web request, the ILanguageResolver determines the language to use for a given set of requested languages. In ASP.NET Core, the requested languages come from the HTTP headers provided by the browser. In standalone applications, the IRequestedLanguageCalculator provides the requested languages. The ILanguageInitializer is responsible for coordinating this during application startup. The rest of Quino uses the following singletons to work with languages. * IDynamicStringFallbackCalculator: Comes into play when a request is made for a language that is not directly supported. For example, if the application supports "en" and "de", then a request for "en-US" will ask this singleton how to resolve the request. * IDynamicStringFactory: Creates a dynamic string to describe a given object. The default implementation uses .NET Attributes. * ILanguageResolver: Determines the culture to use from a list of available cultures and a list of requested/preferred cultures. * IRequestedLanguageCalculator: Provides the sequence of languages from which to choose during initial resolution (web requests _do not_ use this). * ILanguageInitializer: Integrates language-selection into the application startup. * ICaptionCalculator: Extracts a single caption for a culture from a given object. Appications should use the IDynamicStringFactory in most cases, instead. An application can control fallback by registering custom IDynamicStringFallbackCalculator and ILanguageResolver implementations (though this is almost certainly not necessary). [Opting in or out] Any product that calls AddEnglishAndGerman() will automatically be upgraded as well. A product can avoid this change by calling AddAmericanEnglishAndSwissGerman() instead. [Reports] A product that uses the new languages will have to replace all fields in reports targeted at "en-US" and "de-CH" to target "en" and "de" instead. [Database Fields] A product that does use the new default languages will have to determine how to migrate database fields created for languages that are no longer explicitly supported. If the model includes value-lists (enums) or multi-language properties , the application will have to migrate the database schema to update multi-language fields (e.g. "caption_en_us" => "caption_en"). [Manual MetaIds] A product that sets MetaIds manually will migrate without modification (Quino will rename the property in the database). [Automatic MetaIds] A product that does _not_ set MetaIds (this has been the default in Quino since version 2) will have a MetaID mismatch because the name has changed. By default, Quino will migrate by attempting to drop, then re-create multi-language properties. In the case of value-list captions, this is harmless (since the data stored in these tables are generated wholly from the metadata). For actual multi-language properties with user data in them, this is _a problem_. The simple solution is to call UseLegacyLanguageMappingFinalizerBuilder() during application configuration to ensure a smooth migration (Quino will rename the property in the database). [Regenerating Code] A product that updates its languages should regenerate code to update any generated language-specific properties. Properties that had previously been generated as, e.g. Caption_en_us will now be Caption_en.

When [NotNull] is null

2020-02-18T09:08:08+01:00

Published by marco on 18. Feb 2020 09:08:08 ------------------------------------------------------------------------ I prefer to be very explicit about nullability of references, wherever possible. Happily, most modern languages support this feature non-nullable references natively (e.g. TypeScript, Swift, Rust, Kotlin). As of version 8, C# also supports non-nullable references, but we haven't migrated to using that enforcement yet. Instead, we've used the JetBrains nullability annotations for years. [1] Recently, I ended up with code that returned a null even though R# was convinced that the value could never be null. The following code looks like it could never produce a null value, but somehow it does. [NotNull] // The R# checker will verify that the method does not return null public DynamicString GetCaption() { var result = GetDynamic() ?? GetString() ?? new DynamicString(); } [CanBeNull] private DynamicString GetDynamic() { ... } [CanBeNull] private string GetString() { ... } So, here we have a method GetCaption() whose result can never be null. It calls two methods that may return null, but then ensures that its own result can never be null by creating a new object if neither of those methods produces a string. The nullability checker in ReSharper is understandably happy with this. At runtime, though, a call to GetCaption() was returning null. How can this be? [The Culprit: An Implicit Operator] There is a bit of code missing that explains everything. A DynamicString declares implicit operators that allow the compiler to convert objects of that type to and from a string. public class DynamicString { // ...Other stuff [CanBeNull] public static implicit operator string([CanBeNull] DynamicString dynamicString) => dynamicString?.Value; } A DynamicString contains zero or more key/value pairs mapping a language code (e.g. "en") to a value. If the object has no translations, then it is equivalent to null when converted to a string. Therefore, a null or empty DynamicString converts to null. If we look at the original call, the compiler does the following: 1. The call to GetDynamic() sets the type of the expression to DynamicString. 2. The compiler can only apply the ?? operator if both sides are of the same type; otherwise, the code is in error. 3. Since DynamicString can be coerced to string, the compiler decides on string for the type of the first coalesced expression. 4. The next coalesce operator (??) triggers the same logic, coercing the right half (DynamicString) to the type it has in common with the left half (string, from before). 5. Since the type of the expression must be string in the end, even if we fall back to the new DynamicString(), it is coerced to a string and thus, null. Essentially, what the compiler builds is: var result = (string)GetDynamic() ?? GetString() ?? (string)new DynamicString(); The R# nullability checker sees only that the final argument in the expression is a new expression and determines that the [NotNull] constraint has been satisfied. The compiler, on the other hand, executes the final cast to string, converting the empty DynamicString to null. [The Fix: Avoid Implicit DynamicString-to-string Conversion] To fix this issue, I avoided the ?? coalescing operator. Instead, I rewrote the code to return DynamicString wherever possible and to implicitly convert from string to DynamicString, where necessary (instead of in the other direction). public DynamicString GetCaption() { var d = GetDynamic(); if (d != null) { return d; } var s = GetString(); if (s != null) { return s; // Implicit conversion to DynamicString } return GetDefault(); } [Conclusion] The takeaway? Use features like implicit operators sparingly and only where absolutely necessary. A good rule of thumb is to define such operators only for structs which are values and can never be null. I think the convenience of being able to use a DynamicString as a string outweighs the drawbacks in this case, but YMMV. -------------------------------------------------------------------------------- [1] Java also has @NonNull and @Nullable annotations, although it's "unclear which standard you're supposed to use."

Configuring .NET Framework Assembly-binding Redirects

2020-01-30T22:30:05+01:00

Published by marco on 30. Jan 2020 22:30:05 Updated by marco on 30. Jan 2020 22:30:51 ------------------------------------------------------------------------ After years of getting incrementally better at fixing binding redirects, I've finally taken the time to document my methodology for figuring out what to put into app.config or web.config files. The method described below works: when you get an exception because the runtime gets an unexpected version of an assembly -- e.g. "The located assembly’s manifest definition does not match the assembly reference" -- this technique lets you formulate a binding-redirect that will fix it. You'll then move on to the next binding issue, until you've taken care of them all and your code runs again. [Automatic Binding Redirects] If you have an executable, you can usually get Visual Studio (or MSBuild) to regenerate your binding redirects for you. Just delete them all out of the app.config or web.config and Rebuild All. You should see a warning appear that you can double-click to generate binding redirects. If, however, this doesn't work, then you're on your own for discovering which version you actually have in your application. You need to know the version or you can't write the redirect. You can't just take any number: it has to match exactly. [Testing Assemblies] Where the automatic generation of binding redirects doesn't work is for unit-test assemblies. My most recent experience was when I upgraded Quino-Windows to use the latest Quino-Standard. The Quino-Windows test assemblies were suddenly no longer able to load the PostgreSql driver. The Quino.Data.PostgreSql assembly targets .NET Standard 2.0. The testing assemblies in Quino-Windows target .NET Framework. After the latest upgrade, many tests failed with the following error message: Could not load file or assembly 'System.Runtime.CompilerServices.Unsafe, Version=4.0.4.1, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a' or one of its dependencies. The located assembly's manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040) This is the version that it was looking for. It will either be the version required by the loading assembly (npgsql in this case) or the version already specified in the app.config (that is almost certainly out of date). [Which File Was Loaded?] To find out the file version that your application actually uses, you have to figure out which assembly .NET loaded. A good first place to look is in the output folder for your executable assembly (the testing assembly in this case). If, for whatever reason, you can't find the assembly in the output folder -- or it's not clear which file is being loaded -- you can tease the information out of the exception itself. 1. From the exception settings, make sure that the debugger will stop on a System.IO.FileLoadException 2. Debug your test 3. The debugger should break on the exception [image] Click "View Details" to show the QuickWatch window for the exception. There's a property called FusionLog that contains more information. [image] The log is quite detailed and shows you the configuration file that was used to calculate the redirect as well as the file that it loaded. [image] [Which Version Is It?] With the path to the assembly in hand, it's time to get the assembly version. Showing the file properties will most likely not show you the assembly version. For third-party assemblies (e.g. Quino), the file version is often the same as the assembly version (for pre-release versions, it's not). However, Microsoft loves to use a different file version than the assembly version. That means that you have to open the assembly in a tool that can dig that version out of the assembly manifest. The easiest way to get the version number is to use the free tool JetBrains DotPeek or use the AssemblyExplorer in JetBrains ReSharper or JetBrains Rider. You can see the three assemblies that I had to track down in the following screenshot. [image] [Writing Binding Redirects] Armed with the actual versions and the public key-tokens, I was ready to create the app.config file for my testing assembly. [image] And here it is in text/code form:

Rebase Considered Essential

2020-01-02T10:41:06+01:00

Published by marco on 2. Jan 2020 10:41:06 ------------------------------------------------------------------------ Fossil is a distributed Source Control Manager that claims to offer the same power without the complexity of Git. The article "Fossil: Rebase Considered Harmful" by D. Richard Hipp is part of the documentation for the tool. One of the main selling points of Fossil is that it does not support rebase. In the article, the author lays out the many ways in which rebasing causes no end of woes for developers using Git. I'd heard of Fossil before and I'd even skimmed this document before. This time around, though, I read it through to learn the author's reasoning. My short take is: I do not want to use an SCM that does not allow rebase. [1] I think a project benefits greatly in clarity if a developer is able to alter the local history before cementing commits into an unalterable history (i.e. pushing to the server). [Terminology and Concepts] The following definitions are not complete, but are sufficient for the ensuing discussion. * A repository describes the history of a set of data * A commit includes instructions for how to change the state of the repository * A branch points to a commit, but generally refers to a set of commits; a repository may contains multiple, independent branches * A merge operation integrates two branches with a merge commit that describes the delta; it retains all commits from both branches * A rebase operation integrates two branches by re-applying each commit since the branches diverged from one branch to the other, possibly changing the original commits; it replaces the commits from one of the branches with new commits A rebase is considered a destructive operation because it discards part of the history of the repository by rewriting commits. If I think about it, though, many of the operations I'm accustomed to making are destructive: * Editing the commit message * Amending a commit with more changes * Squashing commits * Re-ordering commits All of these operations are considered destructive because they modify the "true" history of the repository. But what do we mean by "true" history? Where does the story start? The changes outlined above are not for sharing. It's not interesting to the final reader that I had to backspace through and re-spell the word "outlined" in the previous sentence. It might be interesting to see different drafts, though, to see how I arrived at the final version. But those changes are at a different level of granularity. Who decides where one level of granularity stops and the next begins? I think it's the author of the commits. My workflow over the last ten years is based heavily on being able to massage commits so that I can prepare what I share to the server repository, where it can no longer be changed. I agree that there should be an unalterable history, but disagree with the author on where that history begins. [Hairball Commits] I agree with the author that developers should not work in silos, massaging their code until it is perfect, pushing only once there are no more errors and no-one could possibly take issue with anything in the feature. At this point, the author purports that many developers squash all of their local commits to a single so-called hairball commit that makes it look like the code sprung from the forehead of the developer as Eve sprung from Adam's rib: whole and without blemish. Hairball commits are acknowledged as bad, so attacking them as the prime reason to eliminate the tool that allows them seems to be more of a straw man. Preventing developers from making any changes to local commits is not the way to solve the problem, though. While Fossil does not allow discarding any single commit from the history, the author acknowledges that Fossil allows developers to apply addenda that the common Fossil tools will show while hiding the original commits. [2] I see the author's point -- that (potentially) important parts of a history are retained whether the developer wants them or not. That is, it is not up to the developer to decide, but up to the archeologist examining the commits later. This is an interesting idea, but the argument is ultimately not convincing. Let's suppose a developer uses an SCM without rebase. Either there will be many commits in the history that -- unlike the author claims -- do not provide any clarity because they are garbage commits (e.g. WIP and other sorts of investigatory commits that were quickly reverted or undone). Or, the developer will be terrified of making a commit before it's ready and runs the risk of losing work or working less efficiently. Developers will not magically become ego-less and kowtow to the machine. Instead, they will pick up bad habits that are worse than local rebasing. They will keep work uncommitted for too long or will fail to split up commits properly because they are afraid that they can't fix them up later. In either case, it's chaos in the commit history and the project efficiency and reliability suffers. But the author is arguing with a straw man that doesn't really exist outside of shitty developer teams with undisciplined developers. One can argue that these are the kind of developers that many projects have, but that can only be addressed with process. Weakening the tools so that disciplined developers are less efficient is a bad idea. You don't like hairball commits? Tell developers to stop making them. Enforce the policy with reviews. The Git documentation already encourages developers to make focused commits. Rebase allows a developer to split up commits during or after a code review. Rebase can actually be used to combat hairball commits. I have personally used it to split up commits that inadvertently mixed a bug fix or two into a large pile of refactoring changes. I've also often advised people to redesign their commits so that they tell a better story. [Citations and Responses] I've interspersed citations from the "document" linked above and included responses and thoughts. "[...] [some tools] accomplish things that cannot be done otherwise, or at least cannot be done easily. Rebase does not fall into that category, because it provides no new capabilities. (Emphasis added.)" As discussed above, I think that this is fundamentally wrong. My workflow is considerably different than it was before I used Git or had access to rebase. I would now be much less efficient if I didn't have rebase. It would make me constantly focus on cleaning up commits before I really care to. You could make the argument that cleaning up afterward takes more time, but I haven't experienced that to be the case. Instead, I want to be able to set the priorities rather than worry about committing something that I cannot undo. [Telling a good story] And it's not about ego or "looking stupid" to future readers of the history; instead, it's about having control of the story you tell to those same readers. If you don't have rebase, then you tell just as poor a story as if you use rebase badly. It's perhaps closer to the "true" story, but it's not the "best" story. Without rebase, you're forcing future archeologists of your code to read all drafts as well as the final version simultaneously. At Encodo, we don't focus on ego, we focus on efficiency. We do not obliterate commits that make sense just to squash a whole feature. We retain commits in order to tell a good story about how a feature was built. We do not emphasize being able to build each commit: often we'll add a failing test in one commit, then fix the bug in another commit, because that tells a better story. We need rebase in order to massage local commits so that they tell this good story rather than uploading dozens of commits that no-one should ever have to look at (typos, code comments, formatting, etc.). Often, we'll squash in little fixes and changes that come up during a review. Is the Fossil author suggesting that there is some benefit to seeing these in a separate commit? It would make understanding the commits at a later time that much harder. I think most of the author's concerns are addressed by using review and process to enforce better commits. Fossil can't make this happen because the developers have to create good commits in the first place or, at least, eventually. Rebase helps better developers clean up their own commits and also helps them help others clean up their commits, teaching them how to tell the story of their code. "A rebase is just a merge with historical references omitted" Exactly. If I can't eliminate WIP commits or squash local commits, then my workflow changes. Honestly what's the point of keeping each commit? Many are scribbles, unwanted drafts. They're not part of a history anyone would retell. Once commits are cleaned up and tell a good story, there is no need to keep the old commits around. At that point, you're wasting the future archeologist's time. "Surely a better approach is to record the complete ancestry of every check-in but then fix the tool to show a "clean" history in those instances where a simplified display is desirable and edifying, but retain the option to show the real, complete, messy history for cases where detail and accuracy are more important." This feature is an interesting one for commits that can no longer be changed (i.e. have already been pushed), but why make the developer mark every accident and mistake instead of just letting him undo them? The "full" view would be of marginal to no value. Even once the messy commits were deciphered, they would most likely yield no useful information. What possible benefit is it to keep a jungle of "fix typo" and "add missing file" or "fix broken test" commits just because the developer made a commit before running tests or seeing a warning in the IDE? [3] [Command Line vs. UI] "So, another way of thinking about rebase is that it is a kind of merge that intentionally forgets some details in order to not overwhelm the weak history display mechanisms available in Git." I honestly think that this guy just wants to make Git look stupid and Fossil look spectacular. I understand fully that it's silly to argue that Git doesn't need a feature that Fossil has just because I've personally never needed it. A good feature is something that becomes essential once you have it, but you never knew you were missing it or were less efficient without it. Fossil's ability to easily see which changes were made to a file after a given commit sounds like it might be that kind of feature. However, rebase in Git is such a feature, so if Fossil takes that away, it's a deal-breaker. At this point, I think also that the author is considering Git as a command-line application rather than extended by a truly powerful UI like SmartGit, which provides fast access to gobs of historical data with little effort. [When does Siloed Development begin?] "Or, to put it another way, you are doing siloed development. You are not sharing your intermediate work with collaborators. This is not good for product quality." What has this guy seen in the wild that he's reacting this way? Who hurt this poor man? How often does he expect us all to commit and push to the server? Should we code directly on the server? Where does he draw the line for "siloed" work? A day? An hour? More to the point: who is paying developers (or a project lead) to examine unvetted commits? Do you think we're made out of free time? Keeping everything around forever is not the most efficient way of optimizing information about your code. It's a hoarder mentality. I understand the sentiment: you want to avoid people massaging commits into oblivion, eliminating important information. But, honestly, I've seen the opposite problem: commits pushed to the server in the shabbiest form, thereafter unalterable. [4] "Many developers are drawn to private branches out of sense of ego. "I want to get the code right before I publish it."" No, that is not my requirement. I want an efficient review that pinpoints (and fixes) errors quickly so no-one wastes time. [Online Repository Tools] The author claims that, "Rebase adds new check-ins to the blockchain without giving the operator an opportunity to test and verify those check-ins. Just because the underlying three-way merge had no conflict does not mean that the resulting code actually works. Thus, rebase runs the very real risk of adding non-functional check-ins to the permanent record." This is true only for the special case of online merges. These should be avoided like the plague, in any case. I know that people really, really trust their tools. I know that they think that merges are infallible, that their CI builds their software and runs their tests and gives their pull request a green flag and a thumbs-up. But anything other than a trivial pull request should be examined with tools more capable than online repository managers. Not only are they not as good, they are wildly inefficient when compared to a good desktop tool. I know this next generation of developers want to do everything on their phones, but this is ridiculous. The screen is too small and the tools are too limited. Get a machine with usable screen real estate and learn what being efficient really means. Not only will you be quicker, you'll be better: your error rate will decrease and you'll see connections in the commits much better than with the (comparatively) meager online tools. I've written before about one such UI, SmartGit, in "Git: Managing local commits and branches" and "Using Git efficiently: SmartGit + BeyondCompare" . Other online tools have similar weaknesses versus their desktop brethren: for example, text editors like Word or Google Docs. It's definitely a killer feature that they're online, but their only selling point is that they're attached to an online document storage. That's the selling point. As amazing as it is that these tools run in a browser, they are pathetic compared to tools from thirty years ago. My God, I fondly remember WriteNow 4.0 for Mac OS 6 and 7, which handled a 250-page document with aplomb, complete with figures, tables, TOC, numbering, custom styles, ... all of those things that an editor should do. Somehow, just because it's in the cloud means that we should be happy with WordPad instead of a full-fledged editor. It's a joke. [Where does lying begin?] The author claims that, "Rebasing is the same as lying By discarding parentage information, rebase attempts to deceive the reader about how the code actually came together." Then you should include all command/undo buffers from your IDE, too. At this point in the document, the author is just repeating the same argument over and over, reformulated but not different. "Unless your project is a work of fiction, it is not a "story" but a "history." Honorable writers adjust their narrative to fit history. Rebase adjusts history to fit the narrative." That's not even how human history works. It's not even how your own stories about your own life work. This is the kind of mentality that wants to keep all 6000 pictures from a vacation. Why? Just in case you need that picture of the ground that you took by accident? Because you need all 300 pictures of the Matterhorn? You're wasting your readers' time and your own. "The intent is that development appear as though every feature were created in a single step: no multi-step evolution, no back-tracking, no false starts, no mistakes." Again, he proposes to fix a problem -- poorly built commits -- by not allowing anyone to modify commits. "We believe it is easier to understand a line of code from the 10-line check-in it was a part of — and then to understand the surrounding check-ins as necessary — than it is to understand a 500-line check-in that collapses a whole branch’s worth of changes down to a single finished feature." I agree with this 100%. As already noted above, though, the review should disallow such foolish hairball commits. "The more comments you have from a given developer on a given body of code, the more concise documentation you have of that developer’s thought process." Correct. But you don't want to see everything. He presents a false choice between all the history and an improperly truncated version. Then he says he'd rather have all of it, and wants to get rid of history-rewriting. This doesn't fix the problem of shitty programmers making shitty commits. The only way to fix that is gatekeeping reviews and process. Taking a vital tool for clarity (rebasing) away from disciplined programmers is a terrible idea. "If we rebase each feature branch down into the development branch as a single check-in, pushing only the rebase check-in up to the parent repo, only that fix’s developer has the information locally to perform the cherry-pick of the fix onto the stable branch." He really seems to.be attacking a repo-management/history-editing process I've never used. It sounds horrid. "Rebasing is an anti-pattern. It is dishonest. It deliberately omits historical information. It causes problems for collaboration. And it has no offsetting benefits." Only one of those sentences is true. -------------------------------------------------------------------------------- [1] Before I used Git, I used Perforce, which allowed a developer to keep changelists around without committing them. Once a changelist had been submitted to the server, it was frozen in the history. Until then, though, the developer could alter files and the commit message. [1] Or so it sounded -- I have not actually used Fossil. [1] If you're a SmartGit user, then you can see the trail of commits you're leaving behind by selecting "Recyclable Commits" in the Log View. This shows all commits that will be reclaimed by the garbage collector the next time it runs. [1] As noted above, Fossil seems to offer a feature with which you can "amend" commits and messages afterwards. I, however, want to be able to prevent bad commits from entering the stream in the first place.

Advanced CSS (blend modes and subgrids)

2019-12-28T23:23:06+01:00

Published by marco on 28. Dec 2019 23:23:06 Updated by marco on 28. Dec 2019 23:23:47 ------------------------------------------------------------------------ The article "Z’s Still Not Dead Baby, Z’s Still Not Dead" by Andy Clarke is well-written, very interesting and taught me a few new CSS tricks of which I was unaware. Granted, my work usually doesn't call for fancy effects like those you can achieve with something like background-blend-mode, but it can happen. There's not only background-blend-mode, there's also mix-blend-mode and filter, all of which apply high-quality effects dynamically. In the late spring, I had a two-month project where I had to use a lot of transformations and animations -- and I was able to get it all done with CSS. Once you know about these kinds of techniques, you keep them in mind, and are able to consider solutions that would seem impossible (or very difficult/time-consuming/unmaintainable) if you didn't know the technique. A modern browser can construct the following image by composing and blending a couple of graphics. [image] It's actually pretty cool that you can get this type of layout with wide browser support and no hacks. See the linked article for a lot of examples. I have used CSS Grid before (as the author does). The author mentions subgrids, but ends up using a second grid within the first grid because browser support for nested grids is good, whereas no-one "supports subgrids except for the latest version of Firefox" . The MDN documentation for "Subgrids" explains that it differs from nested grids in that "If you set the value subgrid on grid-template-columns, grid-template-rows or both, instead of creating a new track listing the nested grid uses the tracks defined on the parent." The linked page includes many examples and more detail. As with any advanced techniques, you have to take into account your own target browsers to see whether you can use them in your own projects. It's a well-written article and I learned a few more techniques that I can hopefully use at some point.

Framework Design: Programming Paradigms and Idioms

2019-11-30T15:36:51+01:00

Published by marco on 30. Nov 2019 15:36:51 Updated by marco on 4. Oct 2023 21:28:45 ------------------------------------------------------------------------ The discussion "React in concurrent mode: 2000 state-connected comps re-rendered at 60FPS " is illuminating mostly in that it shows how ego can impede productivity. [Ego-driven design] Ego can also be that thing that drives a talented programmer to create something of use to the rest of us, but that's honestly a very rare case. More often than not, the best case is that a developer improves their skills -- and perhaps learns to be more humble instead of shooting of their mouth about how "easy" it is to create a "good" product. Such claims are nearly always made without defining what they mean by "good". Some comments are from programmers more interested in a pissing contest of who can write performant code on their own. Their implementation often focuses laser-like on a specific use case not often found in nature without tackling the tough question of how to design a more generalized solution that incorporates and balances more than just the one aspect of the system that they think they're good at (e.g. performance). That is, they tend to carefully define the application domain based on what they're already good at. This is not how product development works. Many of the commentators get distracted by the overreaching claims of the reposter (faster than any other WebGL rendering, which is patently not true) instead of reading the much more reasonable claims of Dan Abramov, who is the original poster. [Product-driven design] Thankfully, there are others who seem to understand that giving up a logical, declarative paradigm in order to do so is not an acceptable tradeoff in almost any given project. What are some facets other than performance that contribute to a good solution? * Maintainability * Extendability * Readability: low syntax noise * Testability: composition-based * Discoverability: consistent API * Flexibility: applicability to different problem domains * Expressiveness: concise and precise * Correctness: error-free is strongly encouraged or enforced * Completeness: degree to which the definition captures all facets of the problem domain * Learnability: low number of concepts cover application needs * Scalability: layered API allows more control where needed * Efficiency: ratio of effort to product low * High signal-to-noise ratio * Optimizability/Performance Products that try optimizing all facets generally never see the light of day or serve as the base material from which more viable projects are born. A higher level of abstraction is a good thing. It allows mediocre programmers (and be happy if you have even mediocre programmers) to write programs that aren't a nightmare to maintain or refactor. It allows good developers to very quickly write maintainable programs. If the underlying framework has a declarative and easily understood paradigm that has only a handful of orthogonal concepts and it offers great performance by default, that's a win. There are few projects that need spectacular performance as their main feature. [Focus on your application domain] I would argue that most web programming is about making line-of-business apps and pages where look and feel matters so much that it's worth investing 50% more budget to get near-perfect and smooth updates. If it janks, it janks. There is no time or budget (or, sometimes, programming skill) to "fix" it. And, if "fixing" it means abandoning the high-level declarative programming model that makes working with Reactive so efficient, maintainable and productive, then that's even more implicit cost bound up in it. As the commentator Onion2k put it: "This is a demo of good performance using a web framework on top of a WebGL framework. It's showing that a future version of React will make building a solid 60fps web app UI [...] within the reach of most web developers. Sure, you can hand-roll code to get that performance today if you know how, but this is about putting that performance in the hands of developers who can't (or, more often, aren't given the resources to). To argue that is unnecessary or actually bad is ridiculous. Libraries that make it easier to build better apps are universally good things. (Emphasis added.)" [Defining an application model] To use React, you have to make concessions to Reactive mode in your application definition. But that's the way programming works. Instead of writing "a person must have a company, while the company has a possibly empty list of people", we write (example from "Quino" ), Elements.Module.Company .AddOneToManyRelation(Elements.Module.Person) Programming is all about explaining what an application does. The programming language and framework and runtime balance all of the factors listed above to be able to transform the formulation most accessible to a product owner ("I want a CRM") through a business analyst ("It has a list of companies, each of which has a list of people") to a programmer (formulation above). The formulation above is still quite high-level, but satisfactory for 99% of cases. For the remaining 1%, the API has to provide some way of digging into the underpinnings of the implementation without dropping the developer off of a cliff. Quino does this reasonably well, as does React. The focus here is on realizing that a framework's ability to accommodate that 1% of use cases smoothly is only one aspect of its effectiveness. Given that it doesn't come up very much, it makes no sense to focus too much effort on optimizing that path, no matter how much more interesting it would be to the developers to do so. [Concurrent mode in React] This is one of those silly blogs-posted-as-tweets, but the points in "Is Concurrent Mode just a workaround for “virtual DOM diffing” overhead? [...]" by Dan Abramov are good. The point is that "Concurrent Mode" is not a speed improvement only for React. It also improves how your app's code updates and is scheduled without you having to change your code (much, or at all). The linked article explains how this sea change in rendering components forms the basis of many other performance improvements that apply to existing applications without modification. It's exciting that a near-future version of React will make animations and updates even smoother than they are now. This taking into consideration that they are already more than good enough for most apps without tweaking. [React's idioms] React is not a game-programming framework. It makes no sense to claim that React apps will blow away apps written in Unity. We make line-of-business apps with it. React already allows apps to have much better update characteristics with almost no code other than a few functional declarations to define rendering and components and the state that they rely on. The model is unimpeachable in that it accurately reflects the application model without adding any ceremony. You make some concessions in order to define your declarations about your program's logic and states so that the framework can optimize as much as it can, but no more. With hooks, you can declare simple, mutable state or one-time, partially mutable state (memos and callbacks), listeners for lifecycle events (effects) and so on. On the one hand, you're forced to define your logic using React's idioms but, on the other, they still make sense in that they make your assumptions about your app's logic explicit rather than implicit. Once you've done this, the framework knows more about what it can optimize away and what it can't. And you haven't wasted time because you're technically describing salient properties of your application domain. [Declaring Behavior] That's the idea behind the < Suspense/> component: the app can declaratively determine how it would like components to be updated in different asynchronous situations involving multiple asynchronous tasks. Concurrent Mode allows the framework to work before that update is technically complete because it allows any work to be interrupted -- and discarded, if it is no longer relevant. This allows the reconciliation to benefit a bit from something like the branch predictor in a CPU, where speculative branches are executed in parallel and occasionally discarded. JavaScript imposes a cooperative rather than parallel model, but low-level support for interruptibility (especially when automatically applied) is worlds better than nothing. [Working within a paradigm] Any language -- and the combination of the underlying programming language and the framework API, combined, is the language a programmer uses -- must have a shape, a paradigm that it enforces. Naturally, a programmer can use a different paradigm than the recommended one. But a good framework finds the balance between a paradigm that is comfortable for a large part of its audience and one that is enough of an abstraction that it has a lot of leeway for applying to the next layers down (until it gets to machine code). A good framework provides an out-of-the-box experience that provides a clearer programming idiom and better performance than most programmers could do on their own. In the thread above, Abramov in no way claims that it's not possible to create a faster application for thousands of components, just that the new renderer is much, much faster than the old one without changing the programming idiom at all. The programming idiom in React is very good, if not great. This is really good news. Instead, you could say that Abramov's claim is that anyone who claims to have made a faster renderer is making tradeoffs in other areas (e.g. from the list above). Most likely, the resulting balance is not as good as the clear, declarative syntax of React or it doesn't cover nearly as many use cases. [Comparing Idioms (React vs. Svelte)] Is React's syntax the best it can be? Maybe not yet. For example, a component declares mutable, internal state with the useState() hook, which returns a state variable and a "setter" function to change that state. Svelte, for example, improves on this by allowing the app to just declare the state variable and automatically noticing when that state is updated and generating the state-update code in the transpilation phase. This is an improvement that allows an app developer to work even closer to "normal" code than before. If Svelte can provide this clearly more readable feature without introducing problems in other facets (e.g. learnability, performance, completeness), then it's a clear win. [Case study: async/await)] A similar kind of improvement is async/await. This feature didn't actually change how asynchronous code works. Instead, it allowed a programmer to write synchronous code that could be made asynchronous automatically. This is a sea change for most developers -- even those clever and experienced enough to have written that level of asynchronous code themselves. The point is that the developer is no longer wasting time writing what amounts to boilerplate code that is very error-prone and difficult to thoroughly test (which means that it's often not thoroughly tested). The idiom of async/await imposed minimal "noise" (none, actually) and has a tremendous upside. The code doesn't necessarily get faster, but it could be made faster without changing it. [Inherent Limitations of Languages/Runtimes] The comment on "Fiber Principles: Contributing To Fiber" by sebmarkbage is another well-written contribution to this discussion that shows that there are a lot of clever people working on React that are aware of the fine balance between the requirements involved in writing a strong framework. The user responds to accusations that much of this work would not be necessary if JavaScript had proper threading. The author argues that globally mutable prototypes are an intrinsic concept that is used in many, many JavaScript use cases. However, they also limit the ability of ever bringing threads to JavaScript. The language is limited from the get-go. That doesn't mean we should all stop using JavaScript. It just means that this is something that goes in the cons list and must be weighed against all of the pros. Anything that is in the cons list must be compensated with effort. JavaScript has many pros going for it: for example, it's won the client-side programming-language war. Perhaps WebAssembly will replace it as a runtime, but only time will tell. By then, it won't matter, because we'll be using languages like Elm or TypeScript to write our code. Even this doesn't matter, though, because these languages must also transpile to the underlying paradigm defined by an engine that must run JavaScript. That goes -- for now -- for WebAssembly targets as well. And threading is out for any of this stuff. Until something in this situation changes and we can target a threaded execution engine on the client side, we should be happy that there are very clever people making cooperative multi-tasking transparent and easy to program for the rest of us. Those of us who worked on Apple OSs before OS X or Windows before 95 know what it's like to have to deal with cooperative multi-tasking in our own code. I welcome the declarative paradigm that allows excellent performance for a wide range of use cases without making me write and maintain a whole bunch of code that has nothing to do with my application domain. [Avoiding shared, mutable state] There's a reason why everyone with sense is talking about this concept. Using shared, mutable state makes it very easy to write the happy path of a single use case, but it makes it very hard to reason about other use cases and branches. It doesn't scale, extend, test or maintain well. If these requirements don't apply to your application -- e.g. a script or one-off throwaway prototype -- then you might be fine. I would personally advise against practicing or becoming accustomed to techniques that apply to one use-case but that are dangerous in all other situations. You'll generally end up using the technique to which you've become accustomed. While training yourself to build high-quality solutions risks the danger of over-engineering solutions to problems that could have been solved more simply, it's easier to "downscale" your coding style than to "upscale" it. With enough practice and the right techniques, you can write quality code just as initially efficiently as crappy code. I would also say to beware of the seductiveness of bad programming models that promise an initial speed in development that quickly drops off once it's too late to change. Prototypes happen to be built into the language in JavaScript's case, but shared mutable data is the great stumbling block of concurrent programming. Applications that batch work into parallelizable chunks can be optimized to run more quickly by a clever runtime. It is much simpler to reason about an application without shared mutable data. There are fewer cases and branches. Otherwise, an application must use locks (or fences or some sort of synchronization concept). The point is that efficient synchronization is not easy and many laic implementations tend toward speed rather than robustness and are buggy as a result. Though it's possible to hand-code faster concurrency than standard frameworks, most people can't do it. And, given time, framework implementations get really, really good at optimizing nearly all cases. C# and .Net, for example, have a tremendously clever runtime underlying async/await now that can hardly be beaten for throughput, scheduling, etc. Successive versions have built on new language concepts introduced precisely to allow an application -- where needed -- to be more declarative in ways that allow even more optimization (e.g. record references, etc.) It's nice to see that Concurrent React -- much like async/await in JS -- provides a simple idiom for moving that effort out of the hands of most developers. [Reinventing the Wheel] Naturally, a developer is free to do that work on their own -- and many commentators in the original thread at the start of this article seem to enjoy writing code that has nothing to do with their actual app just to show that they can. But with enhancements like async/await or Concurrent React, they don't have to in order to enjoy performance benefits. That's a win-win -- a free lunch. The point made above by Onion2k is very salient: very often "developers [aren't] given the resources to" make the kind of optimizations that React will provide for free. Could a given rockstar developer write something even faster for exactly their application domain? Probably. Are they going to be given the time and budget to do so? Almost certainly not. It's far better to have a good default that is smooth as silk and more than adequate to the task for almost all conceivable applications. No-one's paying you to reinvent the wheel. That's almost certainly not your job. If you'd like it to be your job, then maybe you should work on a project where you're inventing the wheel directly (i.e. a framework project). Then, you can build on that experience and your framework to turn around tightly written, maintainable and performant applications for your paying customers. [Inventing the Wheel] It's important to be pragmatic and remember when you're working on framework code and when you're working on code that benefits from framework code without reinventing it. Otherwise, you've got a terrible situation: you invest in framework/infrastructure on every single project because you never reap the benefits of having written a framework. In the case of frameworks that are completely external to your application, like React (or Quino), you never even had to invest in writing the framework at all. If you write a framework for just expert developers, there will be no adoption and you don't help a large part of the community to write better apps. But what do we mean by better? Continuing with React as an example, the abstract requirements at the start of this article roughly map to: * Maintainable (minimum of idioms) * Robust (not buggy) * Scaling performance * Consistent UX for * Error conditions * Networking latency (progressive handling, from avoiding needless jank to notifying the user that something is happening when it takes longer) * Small/large data/updates * Tooling support for introspection, code-completion, debugging, discovery, etc. * * Support for common tasks * State management, from function-local to multi-function/context * Describing state (mutable, non-mutable) [Minimizing Impedance] An application should have to only declare things about itself that are relevant to itself -- but that also help to render the application better. Again, these idioms should scale: an application which will not have foreseeable performance issues in most components should be able to write those components with more approachable code. Individual "islands" of code can provide additional information to optimize hotspots (like memoization, immutability hints, etc.) It's important to note that these concepts are not introduced by the framework -- they are intrinsic to the application's domain model, but usually kept implicit. If the application does not describe these aspects of itself, then the framework must make more pessimistic assumptions. Often this doesn't matter. Where it does matter, the application should be able to use compatible and familiar idioms to improve the granularity of its description about itself. This, in turn, lets the framework use a faster approach where it now knows that it won't violate the application's definition. The simplest of these is to tell React which parts of the state are mutable and which are immutable. When determining what has changed in an application state, a framework can simply compare the reference to the root node of an immutable object graph to the previous root-node reference to determine if that part of the graph has changed. If the object graph does not declare itself as immutable, then the framework must be pessimistic and compare the entire subgraph to determine if it has changed. This is a concept that is intrinsic to programming. It is hard to conceive of it ever not being relevant. Naturally, if there is more than enough processing power available or the graph is small enough, it won't matter, but it's still axiomatically more work to compare potentially mutable graphs than immutable graphs. If an application fails to express immutability where it could have, that small missing bit of information reduces flexibility in choosing an algorithm. This is not a new thing: most functional languages have immutability baked in as the default. Even C has the notion of const and volatile to give hints to the compiler about how it can deal with that data. Naturally, higher-level languages try to abstract away these concepts, but it constrains all the layers below. [Other Intrinsic Concepts] On this subject, another unavoidable concept is nullability: is a reference assigned or not? Most new languages (and newer versions of languages, like C#) are switching from the age-old -- and convenient-for-the-compiler -- default of nullable references. Again, reference assignment is a core concept that is unavoidable when thinking about code with pointers. Another concept that limits choosing a more performant transformation during compilation is failing to express function purity. Does a function cause a side-effect? A compiler can optimize a function known to be pure in ways that it cannot with impure functions. All of these features are a balance between programmer convenience, onboarding of new developers, and allowing programmers to focus on application logic rather than making concessions to the language and framework. As discussed above, though, there are concepts intrinsic to programming that have ostensibly nothing to do with application logic, but that an application declares (if not explicitly, then implicitly). Taking the example from above, if an application declares that a person is in a company, but fails to mention that a person must be in a company, then the underlying software (framework and compiler) must be more pessimistic about that relationship than is strictly necessary. A good framework encourages software to be precise about its own model by allowing the application to declare the salient parts of its model in a declarative minimal set of idioms.

Thoughts on Atomic/Utility CSS

2019-11-24T20:55:24+01:00

Published by root on 24. Nov 2019 20:55:24 ------------------------------------------------------------------------ The article "In Defense of Utility-First CSS" by Sarah Dayan is very long [1], so I've summarized a bit with notes and thoughts. [2] I don't really care about being pedantic without first knowing some facts. What are the requirements? [Requirements] * Can I quickly make a precise change to a single component? * Can I make a global, thematic change? * How quickly can I make changes? * How maintainable is the result? If atomic/utility CSS can deliver these things, then it's probably a fine tool. But -- spoiler alert -- it seems more like a tool for designers -- not programmers. Programmers have other, better tools for building CSS in a way that fulfills the requirements above. [Designers vs. Programmers] Essentially, these designers are like we programmers used to be: we used to care about cascading when we were still hand-coding our CSS. Now that we're using LESS or another generator, we can use variables and functions for theming and use local CSS for precision. We can lean on specificity when it suits us and avoid it when it only gets in the way. We want to declaratively say how we want everything to look and let our tools (LESS, WebPack with plugins) figure out how best to generate the CSS to accommodate supported browsers and also to create the kind of CSS that performs well without blowing up memory client-side. None of these optimizations and accommodations for targets should be up to the programmer/designer/CSS-writer at this point. [Utility vs. BEM] Utility-CSS feels functional, but it also feels like something you use when you don't have LESS. I've never used BEM and agree that it never really made sense, from several good coding practices like DRY. That the author is coming from BEM to utility-CSS is not a surprise: BEM was never a good idea. [A Refactoring Use Case] "Early refactors are a pretty good indicator of unmaintainability." I don't agree. It's more a sign of shifting priorities or requirements. It's not uncommon in agile development. The example the author has of changing the meaning of a "card" after there are already components using that style just means that you should make a "card2" class (not a "card-no-ribbon" one) because it's just a different card type. The problem is that the design now includes two cards, not that your implementation should somehow be able to easily roll with a confusing design. Where I see a problem is when a card is supposed to have a certain padding and a border with a certain color (let's say the "padding-top-8" and "border-bottom-lemon" from the author's example). But then you don't want those anymore. Granted, with proper components, you'll only have to change the style in one place anyway, right? So it doesn't matter what you call it. You could have just called it "card" in the local styles and been done with it. So, either you have to remove those highly specific styles in many places in your HTML (as with an old-style web site, like earthli) or you change it in one place anyway (new-fangled, with React components). [Visuals vs. Semantics] I guess it's the difference between knowing from the HTML what the component is going to look like (

) and knowing what the component is (
). The author writes: "Yet, the bigger and the more complex a component gets, the less obvious it is to know what class name maps to what element on the screen, or what it looks like." But then they include an example where it's absolutely clear which components do what:

The Shining

His breath stopped in a gasp...

Stephen King

Stephen Edwin King ...

Website Twitter

I think this again shows the difference between programmers and designers: the code above is crystal clear to a programmer, so if a programmer is writing the CSS, then there's no need to change anything. The author seems to be a designer hell-bent on knowing exactly what the page will look like without actually showing it in a browser. I wish they'd included the version with utility CSS ... it would have been a giant block of unreadable code, doubled in size with class names. [Don't Change Anything] The author makes a good case for theming using CSS variables, which can be applied "at runtime" in the browser. The solution to theming with utility CSS turns out to be ... making semantic styles instead of precisely named styles. So...not utility CSS. The author references a few other articles, one of which is "Kiss My Classname" by Jeffrey Zeldman , which eloquently argues that there is nothing to change. He instead argues that developers and designers should use a visual style guide. "I don’t believe the problem is the principle of semantic markup or the cascade in CSS. I believe the problem is a dozen people working on something without talking to each other. "Slapping a visually named class on every item in your markup may indeed make your HTML easier to understand for a future developer who takes over without talking to you, especially if you don’t document your work and create a style guide. But making things easier for yourself and other developers is not your job. And if you want to make things easier for yourself and other developers, talk to them, and create a style guide or pattern library." "The present is always compromised, always rushed. We muddle through with half the information we need, praised for our speed and faulted when we stop to contemplate or even breathe. (Emphasis added.)" -------------------------------------------------------------------------------- [1] It's also almost two years old, but still seems to describe the state-of-the-nation in utility/atomic CSS. [1] Another article they referenced was "CSS Utility Classes and "Separation of Concerns"" by Adam Wathan and it's even longer. It's almost a jeremiad with the seeming intent of breaking the reader down with a flood of words. I could only skim it, but it seems like these people are styling without programming: that is, some of the utility classes and even the slightly semantic ones they use could very easily be written more cleanly if they just used component-local styles. For example, this is completely unnecessary with local styles, because you don't have to worry about specificity biting you in the ass:

Stubbing ...

In this quick blog post and screencast, ...

In another article "On the Growing Popularity of Atomic CSS" by Ollie Williams , the author mentions that they're addressing "n a mixed-ability team, perhaps involving backend developers with limited interest and knowledge of CSS". I didn't have the energy to finish that one either, because a skim indicated that it repeated a lot of what was in the article I did read.

Azure Linked Accounts and SSH Keys

2019-10-17T14:42:13+02:00

Published by marco on 17. Oct 2019 14:42:13 ------------------------------------------------------------------------ Azure DevOps allows you to link multiple accounts. Our concrete use case was: * User U1 was registered with an Azure DevOps organization O1 * Microsoft did some internal management and gave our partner account a new organization O2, complete with new accounts for all users. Now I have user U2 as well, registered with O2. * U2 was unable to take tests to qualify for partner benefits, so I had to use U1 but link the accounts so that those test results accrued to O2 as well as O1. * We want to start phasing out our users from O1, so we wanted to remove U1 from O1 and add U2 Are we clear so far? U1 and U2 are linked because reasons. U1 is old and busted; U2 is the new hotness. The linking has an unexpected side-effect when managing SSH keys. If you have an SSH key registered with one of the linked accounts, you cannot register an SSH key with the same signature with any of the other accounts. This is somewhat understandable (I guess), but while the error message indicates that you have a duplicate, it doesn't tell you that the duplicate is in another account. When you check the account that you're using and see no other SSH keys registered, it's more than a little confusing. Not only that, but if the user to which you've added the SSH key has been removed from the organization, it isn't at all obvious how you're supposed to access your SSH key settings for an account that no longer has access to Azure DevOps (in order to remove the SSH key). Instead, you're left with an orphan account that's sitting on an SSH key that you'd like to use with a different account. So, you could create a new SSH key _or_ you could do the following: * Re-add U1 to O1 * Remove SSH key SSH1 from U1 * Register SSH key SSH1 with U2 * Profit If you can't add U1 to O1 anymore, then you'll just have to generate and use a new SSH1 key for Azure. It's not an earth-shatteringly bad user experience, but interesting to see how several logical UX decisions led to a place where a couple of IT guys were confused for long minutes.

Using Git efficiently: SmartGit + BeyondCompare

2019-10-17T13:27:26+02:00

Published by marco on 17. Oct 2019 13:27:26 Updated by marco on 11. Mar 2021 14:33:13 ------------------------------------------------------------------------ I've written about using SmartGit (SG) before [1] [2] and I still strongly recommend that developers who manage projects use a UI for Git. If you're just developing a single issue at a time and can branch, commit changes and make pull requests with your IDE tools, then more power to you. For this kind of limited workflow, you can get away with a limited tool-set without too big of a safety or efficiency penalty. However, if you need an overview or need to more management, then you're going to sacrifice efficiency and possibly correctness if you use only the command line or IDE tools. I tend to manage Git repositories, which means I'm in charge of pruning merged or obsolete branches and making sure that everything is merged. A well-rendered log view and overview of branches is indispensable for this kind of work. [SmartGit] I have been and continue to be a proponent of SmartGit for all Git-related work. It not only has a powerful and intuitive UI, it also supports pull requests, including code comments that integrate with BitBucket, GitLab and GitHub, among others. It has a wonderful log view that I now regularly use as my standard view. It's fast and accurate (I almost never have to refresh explicitly to see changes) and I have a quick overview of the workspace, the index and recent commits. I can search for files and easily get individual logs and blame. The file-differ has gotten a lot better and has almost achieved parity with my favorite diffing/merging tool Beyond Compare. Almost, but not quite. The difference is still significant enough to justify Beyond Compare's purchase price of $60. What's better in Beyond Compare [3]? [Diffing] * While both differs have syntax-highlighting (and the supported file-types seem to be about the same), Beyond Compare distinguishes between significant and insignificant (e.g. comments or whitespace) changes. It makes it much easier to see whether code or documentation has changed. * The intra-line diffing in Beyond Compare is more fine-grained and tends to highlight changes better. SmartGit is catching up in this regard. * You can re-align a diff manually using F7. This is helpful if you moved code and want to compare two chunks that the standard diff no longer sees as being comparable [Merging] I could live without the Beyond Compare differ, but not without the merger. [image] * The 4-pane view shows left, base and right above as well as the target below, with the target window being editable. Each change has its own color, so you can see afterwards whether you took left, right or made manual changes. * The merge view includes a line-by-line differ that shows left, base, right and target lines directly above one another, with a scrollbar for longer lines. * The target view is color-coded to show the origin of each line of text: right, left, base or custom edited. * BeyondCompare makes a smart recommendation for how to merge a given conflict that is very often exactly what you want, which means that for many conflicts, you can just confirm the recommendation. * SmartGit has two separate windows for base vs. left/right and right/left vs. target. Long lines are really hard to decipher/merge in SmartGit [Integrate Beyond Compare into SmartGit] To set up SmartGit to use Beyond Compare 1. Select Tools > Diff Tools 1. Click the "Add..." button 2. Set File Pattern to * 3. Select "External diff tool" 4. Set the command to C:\Program Files (x86)\Beyond Compare 4\BCompare.exe 5. Set the arguments to "${leftFile}" "${rightFile}" 2. Select Tools > Conflict Solvers 1. Select "External Conflict Solver" 2. Set File Pattern to * 3. Set the command to C:\Program Files (x86)\Beyond Compare 4\BCompare.exe 4. Set the arguments to "${leftFile}" "${rightFile}" "${baseFile}" "${mergedFile}" [Update March 11th, 2021] I was testing the Git support in Visual Studio Code and ran into a somewhat surprising limitation. For those that use IDE Git integration without an external tool, this would be a pretty disappointing message. What do you do then? [image] -------------------------------------------------------------------------------- [1] In "Git: Managing local commits and branches" and "Programming in the moderncurrent age" [1] I am in no way affiliated with SmartGit. [1] I am in no way affiliated with BeyondCompare.

Visual Studio 2019 Survey

2019-10-17T07:38:00+02:00

Published by marco on 17. Oct 2019 07:38:00 ------------------------------------------------------------------------ Visual Studio 2019 (VS) asked me this morning if I was interested in taking a survey to convey my level of satisfaction with the IDE. VS displays the survey in an embedded window using IE11. [1] I captured the screen of the first thing I saw when I agreed to take the survey. [image] I know it's the SurveyMonkey script that's failing, but it's still not an auspicious start. -------------------------------------------------------------------------------- [1] I'd just upgraded to Windows 10 build 1903, which includes IE 11.418.18362.0. I can't imagine that they didn't test this combination.

Svelte vs. React (overselling a framework as a revolution)

2019-05-19T17:15:28+02:00

Published by marco on 19. May 2019 17:15:28 Updated by marco on 13. Jan 2022 09:53:18 ------------------------------------------------------------------------ I've just read about a web framework called Svelte in the post "Virtual DOM is pure overhead" . I think the product itself sounds interesting, but that the author uses unnecessarily misleading arguments. From what I gather, Svelte is a compile-time reconciliation generator for JSX/TSX components. This pre-calculated generator applies changes to the DOM without needing a virtual DOM and without real-time diffing or reconciliation. That is, instead of having real-time calculation, with possible performance hits [1], the app benefits from having all possible state changes pre-calculated and ready to apply immediately and quickly. This all sounds pretty good, I think. I'm definitely going to take a look at the more-advanced tutorials. [2] However, the author wasn't happy with just presenting his product, but seems to need to mischaracterize why products like React abstracted away from the DOM in the first place. He tells us that the virtual DOM was always slower than manipulating the DOM. But that isn't the claim React makes. React helps users avoid common performance pitfalls in the model of programming that it replaced -- it never claimed to be the final word in performance optimization. It's clear that something like Svelte -- if it can cover all the needs of an app -- is faster than maintaining a virtual DOM. But that product isn't what React replaced. React replaced products written in jQuery. React brought an asynchronous frame-based renderer to the web (something that products like WPF have had for decades). It brought us type-safe views (when used with TypeScript) and taught us about the advantages of immutable data structures. "He stands on their shoul-ders, then implies that they were idiots for not having been taller." The author characterizes the notion that a virtual DOM is faster as a "meme". This is silly and imprecise. It is true that React will be more efficient than most hand-coded web sites of a typical level of complexity. jQuery sites tended to teeter and collapse under their own weight. They were unmaintainable and very difficult to optimize without nearly rewriting them. React sites, on the other hand, are modular in nature and the library includes several standard patterns to apply and measures to take to optimize these components. It's not always easy, but it's better than it was in the old days. And there are solutions in React to performance issues. The users must follow patterns and use the APIs correctly. That's the way it is in every framework or library. Some libraries offer less leeway for users to screw up performance in the way that they shape their APIs. Sometimes the API surface goes too much in that direction and ends up handcuffing users. That is, users can't write what they want to write in a way that feels natural because the pattern they prefer wouldn't perform well under their framework. Instead, the user must change how they think about writing apps just to use the framework. This isn't necessarily a bad thing, but is definitely something to consider. It's possible that Svelte offers all of the advantages of React with even more flexibility and less opinion. React -- and its companion Redux -- was always about being very declarative about state and changes. There is no magic, even the reconciliation algorithm is very predictable. There are other approaches, like MobX, which users claim "does the right thing" with state changes, even if the user fails to declare dependencies as clearly as React would have required. [3] I imagine that Svelte is going in this direction as well. The claim I think that Svelte is making is that users can write code that feels more natural without changing their paradigm to match the framework. That is, Svelte must have some rules for which state the compiler observes and pre-compiles, but the claim is that it's much more flexible and forgiving than React's "straitjacket" (my word). [4] He goes on to say that React acknowledges its own slowness by giving the user control over shouldComponentUpdate. This is a silly argument again. It's arguing that React bamboozled people in 2013 by convincing them to use their framework instead of a library that the author purports is faster but that he only started in 2017. There is honestly no need for this kind of bullshit. If your library offers advantages over React, describe them and let them speak for themselves. There is no need to rewrite the whole history of a product that quite clearly inspired your own, pretending that the authors of your own framework's inspiration are your inferiors because they failed to leap directly to the concepts outlined in your library. He stands on their shoulders, then implies that they were idiots for not having been taller. Through all of this fluff, it took to about 3/4 of the way through the article to find out that Svelte generates update code at build time. I would have been much more intrigued had the author led with that. Now, I'm going to be suspicious of everything about this framework because the author went to such lengths to bamboozle and oversell me. He seems to want me to think I've been a fool for having used React in the first place, when his framework has been waiting for me all along, since all the way back to sometime in 2018. [5] But he waits until the very last paragraph to explain what Svelte actually is -- even though he's been comparing it to React the entire time. It's a good description: "It's important to understand that virtual DOM isn't a feature. It's a means to an end, the end being declarative, state-driven UI development. Virtual DOM is valuable because it allows you to build apps without thinking about state transitions, with performance that is generally good enough. That means less buggy code, and more time spent on creative tasks instead of tedious ones. "But it turns out that we can achieve a similar programming model without using virtual DOM — and that's where Svelte comes in." This is a much fairer characterization of the two libraries: they both base on a very similar model -- one that React did a tremendous amount of legwork in establishing as an attractive approach in people's minds -- but that Svelte goes a step further to improve the reconciliation mechanism, moving it from runtime to compile-time. Svelte's improvement could be a highly welcome one, but it's incremental, not revolutionary. That's wonderful! But it's actually even more wonderful than his article indicated, because I actually don't have to learn anything to work with Svelte instead of React. I can work pretty much the same (Svelte doesn't have hooks [6] because it seems it doesn't need them) and just kind of "drop in" Svelte instead of React and have better performance, even in places where I'd never noticed I might have had problems. That is, with Svelte instead of React, my app will be overall faster because performance no longer suffers from "death by a thousand cuts", as the author puts it. Despite the author's overzealous mischaracterizations and attempts at hot-take marketing, I'm still going to check out Svelte. -------------------------------------------------------------------------------- [1] Such performance hits are often irrelevant, as even the author admits. However, if you can choose between two frameworks, one that punishes you with bad performance and another that optimizes instead, you would choose the more forgiving one, if there are no other downsides. [1] Here I'm going to lead with my conclusion so as not to follow in the author's footsteps, using a style that I spend the rest of this article upbraiding. [1] I'm not sure what MobX 5 is up to or what introspection it offers into the web of observables and dependencies in a more-complex application, but older versions of the library were not easy to debug when performance problems arose. From what I've read from users, things have gotten much better, but I'm still inclined to think that React's declarative approach suits me better -- it's easier for me to apply well-established patterns in my own code rather than trying to figure out how to appease the MobX black box. Again, things may be different now than in earlier versions. I'm open to taking another look at MobX. I'm also not sure how Svelte and MobX compare: MobX requires users to indicate that state is "observable" before it manages it, whereas I assume Svelte determines for itself which state-transitions it should track. [1] Update January 2022: In going through the tutorial available today, you're very quickly introduced to reactive declarations to help Svelte determine which compound expressions should be "watched" for changes to sub-elements. That is, if you declare simple variable, any references to it in view code will be automatically updated, but if you derive another simple value from it and observe that value, it only updates when the derived value is updated directly. This is unlikely, as the derives value presumably implements an algorithm of some sort and should never be directly changed (i.e. it's a calculated property in the parlance of other frameworks. For example, given the following code, let count = 0; doubled = count * 2; Any observers (i.e. embeddings in a view) of the value of doubled will not be updated when value changes, even though the naive interpretation of a JavaScript developer would be that of course it changes. In order to get the desired effect, you must make it a reactive declaration with $:. For example, let count = 0; doubled = count * 2; This is perfectly fine, but is an example of how the "you don't have to do anything to make your JavaScript work naturally, unlike smelly React", is overselling the advantage. Missing reactive declarations will cause an app to not work as expected just as must as a missing useState() does in React. [1] It's hard to tell when it was released because there are 254 releases, one nearly every day and I didn't feel like hitting "older" in GitHub for 30 pages or cloning the repository just to get a better look at the log. [1] The author disparages hooks, saying that they are even worse for performance and linking to a tweet with the words "with predictable results". The tweet complains about atrocious performance because of constant reconciliation and rendering -- but a dozen answers down is the answer: the original poster failed to tell the useEffect() hook on which state it relied. That's kind of a rookie mistake -- in that framework. [7] I understand that Svelte claims that it doesn't need these hints in order to be able to determine at compile-time when a piece of code needs to be executed because the state on which it relies has changed. React is declarative and requires help from the user whereas presumably the selling point of Svelte is that this user would have wasted less time improving performance and more time focused on application logic because Svelte is smart enough to do all this for you. I personally think that this sounds awesome and that it is an admirable goal, but have my doubts that Svelte doesn't also impose its own set of limitations on what kind of state transformations you can do that the compiler can actually detect. That is, React provides an API with which callers can "help" the reconciliation algorithm avoid work. Svelte claims that this isn't necessary, but I'm going to guess that there are rules for how sophisticated state changes can be before the Svelte compiler no longer detects them. In that case, what does Svelte do? Fall back to using a React-like virtual DOM to reconcile changes? Or just not update when the user expects? Or just fail to compile, spitting out an error indicating what the user should do to fix them issue (my personal favorite)? [1] Update January 2022: It's also a mistake that IDEs will help you avoid. WebStorm issues a warning if you rely on state in an effect that you have not listed in the dependencies.

Why use an IOC? (hint: testing)

2019-04-08T09:38:17+02:00

Published by marco on 8. Apr 2019 09:38:17 Updated by marco on 14. Feb 2026 17:46:44 ------------------------------------------------------------------------ [Introduction] Testing is any form of validation that verifies code. That includes not only structured validation using checklists, test plans, etc. but also informal testing, as when developers click their way through a UI or emit values in debugging output to a console. Automated testing covers the topic of all regression-style tests that execute both locally and in CI. This includes unit, integration, and end-to-end tests. Testing is primarily a mindset. You should think of writing tests not as something you have to do, but rather as something you want to do. * How else do you prove that what you wrote works? * What does "it works" mean? * Which use cases are covered? * How do you answer these questions without tests? * What do we mean by writing tests? [You're already testing!] You're almost certainly already testing. You might be clicking through the UI or emitting statements in a command-line application, but you're verifying your code somehow. I mean ... you are, right? RIGHT? I'm kidding. Of course you're not just writing code, building it, and committing it. You're validating it somehow. That's testing. [A list of validations] If you're really good, you might even keep a list of these validations. Once you have a list, then, 1. You don't have to worry about forgetting to do them in the future 2. Even someone with no knowledge of the system can perform validation This is fine, but it's still a manual process. A manual process carries with it the following drawbacks: 1. It gets quite time-consuming, especially as the list of validations grows 2. You're highly unlikely to perform the validations often enough * It's much easier to fix a mistake if you learn about it relatively soon after you made it 3. 4. You're also unlikely to add all of the validations you need * Generally, you won't validate smaller "facts" and will focus on high-level stuff 5. A manual validation process can't be run as part of CI or CD [Automating the list] Automated testing means that you codify those validations. "😒 Great! I have tests! How the heck do I codify them?" Don't panic. Almost any code can be tested. In fact, if you can't get at it with a test, then you might have found an architectural problem. See? Automating tests will even help you write better code! "🤨 How do I get started?" Just start somewhere. It doesn't matter where. Don't worry about coverage. Just get the feeling for writing a proof about a facet of your code. Any bit of logic can—and should—be tested. What if you still don't know where to begin? Ask someone for help! Don't be shy. It's in everyone's best interest for a project to have good tests. You want everyone's code to have tests so you know right away when you've broken something in a completely unrelated area. This is a good thing! [Goals] "🤸‍♀️ Developers should be excited to use tests to prove that their code works." [Tests should be quick and easy (maybe even fun) to write] A project should provide support for mocking devices and external APIs, or for using test-specific datasets. [Tests should be reasonably fast] A reasonably fast test suite will tend to be run more often. We would like a developer to notice a broken test right after the change that broke it, preferably even before pushing it. [Avoid debugging tests in CI] Tests a developer runs locally should almost always work in CI. Failing tests in CI should also fail locally. [Guidelines] "🤨 Don't be pedantic." For example, * "Stop requiring only one assertion per unit test: Multiple assertions are fine" * Don't forbid mocking in integration tests and don't force mocking in unit tests. * In fact, stop worrying about whether it's a unit or an integration and just write useful tests that prove useful things about your code. * * Don't get obsessed with automating everything. * Get the low-hanging fruit first, and leave the rest to manual testing. * See where you stand. * If you haven't automated enough, iterate until done. 🔄 [Tests should be useful] We never want anyone in a team to get the impression that we're writing tests just to write tests. We write tests because they help us write better code and because it feels good to be able to prove that something that was working continues to work. You should feel more efficient and productive and feel like you're producing higher-quality code. * Tests should confirm use cases * Tests should prove something about your code that you think is worth proving. * Tests should confirm behavior that either is how the code currently works or how it should work. * Tests should help you write better code from the get-go. * Every bug that you need to fix is de-facto a use case that needs a test. [Code Coverage & Reviews] How do you know when there are "enough" automated tests? Don't get distracted by trying to achieve a specific coverage percentage. The most important thing is that the major use cases are covered. If software is stable and there is "only" 40% test-coverage, then maybe there is a lot of code that rarely or never gets used? In that case, you might want to think about removing code that you don't need rather than to waste time writing tests for code that never runs. New code, though, should always have automated tests. A code reviewer should verify that new functionality is being tested. [Types of tests] Unit Cover a single unit, mocking away other dependencies where needed. Useful for verifying simple logic like calculated properties or verifying the results of service methods with given inputs Integration Cover multiple units, possibly mocking unwanted dependencies. Useful for verifying behavior of units in composition, as they will be used in the end product. The goal is to cover as much as possible without resorting to more costly end-to-end tests End-to-End Also called UI Tests, these tests verify the entire stack for actual customer use cases. Very useful, but generally require more maintenance as they tend to be more fragile. Essential for verifying UI behavior not reflected in a programmatic model. Can work with snapshots (e.g. error label is in red) [Approach] The article "Write tests. Not too many. Mostly integration." describes a pragmatic approach quite well. Instead of the classic "testing pyramid", it suggests a "testing trophy". !"image.png" This style of development has the following aims: 1. Verify as much as possible statically, with linting and analyzers 2. Make integration tests cheaper because they prove more about your system than unit tests 3. Prove as much as possible outside of end-to-end tests because they're expensive and brittle [Analysis] "Remember that everything you use has to work both locally and in CI." [Static-checking] A project should include analyzers and techniques so that the compiler helps make many tests unnecessary. For example, if you know that a parameter or result can never be null, then you can avoid a whole slew of tests. Developers should only spend time writing tests that verify semantic aspects that can't be proven by the compiler. [Null-reference analysis in .NET] The .NET world provides many, many analyzers and tools to verify code quality. One of the most important things a project can do is to improve null-checking. The best way to do this is to upgrade to C# 8 or higher and enable "null-reference analysis" . The "default language for .NET Framework is going to stay C# 7.3" , but you can "enable null-reference analysis for .NET Framework" quite easily. Another option is to use the "JetBrains Annotations NuGet package" , which provides attributes to indicate whether parameters or results are nullable. The preferred way, though, is to use the by-now standard nullability-checking available in .NET. Doing neither is not a good option, as it will be very difficult to avoid null-reference exceptions. [Unit-testing] Unit tests are very useful for validating requirements and invariants about your code. These are the easiest tests to write and will generally be the first ones that you will write. A requirement or an invariant may be specified in the story itself, but it can be anything that you know about the code that's important. It's up to the developer and the reviewer(s) to determine which tests are necessary. It gets easier with experience—and it doesn't take long to get enough experience so that it's no longer so intimidating. [Unit-testing example] Just as a quick example in .NET, consider the following code, public bool IsDiagnosticModeRunning { get => _isDiagnosticModeRunning; set { _isDiagnosticModeRunning = value; _statusManager.InstrumentState = value ? InstrumentState.DiagnosticMode : InstrumentState.Ready; } } Here we see a relatively simple property with a getter and a setter. However, we also see that there is an invariant in the implementation: that the _statusManager.InstrumentState is synced with it. Using many of the "techniques described below" <#tools-and-techniques>, we could write the following test: [DataRow(true, InstrumentState.DiagnosticMode)] [DataRow(false, InstrumentState.Ready)] [TestMethod] public void TestIsDiagnosticModeRunning(bool running, InstrumentState expectedInstrumentState) { var locator = CreateLocator(); var instrumentControlService = locator.GetInstance(); var statusManager = locator.GetInstance(); Assert.AreNotEqual(expectedInstrumentState, statusManager.InstrumentState); instrumentControlService.IsDiagnosticModeRunning = running; Assert.AreEqual(expectedInstrumentState, statusManager.InstrumentState); } Here, we're using MSTest to create a parameterized test that, * creates the IOC * gets the two relevant services from it * Verifies that the state is not already set to the expected state (in which case the test would succeed even if the tested code doesn't do anything) * Sets the property to a given value * Verifies that the state is correct for that value We now have code that validates two facts about the system. Should something change where these facts are no longer true, the tests will fail, giving the developer a chance to analyze the situation. * Was the change inadvertent or deliberate? * Are the facts still correct? Does the test need to be updated? If you're addressing a bug-fix, though, you might be able to prove that you've fixed the bug with a unit test, but it's also likely that you'll have to write an integration test instead. [Integration-testing] Unit tests have their place, but they are far too emphasized in the testing pyramid. The testing pyramid comes from a time when writing integration tests was much more difficult than it (theoretically) is today. The "theoretically" above means that the ability to write integration tests as efficiently as unit tests is contingent on a project offering proper tools and support. One common complaint about integration tests vis à vis unit tests is that they run more slowly. Another is that they take longer to develop. Ideally, a project provides support to counteract both of these tendencies. To this end, then, a project should offer base and support classes that make common integration tests easy to set up and quick to execute: * Interacting with a database * Setting up a known database schema * Getting to a clean dataset * [Mocking]() the database * Mocking other external dependencies in a project (e.g. loading configuration from an endpoint, sending emails, sending modifications to endpoints) There are many different ways to solve this problem, each with tradeoffs. For example, a project can load dependencies in Docker containers, either created and started manually (see "Testing your ASP.NET Core application - using a real database" ) or even dynamically with a tool like the "Testcontainers NuGet package" . [Comparing Unit and Integration tests] A drawback to unit tests is that, while they can test an individual component well, it's really the big picture that we want to test. We want to test scenarios that correspond to actual use cases rather than covering theoretical call stacks. It's not that the second part isn't important, but that it's not as important. Given limited time and resources, we would prefer to have integration tests that also cover a lot of the same code paths that we would have covered with unit tests, rather than to have unit tests, but few to no integration tests. This, however, leads directly to... The advantage of a unit test over an integration test is that when it fails, it's obvious which code failed. An integration test, by its very nature, involves multiple components. When it fails, it might not be obvious which sub-component caused the error. If you find that you have integration tests failing and it takes a while to figure out what went wrong, then that's a sign that you should bolster your test suite with more unit tests. Once an integration test fails and one or more unit tests fail, then you have the best of both worlds: you've been made aware that you've broken a use case (integration test), but you also know which precise behavior is no longer working as before (unit test). [Tools and Techniques] [Tests are Code] Test code is just as important as product code. Use all of the same techniques to improve code quality in test code as you would in product code. Clean coding, good variable names, avoid copy/paste coding—all of it applies just as much to tests. There are two main differences: * You don't need to document tests. * You don't have to write tests for tests. 😀 [Writing testable code] This is a big, big topic, of course. There are a few guidelines that make it easier to write tests—or to avoid having to write tests at all. As noted above, code that can be validated by the compiler (static analysis) doesn't need tests. E.g. you don't have to write a test for how your code behaves when passed a null parameter if you just forbid it. Likewise, you don't have to re-verify that types work as they should in statically typed languages. We can trust the compiler. Here are a handful of tips. * Prefer composition to inheritance * A functional programming style is very testable * An IOC Container is very helpful * Avoid nullable properties, results, and parameters * Avoid mutable data * Interfaces are much easier to fake or mock; use those wherever you can See the following articles for more ideas. * "C# Handbook - Chapter 4: Design" (2017) * "Questions to consider when designing APIs: Part I" (2014) * "Questions to consider when designing APIs: Part II" (2014) * "Why use an IOC? (hint: testing)" (2019) [Parameterized Tests] Investigate your testing library to learn how to write multiple tests without having to write a lot of code. In the MSTests framework, you can use DataRow to parameterize a test. In NUnit, TestCase does the same thing, and Value allows you to provide parameter values for a list of tests that are the Cartesian product of all values. [Mocking/Faking] Use mocks or fakes to exclude a subsystem from a test. What would you want to exclude? While you will want to make some tests that include database access or REST API calls, there are a lot of tests where you're proving a fact that doesn't depend on these results. [Focus on what you're testing] For example, suppose a component reads its configuration from the database by default. A test of that component may simply want to see how it reacts with a given input to a given method. Where the configuration came from is irrelevant to that particular test. In that case, you could mock away the component that loads the configuration from the database and instead use a fake object that just provides some standard values. [Test error conditions] Another possibility is to fake an external service to see how your code reacts when the service returns an error or an ambiguous response. Without mocks, how would you test how your code reacts when a REST endpoint returns 503 or 404? Without a mock, how would you force the purely external endpoint to give a certain code? You really can't. With a mock, though, you can replace the service and return a 404 response for a specific test. This is quite a powerful technique. [How to fake?] As noted above, it's much, much easier to use fake objects if you've consistently used interfaces. You can just create your own implementation of the interface whose standard implementation you want to replace, give it a fake implementation (e.g. returning false and empty string and null for methods and properties), and then use that class as the implementation. [Faking/mocking libraries] If you have interfaces that perform a single task (single-responsibility principle), then it doesn't take too much effort to write the fake object by hand. However, it's much easier to use a library to create fake objects—and there are other benefits as well, like tracking which methods were called with which parameters. You can assert on this data collected by the fake object. For .NET, a great library for faking objects is "FakeItEasy" . With a fake object, you can indicate which values to return for a given set of parameters without too much effort. Similarly, you can use the same API to query how often these methods have been called. This allows you to verify, for example, that a call to a REST service would have been made. This is a powerful way of proving facts about your code without having to actually interact with external services. [An example] The following code configures a fake object for ITestUnitConfigurationService that returns default data for all properties, except for Configuration and GetTestUnitParameterValues(), which are configured to return specific data. private static ITestUnitConfigurationService CreateFakeTestUnitConfigurationService() { var result = A.Fake(); var testUnitParameters = CreateTestUnitParameters(); var testUnitConfiguration = new TestUnitConfiguration(testUnitParameters); A.CallTo(() => result.Configuration).Returns(testUnitConfiguration); var testUnitParameterValues = CreateTestUnitParameterValues(); A.CallTo(() => result.GetTestUnitParameterValues()).Returns(testUnitParameterValues); return result; } In the test, we could get this fake object back out of the IOC (for example) and then verify that certain methods have been called the expected number of times. var testUnitConfigurationService = locator.GetInstance(); A.CallTo(() => testUnitConfigurationService.Configuration).MustHaveHappenedOnceExactly(); A.CallTo(() => testUnitConfigurationService.GetTestUnitParameterValues()).MustHaveHappenedOnceExactly(); [Snapshot-testing] You can avoid writing a ton of assertions and a ton of tests with snapshot testing. For example, imagine you have a test that generates a particular view model. You want to verify 30 different parts of this complex model. You could navigate the data structure, asserting the 30 values individually. That would be pretty tedious, though, and lead to fragile and hard-to-maintain testing code. Instead, you could emit that structure as text and save it as a snapshot in the repository. If a future code change leads to a different snapshot, the test fails and the developer that caused the failure would have to approve the new snapshot (if it's an expected or innocuous change) or fix the code (if it was inadvertent and wrong). The upside is that large swaths of assertions are reduced to a simple snapshot assertion. The downside is that the test might break more often for spurious reasons. Generally, you can avoid these spurious reasons by being judicious about how your format the snapshot, * Avoid timestamps or data that changes over time * Avoid using output methods that are too likely to change over time See the documentation for the "Snapshooter NuGet package" .

Using Unity, Collab and Git

2019-01-22T19:47:58+01:00

Published by marco on 22. Jan 2019 19:47:58 ------------------------------------------------------------------------ If you're familiar with the topic, you might be recoiling in horror. It would be unclear, though, whether you're recoiling from the "using Collab" part or the "using Collab with Git" part. Neither is as straightforward as I'd hoped. " tl;dr: If you have to use Collab with Unity, but want to back it up with Git, disable core.autocrlf [1] and add * -text to the .gitattributes. " [Collab's Drawbacks] Collab is the source-control system integrated into the Unity IDE. It was built for designers to be able to do some version control, but not much more. Even with its limited scope, it's a poor tool. [The functionality horror] * The system does not ever show you any differences, neither in the web UI nor the local UI, neither for uncommitted nor committed files * Some changes cannot be reverted. No reason is given. * You can only delete new files from the file system. * There is no support for renaming * Reverting to a known commit has worked for me exactly once out of about 10 tries. The operation fails with an Error occurred and no further information. If you really get stuck, your only choice is to restore the entire workspace by re-cloning/re-downloading it. * Conflict resolution is poorly supported, although it works better than expected (it integrates with BeyondCompare, thank goodness). [The usability horror] * The UI only lets you commit all changed files at once. * There is no notion of "commits". * You can’t commit individual files or chunks. * There is no staging area. * You can't exclude files. * You can ignore them completely, but that doesn't help. * The UI is only accessible via mouse from the menu bar. * You can sometimes revert folders (sometimes you can't, again with an Error occurred message), but you can't revert arbitrary groups of files. * The UI is almost entirely in that custom drop-down menu. * You can scroll through your changed files, but you can't expand the menu to show more files at once. * You can show a commit history, but there are no diffs. None. * There aren't even any diffs in the web version of the UI, which is marginally better, but read-only. [Pair Git with Collab] This is really dangerous, especially with Unity projects. There is so much in a Unity project without a proper "Undo" that you very often want to return to a known good version. So what can we do to improve this situation? We would like to use Git instead of Collab. However, we have to respect the capabilities and know-how of the designers on our team, who don't know how to use Git. On our current project, there's no time to train everyone on Git -- and they already know how to use Collab and don't feel tremendously limited by it. Remember, any source control is better than no source control. The designers are regularly backing up their work now. In its defense, Collab is definitely better than nothing (or using a file-share or some other weak form of code-sharing). Instead, those of us who know Git are using Git alongside Collab. [It kind of works...] We started naively, with all of our default settings in Git. Our workflow was: 1. Pull in Unity/Collab 2. Fetch from Git/Rebase to head (we actually just use "pull with rebase") Unfortunately, we would often end up with a ton of files marked as changed in Collab. These were always line-ending differences. As mentioned above, Collab is not a good tool for reverting changes. The project has time constraints -- it's a prototype for a conference, with a hard deadline -- so, despite its limitations, we reverted in Collab and updated Git with the line-endings that Collab expected. We limped along like this for a bit, but with two developers on Git/Collab on Windows and one designer on Collab on Mac, we were spending too much time "fixing up" files. The benefit of having Git was outweighed by the problems it caused with Collab. [Know Your Enemy] So we investigated what was really going on. The following screenshots show that Collab doesn't seem to care about line-endings. They're all over the map. [image] [image] [image] [Configuring Git] Git, on the other hand, really cares about line-endings. By default, Git will transform the line-endings in files that it considers to be text files (this part is important later) to the line-ending of the local platform. In the repository, all text files are LF-only. If you work on MacOS or Linux, line-endings in the workspace are unchanged; if you work on Windows, Git changes all of these line-endings to CRLF on checkout -- and back to LF on commit. Our first "fix" was to turn off the core.autocrlf option in the local Git repository. git config --local core.autocrlf false We thought this would fix everything since now Git was no longer transforming our line-endings on commit and checkout. This turned out to be only part of the problem, though. As you can see above, the text files in the repository have an arbitrary mix of line-endings already. Even with the feature turned off, Git was still normalizing line-endings to LF on Windows. The only thing we'd changed so far is to stop using the CRLF instead of LF. Any time we git reset, for example, the line-endings in our workspace would still end up being different than what was in Git or Collab. [Git: Stop doing stuff] What we really want is for Git to stop changing any line-endings at all. This isn't part of the command-line configuration, though. Instead, you have to set up .gitattributes. Git has default settings that determine which files it treats as which types. We wanted to adjust these default settings by telling Git that, in this repository, it should treat no files as text. Once we knew this, it's quite easy to configure. Simply add a .gitattributes file to the root of the repository, with the following contents: * -text This translates to "do not treat any file as text" (i.e. match all files; disable text-handling). [Conclusion] With these settings, the two developers were able to reset their workspaces and both Git and Collab were happy. Collab is still a sub-par tool, but we can now work with designers and still have Git to allow the developers to use a better workflow. The designers using only Collab were completely unaffected by our changes. -------------------------------------------------------------------------------- [1] Technically, I don't think you have to change the autocrlf setting. Turning off text-handling in Git should suffice. However, I haven't tested with this feature left on and, due to time-constraints, am not going to risk it.

Finding deep assembly dependencies

2019-01-21T20:26:52+01:00

Published by marco on 21. Jan 2019 20:26:52 ------------------------------------------------------------------------ Quino contains a Sandbox in the main solution that lets us test a lot of the Quino subsystems in real-world conditions. The Sandbox has several application targets: * WPF * Winform * Remote Data Server * WebAPI Server * Console The targets that connect directly to a database (e.g. WPF, Winform) were using the PostgreSql driver by default. I wanted to configure all Sandbox applications to be easily configurable to run with SqlServer. [Just add the driver, right?] This is pretty straightforward for a Quino application. The driver can be selected directly in the application (directly linking the corresponding assembly) or it can be configured externally. Naturally, if the Sandbox loads the driver from configuration, some mechanism still has to make sure that the required data-driver assemblies are available. The PostgreSql driver was in the output folder. This was expected, since that driver works. The SqlServer was not in the output folder. This was also expected, since that driver had never been used. I checked the direct dependencies of the Sandbox Winform application, but it didn't include the PostgreSql driver. That's not really good, as I would like both SqlServer and PostgreSql to be configured in the same way. As it stood, though, I would be referencing SqlServer directly and PostgreSql would continue to show up by magic. Before doing anything else, I was going to have to find out why PostgreSql was included in the output folder. I needed to figure out assembly dependencies. [Visual Studio?] My natural inclination was to reach for NDepend, but I thought maybe I'd see what the other tools have to offer first. Does Visual Studio include anything that might help? The "Project Dependencies" shows only assemblies on which a project is dependent. I wanted to find assemblies that were dependent on PostgreSql. I have the Enterprise version of Visual Studio and I seem to recall an "Architecture" menu, but I "discovered that these tools are no longer installed by default" . According to the VS support team in that link, you have to install the "Visual Studio extension development" workload in the Visual Studio installer. In this package, the "Architecture and analysis tools" feature is available, but not included by default. Hovering this feature shows a tooltip indicating that it contains "Code Map, Live Dependency Validation and Code Clone detection". The "Live Dependency Validation" sounds like it might do what I want, but it also sounds quite heavyweight and somewhat intrusive, as described in "this blog from the end of 2016" . Instead of further modifying my VS installation (and possibly slowing it down), I decided to try another tool. [ReSharper?] What about ReSharper? For a while now, it's included project-dependency graphs and hierarchies. Try as I might, I couldn't get the tools to show me the transitive dependency on PostgreSql that Sandbox Winform was pulling in from somewhere. The hierarchy view is live and quick, but it doesn't show all transitive usages. [image] The graph view is nicely rendered, but shows dependencies by default instead of dependencies and usages. At any rate, the Sandbox wasn't showing up as a transitive user of PostgreSql. [image] I didn't believe ReSharper at this point because something was causing the data driver to be copied to the output folder. [NDepend to the rescue ] So, as expected, I turned to NDepend. I took a few seconds to run an analysis and then right-clicked the PostgreSql data-driver project to select NDepend => Select Assemblies... => That are Using Me (Directly or Indirectly) to show the following query and results. [image] Bingo. Sandbox.Model is indirectly referencing the PostgreSql data driver, via a transitive-dependency chain of 4 assemblies. Can I see which assemblies they are? Of course I can: this kind of information is best shown on a graph, so you can show a graph of any query results by clicking Export to Graph to show the graph below. [image] Now I can finally see that the SandboxModel pulls in the Quino.Testing.Models.Generated (to use the BaseTypes module) which, in turn, has a reference to Quino.Tests.Base which, of course, includes the PostgreSql driver because that's the default testing driver for Quino tests. Now that I know how the reference is coming in, I can fix the problem. Here I'm on my own: I have to solve this problem without NDepend. But at least NDepend was able to show me exactly what I have to fix (unlike VS or ReSharper). I ended up moving the test-fixture base classes from Quino.Testing.Models.Generated into a new assembly called Quino.Testing.Models.Fixtures. The latter assembly still depends on Quino.Tests.Base and thus the PostgreSql data driver, but it's now possible to reference the Quino testing models without transitively referencing the PostgreSql data driver. A quick re-analysis with NDepend and I can see that the same query now shows a clean view: only testing code and testing assemblies reference the PostgreSql driver. [image] [Finishing up] And now to finish my original task! I ran the Winform Sandbox application with the PostgreSql driver configured and was greeted with an error message that the driver could not be loaded. I now had parity between PostgreSql and SqlServer. The fix? Obviously, make sure that the drivers are available by referencing them directly from any Sandbox application that needs to connect to a database. This was the obvious solution from the beginning, but we had to quickly fix a problem with dependencies first. Why? Because we hate hacking. :-) Two quick references added, one build and I was able to connect to both SQL Server and PostgreSql.

QQL: A Query Language for Quino

2019-01-20T22:37:35+01:00

Published by marco on 20. Jan 2019 22:37:35 Updated by marco on 21. Jan 2019 10:00:49 ------------------------------------------------------------------------ In late 2011 and early 2012, Encodo designed a querying language for Quino. Quino has an ORM that, combined with .NET Linq provides a powerful querying interface for developers. QQL is a DSL that brings this power to non-developers. QQL never made it to implementation -- only specification. In the meantime, the world moved on and we have common, generic querying APIs like OData. The time for QQL is past, but the specification is still an interesting artifact, in its own right. Who knows? Maybe we'll get around to implementing some of it, at some point. At any rate, you can download the specification from "Encodo" or "here at earthli" . The following excerpts should give you an idea of what you're in for, should you download and read the 80-page document. [Details] The TOC lists the following top-level chapters: 1. Introduction 2. Examples 3. Context & Scopes 4. Standard Queries 5. Grouping Queries 6. Evaluation 7. Syntax 8. Data Types and Operators 9. Libraries 10. Best Practices 11. Implementation Details 12. Future Enhancements From the abstract in the document: "The Quino Query Language (QQL) defines a syntax and semantics for formulating data requests against hierarchical data structures. It is easy to read and learn both for those familiar with SQL and non-programmers with a certain capacity for abstract thinking (i.e. power users). Learning only a few basic rules is enough to allow a user to quickly determine which data will be returned by all but the more complex queries. As with any other language, more complex concepts result in more complex texts, but the syntax of QQL limits these cases." From the overview: "QQL defines a syntax and semantics for writing queries against hierarchical data structures. A query describes a set of data by choosing an initial context in the data and specifying which data are to be returned and how the results are to be organized. An execution engine generates this result by applying the query to the data." [Examples] [Standard Projections] The follow is from chapter 2.1, "Simple Standard Query": The following query returns the first and last name of all active people as well as their 10 most recent time entries, reverse-sorted first by last name, then by first name. Person { select { FirstName; LastName; Sample:= TimeEntries { orderby Date desc; limit 10 } } where Active orderby { LastName desc; FirstName desc; } } In chapter 2, there are also "2.2 Intermediate Standard Query" and "2.3 Complex Standard Query" examples. [Grouping Projections] The following is from chapter 2.4, "Simple Grouping Query": The following query groups active people by last name and returns the age of the youngest person and the maximum contracts for each last name. Results are ordered by the maximum contracts for each group and then by last name. group Person { groupby LastName; select { default; Age:= (Now - BirthDate.Min).Year; MaxContracts:= Contracts.Count.Max } where Active; orderby { MaxContracts desc; LastName desc; } } In chapter 2, there are also "2.5 Complex Grouping Query", "2.6 Standard Query with Grouping Query" and "2.7 Nested Grouping Queries" examples.

Breaking Changes in C#

2019-01-20T22:19:02+01:00

Published by marco on 20. Jan 2019 22:19:02 Updated by marco on 20. Jan 2019 22:20:10 ------------------------------------------------------------------------ Due to the nature of the language, there are some API changes that almost inevitably lead to breaking changes in C#. [Change constructor parameters] While you can easily make another constructor, marking the old one(s) as obsolete, if you use an IOC that allows only a single public constructor, you're forced to either * remove the obsolete constructor or * mark the obsolete constructor as protected. In either case, the user has a compile error. [Virtual methods/Interfaces] There are several known issues with introducing new methods or changing existing methods on an existing interface. For many of these situations, there are relatively smooth upgrade paths. I encountered a situation recently that I thought worth mentioning. I wanted to introduce a new overload on an existing type. Suppose you have the following method: bool TryGetValue( out T value, TKey key = default(TKey), [CanBeNull] ILogger logger = null ); We would like to remove the logger parameter. So we deprecate the method above and declare the new method. bool TryGetValue( out T value, TKey key = default(TKey) ); Now the compiler/ReSharper notifies you that there will be an ambiguity if a caller does not pass a logger. How to resolve this? Well, we can just remove the default value for that parameter in the obsolete method. bool TryGetValue( out T value, TKey key = default(TKey), [CanBeNull] ILogger logger ); But now you've got another problem: The parameter logger cannot come after the key parameter because it doesn't have a default value. So, now you'd have to move the logger parameter in front of the key parameter. This will cause a compile error in clients, which is what we were trying to avoid in the first place. In this case, we have a couple of sub-optimal options. Multiple Releases Use a different name for the new API (e.g. TryGetValueEx à la Windows) in the next major version, then switch the name back in the version after that and finally remove the obsolete member in yet another version. That is, * in version n, TryGetValue (with logger) is obsolete and users are told to use TryGetValueEx (no logger) * in version n+1, TryGetValueEx (no logger) is obsolete and users are told to use TryGetValue (no logger) * in version n+2, we finally remove TryGetValueEx. This is a lot of work and requires three upgrades to accomplish. You really need to stay on the ball in order to get this kind of change integrated and it takes a non-trivial amount of time and effort. We generally don't use this method, as our customers are developers and can deal with a compile error or two, especially when it's noted in the release notes and the workaround is fairly obvious (e.g. the logger parameter is just no longer required). Remove instead of deprecating Accept that there will be a compile error and soften the landing as much as possible for customers by noting it in the release notes.

Version numbers in .NET Projects

2019-01-20T22:00:30+01:00

Published by marco on 20. Jan 2019 22:00:30 ------------------------------------------------------------------------ Any software product should have a version number. This article will answer the following questions about how Encodo works with them. * How do we choose a version number? * What parts does a version number have? * What do these parts mean? * How do different stakeholders interpret the number? * What conventions exist for choosing numbers? * Who chooses and sets these parts? [Stakeholders] In decreasing order of expected expertise, * Developers: Writes the software; may *change* version numbers * Testers: Tests the software; highly interested in version numbers that make sense * End users: Uses the software as a black box The intended audience of this document is *developers*. [Definitions and Assumptions] * Build servers, not developer desktops, produce artifacts * The source-control system is Git * The quino command-line tool is installed on all machines. This tool can *read* and *write* version numbers for any .NET solution, regardless of which of the many version-numbering methods a given solution actually uses. * A *software library* is a package or product that has a developer as an *end user* * A *breaking change* in a software library causes one of the following * a build error * an API to behave differently in a way that cannot be justified as a bug fix [Semantic versions] Encodo uses semantic versions. This scheme has a strict ordering that allows you to determine which version is "newer". It indicates pre-releases (e.g. alphas, betas, rcs) with a "minus", as shown below. Version numbers come in two flavors: * Official releases: [Major].[Minor].[Patch].[Build] * Pre-releases: [Major].[Minor].[Patch]-[Label][Build] See Microsoft's "NuGet Package Version Reference" for more information. [Examples] * 0.9.0-alpha34: A pre-release of 0.9.0 * 0.9.0-beta48: A pre-release of 0.9.0 * 0.9.0.67: An official release of 0.9.0 * 1.0.0-rc512: A pre-release of 1.0.0 * 1.0.0.523: An official release of 1.0.0 The numbers are strictly ordered. The first three *parts* indicate the "main" version. The final *part* counts strictly upward. [Parts] The following list describes each of the parts and explains what to expect when it changes. [Build] * Identifies the build task that produced the artifact * Strictly increasing [Label] * An arbitrary designation for the "type" of pre-release [Patch] * Introduces bug fixes but no features or API changes * May introduce obsolete members * May *not* introduce breaking changes This part is also known as "Maintenance" (see "versioning">Software versioning" on Wikipedia). [Minor] * Introduces new features that extend existing functionality * May include bug fixes * May cause minor breaking changes * May introduce obsolete members that cause compile errors * Workarounds must be documented in release notes or obsolete messages [Major] * Introduces major new features * Introduces breaking changes that require considerable effort to integrate * Introduces a new data or protocol format that requires migration [Conventions] [Uniqueness for official releases] There will only ever be one artifact of an official release corresponding to a given "main" version number. That is, if 1.0.0.523 exists, then there will never be a 1.0.0.524. This is due the fact that the build number (e.g. 524) is purely for auditing. For example, suppose your software uses a NuGet package with version 1.0.0.523. NuGet will not offer to upgrade to 1.0.0.524. [Pre-release Labels] There are no restrictions on the labels for pre-releases. However, it's recommended to use one of the following: * alpha * beta * rc Be aware that if you choose a different label, then it is ordered alphabetically relative to the other pre-releases. For example, if you were to use the label pre-release to produce the version 0.9.0-prealpha21, then that version is considered to be higher than 0.0.0-alpha34. A tool like NuGet will not see the latter version as an upgrade. [Release branches] The name of a release branch should be the major version of that release. E.g. release/1 for version 1.x.x.x. [Pre-release branches] The name of a pre-release branch should be of the form feature/[label] where [label] is one of the labels recommended above. It's also OK to use a personal branch to create a pre-release build, as in mvb/[label]. [Setting the base version] A developer uses the quino tool to set the version. For example, to set the version to 1.0.1, execute the following: quino fix -v 1.0.1.0 The tool will have updated the version number in all relevant files. [Calculating final version] The build server calculates a release's version number as follows, * major: Taken from solution * minor: Taken from solution * maintenance: Read from solution * label: Taken from the Git branch (see below for details) * build: Provided by the build server [Git Branches] The name of the Git branch determines which kind of release to produce. * If the name of the branch matches the glob **/release/*, then it's an official release * Everything else is a pre-release For example, * origin/release/1 * origin/production/release/new * origin/release/ * release/1 * production/release/new * release/ The name of the branch doesn't influence the version number since an official release doesn't have a label. [Pre-release labels] The label is taken from the last part of the branch name. For example, * origin/feature/beta yields beta * origin/feature/rc yields rc * origin/mvb/rc yields rc The following algorithm ensures that the label can be part of a valid semantic version. * Remove invalid characters * Append an X after a trailing digit * Use X if the label is empty (or becomes empty after having removed invalid characters) For example, * origin/feature/rc1 yields rc1X * origin/feature/linuxcompat yields linuxcompat * origin/feature/12 yields X [Examples] Assume that, * the version number in the solution is 0.9.0.0 * the build counter on the build server is at 522 Then, * Deploying from branch origin/release/1 produces artifacts with version number 0.9.0.522 * Deploying from branch origin/feature/rc produces artifacts with version number 0.9.0-rc522 [Release Workflow] The following are very concise guides for how to produce artifacts. [Pre-release] * Ensure you are on a non-release branch (e.g. feature/rc, master) * Verify or set the base version (e.g. quino fix -v 1.0.2.0 * Push any changes to Git * Execute the "deploy" task against your branch on the build server [Release] * Ensure you are on a release branch (e.g. release/1) * Verify or set the base version (e.g. quino fix -v 1.0.2.0`) * Push any changes to Git * Execute the "deploy" task against your branch on the build server

v6.0: .NET Standard & Authentication

2019-01-20T21:59:55+01:00

Published by marco on 20. Jan 2019 21:59:55 ------------------------------------------------------------------------ Note: this article was originally published at "Encodo.com" at the end of October, 2018. -------------------------------------------------------------------------------- The summary below describes major new features, items of note and breaking changes. * "Download the artifacts" * See the full list of issues * "Quino-Standard 6.0.0" * "Quino-WebAPI 6.0.0" * "Quino-Windows 6.0.0" The links above require a login. [Overview] At long last, Quino enters the world of .NET Standard and .NET Core. Libraries target .NET Standard 2.0, which means they can all be used with any .NET runtime on any .NET platform (e.g. Mac and Linux). Sample applications and testing assemblies target .NET Core 2.0. Tools like quinogenerate and quinofix target .NET Core 2.1 to take advantage of the standardized external tool-support there. Furthermore, the Windows, Winform and WPF projects have moved to a separate solution/repository called Quino-Windows. Quino-Standard is the core on which both Quino-Windows and Quino-WebAPI build. * All core assemblies target .NET Standard 2.0. * All assemblies in Quino-Windows target .NET Framework 4.6.2 because that's the first framework that can interact with .NET Standard (and under which Windows-specific code runs). * All assemblies in Quino-WebAPI currently target .NET Framework 4.6.2. We plan on targeting .NET Core in an upcoming version (tentatively planned for v7). [Highlights] * Target .NET Standard and .NET Core from Quino-Standard * Split Windows-specific code to Quino-Windows * Improve authentication API to use IIdentity everywhere (deprecating ICredentials and IUserCredentials). [Breaking Changes] 6.0 is a pretty major break from the 5.x release. Although almost all assembly names have stayed the same, we had to move some types around to accommodate targeting .NET Standard with 85% of Quino's code. [APIs] We've tried to support existing code wherever possible, but some compile errors will be unavoidable (e.g. from namespace changes or missing references). In many cases, R#/VS should be able to help repair these errors. These are the breaking changes that are currently known. * Moved IRunSettings and RunMode from Encodo.Application to Encodo.Core. [References] Any .NET Framework executable that uses assemblies targeting .NET Standard must reference .NET Standard itself. The compiler (MSBuild) in Visual Studio will alert you to add a reference to .NET Standard using NuGet. This applies not just to Winform executables, but also to any unit-test assemblies. [Tools] One piece that has changed significantly is the tool support formerly provided with Quino.Utils. As of version 6, Quino no longer uses NAnt, instead providing dotnet-compatible tools that you can install using common .NET commands. Currently, Quino supports: * dotnet quinofix * dotnet quinogenerate * dotnet quinopack Please see the "tools documentation" for more information on how to install and use the new tools. The standalone Winforms-based tools are in the Quino-Windows download, in the Tools.zip archive. * Quino.Migrator * Quino.PasswordEncryptor Quino.Utils is no longer supported as a NuGet package.

Learning Quino: a roadmap for documentation and tutorials

2019-01-20T21:59:29+01:00

Published by marco on 20. Jan 2019 21:59:29 ------------------------------------------------------------------------ Note: this article was originally published at "Encodo.com" in July, 2018. -------------------------------------------------------------------------------- In recent articles, we outlined "a roadmap to .NET Standard and .NET Core" and "a roadmap for deployment and debugging" . These two roadmaps taken together illustrate our plans to extend as much of Quino as possible to other platforms (.NET Standard/Core) and to make development with Quino as convenient as possible (getting/upgrading/debugging). To round it off, we've made good progress on another vital piece of any framework: documentation. [Introducing docs.encodo.ch] We recently set up a new server to host "Quino documentation" . There, you can find documentation for current releases. Going forward, we'll also retain documentation for any past releases. We're generating our documentation with "DocFX" , which is the same system that powers Microsoft's own documentation web site. We've integrated documentation-generation as a build step in Quino's nightly build on TeamCity, so it's updated every night (Zürich time) [1]. The documentation includes "conceptual documentation" which provides an overview/tutorials/FAQ for basic concepts in Quino. The "API Reference" includes comprehensive documentation about the types and methods available in Quino. [Next Steps] While we're happy to announce that we have publicly available documentation for Quino, we're aware that we've got work to do. The next steps are: * Improve and extend conceptual documentation ("QNO-3452" , "QNO-3453" , "QNO-3985" , "QNO-5282" , "QNO-5283" , "QNO-5284" , "QNO-5286" , "QNO-5391" , "QNO-5528" , "QNO-5544" , "QNO-5562" , "QNO-5813" , "QNO-5912" ) * Improve the search for API documentation to include members as well as types ("QNO-5934" ) * Integrate documentation for the newly created/separated Quino-WebApi repository ("IT-1293" ) * Retain documentation for the soon-to-be-split-off Quino-Windows repository ("QNO-5904" ) Even though there's still work to do, this is a big step in the right direction. We're very happy to have found DocFX, which is a very comprehensive, fast and nice-looking solution to generating documentation for .NET code. [2] -------------------------------------------------------------------------------- [1] If the build succeeds, naturally. :-) [1] We used to use Sandcastle many years ago, but dropped support because it took forever to generate documentation, required its own solution file, didn't look very nice out-of-the-box, wasn't so easily customized and didn't have a very good search (which also didn't work without an IIS running it).

Delivering Quino: a roadmap for deployment

2019-01-20T21:58:53+01:00

Published by marco on 20. Jan 2019 21:58:53 ------------------------------------------------------------------------ Note: this article was originally published at "Encodo.com" in July, 2018. -------------------------------------------------------------------------------- In a recent article, we outlined "a roadmap to .NET Standard and .NET Core" . We've made really good progress on that front: we have a branch of Quino-Standard that targets .NET Standard for class libraries and .NET Core for utilities and tests. So far, we've smoke-tested these packages with Quino-WebApi. Our next steps there are to convert Quino-WebApi to .NET Standard and .NET Core as well. We'll let you know when it's ready, but progress is steady and promising. With so much progress on several fronts, we want to address how we get Quino from our servers to our customers and users. [Getting Quino] Currently, we provide access to a private fileshare for customers. They download the NuGet packages for the release they want. They copy these to a local folder and bind it as a NuGet source for their installations. In order to make a build available to customers, we have to publish that build by deploying it and copying the files to our file share. This process has been streamlined considerably so that it really just involves telling our CI server (TeamCity) to deploy a new release (official or pre-). From there, we download the ZIP and copy it to the fileshare. Encodo developers don't have to use the fileshare because we can pull packages directly from TeamCity as soon as they're available. This is a much more comfortable experience and feels much more like working with nuget.org directly. [Debugging Quino] The debugging story with external code in .NET is much better than it used to be (spoiler: it was almost impossible, even with Microsoft sources), but it's not as smooth as it should be. This is mostly because NuGet started out as a packaging mechanism for binary dependencies published by vendors with proprietary/commerical products. It's only in recent year(s) that packages are predominantly open-source. In fact, debugging with third-party sources -- even without NuGet involved -- has never been easy with .NET/Visual Studio. Currently, all Quino developers must download the sources separately (also available from TeamCity or the file-share) in order to use source-level debugging. Binding these sources to the debugger is relatively straightforward but cumbersome. Binding these sources to ReSharper is even more cumbersome and somewhat unreliable, to boot. I've created the issue "Add an option to let the user search for external sources explicitly (as with the VS debugger) when navigating" in the hopes that this will improve in a future version. JetBrains has already fixed one of my issues in this are ("Navigate to interface/enum/non-method symbol in Nuget-package assembly does not use external sources" ), so I'm hopeful that they'll appreciate this suggestion, as well. The use case I cited in the issue above is, Developers using NuGet packages that include sources or for which sources are available want to set breakpoints in third-party source code. Ideally, a developer would be able to use R# to navigate through these sources (e.g. via F12) to drill down into the code and set a breakpoint that will actually be triggered in the debugger. As it is, navigation in these sources is so spotty that you often end up in decompiled code and are forced to use the file-explorer in Windows to find the file and then drag/drop it to Visual Studio where you can set a breakpoint that will work. The gist of the solution I propose is to have R# ask the user where missing sources are before decompiling (as the Visual Studio debugger does). [Nuget Protocol v3 to the rescue?] There is hope on the horizon, though: Nuget is going to address the debugging/symbols/sources workflow in an upcoming release. The overview is at "NuGet Package Debugging & Symbols Improvements" and the issue is "Improve NuGet package debugging and symbols experience" . Once this feature lands, Visual Studio will offer seamless support for debugging packages hosted on nuget.org. Since we're using TeamCity to host our packages, we need JetBrains to "Add support for NuGet Server API v3" [1] in order to benefit from the improved experience. Currently, our customers are out of luck even if JetBrains releases simultaneously (because our TeamCity is not available publicly). [Quino goes public?] I've created an issue for Quino, "Make Quino Nuget packages available publicly" to track our progress in providing Quino packages to our customers in a more convenient way that also benefits from improvements to the debugging workflow with Nuget Packages. If we published Quino packages to NuGet (or MyGet, which allows private packages), then we would have the benefit of the latest Nuget protocol/improvements for both ourselves and our customers as soon as it's available. Alternatively, we could also proxy our TeamCity feed publicly. We're still considering our options there. As you can see, we're always thinking about the development experience for both our developers and our customers. We're fine-tuning on several fronts to make developing and debugging with Quino a seamless experience for all developers on all platforms. We'll keep you posted. -------------------------------------------------------------------------------- [1] This issue has been closed and released with R# in the latest versions.

Removing unwanted references to .NET 4.6.1 from web applications

2019-01-20T21:55:36+01:00

Published by marco on 20. Jan 2019 21:55:36 ------------------------------------------------------------------------ Note: this article was originally published at "Encodo.com" in July, 2018. -------------------------------------------------------------------------------- The title is a bit specific for this blog post, but that's the gist of it: we ended up with a bunch of references to an in-between version of .NET (4.6.1) that was falsely advertising itself as a more optimal candidate for satisfying 4.6.2 dependencies. This is a known issue; there are several links to MS GitHub issues below. In this blog, I will discuss direct vs. transient dependencies as well as internal vs. runtime dependencies. [tl;dr] If you've run into problems with an application targeted to .NET Framework 4.6.2 that does not compile on certain machines, it's possible that the binding redirects Visual Studio has generated for you use versions of assemblies that aren't installed anywhere but on a machine with Visual Studio installed. How I solved this issue: * Remove the C:\Program Files (x86)\Microsoft Visual Studio\2017\BuildTools\MSBuild\Microsoft\Microsoft.NET.Build.Extensions\net461\ directory * Remove all System* binding redirects * Clean out all bin/ and obj/ folders * Delete the .vs folder (may not be strictly necessary) * Build in Visual Studio * Observe that a few binding-redirect warnings appear * Double-click them to re-add the binding redirects, but this time to actual 4.6.2 versions (you may need to add true to your project) * Rebuild and verify that you have no more warnings The product should now run locally and on other machines. For more details, background and the story of how I ran into and solved this problem, read on. Note: I published a recent article, ".NET Tips and Resources" , containing a link to a video by Immo Landwerth, in which says "If you want to be compatible with .NET Core 1.5 or lower, then you can use .NET Framework 4.6.1. For .NET Standard compatibility, you should definitely use .NET Framework 4.7.2 instead." That will probably fix the problem as well. Moving to .NET Core will also fix the problem, as all binding is handled automatically there. [Building Software] What do we mean when we say that we "build" an application? Building is the process of taking a set of inputs and producing an artifact targeted at a certain runtime. Some of these inputs are included directly while others are linked externally. * Examples of direct inputs are the binary artifacts produced from the source code that comprises your application * Examples of external inputs are OS components and runtime environments The machine does exactly what you tell it to, so it's up to you to make sure that your instructions are as precise as possible. However, you also want your application to be flexible so that it can run on as wide an array of environments as possible. Your source code consists of declarations. We've generally got the direct inputs under control. The code compiles and produces artifacts as expected. It's the external-input declarations where things go awry. What kind of external inputs does our application have? * System dependencies in the runtime target (assemblies like System.Runtime, System.Data, etc.), each with a minimum version * Third-party dependencies pulled via NuGet, each with a minimum version How is this stitched together to produce the application that is executed? * The output folder contains our application, our own libraries and the assemblies from NuGet dependencies * All other dependencies (e.g. system dependencies) are pulled from the environment The NuGet dependencies are resolved at build time. All resources are pulled and added to the release on the build machine. There are no run-time decisions to make about which versions of which assemblies to use. Dependencies come in two flavors: * Direct: A reference in the project itself * Transient: A direct reference inherited from another direct or transient reference It is with the transient references that we run into issues. The following situations can occur: * A transient dependency is referenced one or more times with the same version. This is no problem, as the builder simply uses that version or substitutes a newer version if that version is no longer available (rare, but possible) * A transient dependency is referenced in different versions. In this case, the builder tries to substitute a single version for all requirements. This generally works OK since most dependencies require a given version or higher. It may be that one or another library cannot work with all newer versions, but this is also rare. In this case, the top-level assembly (the application) must include a hint (an assembly-binding redirect) that indicates that the substitution is OK. More on these below. * A transient dependency requires a lower version than the version that is directly referenced. This is also not a problem, as the transient dependency is satisfied by the direct dependency with the higher version. In this case, the top-level application must also include an assembly-binding redirect to allow the substitution without warning. * A transient dependency requires a higher version than the version that is directly referenced. This is an error (no longer just a warning) that must be solved by either downgrading the dependency that leads to the problematic transient dependency or upgrading the direct dependency. Generally, the application will upgrade the direct dependency. [Assembly-Binding Redirects] An application generally includes an app.config (desktop applications or services) or web.config XML file that includes a section where binding redirects are listed. A binding redirect indicates the range of versions that can be mapped (or redirected) to a certain fixed version (which is generally also included as a direct dependency). A redirect looks like this (a more-complete form is further below): When the direct dependency is updated, the binding redirect must be updated as well (generally by updating the maximum version number in the range and the version number of the target of the redirect). NuGet does this for you when you're using package.config. If you're using Package References, you must update these manually. This situation is currently not so good, as it increases the likelihood that your binding redirects remain too restrictive. [NuGet Packages] NuGet packages are resolved at build time. These dependencies are delivered as part of the deployment. If they could be resolved on the build machine, then they are unlikely to cause issues on the deployment machine. [System Dependencies] Where the trouble comes in is with dependencies that are resolved at execution time rather than build time. The .NET Framework assemblies are resolved in this manner. That is, an application that targets .NET Framework expects certain versions of certain assemblies to be available on the deployment machine. We mentioned above that the algorithm sometimes chooses the desired version or higher. This is not the case for dependencies that are in the assembly-binding redirects. Adding an explicit redirect locks the version that can be used. This is generally a good idea as it increases the likelihood that the application will only run in a deployment environment that is extremely close or identical to the development, building or testing environment. [Aside: Other Bundling Strategies] How can we avoid these pesky run-time dependencies? There are several ways that people have come up with, in increasing order of flexibility: * Deliver hardware and software together. This is common in industrial applications and used to be much more common for businesses, as well. Nearly bulletproof. If it worked in the factory, it will work for the customer. * Deliver a VM (virtual machine) as your application. This includes the entire execution environment right down to the hardware. Safe, but inefficient. * Use a container (e.g. Docker) to deliver a description of the execution environment. The image is built to match the declaration. This is also quite stable and can avoid many of the substitution errors outlined above. If components are outdated, the machine fails to start and the definition must first be updated (and, presumably, tested). This type of deployment is getting more reliable but is also overkill for many applications. * Deliver the runtime with the application instead of describing the runtime you'd like to have. Targeting .NET Core instead of .NET Framework includes the runtime. This seems like a nice alternative and it's not surprising that Microsoft went in this direction with .NET Core. It's a good solution to the external-dependency issues outlined above. To sum up: * A VM delivers the OS, runtime and application. * A Container delivers a description of the OS and runtime as well as the application itself. * .NET Core includes the runtime and application and is OS-agnostic (within reason). * .NET Framework includes only the application and some directives on the remaining components to obtain from the runtime environment. Our application targets .NET Framework (for now). We're looking into .NET Core, but aren't ready to take that step yet. [Where can the deployment go wrong?] To sum up the information from above, problems arise when the build machine contains components that are not available on the deployment machine. How can this happen? Won't the deployment machine just use the best match for the directives included in the build? Ordinarily, it would. However, if you remember our discussion of assembly-binding redirects above, those are set in stone. What if you included binding redirects that required versions of system dependencies that are only available on your build machine ... or even your developer machine? [Special Tip for Web Applications] We actually discovered an issue in our deployment because the API server was running, but the Authentication server was not. The Authentication server was crashing because it couldn't find the runtime it needed in order to compile its Razor views (it has ASP.Net MVC components). We only discovered this issue on the deployment server because the views were only ever compiled on-the-fly. To catch these errors earlier in the deployment process, you can enable pre-compiling views in release mode so that the build server will fail to compile instead of a producing a build that will sometimes fail to run. Add the true to any MVC projects in the PropertyGroup for the release build, as shown in the example below: pdbonly true bin TRACE prompt 4 6 true [How do I create a redirect?] We mentioned above that NuGet is capable of updating these redirects when the target version changes. An example is shown below. As you can see, they're not very easy to write: Most bindings are created automatically when MSBuild emits a warning that one would be required in order to avoid potential runtime errors. If you compile with MSBuild in Visual Studio, the warning indicates that you can double-click the warning to automatically generate a binding. If the warning doesn't indicate this, then it will tell you that you should add the following to your project file: true After that, you can rebuild to show the new warning, double-click it and generate your assembly-binding redirect. [How did we get the wrong redirects?] When MSBuild generates a redirect, it uses the highest version of the dependency that it found on the build machine. In most cases, this will be the developer machine. A developer machine tends to have more versions of the runtime targets installed than either the build or the deployment machine. A Visual Studio installation, in particular, includes myriad runtime targets, including many that you're not using or targeting. These are available to MSBuild but are ordinarily ignored in favor of more appropriate ones. That is, unless there's a bit of a bug in one or more of the assemblies included with one of the SDKs...as there is with the net461 distribution in Visual Studio 2017. Even if you are targeting .NET Framework 4.6.2, MSBuild will still sometimes reference assemblies from the 461 distribution because the assemblies are incorrectly marked as having a higher version than those in 4.6.2 and are taken first. I found the following resources somewhat useful in explaining the problem (though none really offer a solution): * ".dll in Microsoft.NET.Build.Extensions\net461 targets .NET Framework 4.6.1" * "ImplicitlyExpandNETStandardFacades option on MSBuild Extensions corrupts build" (GitHub issue) * "MS Build Extensions file corrupt my bin (Web API) folder" How can you fix the problem if you're affected? You'll generally have a crash on the deployment server that indicates a certain assembly could not be loaded (e.g. System.Runtime). If you show the properties for that reference in your web application, do you see the path C:\Program Files (x86)\Microsoft Visual Studio\2017\BuildTools\MSBuild\Microsoft\Microsoft.NET.Build.Extensions\net461 somewhere in there? If so, then your build machine is linking in references to this incorrect version. If you let MSBuild generate binding redirects with those referenced paths, they will refer to versions of runtime components that do not generally exist on a deployment machine. Tips for cleaning up: * Use MSBuild to debug this problem. R# Build is nice, but not as good as MSBuild for this task. * Clean and Rebuild to force all warnings * Check your output carefully. * Do you see warnings related to package conflicts? * Ambiguities? * Do you see the path C:\Program Files (x86)\Microsoft Visual Studio\2017\BuildTools\MSBuild\Microsoft\Microsoft.NET.Build.Extensions\net461 in the output? A sample warning message: [ResolvePackageFileConflicts] Encountered conflict between Platform:System.Collections.dll and CopyLocal:C:\Program Files (x86)\Microsoft Visual Studio\2017\BuildTools\MSBuild\Microsoft\Microsoft.NET.Build.Extensions\net461\lib\System.Collections.dll. Choosing CopyLocal:C:\Program Files (x86)\Microsoft Visual Studio\2017\BuildTools\MSBuild\Microsoft\Microsoft.NET.Build.Extensions\net461\lib\System.Collections.dll because AssemblyVersion 4.0.11.0 is greater than 4.0.10.0. [The Solution] As mentioned above, but reiterated here, this what I did to finally stabilize my applications: * Remove the C:\Program Files (x86)\Microsoft Visual Studio\2017\BuildTools\MSBuild\Microsoft\Microsoft.NET.Build.Extensions\net461\ directory * Remove all System* binding redirects * Clean out all bin/ and obj/ folders * Delete the .vs folder (may not be strictly necessary) * Build in Visual Studio * Observe that a few binding-redirect warnings appear * Double-click them to re-add the binding redirects, but this time to actual 4.6.2 versions (you may need to add true to your project) * Rebuild and verify that you have no more warnings * Deploy and TADA! [One more thing] When you install any update of Visual Studio, it will silently repair these missing files for you. So be aware and check the folder after any installations or upgrades to make sure that the problem doesn't creep up on you again.

Quino's Roadmap to .NET Standard and .NET Core

2019-01-20T21:49:13+01:00

Published by marco on 20. Jan 2019 21:49:13 ------------------------------------------------------------------------ Note: this article was originally published at "Encodo.com" in May, 2018. -------------------------------------------------------------------------------- With Quino 5, we've gotten to a pretty good place organizationally. Dependencies are well-separated into projects—and there are almost 150 of them. We can use code-coverage, solution-wide-analysis and so on without a problem. TeamCity runs the ~10,000 tests quickly enough to provide feedback in a reasonable time. The tests run even more quickly on our desktops. It's a pretty comfortable and efficient experience, overall. [Monolithic Solution: Pros and Cons] As of Quino 5, all Quino-related code was still in one repository and included in a single solution file. Luckily for us, Visual Studio 2017 (and Rider and Visual Studio for Mac) were able to keep up quite well with such a large solution. Recent improvements to performance kept the experience quite comfortable on a reasonably equipped developer machine. Having everything in one place is both an advantage and disadvantage: when we make adjustments to low-level shared code, the refactoring is applied in all dependent components, automatically. If it's not 100% automatic, at least we know where we need to make changes in dependent components. This provides immediate feedback on any API changes, letting us fine-tune and adjust until the API is appropriate for known use cases. On the other hand, having everything in one place means that you must make sure that your API not only works for but compiles and tests against components that you may not immediately be interested in. For example, we've been pushing much harder on the web front lately. Changes we make in the web components (or in the underlying Quino core) must also work immediately for dependent Winform and WPF components. Otherwise, the solution doesn't compile and tests fail. While this setup had its benefits, the drawbacks were becoming more painful. We wanted to be able to work on one platform without worrying about all of the others. On top of that, all code in one place is no longer possible with cross-platform support. Some code—Winform and WPF—doesn't run on Mac or Linux. [1] The time had come to separate Quino into a few larger repositories. [Separate Solutions] We decided to split along platform-specific lines. * Quino-Standard: all common code, including base libraries, application, configuration and IOC support, metadata, builders and all data drivers * Quino-WebApi: all web-related code, including remaining ASP.NET MVC support * Quino-Windows: all Windows-platform-only code (Windows-only APIs (i.e. native code) as well as Winform and WPF) The Quino-WebApi and Quino-Windows solution will consume Quino-Standard via NuGet packages, just like any other Quino-based product. And, just like any Quino-based product, they will be able to choose when to upgrade to a newer version of Quino-Standard. [Quino-Standard] Part of the motivation for the split is cross-platform support. The goal is to target all assemblies in Quino-Standard to .NET Standard 2.0. The large core of Quino will be available on all platforms supported by .NET Core 2.0 and higher. This work is quite far along and we expect to complete it by August 2018. [Quino-WebApi] As of Quino 5.0.5, we've moved web-based code to its own repository and set up a parallel deployment for it. Currently, the assemblies still target .NET Framework, but the goal here is to target class libraries to .NET Standard and to use .NET Core for all tests and sample web projects. We expect to complete this work by August 2018 as well. [Quino-Windows] We will be moving all Winform and WPF code to its own repository, setting it up with its own deployment (as we did with Quino-WebApi). These projects will remain targeted to .NET Framework 4.6.2 (the lowest version that supports interop with .NET Standard assemblies). We expect this work to be completed by July 2018. [Quino-Mobile] One goal we have with this change is to be able to use Quino code from Xamarin projects. Any support we build for mobile projects will proceed in a separate repository from the very beginning. We'll keep you posted on work and improvements and news in this area. [Conclusion] Customer will, for the most part, not notice this change, except in minor version numbers. Core and platform versions may (and almost certainly will) diverge between major versions. For major versions, we plan to ship all platforms with a single version number. -------------------------------------------------------------------------------- [1] I know, Winform can be made to run on Mac using Mono. And WPF may eventually become a target of Xamarin. But a large part of our Winform UI uses the Developer Express components, which aren't going to run on a Mac. And the plans for WPF on Mac/Linux are still quite up in the air right now.

Convert Markdown to earthli format

2019-01-20T21:44:30+01:00

Published by marco on 20. Jan 2019 21:44:30 ------------------------------------------------------------------------ The earthli blogging format uses HTML-like formatting, described in the "lengthy manual" (with examples). However, Encodo's blogging back-end now uses Umbraco, with Markdown for content. I used to be able to cross-post with ease, by copy/pasting. Now, I need to convert the content from Markdown to earthli formatting. The following steps suffice to convert any article: 1. If there are attached media (e.g. graphics), save those locally 2. Create a new earthli article with the same title as the source article 3. Attach all media to the new article 4. Copy the main text; paste into Sublime Text 5. Find/replace the following regular expressions with the replacements: * ### ([^\n]+)$ => \1 * ## ([^\n]+)$ => \1 * \[([^!][^\]]+)\]$([^$]+)\) => \1 * \*\*([^\*]+)\*\* => \1 * _([^_]+)_ => \1 * ```txt\n([^`]+)\n``` =>

\1

* ```[a-z]+\n([^`]+)\n``` => \1 * `([^`]+)` => \1 6. Manually re-attach all media in the appropriate locations [1] I haven't automated this process yet because I only rarely transfer articles. -------------------------------------------------------------------------------- [1] This is easy since the earthli UI includes an attachment formatter that lets you place, align, link and scale the attachment better than in the original specification anyway.

Compiler Pessimism

2019-01-08T22:46:23+01:00

Published by marco on 8. Jan 2019 22:46:23 Updated by marco on 20. Jan 2019 11:22:48 ------------------------------------------------------------------------ "In practice, nearly everything you write is potentially dependent upon the order of evaluation, but in practice it isn't because you are not a nincompoop." He completes the thought with "[b]ut the compiler doesn't know that. The compiler must adhere to the letter of the language standard, because it has to compile insane code as well as sane code."

Inventing languages for the sake of it

2019-01-08T22:28:18+01:00

Published by marco on 8. Jan 2019 22:28:18 ------------------------------------------------------------------------ The article "Fear, trust and JavaScript: When types and functional programming fail" presents issues in JavaScript and a solution: use another language. The list several newer ones that are completely untested. But the main problem that the article mentions can't be solved 100% by any language. The main problem is at the boundaries of your application: inputs. When you get data from an external source, you have to validate it somehow before passing it along to the rest of the application. No language can remove this requirement. It doesn't matter how functional, curryable, immutable or sexy it is; it just can't do it. What you have instead is languages with more built-in mechanisms for defining types that allow the rest of the program to work safely with the data, once it's been validated. So if your language supports immutability and types, then you can validate that the data is OK before hydrating the object from the serialized source (e.g. JSON). What we're trying to avoid is unexpected runtime errors, no? Or, at the very least, we want a runtime error of a known type that precisely identifies the problem with the incoming data. That is, the data either conforms to the definition -- and the definition is statically typed -- or there is an error. The desire is to push this gatekeeper/conversion to a single place so that the rest of the application works with the compiler to find errors rather than tyhe programmer defensively checking throughout the source. However, suggesting that PureScript or Elm or ClojureScript are somehow better at doing this the JavaScript is incorrect. Where they are better is in providing language mechanisms that allow you to precisely define the shape of the data. Despite the author's suggestions, they are not that much different than TypeScript. The only difference being that TypeScript chose to stay much closer to JavaScript for compatibility reasons. At the time that TypeScript came out, this was a reasonable requirement, since almost no-one wanted to move completely away from JavaScript. Five years later and the development world is ready for other languages. With WASM (Web Assembly) as a target (instead of just JavaScript), there are more possibilities than ever. JavaScript as a compile target is still open to runtime errors. When you use a higher-level language, you're restricting the range of functionality that you can use in the target bytecode/machine code. That is, when you write an if-statement in C, you're using the JMP statement, but you're only able to JMP to certain address locations instead of anywhere in addressable memory. It's the same with JavaScript as a compile target. It doesn't really matter that JavaScript allows too much -- what matters is what the higher-level language allows. TypeScript may still allow too much, but it's worlds better than JavaScript. It's true that PureScript or Elm or ClojureScript can close some loopholes that TypeScript leaves open. That's fine. But if you're going to just use JavaScript (or WASM) as a compile target, then why not choose a more-established language like C# or F#?

Anyone Can Be a Programmer, Right?

2019-01-08T22:24:05+01:00

Published by marco on 8. Jan 2019 22:24:05 ------------------------------------------------------------------------ The post on Reddit called "Someone asked me to make a site for them and I don't know how the fuck I'm supposed to go about it." is about exactly what it sounds like it's about. Amid the flurry of comments with recommendations on how to pretend he (or she) knows how to build a web site by using tools he's (or she's) never heard of, I chimed in with, What is it about software that makes people who have never done it think that they can do it professionally? What if your neighbor had heard you were a carpenter and had asked you to make a dining-room set for "good money"? Would you watch YouTube videos about how to make furniture and then charge money for the first furniture you ever made? What about if they'd asked for a haircut/trim/style/dye? Would you just go for it, after having asked around on /r/coiffeur for a few minutes? Or maybe they'd heard you were a chef and offered "good money" to cook their Thanksgiving dinner for them? Would you risk doing that? Probably not, because if you've never done any of those things, you're not good at them and charging for doing them can only backfire horribly. Unless your neighbor is a sap and a fool, in which case go for it.

Ray-tracing on postcards

2018-12-31T22:55:26+01:00

Published by marco on 31. Dec 2018 22:55:26 ------------------------------------------------------------------------ The article "Deciphering The Postcard Sized Raytracer" by Fabien Sanglard is a wonderfully presented breakdown of how the path tracer found on a postcard does its magic. It's not super-fast (it takes 3 minutes to produce a much rougher version on the author's machine). He includes his final "cleaned-up source code" . [image] It comes from the same person who made the business card ray-tracer discussed in the article "Decyphering The Business Card Raytracer" by Fabien Sanglard .

The problem with slow development tools

2018-12-30T23:03:19+01:00

Published by marco on 30. Dec 2018 23:03:19 Updated by marco on 30. Dec 2018 23:03:37 ------------------------------------------------------------------------ The article ""Modern" C++ Lamentations" by Aras Pranckevicius is a wide-ranging rant about the inefficiency of C++ template programming and the degree to which it's inappropriate for many of the areas where C++ is used. Aras is one of the developers for the Unity game engine In particular, he highlights the disastrous compilation and execution speeds when using a lot of the STL. Not only that, but the debugging time is extremely slow, due to the inordinate amount of extra symbol information associated with hundreds of thousands of lines of code pulled in to implement relatively simple concepts that are standard in other languages, libraries and runtimes. On top of it all, even the high-level C++ code isn't very easy to read, despite the tremendous amount of abstraction. The optimized version of C++ code has an even worse compilation time, but it has a comparable/reasonable run-time to the C/C++-style version. However, it's very difficult to debug optimized code, which makes it doubly bad for development. Interactive development is hindered because of long compile times and, when debugging is necessary, most introspection tools don't work (e.g. reading variables) very well. It's the rare developer who can make headway debugging optimized code. He compares versions of an algorithm built using "classic" C/C++ programming vs. STL programming. He then compares to C#, which compiles and runs and debugs very quickly -- and is very easy to read, to boot. The problem with C++ boils down to its approach of making "everything a library". It's almost like an exercise in abstraction: since a few generic-programming concepts can be used to build everything in the library rather than the language, that's what C++ does. It's almost as if it does it to prove that it can be done. I'm all for removing redundancy in a language, but C++ is far from such a language. It's almost like the designers don't use their own language. He cites "Christer Ericson" "Goal of programmers is to ship, on time, on budget. It’s not “to produce code.” IMO most modern C++ proponents 1) overassign importance to source code over 2) compile times, debug[g]ability, cognitive load for new concepts and extra complexity, project needs, etc. 2 is what matters." Aras continues discussing the future of C++ and how it is currently used in game companies, for example. These are the companies using C++ the most. Rust is making some inroads, but the area is dominated by C/C++. Finally, he has some good advice for programmers -- for any professional, really -- on how to take criticism and turn it into something useful. "Ignoring literal trolls who complain on the internet “just for the lulz”, [the] majority of complaints do have [an] actual issue or problem behind it. It might be worded poorly, or exaggerated, or whoever is complaining did not think about other possible viewpoints, but there is a valid issue behind the complaint anyway. "What I do whenever someone complains about thing I’ve worked on, is try to forget about “me” and “work I did”, and get their point of view. What are they trying to solve, and what problems do they run into? The purpose of any software/library/language is to help their users solve the problems they have. It might be a perfect tool at solving their problem, an “ok I guess that will work” one, or a terribly bad one at that." As a postscript, the article "It is fast or it is wrong" by Nikita Tonsky discusses a very similar issue with Clojure vs. ClojureScript. "What do ClojureScript/Google Closure compilers do for so long? They are wasting your time, that’s what. Of course it’s nobody’s fault, but in the end, this whole solution is simply wrong. We can do the same thing much faster, we have proof of that, we have the means to do it, it just happens that we are not. But we could. If we wanted to. That huge overhead you’re paying, you’re paying it for nothing. You don’t get anything from being on JS, except a 2× performance hit and astronomical build times." I find these points interesting because programming is very much about which tools you use and how they help you to turn your work around more quickly. I'm in charge of choosing which languages, libraries and tools we use at Encodo and I'm hyper-aware of the efficiency losses when developers are hindered by their tools or libraries. Being the lead developer of our framework Quino makes me doubly aware of this. If you have a very slow feedback loop, then you'll take much longer to get your work done. I remember back in the late 90s/early 2000s, working with C++, where I would have to schedule builds because it took over 30 minutes to rebuild all of my static libraries if I made a low-level change. This was on a project that cross-compiled to Mac and Windows. Instead of working on my project, I spent way too much time massaging PCH files and avoiding making low-level changes so that I could continue testing. Bad tools that run too slowly are a problem. That's why you should always be very careful in choosing your languages, libraries and environments. Jumping ship to the "new hotness" very often means that you're going to have your time wasted by tools that aren't ready for prime time.

.NET Tips and Resources

2018-12-30T22:12:44+01:00

Published by marco on 30. Dec 2018 22:12:44 ------------------------------------------------------------------------ If you're a .NET developer, this is video you've been looking for: [media] Immo tells you everything you need to know about Nuget, using Package References, switching to .NET Core, and using Assembly-Binding Redirects in .NET Framework (they're not necessary in .NET Core). He also includes an effusive apology for the nightmare of compatibility issues that accompanied the purported interoperability between .NET 4.6.1 and .NET Core. If you want to be compatible with .NET Core 1.5 or lower, then you can use .NET Framework 4.6.1. For .NET Standard compatibility, you should definitely use .NET Framework 4.7.2 instead. He includes a list of resources for digging through open-source code and checking platform and target compatibility. ".NET API Catalog" While you can use "Microsoft Docs" to find out which targets or platforms support which APIs, this resource lets you do it faster. You can browse a giant list of namespaces and click on any one of them to see the types, and then drill down to properties and methods. For each level, you can see a nice list of supported targets/platforms and the assemblies to use. You can also "Search", which opens what looks like a terminal that let's you camel-case search for your namespace, type or member. Selecting a result takes you to the location in the catalog. ".NET Framework Source-code browser" Yes, you read that correctly. I had no idea that this existed -- I've been digging through decompiled assembly code instead. This is much faster and includes the original documentation and comments. The source is syntax-highlighted and all types, methods and properties are linked. There's a document explorer, namespace explorer and project manager, all linked up very nicely. You can click any element and show all references in a separate pane. Clicking one of those references navigates there -- and other references in that file are also highlighted. If that's not sufficient, you can even download the entire source code as a ZIP file from here -- complete with solution and project files so you can open it in Visual Studio for browsing. ".NET Core Source Browser" This is the same thing as the link above, but for .NET Core sources. "FuGet" This is a NuGet package browser combined with an API browser over all of the assemblies in a package. * You can search the package for keywords. * You can browse types and see the formatted XML documentation. * You can diff versions of the package to see which APIs have changed. * You can jump to the code in the Git repository, if it's given in the package. It's an open-source GitHub project, so you could even run your own copy for diffing privately published packages.

Which type should you register in an IOC container?

2018-07-16T21:55:42+02:00

Published by marco on 16. Jul 2018 21:55:42 ------------------------------------------------------------------------ [Use Case] I just ran into an issue recently where a concrete implementation registered as a singleton was suddenly not registered as a singleton because of architectural changes. The changes involved creating mini-applications within a main application, each of which has its own IOC. Instead of creating controllers using the main application, I was now creating controllers with the mini-application instead (to support multi-tenancy, of which more in an upcoming post). [Silent Replacement of Singleton with Transient] Controllers are, by their nature, transient; a new controller is created to handle each incoming request. In the original architecture, the concrete singleton was injected into the controller and all controller instances used the same shared instance. In the new architecture, the registration was not present in the mini-application (at first), which led to a (relatively) subtle bug: a transient and freshly created instance was injected into each new controller. In cases where the singleton is a stateless algorithm, this wouldn't be a logical problem at all. At the very worst, you're over-allocating -- but you probably wouldn't notice that, either. In this case, the singleton was a settings object, configured at application startup. The configured object was still in the main application's IOC, but not registered in the mini-application's IOC. Because the singleton was registered on a concrete type rather than an interface, the semantic error occurred silently instead of throwing a lifestyle-mismatch or unregistered-interface exception. [A Straightforward Fix] This is only one of the reasons that I recommend using interfaces as the anchoring type of an IOC registration. To fix the issue, I did exactly this: I extracted an interface from the class and used the interface everywhere (except for the implementing type of the registration). Re-running the test caused an immediate exception rather than a strange data bug (which resulted because the default configuration in the concrete type was just correct enough to allow it to limp to a result). To show an example, instead of the following, application.RegisterSingle() I used, application.RegisterSingle() This still didn't fix the crash because the mini-application doesn't get that registration automatically. I also can't use the same registration as above because that would just create a new unconfigured ApiSettings in each mini-application (the same as I had before, but now as a singleton). To go that route, I would have to replicate the configuration-loading for the ApiSettings as well. And I don't want to do that. Instead, I just injected the IApiSettings from the main application to the component responsible for creating the mini-application and registered the object as a singleton directly, as shown below. public class MiniApplicationFactory { public MiniApplicationFactory([NotNull] IApiSettings apiSettings) { if (apiSettings = null) { throw new ArgumentNullException(nameof(apiSettings(); } _apiSettings = apiSettings; } IApplication CreateApplication() { return new Application().UseRegisterSingle(_apiSettings); } [NotNull] private readonly IApiSettings _apiSettings; } On a side note, whereas C# syntax has become more concise and powerful from version to version, I still think it has a way to go in terms of terseness for such simple objects. For such things, Kotlin and TypeScript nicely illustrate what such a syntax could look like. [1] [Other Drawbacks] I mentioned above that this is only "one" of the reasons I don't like registering concrete singletons. The other two reasons are: 1. Complicates replacement: If the registered type is a concrete instance, then any replacement must inherit from this instance. The base class has to be constructed more carefully in order to allow for all foreseeable customizations. With an interface, the implementor is completely free to either use the existing class as a base or to re-implement the interface entirely. 2. Limits Mocking: Related to the first reason is that mocking is limited in its ability to override non-virtual methods. Even without a mocking library, you're just as hard-pressed to work around unwanted behavior in a hand-coded mock as you are with an actual replacement (as described above). Such limitations are non-existent with interfaces. -------------------------------------------------------------------------------- [1] I'm still waiting for C# to clean up a bit more of this syntax for me. The [NotNull] should be a language feature checked by the compiler so that the ArgumentNullException is no longer needed. On top of that, I'd like to see parameter properties, as in "TypeScript" (this is where you can prefix a constructor parameter with a keyword to declare and initialize it as a property). With a few more C#-language iterations that included non-nullable reference types and parameter properties, the example could look like the code below: public class MiniApplicationFactory { public MiniApplicationFactory(private IApiSettings apiSettings) { } IApplication CreateApplication() { return new Application().UseRegistereSingle(apiSettings); } }

Tools for maintaining Quino

2018-05-24T22:12:33+02:00

Published by marco on 24. May 2018 22:12:33 ------------------------------------------------------------------------ The "Quino roadmap" shows you where we're headed. How do we plan to get there? A few years back, we made a big leap in Quino 2.0 to split up dependencies in anticipation of the initial release of .NET Core. Three tools were indispensable: "ReSharper" , "NDepend" and, of course, Visual Studio. Almost all .NET developers use Visual Studio, many use ReSharper and most should have at least heard of NDepend. At the time, I wrote a series of articles on the migration from two monolithic assemblies (Encodo and Quino) to dozens of layered and task-specific assemblies that allows applications to include our software in a much more fine-grained manner. As you can see from the articles, NDepend was the main tool I used for finding and tracking dependencies. [1] I used ReSharper to disentangle them. * "Splitting up assemblies in Quino using NDepend (Part I)" * "Iterating with NDepend to remove cyclic dependencies (Part II)" * "The Road to Quino 2.0: Maintaining architecture with NDepend (part I)" * "The Road to Quino 2.0: Maintaining architecture with NDepend (part II)" Since then, I've not taken advantage of NDepend's features for maintaining architecture as much as I'd like. I recently fired it up again to see where Quino stands now, with 5.0 in beta. But, first, let's think about why we're using yet another tool for examining our code. Since I started using NDepend, other tools have improved their support for helping a developer maintain code quality. * ReSharper itself has introduced tools for visualizing project and type dependencies with very nice graphs. However, there is currently no support for establishing boundaries and getting ReSharper to tell me when I've inadvertently introduced new dependencies. In fact, ReSharper's only improved its support for quickly pulling in a dependency with its excellent Nuget-Package integration. ReSharper is excellent for finding lower-level code smells, like formatting, style and null-reference issues, as well as language usage, missing documentation and code-complexity (with an extension). DotCover provides test-coverage data but I haven't used it for real-time analysis yet (I don't use continuous testing with ReSharper on Quino because I feel it would destroy my desktop). * Visual Studio has also been playing catch-up with ReSharper and has done an excellent job in the last couple of years. VS 2017 is much, much faster than its predecessors; without it, we would be foundering badly with a Quino solution with almost 150 projects. [2] Visual Studio provides Code Analysis and Portability Analysis and can calculate Code Metrics. Code Analysis is mostly covered by ReSharper, although it has a few extra inspections related to proper application and usage of the IDisposable pattern. The Portability Analysis is essential for moving libraries to .NET Standard but doesn't offer any insight into architectural violations like NDepend does. * We've recently started working with SonarQube on our TeamCity build server because a customer wanted to use it. It has a very nice UI and very nice reports, but doesn't go much farther than VS/R# inspections. Also, the report isn't in the UI, so it's not as quick to jump into the code. I don't want to review it here, since we only recently started working with it. It looks promising and is a welcome addition to that project. Hopefully more will reveal itself in time. * TeamCity provides a lot of the services that ReSharper also provides: inspections and code-coverage for builds. This takes quite a while, though, so we only run inspections and coverage for the Quino nightly build. The reports are nice but, as with SonarQube, of limited use because of the tenuous integration with Visual Studio. The integration works, but it's balky and we don't use it very much. Instead, we analyze inspections in real-time in Visual Studio with ReSharper and don't use real-time code-coverage. [3] * NDepend integrates right into Visual Studio and has a super-fast analysis with a very nice dashboard overview, from which you can drill down into myriad issues and reports and analyses, from technical debt (with very daunting but probably accurate estimates for repair) to type- and assembly-interdependency problems. NDepend can also integrate code-coverage results from DotCover to show how you're doing on that front on the dashboard as well. As with TeamCity and SonarQube, the analyses are retained as snapshots. With NDepend, you can quickly compare them (and comparing against a baseline is even included by default in the dashboard), which is essential to see if you're making progress or regressing. [4] NDepend also integrates with TeamCity, but we haven't set that up (yet). With a concrete .NET Core/Standard project in the wings/under development, we're finally ready to finish our push to make Quino Core ready for cross-platform development. For that, we're going to need NDepend's help, I think. Let's take a look at where we stand today. The first step is to choose what you want to cover. In the past, I've selected specific assemblies that corresponded to the "Core". I usually do the same when building code-coverage results, because the UI assemblies tend to skew the results heavily. As noted in a footnote below, we're starting an effort to separate Quino into high-level components (roughly, a core with satellites like Winform, WPF and Web). Once we've done that, the health of the core itself should be more apparent (I hope). For starters, though, I've thrown all assemblies in for both NDepend analysis as well as code coverage. Let's see how things stand overall. [image] The amount of information can be quite daunting but the latest incarnation of the dashboard is quite easy to read. All data is presented with a current number and a delta from the analysis against which you're comparing. Since I haven't run an analysis in a while, there's no previous data against which to compare, but that's OK. * Lines of Code * Code Elements (Types, Methods, etc.) * Comments (documentation) * Technical Debt * Code Coverage [5] * Quality Gates / Rules / Issues Let's start with the positive. * The Quino sources contain almost 50% documentation. That's not unexpected. The XML documentation from which we generate our developer documentation [6] is usually as long as or longer than the method itself. * We have a solid B rating for technical debt, which is really not bad, all things considered. I take that to mean that, even without looking, we instinctively produce code with a reasonable level of quality. Now to the cool part: you can click anything in the NDepend dashboard to see a full list of all of the data in the panel. Click the "B" on technical debt and you'll see an itemized and further-drillable list of the grades for all code elements. From there, you can see what led to the grade. By clicking the "Explore Debt" button, you get a drop-down list of pre-selected reports like "Types Hot Spots". Click lines of code and you get a breakdown of which projects/files/types/methods have the most lines of code Click failed quality gates to see where you've got the most major problems (Quino currently has 3 categories) [image] Click "Critical" or "Violated" rules to see architectural rules that you're violating. As with everything in NDepend, you can pick and choose which rules should apply. I use the default set of rules in Quino. [image] Most of our critical issues are for mutually-dependent namespaces. This is most likely not root namespaces crossing each other (though we'd like to get rid of those ASAP) but sub-namespaces that refer back to the root and vice-versa. This isn't necessarily a no-go, but it's definitely something to watch out for. There are so many interesting things in these reports: * Don't create threads explicitly (this is something we've been trying to reduce; I already knew about the one remaining, but it's great to see it in a report as a tracked metric) * Methods with too many parameters (you can adjust the threshold, of course) * Types too big: we'd have to check these because some of them are probably generated code, in which case we'd remove them from analysis. * Abstract constructors should be protected: ReSharper also indicates this one, but we have it as a suggestion, not a warning, so it doesn't get regularly cleaned up. It's not critical, but a code-style thing. I find the NDepend report much easier to browse than the inspection report in TeamCity. Click the "Low" issues (Quino has over 46,000!) and you can see that NDepend analyzes your code at an incredibly low level of granularity [image] * There are almost 10,000 cases where methods could have a lower visibility. This is good to know, but definitely low-priority. * Namespace does not correspond to file location: I'm surprised to see 4,400 violations because I thought that ReSharper managed that for us quite well. This one bears investigating – maybe NDepend found something ReSharper didn't or maybe I need to tweak NDepend's settings. Finally, there's absolutely everything, which includes boxing/unboxing issues [7], method-names too long, large interfaces, large instances (could also be generated classes). [image] These already marked as low, so don't worry that NDepend just rains information down on you. Stick to the critical/high violations and you'll have real issues to deal with (i.e. code that might actually lead to bugs rather than code that leads to maintenance issues or incurs technical debt, both of which are more long-term issues). What you'll also notice in the screenshots that NDepend doesn't just provide pre-baked reports: everything is based on its query language. That is, NDepend's analysis is lightning fast (takes only a few seconds for all of Quino) during which it builds up a huge database of information about your code that it then queries in real-time. NDepends provides a ton of pre-built queries linked from all over the UI, but you can adjust any of those queries in the pane at the top to tweak the results. The syntax is Linq to Sql and there are a ton of comments in the query to help you figure out what else you can do with it. [image] As noted above, the amount of information can be overwhelming, but just hang in there and figure out what NDepend is trying to tell you. You can pin or hide a lot of the floating windows if it's all just a bit too much at first. In our case, the test assemblies have more technical debt than the code that it tests. This isn't optimal, but it's better than the other way around. You might be tempted to exclude test assemblies from the analysis, to boost your grade, but I think that's a bad idea. Testing code is production code. Make it just as good as the code it tests to ensure overall quality. I did a quick comparison between Quino 4 and Quino 5 and we're moving in the right direction: the estimation of work required to get to grade A was already cut in half, so we've made good progress even without NDepend. I'm quite looking forward to using NDepend more regularly in the coming months. I've got my work cut out for me. -------------------------------------------------------------------------------- [1] Many thanks to Patrick Smacchia of NDepend for generously providing an evaluator's license to me over the years. [1] We came up with a plan for reducing the size of the core solution in a recent architecture meeting. More on that in a subsequent blog post. [1] Quino has 10,000 tests, many of which are integration tests, so a change to a highly shared component would trigger thousands of tests to run, possibly for minutes. I can't see how it would be efficient to run tests continuously as I type in Quino. I've used continuous testing in smaller projects and it's really wonderful (both with ReSharper and also Wallaby for TypeScript), but it doesn't work so well with Quino because of its size and highly generalized nature. [1] I ran the analysis on both Quino 4 and Quino 5, but wasn't able to directly compare results because I think I inadvertently threw them away with our nant clean command. I'd moved the ndepend out folder to the common folder and our command wiped out the previous results. I'll work on persisting those better in the future. [1] I generated coverage data using DotCover, but realized only later that I should have configured it to generate NDepend-compatible coverage data (as detailed in "NDepend Coverage Data" . I'll have to do that and run it again. For now, no coverage data in NDepend. This is what it looks like in DotCover, though. Not too shabby: [image] [1] Getting that documentation out to our developers is also a work-in-progress. Until recently, we've been stymied by the lack of a good tool and ugly templates. But recently we added "DocFX" support to Quino and the generated documentation is gorgeous. There'll be a post hopefully soon announcing the public availability of Quino documentation. [1] There's probably a lot of low-hanging fruit of inadvertent allocations here. On the other hand, if they're not code hot paths, then they're mostly harmless. It's more a matter of coding consistently. There's also an extension for ReSharper (the "Heap Allocations Viewer") that indicates allocations directly in the IDE, in real-time. I have it installed, and it's nice to see where I'm incurring allocations.

File-system consistency

2018-03-31T23:28:27+02:00

Published by marco on 31. Mar 2018 23:28:27 ------------------------------------------------------------------------ The long and technical article "Files are hard" by Dan Luu discusses several low-level and scholarly analyses of how common file-systems and user-space applications deal with read/write errors. * How theoretically consistent is the file system? * How well-documented are patterns that guarantee consistency? * How well-understand are these patterns in the communities using them? * How do common applications (e.g. source control, databases, etc.) use these patterns? * Are these applications guaranteeing consistency? * What about the file-system designs? Are those airtight? * Are the file-system implementations correct? * How do the various components deal with hardware degradation or failure? [Asynchronous programming is hard] File-system operations work with devices and are thus asynchronous by nature. The analyses discovered similar ordering issues as with multi-threaded code. "The most common class of error was incorrectly assuming ordering between syscalls. The next most common class of error was assuming that syscalls were atomic2. These are fundamentally the same issues people run into when doing multithreaded programming. Correctly reasoning about re-ordering behavior and inserting barriers correctly is hard. But even though shared memory concurrency is considered a hard problem that requires great care, writing to files isn’t treated the same way, even though it’s actually harder in a number of ways." This is why most applications should use a framework or runtime support to access the file system. Even this might not be enough, though, if the implementation is still not robust enough for the application requirements. The .NET runtime has for quite a while now offered an API that uses async/await (i.e. a promise/future-based API), which at the very least indicates the asynchronous nature of these calls, with separate paths for success and error. This is better than nothing, even if the implementation occasionally fails to properly propagate errors (as we see with the POSIX APIs below). At any rate, the article drives home the point that programming against file systems is hard. "People almost always just run some tests to see if things work, rather than making sure they’re coding against what’s legal in a POSIX filesystem." Having a few tests is better than nothing, but it's even better to hoist your code up as many levels of abstraction as possible and avoid having to know about how to interleave fsync calls at all. Unless you're writing a database or a source-control system, right? [A common problem: documentation] He goes on to discuss "how much misinformation is out there" and that "it’s hard for outsiders to troll through a decade and a half of mailing list postings to figure out which ones are still valid and which ones have been obsoleted" This is a common problem that applies not just to low-level systems programming, but to any other programming problem. We have a surfeit of choice: just search online and you'll find something that matches what you searched. * Is the source authoritative? * Is the source even competent? * Is the source relevant? Or just kind of related? * Is the source current? Or outdated? * Are you in an echo chamber that feels authoritative but is just a bunch of low-skill developers at a local maximum when the real answer to your problem is elsewhere and is actually much more elegant? I recently ran into this phenomenon when learning Docker. Docker has changed and improved so much that the Internet is literally littered with old and overly complicated solutions to problems that either no longer exist or that can be solved with a simple one-liner in a configuration file. If you follow the instructions you find online, it's possible that you'll have something that works the way you want it to, but it's also very likely that you'll end up with a Frankenstein's Monster of a setup that kind of works but is fragile in unnecessary ways. [Drives are not infallible] From the article: "So far, we’ve assumed that the disk works properly, or at least that the filesystem is able to detect when the disk has an error via SMART or some other kind of monitoring. I’d always figured that was the case until I started looking into it, but that assumption turns out to be completely wrong." That sounds bad, of course. It's not something we user-space programmers ever really think about, is it? You read from a file, you write to a file, it works, right? And if it doesn't work (super-rare, right?), then the runtime throws an exception. If we assume that the runtime throws an exception, we're also assuming that the runtime is notified when an error occurs during a read or write operation. This was, apparently, not the case (at least in 2005-2008; we'll see improvements below). "In one presentation, one of the authors remarked that the ext3 code had lots of comments like “I really hope a write error doesn't happen here" in places where errors weren't handled. [...] NTFS is somewhere in between. The authors found that it has many consistency checks built in, and is pretty good about propagating errors to the user. However, like ext3, it ignores write failures." Ignoring write failures! That's kind of incredible, but if you've ever relied heavily on NTFS, you know that there are bugs in it. Sometimes files are just mysteriously locked and inaccessible until the system is rebooted. Why does the problem go away on reboot? NTFS is journaled and can recover its data, but it needs to be unmounted and checked. Instead of panicking, the write error is ignored. [1] "At this point, we know that it's quite hard to write files in a way that ensures their robustness even when the underlying filesystem is correct, the underlying filesystem will have bugs, and that attempting to repair corruption to the filesystem may damage it further or destroy it." [Replicating the results] The papers referenced in the first article are quite old (a decade or more) but the conclusions are still fascinating. Luu discusses the need for replicating the study and laments that "replications usually give little to no academic credit. This is one of the many cases where the incentives align very poorly with producing real world impact." Happily, Luu followed up with another post, called "File-system error-handling" that reproduces some of the original results with the 2017 versions of the file systems. This is an interesting study in its own right, discussing in detail interesting nuggets like the fact that "apfs doesn’t checksum data because “[apfs] engineers contend that Apple devices basically don’t return bogus data”." (from "APFS in Detail: Data Integrity" ). The second article concludes that "Filesystem error handling seems to have improved." Basic write errors are now propagated to user-space wherever possible (i.e. if the drive is not dead). However, "[m]ost filesystems don’t have checksums for data and leave error detection and correction up to userspace software." This is probably something that most user-space software developers never think about, but it's crucially important. Does your software assume that the file system will always throw an error? Or does it "just assume[...] that filesystems and disks don’t have errors"? [Abstract it away!] The first article concludes with a citation from "Butler Lampson" : "Lampson suggests that the best known general purpose solution is to package up all of your parallelism into as small a box as possible and then have a wizard write the code in the box." This is generally a good approach for anything complicated: programmers should use as high-level an API as possible for a given task. Problems like security, memory-allocation, file-system access, networking, asynchronous/parallel programming...these all fall into that category. Generally, the advice is, as usual, to get your requirements, make components that satisfy those requirements and include automated tests that verify that the components will continue to satisfy the requirements. As Lampson says, don't write code that's beyond you -- get a "wizard" to write it instead. That's what most of us do when we use the runtime provided with our language. [2] The best you can usually do is to abstract away access to external systems (including the file system) so that you can improve behavior later, should it be required. The budget and reliability constraints of a project don't always allow you to program perfectly safely. What you can do is to make sure that the system can be made safer later with a reasonable amount of effort. To be clear: don't be unnecessarily sloppy, but don't tank your project guaranteeing NASA-level safety where its not needed. So what does that mean? If you're programming on .NET, it means you should probably stay away from some constructs that you've previously considered safe and not worth wrapping, like File or Directory. Instead of using these directly, use them from an injected service. This level of abstraction is not difficult to enforce if introduced early in a project and will allow for improved testing anyway. If the filesystem is abstracted, components will no longer need their tests to actually write out files in order to work. As discussed above, this isn't to say that you jeopardize your deadline to abstract away every single file-system reference. For some applications, file-system access is so intrinsic as to be un-mockable (e.g. databases, source-control, etc.). However, your application is probably not one of those. It's likely that your application reads/writes files in a highly localizable manner that could be wrapped in a simple component. This advice is similar to the by-now common practice of not using the global DateTime.UtcNow. How can this be a problem? Well, if code uses an IClock component instead, then tests can adjust "now" to be a point in the past or future and test scheduling components more easily. It's an easy pattern to follow in new code that pays for itself the first time you need to reproduce a timing problem. [Avoiding Bugs] At the end of the second article, there's an interesting discussion of how to avoid these kind of bugs -- or just bugs, in general. "There's a very old debate over how to prevent things like this from accidentally happening." Better "tools or processes"? Be "better programmers"? Are tools like guardrails? Does it make sense to keep driving, bashing back and forth across the road, but happy that the guardrails are keeping us on the road at all? Would you do that in a car? Well, no. But, yes, if that's the best option? What's the other option? Just stop the car and don't go anywhere anymore? Or get out and walk? That analogy has been beaten to death -- and I don't think it's very appropriate (as you can see from my discussion about abstraction above). Tools and processes are better than nothing. Proper programming practices and patterns are, as well. If you train yourself to use tried-and-true patterns, then you automatically avoid common errors. * Use a language with static type-checking [3] * Abstract away interfaces to the system * Use non-nullable references wherever possible * Use immutable data wherever possible [4] * Segregate mutable data into dumb objects The point isn't to be able to say that "there are no bugs"; it's be able to say that "these tested bugs won't happen". The point is to use practices that avoid whole classes of problems. [What are better tools?] "Even better than a static analysis tool would be a language that makes it harder to accidentally forget about checking for an error." And now we come to the justification for some of the newer languages out there. Rust is such a language, which attempts to fix many of the shortcomings of C and C++ in the domain of allocating, sharing, modifying and freeing memory. For error-handling, the article "The Error Model" by Joe Duffy discusses a very interesting and promising approach taken by a Microsoft Research team with Midori, a 100%-managed version of Windows. The basic insight is to separate bugs from recoverable errors and unrecoverable errors. A bug is something the user-space application did wrong (e.g. passing a null reference to a method that expects only non-null references). A recoverable error is a validation error encountered when processing user input. An unrecoverable error is a file-read error in a base configuration file or a stack overflow or an out-of-memory error. For almost all software, file-system errors are something that should just be considered an unrecoverable error. There is no reason why most applications should attempt to continue when e.g. the main configuration cannot be loaded. Most applications don't even need to be able to recover from that. The problem occurs so rarely that you should just get a file out of backup. Lower-level applications like Git or PostgreSql have to take more care to deal with file-system errors [5], but your software most likely doesn't need to handle them. As discussed above, be aware that they can happen, abstract your code from the file-system so you can test error situations and improve handling where needed, but fail fast unless your project has a requirement to be able to recover in error conditions. Generally, no-one expects a user-space application to include robust file-recovery. It's expected, though, that the application detects when something is wrong and reports it, failing fast rather than just limping along and corrupting data. -------------------------------------------------------------------------------- [1] Anecdotally, it's definitely possible to get file corruption: I've had critical configuration files filled with NULL bytes after certain catastrophic operations. [1] Or why we use libraries for tough tasks (e.g. the immutable data-structures libraries for .NET, which provide a performant and correct implementation). [1] Not that I didn't say explicit types. I've used Swift and TypeScript for production code (and played with F#), all of which have inferred types. I still prefer specifying parameter and return types because it helps me localize errors immensely. For most methods, it helps me reason about the code more easily without giving away any flexibility. Still, that's a matter of taste (and, perhaps, familiarity); the point is that static type-checking is a good thing. [1] The article "You can't Rust that" by Armin Ronacher discusses how to use immutability in a language that provides pretty good support for it, Rust. He comes to the proper conclusion that you should "[c]onsider promoting new state instead of interior mutability", which is, not coincidentally, the same concept that the .NET immutable-collections library uses. [1] And if you look at the analysis in the first article, Git wasn't particularly radiant, but PostgreSql and Sqlite both did quite well.

Adventures in .NET Standard 2.0-preview1

2017-05-14T21:38:17+02:00

Published by marco on 14. May 2017 21:38:17 Updated by marco on 15. May 2017 08:36:05 ------------------------------------------------------------------------ .NET Standard 2.0 is finally publicly available as a preview release. I couldn't help myself and took a crack at converting parts of Quino to .NET Standard just to see where we stand. To keep me honest, I did all of my investigations on my MacBook Pro in MacOS. [IDEs and Tools] I installed Visual Studio for Mac, the latest JetBrains Rider EAP and .NET Standard 2.0-preview1. I already had Visual Studio Code with the C#/OmniSharp extensions installed. Everything installed easily and quickly and I was up-and-running in no time. Armed with 3 IDEs and a powerful command line, I waded into the task. [Porting Quino to .NET Standard] Quino is an almost decade-old .NET Framework solution that has seen continuous development and improvement. It's quite modern and well-modularized, but we still ran into considerable trouble when experimenting with .NET Core 1.1 almost a year ago. At the time, we dropped our attempts to work with .NET Core, but were encouraged when Microsoft shifted gears from the extremely low--surface-area API of .NET Core to the more inclusive though still considerably cleaned-up API of .NET Standard. Since it's an older solution, Quino projects use the older csproj file-format: the one where you have to whitelist the files to include. Instead of re-using these projects, I figured a good first step would be to use the dotnet command-line tool to create a new solution and projects and then copy files over. That way, I could be sure that I was really only including the code I wanted -- instead of random cruft generated into the project files by previous versions of Visual Studio. [The dotnet Command] The dotnet command is really very nice and I was able to quickly build up a list of core projects in a new solution using the following commands: * dotnet new sln * dotnet new classlib -n {name} * dotnet add reference {../otherproject/otherproject.csproj} * dotnet add package {nuget-package-name} * dotnet clean * dotnet build That's all I've used so far, but it was enough to investigate this brave new world without needing an IDE. Spoiler alert: I like it very much. The API is so straightforward that I don't even need to include descriptions for the commands above. (Right?) Everything really seems to be coming together: even the "documentation" is clean, easy-to-navigate and has very quick and accurate search results. [Initial Results] * Encodo.Core compiles (almost) without change. The only change required was to move project-description attributes that used to be in the AssemblyInfo.cs file to the project file instead (where they admittedly make much more sense). If you don't do this, the compiler complains about "[CS0579] Duplicate 'System.Reflection.AssemblyCompanyAttribute' attribute" and so on. * Encodo.Expressions references Windows.System.Media for Color and the Colors constants. I changed those references to System.Drawing and Color, respectively -- something I knew I would have to do. * Encodo.Connections references the .NET-Framework--only WindowsIdentity. I will have to move these references to a Encodo.Core.Windows project and move creation of the CurrentCredentials, AnonymousCredentials and UserCredentials to a factory in the IOC. * Quino.Meta references the .NET-Framework--only WeakEventManager. There are only two references and these are used to implement a CollectionChanged feature that is nearly unused. I will probably have to copy/implement the WeakEventManager for now until we can deprecate those events permanently. * Quino.Data depends on Quino.Meta.Standard, which references System.Windows.Media (again) as well as a few other things. The Quino.Meta.Standard potpourri will have to be split up. I discovered all of these things using just VS Code and the command-line build. It was pretty easy and straightforward. So far, porting to .NET Standard is a much more rewarding process than our previous attempt at porting to .NET Core. [The Game Plan] At this point, I had a shadow copy of a bunch of the core Quino projects with new project files as well as a handful of ad-hoc changes and commented code in the source files. While OK for investigation, this was not a viable strategy for moving forward on a port for Quino. I want to be able to work in a branch of Quino while I further investigate the viability of: * Targeting parts of Quino to .Net Standard 2.0 while keeping other parts targeting the lowest version of .NET Framework that is compatible with .NET Standard 2.0 (4.6.1). This will, eventually, be only the Winform and WPF projects, which will never be supported under .NET Standard. * Using the new project-file format for all projects, regardless of target (which IDEs can I still use? Certainly the latest versions of Visual Studio et. al.) To test things out, I copied the new Encodo.Core project file back to the main Quino workspace and opened the old solution in Visual Studio for Mac and JetBrains Rider. [IDE Pros and Cons] [Visual Studio for Mac] Visual Studio for Mac says it's a production release, but it stumbled right out of the gate: it failed to compile Encodo.Core even though dotnet build had compiled it without complaint from the get-go. Visual Studio for Mac claimed that OperatingSytem was not available. However, according to "the documentation" , Operating System is available for .NET Standard -- but not in .NET Core. My theory is that Visual Studio for Mac was somehow misinterpreting my project file. Update: After closing and re-opening the IDE, though, this problem went away and I was able to build Encodo.Core as well. Shaky, but at least it works now. [image]Unfortunately, working with this IDE remained difficult. It stumbled again on the second project that I changed to .NET Standard. Encodo.Core and Encodo.Expressions both have the same framework property in their project files -- netstandard2.0 -- but, as you can see in the screenshot to the left, both are identified as .NETStandard.Library but one has version 2.0.0-preview1-25301-01 and the other has version 1.6.1. I have no idea where there second version number is coming from -- it looks like this IDE is mashing up the .NET Framework version and the .NET Standard versions. Not quite ready for primetime. Also, the application icon is mysteriously the bog-standard MacOS-app icon instead of something more...Visual Studio-y. [JetBrains Rider EAP (April 27th)] JetBrains Rider built the assembly without complaint, just as dotnet build did on the command line. Rider doesn't didn't stumble as hard as Visual Studio for Mac, but it also didn't had problems building projects after the framework had changed. On top of that, it wasn't always so easy to figure out what to do to get the framework downloaded and installed. Rider still has a bit of a way to go before I would make it my main IDE. I also noticed that, while Rider's project/dependencies view accurately reflects .NET Standard projects, the "project properties" dialog shows the framework version as just "2.0". The list of version numbers makes this look like I'm targeting .NET Framework 2.0. Addtionally, Rider's error messages in the build console are almost always truncated. [image]The image to the right is of the IDE trying to inform me that Encodo.Logging (which was still targeting .NET Framework 4.5) cannot reference Encodo.Core (which references NET Standard 2.0). If you copy/paste the message into an editor, you can see that's what it says. [1] [Visual Studio Code] I don't really know how to get Visual Studio Code to do much more than syntax-highlight my code and expose a terminal from which I can manually call dotnet build. They "write about" Roslyn integration where "[o]n startup the best matching projects are loaded automatically but you can also choose your projects manually". While I saw that the solution was loaded and recognized, I never saw any error-highlighting in VS Code. The documentation does say that it's "optimized for cross-platform .NET Core development" and my projects targeted .NET Standard so maybe that was the problem. At any rate, I didn't put much time into VS Code yet. [Next Steps] 1. Convert all Quino projects to use the new project-file format and target .NET Framework. Once that's all running with the new project-file format, it will be much easier to start targeting .NET Standard with certain parts of the framework 2. Change the target for all projects to .NET Framework 4.6.1 to ensure compatibility with .NET Standard once I start converting projects. 3. Convert projects to .NET Standard wherever possible. As stated above, Encodo.Core already works and there are only minor adjustments needed to be able to compile Encodo.Expressions and Quino.Meta. 4. Continue with conversion until I can compile Quino.Schema, Quino.Data.PostgreSql, Encodo.Parsers.Antlr and Quino.Web. With this core, we'd be able to run the WebAPI server we're building for a big customer on a Mac or a Linux box. 5. Given this proof-of-concept, a next step would be to deploy as an OWIN server to Linux on Amazon and finally see a Quino-based application running on a much leaner OS/Web-server stack than the current Windows/IIS one. I'll keep you posted. [2] -------------------------------------------------------------------------------- [1] Encodo.Expressions.AssemblyInfo.cs(14, 12): [CS0579] Duplicate 'System.Reflection.AssemblyCompanyAttribute' attribute Microsoft.NET.Sdk.Common.targets(77, 5): [null] Project '/Users/marco/Projects/Encodo/quino/src/libraries/Encodo.Core/Encodo.Core.csproj' targets '.NETStandard,Version=v2.0'. It cannot be referenced by a project that targets '.NETFramework,Version=v4.5'. [1] Update: I investigated a bit farther and I'm having trouble using NETStandard2.0 from NETFramework462 (the Mono version on Mac). I was pretty sure that's how it's supposed to work, but NETFramework (any version) doesn't seem to want to play with NETStandard right now. Visual Studio for Mac tells me that Encodo.Core (NETStandard2.0) cannot be used from Encodo.Expressions (Net462), which doesn't seem right, but I'm not going to fight with it on this machine anymore. I'm going to try it on a fully updated Windows box next -- just to remove the Mono/Mac/NETCore/Visual Studio for Mac factors from the equation. Once I've got things running on Windows, I'll prepare a NETStandard project-only solution that I'll try on the Mac.

C# Handbook 7.0

2017-05-01T21:42:56+02:00

Published by marco on 1. May 2017 21:42:56 Updated by marco on 1. May 2017 22:01:15 ------------------------------------------------------------------------ [image]"I announced almost exactly one year ago" that I was rewriting the Encodo C# Handbook. The original was "published almost exactly nine years ago" . There were a few "more releases" as well as a few "unpublished" "chapters" . I finally finished a version that I think I can once again recommend to my employees at "Encodo" . The major changes are: * The entire book is now a "Git Repository" . All content is now in Markdown. Pull requests are welcome. * I've rewritten pretty much everything. I removed a lot of redundancies, standardized formulations and used a much more economical writing style than in previous versions. * Recommendations now include all versions of C# up to 7 * There is a clearer distinction between general and C#-specific recommendations * There are now four main sections: "Naming" , "Formatting" , "Usage" and Best Practices, which is broken into "Design" , "Safe Programming" , "Error-handling" , "Documentation" and a handful of other, smaller topics. Here's the introduction: "The focus of this document is on providing a reference for writing C#. It includes naming, structural and formatting conventions as well as best practices for writing clean, safe and maintainable code. Many of the best practices and conventions apply equally well to other languages." Check out the "whole thing" ! Or download the "PDF" that I included in the repository.

The weird world of type-compatibility in TypeScript

2017-03-04T20:20:22+01:00

Published by marco on 4. Mar 2017 20:20:22 ------------------------------------------------------------------------ I recently fixed a bug in some TypeScript code that compiled just fine -- but it looked for all the world like it shouldn't have. tl;dr: there is no TypeScript compiler bug, but my faith in the TypeScript language's type model is badly shaken. [A simple example] The following code compiles -- and well it should. interface IB { name: string; } interface IA { f(action: (p: IB) => void): IA; } class A implements IA { f = (action: (p: IB) => void): IA => { return this; } } Some notes on this example: * The shape of interface IB isn't relevant to the discussion. * The intent of interface IA is to require implementors to define a method named f that takes single parameter of type İ => void and returns IA. * The implementation A above satisfies this requirement. It doesn't do anything with parameter action but that's OK. * The definition of A.f() is what a naive user of TypeScript would assume was the only way of satisfying the requirement from IA [Oddly compatible lambdas] However, the following implementations of IA also compile. class A2 implements IA { f = (action: () => IB): IA => { return this; } } class A3 implements IA { f = (action: (p: IB) => IB): IA => { return this; } } class A4 implements IA { f = (action: () => void): IA => { return this; } } class A5 implements IA { f = (): IA => { return this; } } [Forcing incompatibility] The only one I tried that doesn't compile is shown below. class A6 implements IA { f = (action: (p: number) => void): IA => { return this; } } In this case, the TypeScript compiler rightly shows the following error: [image] Hovering over the class name A5 shows the following tooltip: Class 'A5' incorrectly implements interface 'IA'. Types of property 'f' are incompatible. Type '(action: (p: number) => void) => IA' is not assignable to type '(action: (p: IB) => void) => IA'. Types of parameters 'action' and 'action' are incompatible. Type '(p: IB) => void' is not assignable to type '(p: number) => void'. Types of parameters 'p' and 'p' are incompatible. Type 'number' is not assignable to type 'IB'. To summarize, the following types seem to be compatible with İ => void: * () => IB * İ => IB * () => void * No parameter at all [The nitty-gritty of TypeScript's type system] In a more strongly typed language like C#, it's clear that none of this would fly. But this is TypeScript, which defines its typing model on compatibility with the dynamic language JavaScript. It almost looks like the type of the lambda isn't part of the type signature of the method, which came as a quite a surprise to me (and also to my colleague, Urs, who is much more of a TypeScript expert than I am). But maybe we don't know enough about the TypeScript type system. Let's look at the "Type compatibility" documentation for TypeScript. This section starts off with a "Note on Soundness", which contains a note that suggests that what we have above is completely valid TypeScript. "The places where TypeScript allows unsound behavior were carefully considered, and throughout this document we’ll explain where these happen and the motivating scenarios behind them." The section Comparing two functions starts off explaining some rather surprising things about the type-compatibility of functions: for a function to be type-compatible with another function, the types of its parameters must match the types of the target type's parameters, but the number of parameters doesn't have to match. So if the target type has 4 parameters and the lambda to assign has 0 parameters, that lambda is compatible. From the manual: let x = (a: number) => 0; let y = (b: number, s: string) => 0; y = x; // OK x = y; // Error For return types, the matching behavior is opposite. That is, a "bigger" type that satisfies the expected return type is just fine. let x = () => ({name: "Alice"}); let y = () => ({name: "Alice", location: "Seattle"}); x = y; // OK y = x; // Error because x() lacks a location property [Reëxamining the oddly compatible lambdas] Armed with this new knowledge, let's see if the previously bizarre-seeming behavior is actually valid. To recap, the TypeScript compiler says that following signatures are compatible with İ => void: * f(() => IB): IA: this is compatible because the zero parameters conform by definition and any return type is OK because void is expected. * f(İ => IB): IA: this is compatible because the single parameter conforms and any return type is OK because void is expected. * f(() => void): IA: this is compatible because because the zero parameters conform by definition and any return type is OK because void is expected. * f() => IA: this one looks plain wrong at first, but the same logic applies to the whole function f(İ => void) => IA instead of to the lambda parameter for it. The interface expects a function f with a single parameter, returning IA. By the first rule above, a function with zero parameters satisfies that requirement. * f((number) => void): IA: This does not satisfy the requirement because number is not compatible with IB. * f(number): IA: This does not satisfy the requirement because number is not compatible with İ => void. * f(): void: This does not satisfy the requirement because while zero parameters is OK, the type void is smaller than IA. Well, it looks like there's nothing to see here, folks. The compiler is doing exactly what it's supposed to. Move along and get on with your day. Unfortunately, that means that TypeScript is going to be considerably less helpful for ensuring program correctness than I'd previously thought. In fact, the caveat about Typescript "allow[ing] unsound behavior [in] carefully considered [places]" seems a bit disingenuous because, to a programmer accustomed to something like C# or Java or Swift, this kind of type-enforcement for method compatibility cannot be relied upon to enforce much of anything. [Actual vs. Formal Arguments] When I read "OOSC2" a long time ago [1], I remember how Bertrand Meyer made the distinction between the formal type of an argument (the type in the method signature) and the actual type of an argument (the runtime type). The method-type--conformance rules for TypeScript make sense for actual arguments. They ensure compatibility with JavaScript. What's not clear to me is that this same logic be applied to formal arguments that are only available in TypeScript. If I declare a specific type signature in an interface, what are the odds that I want the wishy-washy JavaScript-friendly type rules for those situations? From an architect's point of view, it would certainly be nicer to have more strict type-checking for formal definitions. Since we don't have that, this very lenient type-compatibility renders type-checking for lambdas largely useless in interface declarations. The compiler won't be able to tell you that your implementation no longer matches the interface declaration because almost anything you write will actually match. -------------------------------------------------------------------------------- [1] I'm a nerd, I read all 1300 pages twice.

My Impressions and Notes from VoxxedDays 2017

2017-03-04T00:06:33+01:00

Published by marco on 4. Mar 2017 00:06:33 Updated by marco on 4. Mar 2017 12:07:35 ------------------------------------------------------------------------ "Encodo" presented a short talk at "Voxxed Days 2017" this year, called "The truth about code reviews" . Sebastian and I also attended the rest of the conference. The following is a list of notes and reactions to the talks. -------------------------------------------------------------------------------- " " ["Engineering You" ] " Martin Thompson -- "Video" " The keynote was about our place in the history of software engineering. Martin described us more as alchemists than engineers right now, a sentiment with which I can only agree. There is too little precision, too little reproducibility and too little focus on safety for use to qualify as engineers. He gave as an example the pride with which car companies brag about the hundreds of millions of lines of code they have running in software in their cars: a claim that should send shivers down your spine. We know how this software is written and how it is tested. "Quino" has fewer than 100,000 lines of code (about 85,000, at least 15% of which is obsolete) and we've been building that for almost 10 years. How a company whose main business is building automobiles guarantees safety and correctness of 300 million lines of code is beyond my comprehension. I would venture that they don't. Highly recommended talk. Very interesting. Lots of good history mixed with common-sense recommendations, like the following: * Code reviews * Iterative design * Be an engineer, not an alchemist * Automate * Test References: * "SOFTWARE ENGINEERING: Report on a conference sponsored by the NATO SCIENCE COMMITTEE Garmisch, Germany, 7th to 11th October 1968" (PDF) * "Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-Intensive Systems" (PDF) -------------------------------------------------------------------------------- " " ["Swiss Transport in Real Time: Tribulations in the Big Data Stack" ] " "Alexandre Masselot" -- "Video" " He discussed a proof-of-concept transport-tracking application. Uses the SBB REST API for vehicle positions (using the same API as exposed for the app). Then there is the OpenData Transport API for station-board information, which provides details about delays. Everything is available as JSON with relatively straightforward data models. Uses Kafka to handle this real-time data pipeline (kind of like Chronicle, RabbitMQ or EasyMQ, but from Apache). The pipeline includes reformatting the data into the desired format (mostly eliding unwanted data), then store them in LogStash and then to ElasticSearch, which allows easy querying of the stored data. This type of data isn't fundamentally relational, so a document-based store is appropriate. The transformation also involves extrapolating the data that you're interested in from the data you obtained. For example, determining whether a train is stopped. E.g. are there x events with the same position? Is the position near a station? It was developed in Scala with Akka actors as well as the Play framework for REST. They represented all stations and trains with actors (objects). The actors are async and can run on any number of machines. After that comes Cassandra? Are they trying to use every possible technology? I'm losing track over here. Deployment on Docker. Also uses Zookeeper in another container for load-balancing/redundancy. OMG buzzwords. He asks: Why not a single application on a single server? Classic Java on Tomcat? It doesn't scale. It can only scale up, but not out. The actual solution feels like a lot of moving parts, but each part does a compartmentalized task, handing off to the next piece. It ends up being quite lightweight, using very little CPU overall. The simple, one-use components scale natively and relatively easily (LogStash, streaming, docker). The app server using Akka can be scaled, but it's here that you have to invest time to use the available fallback and clustering strategies. To render the data on the map, they used React to manage the data and d3.js to render. React is fast and scalable (but as Encodo has also discovered, that's not free either). Also, the client-side CPU usage is not insignificant, even with a lot of nodes. He also discussed UX and UI with tests. How to visualize possibly overlapping and differently sized elements at different zoom levels. Used Jupiter to analyze data and produce graphs. Conclusion: offload the parts of your application that aren't your core problem to external software and services. Things like managing data streams, transforming data, etc. Focus on your models and analyzing your data. * "Visualizing massive data streams: a public transport use case" * "D3.js transitions killed my CPU! A d3.js & pixi.js comparison" * "An invitation to reproducible computational research" -------------------------------------------------------------------------------- " " ["Functional data structures in Java" ] " Oleg Šelajev -- "Video" " He discussed how to build reusable structures that don't share mutable state (non-imperative vs. functional). * Classic standard libraries define mutable data structures, like lists, arrays, etc. These are not optimal for multi-tasking and asynchronous work. Mutable data structures produce side effects. * Void is a "code smell" because the only reason to call it is to cause a side effect. Prefer pure methods. * A functional data structure has to be immutable. * A functional data structure has to be persistent. This is similar to the first property, but it allows for a new structure to be created that is a mutation of the prior version. Obviously, we want to optimize storage here, reusing as much of the prior version as possible (instead of copying). * This is how mutation works, since we know that the prior version will never change, so it can be freely referenced. * Return values from methods on functional data structures are referentially transparent. You can cache the value without worrying that it will ever change or disappear. * This allows an application to work lock-free instead of guarding access to all possibly mutating methods. * It is easier to reason about functional (pure) data structures. Any discussion of data-structure design/implementation will naturally involve balancing performance vs. storage. The safety is baked-in, but performance is always a concern when working with immutable data structures, most especially when changing them. Even though the average call time for a method is nearly constant (as with most mutable structures), what if you call too many expensive operations and skew the average in real-world use? Well, you can combat this by leveraging the cachability of your collections (as defined above) as a way of memoizing (a well-known performance-optimization technique which carries with it possibly higher storage costs if you can't share the memoized instances very much.) In some cases, you can reason about performance in the following way: if you get to a situation where you would have to do an expensive operation (e.g. the reverse implicit in balancing head/tail of a queue), you can only get to this situation by having done n cheap operations first. So it is proven that the average is still constant time. Destructive behavior (like deque) looks different than mutable data structure. In those cases, the operation returns both the removed element as well as a reference to the queue that represents the new state of the queue. Tuple> Dequeue(); For maps, you need a concept called Zip that lets you quickly build a representation of the structure where the element viewed at a particular point in an existing structure is different. So even when a desired mutation would require alteration of a lot of the underlying structure, this operation allows reuse of a lot more of the structure than would otherwise be possible. The node can point to different parent and child nodes, referencing the new part of the structure while embedded in as much of the prior version as possible. "Object-oriented programming makes it easier to reason about moving parts. FUnctional programming makes it easier to minimize moving parts." References: * "What's new in purely functional data structures since Okasaki?" -------------------------------------------------------------------------------- " " ["Does diversity really matter?" ] " Sombra González and Brigitte Hulliger -- "Video" " This talk began by posing the following questions to the audience. * Do you work with women in a technical capacity? (My answer: No. The closest I've come was a programmer I trained as part of a group of 7 others for a customer.) * Can you remember having been in a meeting with two women or more? (My answer: A couple of project meetings over the last couple of years, but no-one in a technical capacity. Also some con conference calls, but neither of the two female participants were in a technical capacity.) Good questions. Good topic. Mostly well-presented, although the middle dragged a bit: Sombra envisioned a (near-)future where women are the same as men in a tech world, a meritocracy. It didn't add very much. As with everywhere else, the software industry has to figure out how to deal with long maternity leaves. Some countries have introduced "rainbow" leaves, which allow sharing of the time between partners, so if the partner is male, the industry has to deal with male absence as well. That will probably help increase acceptance of female leave, as it removes the distinction. For small companies, these kinds of extended leaves are a big hurdle because we can't so easily absorb so much missing capacity. We haven't improved at all in the last quarter-century: there have been proportionally fewer women in technical software positions every year since 1991. The quit rate is much higher (41%) than for men (17%). This is not primarily due to family concerns, though. It's mostly due to women not feeling comfortable in an industry where they're often the only female in a meeting, on a team or in a company. Reference: * "@WeShapeTech" -------------------------------------------------------------------------------- " " ["The truth about code reviews" ] " Sebastian Greulach -- "Video" " This talk is a reduced version of the code-review talk that Sebastian has been doing for "Encodo Systems" in both English and German over the last year. The presentation includes some statistics about the value of code reviews, a discussion of which benefits you can expect to get, which types of reviewers are likely to yield which benefits as well as Encodo's approach and advice for integrating code reviews into your development process. This was the most informative and amazing presentation at the entire show. All kidding aside, the room was packed and the ratings were quite good. There seemed to be a lot of interest in process. Reference: * "English slides (long form)" * "German slides (long form)" -------------------------------------------------------------------------------- " " ["How Modern SQL Databases Come up with Algorithms that You Would Have Never Dreamed Of" ] " Lukas Eder -- "Video" " This guy was supremely entertaining. He is the undisputed master of the animated and reaction GIF in presentations. Informative, spirited and very funny. * SQL is a 4GL. * It's a declarative language. * He shows off with a calculation of the Mandelbrot set with PostgreSql (but that's not the presentation) * He presents an example of how to address business needs (e.g. how much money per film per day). Shows how simple joins are in SQL * Then he shows how to do it with classic Java (which sucks). Basically, he shows how good SQL is by slagging on Java * He shows something that could be on "Annotatiomania" * At this point, the Java code is so long that "they can see our code from space." * Eager-loading is a code smell. You should actually be able to get the objects that you want in the form that you want. The optimal result type is the exact shape of the data that you want, not the ORM objects. [1] * "When does that ever happen? Changing requirements. Never." * He's talking about using SQL instead of code because you don't care about algorithms or storage types or caches -- let the database developers worry about that. They're good at it. And they love it. And the questions that you're answering are higher-level. * He discusses how Java streams look much, much nicer. But I think .NET Linq is even nicer ... and he doesn't mention that at all. So he shows how the more readable API is much better in Java ... but it's now exactly how SQL works. * The Java is now more readable, but it's still lazy-loading a ton of data you don't need. You're doing stuff on the client that the database would do much better. * Java Streams are so much uglier than Linq. They are forced to use explicitly typed Tuples (because there is still no var) and the tuple elements are unnamed (p1, p2, etc.) C# 6 is still like this, but C# 7 introduced named items for anonymous tuples. * The general-purpose languages force us to think about these things when they are not our programming domain. * Database also is capable of caching execution plans and optimizing subsequent queries. Prepared queries are da bomb. [2] * Any algorithm that produces the correct result is acceptable. It doesn't matter how you get there. * We don't really know what the database is really doing with a declarative syntax. We probably can't even guess at the optimizations that the clever database is doing. Use "Explain Query" to see the estimated plan and the actual plan (based on the actual data and current situation on the database instance). * The cardinality is a hint that indicates whether to use linear or logarithmic algorithm. This will also give a hint as the order that the database will load data (e.g. to reduce the dataset as quickly as possible before applies further joins, ordering and restrictions). * [image]Conclusion: let the database choose the algorithm based on the dataset available and the current state of the database. He shows an example with a histogram: how a query with one filter might use an index whereas a different filter might be more efficient just scanning the whole table (because 90% of the data is required anyway). The database can take disk-access speed into account. How can the developer predict which algorithm to use? The data and deployment environment isn't known at compile-time. So since you can't know and you're not the guy to decide, then you should offload that decision to the software that does know: the SQL database. What about latencies for remote data? Same thing. Let the database decide. * Unless you're the one writing the database. * The database is really good at this. It remembers how well its estimate matched the actual cost and it uses this to improve its execution plan. * Oracle can actually change the execution plan "in flight" if it sees during execution that an assumption was grievously wrong. * Also, SQL is functional: no side effects. * "Coders want to code; they want to do everything themselves." * Also: use production data whenever possible so you have a commensurate dataset size. * Joolambda is a product from his company. Also JOOQ. Looks like Linq, actually. But maybe it works better? Nice clean API which works with arbitrary result sets instead of fixed ORM objects. Can Quino learn something from it? I asked him after the talk where JOOQ gets its metadata and it generates it from the database schema. * Without hash joins there are a whole lot of algorithms that aren't available (MySQL). * Put business logic in the database, but be careful because how do you test it? I talked to Martin, Vlad and Lukas after the talk about testing and we agreed that databases should be immutable (Martin forbids the UPDATE statement in his projects, where he can) and then you basically have an immutable data structure in a separate process with a really powerful and efficient query languages over the graph. [3] * Locks are terribly complicated and performance is unpredictable. He says he's "lucky to only work with read-only databases. So much easier. So much fun." * Summary of chat after talk (some repeat from above): Chatted with Martin, Vlad and Lukas [4] after the talk about testing the database. Martin suggested that you don't use the update statement, only insert. Lukas responded similarly, saying that we should use SQL for read-only logical queries. Jooq has a metadata generator for analyzing your database so that you can query it. It doesn't define objects; you can only define the Tuple that you will return. That is pretty cool. Martin also pointed out that you could enforce immutability and store your data in an immutable, queryable graph by using the database. References: * "10 SQL Tricks That You Didn’t Think Were Possible" ("Video" ) * "Big-O Cheat-Sheet" -------------------------------------------------------------------------------- " " ["A practical introduction to Category Theory" ] " Daniela Sfregola -- "Video" " Category theory is about Monads, examples of which are Option, Try and Future (promise). The example she uses shows how to apply category-theory constructs to data-validation. The examples are in Scala, although the API that she presents looks very similar to the terminology used in Java's Streams API. E.g. flatMap(). That's Select() for C# developers. Similarly, Options is Nullable, although I can't think of the type analog for Some or None. Her validation example is well-made, going from returning an Option which is no better than a Boolean. Then she shows an Either but that doesn't allow for having both sides wrong. This can be done with Either but it's painful. That's why we invented pattern-matching (now available in C# 7). When she introduced a Validated, which is capable of returning a list of errors. "Focus on how things compose." The talk was quite short and didn't introduce much new. The pattern-matching syntax in Scala is a bit wordy. -------------------------------------------------------------------------------- " " ["g º f patterns" ] " Mario Fusco -- "Video" " Since my previous talk was done early, I joined Sebastian in this one. I saw only the tail-end of it, but man are the streams() libraries still really wordy. Welcome to functional programming, Java! Still, I'm disappointed that I can't use streams() in the Android project I'm working on because it required Java 8, which forces API level 24, which excludes a lot of devices. Sebastian said the talk was pretty good. -------------------------------------------------------------------------------- " " ["What about CSS? Progressive Enhancement and CSS" ] " Ire Aderinokun -- "Video" " * She's from Lagos, Nigeria. Google talk something or other. * Nicest slides I've seen all day. * Graceful degradation is the solution for only the current best browser. It doesn't necessarily scale to future versions. * Most designers test only one version older than the supported version. Encodo tests the versions required in the spec. * The goal isn't to dazzle the user, but to deliver the information to the widest possible audience. * Admittedly, some sites do have "dazzle the user" as a goal Rules: * Use sparse, semantic markup * Use plain text for the content in the markup * The basic layout should work without CSS * CSS is an enhancement * Enhanced behavior can be provided by unobtrusive JavaScript * End-user browser preferences should be respected (e.g. don't restrict zooming the UI, since a lot of users can't see so well) WTF is the squirrel browser? (It turns out it's "UC Browser" , popular in China.) Or the one with the strange globe? (Maybe Flock? Not sure.) Does Opera really have higher market-share than IE? Probably globally, right? Phone browser in India/China/etc. She showed a really cool graph of how many hours you have to work to use 500MB of data. Germany: 1h, Brazil: 56h, US: 6h. Bandwidth matters. A lot. WWW != Wealthy Western Web ammirite? * Use aria rules if you know that you might run on browsers that don't understand the new tag types (increasingly unlikely). Still, phone browsers in Africa probably have never heard of

. * There is no difference between an unsupported CSS property or a bad value or name in the style, selector, etc. * CSS doesn't have built-in fallbacks More rules: * Start with sensible HTML (same as above) * Go "Mobile-first" * Use media queries * Use flexbox (was designed as a progressive enhancement, so vertical-align is ignored when flexing is enabled.) * What about "Offline-first"? That is, making sure that your app works offline to at least some degree. Syncing data can be a pain, depending on the data, though. If you just have data to log, that's independent of other data, it's OK. * Use CSS Feature queries (detect support or NOT support) * Use progressive enhancement to deal with IE, which doesn't support feature queries * A good tip is that a property with a bad value is ignored by the browser. What about the future of the web? VR? Old devices handed down from the 1st to the 3rd world. I asked about testing that the progressive enhancements work as programmed, but no-one has any new ideas for testing, though. Manual testing to verify that the enhancements and fallbacks work. References: * "BitsOfCo.De" * "Understanding Progressive Enhancement" by Aaron Gustafson -------------------------------------------------------------------------------- " " ["I just hacked your app!" ] " Marcos Placona -- "Video" " He started off the talk as a bandit, reverse-engineering a Base64-encoded name/password. He used Charles to get MITM. It was a nice trick, and it probably works on a lot of devices and apps. It's very easy to make a hackable application if you don't think about security. He uses a nice word-definition slide with pronunciation and usage to make it look all official. * Pace is a bit slow at first. * Pokemon Hack was a MITM; it wasn't malicious: kids just didn't feel like walking. Important to remember that if motivation is high, a hacker will try really hard. * Beer hack was a loyalty hack (Kuba Gretsky) * Encrypt all the values instead of sending plain-text * But be careful of where you put your keys * This one guy Luke Chadwick uploaded his Amazon key to GitHub by accident. Farmers who watch every damned public commit got it, spun up some EC2 instances and started mining BitCoins * Use security features where possible * Use certificate pinning with the CertificatePinner() * Do NOT trust the device * Do NOT trust the app; it can be decompiled. * What about magic strings? * You can get your keys from a server * Or you can encrypt them, but what about the encryption key for the encryption key? * Get the key from the NDK. You can store information in the NDK itself, which is more secure and less decompilable than app code. * Check that the application name hasn't been changed. * Check that the package manager is supported/correct; otherwise, your app has been republished to a new server. * Or you can also check that the installer is Google or Amazon * Check your application signature; you can check whether the app was actually compiled by YOU * Check if the device is rooted (he used some exec() command) * Check for emulator (If the build fingerprint starts with "emulator") * Do not allow users to switch your App into debug mode * Use ProGuard, DexGuard. ProGuard is the lite version. DexGuard supports a lot of the checks listed above. DexGuard uses non-Latin Unicode for obfuscation. :-) Unfortunately, it's a per-user charge. That's per user of your app. * So, SafetyNet it is! That's more like it. * "The Internet is not a Safe Place" (shows a slide of a dirty van with "Hannah Montana Concert Shuttle" sprayed on the side) * Try to hack your own applications. Always. -------------------------------------------------------------------------------- [1] Just thinking out loud: Often, though, you want the object, more or less, so it's OK, no? Can we add this kind of anonymous-loading to Quino? It would be interesting to get arbitrary result sets. Like QQL. But it's nice to work with metadata (although we're very close to the SQL model anyway. Can we make it easier to build queries that aggregate?) [1] Quino doesn't use those either. Lots of wasted time there. Can we move toward working closer with the database API? How is it that we've avoided aggregations for so long? We just write those queries by hand. [1] Also thinking of a talk for next year: Schema Migration with Quino. I really like SQL and it's the best answer to a lot of questions. But I'm a programmer and I don't like to define my metadata in table form. I like to describe my model in an agnostic way so that I can re-use my metadata in more than just a database context. Or what about a focus on building metadata-based applications? Capturing your domain model in a way that's not so database-centric? [1] Nice clean API which works with arbitrary result sets instead of fixed ORM objects.

The evolving Swift string API and implementation

2017-02-06T00:10:55+01:00

Published by marco on 6. Feb 2017 00:10:55 ------------------------------------------------------------------------ As Microsoft did a couple of years ago, Apple's language designers are also designing the next version of Swift in public. [1] One example of the new design is the discussion of "String Processing For Swift 4" . If you read through the relatively long document, you can at least see that they're giving the API design a tremendous amount of thought. [API Considerations for Strings] There are so many factors to weigh when building the API, especially for a low-level construct like String. * As they state right at the beginning of the document, they are concerned with "Ergonomics, Correctness, Performance" (probably in that order). * How does the API affect storage? * Is it still possible to use a COW pattern in order to save memory for multiple copies of the same string? Other, similar languages like C# and Java have slowly moved to a more-eager copying mechanism to reduce complexity in the memory-manager for strings, especially when used in multi-tasking. * How allocation-efficient is the base string library? Does the API help the more well-worn code paths avoid allocation unless absolutely necessary? * What about slicing support? Does the API force copying when it would not be needed? Does it at least allow the decision to copy to be delayed until absolutely necessary? * How accessible are the various supported representations? (E.g. UTF8 vs. UTF16) * How compatible/performant is the optimally ergonomic API with the "Objective-C interoperability" ? This is a common case and must be as close to allocation-free as possible and fast (because thunking between Swift code and Objective-C/Cocoa APIs is very common). * Does the API leverage patterns from other parts of the API rather than expanding the String API with a bunch of overloads? (E.g. the "discussion of storage for sub-strings" .) * Is immutable the default, with mutability opt-in? (This prevents unwanted copies and dangling references in the reference-counted world of Swift ... although Strings are actually structs rather than classes.) * Does the API do the "right thing" by default? In the case of Swift's string-handling, this means that the caller of the API works with Unicode graphemes, by default. * What about case-sensitive/insensitive comparisons? Accent sensitivity? * What about ordering? Collation? Localization? * Does the API scale nicely to allow increasing specificity, with good defaults? * Is there consistency within the string API? * What about consistency with similar constructs, like Array? * How does the API fit with with developer expectations? Should the String be a Collection? If so, what is the default item-type? * Why doesn't the Character have the same or a similar API as a String? (E.g. why can't you get the sub-structure of the grapheme cluster for a character without first casting it to a String?) [Slices/Substrings] A good example is the discussion of how to represent string slices: should there be a separate type, called Substring, analogous to the ArraySlice that already exists for an Array? "Long-term storage of Substring instances is discouraged. A substring holds a reference to the entire storage of a larger string, not just to the portion it presents, even after the original string's lifetime ends. "[...] "The downside of having two types is the inconvenience of sometimes having a Substring when you need a String, and vice-versa. It is likely this would be a significantly bigger problem than with Array and ArraySlice, as slicing of String is such a common operation. It is especially relevant to existing code that assumes String is the currency type -- that is, the default string type used for everyday exchange between APIs. To ease the pain of type mismatches, Substring should be a subtype of String in the same way that Int is a subtype of Optional." [To implement Collection or not?] For those that watch as the API for Swift evolves from one major version to another -- with each change introducing non--backward-compatible incompatibilities -- this document should hopefully reassure them that the changes are not made lightly. It may seem like the designers don't have a plan, but, over the years, designers and opinions change. E.g. Witness the "discussion of what the default representation of the string should be" . "[...] in Swift 1.0, String was a collection of Character (extended grapheme clusters). [...] In Swift 2.0, String's Collection conformance was dropped, because we convinced ourselves that its semantics differed from those of Collection too significantly." After listing several reasons why the change in Swift 2.0 was not a good direction, they conclude that in 4.0, they should revert to the original behavior. "It would be much better to legitimize the conformance to Collection and simply document the oddity of any concatenation corner-cases, than to deny users the benefits on the grounds that a few cases are confusing." Again, the discussion is open and public and, despite the claims of some who think that they're just a bunch of cowboys changing stuff willy-nilly, they have a documented plan. It's unfortunate that it took them so long to get there, but this kind of design isn't always easy. [Consolidating Index Types] Because Swift uses Unicode grapheme clusters as the default "items" view for strings, the discussion of "string indices" might seem unnecessarily abstract for developers coming from other languages, where the index is always an int int bytes. "String currently has four views--characters, unicodeScalars, utf8, and utf16 [...]" Because of these different views, it's necessary to discuss how to reduce API surface by consolidating the various index types used to refer to individual elements in these different "views" on a String. [Doing the Right Thing] It's not like C# -- and most other mainstream languages -- have anything to brag about with their string-handling. In that respect, even Swift 1 and 2 are light-years ahead in Unicode correctness with their focus on grapheme clusters rather than the utterly nonsensical 90s-era bytes still used in those other languages. The "Guidance for API Designers" shows how they try to build the API so that it makes sense for callers. "A Substring passed where String is expected will be implicitly copied. When compared to the "same type, copied storage" model, we have effectively deferred the cost of copying from the point where a substring is created until it must be converted to String for use with an API. "A user who needs to optimize away copies altogether should use this guideline: if for performance reasons you are tempted to add a Range argument to your method as well as a String to avoid unnecessary copies, you should instead use Substring." Their goal is noble, though it's unclear to what degree the vision can be realized. The "following citation" could be written as the high-level goal of any API. "We should represent these aspects as orthogonal, composable components, abstracting pattern matchers into a protocol like this one, that can allow us to define logical operations once, without introducing overloads, and massively reducing API surface area." -------------------------------------------------------------------------------- [1] Also as Microsoft did, Apple now has an "official, public issue tracker for Swift." [2] You could have knocked me over with a feather. [1] The downside: it's Atlassian Jira.

A tuple-inference bug in the Swift 3.0.1 compiler

2017-02-04T18:17:03+01:00

Published by marco on 4. Feb 2017 18:17:03 Updated by marco on 5. Feb 2017 23:42:56 ------------------------------------------------------------------------ I encountered some curious behavior while writing a service-locator interface (_protocol_) in Swift. I've reproduced the issue in a stripped-down playground [1] and am almost certain I've found a bug in the Swift 3.0.1 compiler included in XCode 8.2.1. Update: At the suggestion of a reader, I searched and found "Apple's Jira for Swift" [2] and reported this issue as "A possible tuple-inference/parameter-resolution bug in Swift 3.0.1" [A Simple, Generic Function] We'll start off with a very basic example, shown below. [image] The example above shows a very simple function, generic in its single parameter with a required argument label a:. As expected, the compiler determines the generic type T to be Int. I'm not a big fan of argument labels for such simple functions, so I like to use the _ to free the caller from writing the label, as shown below. [image] As you can see, the result of calling the function is unchanged. [Or Maybe Not So Simple?] Let's try calling the function with some other combinations of parameters and see what happens. [image] If you're coming from another programming language, it might be quite surprising that the Swift compiler happily compiles every single one of these examples. Let's take them one at a time. * int: This works as expected * odd: This is the call that I experienced in my original code. At the time, I was utterly mystified how Swift -- a supposedly very strictly typed language -- allowed me to call a function with a single parameter with two parameters. This example's output makes it more obvious what's going on here: Swift interpreted the two parameters as a Tuple. Is that correct, though? Are the parentheses allowed to serve double-duty both as part of the function-call expression and as part of the tuple expression? * tuple: With two sets of parentheses, it's clear that the compiler interprets T as tuple (Int, Int). * labels: The issue with double-duty parentheses isn't limited to anonymous tuples. The compiler treats what looks like two labeled function-call parameters as a tuple with two Ints labeled a: and b:. * nestedTuple: The compiler seems to be playing fast and loose with parentheses inside of a function call. The compiler sees the same type for the parameter with one, two and three sets of parentheses. [3] I would have expected the type to be ((Int, Int)) instead. * complexTuple: As with tuple, the compiler interprets the type for this call correctly. [Narrowing Down the Issue] The issue with double-duty parentheses seems to be limited to function calls without argument labels. When I changed the function definition to require a label, the compiler choked on all of the calls, as expected. To fix the problem, I added the argument label for each call and you can see the results below. [image] * int: This works as expected * odd: With an argument label, instead of inferring the tuple type (Int, Int), the compiler correctly binds the label to the first parameter 1. The second parameter 2 is marked as an error. * tuple: With two sets of parentheses, it's clear that the compiler interprets T as tuple (Int, Int). * labels: This example behaves the same as odd, with the second parameter b: 2 flagged as an error. * nestedTuple: This example works the same as tuple, with the compiler ignoring the extra set of parentheses, as it did without an argument label. * complexTuple: As with tuple, the compiler interprets the type for this call correctly. [Swift Grammar] I claimed above that I was pretty sure that we're looking at a compiler bug here. I took a closer look at the productions for tuples and functions defined in The Swift Programming Language (Swift 3.0.1) manual available from Apple. First, let's look at tuples: [image] As expected, a tuple expression is created by surrounding zero or more comma-separated expressions (with optional identifiers) in parentheses. I don't see anything about folding parentheses in the grammar, so it's unclear why (((1))) produces the same type as (1). Using parentheses makes it a bit difficult to see what's going on with the types, so I'm going to translate to C# notation. * () => empty tuple [4] * (1) => Tuple * ((1)) => Tuple> * ...and so on. This seems to be a separate issue from the second, but opposite, problem: instead of ignoring parentheses, the compiler allows one set of parentheses to simultaneously denote the argument clause of a single-arity function call and an argument of type Tuple encompassing all parameters. A look at the grammar of a function call shows that the parentheses are required. [image] Nowhere did I find anything in the grammar that would allow the kind of folding I observed in the compiler, as shown in the examples above. I'm honestly not sure how that would be indicated in grammar notation. [Conclusion] Given how surprising the result is, I can't imagine this is anything but a bug. Even if it can be shown that the Swift compiler is correctly interpreting these cases, it's confusing that the type-inference is different with and without labels. -------------------------------------------------------------------------------- [1] The X-Code playground is a very decent REPL for this kind of example. Here's the code I used, if you want to play around on your own. func test(_ a: T) -> String { return String(describing: type(of: T.self)) } var int = test(1) var odd = test(1, 2) var tuple = test((1, 2)) var labels = test(a: 1, b: 2) var nestedTuple = test(((1, 2))) var complexTuple = test((1, (2, 3))) [1] I was amazed to find that Apple actually has a normal bug tracker for which I could create an account. Wonders never cease. [1] I didn't include the examples, but the type is unchanged with four, five and six sets of parentheses. The compiler treats them as semantically irrelevant, though the Swift grammar doesn't allow for this, as far as I could tell from the BNF in the official manual. [1] This is apparently legal in Swift, but I can't divine its purpose in an actual program

Programming-language Features: How much is too much?

2017-01-15T23:40:49+01:00

Published by marco on 15. Jan 2017 23:40:49 Updated by marco on 4. Oct 2023 21:24:02 ------------------------------------------------------------------------ The article "Dark Path" by Robert C. Martin was an interesting analysis of a recent "stricter" trend in programming languages, as evidenced by Swift and Kotlin. I think TypeScript is also taking some steps along this path, as well as Rust, which I have a read a lot about, but haven't had much occasion to use. [Correct vs. Expressive] The point Martin makes is that all of these languages seem to be heedlessly improving correctness at the possible cost of expressiveness and maintainability. That is, as types are inferred from implementation, it can become more difficult to pinpoint where the intent of the programmer and the understanding of the compiler parted ways. As well, with increasing strictness -- e.g. non-null references, reference-ownership, explicit exceptions, explicit overrides -- there comes increasing overhead in maintaining code. Not only that, but developers must know their types -- and hence their design -- up front, which restricts evolving design as practiced in the very successful TDD approach and seems to be headed back to the stone age of waterfall design. As well, that level of strictness convinces developers -- who are similarly encouraged by the language designers -- that once their code compiles, then it runs as expected. But then they think they don't need to test, whereas the compiler really has no idea whether your code does what it should do. All it can guarantee is that no exception went unhandled -- or explicitly ignored -- (e.g. in Kotlin or Swift) or there are no race conditions or deadlocks (Rust) or that there are no null references where not explicitly programmed (Swift, Kotlin, TypeScript). These compiler-enforced language features are very useful, but are in the same class as the spell-checker in your text editor. Having no red, wavy lines in your document is no guarantee that the document makes any sense whatsoever. So these are interesting and useful features. They can lead to increased safety. But, they won't make your program do what it's supposed to do. At best, they help you avoid writing behavior that you most definitely don't want. These features are nice to have, but they are not worth having at any price. [Round Two: Defaming Types] It was an interesting article that I more-or-less agreed with. The follow-up article "Types and Tests" by Robert C. Martin followed close on its heels because Martin apparently wanted to respond to feedback he'd received on the first article. I thought he went a bit far in the second article. For example, he emphasized that, "No, types are not tests. Type systems are not tests. Type checking is not testing. Here’s why." That's absolutely true, but types are still related to testing. Types help me specify my interface more precisely and I can trust the compiler to enforce them. That's a lot of tests I don't have to write. Otherwise, for every API I write, I'd have to write tests to prove that only the supported types can be passed in -- and I'd also have to specify how my API behaves when value with an incorrect type is passed in. Do I fail silently? How do I let the caller know what to expect? This seems not only sloppy but time-consuming. It sounds like busy work, having to think about this kind of stuff for every API. [Precise Requirements] Martin continues, "[...] the way f is called has nothing to do with the required behavior of the system. Rather it is a test of an arbitrary constraint imposed by the programmer. A constraint that was likely over[-]specified from the point of view of the system requirements. (Emphasis added.)" The first sentence is a useful observation. The second is hyperbole. Indicating int rather than object for a parameter called limit hardly seems like an over-specification. In fact, in seems like exactly what I want. If the requirement says shall allow a user to enter a value for limit... rather than shall allow a user to enter a positive number for limit..., then I would argue that 99% of the time it's the requirement that isn't precise enough. I would not assume that the requirements engineer knew just what she was doing when she left the door open for a limit given as a string. Without types, our requirements would also become bloated with over-definitions like: * The system shall allow the user to enter a value that defines the upper limit of records to return. * The system shall throw an exception of type ArgumentOutOfRangeException for values that are less than zero or greater than 1000. * The system shall marshal the value to a numeric representation where necessary. * The system shall throw an exception of type ClassCastException if the given value cannot be marshaled to a numeric value. For this specification, a developer could write: public void SetLimit(object limit) { int limitAsNumber; if (!Int32.ParseInt(limit, out limitAsNumber)) { throw new ClassCastException("..."); } if (limitAsNumber > 1000) { throw new ArgumentOutOfRangeException("limit"); } _limit = limit; } The developer could also write: public void SetLimit(UInt32 limit) { if (limit > 1000) { throw new ArgumentOutOfRangeException("limit"); } _limit = limit; } That's actually what we want the developer to write, no? If you choose JavaScript to implement this requirement, then you would need to over-specify because you need to decide how to handle values with unsupported types. If the requirements engineer is allowed to assume that the implementing language has a minimal type system, then the requirements are also easier to write, as shown below. * The system shall allow the user to enter a positive integer that defines the upper limit of records to return. * The system shall throw an exception of type ArgumentOutOfRangeException for values that are less than zero or greater than 1000. * The system shall marshal the value to a numeric representation where necessary. * The system shall throw an exception of type ClassCastException if the given value cannot be marshaled to a numeric value. [Expressive Types are a Good Thing] Assuming a minimal type system in the target language saves time and effort. The requirements engineer can specify more concisely and the software engineer wastes less time writing boilerplate that has nothing to do with application behavior. Martin finished up with this sentiment, "So, no, type systems do not decrease the testing load. Not even the tiniest bit. But they can prevent some errors that unit tests might not see. (e.g. Double vs. Int) (Emphasis added.)" As you can imagine, I strongly disagree with the "[n]ot even the tiniest bit" part, based on my arguments above. If you use JavaScript, then you have to test all valid input and verify its behavior. In JavaScript, literally any data is valid input and it's up to your method to declare it invalid. Only tests can provide any protection against your method being called at runtime with invalid data. You have to write a test to verify that your method throws an error when passed a double rather than an int. Most people will not write these kind of tests, which I suspect is why Martin says there's no change in testing load. [Strict Languages for Bad Programmers?] I agree that the pendulum in Swift has swung too far in a restrictive direction. The language does feel pretty overloaded. I also agree that the behavior of the system itself needs to be tested and that types don't help you there. Martin again, "On the other hand, internal self-consistency does not mean the program exhibits the correct behavior. Behavior and self-consistency are orthogonal concepts. Well behaved programs can be, and have been, written in languages with high ambiguity and low internal consistency. Badly behaved programs have been written in languages that are deeply self-consistent and tolerate few ambiguities. (Emphasis added.)" Agreed. I think, though, that Martin might be forgetting about all of the people writing software who aren't the kind of people who can write a well-behaved program in a wildly inconsistent language. I, for example, am so awesome [1] that I wrote my entire web-site software in PHP -- one of the worst languages in the world for internal self-consistency -- and it's been running my site for going on 18 years. Programming skill and iron discipline fill the gap left by language consistency. But for bad programmers? They write utter garbage in PHP. Maybe it's not a bad idea to create languages that channel poorly disciplined programmers into better practices. I take the point from the previous article ("Dark Path" ) that bad programmers will simply work their way around the rigor, where possible. They will mark every class as open in Swift instead of thinking about their architecture. For those of us with discipline, the language will put up roadblocks that force us to write more code rather than less. [Unfixable Errors?] As a counterexample, there is Rust, which enforces reference-ownership in a way that guarantees concurrent code with no deadlocks and no race conditions. This is a good thing. It probably gets in your way when you're trying to write other types of programs, but it's overall a good thing. I haven't had any personal experience with it, but I've heard that it's sometimes difficult to figure out why a given program won't compile. I would hope that these situations become fewer with experience, but would also be cautious because I remember programming in C++ with templates and know how much time can be lost when you don't know how to fix your program based on an error message. [Non-nullable types are not a step too far] I, for one, like that my compiler tells me when I have potential null-reference exceptions. I use attributes in C# to tell me exactly that and I use R# to find all places in my code where I have potential violations. Those are more tests that I don't have to write, if the compiler can "prove" that this code is never called with a null reference. [2] It lets me write more concise implementation and spares me a lot of scaffolding. [What about const/non-mutable types?] Many years ago, I had the same experience with const in C++ as Martin discusses. After some time working with const, I starting making everything I possibly could const in order to eliminate a whole class of mutation errors in my code. That did have consequences, at the time. Changing one thing could -- as Martin describes for his hypothetical language TDP -- lead to knock-on changes throughout the code base. Generics can have this effect, as well, with changes leaking into all of the places they're used. I wrote a "blog series" on having pulled back from generics in a few central places in Quino. [Exceptions in method signatures] I often felt the way that Martin does about Java's throws declaration. I imagine that I'll start to feel the same about Swift's, as well. I read once about a "nice typing system in Midori" , the managed version of Windows created by Joe Duffy and team at Microsoft Research, that I felt I would like to try (no pun intended). [Toward more expressive types] Martin says that he uses both dynamically and statically typed languages. He acknowledges that certain extensions to the type system can be useful (but just that some languages have gone too far). I, too, think some innovations can be very helpful. I like immutables (types, declarations, whatever) because they let me reason better about my code. They let me eliminate unwanted code paths with the compiler rather than having to write more rote tests that I think even Martin will agree have nothing to do with the original specification or the behavior of my application. If I can mark something as readonly because I don't expect it to ever need to be changed, that's a little note I've left for future programmers that, should they want to modify that value, they will have to make sure to reason differently about the implementation. The value was never intended to be rewritten and there are no tests for that behavior. It's a nice way of reducing the scope of the implementation. It simultaneously restricts that scope, but that's a good thing. A program can, very quickly, do a lot of things that it should not do. I don't want to write tests for all of this stuff. I have neither the inclination nor the time -- nor the budget -- to write tests for things that I could instead eliminate entirely from the realm of possibility with a powerful type system. [Agreeing with Martin] I read up on Kotlin and saw a seminar on it last year. I, too, noticed that there seems to be an "everything but the kitchen sink" feel to it. It's the same feeling I get when I look at Scala's type system, though that one is less about restriction than about letting you do everything in 3 different ways. I've been reading through the Swift language guide and I'm getting the same feeling. It doesn't help that they have their own name and keyword for nearly every commonly known programming concept. You can use self. but the guide prefers just ., which takes some getting used to. finally? Nope. Use defer instead. [An example from TypeScript] To be honest, I'm also a bit dizzy at how quickly the TypeScript type system has gotten more and more complex. "TypeScript 2.1: keyof and Lookup Types" by Marius Schulz includes details on even more typing specifications that let you infer types from dynamic objects with flow-control analysis. I think this is quite an interesting approach, akin to more functional languages, like ML and F#, where return types are inferred and even parameter types are inferred. Swift has also gone a long way in this direction. Interfaces are replaced with non-inheritable types that describe the shape of data. Types can even be inferred by which fields you access within conditionals so that a single variable has a different inferred type depending on which path through the code it takes. It's all very exciting, but I wonder how much can be used correctly -- especially by the aforementioned crappy programmers. For example, this is the definition for the Object.entries() method from JavaScript. interface ObjectConstructor { // ... entries(o: T): [keyof T, T[K]][]; // ... } [Inferred method return-types] After having used languages that have explicit return types for methods, I'm still a bit at sea when I read TypeScript code without them. I find myself hovering identifiers to see which type was inferred for them by the real-time compilation. I agree that the code is cleaner, but maybe something's gone missing. It's harder to tell what the hell I'm supposed to pass in as a parameter or what the hell I get back from a function when the type can be a union of 3 or 4 other vaguely and sometimes ad-hoc--defined types. For example, a lot of code just constantly redefines the hash-table interface rather than just defining a type for it ... so the caller isn't restricted to implementing a specific interface. This is nice for library code, I guess, but it makes it harder to reason about the code because you don't have good names for types. This is an interesting enough experience for seasoned programmers; I can't even imagine how average or bad programmers deal with it. I see where Martin is coming from, that he's afraid of BDUF, something he's been fighting for years by arguing that you can design as you go if you'll just test your code as you write it. If you see that a parameter has to be an IHashMap, that's easier to understand than { [key: string]: any } or { [key: string]: T } where T is a completely different type. There are advantages and disadvantages. "Every step down that path increases the difficulty of using and maintaining the language. Every step down that path forces users of the language to get their type models “right” up front; because changing them later is too expensive. Every step down that path forces us back into the regime of Big Design Up Front." I agree with the sentiment, but I don't know if we're there yet. Martin argues that there is a balance and maybe I need more experience with the languages he's horrified about. He does write: "I think Java and C# have done a reasonable job at hovering near the balance point. (If you ignore the horrible syntax for generics, and the ridiculous proscription against multiple inheritance.)" ...which I agree with wholeheartedly. I have learned to live without multiple inheritance, but I regularly railed against its absence for decades. I have given up because the world has moved on. I would love to see proper contravariance and covariant return types and anchored types, but I've kind of given up on seeing that kind of stuff in a mainstream language, as well. Instead, I've drifted more toward immutable, stateless, functional style -- even in C#. I'm ogling F#. I'm working with Swift now and will do much more of that this year. -------------------------------------------------------------------------------- [1] Did you expect me to soften that brag down here, in this footnote? [1] I also use R#'s solution-wide analysis.

Set up JetBrains DataGrip with local SQL Server Express

2017-01-11T08:47:45+01:00

Published by marco on 11. Jan 2017 08:47:45 ------------------------------------------------------------------------ The article "Connecting DataGrip to MS SQL Server" by Maksim Sobolevskiy covers all of the points well, with screen shots but I just wanted to record my steps, collected into a tight list. Screenshots for most of these steps are available in the blog linked above. If you don't have a license for DataGrip, you can "download" a 30-day trial or you can "download" the JetBrains Rider EAP, which bundles it. Once Rider is released, you'll have to have a license for it, but -- for now -- you can use it for free. * Open SQL Server Configuration Manager to make sure that the TCP/IP protocol for SQL Server Express in the SQL Server Network Configuration is enabled. Restart the SQL Server Windows service if you made changes. * Make sure that the SQL Server Browser Windows service is running. * Open the Database tool window.Add a SQL Server (Microsoft) connection. Make sure the driver is downloaded. Fill out the settings as shown for QuinoTests below (click to expand). Note that the port has been cleared. [image]

A Surfeit of C#/.NET IDEs

2016-11-24T20:02:47+01:00

Published by marco on 24. Nov 2016 20:02:47 ------------------------------------------------------------------------ For many years, the C#/.NET world has been dominated by a single main IDE: Visual Studio. MonoDevelop has also been available for a while, as an alternative for users on other platforms. Lately, though, there have been a few new contenders in the .NET IDE arena. [Visual Studio for Mac] We'll get this one out of the way first: this is basically Xamarin Studio for Mac, rebranded as Visual Studio for Mac. This IDE is pretty and extremely well-integrated into MacOS, with a lot of animated editor interaction for compiler warnings and errors. Unlike Rider or Visual Studio 2017 with ReSharper, Xamarin Studio doesn't benefit from the R# tooling, so there are a few things immediately missing. Navigation is not as smooth as with ReSharper-based IDEs [1], although it's definitely on-par with what I've experienced in Xcode. Xamarin Studio is fast and pretty good and I'll definitely keep it in the mix for testing Quino on alternate platforms once we start the move to .NET Standard 2.0. [2] [Project Rider] [image]This is only an EAP, so keep that in mind when testing. I installed this IDE on my Mac and Windows. The setup process was very smooth, asking for theme/color preferences and -- most importantly -- keyboard preferences. This time, the key-mapping for "Visual Studio" turned out to be quite appropriate and good. I was able to load the Quino solution relatively quickly. The first load kicks off two processes: Nuget Restore and Process Files. On subsequent loads, the Nuget Restore no longer applies and Process Files benefits from Rider having cached everything the first time around. I couldn't find any option to add an extra NuGet source, which was odd. There is a tab in the "Nuget Packages" pane called "sources", but it just lists the NuGet configuration files but doesn't offer any way to add sources. On the plus side, the test runner worked immediately. but on the minus side, it delivered results inconsistent with VS2015 and VS2017 running on the same machine. It looks and behaves like the same test runner as in ReSharper [3], but the results are different for some (a few hundred) Quino tests. It loads quickly, can deal with the Quino solution without issues and the test runner works. Everything else felt like Visual Studio with ReSharper -- at least for the stuff I use. I'll keep an eye on this IDE. [Visual Studio 2017 RC1] [image]I installed this with ReSharper 2016.3EAP9 and was pleasantly surprised to see that it behaved like an actual RC. That is, instead of releasing Alpha/early-beta software as an RC -- I'm looking at you, .NET Core -- they've got a really solid release on their hands. That said, it's not quite ready for production use (obvious from the RC moniker) but I was able to use it for productive use over a long weekend. So I was pretty encouraged that I'll be able to let the guys at "Encodo" use it sooner rather than later. [4] That said, here are the things I've noticed that are missing: * They claim that EditorConfig is included, but it didn't work for me. I found an EditorConfig extension that I installed, but I'd already set the configuration manually by then, so I'm not sure what effect it had. * I'd turned CodeLens off, but I still saw a rogue Git process. I haven't seen this process before or since, so I can only assume it came from 2017. * The StyleCop by JetBrains extension is not available yet, so that's a lot of hints and warnings missing. The maintainers of this extension write that "they will release as soon as ReSharper does" . * The standard Visual Studio TestRunner is still unusable. I can't imagine working without the ReSharper TestRunner. [5] Everything else seemed to work fine, which speaks well of both VS2017 and R#'s latest EAP. [Conclusion] * Xamarin Studio/Visual Studio for Mac: this won't be in the mix yet. It doesn't offer anything more than Rider on Mac. Plus, the Quino solution isn't ready for Mac development yet. Once .NET Standard 2.0 is ready and we can target Quino to that, we'll take another look. * Project Rider: I'll be keeping an eye on future releases of JetBrains's Project Rider, but won't use the current EAP version again (tests don't work reliably). * Visual Studio 2017: Once this is released, I feel good about installing it and releasing to other Encodo developers much sooner than with other versions. Especially since ReSharper's support is already so solid and once the StyleCop extension is available. Fingers crossed that the quality stays the same. -------------------------------------------------------------------------------- [1] I.e. go to Base/Inheriting implementations, and so on. [1] Although .NET Core 1.1 has been released, it seems that .NET Standard 2.0 will not be available until 2017. This isn't exactly unexpected, but we aren't going to move Quino onto a new platform target until this is released. It's possible that we could target .NET Core as well, but we burned our fingers once, so we're going to hold off until .NET Standard. [1] E.g. I've been having sporadic test failures when the runner is unable to load the latest FakeItEasy assembly in Visual Studio on my desktop -- but never my laptop or our TeamCity build server. A bunch of test failures in Rider were due to the same reason. [1] We usually wait until the first service pack -- at least. [1] Which has been -- thankfully -- stable since 2016.2.

ABD: Improving the Aspect-modeling API for Quino

2016-06-05T12:52:31+02:00

Published by marco on 5. Jun 2016 12:52:31 ------------------------------------------------------------------------ [Overview] We discussed ABD in a recent article "ABD: Refactoring and refining an API" . To cite from that article, "[...] the most important part of code is to think about how you’re writing it and what you’re building. You shouldn’t write a single line without thinking of the myriad ways in which it must fit into existing code and the established patterns and practices." With that in mind, I saw another teaching opportunity this week and wrote up my experience designing an improvement to an existing API. [Requirements] Before we write any code, we should know what we're doing. [1] * We use aspects (IMetaAspects) in Quino to add domain-specific metadata (e.g. the IVisibleAspect controls element visibility) * Suppose we have such an aspect with properties A1...AN. When we set property A1 to a new value, we want to retain the values of properties A2...AN (i.e. we don't want to discard previously set values) * The current pattern is to call FindOrAddAspect(). This method does what it advertises: If an aspect with the requested type already exists, it is returned; otherwise, an instance of that type is created, added and returned. The caller gets an instance of the requested type (e.g. IVisibleAspect). * Any properties on the requested type that you want to change must have setters. * If the requested type is an interface, then we end up defining our interface as mutable. * Other than when building the metadata, every other use of these interfaces should not make changes. * We would like to be able to define the interface as read-only (no setters) and make the implementation mutable (has setters). Code that builds the metadata uses both the interface and the implementation type. Although we're dealing concretely with aspects in Quino metadata, the pattern and techniques outlined below apply equally well to other, similar domains. [The current API] A good example is the IClassCacheAspect. It exposes five properties, four of which are read-only. You can modify the property (OrderOfMagnitude) through the interface. This is already not good, as we are forced to work with the implementation type in order to change any property other than OrderOfMagnitude. The current way to address this issue would be to make all of the properties settable on the interface. Then we could use the FindOrAddAspect() method with the IClassCacheAspect. For example, var cacheAspect = Element.Classes.Person.FindOrAddAspect( () => new ClassCacheAspect() ); cacheAspect.OrderOfMagnitude = 7; cacheAspect.Capacity = 1000; For comparison, if the caller were simply creating the aspect instead of getting a possibly-already-existing version, then it would just use an object initializer. var cacheAspect = Element.Classes.Person.Aspects.Add( new ClassCacheAspect() { OrderOfMagnitude = 7; Capacity = 1000; } } This works nicely for creating the initial aspect. But it causes an error if an aspect of that type had already been added. Can we design a single method with all the advantages? [The new API] A good way to approach a new is to ask: How would we want the method to look if we were calling it? Element.Classes.Person.SetCacheAspectValues( a => { a.OrderOfMagnitude = 7; a.Capacity = 1000; } ); If we only want to change a single property, we can use a one-liner: Element.Classes.Person.SetCacheAspectValues(a => a.Capacity = 1000); Nice. That's even cleaner and has fewer explicit dependencies than creating the aspect ourselves. [Making it work for one aspect type] Now that we know what we want the API to look like, let's see if it's possible to provide it. We request an interface from the list of aspects but want to use an implementation to set properties. The caller has to indicate how to create the instance if it doesn't already exist, but what if it does exist? We can't just upcast it because there is no guarantee that the existing aspect is the same implementation. These are relatively lightweight objects and the requirement above is that the property values on the existing aspect are set on the returned aspect, not that the existing aspect is preserved. What if we just provided a mechanism for copying properties from an existing aspect onto the new version? var cacheAspect = new ClassCacheAspect(); var existingCacheAspect = Element.Classes.Person.Aspects.FirstOfTypeOrDefault(); if (existingCacheAspect != null) { result.OrderOfMagnitude = existingAspect.OrderOfMagnitude; result.Capacity = existingAspect.Capacity; // Set all other properties } // Set custom values cacheAspect.OrderOfMagnitude = 7; cacheAspect.Capacity = 1000; This code does exactly what we want and doesn't require any setters on the interface properties. Let's pack this away into the API we defined above. The extension method is: public static ClassCacheAspect SetCacheAspectValues( this IMetaClass metaClass, Action setValues) { var result = new ClassCacheAspect(); var existingCacheAspect = metaClass.Aspects.FirstOfTypeOrDefault(); if (existingCacheAspect != null) { result.OrderOfMagnitude = existingAspect.OrderOfMagnitude; result.Capacity = existingAspect.Capacity; // Set all other properties } setValues(result); return result; } So that takes care of the boilerplate for the IClassCacheAspect. It hard-codes the implementation to ClassCacheAspect, but let's see how big a restriction that is once we've generalized below. [Generalize the aspect type] We want to see if we can do anything about generalizing SetCacheAspectValues() to work for other aspects. Let's first extract the main body of logic and generalize the aspects. public static TConcrete SetAspectValues( this IMetaClass metaClass, Action copyValues, Action setValues ) where TConcrete : new, TService where TService : IMetaAspect { var result = new TConcrete(); var existingAspect = metaClass.Aspects.FirstOfTypeOrDefault(); if (existingAspect != null) { copyValues(result, existingAspect); } setValues(result); return result; } [Remove constructor restriction] This isn't bad, but we've required that the TConcrete parameter implement a default constructor. Instead, we could require an additional parameter for creating the new aspect. public static TConcrete SetAspectValues( this IMetaClass metaClass, Func createAspect, Action copyValues, Action setValues ) where TConcrete : TService where TService : IMetaAspect { var result = createAspect(); var existingAspect = metaClass.Aspects.FirstOfTypeOrDefault(); if (existingAspect != null) { copyValues(result, existingAspect); } setValues(result); return result; } [Just pass in the new aspect to use] Wait, wait, wait. We not only don't need to the new generic constraint, we also don't need the createAspect lambda parameter, do we? Can't we just pass in the object instead of passing in a lambda to create the object and then calling it immediately? public static TConcrete SetAspectValues( this IMetaClass metaClass, TConcrete aspect, Action copyValues, Action setValues ) where TConcrete : TService where TService : IMetaAspect { var existingAspect = metaClass.Aspects.FirstOfTypeOrDefault(); if (existingAspect != null) { copyValues(aspect, existingAspect); } setValues(aspect); return aspect; } That's a bit more logical and intuitive, I think. [Redefine original method] We can now redefine our original method in terms of this one: public static ClassCacheAspect SetAspectValues( this IMetaClass metaClass, Action setValues) { return metaClass.UpdateAspect( new ClassCacheAspect(), (aspect, existingAspect) => { result.OrderOfMagnitude = existingAspect.OrderOfMagnitude; result.Capacity = existingAspect.Capacity; // Set all other properties }, setValues ); } [Generalize copying values] Can we somehow generalize the copying behavior? We could make a wrapper that expects an interface on the TService that would allow us to call CopyFrom(existingAspect). public static TConcrete SetAspectValues( this IMetaClass metaClass, TConcrete aspect, Action setValues ) where TConcrete : TService, ICopyTarget where TService : IMetaAspect { return metaClass.UpdateAspect( aspect, (aspect, existingAspect) => aspect.CopyFrom(existingAspect), setValues ); } What does the ICopyTarget interface look like? public interface ICopyTarget { void CopyFrom(object other); } This is going to lead to type-casting code at the start of every implementation to make sure that the other object is the right type. We can avoid that by using a generic type parameter instead. public interface ICopyTarget { void CopyFrom(T other); } That's better. How would we use it? Here's the definition for ClassCacheAspect: public class ClassCacheAspect : IClassCacheAspect, ICopyTarget { public void CopyFrom(IClassCacheAspect otherAspect) { OrderOfMagnitude = otherAspect.OrderOfMagnitude; Capacity = otherAspect.Capacity; // Set all other properties } } Since the final version of ICopyTarget has a generic type parameter, we need to adjust the extension method. But that's not a problem because we already have the required generic type parameter in the outer method. public static TConcrete SetAspectValues( this IMetaClass metaClass, TConcrete aspect, Action setValues ) where TConcrete : TService, ICopyTarget where TService : IMetaAspect { return metaClass.UpdateAspect( aspect, (aspect, existingAspect) => aspect.CopyFrom(existingAspect), setValues ); } [Final implementation] Assuming that the implementation of ClassCacheAspect implements ICopyTarget as shown above, then we can rewrite the cache-specific extension method to use the new extension method for ICopyTargets. public static ClassCacheAspect SetCacheAspectValues( this IMetaClass metaClass, Action setValues) { return metaClass.UpdateAspect( new ClassCacheAspect(), setValues ); } This is an extension method, so any caller that wants to use its own IClassCacheAspect could just copy/paste this one line of code and use its own aspect. [Conclusion] This is actually pretty neat and clean: * We have a pattern where all properties on the interface are read-only * We have a pattern where an aspect can indicate how its values are to be copied from another instance. This is basically boilerplate, but must be written only once per aspect -- and it can be located right in the implementation itself rather than in an extension method. * A caller building metadata passes in a single lambda to set values. Existing values are handled automatically. * Adding support for more aspects is straightforward and involves very little boilerplate. -------------------------------------------------------------------------------- [1] You would think that would be axiomatic. You'd be surprised.

ABD: Refactoring and refining an API

2016-05-21T10:58:43+02:00

Published by marco on 21. May 2016 10:58:43 Updated by marco on 21. May 2016 10:59:27 ------------------------------------------------------------------------ We've been doing more internal training at "Encodo" lately and one topic that we've started to tackle is design for architecture/APIs. Even if you're not officially a software architect -- designing and building entire systems from scratch -- every developer designs code, on some level. "[A]lways [B]e [D]esigning" There are broad guidelines about how to format and style code, about how many lines to put in a method, about how many parameters to use, and so on. We strive for Clean Code(tm). But the most important part of code is to think about how you're writing it and what you're building. You shouldn't write a single line without thinking of the myriad ways in which it must fit into existing code and the established patterns and practices. We've written about this before, in the two-part series called "Questions to consider when designing APIs" ("Part I" and "Part II" ). Those two articles comprise a long list of aspects of a design to consider. "First make a good design, then compromise to fit project constraints." Your project defines the constraints under which you can design. That is, we should still have our designer caps on, but the options available are much more strictly limited. But, frustrating as that might be, it doesn't mean you should stop thinking. A good designer figures out what would be optimal, then adjusts the solution to fit the constraints. Otherwise, you'll forget what you were compromising from -- and your design skills either erode or never get better. We've been calling this concept ABD -- Always Be Designing. [1] Let's take a closer, concrete look, using a recent issue in the schema migration for Quino. Hopefully, this example illustrates how even the tiniest detail is important. [2] [A bug in the schema migrator] We detected the problem when the schema migration generated an invalid SQL statement. ALTER TABLE "punchclock__timeentry" ALTER COLUMN "personid" SET DEFAULT ; As you can see, the default value is missing. It seems that there are situations where the code that generates this SQL is unable to correctly determine that a default value could not be calculated. The code that calculates the default value is below. result = Builder.GetExpressionPayload( null, CommandFormatHints.DefaultValue, new ExpressionContext(prop), prop.DefaultValueGenerator ); To translate, there is a Builder that produces a payload. We're using that builder to get the payload (SQL, in this case) that corresponds to the DefaultValueGenerator expression for a given property, prop. This method is an extension method of the IDataCommandBuilder, reproduced below in full, with additional line-breaks for formatting: public static string GetExpressionPayload( this IDataCommandBuilder builder, [CanBeNull] TCommand command, CommandFormatHints hints, IExpressionContext context, params IExpression[] expressions) { if (builder == null) { throw new ArgumentNullException("builder"); } if (context == null) { throw new ArgumentNullException("context"); } if (expressions == null) { throw new ArgumentNullException("expressions"); } return builder.GetExpressionPayload( command, hints, context, expressions.Select( e => new ExecutableQueryItem(new ExecutableExpression(e)) ) ); } This method does no more than to package each item in the expressions parameter in an ExecutableQueryItem and call the interface method. The problem isn't immediately obvious. It stems from the fact that each ExecutableQueryItem can be marked as Handled. The extension method ignores this feature, and always returns a result. The caller is unaware that the result may correspond to an only partially handled expression. [Is there a quick fix?] Our first instinct is, naturally, to try to figure out how we can fix the problem. [3] In the code above, we could keep a reference to the executable items and then check if any of them were unhandled, like so: var executableItems = expressions.Select( e => new ExecutableQueryItem(new ExecutableExpression(e)) ); var result = builder.GetExpressionPayload(command, hints, context, executableItems); if (executableItems.Unhandled().Any()) { // Now what? } return result; } We can detect if at least one of the input expressions could not be mapped to SQL. But we don't know what to do with that information. * Do we throw an exception? No, we can't just do that. None of the callers are expecting an exception, so that's an API change. [4] * Do we return null? What can we return to indicate that the input expressions could not be mapped? Here we have the same problem as with throwing an exception: all callers assume that the result can be mapped. So there's no quick fix. We have to change an API. We have to design. [Part of the result is missing] As with most bugs, the challenge lies not in knowing how to fix the bug, but in how to fix the underlying design problem that led to the bug. The problem is actually not in the extension method, but in the method signature of the interface method. Instead of a single result, there are actually two results for this method call: * Can the given expressions be mapped to a string (the target representation)? * If so, what is that text? Instead of a Get method, this is a classic TryGet method. [How to Introduce the Change] If this code is already in production, then you have to figure out how to introduce the bug fix without breaking existing code. If you already have consumers of your API, you can't just change the signature and cause a compile error when they upgrade. You have to decorate the existing method with [Obsolete] and make a new interface method. So we don't change the existing method and instead add the method TryGetExpressionPayload() to IDataCommandBuilder. [What are the parameters?] Now, let's figure out what the parameters are going to be. The method called by the extension method above has a slightly different signature. [5] string GetExpressionPayload( [CanBeNull] TCommand command, CommandFormatHints hints, [NotNull] IExpressionContext context, [NotNull] IEnumerable> expressions ); That last parameter is a bit of a bear. What does it even mean? The signature of the extension method deals with simple IExpression objects -- I know what those are. But what are ExecutableQueryItems and IExecutableExpressions? As an author and maintainer of the data driver, I know that these objects are part of the internal representation of a query as it is processed. But as a caller of this method, I'm almost never going to have a list of these objects, am I? Let's find out. Me: Hey, ReSharper, how many callers of that method are there in the entire Quino source? ReSharper: Just one, Dave. [6] So, we defined an API with a signature that's so hairy no-one calls it except through an extension method that makes the signature more palatable. And it introduces a bug. Lovely. We've now figured out that our new method should accept a sequence of IExpression objects instead of ExecutableQueryItem objects. How's the signature looking so far? bool TryGetExpressionPayload( [CanBeNull] TCommand command, CommandFormatHints hints, [NotNull] IExpressionContext context, [NotNull] IEnumerable expressions, out string payload ); [Are We Done?] Not quite. There are two things that are still wrong with this signature, both important. [Fix the Result Type] One problem is that the rest of the IDataCommandBuilder deals with a generic payload type and this method only works for builders where the target representation is a string. The Mongo driver, for example, uses MongoStorePayload and MongoRetrievePayload objects instead of strings and throws a NotSupportedException for this API. That's not very elegant, but the Mongo driver was forced into that corner by the signature. Can we do better? The API would currently require Mongo to always return false because our Mongo driver doesn't know how to map anything to a string. But it could map to one of the aforementioned object representations. If we change the out parameter type from a string to an object, then any driver, regardless of payload representation, has at least the possibility of implementing this API correctly. [Fix parameters] Another problem is that the order of parameters does not conform to the code style for Encodo. * We prefer to place all non-nullable parameters first. Otherwise, a call that passes null as the first parameter looks strange. The command can be null, so it should move after the two non-nullable parameters. If we move it all the way to the end, we can even make it optional. * Also, primitives should come after the references. (So hints should be third.) * Also, semantically, the call is getting the payload for the expressions not the context. The first parameter should be the target of the method; the rest of the parameters provide context for that input. * The original method accepted params IExpression[]. Using params allows a caller to provide zero or more expressions, but it's only allowed on the terminal parameter. Instead, we'll accept an IEnumerable, which is more standard for the Quino library anyway. The final method signature is below. bool TryGetExpressionPayload( [NotNull] IEnumerable expressions, [NotNull] IExpressionContext context, CommandFormatHints hints, out object payload, [CanBeNull] TCommand command = default(TCommand) ); [Our API in Action] The schema migration called the original API like this: result = Builder.GetExpressionPayload( null, CommandFormatHints.DefaultValue, new ExpressionContext(prop), prop.DefaultValueGenerator ); return true; The call with the new API -- and with the bug fixed -- is shown below. The only non-functional addition is that we have to call ToSequence() on the first parameter (highlighted). Happily, though, we've fixed the bug and only include a default value in the field definition if one can actually be calculated. object payload; if (Builder.TryGetExpressionPayload( prop.DefaultValueGenerator.ToSequence(), new ExpressionContext(prop), CommandFormatHints.DefaultValue, out payload) ) { result = payload as string ?? payload.ToString(); return true; } [One More Design Decision...] A good rule of thumb is that if you find yourself explaining something in detail, it might still be too complicated. In that light, the call to ToSequence() is a little distracting. [7] It would be nice to be able to map a single expression without having to pack it into a sequence. So we have one more design decision to make: where do we add that method call? Directly to the interface, right? But the method for a single expression can easily be expressed in terms of the method we already have (as we saw above). It would be a shame if every implementor of the interface was forced to produce this boilerplate. Since we're using C#, we can instead extend the interface with a static method, as shown below (again, with more line breaks for this article): public static bool TryGetExpressionPayload( [NotNull] this IDataCommandBuilder builder, // Extend the builder [NotNull] IExpression expression, [NotNull] IExpressionContext context, CommandFormatHints hints, out object payload, [CanBeNull] TCommand command = default(TCommand) ) { return builder.TryGetExpressionPayload( expression.ToSequence(), context, hints, out payload, command ); } We not only avoided cluttering the interface with another method, but now a caller with a single expression doesn't have to create a sequence for it [8], as shown in the final version of the call below. object payload; if (Builder.TryGetExpressionPayload( prop.DefaultValueGenerator, new ExpressionContext(prop), CommandFormatHints.DefaultValue, out payload) ) { result = payload as string ?? payload.ToString(); return true; } [Conclusion] We saw in this post how we always have our designer/architect cap on, even when only fixing bugs. We took a look at a quick-fix and then backed out and realized that we were designing a new solution. Then we covered, in nigh-excruciating detail, our thought process as we came up with a new solution. Many thanks to Dani for the original design and Sebastian for the review! -------------------------------------------------------------------------------- [1] This is a bit of a riff on ABC -- Always Be Closing -- as popularized by Alec Baldwin in the movie "Glengarry Glen Ross" . [1] Also, understand that it took much longer to write this blog post and itemize each individual step of how we thought about the issue. In reality, we took only a couple of minutes to work through this chain of reasoning and come up with the solution we wanted. It was only after we'd finished designing that I realized that this was a good example of ABD. [1] Actually, our first instinct is to make sure that there is a failing test for this bug. But, this article deals with how to analyze problems and design fixes, not how to make sure that the code you write is tested. That's super-important, too, though, just so you know. Essential, even. [1] Even though C# doesn't include the exceptions thrown in the signature of a method, as Java does. Where the Java version is fraught with issues, see the "Recoverable Errors: Type-Directed Exceptions" chapter of "Midori: The Error Model" by Joe Duffy for a really nice proposal/implementation of a language feature that includes expected exceptions in the signature of a method. [1] Which is why we defined the extension method in the first place. [1] I'm fully aware that my name isn't Dave. It's just what ReSharper calls me. "Old-school reference." [1] This was pointed out, by the way, by a reviewer of this blog post and escaped the notice of both designers and the code-reviewer. API design is neither easy nor is it done on the first try. It's only finished after multiple developers have tried it out. Then, you'll probably be able to live with it. [1] Most developers would have used new [] { expression }, which I think is kind of ugly.

v2.2: Winform fixes and Query Improvements

2016-05-12T22:22:36+02:00

Published by marco on 12. May 2016 22:22:36 ------------------------------------------------------------------------ The summary below describes major new features, items of note and breaking changes. The "full list of issues" is also available for those with access to the Encodo issue tracker. [Highlights] * Lots of bug fixes and improvements for the Winform UI and German translations with the release of Punchclock on this version. ("QNO-5162" , "QNO-5159" , "QNO-5158" , "QNO-5157" , "QNO-5156" , "QNO-5140" , "QNO-5155" , "QNO-5145" , "QNO-5111" , "QNO-5107" , "QNO-5106" , "QNO-5104" , "QNO-5015" ) * DateTimeExtensions.GetDayOfWeek() had a leap-day bug ("QNO-5051" ) * Fixed how the hash code for GenericObjects is calculated, which fixes sorting issues in grids, specifically for non-persisted or transient objects ("QNO-5137" ) * Improvements to the IAccessControl API for getting groups and users and testing membership ("QNO-5133" ) * Add support for query aliases (e.g. for joining the same table multiple times) ("QNO-531" ) This changes the API surface only minimally. Applications can pass an alias when calling the Join method, as shown below, query.Join(Metadata.Project.Deputy, alias: "deputy") You can find more examples of aliased queries in the TestAliasedQuery(), TestJoinAliasedTables(), TestJoinChildTwice() defined in the QueryTests testing fixture. * Add a standalone IQueryAnalyzer for optimizations and in-memory mini-drivers ("QNO-4830" ) [Breaking changes] * ISchemaManager has been removed. Instead, you should retrieve the interface you were looking for from the IOC. The possible interfaces you might need are IImportHandler, IMappingBuilder, IPlanBuilder or ISchemaCommandFactory. * ISchemaManagerSettings.GetAuthorized() has been moved to ISchemaManagerAuthorizer. * The hash-code fix for GenericObjects may have an effect on the way your application sorts objects.The IParticipantManager (base interface of IAccessControl) no longer has a single method called GetGroups(IParticipant). This method was previously used to get the groups to which a user belongs and the child groups of a given group. This confusing double duty for the API led to an incorrect implementation for both usages. Instead, there are now two methods: * IEnumerable GetGroups(IUser user): Gets the groups for the given user * IEnumerable GetChildGroups(IGroup group): Gets the child groups for the given group The old method has been removed from the interface because (A) it never worked correctly anyway and (B) it conflicts with the new API.

Quino Retrospective and Roadmap

2016-05-12T22:16:43+02:00

Published by marco on 12. May 2016 22:16:43 Updated by marco on 12. May 2016 22:30:34 ------------------------------------------------------------------------ [History] Before taking a look at the roadmap, let's quickly recap how far we've come. An overview of the release schedule shows a steady accretion of features over the years, as driven by customer or project needs. [image] The list below includes more detail on the releases highlighted in the graphic. [1] * "0.1" : Proof of concept with metadata, PostgreSql (data and schema-migration) and Winforms UI * "1.0" : First customer product with PostgreSql, DevExpress Winforms UI and Reporting * "1.0.5" : MS-SQL driver (parity with PostgreSql driver) * "1.5.0" : Remoting data driver; require .NET 4.0 * "1.6.0" : Mongo/NoSQL data driver * "1.8.0" : Rewrite data driver to use sessions * "1.8.5" : Support "improved metadata-generation pattern" * "1.9.0" : Add plugin/overlay support * "1.10.0" : Require .NET 4.5; add JSON-based remoting protocol; Windows-service support * "1.13.0" : Rewrite security API * "v2.0-beta1" : Rewrite configuration, logging and schema-migration APIs * "v2.0-beta2" : Add V2 generated-code format * "2.0" : Finish configuration/IOC rewrite; produce NuGet packages for delivery * "2.2" : Stabilize Winform; support aliased tables in queries * "3.0" : Rewrite MetaBuilder API; improve support for plugins We took 1.5 years to get to v1. The initial major version was to signify the first time that Quino-based code went into external production. [2] After that, it took 6.5 years to get to v2. Although we added several large products that use Quino, we were always able to extend rather than significantly change anything in the core. The second major version was to signify sweeping changes made to address technical debt, to modernize certain components and to prepare for changes coming to the .NET platform. It took just 5 months to get to v3 for two reasons: 1. Although we were able to make a lot of planned changes in v2 [3], we had to leave some breaking changes for future versions. [4] 2. We now strictly adhere to the rule that a breaking change anywhere in the software's API -- and Quino's API surface is large -- leads automatically to a major-version change. [5] [Roadmap] So that's where we've been. Where are we headed? As you can see above, Quino is a very mature product that satisfies the needs of a wide array of software on all tiers. What more is there to add? Quino's design has always been driven by a combination of customer requirements and what we anticipated would be customer requirements. We're currently working on the following features. Modeling improvements This work builds on the API changes made to the MetaBuilder in v3. We're creating a more fluent, modern and extensible API for building metadata. We hope to be able to add these changes incrementally without introducing any breaking changes. [6] WPF / VSG A natural use of the rich metadata in Quino is to generate user interfaces for business entities without have to hand-tool each form. From the POC onward, Quino has included support for generating UIs for .NET Winforms. Winforms has been replaced on the Windows desktop with WPF and UWP. We've gotten quite far with being able to generate WPF applications from Quino metadata. The screenshots below come from a pre-alpha version of the Sandbox application included in the Quino solution. [image][image][image][image] You may have noticed the lovely style of the new UI. [7] We're using a VSG designed for us by "Ergosign" , for whom we've done some implementation work in the past. .NET Core If you've been following Microsoft's announcements, things are moving quickly in the .NET world. There are whole new platforms available, if you target your software to run on them. We're investigating the next target platforms for Quino. Currently that means getting the core of Quino -- Quino.Meta and its dependencies -- to compile under .NET Core. [image]As you can see in the screenshot, we've got one of the toughest assemblies to compile -- Encodo.Core. After that, we'll try for running some tests under Linux or OS X. The long-term goal is to be able to run Quino-based application and web servers on non-Windows -- and, most importantly, non-IIS -- platforms. [8] These changes will almost certainly cause builds using previous versions to break. Look for any additional platform support in an upcoming major-version release. -------------------------------------------------------------------------------- [1] There were, of course, more minor and patch releases throughout, but those didn't introduce any major new functionality. [1] Punchclock, our time-entry and invoicing software -- and Quino ""dogfood" (When a developer uses their own code for their own daily needs. Being a user as well as a developer creates the user empathy that is the hallmark of good software.)" product -- had been in use internally at Encodo earlier than that. [1] E.g. splitting the monolithic Encodo and Quino assemblies into dozens of new, smaller and much more focused assemblies. Reorganizing configuration around the IOC and rewriting application startup for more than just desktop applications was another sweeping change. [1] One of those breaking changes was to the MetaBuilder, which started off as a helper class for assembling application metadata, but became a monolithic and unavoidable dependency, even in v2. In v3, we made the breaking changes to remove this component from its central role and will continue to replace its functionality with components that more targeted, flexible and customizable. [1] In the years between v1 and v2, we used the minor-version number to indicate when breaking changes could be made. We also didn't try as hard to avoid breaking changes by gracefully deprecating code. The new approach tries very hard to avoid breaking changes but accepts the consequences when it's deemed necessary by the team. [1] That is, when users upgrade to a version with the newer APIs, they will get obsolete warnings but their existing code will continue to build and run, as before the upgrade. In this way, customers can smoothly upgrade without breaking any builds. [1] You may also have noticed that the "Sandbox Dialog View" includes a little tag in it for the "XAML Spy", a tool that we use for WPF development. Just so you know the screenshots aren't faked... :-) [1] As with the WPF interface, we're likely to dogfood all of these technologies with Punchclock, our time-tracking and invoicing system written with Quino. The application server and web components that run on Windows could be migrated to run on one of our many Linux machines instead.

Quino v3.0: Metadata builders and code-generation improvements

2016-05-12T22:11:29+02:00

Published by marco on 12. May 2016 22:11:29 ------------------------------------------------------------------------ The summary below describes major new features, items of note and breaking changes. The "full list of issues" is also available for those with access to the Encodo issue tracker. [Highlights] * Metadata builders are more lightweight now and have fewer restrictions on their API. ("QNO-4994" ) * Improve support for plugins ("QNO-5189" , "QNO-5190" , "QNO-5196" ) * Improve and clean up expression library ("QNO-5174" , "QNO-5176" , "QNO-5182" , "QNO-5183" , "QNO-5184" , "QNO-5188" , "QNO-5198" , "QNO-5205" ) * Improve schema-migration tools ("QNO-5179" , "QNO-5206" , "QNO-5212" , "QNO-5213" ) [Breaking changes] * IDataSession and IApplication now directly implement the IServiceRequestHandler and helper methods that used to extend IApplication now extend this interface instead, so calls like GetModel() can now be executed against an IApplication or an IDataSession. Many methods have been moved out of the IServiceRequestHandler interface to extension methods declared in the Encodo.IOC namespace. This move will require applications to update the usings. ReSharper will automatically find the correct namespace and apply it for you. * Similarly, the extension method ApplicationExtensions.GetInstance() has been replaced with a direct implementation of the IServiceRequestHandler by IApplication. * MetaBuilder.Include() has been replaced with Dependencies.Include() * When you call the new version of CreateModel(), you can no longer call CreateMainModule() because the main module is set up automatically. Although the call is marked as obsolete, it can only be combined with the older overload of the CreateModel(). Using it with the newer overload will cause a runtime error as the main module is added to the model twice. * The various methods to create paths with the MetaBuilder have been replaced by AddPath(). To rewrite a path, use the following style: Builder.AddPath( Elements.Classes.A.FromOne("Id"), Elements.Classes.B.ToMany("FileId"), path => path.SetMetaId(new Guid("...")).SetDeleteRule(MetaPathRule.Cascade), idx => idx.SetMetaId(new Guid("...")) );

C# Handbook Rewrite

2016-04-26T21:40:40+02:00

Published by marco on 26. Apr 2016 21:40:40 Updated by marco on 27. Apr 2016 07:13:40 ------------------------------------------------------------------------ Encodo published its first C# Handbook and published it to "its web site" in 2008. At the time, we also published to several other standard places and got some good, positive feedback. Over the next year, I made some more changes and published new versions. The latest version is 1.5.2 and is available from "Encodo's web site" . Since then, though I've made a few extra notes and corrected a few errors, but never published an official version again. This is not because Encodo hasn't improved or modernized its coding guidelines, but because of several issues, listed below. * At 72 pages, it's really quite long * A more compact, look-up reference would be nice * It contains a mix of C#-specific, Encodo-specific and library-specific advice * It's maintained in Microsoft Word * Code samples are manually formatted * New versions are simply new copies in versioned folders (no source control) * Collaboration is nearly impossible * There is nothing about any .NET version newer than 3.5 * There is no mention of any other programming language (e.g. TypeScript, JavaScript) * A lot of stuff is overly complicated (e.g. var advice) or just plain wrong (e.g. var advice) To address these issues and to accommodate the new requirements, here's what we're going to do: * Convert the entire document from Word to Markdown and put it in a Git repository * Collaboration? Pull requests. Branches. * Versioning? Standard diffing of commits. * Code samples? Automatic highlighting from GitLab (Encodo's internal server) or GitHub (external repository). * Separate the chapters into individual files and keep them shorter and more focused on a single topic * * Separate all of the advice and rules into the following piles: * General programming advice and best practices * C#-specific * Encodo-specific * Library-specific (e.g. Quino) These are the requirements and goals for a new version of the C# handbook. The immediate next steps are: 1. Convert current version from Microsoft Word to Markdown (done) 2. Add everything to a Git repository (done) 3. Overhaul the manual to remove incorrect and outdated material; address issues above (in progress) 4. Mirror externally ("GitHub" or "GitLab" or both) I hope to have an initial, modern version ready within the next month or so.

API Design: The Road Not Taken

2016-04-07T22:27:10+02:00

Published by marco on 7. Apr 2016 22:27:10 ------------------------------------------------------------------------ "Unwritten code requires no maintenance and introduces no cognitive load." As I was working on another part of Quino the other day, I noticed that the oft-discussed registration and configuration methods [1] were a bit clunkier than I'd have liked. To whit, the methods that I tended to use together for configuration had different return types and didn't allow me to freely mix calls fluently. [The difference between Register and Use] The return type for Register methods is IServiceRegistrationHandler and the return type for Use methods is IApplication (a descendant), The Register* methods come from the IOC interfaces, while the application builds on top of this infrastructure with higher-level Use* configuration methods. This forces developers to write code in the following way to create and configure an application. public IApplication CreateApplication() { var result = new Application() .UseStandard() .UseOtherComponent(); result. .RegisterSingle() .Register(); return result; } That doesn't look too bad, though, does it? It doesn't seem like it would cramp anyone's style too much, right? Aren't we being a bit nitpicky here? That's exactly why Quino 2.0 was released with this API. However, here we are, months later, and I've written a lot more configuration code and it's really starting to chafe that I have to declare a local variable and sort my method invocations. So I think it's worth addressing. Anything that disturbs me as the writer of the framework -- that gets in my way or makes me write more code than I'd like -- is going to disturb the users of the framework as well. Whether they're aware of it or not. [Developers are the Users of a Framework] In the best of worlds, users will complain about your crappy API and make you change it. In the world we're in, though, they will cheerfully and unquestioningly copy/paste the hell out of whatever examples of usage they find and cement your crappy API into their products forever. Do not underestimate how quickly calls to your inconvenient API will proliferate. In my experience, programmers really tend to just add a workaround for whatever annoys them instead of asking you to fix the problem at its root. This is a shame. I'd rather they just complained vociferously that the API is crap rather than using it and making me support it side-by-side with a better version for usually feels like an eternity. Maybe it's because I very often have control over framework code that I will just not deal with bad patterns or repetitive code. Also I've become very accustomed to having a wall of tests at my beck and call when I bound off on another initially risky but in-the-end rewarding refactoring. If you're not used to this level of control, then you just deal with awkward APIs or you build a workaround as a band-aid for the symptom rather than going after the root cause. [Better Sooner than Later] So while the code above doesn't trigger warning bells for most, once I'd written it a dozen times, my fingers were already itching to add [Obsolete] on something. I am well-aware that this is not a simple or cost-free endeavor. However, I happen to know that there aren't that many users of this API yet, so the damage can be controlled. If I wait, then replacing this API with something better later will take a bunch of versions, obsolete warnings, documentation and re-training until the old API is finally eradicated. It's much better to use your own APIs -- if you can -- before releasing them into the wild. Another more subtle reason why the API above poses a problem is that it's more difficult to discover, to learn. The difference in return types will feel arbitrary to product developers. Code-completion is less helpful than it could be. It would be much nicer if we could offer an API that helped users discover it at their own pace instead of making them step back and learn new concepts. Ideally, developers of Quino-based applications shouldn't have to know the subtle difference between the IOC and the application. [A Better Way] Something like the example below would be nice. return new Application() .UseStandard() .RegisterSingle() .UseOtherComponent() .Register(); Right? Not a gigantic change, but if you can imagine how a user would write that code, it's probably a lot easier and more fluid than writing the first example. In the second example, they would just keep asking code-completion for the next configuration method and it would just be there. [Attempt #1: Use a Self-referencing Generic Parameter] In order to do this, I'd already created an issue in our tracker to parameterize the IServiceRegistrationHandler type in order to be able to pass back the proper return type from registration methods. I'll show below what I mean, but I took a crack at it recently because I'd just watched the very interesting video "Fun with Generics" by Benjamin Hodgson , which starts off with a technique identical to the one I'd planned to use -- and that I'd already used successfully for the IQueryCondition interface. [2] Let's redefine the IServiceRegistrationHandler interface as shown below, public interface IServiceRegistrationHandler { TSelf Register() where TService : class where TImplementation : class, TService; // ... } Can you see how we pass the type we'd like to return as a generic type parameter? Then the descendants would be defined as, public interface IApplication : IServiceRegistrationHandler { } In the video, Hodgson notes that the technique has a name in formal notation, "F-bounded quantification" but that a snappier name comes from the C++ world, "curiously recurring template pattern". I've often called it a self-referencing generic parameter, which seems to be a popular search term as well. This is only the first step, though. The remaining work is to update all usages of the formerly non-parameterized interface IServiceRegistrationHandler. This means that a lot of extension methods like the one below public static IServiceRegistrationHandler RegisterCoreServices( [NotNull] this IServiceRegistrationHandler handler) { } will now look like this: public static TSelf RegisterCoreServices( [NotNull] this IServiceRegistrationHandler handler) where TSelf : IServiceRegistrationHandler { } This makes defining such methods more complex (again). [3] in my attempt at implementing this, Visual Studio indicated 170 errors remaining after I'd already updated a couple of extension methods. [Attempt #2: Simple Extension Methods] Instead of continuing down this path, we might just want to follow the pattern we established in a few other places, by defining both a Register method, which uses the IServiceRegistrationHandler, and a Use method, which uses the IApplication Here's an example of the corresponding "Use" method: public static IApplication UseCoreServices( [NotNull] this IApplication application) { if (application == null) { throw new ArgumentNullException("application"); } application .RegisterCoreServices() .RegisterSingle(application.GetServices()) .RegisterSingle(application); return application; } Though the technique involves a bit more boilerplate, it's easy to write and understand (and reason about) these methods. As mentioned in the initial sentence of this article, the cognitive load is lower than the technique with generic parameters. The only place where it would be nice to have an IApplication return type is from the Register* methods defined on the IServiceRegistrationHandler itself. We already decided that self-referential generic constraints would be too messy. Instead, we could define some extension methods that return the correct type. We can't name the method the same as the one that already exists on the interface [4], though, so let's prepend the word Use, as shown below: IApplication UseRegister( [NotNull] this IApplication application) where TService : class where TImplementation : class, TService; { if (application == null) { throw new ArgumentNullException("application"); } application.Register(); return application; } That's actually pretty consistent with the other configuration methods. Let's take it for a spin and see how it feels. Now that we have an alternative way of registering types fluently without "downgrading" the result type from IApplication to IServiceRegistrationHandler, we can rewrite the example from above as: return new Application() .UseStandard() .UseRegisterSingle() .UseOtherComponent() .UseRegister(); Instead of increasing cognitive load by trying to push the C# type system to places it's not ready to go (yet), we use tiny methods to tweak the API and make it easier for users of our framework to write code correctly. [5] -------------------------------------------------------------------------------- [1] See Encodo’s configuration library for Quino "Part 1" , "Part 2" and "Part 3" as well as API Design: Running and Application "Part 1" and "Part 2" and, finally, "Starting up an application, in detail" . [1] The video goes into quite a bit of depth on using generics to extend the type system in the direction of dependent types. Spoiler alert: he doesn't make it because the C# type system can't be abused in this way, but the journey is informative. [1] As detailed in the links in the first footnote, I'd just gotten rid of this kind of generic constraint in the configuration calls because it was so ugly and offered little benefit. [1] If you define an extension method for a descendant type that has the same name as a method of an ancestor interface, the method-resolution algorithm for C# will never use it. Why? Because the directly defined method matches the name and all the types and is a "stronger" match than an extension method. Perhaps an example is in order: interface IA { IA RegisterSingle(); } interface IB : IA { } static class BExtensions { static IB RegisterSingle(this IB b) { return b; } static IB UseStuff(this IB b) { return b; } } Let's try to call the method from BExtensions: public void Configure(IB b) { b.RegisterSingle().UseStuff(); } The call to UseStuff cannot be resolved because the return type of the matched RegisterSingle method is the IA of the interface method not the IB of the extension method. There is a solution, but you're not going to like it (I know I don't). public void Configure(IB b) { BExtensions.RegisterSingle(b).UseStuff(); } You have to specify the extension-method class's name explicitly, which engenders awkward fluent chaining -- you'll have to nest these calls if you have more than one -- but the desired method-resolution was obtained. But at what cost? "The horror...the horror." [1] The final example does not run against Quino 2.2, but will work in an upcoming version of Quino, probably 2.3 or 2.4.

v2.2: Winform fixes and Query Improvements

2016-03-25T13:41:54+01:00

Published by marco on 25. Mar 2016 13:41:54 ------------------------------------------------------------------------ The summary below describes major new features, items of note and breaking changes. The "full list of issues" is also available for those with access to the Encodo issue tracker. [Highlights] * Lots of bug fixes and improvements for the Winform UI and German translations with the release of Punchclock on this version. ("QNO-5162" , "QNO-5159" , "QNO-5158" , "QNO-5157" , "QNO-5156" , "QNO-5140" , "QNO-5155" , "QNO-5145" , "QNO-5111" , "QNO-5107" , "QNO-5106" , "QNO-5104" , "QNO-5015" ) * DateTimeExtensions.GetDayOfWeek() had a leap-day bug ("QNO-5051" ) * Fixed how the hash code for GenericObjects is calculated, which fixes sorting issues in grids, specifically for non-persisted or transient objects ("QNO-5137" ) * Improvements to the IAccessControl API for getting groups and users and testing membership ("QNO-5133" ) * Add support for query aliases (e.g. for joining the same table multiple times) ("QNO-531" ) This changes the API surface only minimally. Applications can pass an alias when calling the Join method, as shown below, query.Join(Metadata.Project.Deputy, alias: "deputy") You can find more examples of aliased queries in the TestAliasedQuery(), TestJoinAliasedTables(), TestJoinChildTwice() defined in the QueryTests testing fixture. * Add a standalone IQueryAnalyzer for optimizations and in-memory mini-drivers ("QNO-4830" ) [Breaking changes] * ISchemaManager has been removed. Instead, you should retrieve the interface you were looking for from the IOC. The possible interfaces you might need are IImportHandler, IMappingBuilder, IPlanBuilder or ISchemaCommandFactory. * ISchemaManagerSettings.GetAuthorized() has been moved to ISchemaManagerAuthorizer. * The hash-code fix for GenericObjects may have an effect on the way your application sorts objects.The IParticipantManager (base interface of IAccessControl) no longer has a single method called GetGroups(IParticipant). This method was previously used to get the groups to which a user belongs and the child groups of a given group. This confusing double duty for the API led to an incorrect implementation for both usages. Instead, there are now two methods: * IEnumerable GetGroups(IUser user): Gets the groups for the given user * IEnumerable GetChildGroups(IGroup group): Gets the child groups for the given group The old method has been removed from the interface because (A) it never worked correctly anyway and (B) it conflicts with the new API.

Voxxed Zürich 2016: Notes

2016-03-25T13:41:30+01:00

Published by marco on 25. Mar 2016 13:41:30 ------------------------------------------------------------------------ This article was originally published on the "Encodo Blogs" . -------------------------------------------------------------------------------- This first-ever "Voxxed Zürich" was hosted at the cinema in the SihlCity shopping center in Zürich on March 3rd. All presentations were in English. The conference was relatively small -- 333 participants -- and largely vendor-free. The overal technical level of the presentations and participants was quite high. I had a really nice time and enjoyed a lot of the presentations. There was a nice common thread running through all of the presentations, starting with the Keynote. There's a focus on performance and reliability through immutabiliy, sequences, events, actors, delayed execution (lambdas, which are relatively new to Java), instances in the cloud, etc. It sounds very BUZZWORDY, but instead it came as a very technically polished conference that reminded me of how many good developers there are trying to do the right thing. Looking forward to next year; hopefully Encodo can submit a presentation. You can take a look at the "VoxxedDays Zürich -- Schedule" . The talks that I visited are included below, with links to the presentation page, the video on YouTube and my notes and impressions. YMMV. [Keynote: Life beyond the Illusion of the Present] "Life beyond the Illusion of the Present" -- Jonas Bonér [media] [Notes] * He strongly recommended reading "The Network is reliable" by Peter Bailis. * This talk is about event-driven, CQRS programming. * Focus on immutable state, very much like JoeDuffy, etc. transactional accrual of facts. * Never delete data, annotate with more facts. * The reality at any point can be calculated for a point in time by aggregating facts up to that point. Like the talk I once wrote up some notes about ("Runaway Complexity in Big Data, and a Plan to Stop It" by Nathan Marz ). * Everything else is a performance optimization. Database views, tables are all caches on the transaction log. Stop throwing the log away, though. * Define smaller atomic units. Not a whole database. Smaller. Consistency boundary. Services? * Availability trumps consistency. Use causal consistency through mechanisms other than time stamps. Local partial better than global. * He talked about data-flow programming; fingers crossed that we get some language support in C# 7 * Akka (Akka.NET) is the main product. [Kotlin - Ready for production] "Kotlin - Ready for production" -- Hadi Hariri [media] * Used at JetBrains, open-source. 14k+ users. It's not a ground-breaking language. They tried Scala and Scala was the first language they tried to use (Java already being off the table) but they didn't like it, so they invented Kotlin. * Interoperable with Java (of course). Usable from all sorts of systems, but intelliJ Idea has first-class support. * Much less code, less maintenance. Encapsulates some concepts like "data classes" which do what they're supposed for DTO definitions. * Inferred type on declarations. No nulls. Null-safe by design. Opt-in for nulls. * Implicit casts as well * Interface delegation * Lazy delegation * Deconstruction * Global infix operators; very expressive * Also defaults to/focuses on immutability * Algebraic data types/ data flow * Anglo is statically typed XML views for android * JavaScript target exists and is the focus of work. Replacement for TypeScript? [Reactive Apps with Akka and AngularJS] "Reactive Apps with Akka and AngularJS" -- Heiko Seeberger [media] * He strongly recommended reading "the reactive manifesto" * Responsive: timely response / non-functional / also under load / scale up/down/out * Resilient: fail early * Message-driven: async message-passing is a way of getting reactive/responsive. Automatic decoupling leads to better error-handling, no data loss * Akka provides support for: * Actor-based model (actors are services); watch video from Channel Nine * Akka HTTP Server is relatively new * Akka is written in Scala * There's a Scala DSL for defining the controller (define routes) * The Scala compiler is pure crap. Sooooo slooooowww (62 seconds for 12 files) During his talk, he took us through the following stages of building a scalable, resilient actor-based application with Akka. * First he started with static HTML * Then he moved on to something connected to AKKA, but not refreshing * W3C Server-sent events is unidirectional channel from the server to the client. He next used this to have instant refresh on the client; not available on IE. Probably used by SignalR (or whatever replaced it)? Nothing is typed, though, just plain old JavaScript * Then he set up sharding * Then persistence (Cassandra, Kafka) AKKA Distributed Data * Deals with keeping replicas consistent without central coordination * Conflict-free replicated data types * Fully distributed, has pub/sub semantics * Uses the Gossip protocol * Support various consistency strategies * Using AKKA gives you automated scaling support (unlike the SignalR demo Urs and I did over 2 years ago, but that was a chat app as well) AKKA Cluster Sharding * Partitioning of actors/services across clusters * Supports various strategies * Default strategy is to distribute unbalanced actors to new shards * The ShardRegion is another actor that manages communication with sharded actors (entities). This introduces a new level of indirection, which must be honored in the code (?) AKKA Persistence * Event-sourcing: validate commands, journal events, apply the event after persistence. * Application is applied to local state only after the journal/persistence has indicated that the command was journaled * On recovery, events are replayed * Supports snapshotting (caching points in time) * Requires a change to the actor/entity to use it. All written in Scala. Akka looks pretty good. It guarantees the ordering because ACTORS. Any given actor only exists on any shard once. If a shard goes down, the actor is recreated on a different shard, and filled with information from the persistent store to "recreate" the state of that actor. DDD (Domain-Driven Design) and the actor model. Watch "Hewitt, Meijer and Szyperski: The Actor Model (everything you wanted to know, but were afraid to ask)" . Code is on GitHub: seeberger/reactive_flows [Lambda core - hardcore] "Lambda core - hardcore" -- Jarek Ratajski [media] Focus on immutability and no side-effects. Enforced by the lambda calculus. Pretty low-level talk about lambda calculus. Interesting, but not applicable. He admitted as much at the top of the talk. Links: * "Dixin's Blog" * "Badlam visualization" * "His presentation engine" [expect("poo").length.toBe(1)] "expect("poo").length.toBe(1)" -- Philip Hofstetter [1] [media] This was a talk about expectations of the length of a character. The presenter was very passionate about his talk and went into an incredible amount of detail. * What is a string? This is the kind of stuff every programmer needs to know. [2] * String is not a collection of bytes. It's a sequence of graphemes. string <> char[] * UTF-16 is crap. What about the in-memory representation? Why in God's name did Python 3 use UTF32? Unicode Transformation format. * What is the length of a string? ä is how many? Single character (diuresis included) or a with combining diuresis? * In-memory representation in Java and C# are UCS-2 (UNICODE 1); stuck in 1996, before Unicode 2.0 came out. This leaks into APIs because of how strings are returned ... string APIs use UTF-16, encoding with surrogate pairs to get to characters outside of the BMP (understood by convention, but not by the APIs that expect UTF-16 ... which has no idea what surrogate pairs are ... and counting algorithms, find, etc. won't work). * ECMAScript hasn't really fixed this, either. substr() can break strings charAt() is still available and has no idea about code points. Does this apply to ES6? String-equality doesn't work for the diuresis above. * So we're stuck with server-side. Who does it right? Perl. Swift. Python. Ruby. Python went through hell with backwards compatibility but with 3.3 they're doing OK again. Ruby strings are a tuple of encoding and data. All of the others have their string libraries dealing in graphemes. How did Perl always get it right? Perl has three methods for asking questions about length, in graphemes, code points or bytes * What about those of us using JavaScript? C#? Java? There are external libraries that we should be using. Not just for DateTime, but for string-handling as well. Even EcmaScript15 still uses code points rather than graphemes, so the count varies depending on how the grapheme is constructed. * Security concerns: certificate authorities have to be aware of homographs (e.g. a character that looks like another one, but has a different encoding/byte sequence). * He recommended the book "Unicode explained" by Jukka K. Korpela. [How usability fits in UX - it's no PICNIC] "How usability fits in UX - it's no PICNIC" -- Myriam Jessier [media] What should a UI be? 1. Functional 2. Reliable 3. Usable 4. Convenient 5. Pleasurable Also nice to have: 1. Desirable 2. Delightful 3. memorable 4. Learnable 5. 3 more Book recommendation: "Don't make me think" by Steve Krug * Avoid mindless and unambiguous clicks. Don't count clicks, count useless shit you need to do. * Let the words go. People's attention will wander. * UX is going to be somewhat subjective. Don't try to please everyone. * OMG She uses hyphens correctly. * She discussed the difference between UX, CX, UI. * Personas are placeholders for your users. See "Personapp" to get started working with personas. Guidelines: * Consistent and standardized UI * Guide the user (use visual cues, nudging) * Make the CallToAction (CTA) interactive objects obvious * Give feedback on progress, interaction * Never make a user repeat something they already told you. You're software, you should have eidetic memory * Always have default values in forms (e.g. show the expected format) * Explain how the inputed information will be used (e.g. for marketing purposes) * No more "reset" button or mass-delete buttons. Don't make it possible/easy to wipe out all someone's data * Have clear and explanatory error or success messages (be encouraging) * Include a clear and visual hierarchy and navigation Guidelines for mobile: * Make sure it works on all phones * Give incentives for sharing and purpose (engagement rates make marketing happy. CLICK THE BUTTON) * Keep usability and conversion in mind (not necessarily money, but you actually want people to be using your app correctly) * Usability (can you use your app on the lowest screen-brightness?) * ...and more... * Make it pretty (some people don't care, e.g. She very clearly said that she's not aesthetically driven, it's not her field; other people do care. A lot). * Give all the information a customer needs to purchase * Design for quick movement (no lag) * Do usability testing through video * Leverage expectations. Fit in to the environment. Search is on the left? Behind a button? Do that. Don't make a new way of searching. * If you offer a choice, then make them as mutually exclusive as possible. When a company talks to itself (e.g. industry jargon), then users get confused * The registration process should be commensurate to the thing that you're registering for * Small clickable ads on mobile. Make click targets appropriate. * Don't blame negative feedback on "fear of change". It's probably you. If people don't like it, then it might not be user-friendly. The example with Twitter's star vs. heart. It's interesting how we let the world frame our interactions. Why not both? Too complex? Would people really be confused by two buttons? One to "like" and one for "read later"? Suggested usability testing tools: * Crazy Egg is $9 per month for heatmaps. * Qualaroo * Optimizely (A/B testing) * Usabilia * Userfeel * Trymyui [React - A trip to Russia isn't all it seems] "React - A trip to Russia isn't all it seems" -- Josh Sephton [3] [media] This talk was about Web UI frameworks and how his team settled on React. * Angular too "all or nothing". * Backbone has no data-binding. * React looks good. Has its own routing for SPAs. Very component-heavy. Everything's a component. Nothing new here so far. * They built their React to replace a Wordpress-based administration form * Stateful components are a bad idea * React components are like self-contained actors/services * They started with Flux, but ended up with Redux. We're using Redux in our samples. I'm eyeballing how to integrate Akka.Net (although I'm not sure if that has anything to do with this. * ReactNative: write once, use on any device * Kind of superficial and kinda short but I knew all about this in React already [The reactor programming model for composable distributed computing] "The reactor programming model for composable distributed computing" -- Aleksandar Prokopec [4] [media] * Reactive programming, with events as sequences of event objects * Events are equivalent to a list/sequence/streams (enumerable in C#) * This talk is also about managing concurrency * There must be a boundary between outer concurrent events vs. how your application works on them * That's why most UI toolkits are single-threaded * Asynchronous is the antonym of concurrency (at least in the dictionary) * Filter the stream of events to compress them to frames, then render and log, so the events come in, are marshaled through the serializing bottleneck and are then dispatched asynchronously to different tasks * Reactor lets clients create their own channels (actors) from which they read events and which they register with a server so that it can publish * Akka supports setting up these things, Reactor is another implementation? * Dammit I want destructuring of function results (C# 7?) * It's very easy to build client/server and broadcast and even ordered synchronization using UIDs (or that pattern mentioned by Jonas in the keynote) The UID needs to be location-specific, though. That's not sufficient either, what you need is client-specific. For this, you need special data structures to store the data in a way that edits are automatically correctly ordered. Events sent for these changes make the events are ordered correctly * What is the CRDT? We just implemented an online collaborative editor: composes nicely and provides a very declarative, safe and scalable way of defining software. This is just a function (feeds back into the idea of lambdas here, actually, immutability, encapsulation) * "Reactors" -------------------------------------------------------------------------------- [1] I am aware of the irony that the emoji symbol for "poo" is not supported on this blogging software. That was basically the point of the presentation -- that encoding support is difficult to get right. There's an issue for it: "Add support for UTF8 as the default encoding" . [1] In my near-constant striving to be the worst conversational partner ever, I once gave a similar encoding lesson to my wife on a two-hour walk around a lake when she dared ask why mails sometimes have those "stupid characters" in them.

Finovate 2016: Bank2Things

2016-03-25T13:39:04+01:00

Published by marco on 25. Mar 2016 13:39:04 ------------------------------------------------------------------------ "[image]" "[image]" At the beginning of the year, we worked on an interesting project that dipped into IOT (Internet of Things). The project was to create use cases for Crealogix's banking APIs in the real world. Concretely, we wanted to show how a customer could use these APIs in their own workflows. The use cases were to provide proof of the promise of flexibility and integrability offered by well-designed APIs. "Watch 7--minute video of the presentation" [The Use Cases] [Football Club Treasurer] [image]The first use case is for the treasurer of a local football club. The treasurer wants to be notified whenever an annual club fee is transferred from a member. The club currently uses a Google Spreadsheet to track everything, but it's updated manually. It would be really nice if the banking API could connected -- via some scripting "glue" -- to update the spreadsheet directly, without user intervention. The treasurer would just see the most current numbers whenever he opened the spreadsheet. The spreadsheet is in addition to the up-to-date view of payments in the banking app. The information is also available there, but not necessarily in the form that he or she would like. Linking automatically to the spreadsheet is the added value. [Chore & Goal Tracker] [image][image]Imagine a family with a young son who wants to buy a drone. He would have to earn it by doing chores. Instead of tracking this manually, the boy's chores would be tabulated automatically, moving money from the parents' account to his own as he did chores. Additionally, a lamp in the boy's room would glow a color indicating how close he was to his goal. The parents wanted to track the boy's progress in a spreadsheet, tracking the transfers as they would have had they not had any APIs. The idea is to provided added value to the boy, who can record his chores by pressing a button and see his progress by looking at a lamp's color. The parents get to stay in their comfort zone, working with a spreadsheet as usual, but having the data automatically entered in the spreadsheet. [The Plan] It's a bit of a stretch, but it sufficed to ground the relatively abstract concept of banking APIs in an example that non-technical people could follow. So we needed to pull quite a few things together to implement these scenarios. * A lamp that can be controlled via API * A button that can trigger an API * A spreadsheet accessibly via API * An API that can transfer money between accounts * "Glue" logic that binds these APIs together [The Lamp] [image][image] We looked at two lamps: * "Philips Hue" * "Lifx" Either of these -- just judging from their websites -- would be sufficient to utterly and completely change our lives. The Hue looked like it was going to turn us into musicians, so we went with Lifx, which only threatened to give us horn-rimmed glasses and a beard (and probably skinny jeans and Chuck Taylor knockoffs). Yeah, we think the marketing for what is, essentially, a light-bulb, is just a touch overblown. Still, you can change the color of the light bulb with a SmartPhone app, or control it via API (which is what we wanted to do). [The Button] The button sounds simple. You'd think that, in 2016, these things would be as ubiquitous as AOL CDs were in the 1990s. You'd be wrong. [image]There's a KickStarter project called "Flic" that purports to have buttons that send signals over a wireless connection. They cost about CHF20. Though we ordered some, we never saw any because of manufacturing problems. If you thought the hype and marketing for a light bulb were overblown, then you're sure to enjoy how Flic presents a button. We quickly moved along a parallel track to get buttons that can be pressed in real life rather than just viewed from several different angles and in several different colors online. [image]Amazon has what they have called "Dash" buttons that customers can press to add predefined orders to their one-click shopping lists. The buttons are bound to certain household products that you tend to purchase cyclically: toilet paper, baby wipes, etc. They sell them dirt-cheap -- $5 -- but only to Amazon Prime customers -- and only to customers in the U.S. Luckily, we knew someone in the States willing to let us use his Amazon Prime account to deliver them, naturally only to a domestic address, from which they would have to be forwarded to us here in Switzerland. That we couldn't use them to order toilet paper in the States didn't bother us -- we were planning to hack them anyway. These buttons showed up after a long journey and we started trapping them in our own mini-network so that we could capture the signal they send and interpret it as a trigger. This was not ground-breaking stuff, but we really wanted the demonstrator to be able to press a physical button on stage to trigger the API that would cascade other APIs and so on. Of course we could have just hacked the whole thing so that someone presses a button on a screen somewhere -- and we programmed this as a backup plan -- but the physicality of pressing a button was the part of the demonstration that was intended to ground the whole idea for non-technical users. [1] [The Spreadsheet] [image][image]If you're going to use an API to modify a spreadsheet, then that spreadsheet has to be available online somewhere. The spreadsheet application in Google Docs is a good candidate. The API allows you to add or modify existing data, but that's pretty much it. When you make changes, they show up immediately, with no ceremony. That, unfortunately, doesn't make for a very nice-looking demo. Google Docs also offers a Javascript-like scripting language that let's you do more. We wanted to not only insert rows, we wanted charts to automatically update and move down the page to accommodate the new row. All animated, thank you very much. This took a couple pages of scripting and a good amount of time. It's also no longer a solution that an everyday user is likely to make themselves. And, even though we pushed as hard as we could, we also didn't get everything we wanted. The animation is very jerky (watch the video linked above) but gets the job done. [The Glue] [image]So we've got a bunch of pieces that are all capable of communicating in very similar ways. The final step is to glue everything together with a bit of script. There are several services available online, like IFTTT -- If This Then That -- that allow you to code simple logic to connect signals to actions. In our system, we had the following signals: * Transfer was made to a bank account * Button was pressed and the following actions: * Insert data into Google Spreadsheet * Set color of lamp [The Crealogix API and UI] [image][image][image]So we're going to betray a tiny secret here. Although the product demonstrated on-stage did actually do what it said, it didn't do it using the Crealogix API to actually transfer money. That's the part that we were actually selling and it's the part we ended up faking/mocking out because the actual transfer is beside the point. Setting up bank accounts is not so easy, and the banks take umbrage at creating them for fake purposes. Crealogix could have let us use fake testing accounts, but even that would have been more work than it was worth: if we're already faking, why not just fake in the easiest way possible by skipping the API call to Crealogix and only updating the spreadsheet? Likewise, the entire UI that we included in the product was mocked up to include only the functionality required by the demonstration. You can see an example here -- of the login screen -- but other screens are linked throughout this article. Likewise, the Bank2Things screen shown above and to the left is a mockup. [Wrapup] So what did Encodo actually contribute? * We used the Crealogix UX and VSG to mock up all of the app screens that you seen linked in this article. We did all of the animation and logic and styling. * We built two Google Spreadsheets and hooked them up to everything else * We hooked up the Lifx lamp API into our system * We hacked the Amazon Dash buttons to communicate in our own network instead of beaming home to the mothership * We built a web site to handle any mocking/faking that needed to be done for the demo and through which the devices communicated * We provided a VM (Virtual Machine) on which everything ran (other than the Google Spreadsheets) As last year -- when we helped Crealogix create the prototype for their "BankClip for Finovate 2015" -- we had a lot of fun investigating all of these cutting-edge technologies and putting together a custom solution in time for Finovate 2016. -------------------------------------------------------------------------------- [1] As it turns out, if you watch the "7--minute video of the presentation" , nowhere do you actually see a button. Maybe they could see them from the audience.

Mini-applications and utilities with Quino

2016-02-27T12:36:39+01:00

result = app.GetInstance() ); if (transcript.ExitCode != ExitCodes.Ok) { throw new InvalidOperationException( "Could not read the service settings from the configuration file." + new SimpleMessageFormatter().GetErrorDetails(transcript.Messages) ); } return result; } If you've been following along in the other articles (see first footnote below), then this structure should be very familiar. We use an ApplicationManager() to execute the application logic, creating the application with CreateServiceConfigurationApplication and returning the settings configured by the application in the second parameter (the "run" action). If anything went wrong, we get the details and throw an exception. You can't see it, but the library provides debug/file logging (if you enable it), debug/release mode support (exception-handling, etc.) and everything is customizable/replaceable by registering with an IOC. [Configuring the Settings Loader] Soooo...I can see where we're returning the ServiceSettings, but where are they configured? Let's take a look at the second method, the one that creates the application. private static IApplication CreateServiceConfigurationApplication() { var application = new Application(); application .UseSimpleInjector() .UseStandard() .UseConfigurationFile("service-settings.xml") .Configure( "service", (settings, node) => { settings.ServiceName = node.GetValue("name", settings.ServiceName); settings.DisplayName = node.GetValue("displayName", settings.DisplayName); settings.Description = node.GetValue("description", settings.Description); settings.Types = node.GetValue("types", settings.Types); } ).RegisterSingle(); return application; } 1. First, we create a standard Application, defined in the Encodo.Application assembly. What does this class do? It does very little other than manage the main IOC (see articles linked in the first footnote for details). 2. The next step is to choose an IOC, which we do by calling UseSimpleInjector(). Quino includes support for the SimpleInjector IOC out of the box. As you can see, you must include this support explicitly, so you're also free to assign your own IOC (e.g. one using Microsoft's Unity). SimpleInjector is very lightweight and super-fast, so there's no downside to using it. 3. Now we have an application with an IOC that doesn't have any registrations on it. How do we get more functionality? By calling methods like UseStandard(), defined in the Encodo.Application.Standard assembly. Since I know that UseStandard() pulls in what I'm likely to need, I'll just use that. [4] 4. The next line tells the application the name of the configuration file to use. [5] 5. The very next line is already application-specific code, where we configure the ServiceSettings object that we want to return. For that, there's a Configure method that returns an object from the IOC along with a specific node from the configuration data. This method is called only if everything started up OK. 6. The final call to RegisterSingle makes sure that the ServiceSettings object created by the IOC is a singleton (it would be silly to configure one instance and return another, unconfigured one). Basically, because this application is so simple, it has already accomplished its goal by the time the standard startup completes. At the point that we would "run" this application, the ServiceSettings object is already configured and ready for use. That's why, in LoadServiceSettings(), we can just get the settings from the application with GetInstance() and exit immediately. [Code Generator] The code generator has a bit more code, but follows the same pattern as the simple application above. In this case, we use the command line rather than the configuration file to get user input. [Execution] The main method defers all functionality to the ApplicationManager, passing along two methods, one to create the application, the other to run it. internal static void Main() { new ApplicationManager().Run(CreateApplication, GenerateCode); } [Configuration] As before, we first create an Application, then choose the SimpleInjector and some standard configuration and registrations with UseStandard(), UseMetaStandardServices() and UseMetaTools(). [6] We set the application title to "Quino Code Generator" and then include objects with UseSingle() that will be configured from the command line and used later in the application. [7] And, finally, we add our own ICommandSet to the command-line processor that will configure the input and output settings. We'll take a look at that part next. private static IApplication CreateApplication( IApplicationCreationSettings applicationCreationSettings) { var application = new Application(); return application .UseSimpleInjector() .UseStandard() .UseMetaStandardServices() .UseMetaTools() .UseTitle("Quino Code Generator") .UseSingle(new CodeGeneratorInputSettings()) .UseSingle(new CodeGeneratorOutputSettings()) .UseUnattendedCommand() .UseCommandSet(CreateGenerateCodeCommandSet(application)) .UseConsole(); } [Command-line Processing] The final bit of the application configuration is to see how to add items to the command-line processor. Basically, each command set consists of required values, optional values and zero or more switches that are considered part of a set. The one for i simply sets the value of inputSettings.AssemblyFilename to whatever was passed on the command line after that parameter. Note that it pulls the inputSettings from the application to make sure that it sets the values on the same singleton reference as will be used in the rest of the application. The code below shows only one of the code-generator--specific command-line options. [8] private static ICommandSet CreateGenerateCodeCommandSet( IApplication application) { var inputSettings = application.GetSingle(); var outputSettings = application.GetSingle(); return new CommandSet("Generate Code") { Required = { new OptionCommandDefinition { ShortName = "i", LongName = "in", Description = Resources.Program_ParseCommandLineArgumentIn, Action = value => inputSettings.AssemblyFilename = value }, // And others... }, }; } [Code-generation] Finally, let's take a look at the main program execution for the code generator. It shouldn't surprise you too much to see that the logic consists mostly of getting objects from the IOC and telling them to do stuff with each other. [9] I've highlighted the code-generator--specific objects in the code below. All other objects are standard library tools and interfaces. private static void GenerateCode(IApplication application) { var logger = application.GetLogger(); var inputSettings = application.GetInstance(); if (!inputSettings.TypeNames.Any()) { logger.Log(Levels.Warning, "No types to generate."); } else { var modelLoader = application.GetInstance(); var metaCodeGenerator = application.GetInstance(); var outputSettings = application.GetInstance(); var modelAssembly = AssemblyTools.LoadAssembly( inputSettings.AssemblyFilename, logger ); outputSettings.AssemblyDetails = modelAssembly.GetDetails(); foreach (var typeName in inputSettings.TypeNames) { metaCodeGenerator.GenerateCode( modelLoader.LoadModel(modelAssembly, typeName), outputSettings, logger ); } } } So that's basically it: no matter how simple or complex your application, you configure it by indicating what stuff you want to use, then use all of that stuff once the application has successfully started. The Encodo/Quino framework provides a large amount of standard functionality. It's yours to use as you like and you don't have to worry about building it yourself. Even your tiniest application can benefit from sophisticated error-handling, command-line support, configuration and logging without lifting a finger. -------------------------------------------------------------------------------- [1] See Encodo’s configuration library for Quino "Part 1" , "Part 2" and "Part 3" as well as API Design: Running and Application "Part 1" and "Part 2" and, finally, "Starting up an application, in detail" . [1] That boilerplate looks like this: var fileService = new ServiceInstaller(); fileService.StartType = ServiceStartMode.Automatic; fileService.DisplayName = "Quino Sandbox"; fileService.Description = "Demonstrates a Quino-based service."; fileService.ServiceName = "Sandbox.Services"; See the ServiceInstaller.cs file in the Sandbox.Server project in Quino 2.1.2 and higher for the full listing. [1] The standard implementation of Quino's ITextKeyValueNodeReader supports XML, but it would be trivial to create and register a version that supports JSON ("QNO-4993" ) or YAML. The configuration file for the utility looks like this: Quino.Services Quino Utility The application to run all Quino backend services.

Verity Stob Teaches Functional Programming

2016-01-17T22:27:10+01:00

Published by marco on 17. Jan 2016 22:27:10 Updated by marco on 18. Jan 2016 07:12:55 ------------------------------------------------------------------------ The article "Learn you Func Prog on five minute quick!" by Verity Stob provides a typically twisted and unhelpful overview of the state of functional programming in this 21st-century renaissance -- heralded decades ago by Lisp programmers. It includes an honest overview of the major players, including Scala, for which the "pro" and "con" are the same (a "[c]lose relationship with Java [...]") and ending with JavaScript, for which the "pro" is "It's what you'll end up using." The discussion continues with rules: variable immutability, function purity, curryability and monadicity, which is where things really go off the rails. Property 7 dribbles to a shuddering halt with, "All monads define a unit() function called of(), a bind() function called map() and a type constructor function called... "Wait a minute. Wait a minute. Perhaps bind() is a functor not a function. I'm pretty sure about that. Hold on to the horses a moment there while I look it up. "...And I should perhaps clarify that this bind() and map() is nothing to do with any other bind() or map() methods or functions that you might be familiar with, although their actions are in some sense quite similar. "Summary: It has been an honour and a pleasure to clear all that up for you. "Final Reader's comment: My gratitude is inexpressible. [1]" -------------------------------------------------------------------------------- [1] I've been reading Verity Stob since the days when she was published in Dr. Dobbs. Nice to see her still going with the same sense of humour (sic) and insight into the at-times puffed-up and self-important programming world. [2] [1] Which is not to say that I don't enjoy immensely the functional aspects of C#. I do. I also have read a lot about monads and am completely familiar with the tragically bad and unenlightening explanations. Stob captures this elegantly with the following corollary to Rule 4: "If you should by some accident come to understand what a Monad is, you will simultaneously lose the ability to explain it to anybody else."

Profiling: that critical 3% (Part II)

2016-01-16T12:53:04+01:00

Published by marco on 16. Jan 2016 12:53:04 ------------------------------------------------------------------------ [image]In "part I" of this series, we discussed some core concepts of profiling. In that article, we not only discussed the problem at hand, but also how to think about not only fixing performance problems, but reducing the likelihood that they get out of hand in the first place. In this second part, we'll go into detail and try to fix the problem. [Reëvaluating the Requirements] Since we have new requirements for an existing component, it's time to reconsider the requirements for all stakeholders. In terms of requirements, the IScope can be described as follows: 1. Hold a list of objects in LIFO order 2. Hold a list of key/value pairs with a unique name as the key 3. Return the value/reference for a key 4. Return the most appropriate reference for a given requested type. The most appropriate object is the one that was added with exactly the requested type. If no such object was added, then the first object that conforms to the requested type is returned 5. These two piles of objects are entirely separate: if an object is added by name, we do not expect it to be returned when a request for an object of a certain type is made There is more detail, but that should give you enough information to understand the code examples that follow. [Usage Patterns] There are many ways of implementing the functional requirements listed above. While you can implement the feature with only requirements, it's very helpful to know usage patterns when trying to optimize code. Therefore, we'd like to know exactly what kind of contract our code has to implement -- and to not implement any more than was promised. Sometimes a hopeless optimization task gets a lot easier when you realize that you only have to optimize for a very specific situation. In that case, you can leave the majority of the code alone and optimize a single path through the code to speed up 95% of the calls. All other calls, while perhaps a bit slow, will at least still be yield the correct results. And "optimized" doesn't necessarily mean that you have to throw all of your language's higher-level constructs out the window. Once your profiling tool tells you that a particular bit of code has introduced a bottleneck, it often suffices to just examine that particular bit of code more closely. Just picking the low-hanging fruit will usually be more than enough to fix the bottleneck. [1] [Create scopes faster [2]] I saw in the profiler that creating the ExpressionContext had gotten considerably slower. Here's the code in the constructor. foreach (var value in values.Where(v => v != null)) { Add(value); } I saw a few potential problems immediately. * The call to Add() had gotten more expensive in order to return the most appropriate object from the GetInstances() method * The Linq replaced a call to AddRange() The faster version is below: var scope = CurrentScope; for (var i = 0; i < values.Length; i++) { var value = values[i]; if (value != null) { scope.AddUnnamed(value); } } Why is this version faster? The code now uses the fact that we know we're dealing with an indexable list to avoid allocating an enumerator and to use non-allocating means of checking null. While the Linq code is highly optimized, a for loop is always going to be faster because it's guaranteed not to allocate anything. Furthermore, we now call AddUnnamed() to use the faster registration method because the more involved method is never needed for these objects. The optimized version is less elegant and harder to read, but it's not terrible. Still, you should use these techniques only if you can prove that they're worth it. [Optimizing CurrentScope] Another minor improvement is that the call to retrieve the scope is made only once regardless of how many objects are added. On the one hand, we might expect only a minor improvement since we noted above that most use cases only ever add one object anyway. On the other, however, we know that we call the constructor 20 million times in at least one test, so it's worth examining. The call to CurrentScope gets the last element of the list of scopes. Even something as innocuous as calling the Linq extension method Last() can get more costly than it needs to be when your application calls it millions of times. Of course, Microsoft has decorated its Linq calls with all sorts of compiler hints for inlining and, of course, if you decompile, you can see that the method itself is implemented to check whether the target of the call is a list and use indexing, but it's still slower. There is still an extra stack frame (unless inlined) and there is still a type-check with as. Replacing a call to Last() with getting the item at the index of the last position in the list is not recommended in the general case. However, making that change in a provably performance-critical area shaved a percent or two off a test run that takes about 45 minutes. That's not nothing. protected IScope CurrentScope { get { return _scopes.Last(); } } protected IScope CurrentScope { get { return _scopes[_scopes.Count - 1]; } } That takes care of the creation & registration side, where I noticed a slowdown when creating the millions of ExpressionContext objects needed by the data driver in our product's test suite. [Get objects faster] Let's now look at the evaluation side, where objects are requested from the context. The offending, slow code is below: public IEnumerable GetInstances() { var serviceType = typeof(TService); var rawNameMatch = this[serviceType.FullName]; var memberMatches = All.OfType(); var namedMemberMatches = NamedMembers.Select( item => item.Value ).OfType(); if (rawNameMatch != null) { var nameMatch = (TService)rawNameMatch; return nameMatch .ToSequence() .Union(namedMemberMatches) .Union(memberMatches) .Distinct(ReferenceEqualityComparer.Default); } return namedMemberMatches.Union(memberMatches); } As you can readily see, this code isn't particularly concerned about performance. It is, however, relatively easy to read and to figure out the logic behind returning objects, though. As long as no-one really needs this code to be fast -- if it's not used that often and not used in tight loops -- it doesn't matter. What matters more is legibility and maintainability. But we now know that we need to make it faster, so let's focus on the most-likely use cases. I know the following things: * Almost all Scope instances are created with a single object in them and no other objects are ever added. * Almost all object-retrievals are made on such single-object scopes * Though the scope should be able to return all matching instances, sorted by the rules laid out in the requirements, all existing calls get the FirstOrDefault() object. These extra bits of information will allow me to optimize the already-correct implementation to be much, much faster for the calls that we're likely to make. The optimized version is below: public IEnumerable GetInstances() { var members = _members; if (members == null) { yield break; } if (members.Count == 1) { if (members[0] is TService) { yield return (TService)members[0]; } yield break; } object exactTypeMatch; if (TypedMembers.TryGetValue(typeof(TService), out exactTypeMatch)) { yield return (TService)exactTypeMatch; } foreach (var member in members.OfType()) { if (!ReferenceEquals(member, exactTypeMatch)) { yield return member; } } } Given the requirements, the handful of use cases and decent naming, you should be able to follow what's going on above. The code contains many more escape clauses for common and easily handled conditions, handling them in an allocation-free manner wherever possible. 1. Handle empty case 2. Handle single-element case 3. Return exact match 4. Return all other matches [3] You'll notice that returning a value added by-name is not a requirement and has been dropped. Improving performance by removing code for unneeded requirements is a perfectly legitimate solution. [Test Results] And, finally, how did we do? I created tests for the following use cases: * Create scope with multiple objects * Get all matching objects in an empty scope * Get first object in an empty scope * Get all matching objects in a scope with a single object * Get first object in a scope with a single object * Get all matching objects in a scope with multiple objects * Get first object in a scope with multiple objects Here are the numbers from the automated tests. [image] [image] * Create scope with multiple objects -- 12x faster * Get all matching objects in an empty scope -- almost 2.5x faster * Get first object in an empty scope -- almost 3.5x faster * Get all matching objects in a scope with a single object -- over 3x faster * Get first object in a scope with a single object -- over 3.25x faster * Get all matching objects in a scope with multiple objects -- almost 3x faster * Get first object in a scope with multiple objects -- almost 2.25x faster This looks amazing but remember: while the optimized solution may be faster than the original, all we really know is that we've just managed to claw our way back from the atrocious performance characteristics introduced by a recent change. We expect to see vast improvements versus a really slow version. Since I know that these calls showed up as hotspots and were made millions of times in the test, the performance improvement shown by these tests is enough for me to deploy a pre-release of Quino via TeamCity, upgrade my product to that version and run the tests again. Wish me luck! [4] -------------------------------------------------------------------------------- [1] Or, most likely, push it to some other piece of code. [1] The best approach at this point is to create issues for the other performance investigations you could make. For example, I opened an issue called "Optimize allocations in the data handlers (start with IExpressionContexts)" , documented everything I had analyzed and quickly got back to the issue on which I'd started. [1] For those with access to the Quino Git repository, the diffs shown below come from commit a825d5030ce6f65a452e1db85a308e1351288b96. [1] If you're following along very, very carefully, you'll recall at this point that the requirement stated above is that objects are returned in LIFO order. The faster version of the code returns objects in FIFO order. You can't tell that the original, slow version did guarantee LIFO ordering, but only because the call to get All members contained a hidden call to the Linq call Reverse(), which slowed things down even more! I removed the call to reverse all elements because (A) I don't actually have any tests for the LIFO requirement nor (B) do I have any other code that expects it to happen. I wasn't about to make the code even more complicated and possibly slower just to satisfy a purely theoretical requirement. That's the kind of behavior that got me into this predicament in the first place. [1] Spoiler alert: it worked. ;-) The fixes cut the testing time from about 01:30 to about 01:10 for all tests on the build server, so we won back the lost 25%.

Profiling: that critical 3% (Part I)

2016-01-13T07:05:23+01:00

Published by marco on 13. Jan 2016 07:05:23 ------------------------------------------------------------------------ An oft-quoted bit of software-development sagacity is "Premature optimization is the root of all evil." As is so often the case with quotes -- especially those on the Internet [1] -- this one has a slightly different meaning in context. The snippet above invites developers to overlook the word "premature" and interpret the received wisdom as "you don't ever need to optimize." Instead, Knuth's full quote actually tells you how much of your code is likely to be affected by performance issues that matter (highlighted below). "Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%." [An Optimization Opportunity in Quino [2]] In other articles, I'd mentioned that we'd upgraded several solutions to Quino 2 in order to test that the API was solid enough for a more general release. One of these products is both quite large and has a test suite of almost 1500 tests. The product involves a lot of data-import and manipulation and the tests include several scenarios where Quino is used very intensively to load, process and save data. These tests used to run in a certain amount of time, but started taking about 25% longer after the upgrade to Quino 2. [Measuring Execution Speed] Before doing anything else -- making educated guesses as to what the problem could be, for example -- we measure. At Encodo, we use JetBrains DotTrace to collect performance profiles. There is no hidden secret: the standard procedure is to take a measurement before and after the change and to compare them. However, so much had changed from Quino 1.13 to Quino 2 -- e.g. namespaces and type names had changed -- that while DotTrace was able to show some matches, the comparisons were not as useful as usual. A comparison between codebases that hadn't changed so much is much easier, but I didn't have that luxury. [Tracking the Problem] Even excluding the less-than-optimal comparison, it was an odd profile. Ordinarily, one or two issues stick out right away, but the slowness seemed to suffuse the entire test run. Since the direct profiling comparison was difficult, I downloaded test-speed measurements as CSV from TeamCity for the product where we noticed the issue. How much slower, you might ask? The test that I looked at most closely took almost 4 minutes (236,187ms) in the stable version, but took 5:41 in the latest build. [image] This test was definitely one of the largest and longest tests, so it was particularly impacted. Most other tests that imported and manipulated data ranged anywhere from 10% to 30% slower. When I looked for hot-spots, the profile unsurprisingly showed me that database access took up the most time. The issue was more subtle: while database-access still used the most time, it was using a smaller percentage of the total time. Hot-spot analysis wasn't going to help this time. Sorting by absolute times and using call counts in the tracing profiles yielded better clues. The tests were slower when saving and also when loading data. But I knew that the ORM code itself had barely changed at all. And, since the product was using Quino so heavily, the stack traces ran quite deep. After a lot of digging, I noticed that creating the ExpressionContext to hold an object while evaluating expressions locally seemed to be taking longer than before. This was my first, real clue. Once I was on the trail, I found that when evaluating calls (getting objects) that used local evaluation, it was also always slower. [Don't Get Distracted] [image]Once you start looking for places where performance is not optimal, you're likely to start seeing them everywhere. However, as noted above, 97% of them are harmless. To be clear, we're not optimizing because we feel that the framework is too slow but because we've determined that the framework is now slower than it used to be and we don't know why. Even after we've finished restoring the previous performance (or maybe even making it a little better), we might still be able to easily optimize further, based on other information that we gleaned during our investigation. But we want to make sure that we don't get distracted and start trying to FIX ALL THE THINGS instead of just focusing on one task at a time. While it's somewhat disturbing that we seem to be created 20 million ExpressionContext objects in a 4-minute test, that is also how we've always done it, and no-one has complained about the speed up until now. Sure, if we could reduce that number to only 2 million, we might be even faster [3], but the point is that that we used to be faster on the exact same number of calls -- so fix that first. [A Likely Culprit: Scope] I found a likely candidate in the Scope class, which implements the IScope interface. This type is used throughout Quino, but the two use-cases that affect performance are: 1. As a base for the ExpressionContext, which holds the named values and objects to be used when evaluating the value of an IExpression. These expressions are used everywhere in the data driver. 2. As a base for the poor-man's IOC used in Stage 2 of application execution. [4] The former usage has existed unchanged for years; its implementation is unlikely to be the cause of the slowdown. The latter usage is new and I recall having made a change to the semantics of which objects are returned by the Scope in order to make it work there as well. [How could this happen?] You may already be thinking: smooth move, moron. You changed the behavior of a class that is used everywhere for a tacked-on use case. That's definitely a valid accusation to make. In my defense, my instinct is to reuse code wherever possible. If I already have a class that holds a list of objects and gives me back the object that matches a requested type, then I will use that. If I discover that the object that I get back isn't as predictable as I'd like, then I improve the predictability of the API until I've got what I want. If the improvement comes at no extra cost, then it's a win-win situation. However, this time I paid for the extra functionality with degraded performance. Where I really went wrong was that I'd made two assumptions: 1. I assumed that all other usages were also interested in improved predictability. 2. I assumed that all other usages were not performance-critical. When I wrote the code you'll see below, I distinctly remember thinking: it's not fast, but it'll do and I'll make it faster if it becomes a problem. Little did I know how difficult it would be to find the problem. [Preventing future slippage] "Avoid changing a type shared by different systems without considering all stakeholder requirements." I think a few words on process here are important. Can we improve the development process so that this doesn't happen again? One obvious answer would be to avoid changing a type shared by different systems without considering all stakeholder requirements. That's a pretty tall order, though. Including this in the process will most likely lead to less refactoring and improvement out of fear of breaking something. We discussed above how completely reasonable assumptions and design decisions led to the performance degradation. So we can't be sure it won't happen again. What we would like, though, is to be notified quickly when there is performance degradation, so that it appears as a test failure. "Notify quickly when there is performance degradation" Our requirements are captured by tests. If all of the tests pass, then the requirements are satisfied. Performance is a non-functional requirement. Where we could improve Quino is to include high-level performance tests that would sound the alarm the next time something like this happens. [5] Enough theory: in "part II" , we'll describe the problem in detail and take a crack at improving the speed. See you there. -------------------------------------------------------------------------------- [1] In fairness, the quote is at least properly attributed. It really was Donald Knuth who wrote it. [1] By "opportunity", of course, I mean that I messed something up that made Quino slower in the new version. [1] See the article "Quino 2: Starting up an application, in detail" for more information on this usage. [1] I'm working on this right now, in issue "Add standard performance tests" for release 2.1.

Quino v2.1: API-smoothing and performance

2016-01-01T22:52:49+01:00

Published by marco on 1. Jan 2016 22:52:49 ------------------------------------------------------------------------ The summary below describes major new features, items of note and breaking changes. The "full list of issues" is also available for those with access to the Encodo issue tracker. [Highlights] Quino 2 is finally ready and will go out the door with a 2.1 rather than a 2.0 version number. The reason being that we released 2.0 internally and tested the hell out of it. 2.1 is the result of that testing. It includes a lot of bug fixes as well as API tweaks to make things easier for developers. * Made more improvements to the Startup/Configuration API for products ("QNO-4860" , "QNO-4991" , "QNO-4990" , "QNO-4911" ) * Further refined the independent and properly decoupled assemblies ("QNO-5001" , "QNO-4609" , "QNO-2514" , "QNO-4980" , "QNO-4974" , "QNO-4865" ) * Improved support for continuous integration and deployment of Quino packages and symbols via TeamCity ("QNO-5003" , "QNO-3738" , "QNO-4998" , "QNO-4756" , "QNO-4995" ) * Improved the Web integration with ASP.NET WebAPI and MVC authentication filters ("QNO-4711" ) * Addressed some performance regressions from 1.13 and added a suite of performance tests to keep better track of performance targets. On top of that, I've gone through the backlog and found many issues that had either been fixed already, were obsolete or had been inadequately specified. The Quino backlog dropped from 682 to 542 issues. * 18 issues marked as won't fix and 46 issues marked as obsolete * Stop supporting Glimpse, although there is a Quino.Web.Glimpse package to use the support we do have ("QNO-4560" ) * Stop supporting HtmlHelpers and other client-side rendering ("QNO-3921" , "QNO-3995" , "QNO-3804" , "QNO-3797" , "QNO-3974" , "QNO-4001" , "QNO-3992" , "QNO-3991" , "QNO-3973" , "QNO-3970" , "QNO-3969" , "QNO-3918" , "QNO-3866" , "QNO-3865" , "QNO-3857" , "QNO-3849" , "QNO-3848" , "QNO-3842" , "QNO-3839" , "QNO-3837" , "QNO-3836" , "QNO-3834" , "QNO-3833" , "QNO-3831" , "QNO-3824" w/sub-tasks, "QNO-3806" , "QNO-3805" , "QNO-3802" , "QNO-2288" ) 12 issues marked as incomplete (not sufficiently specified) * 2 issues marked as cannot reproduce * 97 issues marked as fixed [Breaking changes] The following changes are marked with Obsolete attributes, so you'll get a hint as to how to fix the problem. Since these are changes from an unreleased version of Quino, they cause a compile error. * UseMetaSchemaWinformDxFeedback() has been renamed to UseMetaschemaWinformDx() * UseSchemaMigrationSupport() has been renamed to UseIntegratedSchemaMigration() * MetaHttpApplicationBase.MetaApplication has been renamed to BaseApplication * The IServer.Run() extension method is no longer supported. * GetStandardFilters, GetStandardFiltersForFormsAuthentication() and GetStandardFiltersForUnrestrictedAuthentication are no longer supported. Instead, you should register filters in the IOC and use the IWebFilterAttributeFactory.CreateFilters() to get the list of supported filters * The ToolRequirementAttribute is no longer supported or used. * AssemblyExtensions.GetLoadableTypesWithInterface() is no longer supported * AssemblyTools.GetValidAssembly() has been replaced with AssemblyTools.GetApplicationAssembly(); GetExecutableName() and GetExecutablePath() have removed. * All of the constant expressions on the MetaBuilderBase (e.g. EndOfTimeExpression) are obsolete. Instead, use MetaBuilderBase.ExpressionFactory.Constants.EndOfTime instead. * All of the global values on MetaObjectDescriptionExtensions are obsolete; instead, use the IMetaObjectFormatterSettings from the IOC to change settings on startup. * Similarly, the set of extension methods that included GetShortDescription() has been moved to the IMetaObjectFormatter. Obtain an instance from the IOC, as usual.

Quino v2.0: Logging, Dependencies, New Assemblies & Nuget

2015-12-28T10:40:24+01:00

Published by marco on 28. Dec 2015 10:40:24 ------------------------------------------------------------------------ The summary below describes major new features, items of note and breaking changes. The "full list of issues" is also available for those with access to the Encodo issue tracker. [Highlights] In the "beta1" and "beta2" release notes, we read about changes to configuration, dependency reduction, the data driver architecture, DDL commands, security and access control in web applications and a new code-generation format. In 2.0 final -- which was actually released internally on November 13th, 2015 (a Friday) -- we made the following additional improvements: * Moved the metadata table maintained for the schema-migrator to a proper Quino module. ("QNO-4741" ) * Rebuilt the logging and messaging API and drastically simplified the implementation throughout ("QNO-4688" w/sub-tasks, "QNO-4954" ) * Split Encodo and Quino into dozens of new, independent and properly decoupled assemblies ("QNO-4678" , "QNO-4672" , "QNO-4670" , "QNO-4376" , "QNO-4920" , "QNO-4926" ) * Rebuilt the configuration and application-startup API ("QNO-4855" , "QNO-4051" , "QNO-4895" , "QNO-4931" , "QNO-4930" , "QNO-4949" , "QNO-4659" w/sub-tasks, "QNO-4950" , "QNO-4857" , "QNO-4910" , "QNO-4934" , "QNO-4898" , "QNO-4935" , "QNO-4937" ) * Changed delivery and deployment for Quino and all products to Nuget packages ("QNO-4916" ) * Added scripting and support for continuous integration and deployment of Quino packages and symbols via TeamCity ("QNO-4871" , "QNO-4932" , "QNO-3437" , "QNO-4433" , "QNO-4494" , "QNO-4871" ) * Restructured and refactored the standard testing base-classes and testing support for Quino products ("QNO-4963" ) * Improved and fixed code-generation for both v1 and v2 formats ("QNO-4804" , "QNO-4828" , "QNO-4897" ) * Refactored the application server API to improve decoupling and extensibility ("QNO-4927" ) These notes are being published for completeness and documentation. The first publicly available release of Quino 2.x will be 2.1 or higher (release notes coming soon). [Breaking changes] [image]As we've mentioned before, this release is absolutely merciless in regard to backwards compatibility. Old code is not retained as Obsolete. Instead, a project upgrading to 2.0 will encounter compile errors. The following notes serve as an incomplete guide that will help you upgrade a Quino-based product. As I wrote in the release notes for "beta1" and "beta2" , if you arm yourself with a bit of time, ReSharper and the release notes (and possibly keep an Encodo employee on speed-dial), the upgrade is not difficult. It consists mainly of letting ReSharper update namespace references for you. [Global Search/Replace] Instead of going through the errors (example shown to the right) one by one, you can take care of a lot of errors with the following search/replace pairs. * Encodo.Quino.Data.Persistence => Encodo.Quino.Data * IMetaApplication => IApplication * ICoreApplication => IApplication * GetServiceLocator() => GetServices() * MetaMethodTools.GetInstance => DataMetaMethodExtensions.GetInstance * application.ServiceLocator.GetInstance => application.GetInstance * Application.ServiceLocator.GetInstance => Application.GetInstance * application.ServiceLocator => application.GetServices() * Application.ServiceLocator => Application.GetServices() * application.Recorder => application.GetLogger() * Application.Recorder => Application.GetLogger() * session.GetRecorder() => session.GetLogger() * Session.GetRecorder() => Session.GetLogger() * Session.Application.Recorder => Session.GetLogger() * FileTools.Canonicalize() => PathTools.Normalize() * application.Messages => application.GetMessageList() * Application.Messages => Application.GetMessageList() * ServiceLocator.GetInstance => Application.GetInstance * MetaLayoutTools => LayoutConstants * GlobalContext.Instance.Application.Configuration.Model => GlobalContext.Instance.Application.GetModel() * IMessageRecorder => ILogger * GetUseReleaseSettings() => IsInReleaseMode() * ReportToolsDX => ReportDxExtensions Although you can't just search/replace everything, it gets you a long way. [Model-Building Fixes] These replacement pairs, while not recommended for global search/replace, are a handy guide for how the API has generally changed. * *Generator => *Builder * SetUpForModule => CreateModule * Builder.SetElementVisibility(prop, true) => prop.Show() * Builder.SetElementVisibility(prop, false) => prop.Hide() * Builder.SetElementControlIdentifier(prop, ControlIdentifiers => prop.SetInputControl(ControlIdentifiers * Builder.SetPropertyHeightInPixels(prop, 200); => prop.SetHeightInPixels(200); Constructing a module has also changed. Instead of using the following syntax, var module = Builder.SetUpForModule(Name, "ApexClearing.Alps.Core", Name, true); Replace it with the following direct replacement, var module = Builder.CreateModule(Name, "ApexClearing.Alps.Core", Name); Or use this replacement, with the recommended style for the v2 format (no more class prefix for generated classes and a standard namespace): var module = Builder.CreateModule(Name, typeof(AuditModuleBuilder).GetParentNamespace()); [Standard Modules (e.g. Reporting, Security, etc.)] Because of how the module class-names have changed, the standard module ORM classes all have different names. The formula is that the ORM class-name is no longer prepended its module name. * ReportsReportDefinition => ReportDefinition * SecurityUser => User * And so on... Furthermore, all modules have been converted to use the v2 code-generation format, which has the metadata separate from the ORM object. Therefore, instead of referencing metadata using the ORM class-name as the base, you use the module name as the base. * ReportReportDefinition.Fields.Name => ReportModule.ReportDefinition.Name.Identifier * ReportReportDefinition.MetaProperties.Name => ReportModule.ReportDefinition.Name * ReportReportDefinition.Metadata => ReportModule.ReportDefinition.Metadata * And so on... There's an upcoming article that will show more examples of the improved flexibility and capabilities that come with the v2-metadata. [Action names] The standard action names have moved as well. * ActionNames => ApplicationActionNames * MetaActionNames => MetaApplicationActionNames Any other, more rarely used action names have been moved back to the actions themselves, so for example SaveApplicationSettingsAction.ActionName If you created any actions of your own, then the API there has changed as well. As previously documented in "API Design: To Generic or not Generic? (Part II)" , instead of overriding the following method, protected override int DoExecute(IApplication application, ConfigurationOptions options, int currentResult) { base.DoExecute(application, options, currentResult); } you instead override in the following way, public override void Execute() { base.Execute(); } [Using NuGet] If you're already using Visual Studio 2015, then the NuGet UI is a good choice for managing packages. If you're still on Visual Studio 2013, then the UI there is pretty flaky and we recommend using the console. The examples below assume that you have configured a source called "Local Quino" (e.g. a local folder that holds the nupkg files for Quino). install-package Quino.Data.PostgreSql.Testing -ProjectName Punchclock.Core.Tests -Source "Local Quino" install-package Quino.Server -ProjectName Punchclock.Server -Source "Local Quino" install-package Quino.Console -ProjectName Punchclock.Server -Source "Local Quino" install-package Quino.Web -ProjectName Punchclock.Web.API -Source "Local Quino" [Debugging Support] We recommend using Visual Studio 2015 if at all possible. Visual Studio 2013 is also supported, but we have all migrated to 2015 and our knowhow about 2013 and its debugging idiosyncrasies will deteriorate with time. These are just brief points of interest to get you set up. As with the NuGet support, these instructions are subject to change as we gain more experience with debugging with packages as well. * Hook up to a working symbol-source server (e.g. TeamCity) * Get the local sources for your version * If you don't have a source server or it's flaky, then get the PDBs for the Quino version you're using (provided in Quino.zip as part of the package release) * Add the path to the PDBs to your list of symbol sources in the VS debugging options * Tell Visual Studio where the sources are when it asks during debugging * Tell R# how to map from the source folder (c:\BuildAgent\work\9a1bb0adebb73b1f for Quino 2.0.0-1765) to the location of your sources Quino packages are no different than any other NuGet packages. We provide both standard packages as well as packages with symbols and sources. Any complications you encounter with them are due to the whole NuGet experience still being a bit in-flux in the .NET world. An upcoming post will provide more detail and examples. [Creating Nuget Packages] We generally use our continuous integration server to create packages, but you can also create packages locally (it's up to you to make sure the version number makes sense, so be careful). These instructions are approximate and are subject to change. I provide them here to give you an idea of how packages are created. If they don't work, please contact Encodo for help. * Open PowerShell * Change to the %QUINO_ROOT%\src directory * Run nant build pack to build Quino and packages * Set up a local NuGet Source name "Local Quino" to %QUINO_ROOT%\nuget (one-time only) * Change to the directory where your Quino packages are installed for your solution. * Delete all of the Encodo/Quino packages * Execute nant nuget from your project directory to get the latest Quino build from your local folder

Improving NUnit integration with testing harnesses

2015-12-06T11:57:57+01:00

Published by marco on 6. Dec 2015 11:57:57 ------------------------------------------------------------------------ [image]These days nobody who's anybody in the software-development world is writing software without tests. Just writing them doesn't help make the software better, though. You also need to be able to execute tests -- reliably and quickly and repeatably. That said, you'll have to get yourself a test runner, which is a different tool from the compiler or the runtime. That is, just because your tests compile (satisfy all of the language rules) and could be executed doesn't mean that you're done writing them yet. [Testing framework requirements] Every testing framework has its own rules for how the test runner selects methods for execution as tests. The standard configuration options are: * Which classes should be considered as test fixtures? * Which methods are considered tests? * Where do parameters for these methods come from? * Is there startup/teardown code to execute for the test or fixture? Each testing framework will offer different ways of configuring your code so that the test runner can find and execute setup/test/teardown code. To write NUnit tests, you decorate classes, methods and parameters with C# attributes. The standard scenario is relatively easy to execute -- run all methods with a Test attribute in a class with a TestFixture attribute on it. [Test-runner Requirements] "There are legitimate questions for which even the best specification does not provide answers." When you consider multiple base classes and generic type arguments, each of which may also have NUnit attributes, things get a bit less clear. In that case, not only do you have to know what NUnit offers as possibilities but also whether the test runner that you're using also understands and implements the NUnit specification in the same way. Not only that, but there are legitimate questions for which even the best specification does not provide answers. At Encodo, we use Visual Studio 2015 with ReSharper 9.2 and we use the ReSharper test runner. We're still looking into using the built-in VS test runner -- the continuous-testing integration in the editor is intriguing [1] -- but it's quite weak when compared to the ReSharper one. So, not only do we have to consider what the NUnit documentation says is possible, but we must also know what how the R# test runner interprets the NUnit attributes and what is supported. [Getting More Complicated] Where is there room for misunderstanding? A few examples, * What if there's a TestFixture attribute on an abstract class? * How about a TestFixture attribute on a class with generic parameters? * Ok, how about a non-abstract class with Tests but no TestFixture attribute? * And, finally, a non-abstract class with Tests but no TestFixture attribute, but there are non-abstract descendants that do have a TestFixture attribute? In our case, the answer to these questions depends on which version of R# you're using. Even though it feels like you configured everything correctly and it logically should work, the test runner sometimes disagrees. * Sometimes it shows your tests as expected, but refuses to run them (Inconclusive FTW!) * Or other times, it obstinately includes generic base classes that cannot be instantiated into the session, then complains that you didn't execute them. When you try to delete them, it brings them right back on the next build. When you try to run them -- perhaps not noticing that it's those damned base classes -- then it complains that it can't instantiate them. Look of disapproval. Throw the TeamCity test runner into the mix -- which is ostensibly the same as that from R# but still subtly different -- and you'll have even more fun. [Improving Integration with the R# Test Runner] At any rate, now that you know the general issue, I'd like to share how the ground rules we've come up with that avoid all of the issues described above. The text below comes from the "issue" I created for the impending release of Quino 2. [Environment] * Windows 8.1 Enterprise * Visual Studio 2015 * ReSharper 9.2 [Expected behavior] Non-leaf-node base classes should never appear as nodes in test runners. A user should be able to run tests in descendants directly from a fixture or test in the base class. [Observed behavior] Non-leaf-node base classes are shown in the R# test runner in both versions 9 and 10. A user must navigate to the descendant to run a test. The user can no longer run all descendants or a single descendant directly from the test. [Analysis] Relatively recently, in order to better test a misbehaving test runner and accurately report issues to JetBrains, I standardized all tests to the same pattern: * Do not use abstract anywhere (the base classes don't technically need it) * Use the TestFixture attribute only on leaf nodes This worked just fine with ReSharper 8.x but causes strange behavior in both R# 9.x and 10.x. We discovered recently that not only did the test runner act strangely (something that they might fix), but also that the unit-testing integration in the files themselves behaved differently when the base class is abstract (something JetBrains is unlikely to fix). You can see that R# treats a non-abstract class with tests as a testable entity, even when it doesn't actually have a TestFixture attribute and even expects a generic type parameter in order to instantiate. Here it's not working well in either the source file or the test runner. In the source file, you can see that it offers to run tests in a category, but not the tests from actual descendants. If you try to run or debug anything from this menu, it shows the fixture with a question-mark icon and marks any tests it manages to display as inconclusive. This is not surprising, since the test fixture may not be abstract, but does require a type parameter in order to be instantiated. [image] Here it looks and acts correctly: [image] I've reported this issue to JetBrains, but our testing structure either isn't very common or it hasn't made it to their core test cases, because neither 9 nor 10 handles them as well as the 8.x runner did. Now that we're also using TeamCity a lot more to not only execute tests but also to collect coverage results, we'll capitulate and just change our patterns to whatever makes R#/TeamCity the happiest. [Solution] * Make all testing base classes that include at least one {{Test}} or {{Category}} attribute {{abstract}}. Base classes that do not have any testing attributes do not need to be made abstract. Once more to recap our ground rules for making tests: * Include TestFixture only on leafs (classes with no descendants) * You can put Category or Test attributes anywhere in the hierarchy, but need to declare the class as abstract. * Base classes that have no testing attributes do not need to be abstract * If you feel you need to execute tests in both a base class and one of its descendants, then you're probably doing something wrong. Make two descendants of the base class instead. When you make the change, you can see the improvement immediately. [image] -------------------------------------------------------------------------------- [1] ReSharper 10.0 also offers continuous integration, but our experiments with the EAP builds and the first RTM build left us underwhelmed and we downgraded to 9.2 until JetBrains manages to release a stable 10.x.

Quino 2: Starting up an application, in detail

2015-11-28T13:58:45+01:00

Published by marco on 28. Nov 2015 13:58:45 Updated by marco on 10. Nov 2025 10:54:46 ------------------------------------------------------------------------ As part of the final release process for Quino 2, we've upgraded 5 solutions [1] from Quino 1.13 to the latest API in order to shake out any remaining API inconsistencies or even just inelegant or clumsy calls or constructs. A lot of questions came up during these conversions, so I wrote the following blog to provide detail on the exact workings and execution order of a Quino application. I've discussed the design of Quino's configuration before, most recently in "API Design: Running an Application (Part I)" and "API Design: To Generic or not Generic? (Part II)" as well as the three-part series that starts with "Encodo’s configuration library for Quino: part I" . [Quino Execution Stages] The life-cycle of a Quino 2.0 application breaks down into roughly the following stages: 1. Build Application: Register services with the IOC, add objects needed during configuration and add actions to the startup and shutdown lists 2. Load User Configuration: Use non-IOC objects to bootstrap configuration from the command line and configuration files; IOC is initialized and can no longer be modified after action ServicesInitialized 3. Apply Application Configuration: Apply code-based configuration to IOC objects; ends with the ServicesConfigured action 4. Execute: execute the loop, event-handler, etc. 5. Shut Down: dispose of the application, shutting down services in the IOC, setting the exit code, etc. [Stage 1] The first stage is all about putting the application together with calls to Use various services and features. This stage is covered in detail in three parts, starting with "Encodo’s configuration library for Quino: part I" . [Stage 2] Let's tackle this one last because it requires a bit more explanation. [Stage 3] Technically, an application can add code to this stage by adding an IApplicationAction before the ServicesConfigured action. Use the Configure() extension method in stage 1 to configure individual services, as shown below. application.Configure( s => s.Behavior = FileLogBehavior.MultipleFiles ); [Stage 4] The execution stage is application-specific. This stage can be short or long, depending on what your application does. For desktop applications or single-user utilities, stage 4 is executed in application code, as shown below, in the Run method, which called by the ApplicationManager after the application has started. var transcript = new ApplicationManager().Run(CreateApplication, Run); IApplication CreateApplication() { ... } void Run(IApplication application) { ... } If your application is a service, like a daemon or a web server or whatever, then you'll want to execute stages 1--3 and then let the framework send requests to your application's running services. When the framework sends the termination signal, execute stage 5 by disposing of the application. Instead of calling Run, you'll call CreateAndStartupUp. var application = new ApplicationManager().CreateAndStartUp(CreateApplication); IApplication CreateApplication() { ... } [Stage 5] Every application has certain tasks to execute during shutdown. For example, an application will want to close down any open connections to external resources, close file (especially log files) and perhaps inform the user of shutdown. Instead of exposing a specific "shutdown" method, a Quino 2.0 application can simply be disposed to shut it down. If you use ApplicationManager.Run() as shown above, then you're already sorted -- the application will be disposed and the user will be informed in case of catastrophic failure; otherwise, you can shut down and get the final application transcript from the disposed object. application.Dispose(); var transcript = application.GetTranscript(); // Do something with the transcript... [Stage 2 Redux] We're finally ready to discuss stage 2 in detail. An IOC has two phases: in the first phase, the application registers services with the IOC; in the second phase, the application uses services from the IOC. An application should use the IOC as much as possible, so Quino keeps stage 2 as short as possible. Because it can't use the IOC during the registration phase, code that runs in this stage shares objects via a poor-man's IOC built into the IApplication that allows modification and only supports singletons. Luckily, very little end-developer application code will ever need to run in this stage. It's nevertheless interesting to know how it works. Obviously, any code in this stage that uses the IOC will cause it to switch from phase one to phase two and subsequent attempts to register services will fail. Therefore, while application code in stage 2 has to be careful, you don't have to worry about not knowing you've screwed up. Why would we have this stage? Some advocates of using an IOC claim that everything should be configured in code. However, it's not uncommon for applications to want to run very differently based on command-line or other configuration parameters. The Quino startup handles this by placing the following actions in stage 2: * Parse and apply command-line * Import and apply external configuration (e.g. from file) An application is free to insert more actions before the ServicesInitialized action, but they have to play by the rules outlined above. ["Single" objects] Code in stage 2 shares objects by calling SetSingle() and GetSingle(). There are only a few objects that fall into this category. The calls UseCore() and UseApplication() register most of the standard objects used in stage 2. Actually, while they're mostly used during stage 2, some of them are also added to the poor man's IOC in case of catastrophic failure, in which case the IOC cannot be assumed to be available. A good example is the IApplicationCrashReporter. [Executing Stages] Before listing all of the objects, let's take a rough look at how a standard application is started. The following steps outline what we consider to be a good minimum level of support for any application. Of course, the Quino configuration is modular, so you can take as much or as little as you like, but while you can use a naked Application -- which has absolutely nothing registered -- and you can call UseCore() to have a bit more -- it registers a handful of low-level services but no actions -- we recommend calling at least UseApplication() to adds most of the functionality outlined below. 1. Create application: This involves creating the IOC and most of the IOC registration as well as adding most of the application startup actions (stage 1) 2. Set debug mode: Get the final value of RunMode from the IRunSettings to determine if the application should catch all exceptions or let them go to the debugger. This involves getting the IRunSettings from the application and getting the final value using the IApplicationManagerPreRunFinalizer. This is commonly an implementation that can allows setting the value of RunMode from the command-line in debug builds. This further depends on the ICommandSetManager (which depends on the IValueTools) and possibly the ICommandLineSettings (to set the CommandLineConfigurationFilename if it was set by the user). 3. Process command line: Set the ICommandProcessingResult, possibly setting other values and adding other configuration steps to the list of startup actions (e.g. many command-line options are switches that are handled by calling Configure() where TSettings is the configuration object in the IOC to modify). 4. Read configuration file: Load the configuration data into the IConfigurationDataSettings, involving the ILocationManager to find configuration files and the ITextValueNodeReader to read them. 5. The ILogger is used throughout by various actions to log application behavior 6. If there is an unhandled error, the IApplicationCrashReporter uses the IFeedback or the ILogger to notify the user and log the error 7. The IInMemoryLogger is used to include all in-memory messages in the IApplicationTranscript The next section provides detail to each of the individual objects referenced in the workflow above. [Available Objects] You can get any one of these objects from the IApplication in at least two ways, either by using GetSingle() (safe in all situations) or GetInstance() (safe only in stage 3 or later) or there's almost always a method which starts with "Use" and ends in the service name. The example below shows how to get the ICommandSetManager [2] if you need it. application.GetCommandSetManager(); application.GetSingle(); // Prefer the one above application.GetInstance(); All three calls return the exact same object, though. The first two from the poor-man's IOC; the last from the real IOC. Only applications that need access to low-level objects or need to mess around in stage 2 need to know which objects are available where and when. Most applications don't care and will just always use GetInstance(). The objects in the poor-man's IOC are listed below. [Core] * IValueTools: converts values; used by the command-line parser, mostly to translate enumerate values and flags * ILocationManager: an object that manages aliases for file-system locations, like "Configuration", from which configuration files should be loaded or "UserConfiguration" where user-specific overlay configuration files are stored; used by the configuration loader * ILogger: a reference to the main logger for the application * IInMemoryLogger: a reference to an in-memory message store for the logger (used by the ApplicationManager to retrieve the message log from a crashed application) * IMessageFormatter: a reference to the object that formats messages for the logger [Command line] * ICommandSetManager: sets the schema for a command line; used by the command-line parser * ICommandProcessingResult: contains the result of having processed the command line * ICommandLineSettings: defines the properties needed to process the command line (e.g. the Arguments and CommandLineConfigurationFilename, which indicates the optional filename to use for configuration in addition to the standard ones) [Configuration] * IConfigurationDataSettings: defines the ConfigurationData which is the hierarchical representation of all configuration data for the application as well as the MainConfigurationFilename from which this data is read; used by the configuration-loader * ITextValueNodeReader: the object that knows how to read ConfigurationData from the file formats supported by the application [3]; used by the configuration-loader [Run] * IRunSettings: an object that manages the RunMode ("release" or "debug"), which can be set from the command line and is used by the ApplicationManager to determine whether to use global exception-handling * IApplicationManagerPreRunFinalizer: a reference to an object that applies any options from the command line before the decision of whether to execute in release or debug mode is taken. * IApplicationCrashReporter: used by the ApplicationManager in the code surrounding the entire application execution and therefore not guaranteed to have a usable IOC available * IApplicationDescription: used together with the ILocationManager to set application-specific aliases to user-configuration folders (e.g. AppData\{CompanyTitle}\{ApplicationTitle}) * IApplicationTranscript: an object that records the last result of having run the application; returned by the ApplicationManager after Run() has completed, but also available through the application object returned by CreateAndStartUp() to indicate the state of the application after startup. Each of these objects has a very compact interface and has a single responsibility. An application can easily replace any of these objects by calling UseSingle() during stage 1 or 2. This call sets the object in both the poor-man's IOC as well as the real one. For those rare cases where a non-IOC singleton needs to be set after the IOC has been finalized, the application can call SetSingle(), which does not touch the IOC. This feature is currently used only to set the IApplicationTranscript, which needs to happen even after the IOC registration is complete. -------------------------------------------------------------------------------- [1] Two large customer solutions, two medium-sized internal solutions (Punchclock and JobVortex) as well as the Demo/Sandbox solution. These solutions include the gamut of application types: * 3 ASP.NET MVC applications * 2 ASP.NET WebAPI applications * 2 Windows services * 3 Winform/DevExpress applications * 2 Winform/DevExpress utilities * 4 Console applications and utilities [1] I originally used ITextValueNodeReader as an example, but that's one case where the recommended call doesn't match 1-to-1 with the interface name. application.GetSingle(); application.GetInstance(); application.GetConfigurationDataReader(); // Recommended [1] Currently only XML, but "JSON" is on the way when someone gets a free afternoon.

IServer: converting hierarchy to composition

2015-11-23T22:31:29+01:00

Published by marco on 23. Nov 2015 22:31:29 ------------------------------------------------------------------------ Quino has long included support for connecting to an application server instead of connecting directly to databases or other sources. The application server uses the same model as the client and provides modeled services (application-specific) as well as CRUD for non-modeled data interactions. We wrote the first version of the server in 2008. Since then, it's acquired better authentication and authorization capabilities as well as routing and state-handling. We've always based it on the .NET HttpListener. [Old and Busted] As late as Quino 2.0-beta2 (which we had deployed in production environments already), the server hierarchy looked like screenshot below, pulled from issue "QNO-4927" : [image] This screenshot was captured after a few unneeded interfaces had already been removed. As you can see by the class names, we'd struggled heroically to deal with the complexity that arises when you use inheritance rather than composition. The state-handling was welded onto an authentication-enabled server, and the base machinery for supporting authentication was spread across three hierarchy layers. The hierarchy only hints at composition in its naming: the "Stateful" part of the class name CoreStatefulHttpServerBase had already been moved to a state provider and a state creator in previous versions. That support is unchanged in the 2.0 version. [Implementation Layers] We mentioned above that implementation was "spread across three hierarchy layers". There's nothing wrong with that, in principle. In fact, it's a good idea to encapsulate higher-level patterns in a layer that doesn't introduce too many dependencies and to introduce dependencies in other layers. This allows applications not only to be able to use a common implementation without pulling in unwanted dependencies, but also to profit from the common tests that ensure the components works as advertised. In Quino, the following three layers are present in many components: 1. Abstract: a basic encapsulation of a pattern with almost no dependencies (generally just Encodo.Core). 2. Standard: a functional implementation of the abstract pattern with dependencies on non-metadata assemblies (e.g. Encodo.Application, Encodo.Connections and so on) 3. Quino: an enhancement of the standard implementation that makes use of metadata to fill in implementation left abstract in the previous layer. Dependencies can include any of the Quino framework assemblies (e.g. Quino.Meta, Quino.Application and so on). [The New Hotness [1]] The diagram below shows the new hotness in Quino 2. [2] [image] The hierarchy is now extremely flat. There is an IServer interface and a Server implementation, both generic in TListener, of type IServerListener. The server manages a single instance of an IServerListener. The listener, in turn, has an IHttpServerRequestHandler, the main implementation of which uses an IHttpServerAuthenticator. As mentioned above, the IServerStateProvider is included in this diagram, but is unchanged from Quino 2.0-beta3, except that it is now used by the request handler rather than directly by the server. You can see how the abstract layer is enhanced by an HTTP-specific layer (the Encodo.Server.Http namespace) and the metadata-specific layer is nice encapsulated in three classes in the Quino.Server assembly. [Server Components and Flow] This type hierarchy has decoupled the main elements of the workflow of handling requests for a server: * The server manages listeners (currently a single listener), created by a listener factory * The listener, in turn, dispatches requests to the request handler * The request handler uses the route handler to figure out where to direct the request * The route handler uses a registry to map requests to response items * The request handler asks the state provider for the state for the given request * The state provider checks its cache for the state (the default support uses persistent states to cache sessions for a limited time); if not found, it creates a new one * Finally, the request handler checks whether the user for the request is authenticated and/or authorized to execute the action and, if so, executes the response items It is important to note that this behavior is unchanged from the previous version -- it's just that now each step is encapsulated in its own component. The components are small and easily replaced, with clear and concise interfaces. Note also that the current implementation of the request handler is for HTTP servers only. Should the need arise, however, it would be relatively easy to abstract away the HttpListener dependency and generalize most of the logic in the request handler for any kind of server, regardless of protocol and networking implementation. Only the request handler is affected by the HTTP dependency, though: authentication, state-provision and listener-management can all be re-used as-is. Also of note is that the only full-fledged implementation is for metadata-based applications. At the bottom of the diagram, you can see the metadata-specific implementations for the route registry, state provider and authenticator. This is reflected in the standard registration in the IOC. These are the service registrations from Encodo.Server: return handler .RegisterSingle() .RegisterSingle, HttpServerListenerFactory>() .Register>(); And these are the service registrations from Quino.Server: handler .RegisterSingle, StandardMetaServerRouteRegistry>() .RegisterSingle, MetaPersistentServerStateProvider>() .RegisterSingle, MetaServerStateCreator>() .RegisterSingle, MetaHttpServerAuthenticator>() .RegisterSingle>() As you can see, the registration is extremely fine-grained and allows very precise customization as well as easy mocking and testing. -------------------------------------------------------------------------------- [1] Any Men in Black fans out there? Tommy Lee Jones was "old and busted" while Will Smith was "the new hotness"? No? Just me? All righty then... [1] This diagram brought to you by the diagramming and architecture tools in ReSharper 9.2. Just select the files or assemblies you want to diagram in the Solution Explorer and choose the option to show them in a diagram. You can right-click any type or assembly to show dependent or referenced modules or types. For type diagrams , you can easily control which relationships are to be shown (e.g. I hide aggregations to avoid clutter) and how the elements are to be grouped (e.g. I grouped by namespace to include the boxes in my diagram).

Iterating with NDepend to remove cyclic dependencies (Part II)

2015-10-16T11:44:35+02:00

Published by marco on 16. Oct 2015 11:44:35 Updated by marco on 7. Dec 2024 23:11:14 ------------------------------------------------------------------------ In the previous article, we discussed the task of "Splitting up assemblies in Quino using NDepend." In this article, I'll discuss both the high-level and low-level workflows I used with NDepend to efficiently clear up these cycles. Please note that what follows is a description of how I have used the tool -- so far -- to get my very specific tasks accomplished. If you're looking to solve other problems or want to solve the same problems more efficiently, you should take a look at the "official NDepend documentation" . [What were we doing?] To recap briefly: we are reducing dependencies among top-level namespaces in two large assemblies, in order to be able to split them up into multiple assemblies. The resulting assemblies will have dependencies on each other, but the idea is to make at least some parts of the Encodo/Quino libraries opt-in. [The plan of attack] On a high-level, I tackled the task in the following loosely defined phases. Remove direct, root-level dependencies This is the big first step -- to get rid of the little black boxes. I made NDepend show only direct dependencies at first, to reduce clutter. More on specific techniques below. Remove indirect dependencies [image]Crank up the magnification to show indirect dependencies as well. This will will help you root out the remaining cycles, which can be trickier if you're not showing enough detail. On the contrary, if you turn on indirect dependencies too soon, you'll be overwhelmed by darkness (see the depressing initial state of the Encodo assembly to the right). Examine dependencies between root-level namespaces Even once you've gotten rid of all cycles, you may still have unwanted dependencies that hinder splitting namespaces into the desired constellation of assemblies. For example, the plan is to split all logging and message-recording into an assembly called Encodo.Logging. However, the IRecorder interface (with a single method, Log()) is used practically everywhere. It quickly becomes necessary to split interfaces and implementation -- with many more potential dependencies -- into two assemblies for some very central interfaces and support classes. In this specific case, I moved IRecorder to Encodo.Core. Even after you've conquered the black hole, you might still have quite a bit of work to do. Never fear, though: NDepend is there to help root out those dependencies as well. Examine cycles in non-root namespaces Because we can split off smaller assemblies regardless, these dependencies are less important to clean up for our current purposes. However, once this code is packed into its own assembly, its namespaces become root namespaces of their own and -- voila! you have more potentially nasty dependencies to deal with. Granted, the problem is less severe because you're dealing with a logically smaller component. In Quino, use non-root namespaces more for organization and less for defining components. Still, cycles are cycles and they're worth examining and at least plucking the low-hanging fruit. [Removing root-level namespace cycles] With the high-level plan described above in hand, I repeated the following steps for the many dependencies I had to untangle. Don't despair if it looks like your library has a ton of unwanted dependencies. If you're smart about the ones you untangle first, you can make excellent -- and, most importantly, rewarding -- progress relatively quickly. [1] 1. Show the dependency matrix 2. Choose the same assembly in the row and column 3. Choose a square that's black 4. Click the name of the namespace in the column to show sub-namespaces 5. Do the same in a row 6. Keep zooming until you can see where there are dependencies that you don't want 7. Refactor/compile/run NDepend analysis to show changes 8. GOTO 1 [Once again, with pictures!] The high-level plan of attack sounded interesting, but might have left you cold with its abstraction. Then there was the promise of detail with a focus on root-level namespaces, but alas, you might still be left wondering just how exactly do you reduce these much-hated cycles? I took some screenshots as I worked on Quino, to document my process and point out parts of NDepend I thought were eminently helpful. [Show only namespaces] [image][image]I mentioned above that you should "[k]eep zooming in", but how do you do that? A good first step is to zoom all the way out and show only direct namespace dependencies. This focuses only on using references instead of the much-more frequent member accesses. In addition, I changed the default setting to show dependencies in only one direction -- when a column references a row (blue), but not vice versa (green). As you can see, the diagrams are considerably less busy than the one shown above. Here, we can see a few black spots that indicate cycles, but it's not so many as to be overwhelming. [2] You can hover over the offending squares to show more detail in a popup. [Show members] [image][image]If you don't see any more cycles between namespaces, switch the detail level to "Members". Another very useful feature is to "Bind Matrix", which forces the columns and rows to be shown in the same order and concentrates the cycles in a smaller area of the matrix. As you can see in the diagram, NDepend then highlights the offending area and you can even click the upper-left corner to focus the matrix only on that particular cycle. [Drill down to classes] [image][image]Once you're looking at members, it isn't enough to know just the namespaces involved -- you need to know which types are referencing which types. The powerful matrix view lets you drill down through namespaces to show classes as well. If your classes are large -- another no-no, but one thing at a time -- then you can drill down to show which method is calling which method to create the cycle. In the screenshot to the right, you can see where I had to do just that in order to finally figure out what was going on. In that screenshot, you can also see something that I only discovered after using the tool for a while: the direction of usage is indicated with an arrow. You can turn off the tooltips -- which are informative, but can be distracting for this task -- and you don't have to remember which color (blue or green) corresponds to which direction of usage. [Indirect dependencies] [image][image]Once you've drilled your way down from namespaces-only to showing member dependencies, to focusing on classes, and even members, your diagram should be shaping up quite well. On the right, you'll see a diagram of all direct dependencies for the remaining area with a problem. You don't see any black boxes, which means that all direct dependencies are gone. So we have to turn up the power of our microscope further to show indirect dependencies. On the left, you can see that the scary, scary black hole from the start of our journey has been whittled down to a small, black spot. And that's with all direct and indirect dependencies as well as both directions of usage turned on (i.e. the green boxes are back). This picture is much more pleasing, no? [Queries and graphs] [image][image][image]For the last cluster of indirect dependencies shown above, I had to unpack another feature: NDepend queries: you can select any element and run a query to show using/used by assemblies/namespaces. [3] The results are shown in a panel, where you can edit the query to see live updates immediately. Even with a highly zoomed-in view on the cycle, I still couldn't see the problem, so I took NDepend's suggestion and generated a graph of the final indirect dependency between Culture and Enums (through Expression). At this zoom level, the graph becomes more useful (for me) and illuminates problems that remain muddy in the matrix (see right). [Crossing the finish line] In order to finish the job efficiently, here are a handful of miscellaneous tips that are useful, but didn't fit into the guide above. [image] * I set NDepend to automatically re-run an analysis on a successful build. The matrix updates automatically to reflect changes from the last analysis and won't lose your place. * If you have ReSharper, you'll generally be able to tell whether you've fixed the dependencies because the usings will be grayed out in the offending file. You can make several fixes at once before rebuilding and rerunning the analysis. * At higher zoom levels (e.g. having drilled down to methods), it is useful to toggle display of row dependencies back on because the dependency issue is only clear when you see the one green box in a sea of blue. * Though Matrix Binding is useful for localizing, remember to toggle it off when you want to drill down in the row independently of the namespace selected in the column. And BOOM! just like that [4], phase 1 (root namespaces) for Encodo was complete! Now, on to Quino.dll... [Conclusion] [image]Depending on what shape your library is in, do not underestimate the work involved. Even with NDepend riding shotgun and barking out the course like a rally navigator, you still have to actually make the changes. That means lots of refactoring, lots of building, lots of analysis, lots of running tests and lots of reviews of at-times quite-sweeping changes to your code base. The destination is worth the journey, but do not embark on it lightly -- and don't forget to bring the right tools. [5] -------------------------------------------------------------------------------- [1] This can be a bit distracting: you might get struck trying to figure out which of all these offenders to fix first. [1] I'm also happy to report that my initial forays into maintaining a relatively clean library -- as opposed to cleaning it -- with NDepend have been quite efficient. [1] And much more: I don't think I've even scratched the surface of the analysis and reporting capabilities offered by this ability to directly query the dependency data. [1] I'm just kidding. It was a lot of time-consuming work. [1] In this case, in case it's not clear: NDepend for analysis and good ol' ReSharper for refactoring. And ReSharper's new(ish) architecture view is also quite good, though not even close to detailed enough to replace NDepend: it shows assembly-level dependencies only.

Splitting up assemblies in Quino using NDepend (Part I)

2015-10-04T07:44:39+02:00

Published by marco on 4. Oct 2015 07:44:39 ------------------------------------------------------------------------ [image]A lot of work has been put into Quino 2.0 [1], with almost no stone left unturned. Almost every subsystem has been refactored and simplified, including but not limited to the data driver, the schema migration, generated code and metadata, model-building, security and authentication, service-application support and, of course, configuration and execution. Two of the finishing touches before releasing 2.0 are to reorganize all of the code into a more coherent namespace structure and to reduce the size of the two monolithic assemblies: Encodo and Quino. [A Step Back] The first thing to establish is: why are we doing this? Why do we want to reduce dependencies and reduce the size of our assemblies? There are several reasons, but a major reason is to improve the discoverability of patterns and types in Quino. Two giant assemblies are not inviting -- they are, in fact, daunting. Replace these assemblies with dozens of smaller ones and users of your framework will be more likely to (A) find what they're looking for on their own and (B) build their own extensions with the correct dependencies and patterns. Neither of these is guaranteed, but smaller modules are a great start. Another big reason is portability. The .NET Core was released as open-source software some time ago and more and more .NET source code is added to it each day. There are portable targets, non-Windows targets, Universal-build targets and much more. It makes sense to split code up into highly portable units with as few dependencies as possible. That is, the dependencies should be explicit and intended. Not only that, but NuGet packaging has come to the fore more than ever. Quino was originally designed to keep third-party boundaries clear, but we wanted to make it as easy as possible to use Quino. Just include Encodo and Quino and off you went. However, with NuGet, you can now say you want to use Quino.Standard and you'll get Quino.Core, Encodo.Core, Encodo.Services.SimpleInjector, Quino.Services.SimpleInjector and other packages. With so much interesting code in the Quino framework, we want to make it available as much as possible not only for our internal projects but also for customer projects where appropriate and, also, possibly for open-source distribution. [NDepend] I've used NDepend before [2] to clean up dependencies. However, the last analysis I did about a year ago showed quite deep problems [3] that needed to be addressed before any further dependency analysis could bear fruit at all. With that work finally out of the way, I'm ready to re-engage with NDepend and see where we stand with Quino. As luck would have it, NDepend is in version 6, released at the start of summer 2015. As was the case last year, NDepend has generously provided me with an upgrade license to allow me to test and evaluate the new version with a sizable and real-world project. Here is some of the "feedback I sent to NDepend" : I really, really like the depth of insight NDepend gives me into my code. I find myself thinking "SOLID" much more often when I have NDepend shaking its head sadly at me, tsk-tsking at all of the dependency snarls I've managed to build. * It's fast and super-reliable. I can work these checks into my workflow relatively easily. * I'm using the matrix view a lot more than the graphs because even NDepend recommends I don't use a graph for the number of namespaces/classes I'm usually looking at * Where the graph view is super-useful is for examining *indirect* dependencies, which are harder to decipher with the graph * I've found so many silly mistakes/lazy decisions that would lead to confusion for developers new to my framework * I'm spending so much time with it and documenting my experiences because I want more people at my company to use it * I haven't even scratched the surface of the warnings/errors but want to get to that, as well (the Dashboard tells me of 71 rules violated; 9 critical; I'm afraid to look :-) [Use Cases] Before I get more in-depth with NDepend, please note that there at least two main use cases for this tool [4]: 1. Clean up a project or solution that has never had a professional dependency checkup 2. Analyze and maintain separation and architectural layers in a project or solution These two use cases are vastly different. The first is like cleaning a gas-station bathroom for the first time in years; the second is more like the weekly once-over you give your bathroom at home. The tools you'll need for the two jobs are similar, but quite different in scope and power. The same goes for NDepend: how you'll use it to claw your way back to architectural purity is different than how you'll use it to occasionally clean up an already mostly-clean project. Quino is much better than it was the last time we peeked under the covers with NDepend, but we're still going to need a bucket of industrial cleaner before we're done. [5] The first step is to make sure that you're analyzing the correct assemblies. Show the project properties to see which assemblies are included. You should remove all assemblies from consideration that don't currently interest you (especially if your library is not quite up to snuff, dependency-wise; afterwards, you can leave as many clean assemblies in the list as you like). [6] [Industrial-strength cleaner for Quino] Running an analysis with NDepend 6 generates a nice report, which includes the following initial dependency graph for the assemblies. [image] As you can see, Encodo and Quino depend only on system assemblies, but there are components that pull in other references where they might not be needed. The initial dependency matrices for Encodo and Quino both look much better than they did when I last generated one. The images below show what we have to work with in the Encodo and Quino assemblies. [image][image] It's not as terrible as I've made out, right? There is far less namespace-nesting, so it's much easier to see where the bidirectional dependencies are. There are only a handful of cyclic dependencies in each library, with Encodo edging out Quino because of (A) the nature of the code and (B) I'd put more effort into Encodo so far. I'm not particularly surprised to see that this is relatively clean because we've put effort into keeping the external dependencies low. It's the internal dependencies in Encodo and Quino that we want to reduce. [Small and Focused Assemblies] [image][image][image]The goal, as stated in the title of this article, is to split Encodo and Quino into separate assemblies. While removing cyclic dependencies is required for such an operation, it's not sufficient. Even without cycles, it's still possible that a given assembly is still too dependent on other assemblies. Before going any farther, I'm going to list the assemblies we'd like to have. By "like to have", I mean the list that we'd originally planned plus a few more that we added while doing the actual splitting. [7] The images on the right show the assemblies in Encodo, Quino and a partial overview of the dependency graph (calculated with the ReSharper Architecture overview rather than with NDepend, just for variety). Of these, the following assemblies and their dependencies are of particular interest [8]: * Encodo.Core: System dependencies only * Encodo.Application: basic application support [9] * Encodo.Application.Standard: configuration methods for non-metadata applications that don't want to pick and choose packages/assemblies * Encodo.Expressions: depends only on Encodo.Core * Quino.Meta: depends only on Encodo.Core and Encodo.Expressions * Quino.Meta.Standard: Optional, but useful metadata extensions * Quino.Application: depends only on Encodo.Application and Quino.Meta * Quino.Application.Standard: configuration methods for metadata applications that don't want to pick and choose packages/assemblies * Quino.Data: depends on Quino.Application and some Encodo.* assemblies * Quino.Schema: depends on Quino.Data This seems like a good spot to stop, before getting into the nitty-gritty detail of how we used NDepend in practice. In the next article, I'll discuss both the high-level and low-level workflows I used with NDepend to efficiently clear up these cycles. Stay tuned! -------------------------------------------------------------------------------- [1] Release notes for 2.0 betas: * "v2.0-beta1: Configuration, services and web" * "v2.0-beta2: Code generation, IOC and configuration" Articles about design: * "Encodo’s configuration library for Quino: part I" * "Encodo’s configuration library for Quino: part II" * "Encodo’s configuration library for Quino: part III" * "API Design: Running an Application (Part I)" * "API Design: To Generic or not Generic? (Part II)" [1] I published a two-parter in August and November of 2014. * "The Road to Quino 2.0: Maintaining architecture with NDepend (part I)" * "The Road to Quino 2.0: Maintaining architecture with NDepend (part II)" [1] You can see a lot of the issues associated with these changes in the release notes for "Quino 2.0-beta1" (mostly the first point in the "Highlights" section) and "Quino 2.0-beta2" (pretty much all of the points in the "Highlights" section). [1] I'm sure there are more, but those are the ones I can think of that would apply to my project (for now). [1] ...to stretch the gas-station metaphor even further. [1] Here I'm going to give you a tip that confused me for a while, but that I think was due to particularly bad luck and is actually quite a rare occurrence. If you already see the correct assemblies in the list, you should still check that NDepend picked up the right paths. That is, if you haven't followed the advice in NDepend's white paper and still have a different bin folder for each assembly, you may see something like the following in the tooltip when you hover over the assembly name: "Several valid .NET assemblies with the name {Encodo} have been found. They all have the same version. the one with the biggest file has been chosen." If NDepend has accidentally found an older copy of your assembly, you must delete that assembly. Even if you add an assembly directly, NDepend will not honor the path from which you added it. This isn't as bad as it sounds, since it's a very strange constellation of circumstances that led to this assembly hanging around anyway: * The project is no longer included in the latest Quino but lingers in my workspace * The version number is unfortunately the same, even though the assembly is wildly out of date I only noticed because I knew I didn't have that many dependency cycles left in the Encodo assembly. [1] Especially for larger libraries like Quino, you'll find that your expectations about dependencies between modules will be largely correct, but will still have gossamer filaments connecting them that prevent a clean split. In those cases, we just created new assemblies to hold these common dependencies. Once an initial split is complete, we'll iterate and refactor to reduce some of these ad-hoc assemblies.[1] Screenshots, names and dependencies are based on a pre-release version of Quino, so while the likelihood is small, everything is subject to change. [1] Stay tuned for an upcoming post on the details of starting up an application, which is the support provided in Encodo.Application.

API Design: To Generic or not Generic? (Part II)

2015-09-26T11:27:08+02:00

Published by marco on 26. Sep 2015 11:27:08 Updated by marco on 15. Jan 2017 23:18:41 ------------------------------------------------------------------------ [image]In this article, I'm going to continue the discussion started in "Part I" , where we laid some groundwork about the state machine that is the startup/execution/shutdown feature of Quino. As we discussed, this part of the API still suffers from "several places where generic TApplication parameters [are] cluttering the API". In this article, we'll take a closer look at different design approaches to this concrete example -- and see how we decided whether to use generic type parameters. [Consistency through Patterns and API] Any decision you take with a non-trivial API is going to involve several stakeholders and aspects. It's often not easy to decide which path is best for your stakeholders and your product. "For any API you design, consider how others are likely to extend it -- and whether your pattern is likely to deteriorate from neglect." For any API you design, consider how others are likely to extend it -- and whether your pattern is likely to deteriorate from neglect. Even a very clever solution has to be balanced with simplicity and elegance if it is to have a hope in hell of being used and standing the test of time. In Quino 2.0, the focus has been on ruthlessly eradicating properties on the IApplication interface as well as getting rid of the descendant interfaces, ICoreApplication and IMetaApplication. Because Quino now uses a pattern of placing sub-objects in the IOC associated with an IApplication, there is far less need for a generic TApplication parameter in the rest of the framework. See "Encodo’s configuration library for Quino: part I" for more information and examples. This focus raised an API-design question: if we no longer want descendant interfaces, should we eliminate parameters generic in that interface? Or should we continue to support generic parameters for applications so that the caller will always get back the type of application that was passed in? Before getting too far into the weeds [1], let's look at a few concrete examples to illustrate the issue. [Do Fluent APIs require generic return-parameters?] As discussed in "Encodo’s configuration library for Quino: part III" in detail, Quino applications are configured with the "Use*" pattern, where the caller includes functionality in an application by calling methods like UseRemoteServer() or UseCommandLine(). The latest version of this API pattern in Quino recommends returning the application that was passed in to allow chaining and fluent configuration. For example, the following code chains the aforementioned methods together without creating a local variable or other clutter. return new CodeGeneratorApplication().UseRemoteServer().UseCommandLine(); What should the return type of such standard configuration operations be? Taking a method above as an example, it could be defined as follows: public static IApplication UseCommandLine(this IApplication application, string[] args) { ... } This seems like it would work fine, but the original type of the application that was passed in is lost, which is not exactly in keeping with the fluent style. In order to maintain the type, we could define the method as follows: public static TApplication UseCommandLine(this TApplication application, string[] args) where TApplication : IApplication { ... } This style is not as succinct but has the advantage that the caller loses no type information. On the other hand, it's more work to define methods in this way and there is a strong likelihood that many such methods will simply be written in the style in the first example. "Generics definitely offer advantages, but it remains to be seen how much those advantages are worth." Why would other coders do that? Because it's easier to write code without generics, and because the stronger result type is not needed in 99% of the cases. If every configuration method expects and returns an IApplication, then the stronger type will never come into play. If the compiler isn't going to complain, you can expect a higher rate of entropy in your API right out of the gate. One way the more-derived type would come in handy is if the caller wanted to define the application-creation method with their own type as a result, as shown below: private static CodeGeneratorApplication CreateApplication() { return new CodeGeneratorApplication().UseRemoteServer().UseCommandLine(); } If the library methods expect and return IApplication values, the result of UseCommandLine() will be IApplication and requires a cast to be used as defined above. If the library methods are defined generic in TApplication, then everything works as written above. This is definitely an advantage, in that the user gets the exact type back that they created. Generics definitely offer advantages, but it remains to be seen how much those advantages are worth. [2] [Another example: The IApplicationManager] Before we examine the pros and cons further, let's look at another example. In Quino 1.x, applications were created directly by the client program and passed into the framework. In Quino 2.x, the IApplicationManager is responsible for creating and executing applications. A caller passes in two functions: one to create an application and another to execute an application. A standard application startup looks like this: new ApplicationManager().Run(CreateApplication, RunApplication); [3] "Generic types can trigger an avalanche of generic parameters(tm) throughout your code." The question is: what should the types of the two function parameters be? Does CreateApplication return an IApplication or a caller-specific derived type? What is the type of the application parameter passed to RunApplication? Also IApplication? Or the more derived type returned by CreateApplication? As with the previous example, if the IApplicationManager is to return a derived type, then it must be generic in TApplication and both function parameters will be generically typed as well. These generic types will trigger an avalanche of generic parameters(tm) throughout the other extension methods, interfaces and classes involved in initializing and executing applications. That sounds horrible. This sounds like a pretty easy decision. Why are we even considering the alternative? Well, because it can be very advantageous if the application can declare RunApplication with a strictly typed signature, as shown below. private static void RunApplication(CodeGeneratorApplication application) { ... } Neat, right? I've got my very own type back. [Where Generics Goes off the Rails] However, if the IApplicationManager is to call this function, then the signature of CreateAndStartUp() and Run() have to be generic, as shown below. TApplication CreateAndStartUp( Func createApplication ) where TApplication : IApplication; IApplicationExecutionTranscript Run( Func createApplication, Action run ) where TApplication : IApplication; These are quite messy -- and kinda scary -- signatures. [4] if these core methods are already so complex, any other methods involved in startup and execution would have to be equally complex -- including helper methods created by calling applications. [5] The advantage here is that the caller will always get back the type of application that was created. The compiler guarantees it. The caller is not obliged to cast an IApplication back up to the original type. The disadvantage is that all of the library code is infected by a generic parameter with its attendant IApplication generic constraint. [6] [Don't add Support for Conflicting Patterns] The title of this section seems pretty self-explanatory, but we as designers must remain vigilant against the siren call of what seems like a really elegant and strictly typed solution. "But aren't properties on an application exactly what we just worked so hard to eliminate?" The generics above establish a pattern that must be adhered to by subsequent extenders and implementors. And to what end? So that a caller can attach properties to an application and access those in a statically typed manner, i.e. without casting? But aren't properties on an application exactly what we just worked so hard to eliminate? Isn't the recommended pattern to create a "settings" object and add it to the IOC instead? That is, as of Quino 2.0, you get an IApplication and obtain the desired settings from its IOC. Technically, the cast is still taking place in the IOC somewhere, but that seems somehow less bad than a direct cast. If the framework recommends that users don't add properties to an application -- and ruthlessly eliminated all standard properties and descendants -- then why would the framework turn around and add support -- at considerable cost in maintenance and readability and extendibility -- for callers that expect a certain type of application? [Wrapping up] Let's take a look at the non-generic implementation and see what we lose or gain. The final version of the IApplicationManager API is shown below, which properly balances the concerns of all stakeholders and hopefully will stand the test of time (or at least last until the next major revision). IApplication CreateAndStartUp( Func createApplication ); IApplicationExecutionTranscript Run( Func createApplication, Action run ); These are the hard questions of API design: ensuring consistency, enforcing intent and balancing simplicity and cleanliness of code with expressiveness. -------------------------------------------------------------------------------- [1] A predilection of mine, I'll admit, especially when writing about a topic about which I've thought quite a lot. In those cases, the instinct to just skip "the object" and move on to the esoteric details that stand in the way of an elegant, perfect solution, is very, very strong. [1] This more-realized typing was so attractive that we used it in many places in Quino without properly weighing the consequences. This article is the result of reconsidering that decision. [1] This call looks the same for all UI (console, Winform, WPF, etc.), all services (e.g. ASP.NET, Windows-services, etc.) as well as for automated tests. This fact isn't germane to the discussion above, but it's pretty neat in its own right. All an application has to do is define two methods with the right signatures and call the appropriate Run() method for the desired type of application. Almost all of the startup code is shared and the pattern is the same everywhere. [1] Yes, the C# compiler will allow you to elide generics for most method calls (so long as the compiler can determine the types of the parameters without it). However, generics cannot be removed from constructor calls. These must always specify all generic parameters, which makes for messier-looking, lengthy code in the caller e.g. when creating the ApplicationManager were it to have been defined with generic parameters. Yet another thing to consider when choosing how to define you API. [1] As already mentioned elsewhere (but it bears repeating): callers can, of course, eschew the generic types and use IApplication everywhere -- and most probably will, because the advantage offered by making everything generic is vanishingly small.. If your API looks this scary, entropy will eat it alive before the end of the week, to say nothing of its surviving to the next major version. [1] A more subtle issue that arises is if you do end up -- even accidentally -- mixing generic and non-generic calls (i.e. using IApplication as the extended parameter in some cases and TApplication in others). This issue is in how the application object is registered in the IOC. During development, when the framework was still using generics everywhere (or almost everywhere), some parts of the code were retrieving a reference to the application using the most-derived type whereas the application had been registered in the container as a singleton using IApplication. The call to retrieve the most derived type returned a new instance of the application rather than the pre-registered singleton, which was a subtle and difficult bug to track down.

API Design: Running an Application (Part I)

2015-09-19T07:29:59+02:00

Published by marco on 19. Sep 2015 07:29:59 Updated by marco on 26. Sep 2015 11:24:56 ------------------------------------------------------------------------ In this article, we're going to discuss a bit more about the configuration library in Quino 2.0. Other entries on this topic have been the articles about Encodo’s configuration library for Quino: "part I" , "part II" and "part III" . The goal of this article is to discuss a concrete example of how we decided whether to use generic type parameters throughout the configuration part of Quino. The meat of that discussion will be in a part 2 because we're going to have to lay some groundwork about the features we want first. (Requirements!) [A Surfeit of Generics] As of "Quino 2.0-beta2" , the configuration library consisted of a central IApplication interface which has a reference to an IOC container and a list of startup and shutdown actions. As shown in "part III" , these actions no longer have a generic TApplication parameter. This makes it not only much easier to use the framework, but also easier to extend it. In this case, we were able to remove the generic parameter without sacrificing any expressiveness or type-safety. As of beta2, there were still several places where generic TApplication parameters were cluttering the API. Could we perhaps optimize further? Throw out even more complexity without losing anything? [Starting up an application] One of these places is the actual engine that executes the startup and shutdown actions. This code is a bit trickier than just a simple loop because Quino supports execution in debug mode -- without exception-handling -- and release mode -- with global exception-handling and logging. As with any application that uses an IOC container, there is a configuration phase, during which the container can be changed and an execution phase, during which the container produces objects but can no longer be re-configured. Until 2.0-beta2, the execution engine was encapsulated in several extension methods called Run(), StartUp() and so on. These methods were generally generic in TApplication. I write "generally" because there were some inconsistencies with extension methods for custom application types like Winform or Console applications. While extension methods can be really useful, this usage was not really appropriate as it violated the open/closed principle. For the final release of Quino, we wanted to move this logic into an IApplicationManager so that applications using Quino could (A) choose their own logic for starting an application and (B) add this startup class to a non-Quino IOC container if they wanted to. [Application Execution Modes] So far, so good. Before we discuss how to rewrite the application manager/execution engine, we should quickly revisit what exactly this engine is supposed to do. As it turns out, not only do we wnat to make an architectural change to make the design more open for extension, but the basic algorithm for starting an application changed, as well. "What does it mean to run an application?" Quino has always acknowledged and kinda/sorta supported the idea that a single application can be run in different ways. Even an execution that results in immediate failure technically counts as an execution, as a traversal of the state machine defined by the application. If we view an application for the state machine that it is, then every application has at least two terminal nodes: OK and Error. But what does OK mean for an application? In Quino, it means that all startup actions were executed without error and the run() action passed in by the caller was also executed without error. Anything else results in an exception and is shunted to Error. [image]But is that true, really? Can you think of other ways in which an application could successfully execute without really having failed? For most applications, the answer is yes. Almost every application -- and certainly every Quino application -- supports a command line. One of the default options for the command line of a Quino application is -h, which shows a manual for the other command-line options. If the application is running in a console, this manual is printed to the console; for a Winform application, a dialog box is shown; and so on. This "help" mode is actually a successful execution of the application that did not result in the main event loop of the application being executed. Thought of in this way, any command-line option that controls application execution could divert the application to another type of terminal node in the state machine. A good example is when an application provides support for importing or exporting data via the command line. ["Canceled" Terminal Nodes] A terminal node is also not necessarily only Crashed or Ok. Almost any application will also need to have a Canceled mode that is a perfectly valid exit state. For example, * If the application requires a login during execution (startup), but the user aborts authentication * If the application supports schema migration, but the user aborts without migrating the schema These are two ways in which a standard Quino application could run to completion without crashing but without having accomplished any of its main tasks. It ran and it didn't crash, but it also didn't do anything useful. [Intermediate Nodes in the Application State Machine] This section title sounds a bit pretentious, but that's exactly what we want to discuss here. Instead of having just start and terminal nodes, the Quino startup supports cycles through intermediate nodes as well. What the hell does that mean? It means that some nodes may trigger Quino to restart in a different mode in order to handle a particular kind of error condition that could be repaired. [1] A concrete example is desperately needed here, I think. The main use of this feature in Quino right now is to support on-the-fly schema-migration without forcing the user to restart the application. This feature has been in Quino from the very beginning and is used almost exclusively by developers during development. The use case to support is as follows: 1. Developer is running an application 2. Developer make change to the model (or pulls changes from the server) 3. Developer runs the application with a schema-change 4. Application displays migration tool; developer can easily migrate the schema and continue working This workflow minimizes the amount of trouble that a developer has when either making changes or when integrating changes from other developers. In all cases in which the application model is different from the developer's database schema, it's very quick and easy to upgrade and continue working. ["Rescuing" an application in Quino 2.0] How does this work internally in Quino 2.0? The application starts up but somehow encounters an error that indicates that a schema migration might be required. This can happen in one of two ways: 1. The schema-verification step in the standard Quino startup detects a change in the application model vis à vis the data schema 2. Some other part of the startup accesses the database and runs into a DatabaseException that is indicative of a schema-mismatch In all of these cases, the application that was running throws an ApplicationRestartException, that the standard IApplicationManager implementation knows how to handle. It handles it by shutting down the running application instance and asking the caller to create a new application, but this time one that knows how to handle the situation that caused the exception. Concretely, the exception includes an IApplicationCreationSettings descendant that the caller can use to decide how to customize the application to handle that situation. The manager then runs this new application to completion (or until a new RestartApplicationException is thrown), shuts it down, and asks the caller to create the original application again, to give it another go. In the example above, if the user has successfully migrated the schema, then the application will start on this second attempt. If not, then the manager enters the cycle again, attempting to repair the situation so that it can get to a terminal node. Naturally, the user can cancel the migration and the application also exits gracefully, with a Canceled state. A few examples of possible application execution paths: * Standard => OK * Standard => Error * Standard => Canceled * Standard => Restart => Migrator => Standard => OK * Standard => Restart => Migrator => Canceled The pattern is the same for interactive, client applications as for headless applications like test suites, which attempt migration once and abort if not successful. Applications like web servers or other services will generally only support the OK and Error states and fail when they encounter a RestartApplicationException. Still, it's nice to know that the pattern is there, should you need it. It fits relatively cleanly into the rest of the API without making it more complicated. The caller passes two functions to the IApplicationManager: one to create an application and one to run it. An example from the Quino CodeGeneratorApplication is shown below: internal static void Main() { new ApplicationManager().Run(CreateApplication, GenerateCode); } private static IApplication CreateApplication( IApplicationCreationSettings applicationCreationSettings ) { ... } private static void GenerateCode(IApplication application) { ... } We'll see in the next post what the final API looks like and how we arrived at the final version of that API in Quino 2.0. -------------------------------------------------------------------------------- [1] Or rescued, using the nomenclature from Eiffel exception-handling, which actually does something very similar. The exception handling in most languages lets you clean up and move on, but the intent isn't necessarily to re-run the code that failed. In Eiffel, this is exactly how exception-handling works: fix whatever was broken and re-run the original code. Quino now works very much like this as well.

Encodo Git Handbook 3.0

2015-09-19T07:18:15+02:00

Published by marco on 19. Sep 2015 07:18:15 ------------------------------------------------------------------------ Encodo first published a Git Handbook for employees in September 2011 and last updated it in July of 2012. Since then, we've continued to use Git, refining our practices and tools. Although a lot of the content is still relevant, some parts are quite outdated and the overall organization suffered through several subsequent, unpublished updates. What did we change from the version 2.0? * We removed all references to the Encodo Git Shell. This shell was a custom environment based on Cygwin. It configured the SSH agent, set up environment variables and so on. Since tools for Windows have improved considerably, we no longer need this custom tool. Instead, we've moved to PowerShell and PoshGit to handle all of our Git command-line needs. * We removed all references to Enigma. This was a Windows desktop application developed by Encodo to provide an overview, eager-fetching and batch tasks for multiple Git repositories. We stopped development on this when SmartGit included all of the same functionality in versions 5 and 6. * We removed all detailed documentation for Git submodules. Encodo stopped using submodules (except for one legacy project) several years ago. We used to use submodules to manage external binary dependencies but have long since moved to NuGet instead. * We reorganized the chapters to lead off with a quick overview of Basic Concepts followed by a focus on Best Practices and our recommended Development Process. We also reorganized the Git-command documentation to use a more logical order. You can download "version 3 of the Git Handbook" or "get the latest copy from here" . Chapter 3, Basic Concepts and chapter 4, Best Practices have been included in their entirety below. [3 Best Practices] [3.1 Focused Commits] Focused commits are required; small commits are highly recommended. Keeping the number of changes per commit tightly focused on a single task helps in many cases. * They are easier to resolve when merge conflicts occur * They can be more easily merged/rebased by Git * If a commit addresses only one issue, it is easier for a reviewer or reader to decide whether it should be examined. For example, if you are working on a bug fix and discover that you need to refactor a file as well, or clean up the documentation or formatting, you should finish the bug fix first, commit it and then reformat, document or refactor in a separate commit. Even if you have made a lot of changes all at once, you can still separate changes into multiple commits to keep those commits focused. Git even allows you to split changes from a single file over multiple commits (the Git Gui provides this functionality as does the index editor in SmartGit). [3.2 Snapshots] Use the staging area to make quick snapshots without committing changes but still being able to compare them against more recent changes. For example, suppose you want to refactor the implementation of a class. * Make some changes and run the tests; if everything’s ok, stage those changes * Make more changes; now you can diff these new changes not only against the version in the repository but also against the version in the index (that you staged). * If the new version is broken, you can revert to the staged version or at least more easily figure out where you went wrong (because there are fewer changes to examine than if you had to diff against the original) * If the new version is ok, you can stage it and continue working [3.3 Developing New Code] Where you develop new code depends entirely on the project release plan. * Code for releases should be committed to the release branch (if there is one) or to the develop branch if there is no release branch for that release * If the new code is a larger feature, then use a feature branch. If you are developing a feature in a hotfix or release branch, you can use the optional base parameter to base the feature on that branch instead of the develop branch, which is the default. [3.4 Merging vs. Rebasing] Follow these rules for which command to use to combine two branches: * If both branches have already been pushed, then merge. There is no way around this, as you won’t be able to push a non-merged result back to the origin. * If you work with branches that are part of the standard branching model (e.g. release, feature, etc.), then merge. * If both you and someone else made changes to the same branch (e.g. develop), then rebase. This will be the default behavior during development [4 Development Process] A branching model is required in order to successfully manage a non-trivial project. Whereas a trivial project generally has a single branch and few or no tags, a non-trivial project has a stable release—with tags and possible hotfix branches—as well as a development branch—with possible feature branches. A common branching model in the Git world is called Git Flow. Previous versions of this manual included more specific instructions for using the Git Flow-plugin for Git but experience has shown that a less complex branching model is sufficient and that using standard Git commands is more transparent. However, since Git Flow is a very widely used branching model, retaining the naming conventions helps new developers more easily understand how a repository is organized. [4.1 Branch Types] The following list shows the branch types as well as the naming convention for each type: * master is the main development branch. All other branches should be merged back to this branch (unless the work is to be discarded). Developers may apply commits and create tags directly on this branch. * feature/name is a feature branch. Feature branches are for changes that require multiple commits or coordination between multiple developers. When the feature is completed and stable, it is merged to the master branch after which it should be removed. Multiple simultaneous feature branches are allowed. * release/vX.X.X is a release branch. Although a project can be released (and tagged) directly from the master branch, some projects require a longer stabilization and testing phase before a release is ready. Using a release branch allows development on the develop branch to continue normally without affecting the release candidate. Multiple simultaneous release branches are strongly discouraged. * hotfix/vX.X.X is a hotfix branch. Hotfix branches are always created from the release tag for the version in which the hotfix is required. These branches are generally very short-lived. If a hotfix is needed in a feature or release branch, it can be merged there as well (see the optional arrow in the following diagram). The main difference from the Git Flow branching model is that there is no explicit stable branch. Instead, the last version tag serves the purpose just as well and is less work to maintain. For more information on where to develop code, see “3.3 – Developing New Code”. [4.2 Example] To get a better picture of how these branches are created and merged, the following diagram depicts many of the situations outlined above. The diagram tells the following story: * Development began on the master branch * v1.0 was released directly from the master branch * Development on feature “B” began * A bug was discovered in v1.0 and the v1.0.1 hotfix branch was created to address it * Development on feature “A” began * The bug was fixed, v1.0.1 was released and the fix was merged back to the master branch * Development continued on master as well as features “A” and “B” * Changes from master were merged to feature “A” (optional merge) * Release branch v1.1 was created * Development on feature “A” completed and was merged to the master branch * v1.1 was released (without feature “A”), tagged and merged back to the master branch * Changes from master were merged to feature “B” (optional merge) * Development continued on both the master branch and feature “B” * v1.2 was released (with feature “A”) directly from the master branch [image] Legend: * Circles depict commits * Blue balloons are the first commit in a branch * Grey balloons are a tag * Solid arrows are a required merge * Dashed arrows are an optional merge

ReSharper Unit Test Runner 9.x update

2015-09-03T12:30:57+02:00

Published by marco on 3. Sep 2015 12:30:57 ------------------------------------------------------------------------ Way back in February, I wrote about "my experiences with ReSharper 9 when it first came out" . The following article provides an update, this time with version 9.2, released just last week. tl;dr: I'm back to ReSharper 8.2.3 and am a bit worried about the state of the 9.x series of ReSharper. Ordinarily, JetBrains has eliminated performance, stability and functional issues by the first minor version-update (9.1), to say nothing of the second (9.2). [Test Runner] In the previous article, my main gripe was with the unit-test runner, which was unusable due to flakiness in the UI, execution and change-detection. With the release of 9.2, the UI and change-detection problems have been fixed, but the runner is still quite flaky at executing tests. What follows is the text of the report that I sent to JetBrains when they asked me why I uninstalled R# 9.2. As with 9.0 and 9.1, I am unable to productively use the 9.2 Test Runner with many of my NUnit tests. These tests are not straight-up, standard tests, but R# 8.2.3 handled them without any issues whatsoever. What's special about my tests? There are quite a few base classes providing base functionality. The top layers provide scenario-specific input via a generic type parameter. - TestsBase - OtherBase (7 of these, one with an NUnit CategoryAttribute) - ConcreteTests (defines tests with NUnit TestAttributes) - ProviderAConcreteTests (CategoryAttribute) - ProtocolAProviderAConcreteTests (TMixin = ProtocolAProviderA; TestFixtureAttribute, CategoryAttributes) - ProtocolBProviderAConcreteTests (TMixin = ProtocolBProviderA; TestFixtureAttribute, CategoryAttributes) - ProviderBConcreteTests (CategoryAttribute) - ProtocolAProviderBConcreteTests (TMixin = ProtocolAProviderB; TestFixtureAttribute, CategoryAttributes) - ProtocolBProviderBConcreteTests (TMixin = ProtocolBProviderB; TestFixtureAttribute, CategoryAttributes) The test runner in 9.2 is not happy with this at all. The test explorer shows all of the tests correctly, with the test counts correct. If I select a node for all tests for ProviderB and ProtocolA (696 tests in 36 fixtures), R# loads 36 non-expandable nodes into the runner and, after a bit of a wait, marks them all as inconclusive. Running an individual test-fixture node does not magically cause the tests to load or appear and also shows inconclusive (after a while; it seems the fixture setup executes as expected but the results are not displayed). If I select a specific, concrete fixture and add or run those tests, R# loads and executes the runner correctly. If I select multiple test fixtures in the explorer and add them, they also show up as expandable nodes, with the correct test counts, and can be executed individually (per fixture). However, if I elect to run them all by running the parent node, R# once again marks everything as inconclusive. As I mentioned, 8.2.3 handles this correctly and I feel R# 9.2 isn't far off -- the unit-test explorer does, after all, show the correct tests and counts. In 9.2, it's not only inconvenient, but I'm worried that my tests are not being executed with the expected configuration. Also, I really missed the StyleCop plugin for 9.2. There's a beta version for 9.1 that caused noticeable lag, so I'm still waiting for a more unobtrusive version for 9.2 (or any version at all). While it's possible that there's something I'm doing wrong, or there's something in my installation that's strange, I don't think that's the problem. As I mentioned, test-running for the exact same solution with 8.2.3 is error-free and a pleasure to use. In 9.2, the test explorer shows all of the tests correctly, so R# is clearly able to interpret the hierarchy and attributes (noted above) as I've intended them to be interpreted. This feels very much like a bug or a regression for which JetBrains doesn't have test coverage. I will try to work with them to help them get coverage for this case. [Real-Time StyleCop rules] Additionally, the StyleCop plugin is absolutely essential for my workflow and there still isn't an official release for any of the 9.x versions. ReSharper 9.2 isn't supported at all yet, even in prerelease form. The "official Codeplex page" shows the latest official version as 4.7, released in January of 2012 for ReSharper 8.2 and Visual Studio 2013. One would imagine that VS2015 support is in the works, but it's hard to say. There is a "page for StyleCop in the ReSharper extensions gallery" but that shows a beta4, released in April of 2015, that only works with ReSharper 9.1.x, not 9.2. I tested it with 9.1.x, but it noticeably slowed down the UI. While typing was mostly unaffected, scrolling and switching file-tabs was very laggy. Since StyleCop is essential for so many developers, it's hard to see why the plugin gets so little love from either JetBrains or Microsoft. [GoTo Word] The ""Go To Word" plugin" is not essential but it is an extremely welcome addition, especially with so much more client-side work depending on text-based bindings that aren't always detected by ReSharper. In those cases, you can find -- for example -- all the references of a "Knockout template" by searching just as you would for a type or member. Additionally, you benefit from the speed of the ReSharper indexing engine and search UI instead of using the comparatively slow and ugly "Find in Files" support in Visual Studio. Alternatives suggested in the comments to the linked issue above all depend on building yet another index of data (e.g. "Sando Code Search Tool" ). JetBrains has pushed off integrating go-to-word until version 10. Again, not a deal-breaker, but a shame nonetheless, as I'll have to do without it in 9.x until version 10 is released. With so much more client-side development going on in Visual Studio and with dynamic languages and data-binding languages that use name-matching for data-binding, GoToWord is more and more essential. Sure, ReSharper can continue to integrate native support for finding such references, but until that happens, we're stuck with the inferior Find-in-Files dialog or other extensions that increase the memory pressure for larger solutions.

v2.0-beta2: Code generation, IOC and configuration

2015-05-30T23:51:19+02:00

Published by marco on 30. May 2015 23:51:19 ------------------------------------------------------------------------ The summary below describes major new features, items of note and breaking changes. The "full list of issues" is also available for those with access to the Encodo issue tracker. [Highlights] In "beta1" , we read about changes to configuration, the data driver architecture, DDL commands, and security and access control in web applications. In beta-2, we made the following additional improvements: * Introduced a new generated-code version that avoids all global references and provides a much-improved API for working with metadata. See below for more details. ("QNO-4821" , "QNO-4820" , "QNO-4819" , "QNO-4818" , "QNO-4817" , "QNO-4816" , "QNO-4815" , "QNO-4814" , "QNO-4813" , "QNO-4812" , "QNO-4507" , "QNO-4506" , "QNO-4117" ) * Continued to update and standardize configuration and execution for applications, as documented in Encodo's configuration library for Quino: "part I" , "part II" and "part III" . ("QNO-4834" , "QNO-4831" , "QNO-4809" , "QNO-4772" , "QNO-4663" , "QNO-4664" , "QNO-4360" , "QNO-4771" , "QNO-4676" , "QNO-4675" , "QNO-4673" , "QNO-3765" ) * Moved all globals and static classes with global dependencies to the application's IOC container. ("QNO-4791" , "QNO-4792" , "QNO-4790" , "QNO-4784" , "QNO-4782" , "QNO-4778" , "QNO-4787" , "QNO-4774" ) * Reduce direct dependencies and properties in IApplication, ICoreApplication and IMetaApplication. ("QNO-4789" , "QNO-4788" , "QNO-4786" , "QNO-4785" , "QNO-4671" , "QNO-4669" , , "QNO-4668" , "QNO-4667" , "QNO-4660" ) * Removed references to GlobalContext and ServiceLocator from the Winforms components ("QNO-4832" ) * Standardized naming and namespaces to conform to industry, StyleCop and .NET conventions. ("QNO-4807" , "QNO-4806" ) * Added support for parameterized custom SQL queries with the ICustomCommandBuilder. This was added by customer request, for applications that formulate queries that are beyond what the Quino ORM is currently capable of mapping. A blog post with more detail on how this works is forthcoming. ("QNO-4802" ) * Further cleanup and consolidation in the data-driver hierarchy. This work was a direct result of Daniel Roth's Bachelor's thesis work in which he integrated NHibernate as an alternative ORM for Quino. ("QNO-4808" ; still to do by RTM: "QNO-4749" ) * Discontinued support for DataContract and DataMember attributes in metadata and generated code. ("QNO-4823" , "QNO-4826" ) [Goodbye, old friends] This release addressed some issues that have been bugging us for a while (almost 3 years in one case). * "QNO-3765" (32 months): After a schema migration caused by a DatabaseException on login, restart the application * "QNO-4117" (27 months): PreferredType registration for models is not always executed * "QNO-4408" (18 months): When access to the remoting server is unauthorized, the web site should respond with an error * "QNO-4506" (14 months): The code generator should generate the persistent object and metadata references in separate classes * "QNO-4507" (14 months): Business objects for modules should not rely on GlobalContext in generated code You will not be missed. [Breaking changes] As we've mentioned before, this release is absolutely merciless in regard to backwards compatibility. Old code is not retained as obsolete Obsolete. Instead, a project upgrading to 2.0 will encounter compile errors. That said, if you arm yourself with a bit of time, ReSharper and the release notes (and possibly keep an Encodo employee on speed-dial), the upgrade is not difficult. It consists mainly of letting ReSharper update namespace references for you. In cases where the update is not so straightforward, we've provided release notes. [V1 generated code support] One of the few things you'll be able to keep (at least for a minor version or two) is the old-style generated code. We made this concession because, while even a large solution can be upgraded from 1.13.0 to 2.0 relatively painlessly in about an hour (we've converted our own internal projects to test), changing the generated-code format is potentially a much larger change. Again, an upgrade to the generated-code format isn't complicated but it might require more than an hour or two's worth of elbow grease to complete. Therefore, you'll be able to not only retain your old generated code, but the code generator will continue support the old-style code-generation format for further development. Expect the grace period to be relatively short, though. Regardless of whether you elect to keep the old-style generated code, you'll have to do a little bit of extra work just to be able to generate code again. 1. Manually update a couple of generated files, as shown below. 2. Compile the solution 3. Generate code with the Quino tools Before you can regenerate, you'll have to manually update your previously generated code in the main model file, as shown below. [Previous version] static MyModel() { Messages = new InMemoryRecorder(); Loader = new ModelLoader(() => Instance, () => Messages, new MyModelGenerator()); } public static IMetaModel CreateModel(IExtendedRecorder recorder) { if (recorder == null) { throw new ArgumentNullException("recorder"); } var result = Loader.Generator.CreateModel(recorder); result.Configure(); return result; } // More code ... /// protected override void DoConfigure() { base.DoConfigure(); ConfigurePreferredTypes(); ApplyCustomConfiguration(); } [Manually updated version] static MyModel() { Messages = new InMemoryRecorder(); Loader = new ModelLoader(() => Instance, () => Messages, new MyModelGenerator()); } public static IMetaModel CreateModel(IExtendedRecorder recorder) { if (recorder == null) { throw new ArgumentNullException("recorder"); } var result = Loader.Generator(MyModel)new MyModelGenerator().CreateModel( ServiceLocator.Current.GetInstance(), ServiceLocator.Current.GetInstance(), recorder ); result.ConfigurePreferredTypes(); result.ApplyCustomConfiguration(); return result; } /// protected override void DoConfigure() { base.DoConfigure(); ConfigurePreferredTypes(); ApplyCustomConfiguration(); } [Integrate into the model builder] In the application configuration for the first time you generate code with Quino 2.0, you should use: ModelLoader = MyModel.Loader; this.UseMetaSimpleInjector(); this.UseModelLoader(MyModel.CreateModel); After regenerating code, you should use the following for version-2 generated code: ModelLoader = MyModel.Loader; this.UseMetaSimpleInjector(); this.UseModelLoader(MyModelExtensions.CreateModelAndMetadata); ...and the following for version-1 generated code: ModelLoader = MyModel.Loader; this.UseMetaSimpleInjector(); this.UseModelLoader(MyModel.CreateModel); [Still to do by RTM] As you can see, we've already done quite a bit of work in beta1 and beta2. We have a few more tasks planned for the feature-complete release candidate for 2.0 Move the schema-migration metadata table to a module. The Quino schema-migration extracts most of the information it needs from database schema itself. It also stores extra metadata in a special table. This table has been with Quino since before modules were supported (over seven years) and hence was built in a completely custom manner. Moving this support to a Quino metadata module will remove unnecessary implementation and make the migration process more straightforward. ("QNO-4888" ) Separate collection algorithm from storage/display method in IRecorder and descendants. The recording/logging library has a very good interface but the implementation for the standard recorders has become too complex as we added support for multi-threading, custom disposal and so on. We want to clean this up to make it easier to extend the library with custom loggers. ("QNO-4888" ) Split up Encodo and Quino assemblies based on functionality. There are only a very dependencies left to untangle ("QNO-4678" , "QNO-4672" , "QNO-4670" ); after that, we'll split up the two main Encodo and Quino assemblies along functional lines. ("QNO-4376" ) Finish integrating building and publishing NuGet and symbol packages into Quino's release process. And, finally, once we have the assemblies split up to our liking, we'll finalize the Nuget packages for the Quino library and leave the direct-assembly-reference days behind us, ready for Visual Studio 2015. ("QNO-4376" ) That's all we've got for now. See you next month for the next (and, hopefully, final update)!

Encodo's configuration library for Quino: part III

2015-05-17T17:45:56+02:00

Published by marco on 17. May 2015 17:45:56 ------------------------------------------------------------------------ [image]This discussion about configuration spans three articles: 1. "part I" discusses the history of the configuration system in Quino as well as a handful of principles we kept in mind while designing the new system 2. "part II" discusses the basic architectural changes and compares an example from the old configuration system to the new. 3. part III takes a look at configuring the "execution order" -- the actions to execute during application startup and shutdown [Introduction] Registering with an IOC is all well and good, but something has to make calls into the IOC to get the ball rolling. "Something has to actually make calls into the IOC to get the ball rolling." Even service applications -- which start up quickly and wait for requests to do most of their work -- have basic operations to execute before declaring themselves ready. Things can get complex when starting up registered components and performing basic checks and non-IOC configuration. * In which order are the components and configuration elements executed? * How do you indicate dependencies? * How can an application replace a piece of the standard startup? * What kind of startup components are there? Part of the complexity of configuration and startup is that developers quickly forget all of the things that they've come to expect from a mature product and start from zero again with each application. Encodo and Quino applications take advantage of prior work to include standard behavior for a lot of common situations. [Configuration Patterns] Some components can be configured once and directly by calling a method like UseMetaTranslations(string filePath), which includes all of the configuration options directly in the composition call. This pattern is perfect for options that are used only by one action or that it wouldn't make sense to override in a subsequent action. So, for simple actions, an application can just replace the existing action with its own, custom action. In the example above, an application for which translations had already been configured would just call UseMetaTranslations() again in order to override that behavior with its own. "Most application will replace standard actions or customize standard settings" Some components, however, will want to expose settings that can be customized by actions before they are used to initialize the component. For example, there is an action called SetUpLoggingAction, which configures logging for the application. This action uses IFileLogSettings and IEventLogSettings objects from the IOC during execution to determine which types of logging to configure. An application is, of course, free to replace the entire SetUpLoggingAction action with its own, completely custom behavior. However, an application that just wanted to change the log-file behavior or turn on event-logging could use the Configure() method [1], as shown below. application.Configure( s => s.Behavior = LogFileBehavior.MultipleFiles ); application.Configure( s => s.Enabled = true ); [Actions] A Quino application object has a list of StartupActions and a list of ShutdownActions. Most standard middleware methods register objects with the IOC and add one or more actions to configure those objects during application startup. Actions have existed for quite a while in Quino. In Quino 2, they have been considerably simplified and streamlined to the point where all but a handful are little more than a functional interface [2]. The list below will give you an idea of the kind of configuration actions we're talking about. * Load configuration data * Process command line * Set up logging * Upgrade settings/configuration (e.g. silent upgrade) * Log a header (e.g. user/date/file locations/etc.; for console apps. this might be mirrored to the console) * Load plugins * Set up standard locations (e.g. file-system locations) For installed/desktop/mobile applications, there's also: * Initialize UI components * Provide loading feedback * Check/manage multiple running instances * Check software update * Login/authentication Quino applications also have actions to configure metadata: * Configure expression engine * Load metadata * Load metadata-overlays * Validate metadata * Check data-provider connections * Check/migrate schema * Generate default data Application shutdown has a smaller set of vital cleanup chores that: * dispose of connection managers and other open resources * write out to the log, flush it and close it * show final feedback to the user [Anatomy of an Action] The following example [3] is for the 1.x version of the relatively simple ConfigureDisplayLanguageAction. public class ConfigureDisplayLanguageAction : ApplicationActionBase where TApplication : ICoreApplication { public ConfigureDisplayLanguageAction() : base(CoreActionNames.ConfigureDisplayLanguage) { } protected override int DoExecute( TApplication application, ConfigurationOptions options, int currentResult) { // Configuration code... } } What is wrong with this startup action? The following list illustrates the main points, each of which is addressed in more detail in its own section further below. * The ConfigurationOptions parameter introduces an unnecessary layer of complexity * The generic parameter TApplication complicates declaration, instantiation and extension methods that use the action * The int return type along with the currentResult parameter are a bad way of controlling flow. The same startup action in Quino 2.x has the following changes from the Quino 1.x version above (legend: additions; deletions). public class ConfigureDisplayLanguageAction : ApplicationActionBase where TApplication : ICoreApplication { public ConfigureDisplayLanguageAction() : base(CoreActionNames.ConfigureDisplayLanguage) { } publicprotected override void int DoExecute( TApplication application, ConfigurationOptions options, int currentResult) { // Configuration code... } } As you can see, quite a bit of code and declaration text was removed, all without sacrificing any functionality. The final form is quite simple, inheriting from a simple base class that manages the name of the action and overrides a single parameter-less method. It is now much easier to see what an action does and the barrier to entry for customization is much lower. public class ConfigureDisplayLanguageAction : ApplicationActionBase { public ConfigureDisplayLanguageAction() : base(CoreActionNames.ConfigureDisplayLanguage) { } public override void Execute() { // Configuration code... } } In the following sections, we'll take a look at each of the problems indicated above in more detail. [Remove the ConfigurationOptions parameter] These options are a simple enumeration with values like Client, Testing, Service and so on. They were used only by a handful of standard actions. These options made it more difficult to decide how to implement the action for a given task. If two tasks were completely different, then a developer would know to create two separate actions. However, if two tasks were similar, but could be executed differently depending on application type (e.g. testing vs. client), then the developer could still have used two separate actions, but could also have used the configuration options. Multiple ways of doing the exact same thing is all kinds of bad. "Multiple ways of doing the exact same thing is all kinds of bad." Parameters like this conflict conceptually with the idea of using composition to build an application. To keep things simple, Quino applications should be configured exclusively by composition. Composing an application with service registrations and startup actions and then passing options to the startup introduced an unneeded level of complexity. Instead, an application now defines a separate action for each set of options. For example, most applications will need to set up the display language to use -- be it for a GUI, a command-line or just to log messages in the correct language. For that, the application can add a ConfigureDisplayLanguageAction to the startup actions or call the standard method UseCore(). Desktop or single-user applications can use the ConfigureGlobalDisplayLanguageAction or call UseGlobalCore() to make sure that global language resources are also configured. [Remove the TApplication generic parameter] The generic parameter to this interface complicates the IApplication interface and causes no end of trouble in MetaApplication, which actually inherits from IApplication for historical reasons. "There is no need to maintain statelessness for a single-use object." Originally, this parameter guaranteed that an action could be stateless. However, each action object is attached to exactly one application (in the IApplication.StartupActions list. So the action that is attached to an application is technically stateless, and a completely different application than the one to which the action is attached could be passed to the IApplcationAction.Execute...which makes no sense whatsoever. Luckily, this never happens, and only the application to which the action is attached is passed to that method. If that's the case, though, why not just create the action with the application as a constructor parameter when the action is added to the StartupActions list? There is no need to maintain statelessness for a single-use object. This way, there is no generic parameter for the IApplication interface, all of the extension methods are much simpler and applications are free to create custom actions that work with descendants of IApplication simply by requiring that type in the constructor parameter. [Debugging is important] "A global exception handler is terrible for debugging" The original startup avoided exceptions, preferring an integer return result instead. In release mode, a global exception handler is active and is there to help the application exit more or less smoothly -- e.g. by logging the error, closing resources where possible, and so on. A global exception handler is terrible for debugging, though. For exceptions that are caught, the default behavior of the debugger is to stop where the exception is caught rather than where it is thrown. Instead, you want exceptions raised by your application to to stop the debugger from where they are thrown. So that's part of the reason why the startup and shutdown in 1.x used return codes rather than exceptions. [Multiple valid code paths] The other reason Quino used result codes is that most non-trivial applications actually have multiple paths through which they could successfully run. Exactly which path the application should take depends on startup conditions, parameters and so on. Some common examples are: * Show command-line help * Migrate an application schema * Import, export or generate data To show command-line help, an application execute its startup actions in order. It reaches the action that checks whether the user requested command-line help. This action processes the request, displays that help and then wants to smoothly exit the application. The "main" path -- perhaps showing the user a desktop application -- should no longer be executed. "Non-trivial applications have multiple valid run profiles." Similarly, the action that checks the database schema determines that the schema in the data provider doesn't match the model. In this case, it would like to offer the user (usually a developer) the option to update the schema. Once the schema is updated, though, startup should be restarted from the beginning, trying again to run the main path. [Use exceptions to indicate errors] Whereas the Quino 1.x startup addressed the design requirements above with return codes, this imposes an undue burden on implementors. There was also confusion as to when it was OK to actually throw an exception rather than returning a special code. Instead, the Quino 2.x startup always uses exceptions to indicate errors. There are a few special types of exceptions recognized by the startup code that can indicate whether the application should silently -- and successfully -- exit or whether the startup should be attempted again. [Conclusion] There is of course more detail into which we could go on much of what we discussed in these three articles, but that should suffice for an overview of the Quino configuration library. -------------------------------------------------------------------------------- [1] If C# had them, that it is. See "Java 8" for an explanation of what they are. [1] This pattern is echoed in the latest beta of the ASP.NET libraries, as described in the article "Strongly typed routing for ASP.NET MVC 6 with IApplicationModelConvention" . [1] Please note that formatting for the code examples has been adjusted to reduce horizontal space. The formatting does not conform to the "Encodo C# Handbook" .

Encodo's configuration library for Quino: part II

2015-05-17T17:45:20+02:00

Published by marco on 17. May 2015 17:45:20 Updated by marco on 19. Sep 2015 07:13:29 ------------------------------------------------------------------------ In this article, we'll continue the discussion about configuration started in "part I" . We wrapped up that part with the following principles to keep in mind while designing the new system. * Consistency * Opt-in configuration * Inversion of Control * Configuration vs. Execution * Common Usage [Borrowing from ASP.NET vNext] Quino's configuration inconsistencies and issues have been well-known for several versions -- and years -- but the opportunity to rewrite it comes only now with a major-version break. Luckily for us, ASP.NET has been going through a similar struggle and evolution. We were able to model some of our terminology on the patterns from their next version. For example, ASP.NET has moved to a pattern where an application-builder object is passed to user code for configuration. The pattern there is to include middleware (what we call "configuration") by calling extension methods starting with "Use". Quino has had a similar pattern for a while, but the method names varied: "Integrate", "Add", "Include"; these methods have now all been standardized to "Use" to match the prevailing .NET winds. [Begone configuration and feedback] Additionally, Quino used to make a distinction between an application instance and its "configuration" -- the template on which an application is based. No more. Too complicated. This design decision, coupled with the promotion of a platform-specific "Feedback" object to first-level citizen, led to an explosion of generic type parameters. [1] The distinction between configuration (template) and application (instance) has been removed. Instead, there is just an application object to configure. The feedback object is now to be found in the service locator. An application registers a platform-specific feedback to use as it would any other customization. [1] The CustomWinformFeedback in the Quino 1.x code at the end of this article provides a glaring example. [Hello service locator] ASP.NET vNext has made the service locator a first-class citizen. In ASP.NET, applications receive an IApplicationBuilder in one magic "Configure" method and receive an IServiceCollection in another magic "ConfigureServices" method. In Quino 2.x, the application is in charge of creating the service container, though Quino provides a method to create and configure a standard one (SimpleInjector). That service locator is passed to the IApplication object and subsequently accessible there. Services can of course be registered directly or by calling pre-packaged Middleware methods. Unlike ASP.NET vNext, Quino 2.x makes no distinction between configuring middleware and including the services required by that middleware. [Begone configuration hierarchy] Quino's configuration library has its roots in a time before we were using an IOC container. The configuration was defined as a hierarchy of configuration classes that modeled the following layers. * A base implementation that makes only the most primitive assumptions about an application. For example, that it has a RunMode ("debug" or "release") or an exit code or that it has a logging mechanism (e.g. IRecorder). * The "Core" layer comprises application components that are very common, but do not depend on Quino's metadata. * And, finally, the "Meta" layer includes configuration for application components that extend the core with metadata-dependent versions as well as specific components required by Quino applications. While these layers are still somewhat evident, the move to middleware packages has blurred the distinction between them. Instead of choosing a concrete configuration base class, an application now calls a handful of "Use" methods to indicate what kind of application to build. There are, of course, still helpful top-level methods -- e.g. UseCore() and UseMeta() methods -- that pull in all of the middleware for the standard application types. But, crucially, the application is free to tweak this configuration with more granular calls to register custom configuration in the service locator. This is a flexible and transparent improvement over passing esoteric parameters to monolithic configuration methods, as in the previous version. [An example: Configure a software updater] Just as a simple example, whereas a Quino 1.x standalone application would set ICoreConfiguration.UseSoftwareUpdater to true, a Quino 2.x application calls UseSoftwareUpdater(). Where a Quino 1.x Winform application would inherit from the WinformFeedback in order to return a customized ISoftwareUpdateFeedback, a Quino 2.x application calls UseSoftwareUpdateFeedback(). The software-update feedback class is defined below and is used by both versions. public class CustomSoftwareUpdateFeedback : WinformSoftwareUpdateFeedback { protected override ResponseType DoConfirmUpdate(TApplication application, ...) { ... } } That's where the similarities end, though. The code samples below show the stark difference between the old and new configuration systems. [Quino 1.x] As explained above, Quino 1.x did not allow registration of a sub-feedback like the software-updater. Instead, the application had to inherit from the main feedback and override a method to create the desired sub-feedback. class CustomWinformFeedback : WinformFeedback { public virtual ISoftwareUpdateFeedback GetSoftwareUpdateFeedback() where TApplication : ICoreApplication where TConfiguration : ICoreConfiguration where TFeedback : ICoreFeedback { return new CustomSoftwareUpdateFeedback(this); } } var configuration = new CustomConfiguration() { UseSoftwareUpdater = true } WinformDxMetaConfigurationTools.Run( configuration, app => new CustomMainForm(app), new CustomWinformFeedback() ); The method-override in the feedback was hideous and scared off a good many developers. not only that, the pattern was to use a magical, platform-specific WinformDxMetaConfigurationTools.Run method to create an application, run it and dispose it. [Quino 2.x] Software-update feedback-registration in Quino 2.x adheres to the principles outlined at the top of the article: it is consistent and uses common patterns (functionality is included and customized with methods named "Use"), configuration is opt-in, and the IOC container is used throughout (albeit implicitly with these higher-level configuration methods). using (var application = new CustomApplication()) { application.UseMetaWinformDx(); application.UseSoftwareUpdater(); application.UseSoftwareUpdaterFeedback(new CustomSoftwareUpdateFeedback()); application.Run(app => new CustomMainForm(app)); } Additionally, the program has complete control over creation, running and disposal of the application. No more magic and implicit after-the-fact configuration. [What comes after configuration?] In the next and (hopefully) final article, we'll take a look at configuring execution -- the actions to execute during startup and shutdown. Registering objects in a service locator is all well and good, but calls into the service locator have to be made in order for anything to actually happen. Keeping this system flexible and addressing standard application requirements is a challenging but not insurmountable problem. Stay tuned.

Encodo’s configuration library for Quino: part I

2015-04-10T15:36:06+02:00

Published by marco on 10. Apr 2015 15:36:06 ------------------------------------------------------------------------ In this article, I'll continue the discussion about configuration improvements mentioned in the "release notes for Quino 2.0-beta1" . With beta2 development underway, I thought I'd share some more of the thought process behind the forthcoming changes. [Software Libraries] "what sort of patterns integrate and customize the functionality of libraries in an application?" An application comprises multiple tasks, only some of which are part of that application's actual domain. For those parts not in the application domain, software developers use libraries. A library captures a pattern or a particular way of doing something, making it available through an abstraction. These simplify and smooth away detail irrelevant to the application. A runtime and its standard libraries provide many such abstractions: for reading/writing files, connecting to networks and so on. Third-party libraries provide others, like logging, IOC, task-scheduling and more. Because Encodo's been writing software for a long time, we have a lot of patterns that we've come up with for our applications. These libraries are split into two main groups: * Encodo.*: extensions to the .NET framework or third-party libraries that don't depend on Quino metadata. * Quino.*: extensions to the .NET framework, third-party libraries or Encodo libraries that depend on Quino metadata. A sort of "meta" library that lies on top of all of this is configuration and startup of applications that use these libraries. That is, what sort of patterns integrate and customize the functionality of libraries in an application? [Balancing K.I.S.S. and D.R.Y] Almost nowhere in an application is the balance between K.I.S.S. and D.R.Y. more difficult to maintain than in configuration and startup. "So if we already know all of that, why does Quino need a new configuration library?" As mentioned above, there is a lot of commonality between applications in this area. An application will definitely want to incorporate such common configuration from a library. Updates and improvements to that library will then be applied as for any other. This is a good thing. However, an application will also want to be able to tweak almost any given facet of this shared configuration. That is: just keep the good parts, have those upgraded when they're changed, but apply customization and extend functionality for the application's domain. Easy, right? It is here that a good configuration library will find just the right level of granularity for customization. Too coarse? Then an application ends up throwing out too much common configuration in order to customize a small part of it. Too fine? Then the configuration system is too verbose or complex and the application avoids using it. Instead, a configuration system should establish clear patterns -- optimally, just one -- for how to apply customization. * The builder of the underlying configuration library has to consider the myriad situations that might face a library developer and distill those requirements to a common pattern. * The library developer needs to think about which parts an application might want to customize and think about how to expose them. So if we already know all of that, then why does Quino need a new configuration library? Well... [History of Quino's Configuration Library] It's really easy to make things over-complicated and muddy. It's really easy to end up growing several different kinds of extension systems over the years. Quino ended up with a generics-heavy API that made declaring new configuration components very wordy. The core of Quino is the metadata definition for an application domain. That part has barely changed at all since we first wrote it lo so many years ago. We declared it to be our core business -- the part that we are better than others at -- the part we wanted to have under our own control. Our first draft [1] has held up remarkably well. Many of the other components have undergone quite a bit of flux: changes in requirements and the components themselves as well as new development processes and patterns all contributed to change. Over time, various applications had different needs and made adjustments to a different iteration of the configuration library. We moved from supporting only single-threaded, single-user desktop applications to also supporting multi-user, multi-threaded services and web servers. "...we were left with an ugly configuration system that no-one wanted to extend or use -- so yet another would be invented." For all of these different applications, we naturally wanted to maintain the common configuration where possible -- but customizations for new platforms stretched the capabilities of the configuration library. Customization would be made to a new version of that library, but applications that couldn't be upgraded immediately forced backwards-compatibility and thus resulted in several different concurrent ways of configuring a particular facet of an application. In order to keep things in one place, we ended up breaking the interface-separation rule. Dependencies started clumping drastically, but it was OK because nobody was trying to use one thing without the other ten. But it was hard to see what was going on; customization became a black box for all but one or two gurus. On and on it went, until we were left with an ugly configuration system that no-one wanted to extend or use -- so yet another would be invented, ad-hoc. And so it went. [Principles for Quino 2.0 Configuration] With Quino 2.0, we examined the existing system and came up with a list of principles. * Consistency: there should be only be one way of customizing settings and components. When a developer asks how to change something, the answer should always be the same pattern. If not, there better be a damned good reason (see "Configuration vs. Execution" below). * Opt-in configuration: No more magic methods or base classes that automatically add components and settings in black boxes. Even if the application has to call one or two more methods, it's better to be declarative than clever(tm). * Inversion of Control: Standardize configuration to use an IOC container or service locator wherever possible. Instead of clumping settings in configuration or application objects, create discrete settings and put them in the container. Make dependencies explicit (constructor parameters!) and resolved through the container wherever possible. * Configuration vs. Execution: Be very aware of the difference between the "configuration" phase and the "execution" phase. During configuration, the service locator is used in write-only mode; during execution, the service locator is in read-only mode. Code executed during configuration must rely only on explicit dependency-injection via constructor. * Common Usage: Establish a pattern for calling configuration methods, from least to most specific. E.g. call Quino's base configuration methods before any application-specific customization. Establish patterns for how to configure a single startup action or how to create settings for a larger component that could be further customized in subsequent phases. In the next part, we'll take a look at some concrete examples and documentation for the new patterns. [2] -------------------------------------------------------------------------------- [1] To be fair, it wasn't our first attempt at metadata. In one way or another, we'd been defining metadata structures for generic programming for more years than we'd be comfortable divulging. A h/t of course to Opus Software's Atlas libraries -- 1 and 2 -- where many of us contributed. Also, I had experience with cross-platform, generic libraries in C++ stretching all the way back to the late 90s as well as the generalized/meta elements of the "earthli WebCore" . So it was more like the fourth or fifth shot at it, if we're going to be honest -- but at least we got it right. :-) [1] In particular, I'll add more detail about "Common Usage" for those who might feel I've left them hanging a bit in the last bullet point. Sorry 'bout that. The day is only so long. See you next time...

Quino v2.0-beta1: Configuration, services and web

2015-03-28T23:26:29+01:00

Published by marco on 28. Mar 2015 23:26:29 ------------------------------------------------------------------------ The summary below describes major new features, items of note and breaking changes. The "full list of issues" is also available for those with access to the Encodo issue tracker. [Highlights] These are the big ones that forced a major-version change. * Rewrote configuration and application API. ("QNO-4666" , "QNO-4679" , "QNO-4659" , "QNO-4772" , "QNO-4663" , "QNO-4664" , "QNO-4360" ) * Data-driver architecture has been consolidated. All drivers, ADO, Mongo and Remote, now use the same base implementation, logic and optimization. ("QNO-4461" , "QNO-2913" , "QNO-4683" ) * Rewrote schema migration to return a list of DDL commands ("QNO-4732" , "QNO-4726" , "QNO-4727" ) Also improved schema-migration and mapping to database ("QNO-4708" , "QNO-4709" , "QNO-4605" , "QNO-4725" , "QNO-4605" , "QNO-4728" , "QNO-4720" , "QNO-4728" ) * Updated the entire security and access-control and authentication API, including adding extensive support for tokens for both ASP.NET MVC and WebAPI. ("QNO-4754" , "QNO-4757" , "QNO-4747" >) * Renamed IMessageRecorder to IRecorder, IMessageStore to IInMemoryRecorder and consolidated IFilteredMessageRecorder to IRecorder. ("QNO-4686" , "QNO-4696" , "QNO-4750" , "QNO-4557" ) Some smaller, but important changes: * Added support for RunInTransaction attribute. Specify the attribute on any IMetaTestFixture to wrap a test or every test in a fixture in a transaction. ("QNO-4682" ) * Shared connection manager is now disposed when an application is disposed. ("QNO-4752" ) [Breaking changes] Oh yeah. You betcha. This is a major release and we've knowingly made a decision not to maintain backwards-compatibility at all costs. Good news, though, the changes to make are relatively straightforward and easy to make if you've got a tool like ReSharper that can update using statements automatically. [Namespace changes] As we saw in "part I" and "part II" of the guide to using NDepend, Quino 2.0 has unsnarled quite a few dependency issues. A large number of classes and interfaces have been moved out of the Encodo.Tools namespace. Many have been moved to Encodo.Core but others have been scattered into more appropriate and more specific namespaces. This is one part of the larger changes, easily addressed by using ReSharper to Alt + Enter your way through the compile errors. [Logging changes] Another large change is in renaming IMessageRecorder to IRecorder and IMessageStore to IInMemoryRecorder. Judicious use of search/replace or just a bit of elbow grease will get you through these as well. [Configuration changes] Finally, probably the most far-reaching change is in merging IConfiguration into IApplication. In previous versions of Quino, applications would create a configuration object and pass that to a platform-dependent Quino Run() method. Some configuration was provided by the application and some by the platform-specific method. The example for Quino 1.13.0 below comes from the JobVortex Winform application. var configuration = new JobVortexConfiguration { MainSettings = Settings.Default }; configuration.Add(new JobVortexClientConfigurationPackage()); if (!string.IsNullOrEmpty(Settings.Default.DisplayLanguage)) { configuration.DisplayLanguage = new Language(Settings.Default.DisplayLanguage); } WinformDxMetaConfigurationTools.Run( configuration, app => new MainForm(app) ); In Quino 2.0, the code above has been rewritten as shown below. using (IMetaApplication application = new JobVortexApplication()) { application.MainSettings = Settings.Default; application.UseJobVortexClient(); if (!string.IsNullOrEmpty(Settings.Default.DisplayLanguage)) { application.DisplayLanguage = new Language(Settings.Default.DisplayLanguage); } application.Run(app => new MainForm(app)); } As you can see, instead of creating a configuration, the program creates an application object. Instead of using configuration packages mixed with extension methods named "Integrate", "Configure" and so on, the new API uses "Use" everywhere. This should be comfortable for people familiar with the OWIN/Katana configuration pattern. It does, however, mean that the IConfiguration, ICoreConfiguration and IMetaConfiguration don't exist anymore. Instead, use IApplication, ICoreApplication and IMetaApplication Again, a bit of elbow grease will be needed to get through these compile errors, but there's little to no risk or need for high-level decisions. There are a lot of these prepackaged methods to help you create common kinds of applications: * UseCoreConsole() (a non-Quino application that uses the console) * UseMetaConsole() (a Quino application that uses the console) * UseCoreWinformDx() (a non-Quino application that uses Winform) * UseMetaWinformDx() (a Quino application that uses Winform) * UseReporting() * UseRemotingServer() * Etc. I think you get the idea. Once we have a final release for Quino 2.0, we'll write more about how to use this new pattern. [Looking ahead to 2.0 Final] This is still just an internal beta of the 2.0 final version. More changes are on the way, including but not limited to: * Remove IConfigurationPackage and standardize the configuration API to be named "Use" everywhere ("QNO-4771" ) * GenericObject improvements ("QNO-4761" , "QNO-4762" ) * Change compile location for all projects ("QNO-4756" ) * Move a lot of properties from ICoreApplication and IMetaApplication to configuration objects in the service locator. Also improve use of and configuration of service locator ("QNO-4659" ) * More improvements to the recorders and logging ("QNO-4688" ) * Changes to how ORM objects and metadata are generated. ("QNO-4506" ) * Separate Encodo and Quino assemblies into multiple, smaller assemblies ("QNO-4376" ) See you there!

C# 6 Features and C# 7 Design Notes

2015-03-13T08:59:09+01:00

Published by marco on 13. Mar 2015 08:59:09 ------------------------------------------------------------------------ Microsoft has recently made a lot of their .NET code open-source. Not only is the code for many of the base libraries open-source but also the code for the runtime itself. On top of that, basic .NET development is now much more open to community involvement. In that spirit, even endeavors like designing the features to be included in the next version of C# are online and open to all: "C# Design Meeting Notes for Jan 21, 2015" by Mads Torgerson . [C# 6 Recap] You may be surprised at the version number "7" -- aren't we still waiting for C# 6 to be officially released? Yes, we are. If you'll recall, the primary feature added to C# 5 was support for asynchronous operations through the async/await keywords. Most .NET programmers are only getting around to using this rather far- and deep-reaching feature, to say nothing of the new C# 6 features that are almost officially available. C# 6 brings the following features with it and can be used in the CTP versions of "Visual Studio 2015" or downloaded from "the Roslyn project" . Some of the more "interesting features of C# 6" are: * Auto-Property Initializers: initialize a property in the declaration rather than in the constructor or on an otherwise unnecessary local variable. * Out Parameter Declaration: An out parameter can now be declared inline with var or a specific type. This avoids the ugly variable declaration outside of a call to a Try* method. * Using Static Class: using can now be used with with a static class as well as a namespace. Direct access to methods and properties of a static class should clean up some code considerably. * String Interpolation: Instead of using string.Format() and numbered parameters for formatting, C# 6 allows expressions to be embedded directly in a string (á la PHP): e.g. "{Name} logged in at {Time}" * nameof(): This language feature gets the name of the element passed to it; useful for data-binding, logging or anything that refers to variables or properties. * Null-conditional operator: This feature reduces conditional, null-checking cruft by returning null when the target of a call is null. E.g. company.People?[0]?.ContactInfo?.BusinessAddress.Street includes three null-checks [Looking ahead to C# 7] If the idea of using await correctly or wrapping your head around the C# 6 features outlined above doesn't already make your poor head spin, then let's move on to language features that aren't even close to being implemented yet. That said, the "first set of design notes for C# 7" by Mads Torgerson include several interesting ideas as well. * Pattern-matching: C# has been ogling its similarly named colleague F# for a while. One of the major ideas on the table for C# is improving the ability to represent as well as match against various types of pure data, with an emphasis on immutable data. * Metaprogramming: Another focus for C# is reducing boilerplate and capturing common code-generation patterns. They're thinking of delegation of interfaces through composition. Also welcome would be an improvement in the expressiveness of generic constraints. Related User Voice issues: * "Expand Generic Constraints for constructors" * "[p]roper (generic) type ali[a]sing" * Controlling Nullability: Another idea is to be able to declare reference types that can never be null at compile-time (where reasonable -- they do acknowledge that they may end up with a "less ambitious approach"). * Readonly parameters and locals: Being able to express when change is allowed is a powerful form of expressiveness. C# 7 may include the ability to make local variables and parameters readonly. This will help avoid accidental side-effects. * Lambda capture lists: One of the issues with closures is that they currently just close over any referenced variables. The compiler just makes this happen and for the most part works as expected. When it doesn't work as expected, it creates subtle bugs that lead to leaks, race conditions and all sorts of hairy situations that are difficult to debug. If you throw in the increased use of and nesting of lambda calls, you end up with subtle bugs buried in frameworks and libraries that are nearly impossible to tease out. The idea of this feature is to allow a lambda to explicitly capture variables and perhaps even indicate whether the capture is read-only. Any additional capture would be flagged by the compiler or tools as an error.Contracts(!): And, finally, this is the feature I'm most excited about because I've been waiting for integrated language support for Design by Contract for literally decades [1], ever since I read the "Object-Oriented Software Construction 2" (OOSC2) for the first time. The design document doesn't say much about it, but mentions that ".NET already has a contract system", the weaknesses of which I've "written about before" . Torgersen writes: "When you think about how much code is currently occupied with arguments and result checking, this certainly seems like an attractive way to reduce code bloat and improve readability." ...and expressiveness and provability! There are a bunch of User Voice issues that I can't encourage you enough to vote for so we can finally get this feature: * "Integrate Code Contracts more deeply in the .NET Framework" * "Integrate Code Contract Keywords into the main .Net Languages" With some or all of these improvements, C# 7 would move much closer to a provable language at compile-time, an improvement over being a safe language at run-time. We can already indicate that instance data or properties are readonly. We can already mark methods as static to prevent the use of this. We can use ReSharper [NotNull] attributes to (kinda) enforce non-null references without using structs and incurring the debt of value-passing and -copying semantics. I'm already quite happy with C# 5, but if you throw in some or all of the stuff outlined above, I'll be even happier. I'll still have stuff I can think of to increase expressiveness -- covariant return types for polymorphic methods or anchored types or relaxed contravariant type-conformance -- but this next set of features being discussed sounds really, really good. -------------------------------------------------------------------------------- [1] I love the features of the language "Eiffel" , but haven't ever been able to use it for work. The tools and IDE are a bit stuck in the past (very dated on Windows; X11 required on OS X). The language is super-strong, with native support for contracts, anchored types, null-safe programming, contravariant type-conformance, covariant return types and probably much more that C# is slowly but surely including with each version. Unfair? I've been writing about this progress for years (from newest to oldest): * ".NET 4.5.1 and Visual Studio 2013 previews are available" * "A provably safe parallel language extension for C#" * "Waiting for C# 4.0: A casting problem in C# 3.5" * "Microsoft Code Contracts: Not with a Ten-foot Pole" * "Generics and Delegates in C#" * "Wildcard Generics" (this one was actually about Java) * "An analysis of C# language design" * "Static-typing for languages with covariant parameters" * "What is .NET?"

Quino Data Driver architecture, Part III: The Pipeline

2015-03-07T08:11:14+01:00

Published by marco on 7. Mar 2015 08:11:14 ------------------------------------------------------------------------ In "part I" of these series, we discussed applications, which provide the model and data provider, and sessions, which encapsulate high-level data context. In "part II" , we covered command types and inputs to the data pipeline. In this article, we're going to take a look at the data pipeline itself. 1. "Applications & Sessions" 2. "Command types & inputs" 3. The Data Pipeline 4. Builders & Commands 5. Contexts and Connections 6. Sessions, resources & objects [Overview] [image]The primary goal of the data pipeline is, of course, to correctly execute each query to retrieve data or command to store, delete or refresh data. The diagram to the right shows that the pipeline consists of several data handlers. Some of these refer to data sources, which can be anything: an SQL database or a remote service. [1] The name "pipeline" is only somewhat appropriate: A command can jump out anywhere in the pipeline rather than just at the opposite end. A given command will be processed through the various data handlers until one of them pronounces the command to be "complete". [Command context: recap] In the previous parts, we learned that the input to the pipeline is an IDataCommandContext. To briefly recap, this object has the following properties: * Session: Defines the context within which to execute the command * Handler: Implements an abstraction for reading/writing values and flags to the objects (e.g. SetValue(IMetaProperty)); more detail on this later * Objects: The sequence of objects on which to operate (e.g. for save commands) or to return (e.g. for load commands) * ExecutableQuery: The query to execute when loading or deleting objects * MetaClass: The metadata that describes the root object in this command; more detail on this later as well [Handlers] Where the pipeline metaphor holds up is that the command context will always start at the same end. The ordering of data handlers is intended to reduce the amount of work and time invested in processing a given command. [Analyzers] The first stage of processing is to quickly analyze the command to handle cases where there is nothing to do. For example, * The command is to save or delete, but the sequence of Objects is empty * The command is to save or reload, but none of the objects in the sequence of Objects has changed * The command is to load data but the query restricts to a null value in the primary key or a foreign key that references a non-nullable, unique key. It is useful to capture these checks in one or more analyzers for the following reasons, 1. All drivers share a common implementation for efficiency checks 2. Optimizations are applied independent of the data sources used 3. Driver code focuses on driver-specifics rather than general optimization [Caches] If the analyzer hasn't categorically handled the command and the command is to load data, the next step is to check caches. For the purposes of this article, there are two things that affect how long data is cached: 1. If the session is in a transacted state, then only immutable data, data that was loaded before the transaction began or data loaded within that transaction can be used. Data loaded/saved by other sessions -- possibly to global caches -- is not visible to a session in a transaction with an isolationLevel stricter than RepeatableRead. 2. The metadata associated with the objects can include configuration settings that control maximum caching lifetime as well as an access-timeout. The default settings are good for general use but can be tweaked for specific object types. Caches currently include the following standard handlers [2]: * The ValueListDataHandler returns immutable data. Since the data is immutable, it can be used independent of the transaction-state of the session in which the command is executed. * The SessionCacheDataHandler returns data that's already been loaded or saved in this session, to avoid a call to a possibly high-latency back-end. This data is safe to use within the session with transactions because the cache is rolled back when a transaction is rolled back. [Data sources] If the analyzer and cache haven't handled a command, then we're finally at a point where we can no longer avoid a call to a data source. Data sources can be internal or external. [Databases] The most common type is an external database: * PostgreSql 8.x and higher (PostgreSql 9.x for schema migration) * Sql Server 2008 and higher (w/schema migration) * Mongo (no schema; no migration) * SQlite (not yet released) [Remoting] Another standard data source is the Quino remote application server, which provides a classic interface- and method-based service layer as well as mapping nearly the full power of Quino's generalized querying capabilities to an application server. That is, an application can smoothly switch between a direct connection to a database to using the remoting driver to call into a service layer instead. The remoting driver supports both binary and JSON protocols. Further details are also beyond the scope of this article, but this driver has proven quite useful for scaling smaller client-heavy applications with a single database to thin clients talking to an application server. [Custom/Aspect-based] And finally, there is another way to easily include "mini" data drivers in an application. Any metaclass can include an IDataHandlerAspect that defines its own data driver as well as its capabilities. Most implementations use this technique to bind in immutable lists of data. But this technique has also been used to load/save data from/to external APIs, like REST services. We can take a look at some examples in more detail in another article. The mini data driver created for use with an aspect can relatively easily be converted to a full-fledged data handler. [Local evaluation] The last step in a command is what Quino calls "local evaluation". Essentially, if a command cannot be handled entirely within the rest of the data pipeline -- either entirely by an analyzer, one or more caches or the data source for that type of object -- then the local analyzer completes the command. What does this mean? Any orderings or restrictions in a query that cannot be mapped to the data source (e.g. a C# lambda is too complex to map to SQL) are evaluated on the client rather than the server. Therefore, any query that can be formulated in Quino can also be evaluated fully by the data pipeline -- the question is only of how much of it can be executed on the server, where it would (usually) be more efficient to do so. Please see the article series that starts with "Optimizing data access for high-latency networks" for specific examples. In this article, we've learned a bit about the ways in which Quino retrieves and stores data using the data pipeline. In the next part, we’ll cover the topic “Builders & Commands”. -------------------------------------------------------------------------------- [1] E.g. Quino uses a ProtoBuf-like protocol to communicate with its standard application server. [1] There is an open issue to "Introduce a global cache for immutable objects or objects used not in a transaction" .

Quino Data Driver architecture, Part II: Command types & inputs

2015-02-28T18:36:41+01:00

Published by marco on 28. Feb 2015 18:36:41 ------------------------------------------------------------------------ In "part I" , we discussed applications -- which provide the model and data provider -- and sessions -- which encapsulate high-level data context. In this article, we're going to take a look at the command types & inputs 1. "Applications & Sessions" 2. Command types & inputs [1] 3. The Data Pipeline 4. Builders & Commands 5. Contexts and Connections 6. Sessions, resources & objects [Overview] [image]Before we can discuss how the pipeline processes a given command, we should discuss what kinds of commands the data driver supports and what kind of inputs the caller can pass to it. As you can well imagine, the data driver can be used for CRUD -- to create, read, update and delete and also to refresh data. In the top-right corner of the diagram to the right, you can see that the only input to the pipeline is an IDataCommandContext. This object comprises the inputs provided by the caller as well as command-specific state used throughout the driver for the duration of the command. [Command types] A caller initiates a command with either a query or an object graph, depending on the type of command. The following commands and inputs are supported: * Load: returns a cursor for the objects that match a query * Count: returns the number of objects that match a query * Save: saves an object graph * Reload: refreshes the data in an object graph * Delete: deletes an object graph or the objects that match a query [Queries] A query includes information about the data to return (or delete). * Metadata: The meta-class represents the type of the root object for the command. For example, a "person" or "company". * Filtering: Filters restrict the objects to return. A filter can address properties of the root object, but also properties of objects related to the root object. A caller can query for people whose first names start with the letter "m" -- FirstName %~ 'm' [2] -- or the caller can find all people which belong to a company whose name starts with the letter "e" -- Company.FirstName %~ 'e'. The context for these expressions is naturally the meta-class mentioned above. Additionally, the metadata/model can also include default filters to include. * Ordering: Orderings that determine in which order the data is returned. Orderings are also specified with the expression language, but are usually simpler, like ordering first by LastName and then by FirstName. More complex expressions are supported -- for example, you could use the expression "{LastName}, {FirstName}", which sorts by a formatted string [3] -- but be aware that many data stores have limited support for complex expressions in orderings. Orderings are ignored in a query when used to delete objects. Queries are a pretty big topic and we've only really scratched the surface so far. Quino has its own query language -- QQL -- the specification for which weighs in at over 80 pages, but that's a topic for another day. [Object graphs] An object graph consists of a sequence of root objects and the sub-objects available along relations defined in the metadata. It's actually simpler than it perhaps sounds. Let's use the example above: a person is related to a single company, so the graph of a single person will include the company as well (if the object is loaded and/or assigned). Additionally, the company defines a relation that describes the list of people that belong to it. The person=>company relationship is complementary to the company=>person relationship. We call person=>company a 1-1 relation, while company=>person is a 1-n relation. The following code creates two new companies, assigns them to three people and saves everything at once. var encodo = new Company { Name = "Encodo Systems AG" }; var other = new Company { Name = "Not Encodo" }; var people = new [] { new Person { FirstName = "John", LastName = "Doe", Company = other }, new Person { FirstName = "Bob", LastName = "Smith", Company = encodo }, new Person { FirstName = "Ted", LastName = "Jones", Company = encodo } }; Session.Save(people); The variable people above is an object graph. The variables encodo and other are also object graphs, but only to parts of the first one. From people, a caller can look up people[0].Company, which is other. The graph contains cycles, so people[0].Company.People[0].Company is also other. From encodo, the caller can get to other people in the same company, but not to people in the other company, for example, encodo.People[0] gets "Bob Smith" and encodo.People[0].Company.People[1] gets "Ted Jones". As with queries, object graphs are a big topic and are strongly bound to the kind of metadata available in Quino. Another topic for another day. [Determining Inputs] Phew. We're almost to the point where we can create an IDataCommandContext to send into the data pipeline. * We have an IDataSession and "know why we need it" * We know what type of command we want to execute (e.g. "Load") * We have either a query or an object graph With those inputs, Quino has all it needs from the caller. A glance at the top-left corner of the diagram above shows us that Quino will determine an IMetaClass and an IMetaObjectHandler from these inputs and then use them to build the IDataCommandContext. An IQuery has a MetaClass property, so that's easy. With the meta-class and the requested type of object, the data driver checks a list of registered object-handlers and uses the first one that says it supports that type. If the input is an object graph, though, the object-handler is determined first and then the meta-class is obtained from the object-handler using a root object from the graph. Most objects will inherit from GenericObject which implements the IPersistable interface required by the standard object handler. However, an application is free to implement an object handler for other base classes -- or no base class at all, using reflection to get/set values on POCOs. That is, however, an exercise left up to the reader. At this point, we have all of our inputs and can create the IDataCommandContext. In the next part, we'll take a look at the "Data Pipeline" through which this command context travels. -------------------------------------------------------------------------------- [1] You'll notice, perhaps, that this topic is new to this article. I'm expanding the series as I go along, trying to provide enough information to understand the process while keeping the individual blog entries to a digestible size. [1] "%~" is actually the case-insensitive begins-with operator. You can find out more about comparison operators in the "Quino documentation" . Browse to "Encodo Base Library" and then "Expressions". [1] For more information on how to use Quino's unique take on interpolated strings, see the documentation in the footnote above.

Quino Data Driver architecture, Part I: Applications & Sessions

2015-02-21T08:02:16+01:00

Published by marco on 21. Feb 2015 08:02:16 ------------------------------------------------------------------------ One part of Quino that has undergone quite a few changes in the last few versions is the data driver. The data driver is responsible for CRUD: create, read, update and delete operations. One part of this is the ORM -- the object-relational mapper -- that marshals data to and from relational databases like PostgreSql, SQL Server and SQLite. We're going to cover a few topics in this series: 1. Applications & Sessions 2. The Data Pipeline 3. Builders & Commands 4. Contexts and Connections 5. Sessions, resources & objects But first let's take a look at an example to anchor our investigation. [Introduction] An application makes a request to the data driver using commands like Save() to save data and GetObject() or GetList() to get data. How are these high-level commands executed? Quino does an excellent job of shielding applications from the details but it's still very interesting to know how this is achieved. The following code snippet creates retrieves some data, deletes part of it and saves a new version. using (var session = application.CreateSession()) { var people = session.GetList(); people.Query.WhereEquals(Person.Fields.FirstName, "john"); session.Delete(people); session.Save(new Person { FirstName = "bob", LastName = "doe" }); } In this series, we're going to answer the following questions...and probably many more. * Where does the data come from? * What kind of sources are supported? How? * Is at least some of the data cached? * Can I influence the cache? * What is a session? Why do I need one? * Wait...what is the application? Let's tackle the last two questions first. [Application] The application defines common configuration information. The most important bits for the ORM are as follows: * Model: The model is the central part of any Quino application. The model defines entities, their properties, relationships between entities and so on. Looking at the example above, the model will include a definition for a Person, which has at least the two properties LastName and FirstName. There is probably an entity named Company as well, with a one-to-many relationship to Person. As you can imagine, Quino uses this information to formulate requests to data stores that contain data in this format. [1] For drivers that support it, Quino also uses this information in order to create that underlying data schema. [2] * DataProvider: The data provider encapsulates all of the logic and components needed to map the model to data sources. This is the part of the process on which this series will concentrate. * ConfigurationData: The configuration data describes which parts of the model are connected to which parts of the data provider. The default is, of course, that the entire model is mapped to a single data source. However, even in that case, the configuration indicates which data source: Sql Server? PostgreSql? A remote application server (2nd tier)? With a high-level API as described above, all of these decisions can be made in the configuration rather than assumed throughout the application. Yes, this means that you can change your Quino application from a two-tier to a three-tier application with a single configuration change. [Sessions] So that's the application. There is a single shared application for a process. But in any non-trivial application -- and any non-desktop application -- we will have multiple data requests running, possibly in different threads of execution. * Each request in a web application is a separate data context. Changes made in one request should not affect any other request. Each request may be authenticated as a different user. * A remote application-server is very similar to a web application. It handles requests from multiple users. Since it's generally the second layer, it will most likely have direct connections to one or more databases. In this case, it will probably be in charge of executing business logic, most likely in a database transaction. In that case, we definitely don't want one request using the transaction context from another request. * Even a non-web client-side application may want to execute some logic in the background or in a separate thread. In those cases, we probably want to keep the data used there separate from the data or objects used to render the other parts of the application. That's where sessions come in. The session encapsulates a data context, which contains the following information: * Application: The application will, as described above, tell the session which model and data provider to use. * Current user: For those familiar with ASP.NET, this is very similar to the HttpContext.Current.User but generalized to be available in any Quino application. All data requests over a session are made in the context of this user. * Access control: The access control provides information about the security context of an application. An application generally uses the access control to perform authorization checks. * Cache: Each session also has its own cache. There are global caches, but those are for immutable data. The session's cache is always available, even when using transactions. * ConnectionManager: Many external data sources have transactable/shared state in the form of a connection. As with data, connections can sometimes be shared between sessions and sometimes they can't. The connection manager takes care of knowing all of that for you. If we go back to the original code sample, we now know that creating a new session with CreateSession() creates a new data context, with its own user and its own data cache. Since we didn't pass in any credentials, the session uses the default credentials for the application. [3] All data access made on that session is nicely shielded and protected from any data access made in other sessions (where necessary, of course). So now we're no closer to knowing how Quino works with data on our behalf, but we've taken the first step: we know all about one of the main inputs to the data driver, the session. In the next part, we'll cover the topic "The Data Pipeline". -------------------------------------------------------------------------------- [1] The domain model is used for everything in a Quino application -- not just the ORM and for schema-migration. We use the model to generate C# code like concrete ORM objects, metadata references (e.g. the Person.Fields.FirstName in the example), or view models, DTOs or even client-side TypeScript definitions. We also use the model to generate user interfaces -- both for entire desktop-application interfaces but also for HTML helpers to build MVC views. [1] See the article "Schema migration in Quino 1.13" for more information on how that works. [1] This is code that you might use in a single-user application. In a server application, you would most likely just use the session that was created for your request by Quino. If an application wants to create a new session, but using the same user as an existing session, it would call: var requestCredentials = requestSession.AccessControl.CurrentUser.CreateCredentials(); using (var session = application.CreateSession(requestCredentials)) { // Work with session }

Are you ready for ReSharper 9? Not for testing, you aren't.

2015-02-11T07:11:51+01:00

Published by marco on 11. Feb 2015 07:11:51 ------------------------------------------------------------------------ We've been using ReSharper at Encodo since version 4. And we regularly use a ton of other software from JetBrains [1] -- so we're big fans. [How to Upgrade R#] As long-time users of ReSharper, we've become accustomed to the following pattern of adoption for new major versions: [EAP] 1. Read about cool new features and improvements on the JetBrains blog 2. Check out the EAP builds page 3. Wait for star ratings to get higher than 2 out of 5 4. Install EAP of next major version 5. Run into issues/problems that make testing EAP more trouble than it's worth 6. Re-install previous major version [RTM] 1. Major version goes RTM 2. Install immediately; new features! Yay! 3. Experience teething problems in x.0 version 4. Go through hope/disappointment cycle for a couple of bug-fix versions (e.g. x.0.1, x.0.2) 5. Install first minor-version release immediately; stability! Yay! This process can take anywhere from several weeks to a couple of months. The reason we do it almost every time is that the newest version of ReSharper almost always has a few killer features. For example, version 8 had initial TypeScript support. Version 9 carries with it a slew of support improvements for Gulp, TypeScript and other web technologies. Unfortunately, if you need to continue to use the test-runner with C#, you're in for a bumpy ride. [History of the Test Runner] Any new major version of ReSharper can be judged by its test runner. The test runner seems to be rewritten from the ground-up in every major version. Until the test runner has settled down, we can't really use that version of ReSharper for C# development. The 6.x and 7.x versions were terrible at the NUnit TestCase and Values attributes. They were so bad that we actually converted tests back from using those attributes. While 6.x had trouble reliably compiling and executing those tests, 7.x was better at noticing that something had changed without forcing the user to manually rebuild everything. Unfortunately, this new awareness in 7.x came at a cost: it slowed editing in larger NUnit fixtures down to a crawl, using a tremendous amount of memory and sending VS into a 1.6GB+ memory-churn that made you want to tear your hair out. 8.x fixed all of this and, by 8.2.x was a model of stability and usefulness, getting the hell out of the way and reliably compiling, displaying and running tests. [The 9.x Test Runner] And then along came 9.x, with a whole slew of sexy new features that just had to be installed. I tried the new features and they were good. They were fast. I was looking forward to using the snazzy new editor to create our own formatting template. ReSharper seemed to be using less memory, felt snappier, it was lovely. And then I launched the test runner. And then I uninstalled 9.x and reinstalled 8.x. And then I needed the latest version of DotMemory and was forced to reinstall 9.x. So I tried the test runner again, which inspired this post. [2] So what's not to love about the test runner? It's faster and seems much more asynchronous. However, it gets quite confused about which tests to run, how to handle test cases and how to handle abstract unit-test base classes. Just like 6.x, ReSharper 9.x can't seem to keep track of which assemblies need to be built based on changes made to the code and which test(s) the user would like to run. [image][image] To be fair, we have some abstract base classes in our unit fixtures. For example, we define all ORM query tests in multiple abstract test-fixtures and then create concrete descendants that run those tests for each of our supported databases. If I make a change to a common assembly and run the tests for PostgreSql, then I expect -- at the very least -- that the base assembly and the PostgreSql test assemblies will be rebuilt. 9.x isn't so good at that yet, forcing you to "Rebuild All" -- something that I'd no longer had to do with 8.2.x. [TestCases and the Unit Test Explorer] It's the same with TestCases: whereas 8.x was able to reliably show changes and to make sure that the latest version was run, 9.x suffers from the same issue that 6.x and 7.x had: sometimes the test is shown as a single node without children and sometimes it's shown with the wrong children. Running these tests results in a spinning cursor that never ends. You have to manually abort the test-run, rebuild all, reload the runner with the newly generated tests from the explorer and try again. This is a gigantic pain in the ass compared to 8.x, which just showed the right tests -- if not in the runner, then at-least very reliably in the explorer. [image][image] And the explorer in 9.x! It's a hyperactive, overly sensitive, eager-to-please puppy that reloads, refreshes, expands nodes and scrolls around -- all seemingly with a mind of its own! Tests wink in and out of existence, groups expand seemingly at random, the scrollbar extends and extends and extends to accommodate all of the wonderful things that the unit-test explorer wants you to see -- needs for you to see. Again, it's possible that this is due to our abstract test fixtures, but this is new to 9.x. 8.2.x is perfectly capable of displaying our tests in a far less effusive and frankly hyperactive manner. [One last thing: output-formatting] Even the output formatting has changed in 9.x, expanding all CR/LF pairs from single-spacing to double-spacing. It's not a deal-breaker, but it's annoying: copying text is harder, reading stack traces is harder. How could no one have noticed this in testing? [image] [Conclusion] The install/uninstall process is painless and supports jumping back and forth between versions quite well, so I'll keep trying new versions of 9.x until the test runner is as good as the one in 8.2.x is. For now, I'm back on 8.2.3. Stay tuned. -------------------------------------------------------------------------------- [1] In no particular order, we have used or are using: * DotMemory * DotTrace * DotPeek * DotCover * TeamCity * PHPStorm * WebStorm * PyCharm [1] Although I was unable to install DotMemory without upgrading to ReSharper 9.x, I was able to uninstall ReSharper 9.x afterwards and re-install ReSharper 8.x.

The Road to Quino 2.0: Maintaining architecture with NDepend (part II)

2014-11-16T00:20:42+01:00

Published by marco on 16. Nov 2014 00:20:42 ------------------------------------------------------------------------ In the "previous article" , I explained how we were using NDepend to clean up dependencies and the architecture of our Quino framework. You have to start somewhere, so I started with the two base assemblies: Quino and Encodo. Encodo only has dependencies on standard .NET assemblies, so let's start with that one. The first step in cleaning up the Encodo assembly is to remove dependencies on the Tools namespace. There seems to be some confusion as to what belongs in the Core namespace versus what belongs in the Tools namespace. There are too many low-level classes and helpers in the Tools namespace. Just as a few examples, I moved the following classes from Tools to Core: * BitTools * ByteTools * StringTools * EnumerableTools The names kind of speak for themselves: these classes clearly belong in a core component and not in a general collection of tools. Now, how did I decide which elements to move to core? NDepend helped me visualize which classes are interdependent. [Direct Dependencies] [image]We see that EnumerableTools depends on StringTools. I'd just moved EnumerableTools to Encodo.Core to reduce dependence on Encodo.Tools. However, since StringTools is still in the Tools namespace, the dependency remains. This is how examining dependencies really helps clarify a design: it's now totally obvious that something as low-level as StringTools belongs in the Encodo.Core namespace and not in the Encodo.Tools namespace, which has everything but the kitchen sink in it. [image]Another example in the same vein is shown to the left, where we examine the dependencies of MessageTools on Encodo.Tools. The diagram explains that the colors correspond to the two dependency directions. [1] We would like the Encodo.Messages namespace to be independent of the Encodo.Tools namespace, so we have to consider either (A) removing the references to ExceptionTools and OperatingSystemTools from MessageTools or (B) moving those two dependencies to the Encodo.Core namespace. Choice (A) is unlikely while choice (B) beckons with the same logic as the example above: it's now obvious that tools like ExceptionTools and OperatingSystemTools belong in Encodo.Core rather than the kitchen-sink namespace. [Indirect Dependencies] Once you're done cleaning up your direct dependencies, you still can't just sit back on your laurels. Now, you're ready to get started looking at indirect dependencies. These are dependencies that involve more than just two namespaces that use each other directly. NDepend displays these as red bounding blocks. The documentation indicates that these are probably good component boundaries, assuming that the dependencies are architecturally valid. NDepend can only show you information about your code but can't actually make the decisions for you. As we saw above, if you have what appear to be strange or unwanted dependencies, you have to decide how to fix them. In the cases above, it was obvious that certain code was just in the wrong namespace. In other cases, it may simply be a few bits of code are defined at too low a level. [Improper use of namespaces] For example, our standard practice for components is to put high-level concepts for the component at the Encodo. namespace. Then we would use those elements from sub-namespaces, like Encodo..Utils. However, we also ended up placing types that then used that sub-namespace in the upper-level namespace, like ComponentNameTools.SetUpEnvironment() or something like that. The call to SetUpEnvironment() references the Utils namespace which, in turn, references the root namespace. This is a direct dependency, but if another namespace comes between, we have an indirect dependency. This happens quite quickly for larger components, like Encodo.Security. The screenshots below show a high-level snapshot of the indirect dependencies in the Encodo assembly and then also a detail view, with all sub-namespaces expanded. The detail view is much larger but shows you much more information about the exact nature of the cycle. When you select a red bounding box, another panel shows the full details and exact nature of the dependency. [image][image][image] [Base Camp Two: base library almost cleaned up] [image][image]After a bunch of work, I've managed to reduce the dependencies to a set of interfaces that are clearly far too dependent on many subsystems. * ICoreConfiguration: references configuration options for optional subsystems like the software updater, the login, the incident reporter and more * ICoreFeedback: references feedbacks for several optional processes, like software-update, logins and more * ICoreApplication: references both the core configuration and feedback The white books for NDepend claim that "[t]echnically speaking, the task of merging the source code of several assemblies into one is a relatively light one that takes just a few hours." However, this assumes that the code has already been properly separated into non-interdependent namespaces that correspond to components. These components can then relatively easily be extracted to separate assemblies. The issue that I have above with the Encodo assembly is a thornier one: the interfaces themselves embody a pattern that is inherently non-decoupling. I need to change how the configuration and feedback work completely in order to decouple this code. [Roadmap for startup and configuration] To that end, I've created an issue in the issue-tracker for Quino, "QNO-4659" [2], titled "Re-examine how the configuration, feedback and application work together". The design of these components predates our introduction of a service locator, which means it's much more tightly coupled (as you can see above). After some internal discussion, we've decided to change the design of the Encodo and Quino library support for application-level configuration and state. Merge the configuration and application To date, the configuration has contained all of the information necessary to run an application. The configuration was more-or-less stateless and corresponded to the definition of an application, akin to how a class is the underlying stateless definition, while an object is an instance of that definition. In practice, though, we always use a single application per configuration and the distinction is irrelevant, for all practical purposes. This will simplify all referencing code, as we will no longer need to pass around an IApplication. Move the feedback to the service locator Instead of treating the feedback like a first-class citizen, with a direct reference on the application, make consumers use the service locator to retrieve an instance. This will remove the remaining generic argument in the definition of IApplication, leaving us with a base interface that is free of generic arguments. Move specific configuration objects to the service locator The specific sub-interfaces that introduce dependencies are as follows: * IncidentReporter * SoftwareUpdater * CommandSetManager * LocationManager * ConnectionSettingsManager Any components that currently reference the properties on the ICoreConfiguration can use the service locator to retrieve an instance instead. Move specific settings to sub-objects The configuration object is not only dependent on sub-objects, but is also overloaded with individual settings that are only used by very few specific sub-components. These will also be extracted into interfaces and moved into the service locator. * ILoginConfiguration * ISoftwareUpdateConfiguration * IFileLogConfiguration As you can see, while NDepend is indispensable for finding dependencies, it can -- along with a good refactoring tool (we use ReSharper) -- really only help you clean up the low-hanging fruit. While I started out trying to split assemblies, I've now been side-tracked into cleaning up an older and less--well-designed component -- and that's a very good thing. There are some gnarly knots that will feel nearly unsolvable -- but with a good amount of planning, those can be re-designed as well. As I mentioned in the previous article, though, we can do so only because we're making a clean break from the 1.x version of Quino instead of trying to maintain backward compatibility. It's worth it, though: the new design already looks much cleaner and is much more easily explained to new developers. Once that rewrite is finished, the Encodo assembly should be clean and I'll use NDepend to find good places to split up that rather large assembly into sensible sub-assemblies. -------------------------------------------------------------------------------- [1] There is a setting to turn off showing the green dependencies -- where the row depends on the column -- to make it easier to read the matrix. If you do that, though, you have to make sure to select the class from which you're trying to remove dependencies in the column. For example, if class A and B are interdependent, but A should not rely on B, you should make sure A is showing in the column. You can then examine dependencies on row B -- and then remove them. This works very nicely with both direct and indirect dependencies. [1] This link is to the Quino issue tracker, which requires a login.

The Road to Quino 2.0: Maintaining architecture with NDepend (part I)

2014-11-12T22:23:25+01:00

Published by marco on 12. Nov 2014 22:23:25 ------------------------------------------------------------------------ [Full disclosure] A while back -- this last spring, I believe -- I downloaded "NDepend" to analyze code dependencies. The trial license is fourteen days; needless to say, I got only one afternoon in before I was distracted by other duties. That was enough, however, to convince me that it was worth the $375 to continue to clean up Quino with NDepend. I decided to wait until I had more time before opening my wallet. In the meantime, however, Patrick Smacchia of NDepend approached me with a free license if I would write about my experiences using NDepend on Encodo's blog. I'm happy to write about how I used the tool and what I think it does and doesn't do. [1] [History & Background] [image]We started working on Quino in the fall of 2007. As you can see from the first commit, the library was super-small and comprised a single assembly. Fast-forward seven years and Version 1.13 of Quino has 66 projects/assemblies. That's a lot of code and it was long past time to take a look a more structured look at how we'd managed the architecture over the years. I'd already opened a branch in our Quino repository called feature/dependencyChanges and checked in some changes at the beginning of July. Those changes had come as a result of the first time I used NDepend to find a bunch of code that was in the wrong namespace or the wrong assembly, architecturally speaking. [Sidebar: Keeping branches mergeable] I wasn't able to continue using this branch, though, for the following reasons. 1. I got the hang of NDepend relatively quickly and got a bit carried away. Using ReSharper, I was able to make a lot of changes and fixes in a relatively short amount of time. 2. I checked in all of these changes in one giant commit. 3. I did this all five months ago. 4. There have been hundreds of subsequent commits on the master branch, many of which also include global refactoring and cleanup. 5. As a result of the above, merging master into feature/dependencyChanges is more trouble than it's worth. [Release Methodology] With each Quino change and release, we try our hardest to balance backward-compatibility with maintainability and effort. If it's easy enough to keep old functionality under an old name or interface, we do so. We mark members and types obsolete so that users are given a warning in the compiler but can continue using the old code until they have time to upgrade. These obsolete members are removed in the next major or minor upgrade. Developers who have not removed their references to obsolete members will at this point be greeted with compiler errors. In all cases, the user can find out from Quino's release notes how they should fix a warning or error. The type of high-level changes that we have planned necessitate that we make a major version-upgrade, to Quino 2.0. In this version, we have decided not to maintain backward-compatibility in the code with Obsolete attributes. However, where we do make a breaking change -- either by moving code to new or different assemblies or by changing namespaces -- we want to maintain a usable change-log for customers who make the upgrade. The giant commit that I'd made previously was not a good start. [Take Two] Since some of these changes will be quite drastic departures in structure, we want to come up with a plan to make merging from the master branch to the feature/dependencyChanges branch safer, quicker and all-around easier. I want to include many of the changes I started in the feature/dependencyChanges branch, but would like to re-apply those changes in the following manner: * Split the giant commit into several individual commits, each of which encapsulates exactly one change; smaller commits are much easier to merge * Document breaking changes in the release notes for Quino 2.0 * Blog about/document the process of using NDepend to clean up Quino [2] So, now that I'm ready to start cleaning up Quino for version 2.0, I'll re-apply the changes from the giant commit, but in smaller commits. At the same time, I'll use NDepend to find the architectural breaks that caused me to make those changes in the first place and document a bit of that process. [Setting up the NDepend Project] I created an NDepend project and attached it to my solution. Version 1.13 of Quino has 66 projects/assemblies, of which I chose the following "core" assemblies to analyze. [image] I can change this list at any time. There are a few ways to add assemblies. Unfortunately, the option to "Add Assemblies from VS Solution(s)" showed only 28 of the 66 projects in the Quino solution. I was unable to determine the logic that led to the other 38 projects not being shown. When I did select the projects I wanted from the list, the assemblies were loaded from unexpected directories. For example, it added a bunch of core assemblies (e.g. Encodo.Imaging) from the src/tools/Quino.CodeGenerator/bin/ folder rather than the src/libraries/Encodo.Imaging/bin folder. I ended up just taking the references I was offered by NDepend and added references to Encodo and Quino, which it had not offered to add. [3] [The NDepend Dashboard] Let's take a look at the initial NDepend Dashboard. [image] There's a lot of detail here. The initial impression of NDepend can be a bit overwhelming, I supposed, but you have to remember the sheer amount of interdependent data that it shows. As you can see on the dashboard, not only are there a ton of metrics, but those metrics are also tracked on a time-axis. I only have one measurement so far. Any assemblies not included in the NDepend project are considered to be "third-party" assemblies, so you can see external dependencies differently than internal ones. There is also support for importing test-coverage data, but I haven't tried that yet. There are a ton of measurements in there, some of which interest me and others that don't, or with which I disagree. For example, over 1400 warnings are in the Quino* assemblies because the base namespace -- Encodo.Quino -- doesn't correspond to a file-system folder -- it expects Encodo/Quino, but we use just Quino. Another 200 warnings are to "Avoid public methods not publicly visible", which generally means that we've declared public methods on internal, protected or private classes. The blog post "Internal or public?" by Eric Lippert covered this adequately and came to the same conclusion that we have: you actually should make methods public if they are public within their scope. There are some "White Books" about namespace and assembly dependencies that are worth reading if you're going to get serious about dependencies. There's a tip in there about turning off "Copy Local" on referenced assemblies to drastically increase compilation speed that we're going to look into. [Dependencies and cycles] One of the white books explains how to use namespaces for components and how to "levelize" an architecture. This means that the dependency graph is acyclic -- that there are no dependency cycles and that there are certainly no direct interdependencies. The initial graphs from the Encodo and Quino libraries show that we have our work cut out for us. [image][image][image] The first matrix shows the high-level view of dependencies in the Encodo and Quino namespaces. Click the second and third to see some initial dependency issues within the Encodo and Quino assemblies. That's as far as I've gotten so far. Tune in next time for a look at how we managed to fix some of these dependency issues and how we use NDepend to track improvement over time. -------------------------------------------------------------------------------- [1] I believe that takes care of full disclosure. [1] This is something I'd neglected to do before. Documenting this process will help me set up a development process where we use NDepend more regularly -- more than every seven years -- and don't have to clean up so much code at once. [1] After having read the recommendations in the NDepend White Book -- "Partitioning code base through .NET assemblies and Visual Studio projects" (PDF) -- it's clear why this happens: NDepend recommends using a single /bin folder for all projects in a solution.

Optimizing compilation and execution for dynamic languages

2014-11-12T22:14:18+01:00

Published by marco on 12. Nov 2014 22:14:18 Updated by marco on 29. May 2025 23:49:58 ------------------------------------------------------------------------ The long and very technical article "Introducing the WebKit FTL JIT" by Filip Pizlo provides a fascinating and in-depth look at how a modern execution engine optimizes code for a highly dynamic language like JavaScript. To make a long story short: the compiler(s) and execution engine optimize by profiling and analyzing code and lowering it to runtimes of ever decreasing abstraction to run as the least dynamic version possible. [A brief history lesson] What does it mean to "lower" code? A programming language has a given level of abstraction and expressiveness. Generally, the more expressive it is, the more abstracted it is from code that can actually be run in hardware. A compiler transforms or translates from one language to another. When people started programming machines, they used punch cards. Punch cards did not require any compilation because the programmer was directly speaking the language that the computer understood. The first layer of abstraction that most of us -- older programmers -- encountered was assembly language, or assembler. Assembly code still has a more-or-less one-to-one correspondence between instructions and machine-language codes but there is a bit of abstraction in that there are identifiers and op-codes that are more human-readable. Procedural languages introduced more types of statements like loops and conditions. At the same time, the syntax was abstracted further from assembler and machine code to make it easier to express more complex concepts in a more understandable manner. At this point, the assembler (which assembled instructions into machine op-codes) became a compile which "compiled" a set of instructions from the more abstract language. A compiler made decisions about how to translate these concepts, and could make optimization decisions based on registers, volatility and other settings. In time, we'd graduated to functional, statically typed and/or object-oriented languages, with much higher levels of abstraction and much more sophisticated compilers. Generally, a compiler still used assembly language as an intermediate format, which some may remember from their days working with C++ or Pascal compilers and debuggers. In fact, .NET languages are also compiled to IL -- the "Intermediate Language" -- which corresponds to the instruction set that the .NET runtime exposes. The runtime compiles IL to the underlying machine code for its processor, usually in a process called JIT -- Just-In-Time compilation. That is, in .NET, you start with C#, for example, which the compiler transforms to IL, which is, in turn, transformed to assembler and then machine code by the .NET runtime. [Static vs. Dynamic compilation] A compiler and execution engine for a statically typed language can make assumptions about the types of variables. The set of possible types is known in advance and types can be checked very quickly in cases where it's even necessary. That is, the statically typed nature of the language allows the compiler to reason about a given program without making assumptions. Certain features of a program can be proven to be true. A runtime for a statically typed language can often avoid type checks entirely. It benefits from a significant performance boost without sacrificing any runtime safety. The main characteristic of a dynamic language like JavaScript is that variables do not have a fixed type. Generated code must be ready for any eventuality and must be capable of highly dynamic dispatch. The generated code is highly virtualized. Such a runtime will execute much more slowly than a comparable statically compiled program. [Profile-driven compilation] Enter the profile-driven compiler, introduced in WebKit. From the article, "The only a priori assumption about web content that our engine makes is that past execution frequency of individual functions is a good predictor for those functions’ future execution frequency." Here a "function" corresponds to a particular overload of a set of instructions called with parameters with a specific set of types. That is, suppose a JavaScript function is declared with one parameter and is called once with a string and 100 times with an integer. WebKit considers this to be two function overloads and will (possibly) elect to optimize the second one because it is called much more frequently. The first overload will still handle all possible types, including strings. In this way, all possible code paths are still possible, but the most heavily used paths are more highly optimized. "All of the performance is from the DFG’s type inference and LLVM’s low-level optimizing power. [...] "Profile-driven compilation implies that we might invoke an optimizing compiler while the function is running and we may want to transfer the function’s execution into optimized code in the middle of a loop; to our knowledge the FTL is the first compiler to do on-stack-replacement for hot-loop transfer into LLVM-compiled code." Depending on the level of optimization, the code contains the following broad sections: * Original: code that corresponds to instructions written by the author * Profiling: code to analyze which types actually appear in a given code path * Switching: code to determine when a function has been executed often enough to warrant further optimization * Bailout code to abandon an optimization level if any of the assumptions made at that level no longer apply [image] While WebKit has included some form of profile-driven compilation for quite some time, the upcoming version is the first to carry the same optimization to LLVM-generated machine code. I recommend reading the "whole article" if you're interested in more detail, such as how they avoided LLVM compiler performance issues and how they integrated this all with the garbage collector. It's really amazing how much that we take for granted the WebKit JS runtime treats as "hot-swappable". The article is quite well-written and includes diagrams of the process and underlying systems.

Quino v1.13.0: Schema migration, remoting, services and web apps

2014-10-31T10:39:12+01:00

Published by marco on 31. Oct 2014 10:39:12 Updated by marco on 1. Nov 2014 08:44:53 ------------------------------------------------------------------------ The summary below describes major new features, items of note and breaking changes in "Quino" . The "full list of issues" is also available for those with access to the Encodo issue tracker. [Highlights] [Data & Schema] * Applications can now choose a base class for generated ORM objects. ("QNO-3107" ) * Data driver: made various bug fixes and improvements. ("QNO-4538" , "QNO-4554" , "QNO-4551" ) * Improved schema migration for fields and constraint-violation messages under SQL Server. ("QNO-4490" , "QNO-4111" , "QNO-4582" ) * Improved the console migrator and APIs, input queries and exit-code handling for console applications in general. Also changed the default command from [R]efresh to [S]how differences. ("QNO-4649" , "QNO-4646" , "QNO-4648" , "QNO-4650" , "QNO-4651" , "QNO-4615" , "QNO-4645" , "QNO-4616" ) [Remoting & services] * Fixed several issues in the remoting driver (client and server parts). ("QNO-4626" , "QNO-4630" , "QNO-4631" , "QNO-4388" , "QNO-4575" , "QNO-4629" , "QNO-4573" , "QNO-4625" , "QNO-4633" , "QNO-4575" ) * Added a runner for Windows services that allows debugging and shows logging output for applications that use the CoreServiceBase, which extends the standard .NET ServiceBase. The runner is available in the Encodo.Service assembly. [Web] * Improved default and custom authentication in web applications and the remoting server. Also improved support for authorization for remote-method routes as well as MVC controllers. * Improved configuration, error-handling and stability of the HttpApplicationBase, especially in situations where the application fails to start. Error-page handling was also improved, including handling for Windows Event Log errors. * Improved appearance of the web-based schema migrator. ("QNO-4559" , "QNO-4561" , "QNO-4563" , "QNO-4548" , "QNO-4487" , "QNO-4486" , "QNO-4488" ) [Winform] * Data-provider statistics: improved the WinForm-based statistics form. ("QNO-4231" , "QNO-4545" , "QNO-4546" ) * Standard forms: updated the standard WinForm about window and splash screen to use Encodo web-site CI. ("QNO-4529" ) [System & Tools] * Removed the dependency on the SmartWeakEvents library from Quino. ("QNO-4645" ); the Quino and Encodo assemblies now no longer have any external dependencies. * Image handling: the Encodo and Quino libraries now use the Windows Imaging Components instead of System.Drawing. ("QNO-4536" ) * Window 8.1: fixed culture-handling for en-US and de-CH that is broken in Windows 8.1. ("QNO-4534" , "QNO-4553" ) * R# annotations have been added to the Encodo assembly. Tell R# to look in the Encodo.Core namespace to use annotations like NotNull and CanBeNull with parameters and results. ("QNO-4508" ) * Generated code now includes a property that returns a ValueListObject for each enum property in the metadata. For example, for a property named State of type CoreState, the generated code includes the former properties for the enum and the foreign key backing it, but now also includes the ValueListObject property. This new property provides easy access to the captions. public CoreState State { ... } public ValueListObject StateObject { ... } public int? CoreStateIdId { ... }Improved the nant fix command in the default build tools to fix the assembly name as well. The build tools are available in bin/tools/build. See the src/demo/Demo.build file for an example on how to use the Nant build scripts for your own solutions. To change the company name used by the "fix" command, for example, add the following task override: * Fixed the implementation of IntegrateRemotableMethods to avoid a race condition with remote methods. Also improved the stability of the DataProvider statistics. ("QNO-4599" ) [Breaking changes] * The generic argument TRight has been removed from all classes and interfaces in the Encodo.Security.* namespace. In order to fix this code, just remove the int generic parameter wherever it was used. For example, where before you used the interface IUser, you should now use IUser ("QNO-4576" ). * The overridable method MetaAccessControl.DoGetAccessChecker() has been renamed to MetaAccessControl.GetAccessChecker(). * Renamed the Encodo.ServiceLocator.SimpleInjector.dll to Encodo.Services.SimpleInjector.dll and Quino.ServiceLocator.SimpleInjector.dll to Quino.Services.SimpleInjector.dll Also changed the namespace Quino.ServiceLocator to Encodo.Quino.Services. * Renamed HttpApplicationBase.StartMetaApplication() to CreateAndStartUpApplication(). * Classes may no longer contain properties with names that conflict with properties of IMetaReadable (e.g. Deleted, Persisted). The model will no longer validate until the properties have been renamed and the code regenerated. ("QNO-4185" ) * Removed StandardIntRights with integer constants and replaced it with StandardRights with string constants. * The IAccessControl.Check() and other related methods now accept a sequence of string rights rather than integers. * IMetaConfiguration.ConfigureSession() has been deprecated. The method will still be called but may have undesired side-effects, depending on why it was overridden. The common use was to initialize a custom AccessControl for the session. Continuing to do so may overwrite the current user set by the default Winform startup. Instead, applications should use the IDataSessionAccessControlFactory and IDataSessionFactory to customize the data sessions and access controls returned for an application. In order to attach an access control, take care to only set your custom access control for sessions that correspond to your application model. [1] internal class JobVortexDataSessionAccessControlFactory : DataSessionAccessControlFactory { public override IAccessControl CreateAccessControl(IDataSession session) { if (session.Application.Model.MetaId == JobVortexModelGenerator.ModelGuid) { return new JobVortexAccessControl(session); } return base.CreateAccessControl(session); } } The default length of the UserModule.User.PasswordHash property has been increased from 100 characters to 1000. This default is more sensible for implementations that use much longer validations tokens instead of passwords. To avoid the schema migration, revert the change by setting the property default length back to 0 in your application model, after importing the security module, as shown below. var securityModule = Builder.Include(); securityModule.Elements.Classes.User.Properties[ Encodo.Quino.Models.Security.Classes.SecurityUser.Fields.PasswordHash ].MaximumSize = 100;Application.Credentials has been removed. To fix references, retrieve the IUserCredentialsManager from the service locator. For example, the following code returns the current user: Session.Application.Configuration.ServiceLocator.GetInstance().Current If your application uses the WinformMetaConfigurationTools.IntegrateWinformPackages() or WinformDxMetaConfigurationTools.IntegrateWinformDxPackages(), then the IDataSession.AccessControl.CurrentUser will continue to be set correctly. If not, add the SingleUserApplicationConfigurationPackage to your application's configuration. The user in the remoting server will be set up correctly. Add the WebApplicationConfigurationPackage to web applications in order to ensure that the current user is set up correctly for each request. ("QNO-4596" ) * IDataSession.SyncRoot has been removed as it was no longer needed or used in Quino itself. Sessions should not be used in multiple threads, so there is no need for a SyncRoot. Code that uses it should be reworked to use a separate session for each thread. * Moved IMetaApplication.CreateSession() to an extension method. Add Encodo.Quino.App to the using clauses to fix any compile errors. * Removed IMetaApplication.DataProvider; use IMetaApplication.Configuration.DataProvider instead. ("QNO-4604" ) * The schema migration API has been completely overhauled. ISchemaChange and descendents has been completely removed. ISchemaAction is no longer part of the external API, although it is still used internally. The ISchemaChangeFactory has been renamed to ISchemaCommandFactory and, instead of creating change objects, which are then applied directly, returns ISchemaCommand objects, which can be either executed or transformed in some other way. IMigrateToolkit.GetActionFor() has also been replace with CreateCommands(), which mirrors the rest of the API by returning a sequence of commands to address a given ISchemaDifference. This release still has some commands that cannot be transformed to pure SQL, but the goal is to be able to generate pure SQL for a schema migration. ("QNO-993" , "QNO-4579" , "QNO-4581" , "4588" , "4591" , "QNO-4594" ) * IMigrateSchemaAspect.Apply() has been removed. All aspects will have to be updated to implement GetCommands() instead, or to use one of the available base classes, like UpdateDataAspectBase or ConvertPropertyTypeSchemaAspect. The following example shows how to use the UpdateDataAspectBase to customize migration for a renamed property. internal class ArchivedMigrationAspect : UpdateDataAspectBase { public ArchivedMigrationAspect() : base("ArchivedMigrationAspect", DifferenceType.RenamedProperty, ChangePhase.Instead) { } protected override void UpdateData(IMigrateContext context, ISchemaDifference difference) { using (var session = context.CreateSession(difference)) { session.ChangeAndSaveAll(UpdateArchivedFlag); } } private void UpdateArchivedFlag(Project obj) { obj.Archived = !obj.Archived; } } The base aspects should cover most needs; if your functionality is completely customized, you can easily pass your previous implementation of Apply() to a DelegateSchemaCommand and return that from your implementation of GetCommands(). See the implementation of UpdateDataAspectBase for more examples. ("QNO-4580" ) * MetaObjectIdEqualityComparer can no longer be constructed directly. Instead, use MetaObjectIdEqualityComparer.Default. * Renamed MetaClipboardControlDx.UpdateColorSkinaware() to MetaClipboardControlDx.UpdateSkinAwareColors(). * IMetaUnique.LogicalParent has been moved to IMetaBase. Since IMetaUnique inherits from IMetaBase, it is unlikely that code is affected (unless reflection or some other direct means was used to reference the property). ("QNO-4586" ) * IUntypedMessage has been removed; the AssociatedObject formerly found there has been moved to IMessage. * ITypedMessage.AssociatedObject has been renamed to ITypedMessage.TypedAssociatedObject. ("QNO-4647" ) * Renamed MetaObjectTools to MetaReadableTools. * Redefined the protected methods GenericObject.GetAsGuid() and GenericObject.GetAsGuidDefault as extension methods in MetaWritableTools. * IMetaFeedback.CreateGlobalContext() has been removed. Instead the IGlobalContext is created using the service locator. -------------------------------------------------------------------------------- [1] The schema migration creates a metadata model for your model -- meta-metadata -- and uses the Quino ORM to load data when importing a model from a database. If you aren't careful, as shown in the code example, then you'll attach your custom access control to the sessions created for the schema migration's data-access, which will more than likely fail when it tries to load user data from a table that does not exist in that model.

Schema migration in Quino 1.13

2014-10-24T12:26:25+02:00

Published by marco on 24. Oct 2014 12:26:25 ------------------------------------------------------------------------ Quino is a metadata framework for .NET. It provides a means of defining an application-domain model in the form of metadata objects. Quino also provides many components and support libraries that work with that metadata to automate many services and functions. A few examples are an ORM, schema migration, automatically generated user interfaces and reporting tools. [The schema-migration tool] The component we're going to discuss is the automated schema-migration for databases. A question that recently came up with a customer was: what do all of the options mean in the console-based schema migrator? Here's the menu you'll see in the console migrator: Advanced Options (1) Show migration plan (2) Show significant mappings (3) Show significant mappings with unique ids (4) Show all mappings (5) Show all mappings with unique ids Main Options (R) Refresh status (M) Migrate database (C) Cancel The brief summary is: * The only action that actually makes changes is (M) * Option (1) is the only advanced option you will every likely use; use this to show the changes that were detected The other advanced options are more for debugging the migration recommendation if something looks wrong. In order to understand what that means, we need to know what the migrator actually does. [image] 1. Provide the application model as input 2. Import a model from the database as input 3. Generate a mapping between the two models 4. Create a migration plan to update the database to reflect the application model 5. Generate a list of commands that can be applied to the database to enact the plan 6. Execute the commands against the database The initial database-import and final command-generation parts of migration are very database-specific. The determination of differences is also partially database-specific (e.g. some databases do not allow certain features so there is no point in detecting a difference that cannot ever be repaired). The rest of the migration logic is database-independent. [Gathering data for migration] The migrator works with two models: the target model and a source model * The target model is provided as part of the application and is usually loaded from a core assembly. The source model is imported from the database schema by the "import handler" Given these two models, the "mapping builder" creates a mapping. In the current implementation of Quino, there is no support for allowing the user to adjust mapping before a migration plan is built from it. However, it would be possible to allow the user to verify and possibly adjust the mapping. Experience has shown that this is not necessary. Anytime we thought we needed to adjust the mapping, the problem was instead that the target model had been configured incorrectly. That is, each time we had an unexpected mapping, it led us directly to a misconfiguration in the model. The options to show mappings are used to debug exactly such situations. Before we talk about mapping, though, we should talk about what we mean by "unique ids". Every schema-relevant bit of metadata in a Quino model is associated with a unique id, in the form of a Guid and called a "MetaId" in Quino. [Importing a model from a database] What happens during when the import handler generates a model? The importer runs in two phases: 1. Extract the "raw model" from the database schema 2. Enhance the "raw model" with data pulled from the application-specific Quino metadata table in the same database A Quino application named "demo" will have the following schema: * All modeled tables are named "demo__*" * The metadata table is named "demometadata__elementdescription" The migrator reads the following information into a "raw model" * Tables => MetaClasses * Fields/Columns => MetaProperties * Indexes => MetaIndexes * Foreign Keys => MetaPaths If there is no further information in the database, then the mapper will have to use the raw model only. If, however, the database was created or is being maintained by Quino, then there is additional information stored in the metadata table mentioned above. The importer enhanced the raw model with this information, in order to improve mapping and difference-recognition. The metadata table contains all of the Quino modeling information that is not reflected in a standard database schema (e.g. the aforementioned MetaId). The data available in this table is currently: * SchemaIdentifier: the identifier used in the raw model/database schema * Identifier: the actual identifier of the metadata element that corresponds to the element identified by the SchemaIdentifier * MetaId: the unique id for the metadata element * ObjectType: the type of metadata (one of: class, property, index, path, model) * ParentMetaId: the unique id of the metadata element that is the logical parent of this one; only allowed to be empty for elements with ObjectType equal to "model" * Data: Custom data associated with the element, as key/value pairs * DataVersion: Identifies the format type of the "Data" element (1.0.0.0 corresponds to CSV) For each schema element in the raw model, the importer does the following: 1. Looks up the data associated with that SchemaIdentifier and ObjectType (e.g. "punchclock__person" and "class") 2. Updates the "Identifier" 3. Sets the "MetaId" 4. Loads the key/value pairs from the Data field and applies that data to the element [Generating a mapping] At this point, the imported model is ready and we can create a mapping between it and the application model. The imported model is called the source model while the application model is called the target model because we're migrating the "source" to match the "target". We generate a mapping by iterating the target model: 1. Find the corresponding schema element in the source model using MetaIds [1] 2. If an element can be found, create a mapping for those two elements 3. If no element can be found, create a mapping with the target element. This will cause the element to be created in the database. 4. For all elements in the source model that have no corresponding element in the target model, create a mapping with only the source element. This will cause the element to be dropped from the database. [Creating a migration plan] The important decisions have already been made in the mapping phase. At this point, the migrator just generates a migration plan, which is a list of differences that must be addressed in order to update the database to match the target model. * If the mapping has a source and target element * Create a difference if the element has been renamed * Create a difference if the element has been altered (e.g. a property has a different type or is now nullable; an index has new properties or is no longer unique; etc.) If the mapping has only a source, generate a difference that the element is unneeded and should be dropped. * If the mapping has only a target, generate a difference that the element is missing and should be created. This is the plan that is shown to the user by the various migration tools available with Quino. [2] [The advanced console-migrator commands] At this point, we can now understand what the advanced console-migrator commands mean. Significant mappings are those mappings which correspond to a difference in the database (create, drop, rename or alter). * Show significant mappings: show significant mappings to see more detail about the names on each side * Show significant mappings with unique ids: same as above, but also include the MetaIds for each side. Use this to debug when you suspect that you might have copy/pasted a MetaId incorrectly or inadvertently moved one. * Show all mappings: Same detail level as the first option, but with all mappings, including those that are 100% matches * Show all mappings with unique ids: same as above, but with MetaIds As already stated, the advanced options are really there to help a developer see why the migrator might be suggesting a change that doesn't correspond to expectations. [Generating commands for the plan] At this point, the migrator displays the list of differences that will be addressed by the migrator if the user chooses to proceed. What happens when the user proceeds? The migrator generates database-specific commands that, when executed against the database, will modify the schema of the database. [3] Commands are executed for different phases of the migration process. The phases are occasionally extended but currently comprise the following. * Initialize: perform any required initialization before doing anything to the schema * DropConstraintsAndIndexes: drop all affected constraints and indexes that would otherwise prevent the desired modification of the elements involved in the migration. * AddUpdateOrRenameSchema: Create new tables, columns and indexes and perform any necessary renaming. The changes in this phase are non-destructive * UpdateData: Perform any necessary data updates before any schema elements are removed. This is usually the phase in which custom application code is executed, to copy existing data from other tables and fields before they are dropped in the next phase. For example, if there is a new required 1--1 relation, the custom code might analyze the other data in the rows of that table to determine which value that row should have for the new foreign key. * DropSchema: Drop any unneeded schema elements and data * CreatePrimaryKeys: Create primary keys required by the schema. This includes both new primary keys as well as reestablishing primary keys that were temporarily dropped in the second phase. * CreateConstraintsAndIndexes: Create constraints and indexes required by the schema. This includes both new constraints and indexes as well as reestablishing constraints and indexes that were temporarily dropped in the second phase. * UpdateMetadata: Update the Quino-specific metadata table for the affected elements. [Executing the migration plan] The commands are then executed and the results logged. Afterward, the schema is imported again, to verify that there are no differences between the target model and the database. In some (always rarer) cases, there will still be differences, in which case, you can execute the new migration plan to repair those differences as well. In development, this works remarkably well and often, without further intervention. [Fixing failed migrations] In some cases, there is data in the database that, while compatible with the current database schema, is incompatible with the updated schema. This usually happens when a new property or constraint is introduced. For example, a new required property is added that does not have a default value or a new unique index is added which existing data violates. In these cases, there are two things that can be done: * Either the database data is cleaned up in a way that makes it compatible with the target schema [4] * Or the developer must add custom logic to the metadata elements involved. This usually means that the developer must set a default value on a property. In rarer cases, the developer must attach logic to the affected metadata (e.g. the property or index that is causing the issue) that runs during schema migration to create new data or copy it from elsewhere in order to ensure that constraints are satisfied when they are reestablished at the end of the migration. In general, it's strongly advised to perform a migration against a replica of the true target database (e.g. a production database) in order to guarantee that all potential data situations have been anticipated with custom code, if necessary. [Quino Migration versus EF Migrations] It's important to point out that Quino's schema migration is considerably different from that employed by EF (which it picked up from the Active Migrations in Ruby, often used with Ruby on Rails). In those systems, the developer generates specific migrations to move from one model version to another. There is a clear notion of upgrading versus downgrading. Quino only recognizes migrating from an arbitrary model to another arbitrary model. This makes Quino's migration exceedingly friendly when moving between development branches, unlike EF, whose "deficiencies in this area have been documented" . -------------------------------------------------------------------------------- [1] The default is to use only MetaIds. There is a mode in which identifiers are used as a fallback but it is used only for tools that import schemas that were not generated by Quino. Again, if the Quino metadata table hasn't been damaged, this strict form of mapping will work extremely well. [1] The Winform and Web user interfaces for Quino both include built-in feedback for interacting with the schema migration. There are also two standalone tools to migrate database schemas: a Winform application and a Windows console application. [1] The form of these commands is currently a mix of SQL and custom C# code. A future feature of the migration will be to have all commands available as SQL text so that the commands, instead of being executed directly, could be saved as a file and reviewed and executed by DBAs instead of letting the tool do it. We're not quite there yet, but proceeding nicely. [1] This is generally what a developer does with his or her local database. The data contained therein can usually be more or less re-generated. If there is a conflict during migration, a developer can determine whether custom code is necessary or can sometimes determine that the data situation that causes the problem isn't something that comes up in production anyway and just remove the offending elements or data until the schema migration succeeds.

EF Migrations troubleshooting

2014-10-20T15:23:19+02:00

Published by marco on 20. Oct 2014 15:23:19 ------------------------------------------------------------------------ "The version of EF Migrations discussed in this article is 5.0.20627. The version of Quino is less relevant: the features discussed have been supported for years. For those in a hurry, there is a "tl;dr near the end of the article." <#tldr>" We use Microsoft Entity Framework (EF) Migrations in one of our projects where we are unable to use Quino. We were initially happy to be able to automate database-schema changes. After using it for a while, we have decidedly mixed feelings. As developers of our own schema migration for the Quino ORM, we're always on the lookout for new and better ideas to improve our own product. If we can't use Quino, we try to optimize our development process in each project to cause as little pain as possible. [EF Migrations and branches] We ran into problems in integrating EF Migrations into a development process that uses feature branches. As long as a developer stays on a given branch, there are no problems and EF functions relatively smoothly. [1] However, if a developer switches to a different branch -- with different migrations -- EF Migrations is decidedly less helpful. It is, in fact, quite cryptic and blocks progress until you figure out what's going on. Assume the following not-uncommon situation: * The project is created in the master branch * The project has an initial migration BASE * Developers A and B migrate their databases to BASE * Developer A starts branch feature/A and includes migration A in her database * Developer B starts branch feature/B and includes migration B in his database We now have the situation in which two branches have different code and each has its own database schema. Switching from one branch to another with Git quickly and easily addresses the code differences. The database is, unfortunately, a different story. Let's assume that developer A switches to branch feature/B to continue working there. The natural thing for A to do is to call "update-database" from the Package Manager Console [2]. This yields the following message -- all-too-familiar to EF Migrations developers. [image] "Unable to update database to match the current model because there are pending changes and automatic migration is disabled. Either write the pending changes to a code-based migration or enable automatic migration. [...]" This situation happens regularly when working with multiple branches. It's even possible to screw up a commit within a single branch, as illustrated in the following real-world example. * Add two fields to an existing class * Generate a migration with code that adds two fields * Migrate the database * Realize that you don't need one of the two fields * Remove the C# code from the migration for that field * Tests run green * Commit everything and push it As far as you're concerned, you committed a single field to the model. When your co-worker runs that migration, it will be applied, but EF Migrations immediately thereafter complains that there are pending model changes to make. How can that be? [Out-of-sync migrations != outdated database] Just to focus, we're actually trying to get real work done, not necessarily debug EF Migrations. We want to answer the following questions: 1. Why is EF Migrations having a problem updating the schema? 2. How do I quickly and reliably update my database to use the current schema if EF Migrations refuses to do it? The underlying reason why EF Migrations has problems is that it does not actually know what the schema of the database is. It doesn't read the schema from the database itself, but relies instead on a copy of the EF model that it stored in the database when it last performed a successful migration. That copy of the model is also stored in the resource file generated for the migration. EF Migrations does this so that the migration includes information about which changes it needs to apply and about the model to which the change can be applied. If the model stored in the database does not match the model stored with the migration that you're trying to apply, EF Migrations will not update the database. This is probably for the best, but leads us to the second question above: what do we have to do to get the database updated? [Generate a migration for those "pending changes"] The answer has already been hinted at above: we need to fix the model stored in the database for the last migration. Let's take a look at the situation above in which your colleague downloaded what you thought was a clean commit. From the Package Manager Console, run add-migration foo to scaffold a migration for the so-called "pending changes" that EF Migrations detected. That's interesting: EF Migrations thinks that your colleague should generate a migration to drop the column that you'd only temporarily added but never checked in. That is, the column isn't in his database, it's not in your database, but EF Migrations is convinced that it was once in the model and must be dropped. How does EF Migrations even know about a column that you added to your own database but that you removed from the code before committing? What dark magic is this? The answer is probably obvious: you did check in the change. The part that you can easily remove (the C# code) is only half of the migration. As mentioned above, the other part is a binary chunk stored in the resource file associated with each migration. These BLOBS are stored in the table _MigrationHistory table in the database. [image][image] [How to fix this problem and get back to work] Here's the tl;dr: generate a "fake" migration, remove all of the C# code that would apply changes to the database (shown below) and execute update-database from the Package Manager Console. [image] This may look like it does exactly nothing. What actually happens is that it includes the current state of the EF model in the binary data for the last migration applied to the database (because you just applied it). Once you've applied the migration, delete the files and remove them from the project. This migration was only generated to fix your local database; do not commit it. [Everything's cool now, right?] Applying the fix above doesn't mean that you won't get database errors. If your database schema does not actually match the application model, EF will crash when it assumes fields or tables are available which do not exist in your database. Sometimes, the only way to really clean up a damaged database -- especially if you don't have the code for the migrations that were applied there [3] -- is to remove the misapplied migrations from your database, undo all of the changes to the schema (manually, of course) and then generate a new migration that starts from a known good schema. [Conclusions and comparison to Quino] The obvious answer to the complaint "it hurts when I do this" is "stop doing that". We would dearly love to avoid these EF Migrations-related issues but developing without any schema-migration support is even more unthinkable. We'd have to create upgrade scripts manually or would have to maintain scripts to generate a working development database and this in each branch. When branches are merged, the database-upgrade scripts have to be merged and tested as well. This would be a significant addition to our development process, has maintainability and quality issues and would probably slow us down even more. And we're certainly not going to stop developing with branches, either. We were hoping to avoid all of this pain by using EF Migrations. That EF Migrations makes us think of going back to manual schema migration is proof that it's not nearly as elegant a solution as our own Quino schema migration, which never gave us these problems. Quino actually reads the schema in the database and compares that model directly against the current application model. The schema migrator generates a custom list of differences that map from the current schema to the desired schema and applies them. There is user intervention but it's hardly ever really required. This is an absolute godsend during development where we can freely switch between branches without any hassle. [4] Quino doesn't recognize "upgrade" versus "downgrade" but instead applies "changes". This paradigm has proven to be a much better fit for our agile, multi-branch style of development and lets us focus on our actual work rather than fighting with tools and libraries. -------------------------------------------------------------------------------- [1] EF Migrations as we use it is tightly bound to SQL Server. Just as one example, the inability of SQL Server to resolve cyclic cascade dependencies is in no way shielded by EF Migrations. Though the drawback originates in SQL Server, EF Migrations simply propagates it to the developer, even though it purports to provide an abstraction layer. Quino, on the other hand, does the heavy lifting of managing triggers to circumvent this limitation. [1] As an aside, this is a spectacularly misleading name for a program feature. It should just be called "Console". [1] I haven't ever been able to use the Downgrade method that is generated with each migration, but perhaps someone with more experience could explain how to properly apply such a thing. If that doesn't work, the method outlined above is your only fallback. [1] The aforementioned database-script maintenance or having only very discrete schema-update points or maintaining a database per branch and switching with configuration files or using database backups or any other schemes that end up distracting you from working.

An introduction to PowerShell

2014-09-14T16:09:45+02:00

Published by marco on 14. Sep 2014 16:09:45 ------------------------------------------------------------------------ On Wednesday, August 27th, Tymon gave the rest of Encodo [1] a great introduction to PowerShell. I've attached "the presentation" but a lot of the content was in demonstrations on the command-line. 1. "Download the presentation" 2. Unzip to a local folder 3. Open index.html in a modern web browser (Chrome/Opera/Firefox work the best; IE has some rendering issues) We learned a few very interesting things: * PowerShell is pre-installed on every modern Windows computer * You can PowerShell to other machines (almost like ssh!) * Windows developers should definitely learn how to use PowerShell. * Unix administrators who have to work on Windows machines should definitely learn how to use PowerShell. The underlying functionality of the operating system is much more discoverable via command line, get-command and get-member than the GUI. * You should definitely install ConEmu * When running ConEmu, make sure that you start a PowerShell session rather than the default Cmd session. * If you're writing scripts, you should definitely install and use the ISE, which is an IDE for PowerShell scripts with debugging, code-completion, lists of available commands and much better copy/paste than the standard console. * The "PowerShell Language Reference v3" is a very useful and compact reference for beginners and even for more advanced users [ConEmu Setup] The easiest way to integrate PowerShell into your workflow is to make it eminently accessible by installing ConEmu. ConEmu is a Windows command-line with a tabbed interface and offers a tremendous number of power-user settings and features. You can tweak it to your heart's content. [image]I set mine up to look like the one that Tymon had in the demonstrations (shown on my desktop to the right). 1. Download "ConEmu" ; I installed version 140814, the most recent version marked as "beta". There is no official release yet, but the software is quite mature. 2. Install it and run it. I didn't allow the Win + Num support because I know that I'd never use it. YMMV and you can always change your choice from the preferences. 3. Show the settings to customize your installation. There are a ton of settings, so I listed the ones I changed below. 4. [image]Set the window size to something a bit larger than the standard settings, especially if you have a larger monitor. I use 120 x 40. 5. [image]Choose the color scheme you want to use. I'm using the standard PowerShell colors but a lot of popular, darker schemes are also available (e.g. Monokai). 6. [image]Check out the hotkeys and set them up accordingly. The only key I plan on using is the one to show ConEmu. On the Swiss-German keyboard, it's Ctrl + ¨. 7. [image]The default console is not transparent, but there are those of us who enjoy a bit of transparency. Again, YMMV. I turned it on and left the slider at the default setting. 8. [image]And, finally, you can turn on Quake-style console mode to make it drop down from the top of your primary monitor instead of appearing in a free-floating window. -------------------------------------------------------------------------------- [1] and one former Encodo employee -- hey "Stephan" !

Should you return null or an empty list?

2014-08-08T10:20:08+02:00

Published by marco on 8. Aug 2014 10:20:08 ------------------------------------------------------------------------ I've seen a bunch of articles addressing this topic of late, so I've decided to weigh in. The reason we frown on returning null from a method that returns a list or sequence is that we want to be able to freely use these sequences or lists with in a functional manner. It seems to me that the proponents of "no nulls" are generally those who have a functional language at their disposal and the antagonists do not. In functional languages, we almost always return sequences instead of lists or arrays. In C# and other functional languages, we want to be able to do this: var names = GetOpenItems() .Where(i => i.OverdueByTwoWeeks) .SelectMany(i => i.GetHistoricalAssignees() .Select(a => new { a.FirstName, a.LastName }) ); foreach (var name in names) { Console.WriteLine("{1}, {0}", name.FirstName, name.LastName); } If either GetHistoricalAssignees() or GetOpenItems() might return null, then we'd have to write the code above as follows instead: var openItems = GetOpenItems(); if (openItems != null) { var names = openItems .Where(i => i.OverdueByTwoWeeks) .SelectMany(i => (i.GetHistoricalAssignees() ?? Enumerable.Empty()) .Select(a => new { a.FirstName, a.LastName }) ); foreach (var name in names) { Console.WriteLine("{1}, {0}", name.FirstName, name.LastName); } } This seems like exactly the kind of code we'd like to avoid writing, if possible. It's also the kind of code that calling clients are unlikely to write, which will lead to crashes with NullReferenceExceptions. As we'll see below, there are people that seem to think that's perfectly OK. I am not one of those people, but I digress. The post, "Is it Really Better to 'Return an Empty List Instead of null'? / Part 1" by Christian Neumanns serves as a good example of an article that seems to be providing information but is just trying to distract people into accepting it as a source of genuine information. He introduces his topic with the following vagueness. "If we read through related questions in Stackoverflow and other forums, we can see that not all people agree. There are many different, sometimes truly opposite opinions. For example, the top rated answer in the Stackoverflow question Should functions return null or an empty object? (related to objects in general, not specifically to lists) tells us exactly the opposite: "Returning null is usually the best idea ..." The statement "we can see that not all people agree" is a tautology. I would split the people into groups of those whose opinions we should care about and everyone else. The statement "There are many different, sometimes truly opposite opinions" is also tautological, given the nature of the matter under discussion -- namely, a question that can only be answered as "yes" or "no". Such questions generally result in two camps with diametrically opposed opinions. As the extremely long-winded pair of articles writes: sometimes you can't be sure of what an external API will return. That's correct. You have to protect against those with ugly, defensive code. But don't use that as an excuse to produce even more methods that may return null. Otherwise, you're just part of the problem. The second article "Is it Really Better to 'Return an Empty List Instead of null'? - Part 2" by Christian Neumanns includes many more examples. I just don't know what to say about people that write things like "Bugs that cause NullPointerExceptions are usually easy to debug because the cause and effect are short-distanced in space (i.e. location in source code) and time." While this is kind of true, it's also even more true that you can't tell the difference between such an exception being caused by a savvy programmer who's using it to his advantage and a non-savvy programmer whose code is buggy as hell. He has a ton of examples that try to distinguish between a method that returns an empty sequence being different from a method that cannot properly answer a question. This is a concern and a very real distinction to make, but the answer is not to return null to indicate nonsensical input. The answer is to throw an exception. The method providing the sequence should not be making decisions about whether an empty sequence is acceptable for the caller. For sequences that cannot logically be empty, the method should throw an exception instead of returning null to indicate "something went wrong". A caller may impart semantic meaning to an empty result and also throw an exception (as in his example with a cycling team that has no members). If the display of such a sequence on a web page is incorrect, then that is the fault of the caller, not of the provider of the sequence. * If data is not yet available, but should be, throw an exception * If data is not available but the provider isn't qualified to decide, return an empty sequence * If the caller receives an empty sequence and knows that it should not be empty, then it is responsible for indicating an error. That there exists calling code that makes assumptions about return values that are incorrect is no reason to start returning values that will make calling code crash with a NullPointerException. All of his examples are similar: he tries to make the pure-data call to retrieve a sequence of elements simultaneously validate some business logic. That's not a good idea. If this is really necessary, then the validity check should go in another method. The example he cites for getting the amount from a list of PriceComponents is exactly why most aggregation functions in .NET throw an exception when the input sequence is empty. But that's a much better way of handling it -- with a precise exception -- than by returning null to try to force an exception somewhere in the calling code. But the upshot for me is: I am not going to write code that, when I call it, forces me to litter other code with null-checks. That's just ridiculous.

Optimizing data access for high-latency networks: part IV

2014-08-08T10:20:05+02:00

Published by marco on 8. Aug 2014 10:20:05 ------------------------------------------------------------------------ [image]In the "previous" " two articles" , we managed to reduce the number of queries executed when opening the calendar of Encodo's time-tracking product Punchclock from one very slow query per person to a single very fast query. Because we're talking about latency in these articles, we'd also like to clear away a few other queries that aren't related to time entries but are still wasting time. [Lazy-loading unneeded values] In particular, the queries that "Load values" for person objects look quite suspicious. These queries don't take a lot of time to execute but they will definitely degrade performance in high-latency networks. [1] [image] As we did before, we can click on one of these queries to show the query that's being loaded. In the screenshot below, we see that the person's picture is being loaded for each person in the drop-down list. [image] We're not showing pictures in the drop-down list, though, so this is an extravagant waste of time. On a LAN, we hardly notice how wasteful we are with queries; on a WAN, the product will feel...sluggish. [What is a load-group?] In order to understand the cause of these queries, you must first know that Quino allows a developer to put metadata properties into different load-groups. A load-group has the following behavior: If the value for a property in a load-group is requested on an object, the values for all of the properties in the load-group are retrieved with a single query and set on that object. The default load-group of an object's metadata determine the values that are initially retrieved and applied to objects materialized by the ORM. The metadata for a person puts the "picture" property of a person into a separate load-group so that the value is not loaded by default when people objects are loaded from the data driver. This is a good balance because business logic will avoid downloading a lot of unwanted picture data by default. Business logic that needs the pictures can either explicitly include the picture in the query or let the value be lazy-loaded by the ORM when it is accessed. The proper solution depends on the situation. [Lazy-loaded property values] As before, we can check the stack trace of the query to figure out which application component is triggering the call. In this case, the culprit is the binding list that we are using to attach the list of people to the drop-down control. The binding list binds the values for all of the properties in a metaclass (e.g. "person"), triggering a lazy load when it accesses the "picture" property. To avoid the lazy-load, we can create a wrapper of the default metadata for a person and remove/hide the property so that the binding list will no longer access it. This is quite easy [2], as shown in the code below. var personMetaClass = new WrapMetaClass(Person.Metadata); personMetaClass.Properties.Remove(Person.MetaProperties.Picture); var query = new Query(personMetaClass); With this simple fix, the binding list no longer knows about the picture property, doesn't retrieve values for that property and therefore no longer triggers any queries to lazily load the pictures from the database for each person object. The screenshot of the statistics window below shows up that we were successful. We have two main queries: one for the list of people to show in the dropdown control and one for the time entries to show in the calendar. [image] [Final version] For completeness, here's the code that Punchclock is using in the current version of Quino (1.11). var personMetaClass = new WrapMetaClass(Person.Metadata); personMetaClass.Properties.Remove(Person.MetaProperties.Picture); var accessToolkit = new PostgreSqlMetaDatabase().AccessToolkit; var query = new Query(personMetaClass); query.CustomCommandText = new CustomCommandText(); query.CustomCommandText.SetSection( CommandTextSections.Where, CommandTextAction.Replace, string.Format( "EXISTS (SELECT id FROM {0} WHERE {1} = {2})", accessToolkit.GetName(TimeEntry.Metadata), accessToolkit.GetField(TimeEntry.MetaProperties.PersonId), accessToolkit.GetField(Person.MetaProperties.Id) )> ); var people = Session.GetList(query); [Future, improved version] Once we fix the but in the WhereExists join type mentioned in the previous article and add the fluent methods for constructing wrappers mentioned in the footnote below, the code will be as follows: var personMetaClass = Person.Metadata. Wrap(). RemoveProperty(Person.MetaProperties.Picture); var accessToolkit = new PostgreSqlMetaDatabase().AccessToolkit; var people = Session.GetList( new Query(personMetaClass). Join(Person.MetaRelations.TimeEntries, JoinType.WhereExists). Query ); This concludes our investigation into performance issues with Quino and Punchclock. -------------------------------------------------------------------------------- [1] You may have noticed that these calls to "load values" are technically lazy-loaded but don't seem to be marked as such in the screenshots. This was a bug in the statistics viewer that I discovered and addressed while writing this article. [image] [1] This is a rather old API and hasn't been touched with the "fluent" wand that we've applied in other parts of the Quino API. A nicer way of writing it would be to create an extension methods called Wrap() and RemoveProperty that return the wrapper class, like so: var personMetaClass = Person.Metadata. Wrap(). RemoveProperty(Person.MetaProperties.Picture); var query = new Query(personMetaClass); But that will have to wait for a future version of Quino.

Optimizing data access for high-latency networks: part III

2014-07-04T09:09:05+02:00

Published by marco on 4. Jul 2014 09:09:05 ------------------------------------------------------------------------ [image]In the "previous article" , we partially addressed a performance problem in the calendar of Encodo's time-tracking product, Punchclock. While we managed to drastically reduce the amount of time taken by each query (>95% time saved), we were still executing more queries than strictly necessary. The query that we're trying to optimized further is shown below. var people = Session.GetList(). Where(p => Session.GetCount(p.TimeEntries.Query) > 0). ToList(); This query executes one query to get all the people and then one query per person to get the number of time entries per person. Each of these queries by itself is very fast. High latency will cause them to be slow. In order to optimize further, there's really nothing for it but to reduce the number of queries being executed. Let's think back to what we're actually trying to accomplish: We want to get all people who have at least one time entry. Can't we get the database to do that for us? Some join or existence check or something? How about the code below? var people = Session.GetList( Session.CreateQuery(). Join(Person.MetaRelations.TimeEntries, JoinType.WhereExists). Query ); What's happening in the code above? We're still getting a list of people but, instead of manipulating the related TimeEntries for each person locally, we're joining the TimeEntries relation with the Quino query Join() method and changing the join type from the default All to the restrictive WhereExists. This sounds like exactly what we want to happen! There is no local evaluation or manipulation with Linq and, with luck, Quino will be able to map this to a single query on the database. This is the best possible query: it's purely declarative and will be executed as efficiently as the back-end knows how. There's just one problem: the WhereExists join type is broken in Quino 1.11. Never fear, though! We can still get it to work, but we'll have to do a bit of work until the bug is fixed in Quino 1.12. The code below builds on lessons learned in the earlier article, "Mixing your own SQL into Quino queries: part 2 of 2" to use custom query text to create the restriction instead of letting Quino do it. var accessToolkit = new PostgreSqlMetaDatabase().AccessToolkit; var query = Session.CreateQuery(); query.CustomCommandText = new CustomCommandText(); query.CustomCommandText.SetSection( CommandTextSections.Where, CommandTextAction.Replace, string.Format( "EXISTS (SELECT id FROM {0} WHERE {1} = {2})", accessToolkit.GetName(TimeEntry.Metadata), accessToolkit.GetField(TimeEntry.MetaProperties.PersonId), accessToolkit.GetField(Person.MetaProperties.Id) ) ); var people = Session.GetList(query); A look at the statistics is very encouraging: [image] We're down to one 29ms query for the people and an even quicker query for all the relevant time entries. [1] We can see our query text appears embedded in the SQL generated by Quino, just as we expected. There are a few other security-related queries that execute very quickly and hardly need optimization. We've come much farther in this article and we're almost done. In the next article, we'll quickly clean up a few other queries that are showing up in the statistics and that have been nagging us since the beginning. -------------------------------------------------------------------------------- [1] The time-entry query is not representative because my testing data set didn't include time entries for the current day and I was too lazy to page around to older data.

Optimizing data access for high-latency networks II

2014-06-27T10:07:40+02:00

Published by marco on 27. Jun 2014 10:07:40 ------------------------------------------------------------------------ [image]In the "previous article" , we discussed a performance problem in the calendar of Encodo's time-tracking product, Punchclock. Instead of guessing at the problem, we profiled the application using the database-statistics window available to all Quino applications. [1] We quickly discovered that most of the slowdown stems from the relatively innocuous line of code shown below. var people = Session.GetList(). Where(p => p.TimeEntries.Any()). ToList(); [First things first: what does the code do?] Before doing anything else, we should establish what the code does. Logically, it retrieves a list of people in the database who have recorded at least one time entry. The first question we should ask at this point is: does the application even need to do this? The answer in this case is 'yes'. The calendar includes a drop-down control that lets the user switch between the calendars for different users. This query returns the people to show in this drop-down control. With the intent and usefulness of the code established, let's dissect how it is accomplishing the task. 1. The Session.GetList() portion retrieves a list of all people from the database 2. The Where() method is applied locally for each object in the list [2] 3. For a given person, the list of TimeEntries is accessed 4. This access triggers a lazy load of the list 5. The Any() method is applied to the full list of time entries 6. The ToList() method creates a list of all people who match the condition Though the line of code looks innocuous enough, it causes a huge number of objects to be retrieved, materialized and retained in memory -- simply in order to check whether there is at least one object. This is a real-world example of a performance problem that can happen to any developer. Instead of blaming the developer who wrote this line of code, its more important to stay vigilant to performance problems and to have tools available to quickly and easily find them. [Stop creating all of the objects] The first solution I came up with [3] was to stop creating objects that I didn't need. A good way of doing this and one that was covered in "Quino: partially-mapped queries" is to use cursors instead of lists. Instead of using the generated list TimeEntries, the following code retrieves a cursor on that list's query and materializes at most one object for the sub-query. var people = Session.GetList().Select(p => { using (var cursor = Session.CreateCursor(p.TimeEntries.Query)) [4] { return cursor.Any(); } }).ToList(); A check of the database statistics shows improvement, as shown below. [image] Just by using cursors, we've managed to reduce the execution time for each query by about 75%. [5] Since all we're interested in finding out is whether there is at least one time entry for a person, we could also ask the database to count objects rather than to return them. That should be even faster. The following code is very similar to the example above but, instead of getting a cursor based on the TimeEntries query, it gets the count. var people = Session.GetList(). Where(p => Session.GetCount(p.TimeEntries.Query) > 0). ToList(); How did we do? A check of the database statistics shows even more improvement, as shown below. [image] We're now down to a few dozen milliseconds for all of our queries, so we're done, right? A 95% reduction in query-execution time should be enough. Unfortunately, we're still executing just as many queries as before, even though we're taking far less time to execute them. This is better, but still not optimal. In high-latency situations, the user is still likely to experience a significant delay when opening the calendar since each query's execution time is increased by the latency of the connection. In a local network, the latency is negligible; on a WAN, we still have a problem. In the next article, we’ll see if we can't reduce the number of queries being executed. -------------------------------------------------------------------------------- [1] This series of articles shows the statistics window as it appears in Winforms applications. The data-provider statistics are also available in Quino web applications as a Glimpse plug-in. [1] It is important for users of the Microsoft Entity Framework (EF) to point out that Quino does not have a Linq-to-Sql mapper. That means that any Linq expressions like Where() are evaluated locally instead of being mapped to the database. There are various reasons for this but the main one is that we ended up preferring a strict boundary between the mappable query API and the local evaluation API. Anything formulated with the query API is guaranteed to be executed by the data provider (even if it must be evaluated locally) and anything formulated with Linq is naturally evaluated locally. In this way, the code is clear in what is sent to the server and what is evaluated locally. Quino only very, very rarely issues an "unmappable query" exception, unlike EF, which occasionally requires contortions until you've figured out which C# formulation of a particular expression can be mapped by EF. [1] Well, the first answer I'm going to pretend I came up with. I actually thought of another answer first, but then quickly discovered that Quino wasn't mapping that little-used feature correctly. I added an issue to tackle that problem at a later date and started looking for workarounds. That fix will be covered in the next article in this series. [1] Please note that cursors are disposable and that the calling application is responsible for cleanup. Failure to dispose of a cursor that has been at least partially iterated will result in an open connection in the underlying database providers associated with the query and will eventually lead to connection-pool exhaustion on those databases. [1] Please ignore the fact that we also dropped 13 person queries. This was not due to any fix that we made but rather that I executed the test slightly differently...and was too lazy to make a new screenshot. The 13 queries are still being executed and we'll tackle those in the last article in this series.

Optimizing data access for high-latency networks: part I

2014-06-20T10:44:29+02:00

Published by marco on 20. Jun 2014 10:44:29 Updated by marco on 24. Jun 2014 13:27:18 ------------------------------------------------------------------------ [image]Punchclock is Encodo's time-tracking and invoicing tool. It includes a calendar to show time entries (shown to the left). Since the very first versions, it hasn't opened very quickly. It was fast enough for most users, but those who worked with Punchclock over the WAN through our VPN have reported that it often takes many seconds to open the calendar. So we have a very useful tool that is not often used because of how slowly it opens. That the calendar opens slowly in a local network and even more slowly in a WAN indicates that there is not only a problem with executing many queries but also with retrieving too much data. [Looking at query statistics] This seemed like a solvable problem, so I fired up Punchclock in debug mode to have a look at the query-statistics window. To set up the view shown below, I did the following: 1. Start your Quino application (Punchclock in this case) in debug mode (so that the statistics window is available) 2. Open the statistics window from the debug menu 3. Reset the statistics to clear out anything logged during startup 4. Group the grid by "Meta Class" 5. Open the calendar to see what kind of queries are generated 6. Expand the "TimeEntry" group in the grid to show details for individual queries [image] I marked a few things on the screenshot. It's somewhat suspicious that there are 13 queries for data of type "Person", but we'll get to that later. Much more suspicious is that there are 52 queries for time entries, which seems like quite a lot considering we're showing a calendar for a single user. We would instead expect to have a single query. More queries would be OK if there were good reasons for them, but I feel comfortable in deciding that 52 queries is definitely too many. A closer look at the details for the time-entry queries shows very high durations for some of them, ranging from a tenth of a second to nearly a second. These queries are definitely the reason the calendar window takes so long to load. [Why are these queries taking so long?] If I select one of the time-entry queries and show the "Query Text" tab (see screenshot below), I can see that it retrieves all time entries for a single person, one after another. There are almost six years of historical data in our Punchclock database and some of our employees have been around for all of them. [1] That's a lot of time entries to load. [image] I can also select the "Stack Trace" tab to see where the call originated in my source code. This feature lets me pinpoint the program component that is causing these slow queries to be executed. [image] As with any UI-code stack, you have to be somewhat familiar with how events are handled and dispatched. In this stack, we can see how a MouseUp command bubbled up to create a new form, then a new control and finally, to trigger a call to the data provider during that control's initialization. We don't have line numbers but we see that the call originates in a lambda defined in the DynamicSchedulerControl constructor. The line of code that I pinpoint as the culprit is shown below. var people = Session.GetList().Where(p => p.TimeEntries.Any()).ToList(); This looks like a nicely declarative way of getting data, but to the trained eye of a Quino developer, it's clear what the problem is. In the next couple of articles, we'll take a closer look at what exactly the problem is and how we can improve the speed of this query. We'll also take a look at how we can improve the Quino query API to make it harder for code like the line above to cause performance problems. -------------------------------------------------------------------------------- [1] Encodo just turned nine years old, but we used a different time-entry system for the first couple of years. If you're interested in our time-entry software history, here it is: 1. 06.2005 -- Start off with Open Office spreadsheets 2. 04.2007 -- Switch to a home-grown, very lightweight time tracker based on an older framework we'd written (Punchclock 1.0) 3. 08.2008 -- Start development of Quino 4. 04.2010 -- Initial version of Punchclock 2.0; start dogfooding Quino

Questions to consider when designing APIs: Part II

2014-06-18T08:10:36+02:00

Published by marco on 18. Jun 2014 08:10:36 Updated by marco on 8. Jun 2016 20:51:27 ------------------------------------------------------------------------ In the "previous article" , we listed a lot of questions that you should continuously ask yourself when you're writing code. Even when you think you're not designing anything, you're actually making decisions that will affect either other team members or future versions of you. In particular, we'd like to think about how we can reconcile a development process that involves asking so many questions and taking so many facets into consideration with YAGNI. [Designing != Implementing] The implication of this principle is, that if you aren't going to need something, then there's no point in even thinking about it. While it's absolutely commendable to adopt a YAGNI attitude, not building something doesn't mean not thinking about it and identifying potential pitfalls. A feature or design concept can be discussed within a time-box. Allocate a fixed, limited amount of time to determine whether the feature or design concept needs to be incorporated, whether it would be nice to incorporate it or possibly to jettison it if it's too much work and isn't really necessary. The overwhelming majority of time wasted on a feature is in the implementation, debugging, testing, documentation and maintenance of it, not in the design. Granted, a long design phase can be a time-sink -- especially a "perfect is the enemy of the good" style of design where you're completely blocked from even starting work. With practice, however, you'll learn how to think about a feature or design concept (e.g. extensibility) without letting it ruin your schedule. If you don't try to anticipate future needs at all while designing your API, you may end up preventing that API from being extended in directions that are both logical and could easily have been anticipated. If the API is not extensible, then it will not be used and may have to be rewritten in the future, losing more time at that point rather than up front. This is, however, only a consideration you must make. It's perfectly acceptable to decide that you currently don't care at all and that a feature will have to be rewritten at some point in the future. You can't do this kind of cost-benefit analysis and risk-management if you haven't taken time to identify the costs, benefits or risks. [Document your process] At Encodo, we encourage the person who's already spent time thinking about this problem to simply document the drawbacks and concessions and possible ideas in an issue-tracker entry that is linked to the current implementation. This allows future users, maintainers or extenders of the API to be aware of the thought process that underlies a feature. It can also help to avoid misunderstandings about what the intended audience and coverage of an API are. The idea is to eliminate assumptions. A lot of time can be wasted when maintenance developers make incorrect assumptions about the intent of code. If you don't have time to do any of this, then you can write a quick note in a task list that you need to more fully document your thoughts on the code you're writing. And you should try to do that soon, while the ideas are still relatively fresh in your mind. If you don't have time to think about what you're doing even to that degree, then you're doing something wrong and need to get organized better. That is, you if you can't think about the code you're writing and don't have time to document your process, even minimally, then you shouldn't be writing that code. Either that, or you implicitly accept that others will have to clean up your mess. And "others" includes future versions of you. (E.g. the you who, six months from now, is muttering, "who wrote this crap?!?") [Be Honest about Hacking] As an example, we can consider how we go from a specific feature in the context of a project to thinking about where the functionality could fit in to a suite of products -- that may or may not yet exist. And remember, we're only thinking about these things. And we're thinking about them for a limited time -- a time-box. You don't want to prevent your project from moving forward, but you also don't want to advance at all costs. Advancing in an unstructured way is called hacking and, while it can lead to a short-term win, it almost always leads to short-to-medium term deficits. You can still write code that is hacked and looks hacked, if that is the highest current priority, but you're not allowed to forget that you did so. You must officially designate what you're doing as a hot-zone of hacking so that the Hazmat team can clean it up later, if needed. A working prototype that is hacked together just so it works for the next demonstration is great as long as you don't think that you can take it into production without doing the design and documentation work that you initially skipped. If you fail to document the deficits that prevent you from taking a prototype to production, then how will you address those deficits? It will cost you much more time and pain to determine the deficits after the fact. Not only that, but unless you do a very good job, it is your users that will most likely be finding deficits -- in the form of bugs. If your product is just a hacked mess of spaghetti code with no rhyme or reason, another developer will be faster and produce more reliable code by just starting over. Trying to determine the flaws, drawbacks and hacks through intuition and reverse-engineering is slower and more error-prone than just starting with a clean slate. Developers on such a project will not be able to save time -- and money -- by building on what you've already made. [A note on error-handling] Not to be forgotten is a structured approach to error-handling. The more "hacked" the code, the more stringent the error-checking should be. If you haven't had time yet to write or test code sufficiently, then that code shouldn't be making broad decisions about what it thinks are acceptable errors. Fail early, fail often. Don't try to make a hacked mess of code bullet-proof by catching all errors in an undocumented manner. Doing so is deceptive to testers of the product as well as other developers. If you're building a demo, make sure the happy path works and stick to it during the demo. If you do have to break this rule, add the hacks to a demo-specific branch of the code that will be discarded later. [Working with a documented project] If, however, the developer can look at your code and sees accompanying notes (either in an issue tracker, as TODOs in the code or some other form of documentation), that developer knows where to start fixing the code to bring it to production quality. For example, it's acceptable to configure an application in code as long as you do it in a central place and you document that the intent is to move the configuration to an external source when there's time. If a future developer finds code for support for multiple database connections and tests that are set to ignore with a note/issue that says "extend to support multiple databases", that future developer can decide whether to actually implement the feature or whether to just discard it because it has been deprecated as a requirement. Without documentation or structure or an indication which parts of the code were thought-through and which are considered to be hacked, subsequent developers are forced to make assumptions that may not be accurate. They will either assume that hacked code is OK or that battle-tested code is garbage. If you don't inform other developers of your intent when your're writing the code -- best done with documentation, tests and/or a cleanly designed API -- then it might be discarded or ignored, wasting even more time and money. If you're on a really tight time-budget and don't have time to document your process correctly, then write a quick note that you think the design is OK or the code is OK, but tell your future self or other developers what they're looking at. It will only take you a few minutes and you'll be glad you did -- and so will they.

Questions to consider when designing APIs: Part I

2014-06-03T10:25:46+02:00

Published by marco on 3. Jun 2014 10:25:46 ------------------------------------------------------------------------ A big part of an agile programmer's job is API design. In an agile project, the architecture is defined from on high only in broad strokes, leaving the fine details of component design up to the implementer. Even in projects that are specified in much more detail, implementers will still find themselves in situations where they have to design something. This means that programmers in an agile team have to be capable of weighing the pros and cons of various approaches in order to avoid causing performance, scalability, maintenance or other problems as the API is used and evolves. When designing an API, we consider some of the following aspects. This is not meant to be a comprehensive list, but should get you thinking about how to think about the code you're about to write. [Reusing Code] * Will this code be re-used inside the project? * How about outside of the project? * If the code might be used elsewhere, where does that need lie on the time axis? * Do other projects already exist that could use this code? * Are there already other implementations that could be used? * If there are implementations, then are they insufficient? * Or perhaps not sufficiently encapsulated for reuse as written? * How likely is it that there will be other projects that need to do the same thing? * If another use is likely, when would the other project or projects need your API? [Organizing Code] * Where should the API live in the code? * Is your API local to this class? * Is it private? * Protected? * Are you making it public in an extension method? * Or internal? * Which namespace should it belong to? * Which assembly? [Testing Code] * What about testability? * How can the functionality be tested? Even if you don't have time to write tests right now, you should still build your code so that it can be tested. It's possible that you won't be writing the tests. Instead, you should prepare the code so that others can use it. It's also possible that a future you will be writing the tests and will hate you for having made it so hard to automate testing. [Managing Dependencies] * Is multi-threading a consideration? * Does the API manage state? * What kind of dependencies does the API have? * Which dependencies does it really need? * Is the API perhaps composed of several aspects? * With a core aspect that is extended by others? * Can core functionality be extracted to avoid making an API that is too specific? [Documenting Code] * How do callers use the API? * What are the expected values? * Are these expectations enforced? * What is the error mechanism? * What guarantees does the API make? * Is the behavior of the API enforced? * Is it at least documented? * Are known drawbacks documented? [Error-handling] This is a very important one and involves how your application handles situations outside of the design. * If you handle externally provided data, then you have to handle extant cases * Are you going to log errors? * In which format? * Is there a standard logging mechanism? * How are you going to handle and fix persistent errors? * Are you even going to handle weird cases? * Or are you going to fail early and fail often? * For which errors should your code even responsible? * How does your chosen philosophy (and you should be enforcing contracts) fit with the other code in the project? [Fail fast; enforce contracts] While we're on the subject of error-handling, I want to emphasize that this is one of the most important parts of API design, regardless of which language or environment you use. [1] Add preconditions for all method parameters; verify them as non-null and verify ranges. Do not catch all exceptions and log them or -- even worse -- ignore them. This is even more important in environments -- I'm looking at you client-side web code in general and JavaScript in particular -- where the established philosophy is to run anything and to never rap a programmer on the knuckles for having written really knuckle-headed code. You haven't tested the code, so you don't know what kind of errors you're going to get. If you ignore everything, then you'll also ignore assertions, contract violations, null-reference exceptions and so on. The code will never be improved if it never makes a noise. It will just stay silently crappy until someone notices a subtle logical error somewhere and must painstakingly track it down to your untested code. You might say that production code shouldn't throw exceptions. This is true, but we're explicitly not talking about production code here. We're talking about code that has few to no tests and is acknowledged to be incomplete. If you move code like this into production, then it's better to crash than to silently corrupt data or impinge the user experience. A crash will get attention and the code may even be fixed or improved. If you write code that will crash on all but the "happy path" and it never crashes? That's great. Do not program preemptively defensively in fresh code. If you have established code that interfaces with other (possibly external) components and you sometimes get errors that you can't work around in any other way, then it's OK to catch and log those exceptions rather than propagating them. At least you tried. In the next article, we'll take a look at how all of these questions and considerations can at all be reconciled with YAGNI. Spoiler alert: we think that they can. -------------------------------------------------------------------------------- [1] I recently read "Erlang and code style" by Jesper L. Andersen , which seems to have less to do with programming Erlang and much more to do with programming properly. The advice contained in it seems to be only for Erlang programmers, but the idea of strictly enforcing APIs between software components is neither new nor language-specific.

Dealing with improper disposal in WCF clients

2014-05-31T08:55:13+02:00

Published by marco on 31. May 2014 08:55:13 ------------------------------------------------------------------------ There's an old problem in generated WCF clients in which the Dispose() method calls Close() on the client irrespective of whether there was a fault. If there was a fault, then the method should call Abort() instead. Failure to do so causes another exception, which masks the original exception. Client code will see the subsequent fault rather than the original one. A developer running the code in debug mode will have be misled as to what really happened. You can see "WCF Clients and the "Broken" IDisposable Implementation" by David Barrett for a more in-depth analysis, but that's the gist of it. This issue is still present in the ClientBase implementation in .NET 4.5.1. The linked article shows how you can add your own implementation of the Dispose() method in each generated client. An alternative is to use a generic adaptor if you don't feel like adding a custom dispose to every client you create. [1] public class SafeClient : IDisposable where T : ICommunicationObject, IDisposable { public SafeClient(T client) { if (client == null) { throw new ArgumentNullException("client"); } Client = client; } public T Client { get; private set; } public void Dispose() { Dispose(true); GC.SuppressFinalize(this); } protected virtual void Dispose(bool disposing) { if (disposing) { if (Client != null) { if (Client.State == CommunicationState.Faulted) { Client.Abort(); } else { Client.Close(); } Client = default(T); } } } } To use your WCF client safely, you wrap it in the class defined above, as shown below. using (var safeClient = new SafeClient(new SystemLoginServiceClient(...))) { var client = safeClient.Client; // Work with "client" } If you can figure out how to initialize your clients without passing parameters to the constructor, you could slim it down by adding a "new" generic constraint to the parameter T in SafeClient and then using the SafeClient as follows: using (var safeClient = new SafeClient()) { var client = safeClient.Client; // Work with "client" } -------------------------------------------------------------------------------- [1] The code included in this article is a sketch of a solution and has not been tested. It does compile, though.

REST API Status codes (400 vs. 500)

2014-05-31T08:55:09+02:00

Published by marco on 31. May 2014 08:55:09 ------------------------------------------------------------------------ In a project that we're working on, we're consuming REST APIs delivered by services built by another team working for the same customer. We had a discussion about what were appropriate error codes to return for various situations. The discussion boiled down to: should a service return a 500 error code or a 400 error code when a request cannot be processed? I took a quick look at the documentation for a couple of the larger REST API providers and they are using the 500 code only for catastrophic failure and using the 400 code for anything related to query-input validation errors. "Microsoft Azure Common REST API Error Codes" Code 400: * The requested URI does not represent any resource on the server. * One of the request inputs is out of range. * One of the request inputs is not valid. * A required query parameter was not specified for this request. * One of the query parameters specified in the request URI is not supported. * An invalid value was specified for one of the query parameters in the request URI. Code 500: * The server encountered an internal error. Please retry the request. * The operation could not be completed within the permitted time. * The server is currently unable to receive requests. Please retry your request. "Twitter Error Codes & Responses" Code 400: "The request was invalid or cannot be otherwise served. An accompanying error message will explain further." Code 500: "Something is broken. Please post to the group so the Twitter team can investigate." "REST API Tutorial HTTP Status Codes" Code 400: "General error when fulfilling the request would cause an invalid state. Domain validation errors, missing data, etc. are some examples." Code 500: "A generic error message, given when no more specific message is suitable. The general catch-all error when the server-side throws an exception. Use this only for errors that the consumer cannot address from their end—never return this intentionally." "REST HTTP status codes" "For input validation failure: 400 Bad Request + your optional description. This is suggested in the book "RESTful Web Services"."

Mixing your own SQL into Quino queries: part 2 of 2

2014-04-17T21:30:02+02:00

Published by marco on 17. Apr 2014 21:30:02 ------------------------------------------------------------------------ In the "first installment" , we covered the basics of mixing custom SQL with ORM-generated queries. We also took a look at a solution that uses direct ADO database access to perform arbitrarily complex queries. In this installment, we will see more elegant techniques that make use of the CustomCommandText property of Quino queries. We'll approach the desired solution in steps, proceeding from attempt #1 -- attempt #"" <>5. tl;dr: Skip to attempt #5 to see the final result without learning why it's correct. [Attempt #1: Replacing the entire query with custom SQL] An application can assign the CustomCommandText property of any Quino query to override some of the generated SQL. In the example below, we override all of the text, so that Quino doesn't generate any SQL at all. Instead, Quino is only responsible for sending the request to the database and materializing the objects based on the results. [Test] public void TestExecuteCustomCommand() { var people = Session.GetList(); people.Query.CustomCommandText = new CustomCommandText { Text = @" SELECT ALL ""punchclock__person"".""id"", ""punchclock__person"".""companyid"", ""punchclock__person"".""contactid"", ""punchclock__person"".""customerid"", ""punchclock__person"".""initials"", ""punchclock__person"".""firstname"", ""punchclock__person"".""lastname"", ""punchclock__person"".""genderid"", ""punchclock__person"".""telephone"", ""punchclock__person"".""active"", ""punchclock__person"".""isemployee"", ""punchclock__person"".""birthdate"", ""punchclock__person"".""salary"" FROM punchclock__person WHERE lastname = 'Rogers'" }; Assert.That(people.Count, Is.EqualTo(9)); } This example solves two of the three problems outlined above: * It uses only a single query. * It will work with a remote application server (although it makes assumptions about the kind of SQL expected by the backing database on that server). * But it is even more fragile than the previous example as far as hard-coded SQL goes. You'll note that the fields expected by the object-materializer have to be explicitly included in the correct order. Let's see if we can address the third issue by getting Quino to format the SELECT clause for us. [Attempt #2: Generating the SELECT clause] The following example uses the AccessToolkit of the IQueryableDatabase to format the list of properties obtained from the metadata for a Person. The application no longer makes assumptions about which properties are included in the select statement, what order they should be in or how to format them for the SQL expected by the database. [Test] public virtual void TestExecuteCustomCommandWithStandardSelect() { var people = Session.GetList(); var accessToolkit = DefaultDatabase.AccessToolkit; var properties = Person.Metadata.DefaultLoadGroup.Properties; var fields = properties.Select(accessToolkit.GetField); people.Query.CustomCommandText = new CustomCommandText { Text = string.Format( @"SELECT ALL {0} FROM punchclock__person WHERE lastname = 'Rogers'", fields.FlattenToString() ) }; Assert.That(people.Count, Is.EqualTo(9)); } This example fixes the problem with the previous one but introduces a new problem: it no longer works with a remote application because it assumes that the client-side driver is a database with an AccessToolkit. The next example addresses this problem. [Attempt #3: Using a hard-coded AccessToolkit] The version below uses a hard-coded AccessToolkit so that it doesn't rely on the external data driver being a direct ADO database. It still makes an assumption about the database on the server but that is usually quite acceptable because the backing database for most applications rarely changes. [1] [Test] public void TestCustomCommandWithPostgreSqlSelect() { var people = Session.GetList(); var accessToolkit = new PostgreSqlMetaDatabase().AccessToolkit; var properties = Person.Metadata.DefaultLoadGroup.Properties; var fields = properties.Select(accessToolkit.GetField); people.Query.CustomCommandText = new CustomCommandText { Text = string.Format( @"SELECT ALL {0} FROM punchclock__person WHERE lastname = 'Rogers'", fields.FlattenToString() ) }; Assert.That(people.Count, Is.EqualTo(9)); } We now have a version that satisfies all three conditions to a large degree. The application uses only a single query and the query works with both local databases and remoting servers. It still makes some assumptions about database-schema names (e.g. "punchclock__person" and "lastname"). Let's see if we can clean up some of these as well. [Attempt #4: Replacing only the where clause] Instead of replacing the entire query text, an application can replace individual sections of the query, letting Quino fill in the rest of the query with its standard generated SQL. An application can append or prepend text to the generated SQL or replace it entirely. Because the condition for our query is so simple, the example below replaces the entire WHERE clause instead of adding to it. [Test] public void TestCustomWhereExecution() { var people = Session.GetList(); people.Query.CustomCommandText = new CustomCommandText(); people.Query.CustomCommandText.SetSection( CommandTextSections.Where, CommandTextAction.Replace, "lastname = 'Rogers'" ); Assert.That(people.Count, Is.EqualTo(9)); } That's much nicer -- still not perfect, but nice. The only remaining quibble is that the identifier lastname is still hard-coded. If the model changes in a way where that property is renamed or removed, this code will continue to compile but will fail at run-time. This is a not insignificant problem if your application ends up using these kinds of queries throughout its business logic. [Attempt #5: Replacing the where clause with generated field names] In order to fix this query and have a completely generic query that fails to compile should anything at all change in the model, we can mix in the technique that we used in attempts #2 and #3: using the AccessToolkit to format fields for SQL. To make the query 100% statically checked, we'll also use the generated metadata -- LastName -- to indicate which property we want to format as SQL. [Test] public void TestCustomWhereExecution() { var people = Session.GetList(); var accessToolkit = new PostgreSqlMetaDatabase().AccessToolkit; var lastNameField = accessToolkit.GetField(Person.MetaProperties.LastName); people.Query.CustomCommandText = new CustomCommandText(); people.Query.CustomCommandText.SetSection( CommandTextSections.Where, CommandTextAction.Replace, string.Format("{0} = 'Rogers'", lastNameField) ); Assert.That(people.Count, Is.EqualTo(9)); } The query above satisfies all of the conditions we outlined above. it's clear that the condition is quite simple and that real-world business logic will likely be much more complex. For those situations, the best approach is to fall back to using the direct ADO approach mixed with using Quino facilities like the AccessToolkit as much as possible to create a fully customized SQL text. Many thanks to "Urs" for proofreading and suggestions on overall structure. -------------------------------------------------------------------------------- [1] If an application needs to be totally database-agnostic, then it will need to do some extra legwork that we won't cover in this post.

Mixing your own SQL into Quino queries: part 1 of 2

2014-04-13T17:38:59+02:00

Published by marco on 13. Apr 2014 17:38:59 ------------------------------------------------------------------------ The Quino ORM [1] manages all CrUD -- Create, Update, Delete -- operations for your application. This basic behavior is generally more than enough for standard user interfaces. When a user works with a single object in a window and saves it, there really isn't that much to optimize. [Modeled methods] A more complex editing process may include several objects at once and perhaps trigger events that create additional auditing objects. Even in these cases, there are still only a handful of save operations to execute. To keep the architecture clean, an application is encouraged to model these higher-level operations with methods in the metadata (modeled methods). The advantage to using modeled methods is that they can be executed in an application server as well as locally in the client. When an application uses a remote application server rather than a direct connection to a database, modeled methods are executed in the service layer and therefore have much less latency to the database. [When Quino's query language isn't enough] If an application needs even more optimization, then it may be necessary to write custom SQL -- or even to use stored procedures to move the query into the database. Mixing SQL with an ORM can be a tricky business. It's even more of a challenge with an ORM like that in Quino, which generates the database schema and shields the user from tables, fields and SQL syntax almost entirely. What are the potential pitfalls when using custom query text (e.g. SQL) with Quino? * Schema element names: An application needs to figure out the names of database objects like table and columns. It would be best not to hard-code them so that when the model changes, the custom code will be automatically updated. * If the query is in a stored procedure, then the database may ensure that the code is updated or at least checked when the schema changes. [2] * If the query is in application code, then care can be taken to keep that query in-sync with the model * Materialization: In particular, the selected fields in a projection must match the expectations of the ORM exactly so that it can materialize the objects properly. We'll see how to ensure this in examples below. There are two approaches to executing custom code: * ADO: Get a reference to the underlying ADO infrastructure to execute queries directly without using Quino at all. With this approach, Quino can still help an application retrieve properly configured connections and commands. * CustomCommandText: An application commonly adds restrictions and sorts to the IQuery object using expressions, but can also add text directly to enhance or replace sections of the generated query. All of the examples below are taken directly from the Quino test suite. Some variables -- like DefaultDatabase -- are provided by the Quino base testing classes but their purpose, types and implementation should be relatively obvious. [Using ADO directly] You can use the AdoDataConnectionTools to get the underlying ADO connection for a given Session so that any commands you execute are guaranteed to be executed in the same transactions as are already active on that session. If you use these tools, your ADO code will also automatically use the same connection parameters as the rest of your application without having to use hard-coded connection strings. The first example shows a test from the Quino framework that shows how easy it is to combine results returned from another method into a standard Quino query. [Test] public virtual void TestExecuteAdoDirectly() { var ids = GetIds().ToList(); var people = Session.GetList(); people.Query.Where(Person.MetaProperties.Id, ExpressionOperator.In, ids); Assert.That(people.Count, Is.EqualTo(9)); } The ADO-access code is hidden inside the call to GetIds(), the implementation for which is shown below. Your application can get the connection for a session as described above and then create commands using the same helper class. If you call CreateCommand() directly on the ADO connection, you'll have a problem when running inside a transaction on SQL Server. The SQL Server ADO implementation requires that you assign the active transaction object to each command. Quino takes care of this bookkeeping for you if you use the helper method. private IEnumerable GetIds() { using (var helper = AdoDataConnectionTools.GetAdoConnection(Session, "Name")) { using (var command = helper.CreateCommand()) { command.AdoCommand.CommandText = @"SELECT id FROM punchclock__person WHERE lastname = 'Rogers'"; using (var reader = command.AdoCommand.ExecuteReader()) { while (reader.Read()) { yield return reader.GetInt32(0); } } } } } There are a few drawbacks to this approach: * Your application will make two queries instead of one. * The hard-coded SQL will break if you make model changes that affect those tables and fields. * The ADO approach only works if the application has a direct connection to the database. An application that uses ADO will not be able to switch to an application-server driver without modification. In the second part, we will improve on this approach by using the CustomCommandText property of a Quino query. This will allow us to use only a single query. We will also improve maintainability by reducing the amount of code that isn't checked by the compiler (e.g. the SQL text above). Stay tuned for part 2, coming soon! Many thanks to "Urs" for proofreading and suggestions on overall structure. -------------------------------------------------------------------------------- [1] This article uses features of Quino that will only become available in version 1.12. Almost all of the examples will also work in earlier versions but the AdoDataConnectionTools is not available until 1.12. The functionality of this class can, however, be back-ported if necessary. [1] More likely, though, is that the Quino schema migration will be prevented from applying updates if there are custom stored procedures that use tables and columns that need to be changed. Mixing your own SQL into Quino queries: part 1 of 2

Java 8

2014-03-28T15:53:54+01:00

Published by marco on 28. Mar 2014 15:53:54 Updated by marco on 28. Mar 2014 15:56:09 ------------------------------------------------------------------------ [image]This article discusses and compares the initial version of Java 8 and C# 4.5.1. I have not used Java 8 and I have not tested that any of the examples -- Java or C# -- even compile, but they should be pretty close to valid. Java 8 has finally been released and -- drum roll, please -- it has closures/lambdas, as promised! I would be greeting this as champagne-cork--popping news if I were still a Java programmer. [1] As an ex-Java developer, I greet this news more with an ambivalent shrug than with any overarching joy. It's a sunny morning and I'm in a good mood, so I'm able to suppress what would be a more than appropriate comment: "it's about time". Since I'm a C# programmer, I'm more interested in peering over the fence at the pile of goodies that Java just received for its eighth birthday and see if it got something "what I ain't got". I found a concise list of new features in the article "Will Java 8 Kill Scala?" by Ahmed Soliman and was distraught/pleased [2] to discover that Java had in fact gotten two presents that C# doesn't already have. As you'll see, these two features aren't huge and the lack of them doesn't significantly impact design or expressiveness, but you know how jealousy works: Jealousy doesn't care. Jealousy is. I'm sure I'll get over it, but it will take time. [3] [Default methods and static interface methods] Java 8 introduces support for static methods on interfaces as well as default methods that, taken together, amount to functionality that is more or less what extensions methods brings to C#. In Java 8, you can define static methods on an interface, which is nice, but it becomes especially useful when combined with the keyword default on those methods. As defined in "Default Methods" : "Default methods enable you to add new functionality to the interfaces of your libraries and ensure binary compatibility with code written for older versions of those interfaces." In Java, you no longer have to worry that adding a method to an interface will break implementations of that interface in other jar files that have not yet been recompiled against the new version of the interface. You can avoid that by adding a default implementation for your method. This applies only to those methods where a default implementation is possible, of course. The page includes an example but it's relatively obvious what it looks like: public interface ITransformer { string Adjust(string value); string NewAdjust(string value) { return value.Replace(' ', '\t'); } } How do these compare with extension methods in C#? Extension methods are nice because they allow you to quasi-add methods to an interface without requiring an implementor to actually implement them. My rule of thumb is that any method that can be defined purely in terms of the public API of an interface should be defined as an extension method rather than added to the interface. Java's default methods are a twist on this concept that addresses a limitation of extension methods. What is that limitation? That the method definition in the extension method can't be overridden by the actual implementation behind the interface. That is, the default implementation can be expressed purely in terms of the public interface, but perhaps a specific implementor of the interface would like to do that plus something more. Or would perhaps like to execute the extension method in a different way, but only for a specific implementation. There is no way to do this with extension methods. Interface default methods in Java 8 allow you to provide a fallback implementation but also allows any class to actually implement that method and override the fallback. [Functional Interfaces] Functional interfaces are a nice addition, too, and something I've wanted in C# for some time. "Eric Meijer" of Microsoft doesn't miss an opportunity to point out that this is a must for functional languages (he's exaggerating, but the point is taken). Saying that a language supports functional interface simply means that a lambda defined in that language can be assigned to any interface with a single method that has the same signature as that lambda. An example in C# should make things clearer: public interface ITransformer { string Adjust(string value); } public static class Utility { public static void WorkOnText(string text, ITransformer) { // Do work } } In order to call WorkOnText() in C#, I am required to define a class that implements ITransformer. There is no other way around it. However, in a language that allows functional interfaces, I could call the method with a lambda directly. The following code looks like C# but won't actually compile. Utility.WorkOnText( "Hello world", s => s.Replace("Hello", "Goodbye cruel") ); For completeness, let's also see how much extra code it is do this in C#, which has no functional interfaces. public class PessimisticTransformer : ITransformer { public string Adjust(string value) { return value.Replace("Hello", "Goodbye cruel"); } } Utility.WorkOnText( "Hello world", new PessimisticTransformer() ); That's quite a huge difference. It's surprising that C# hasn't gotten this functionality yet. It's hard to see what the downside is for this feature -- it doesn't seem to alter semantics. While it is supported in Java, there are other restrictions. The signature has to match exactly. What happens if we add an optional parameter to the interface-method definition? public interface ITransformer { string Adjust(string value, ITransformer additional = null); } In the C# example, the class implementing the interface would have to be updated, of course, but the code at calling location remains unchanged. The functional interface's definition is the calling location, so the change would be closer to the implementation instead of more abstracted from it. public class PessimisticTransformer : ITransformer { public string Adjust(string value, ITransformer additional = null) { return value.Replace("Hello", "Goodbye cruel"); } } // Using a class Utility.WorkOnText( "Hello world", new PessimisticTransformer() ); // Using a functional interface Utility.WorkOnText( "Hello world", (s, a) => s.Replace("Hello", "Goodbye cruel") ); I would take the functional interface any day. [Java Closures] As a final note, Java 8 has finally acquired closures/lambdas [4] but there is a limitation on which functions can be passed as lambdas. It turns out that the inclusion of functional interfaces is a workaround for not having first-class functions in the language. Citing the article, "[...] you cannot pass any function as first-class to other functions, the function must be explicitly defined as lambda or using Functional Interfaces" While in C# you can assign any method with a matching signature to a lambda variable or parameter, Java requires that the method be first assigned to a variable that is "explicitly assigned as lambda" in order to use. This isn't a limitation on expressiveness but may lead to clutter. In C# I can write the following: public string Twist(string value) { return value.Reverse(); } public string Alter(this string value, Func func) { return func(value); } public string ApplyTransformations(string value) { return value.Alter(Twist).Alter(s => s.Reverse()); } This example shows how you can declare a Func to indicate that the parameter is a first-class function. I can pass the Twist function or I can pass an inline lambda, as shown in ApplyTransformations. However, in Java, I can't declare a Func: only functional interfaces. In order to replicate the C# example above in Java, I would do the following: public String twist(String value) { return new StringBuilder(value).reverse().toString(); } public String alter(String value, ITransformer transformer) { return transformer.adjust(value); } public String applyTransformations(String value) { return alter(alter(value, s -> twist(s)), s -> StringBuilder(s).reverse().toString(); } Note that the Java example cannot pass Twist directly; instead, it wraps it in a lambda so that it can be passed as a functional interface. Also, the C# example uses an extension method, which allows me to "add" methods to class string, which is not really possible in Java. Overall, though, while these things feel like deal-breakers to a programming-language snob [5] -- especially those who have a choice as to which language to use -- Java developers can rejoice that their language has finally acquired features that both increase expressiveness and reduce clutter. [6] As a bonus, as a C# developer, I find that I don't have to be so jealous after all. Though I'd still really like me some functional interfaces. -------------------------------------------------------------------------------- [1] Even if I were still a Java programmer, the champagne might still stay in the bottle because adoption of the latest runtime in the Java world is extremely slow-paced. Many projects and products require a specific, older version of the JVM and preclude updating to take advantage of newer features. The .NET world naturally has similar limitations but the problem seems to be less extreme. [1] Distraught because the features look quite interesting and useful and C# doesn't have them and pleased because (A) I am not so immature that I can't be happy for others and (B) I know that innovation in other languages is an important driver in your own language. [1] Totally kidding here. I'm not insane. Take my self-diagnosis with a grain of salt. [1] I know that lambdas and closures are not by definition the same and I'm not supposed to use the interchangeably. I'm trying to make sure that a C# developer who reads this article doesn't read "closure" (which is technically what a lambda in C# is because it's capable of "closing over" or capturing variables) and not understand that it means "lambda". [1] Like yours truly. [1] Even if most of those developers won't be able to use those features for quite some time because they work on projects or products that are reluctant to upgrade.

Quino: efficiency, hinting and local sorting

2014-03-13T21:46:59+01:00

Published by marco on 13. Mar 2014 21:46:59 ------------------------------------------------------------------------ In "Quino: partially-mapped queries" we took a look at how Quino seamlessly maps as much as possible to the database, while handling unmappable query components locally as efficiently as possible. [Correctness is more important than efficiency] As efficiently as possible can be a bit of a weasel statement. We saw that partial application of restrictions could significantly reduce the data returned. And we saw that efficient handling of that returned data could minimize the impact on both performance and memory, keeping in mind, of course, that the primary goal is correctness. However, as we saw in the previous article, it's still entirely possible that even an optimally mapped query will result in an unacceptable memory-usage or performance penalty. In these cases, we need to be able to hint or warn the developer that something non-optimal is occurring. It would also be nice if the developer could indicate whether or not queries with such deficiencies should even be executed. [When do things slow down?] Why would this be necessary? Doesn't the developer have ultimate control over which queries are called? The developer has control over queries in business-logic code. But recall that the queries that we are using are somewhat contrived in order to keep things simple. Quino is a highly generic metadata framework: most of the queries are constructed by standard components from expressions defined in the metadata. For example, the UI may piece together a query from various sources in order to retrieve the data for a particular view. In such cases, the developer has less direct control to "repair" queries with hand-tuning. Instead, the developer has to view the application holistically and make repairs in the metadata. This is one of many reasons why Quino has local evaluation and does not simply throw an exception for partially mapped queries, as EF does. [Debugging data queries] [image]It is, in general, far better to continue working while executing a possibly sub-optimal and performance-damaging query than it is to simply crash out. Such behavior would increase the testing requirements for generated UIs considerably. Instead, the UI always works and the developer can focus on optimization and fine-tuning in the model, using tools like the Statistics Viewer, shown to the left. [image]The statistics viewer shows all commands executed in an application, with a stack trace, messages (hints/warnings/info) and the original query and mapped SQL/remote statement for each command. The statistics are available for SQL-based data drivers, but also for remoting drivers for all payload types (including JSON). The screenshot above is for the statistics viewer for Winform applications; we've also integrated statistics into web applications using "Glimpse" , a plugin architecture for displaying extra information for web-site developers. The screenshot to the right shows a preview-release version that will be released with Quino 1.11 at the end of March. [Sorting is all or nothing] One place where an application can run into efficiency problems is when the sort order for entities is too complex to map to the server. If a single restriction cannot be mapped to the database, we can map all of the others and evaluate the unmappable ones locally. What happens if a single sort cannot be mapped to the database? Can we do the same thing? Again, to avoid being too abstract, let's start with an example. var query = Session.GetQuery(); query .Where(Person.Fields.LastName, ExpressionOperator.StartsWith[1], "M") .OrderBy(Person.Fields.LastName) .OrderBy(Person.Fields.FirstName) .Join(Person.Relations.Company).WhereEqual(Company.Fields.Name, "IBM"); Assert.That(Session.GetList(query).Count, Is.Between(100, 120)); Both of these sorts can be mapped to the server so the performance and memory hit is very limited. The ORM will execute a single query and will return data for and create about 100 objects. Now, let's replace one of the mappable sorts with something unmappable: var query = Session.GetQuery(); query .Where(Person.Fields.LastName, ExpressionOperator.StartsWith[1], "M") .OrderBy(new DelegateExpression(c => c.GetObject().FirstName) .OrderBy(Person.Fields.LastName) .Join(Person.Relations.Company).WhereEqual(Company.Fields.Name, "IBM"); Assert.That(Session.GetList(query).Count, Is.Between(100, 120)); What's happening here? Instead of being able to map both sorts to the database, now only one can be mapped. Or can it? The primary sort can't be mapped, so there's obviously no point in mapping the secondary sort. Instead, all sorting must be applied locally. What if we had been able to map the primary sort but not the secondary one? Then we could have the database apply the primary sort, returning the data partially ordered. We can apply the remaining sort in memory...but that won't work, will it? If we only applied the secondary sort in memory, then the data would end up sort only by that value. It turns out that, unlike restrictions, sorting is all-or-nothing. If we can't map all sorts to the database, then we have to apply them all locally. [1] In this case, the damage is minimal because the restrictions can be mapped and guarantee that only about 100 objects are returned. Sorting 100 objects locally isn't likely to show up on the performance radar. Still, sorting is a potential performance-killer: as soon as you stray from the path of standard sorting, you run the risk of either: * Choosing a sort that is mappable but not covered by an index on the database * Choosing a sort that is unmappable and losing out on index-optimized sorting on the database In the next article, we'll discuss how we can extract slices from a result set -- using limit and offset -- and what sort of effect this can have on performance in partially mapped queries. -------------------------------------------------------------------------------- [1] The mapper also doesn't bother adding any ordering to the generated query if at least one ordering is unmappable. There's no point in wasting time on the database with a sort that will be re-applied locally.

Quino: partially-mapped queries

2014-03-06T22:33:32+01:00

Published by marco on 6. Mar 2014 22:33:32 ------------------------------------------------------------------------ In "Quino: an overview of query-mapping in the data driver" we took a look at some of the basics of querying data with Quino while maintaining acceptable performance and memory usage. Now we'll take a look at what happens with partially-mapped queries. Before explaining what those are, we need a more concrete example to work with. Here's the most-optimized query we ended up with in the previous article: var query = Session.GetQuery(); query.Join(Person.Relations.Company).WhereEqual(Company.Fields.Name, "IBM"); Assert.That(Session.GetCount(query), Is.GreaterThanEqual(140000)); With so many entries, we'll want to trim down the list a bit more before we actually create objects. Let's choose only people whose last names start with the letter "M". var query = Session.GetQuery(); query .Where(Person.Fields.LastName, ExpressionOperator.StartsWith [1], "M") .Join(Person.Relations.Company).WhereEqual(Company.Fields.Name, "IBM"); Assert.That(Session.GetCount(query), Is.Between(100, 120)); This is the kind of stuff that works just fine in other ORMs, like Entity Framework. Where Quino goes just a little farther is in being more forgiving when a query can be only partially mapped to the server. If you've used EF for anything beyond trivial queries, you've surely run into an exception that tells you that some portion of your query could not be mapped. [2] Instead of throwing an exception, Quino sends what it can to the database and uses LINQ to post-process the data sent back by the database to complete the query. [Introducing unmappable expressions] Unmappable code can easily sneak in through aspects in the metadata that define filters or sorts using local methods or delegates that do not exist on the server. Instead of building a complex case, we're going to knowingly include an unmappable expression in the query. var query = Session.GetQuery(); query .Where(new DelegateExpression [3](c => c.GetObject().LastName.StartsWith("M") [4]) .Join(Person.Relations.Company).WhereEqual(Company.Fields.Name, "IBM"); Assert.That(Session.GetCount(query), Is.Between(100, 120)); The new expression performs the same check as the previous example, but in a way that cannot be mapped to SQL. [5] With our new example, we've provoked a situation where any of the following could happen: * The ORM could throw up its hands and declare the query unmappable, pushing the responsibility for separating mappable from unmappable onto the shoulders of the developers. As noted above, this is what EF does. * The ORM could determine that the query is unmappable and evaluate everything locally, retrieving only the initial set of Person objects from the server (all several million of them, if you'll recall from the previous post). * The ORM could map part of the query to the database, retrieving the minimal set of objects necessary in order to guarantee the correct result. This is what Quino does. The strategy works in many cases, but is not without its pitfalls. What happens when we evaluate the query above? With partial mapping, we know that the restriction to "IBM" will be applied on the database. But we still have an additional restriction that must be applied locally. Instead of being able to get the count from the server without creating any objects, we're now forced to create objects in memory so that we can apply the local restrictions and only count the objects that match them all. But as you'll recall from the previous article, the number of matches for "IBM" is 140,000 objects. The garbage collector just gave you a dirty look again. [Memory bubbles] There is no way to further optimized this query because of the local evaluation, but there is a way to avoid another particularly nasty issue: memory bubbles. What is a memory bubble you might ask? It describes what happens when your application is using nMB and then is suddenly using n + 100MB because you created 140,000 objects all at once. Milliseconds later, the garbage collector is thrashing furiously to clean up all of these objects -- and all because you created them only in order to filter and count them. A few milliseconds after that, your application is back at nMB but the garbage collector's fur is ruffled and it's still trembling slightly from the shock. The way to avoid this is to stream the objects through your analyzer one at a time rather than to create them all at once. Quino uses lazily-evaluated IEnumerable sequences throughout the data driver specifically to prevent memory bubbles. [Streaming with IEnumerable sequences] Before tackling how the Quino ORM handles the Count(), let's look at how it would return the actual objects from this query. * Map the query to create a SELECT statement * At this point, it doesn't matter whether the entire query could be mapped * Create an IEnumerable sequence that represents the result of the mapped query * At this point, nothing has been executed and no objects have been returned * Wrap the sequence in another sequence that applies all of the "unhandled" parts of the query * Return that sequence as the result of executing the query * At this point, we still haven't actually executed anything on the database or created any objects Right, now we have an IEnumerable that represents the result set, but we haven't lit the fuse on it yet. How do we light the fuse? Well, the most common way to do so is to call ToList() on it. What happens then? * The IEnumerator requests an element * The query is executed against the database and returns an IDataReader * The reader requests a row and creates a Person object from that row's data * The wrapper that performs the local evaluation applies its filter(s) to this Person and yields it if it matches * If it matched the local filters, the Person is added to the list * Control returns to the IDataReader, which requests another row * Repeat until no more rows are returned from the database Since the decision to add all objects to a list occurs all the way at the very outer caller, it's the caller that's at fault for the memory bubble not the driver. [6] We'll see in the section how to avoid creating a list when none is needed. Using cursors to control evaluation If we wanted to process data but perhaps offer the user a chance to abort processing at any time, we could even get an IDataCursor from the Quino ORM so control iteration ourselves. using (var cursor = Session.CreateCursor(query)) { foreach (var obj in cursor) { // Do something with obj if (userAbortedOperation) { break; } } } [And finally, the count query] But back to evaluating the query above. The Quino ORM handles it like this: * Try to map the query to create a COUNT statement * Notice that at least one restriction could not be mapped * Create a cursor to SELECT all of the objects for the same query (shown above) and count all the objects returned by that instead So, if a count-query cannot be fully mapped to the database, the most efficient possible alternative is to execute a query that retrieves as few objects as possible (i.e. maps as much to the server as it can) and streams those objects to count them locally. Tune in next time for a look at how to exert more control with limit and offset and how those work together with partial mapping. -------------------------------------------------------------------------------- [1] These exceptions are not always obvious, either. Often, the exception complains about not being able to call a static method on a non-static object or class, which occurs when mapping the Expression object to SQL. EF's error messages have in general gotten much better but the mapping failures can be quite confusing. [1] If we were worried that the last names in our database might not necessarily be capitalized, we would use the ExpressionOperator.StartsWithCI to perform the check in a case-insensitive manner instead. [1] A DelegateExpression simply wraps the lambda given in the constructor in a Quino expression object. The parameter c is an IExpressionContext that provides the target object, which is in this case a Person. [1] I'm going to go ahead and assume that my database schema doesn't allow nulls for the LastName field. [1] If Quino had a LINQ-to-SQL provider, there's a chance that more of these delegates could be mapped, but we don't have one...and they can't. [1] Did we still create 140,000 objects? Yes we did, but not all at once. Now, there are probably situations where it is better to create several objects rather than streaming them individually, but I'm confident that keeping this as the default is the right choice. If you find that your particular situation warrants different behavior, feel free to use Session.CreateCursor() to control evaluation yourself and create the right-sized batches of objects to count. The ChangeAndSave() extension method does exactly that to load objects in batches (size adjustable by an optional parameter) rather than one by one.

LESS vs. SASS: Variable semantics

2014-02-24T23:01:09+01:00

Published by marco on 24. Feb 2014 23:01:09 Updated by marco on 24. Feb 2014 23:13:04 ------------------------------------------------------------------------ I've been using CSS since pretty much its inception. It's powerful but quite low-level and lacks support for DRY. So, I switched to generating CSS with LESS a while back. This has gone quite well and I've been pretty happy with it. Recently, I was converting some older, theme stylesheets for earthli. A theme stylesheet provides no structural CSS, mostly setting text, background and border colors to let users choose the basic color set. This is a perfect candidate for LESS. So I constructed a common stylesheet that referenced LESS variables that I would define in the theme stylesheet. Very basically, it looks like this: ---------------- | crimson.less | ---------------- @body_color: #800; @import "theme-base"; ---------------- ------------------- | theme-base.less | ------------------- body { background-color: @body_color; } ------------------- This is just about the most basic use of LESS that even an amateur user could possibly imagine. I'm keeping it simple because I'd like to illustrate a subtlety to variables in LESS that tripped me up at first -- but for which I'm very thankful. I'll give you a hint: LESS treats variables as a stylesheet would, whereas SASS treats them as one would expect in a programming language. Let's expand the theme-base.less file with some more default definitions. I'm going to define some other variables in terms of the body color so that themes don't have to explicitly set all values. Instead, a theme can set a base value and let the base stylesheet calculate derived values. If a calculated value isn't OK for a theme, the theme can set that value explicitly to override. Let's see an example before we continue. ------------------- | theme-base.less | ------------------- @title_color: darken(@body_color, 25%); @border_color: @title_color; body { background-color: @body_color; } h2 { color: @title_color; border: 1px solid @border_color; } ------------------- You'll notice that I avoided setting a value for @body_color because I didn't want to override the value set previously in the theme. But then wouldn't it be impossible for the theme to override the values for @title_color and @border_color? We seem to have a problem here. [1] I want to be able to set some values and just use defaults for everything that I don't want to override. There is a construct in SASS called !default that does exactly this. It indicates that an assignment should only take place if the variable has not yet been assigned. [2] Searching around for an equivalent in LESS took me to this page, "Add support for "default" variables (similar to !default in SASS) #1706" . There users suggested various solutions and the original poster became ever more adamant -- "Suffice it to say that we believe we need default variable setting as we've proposed here" -- until a LESS developer waded in to state that it would be "a pointless feature in less", which seemed harsh until an example showed that he was quite right. The clue is further down in one of the answers: "If users define overrides after then it works as if it had a default on it. [T]hat's because even in the imported file it will take the last definition in the same way as css, even if defined after usage. (Emphasis added.)" It was at this point that the lightbulb went on for me. I was thinking like a programmer where a file is processed top-down and variable values can vary depending on location in the source text. That the output of the following C# code is 12 should amaze no one. var a = 1; Console.Write(a); a = 2; Console.Write(a); a = 3; In fact, we would totally expect our IDE to indicate that the value in the final assignment is never used and can be removed. Using LESS variable semantics, though, where variables are global in scope [3] and assignment are treated as they are in CSS, we would get 33 as output. Why? Because the value of the variable a has the value 3 because that's the last value assigned to it. That is, LESS has a cascading approach to variable assignment. This is exactly as the developer from LESS said: stop fighting it and just let LESS do what it does best. Do you want default values? Define the defaults first, then define your override values. The overridden value will be used even when used for setting the value of another default value that you didn't even override. Now let's go fix our stylesheet to use these terse semantics of LESS. Here's a first cut at a setup that feels pretty right. I put the files in the order that you would read them so that you can see the overridden values and everything makes sense again. [4] ------------------------ | theme-variables.less | ------------------------ @body_color: white; @title_color: darken(@body_color, 25%); @border_color: @title_color; ------------------------ ---------------- | crimson.less | ---------------- @import "theme-variables"; @body_color: #800; @import "theme-base"; ---------------- ------------------- | theme-base.less | ------------------- body { background-color: @body_color; } h2 { color: @title_color; border: 1px solid @border_color; } ------------------- You can see in the example above that the required variables are all declared, then overridden and then used. From what we learned above, we know that the value of @title_color in the file theme-variables.less will use a value of #800 for @body_color because that was the last value it was assigned. We can do better though. The example above hasn't quite embraced the power of LESS fully. Let's try again. ------------------- | theme-base.less | ------------------- @body_color: white; @title_color: darken(@body_color, 25%); @border_color: @title_color; body { background-color: @body_color; } h2 { color: @title_color; border: 1px solid @border_color; } ------------------- ---------------- | crimson.less | ---------------- @import "theme-base"; @body_color: #800; ---------------- Boom! That's all you have to do. Set up everything in your base stylesheet file. Define all variables and define them in terms of each other in as convoluted a manner as you like. The final value of each value is determined before any CSS is generated. This final version also has the added advantage that a syntax-checking IDE like JetBrains WebStorm or PHPStorm will be able to provide perfect assistance and validity checking. That wasn't true at all for any of the previous versions, where variable declarations were in different files. Although I was seriously considering moving away from LESS and over to SASS -- because at least they didn't leave out such a basic feature, as I had thought crossly to myself -- I'm quite happy to have learned this lesson and am more happy with LESS than ever. -------------------------------------------------------------------------------- [1] For those of you who already know how to fix this, stop smirking. I'm writing this post because it wasn't intuitive for me -- although now I see the utter elegance of it. [1] I'd also seen the same concept in NAnt property tasks where you can use the now-deprecated overwrite="false" directive. For the curious, now you're supposed to use unless="${property::exists('property-name')}" instead, which is just hideous. [1] There are exceptions, but "variables are global in LESS is a good rule of thumb". One example is that if a parameter for a mixin has the same name as a globally assigned variable, the value within that mixin is taken from the parameter rather than the global. [1] Seriously, LESS experts, stop smirking. I'm taking a long time to get there because a programmer's intuitive understanding of how variables work is a hard habit to break. Almost there.

Rolling your own languages and frameworks

2014-02-09T23:08:59+01:00

Published by marco on 9. Feb 2014 23:08:59 ------------------------------------------------------------------------ The blog post/article "So You Want To Write Your Own Language?" by Walter Bright contains a lot of interesting information, related to only to parsing, but also to runtime and framework design. Bright is well-known as the designer of the D programming language, so he's definitely worth a read. I thought he jumped back and forth between topics a bit, so I summarized the contents for myself below: [Parsing] Bright identifies Minimizing keystrokes, easy parsing and minimizing the number of keywords as false gods. Do not waste any time trying to satisfy these requirements; instead, let them flow naturally from a good design. Your language should consist of productions that have only a single non-terminal on the left-hand side. That is, strive to make your language context-free. [1] The implication is that you're actually going to define the grammar rather than just winging it. This means that you can can use a parser generator even though Bright says not to "bother wasting time with lexer or parser generators and other so-called 'compiler compilers.'" I instead agree with the article "Advice on writing a programming language" by Ted Kaminski' , which advises providing a grammar that can be used with parser generators because "many of those people eager to contribute either get stuck trying and failing to build a parser or trying and failing to learn to use the daunting internals of your compiler". You can either make it easy for people to build compilers for your language or you can maintain a very friendly API for your own compiler. If you choose the API route, it might force you to be more disciplined, but it might also cause you no end of backwards-compatibility headaches as your compiler quickly evolves. Not only that, but you'd then have to make that API available for any number of languages and any number of platforms. If you take the route of publishing the BNF, that may also not not be enough. This because it can still be daunting to convert a BNF to something that your compiler-generator can use, especially for non-trivial languages. Providing a grammar for a widely supported parser-generator like "ANTLR" [2] will give those willing to build tools for your language a good jump-start. "Use an LR parser generator. It’ll keep your language parsable, and make it easier to change early on. When your language becomes popular enough that you have the problem of parsing errors not being friendly enough for your users, celebrate that fact and hand-roll only then. "And then reap the benefit of all the tooling an easily parsed language gets, since you know you kept it LR(1)." [Error-handling] Introduce redundancy into the language definition (e.g. semicolons as line-terminators in addition to whitespace/newlines) in order to make error-message generation much easier and much more likely to produce friendly output. Compilers can handle error messages in different ways: Bail out on the first error This is a good fallback, but it saves the developer a lot of work if you identify all of the root errors in source -- that is, errors that are not a consequence of another error. Collect multiple errors In order to continue parsing/compiling after an error, the machine can take one of two approaches: * "Guess what the programmer intended, repair the syntax trees, and continue." (Bright) Bad guesses lead to spurious and inscrutable error messages which lead to developers no longer trusting their compilers. Avoid this approach as it is very difficult to get right. * Take the approach that Bright did with the D compiler: consider any part of the code that has an error as "poisoned". He likens it to the way that "floating-point NaNs are handled. Any operation with a NaN operand silently results in a NaN." With this approach, "the compiler is able to detect multiple errors as long as the errors are in sections of code with no dependency between them", which yields only high-quality and relevant error messages for the developer. [Stand on the shoulders of giants] Do not re-invent the syntax for everything in your language. Instead, as Bright says, "[s]ave the divergence for features not generally seen before, which also signals the user that this is new." [The runtime] A language definition is nothing without a runtime. Bright recommends "taking the common sense approach and using an existing back end, such as the JVM, CLR, gcc, or LLVM. (Of course, I can always set you up with the glorious Digital Mars back end!)" If you can avoid writing your own back-end, you should definitely do so. Similar to the approach recommended for parsing the language: start with a stock runtime and migrate to something custom if the needs of your project warrant it (they almost certainly won't). This is the approach taken by any number of other popular languages, like Scala. [The framework] And then there's the library/framework that accompanies the language and, arguably, helps to define it for people. Complaints about a language are often complaints about the standard runtime library/framework for the language. Developers quickly associate them and treat them as one entity. Bright's focus is on very low-level runtimes (such as the one for his language, D) and thus his advice focuses on fast I/O, fast and efficient memory allocation/de-allocation and robust/fast transcendental functions [3]. However, he also offers the following excellent rule of thumb for any framework: "My general rule is if the explanation for what the function does is more lines than the implementation code, then the function is likely trivia and should be booted out." -------------------------------------------------------------------------------- [1] See "Example of why C++ is NOT a context free grammar?" by Kaz Kylheku and "Context-free grammar" for more information. [1] These are functions that cannot be composed of other functions in the framework, what I would call "core" or "root" functions. These are the functions that the developer would find it either impossible or incredibly difficult to replicate efficiently in the language itself. [1] As of publication time, the current stable release of ANTLR for C# available via NuGet -- and used by most other packages -- is version 3.5.1. However, the official home page is now "ANTLR" rather than "ANTLR3" . The NuGet packages for version 4 are in pre-release, but it's very nice to see some progress being made after years of relatively minor upgrades. In particular, the new version of the IDE "ANTLRWorks" looks much nicer and seems to have been based on the JETBrains IDE framework. I'm definitely looking forward to checking it out in more detail.

Quino: an introduction to query-mapping in the ORM

2014-02-07T09:57:07+01:00

Published by marco on 7. Feb 2014 09:57:07 ------------------------------------------------------------------------ The following article was originally published on the "Encodo blogs" and is cross-published here. -------------------------------------------------------------------------------- One of the most-used components of Quino is the ORM. An ORM is an Object-Relational Mapper, which accepts queries and returns data. * Applications formulate queries in Quino using application metadata * The ORM maps this query to the query language of the target database * The ORM transforms the results returned by the database to objects (the classes for which were also generated from application metadata). This all sounds a bit abstract, so let's start with a concrete example. Let's say that we have millions of records in an employee database. We'd like to get some information about that data using our ORM. With millions of records, we have to be a bit careful about how that data is retrieved, but let's continue with concrete examples. [Attempt #1: Get your data and refine it locally] The following example returns the correct information, but does not satisfy performance or scalability requirements. [1] var people = Session.GetList().Where(p => p.Company.Name == "IBM"); Assert.That(people.Count(), Is.GreaterThanEqual(140000)); What's wrong with the statement above? Since the call to Where occurs after the call to GetList(), the restriction cannot possibly have been passed on to the ORM. The first line of code doesn't actually execute anything. It's in the call to Count() that the ORM and LINQ are called into action. Here's what happens, though: * For each row in the Person table, create a Person object * For each person object, create a corresponding Company object * Count all people where the Name of the person's company is equal to "IBM". The code above benefits from almost no optimization, instantiating a tremendous number of objects in order to yield a scalar result. The only side-effect that can be considered an optimization is that most of the related Company objects will be retrieved from cache rather than from the database. So that's a plus. Still, the garbage collector is going to be running pretty hot and the database is going to see far more queries than necessary. [3] [Attempt #2: Refine results on the database] Let's try again, using Quino's fluent querying API. [4] The Quino ORM can map much of this API to SQL. Anything that is mapped to the database is not performed locally and is, by definition, more efficient. [5] var people = Session.GetList(); people.Query.Join(Person.Relations.Company).WhereEqual(Company.Fields.Name, "IBM"); [6] Assert.That(people.Count, Is.GreaterThanEqual(140000)); First, we get a list of people from the Session. As of the first line, we haven't actually gotten any data into memory yet -- we've only created a container for results of a certain type (Person in this case). The default query for the list we created is to retrieve everything without restriction, as we saw in the first example. In this example, though, we restrict the Query to only the people that work for a company called "IBM". At this point, we still haven't called the database. The final line is the first point at which data is requested, so that's where the database is called. We ask the list for the number of entries that match it and it returns an impressive number of employees. At this point, things look pretty good. In older versions of Quino, this code would already have been sufficiently optimized. It results in a single call to the database that returns a single scalar value with everything calculated on the database. Perfect. [Attempt #3: Avoid creating objects at all] However, since "v1.6.0" of Quino [7], the call to the property IDataList.Count has automatically populated the list with all matching objects as well. We made this change because the following code pattern was pretty common: var list = Session.GetList(); // Adjust query here if (list.Count > 0) { // do something with all of the objects here } That kind of code resulted in not one, but two calls to the database, which was killing performance, especially in high-latency environments. That means, however, that the previous example is still going to pull 14,000 objects into memory, all just to count them and add them to a list that we're going to ignore. The garbage collector isn't a white-hot glowing mess anymore, but it's still throwing you a "look of disapproval" . Since we know that we don't want the objects in this case, we can get the old behavior back by making the following adjustment. var people = Session.GetList(); people.Query.Join(Person.Relations.Company).WhereEqual(Company.Fields.Name, "IBM"); Assert.That(Session.GetCount(people.Query), Is.GreaterThanEqual(140000)); It would be even clearer to just forget about creating a list at all and work only with the query instead. var query = Session.GetQuery(); query.Join(Person.Relations.Company).WhereEqual(Company.Fields.Name, "IBM"); Assert.That(Session.GetCount(query), Is.GreaterThanEqual(140000)); Now that's a maximally efficient request for a number of people in Quino 1.10 as well. Tune in next time for a look at what happens when a query can only be partially mapped to the database. -------------------------------------------------------------------------------- [1] I suppose it depends on what those requirements are, but if you think your application's performance requirements are so loose that it's OK to create millions of objects in memory just in order to count them, then you're probably not in the target audience for this article. [1] Instead of using a code-first, Quino uses a "metadata-first" approach, so the Person class used here is generated from the application metadata rather than written by the developer, as in other frameworks. There are different strategies for retrieving associated data. Quino does not yet support retrieving anything other than root objects. That is, the associated Company object is not retrieved in the same query as the Person object. In the example in question, the first indication that the ORM has that a Company is required is when the lambda retrieves them individually. Even if the original query had somehow indicated that the Company objects were also desired (e.g. using something like Include(Person.Relations.Company) as you would in EF), the most optimal mapping strategy is still not clear. Should the mapper join the company table and retrieve that highly redundant data with each person? Or should it execute a single query for all companies and prime a cache with those? The right answer depends on the latency and bandwidth between the ORM and the database as well as myriad other conditions. When dealing with a lot of data, it's not hard to find examples where the default behavior of even a clever ORM isn't maximally efficient -- or even very efficient at all. As we already noted, though, the example in question does everything in memory. If we reasonably assume that the people belong to a relatively small number of companies -- say qc -- then the millions of calls to retrieve companies associated with people will result in a lot of cache hits and generate "only" qc + 1 queries. [1] Quino does not have LINQ to SQL support. I'm not even going to write "yet" at the end of that sentence because it's not at all clear that we're ever going to have it. Popular demand might convince us otherwise, but for now we're quite happy with our API (and soon-to-be-revealed query language QQL). [1] That's an assumption I'm going to make for which counterexamples certainly exist, but none of which apply to the simple examples we'll address in this article. [1] The Person.Relations and Person.Fields static fields are generated with the Person class. These correspond to the application metadata and change when the metadata changes. Developers are encouraged to use these generated constants so that even metadata-based queries can be validated by the compiler. [1] That was in almost three years ago, in June of 2011.

Generating JSON from Dart object graphs

2014-02-01T17:09:41+01:00

Published by marco on 1. Feb 2014 17:09:41 ------------------------------------------------------------------------ [image]A while back, I participated in an evaluation of languages that could replace JavaScript for our web front-end development language at "Encodo" . We took a look at two contenders: Dart and TypeScript. At the time, Dart was weaker for the following reasons: * It had not yet been released * It had little to no tool support * Integration with existing JS libraries was somewhat laborious Though TypeScript has its weaknesses (it has technically not yet hit a 1.0 release), we eventually decided to go in that direction. The tool support in Visual Studio and ReSharper are both improving steadily and have gotten quite good. We've had quite positive results in one larger project. Even with Dart in our wake, I am still curious to see how people are using it. I was surprised by the claims in the article "Why Dart should learn JSON while it’s still young" by Max Horstmann . Since Dart is not directly compatible with JavaScript, as TypeScript is, neither can a given JSON-formatted string be implicitly interpreted. Instead, you can import it using a library function. This is not really a problem, though one wonders if there are performance penalties for Dart that are not present in JavaScript/TypeScript. Where the problem arises is in exporting JSON, which does not happen automagically. In non--client-side languages like C#, NewtonSoft's JSON.Net library can serialize pretty much anything using reflection. JSON isn't baked into the language, but that isn't too surprising. However, in Dart, positioned as a contender for taking over from JavaScript as the client-side language of choice, the solution recommended even by Dart language gurus is to implement toJson() on all objects that you want to export. Either that, or use a probably non-optimized external library to serialize your object to JSON (likely using introspection, as JSON.Net does). I agree with the author of the blog that this is a red flag for using Dart in production projects. It's strange that Dart doesn't produce JSON without relying on external libraries. And the recommended library is, as of this writing, of pre-production/alpha quality -- the version number is 0.1.0 and the TODO list includes a bullet point that exhorts the author to "Write tests!". So I'm still waiting to see what becomes of Dart, but the balkiness of the current solution for generating JSON not only makes it a bit of tough fit for many current web applications, but also makes us urge caution despite its having recently been released (1.0 came out in November 2013).

The Ruby language: where you can randomize your base class

2014-02-01T12:38:27+01:00

Published by marco on 1. Feb 2014 12:38:27 ------------------------------------------------------------------------ [image]I have never really examined Ruby in detail but it seems to be even more of a treasure-trove of ad-hoc features than PHP. It takes full advantage of being evaluated at run-time to offer features that I haven't seen in even other dynamic languages. Some of these features seem like they might be nice shortcuts but also seem like they would be difficult to optimize. Not only that, but they seem so obscure that they would likely will trip up even more seasoned users of the language. At any rate, the one I found to be most brash was "methods in class definitions" by bjeanes . (The article is a treasure trove of other gems, no pun intended.) The example below shows how that might work. class RandomSubclass < [Array, Hash, String, Fixnum, Float, TrueClass].sample end RandomSubclass.superclass # could output one of 6 different classes. The language allows you to call methods from the "extends" clause. The example above creates an array of class names, then calls the sample method on them to yield a base class. The actual base class is not only unknown at compile time, it is also unpredictable at runtime. The example above is contrived and makes the feature seem like it's only for the reckless. It's clear that serious software would have to forbid or strictly limit the use of such a feature, but I can see where it would be useful. For example, you may want to change your base class depending on deployment parameters. If you're deploying to a testing or staging environment, you'll use a base class that includes more logging, profiling and debugging code. For production, you switch to a base class that's optimized. If the class interface remains the same, then using this feature wouldn't be as dangerous as it initially appeared. Still, ensuring quality and enforcing architecture in software written in such a language would require a strict development process and discipline and vigilance from all involved.

Apple Developer Videos

2014-01-05T11:46:53+01:00

Published by marco on 5. Jan 2014 11:46:53 ------------------------------------------------------------------------ It's well-known that Apple runs a walled garden. Apple makes its developers pay a yearly fee to get access to that garden. In fairness, though, they do provide some seriously nice-looking APIs for their iOS and OS X platforms. They've been doing this for years, as listed in the post "iOS 7 only is the only sane thing to do" by Tal Bereznitskey . It argues that the new stuff in iOS 7 is compelling enough to make developers consider dropping support for all older operating systems. And this for pragmatic reasons, such as having far less of your own code to support and correspondingly making the product cost less to support. It's best to check your actual target market, but Apple users tend to upgrade very quickly and reliably, so an iOS 7-only strategy is a good option. Among the improvements that Apple has brought in the recent past are blocks (lambdas), GCD (asynchronous execution management) and ARC (mostly automated memory management), all introduced in iOS 4 and OS X 10.6 Snow Leopard. OS X 10.9 Mavericks and iOS 7 introduced a slew of common UI improvements (e.g. AutoLayout and HTML strings for labels). [1] To find the videos listed below, browse to "WWDC 2013 Development Videos" . For the web, Apple has improved developer tools and support in Safari considerably. There are two pretty good videos demonstrating a lot of these improvements: #601: Getting to Know Web Inspector This video shows a lot of improvements to Safari 7 debugging, in the form of a much more fluid and intuitive Web Inspector and the ability to save changes made there directly back to local sources. #603: Getting the Most Out of Web Inspector This video shows how to use the performance monitoring and analysis tools in Safari 7. The demonstration of how to optimize rendering and compositing layers was really interesting. For non-web development, Apple has been steadily introducing libraries to provide support for common application tasks, the most interesting of which are related to UI APIs like Core Image, Core Video, Core Animation, etc. Building on top of these, Apple presents the Sprite Kit -- for building 2D animated user interfaces and games -- and the Scene Kit -- for building 3D animated user interfaces and games. There are some good videos demonstrating these APIs as well. #500: What’s New in Scene Kit An excellent presentation content-wise; the heavily accented English is sometimes a bit difficult to follow, but the material is top-notch. #502: Introduction to Sprite Kit This is a good introduction to nodes, textures, actions, physics and the pretty nice game engine that Apple delivers for 2D games. #503: Designing Games with Sprite Kit The first half is coverage of tools and assets management along with more advanced techniques. The second half is with game designers Graeme Devine [2] and Spencer Lindsay, who designed the full-fledged online multi-player game Adventure to showcase the Sprite Kit. -------------------------------------------------------------------------------- [1] Disclaimer: I work with C# for Windows and HTML5 applications of all stripes. I don't actually work with any of these technologies that I listed above. The stuff looks fascinating, though and, as a framework developer, I'm impressed by the apparent cohesiveness of their APIs. Take recommendations with a grain of salt; it could very well be that things are a good deal less rosy when you actually have to work with these technologies. [1] Formerly of Trilobyte and then id Software, now at Apple.

Brilliant articles by the funniest guy at Microsoft

2013-12-29T23:09:53+01:00

Published by marco on 29. Dec 2013 23:09:53 ------------------------------------------------------------------------ I recently stumbled upon some "Essays from the funniest man in Microsoft Research" by Raymond . He is such a funny writer that this article, against convention, will consist mostly of citations rather than an even mix of citations and paraphrasing that I naturally consider to be much more lucid and pithy. I quote at length to do the material justice, for documentation and to ensure that you all download the PDFs to see if there is more where that came from (there is). All emphases have been added. "Mobile Computing Research Is a Hornet’s Nest of Deception and Chicanery" by James Mickens (PDF) On the delusions of the mobile-computing world: "Mobile computing researchers are a special kind of menace. They don’t smuggle rockets to Hezbollah, or clone baby seals and then make them work in sweatshops for pennies a day. That’s not the problem with mobile computing people. The problem with mobile computing people is that they have no shame. They write research papers with titles like “Crowdsourced Geolocation-based Energy Profiling for Mobile Devices,” as if the most urgent deficiency of smartphones is an insufficient composition of buzzwords." On browsing web pages: "When I use a mobile browser to load a web page, I literally have no expectation that anything will ever happen. A successful page load is so unlikely, so unimaginable, that mobile browsers effectively exist outside of causality—the browser is completely divorced from all action verbs, and can only be associated with sad, falling-tone sentences like “I had to give up after twenty seconds.” " On the fragility of touchscreens: "Note that, when I say that you will “drop” your touchscreen, I do not mean “drop” in the layperson sense of “to release from a non-trivial height onto a hard surface.” I mean “drop” in the sense of “to place your touchscreen on any surface that isn’t composed of angel feathers and the dreams of earnest schoolchildren.” Phones and tablets apparently require Planck-scale mechanical alignments, such that merely looking at the touchscreen introduces fundamental, quantum dynamical changes in the touchscreen’s dilithium crystals. Thus, if you place your touchscreen on anything, ever, you have made a severe and irreversible life mistake." On the sheer touchiness of touchscreens: "On your touchscreen, your swipes will become pinches, and your pinches will become scrolls, and each one of your scrolls will become a complex thing never before seen on this earth, a leviathan meta-touch event of such breadth and complexity that your phone can only respond like Carrie White at the prom. So, your phone just starts doing stuff, all the stuff that it knows how to do, and it’s just going nuts, and your apps are closing and opening and talking to the cloud and configuring themselves in unnatural ways, and your phone starts vibrating and rumbling with its little rumble pack, and it will gently sing like a tiny hummingbird of hate, and you’ll look at the touchscreen, and you’ll see that things are happening, my god, there are so many happenings, and you’ll try to flip the phone over and take out the battery, because now you just want to kill it and move to Kansas and start over, [...]" On the uselessness of most mobile computing: "When you purchase a mobile device, you are basically saying, “I endorse the operational inefficiency of the modern bourgeoisie lifestyle, even though I could find a rock and tie a coat hanger around it and have a better chance of having a phone conversation that doesn’t sound like two monsters arguing about German poetry.”" "The Slow Winter" by James Mickens (PDF) On flying in the early 21st century: "The point is that flying in airplanes used to be fun, but now it resembles a dystopian bin-packing problem in which humans, carry-on luggage, and five dollar peanut bags compete for real estate while crying children materialize from the ether and make obscure demands in unintelligible, Wookie-like languages while you fantasize about who you won’t be helping when the oxygen masks descend." On how awesome it was being a hardware architect before things got all quantum and messy: "Of course, pride precedes the fall, and at some point, you realize that to implement aggressive out-of-order execution, you need to fit more transistors into the same die size, but then a material science guy pops out of a birthday cake and says YEAH WE CAN DO THAT, and by now, you’re touring with Aerosmith and throwing Matisse paintings from hotel room windows, because when you order two Matisse paintings from room service and you get three, that equation is going to be balanced. It all goes so well, and the party keeps getting better. When you retire in 2003, your face is wrinkled from all of the smiles, and even though you’ve been sued by several pedestrians who suddenly acquired rare paintings as hats, you go out on top, the master of your domain. " On quantum-level effects in modern processors: "They randomly switched states; they leaked voltage; they fell prey to the seductive whims of cosmic rays that, unlike the cosmic rays in comic books, did not turn you into a superhero, but instead made your transistors unreliable and shiftless, like a surly teenager who is told to clean his room and who will occasionally just spray his bed with Lysol and declare victory." On scaling in cores when processor speed and more transistors became too messy: "John did what any reasonable person would do: he cloaked himself in a wall of denial and acted like nothing had happened. “Making processors faster is increasingly difficult,” John thought, “but maybe people won’t notice if I give them more processors.” This, of course, was a variant of the notorious Zubotov Gambit, named after the Soviet-era car manufacturer who abandoned its attempts to make its cars not explode, and instead offered customers two Zubotovs for the price of one [...]" On the main purpose that people have for their computers: "Lay people use their computers for precisely ten things, none of which involve massive computational parallelism, and seven of which involve procuring a vast menagerie of pornographic data and then curating that data using a variety of fairly obvious management techniques, like the creation of a folder called “Work Stuff,” which contains an inner folder called “More Work Stuff,” where “More Work Stuff” contains a series of ostensible documentaries that describe the economic interactions between people who don’t have enough money to pay for pizza and people who aren’t too bothered by that fact. " A summary of the state of the world of hardware design and development: "[...] you brought the fire down from Olympus, and the mortals do with it what they will. But now, all the easy giants were dead, and John was left to fight the ghosts that Schrödinger had left behind." "The Night Watch" by James Mickens (PDF) What it's like to be a systems (low-level) programmer: "A systems programmer will know what to do when society breaks down, because the systems programmer already lives in a world without law." On why people still use C++ (or a response to the snotty question of: "why don't you just use high-level language X instead?") "Why not use a modern language with garbage collection and functional programming and free massages after lunch? Here’s the answer: Pointers are real. They’re what the hardware understands. Somebody has to deal with them. You can’t just place a LISP book on top of an x86 chip and hope that the hardware learns about lambda calculus by osmosis. [...] Pointers are like [...] real, living things that must be dealt with so that polite society can exist. Make no mistake, I don’t want to write systems software in a language like C++. [...] When it’s 3 A.M., and you’ve been debugging for 12 hours, and you encounter a virtual static friend protected volatile templated function pointer, you want to [...] find the people who wrote the C++ standard and bring ruin to the things that they love." On being thankful for systems programmers: "That being said, if you find yourself drinking a martini and writing programs in garbage-collected, object-oriented Esperanto, be aware that the only reason that the Esperanto runtime works is because there are systems people who have exchanged any hope of losing their virginity for the exciting opportunity to think about hex numbers and their relationships with the operating system, the hardware, and ancient blood rituals that Bjarne Stroustrup performed at Stonehenge." On how difficult it is to work in extremely fragile territory (rather than a safe runtime): "Indeed, I would [have...checked the log files for errors] if I hadn’t broken every component that a logging system needs to log data. I have a network file system, and I have broken the network, and I have broken the file system, and my machines crash when I make eye contact with them. I HAVE NO TOOLS BECAUSE I’VE DESTROYED MY TOOLS WITH MY TOOLS." A backhanded swipe at the utter uselessness of many UI concerns: "I’m glad that people are working on new kinds of bouncing icons because they believe that humanity has solved cancer and homelessness and now lives in a consequence-free world of immersive sprites."

How to fool people into giving up their email address

2013-11-07T20:56:06+01:00

Published by marco on 7. Nov 2013 20:56:06 ------------------------------------------------------------------------ On "Codecademy" , you can learn to program in various languages. It starts off very slowly and is targeted at non-technical users. That's their claim anyway -- the material in the courses I looked at ramps up pretty quickly. Anyway, the interesting thing I saw was in their introductory test. It struck me as a subtle way to get you to enter your email address. I'd just recently discussed this on a project I'm working on: how can we make it fun for the user to enter personal information? The goal is not to sell that information (not yet anyway, but who knows what the future holds), but to be able to enhance -- nay, personalize -- the service. Personalizing has a bad reputation but can be very beneficial. For example, if you're using a site for free and you're going to see offers and advertisements anyway, isn't it better to enter a bit of data that will increase the likelihood that offers and ads are interesting? Each person can -- and should -- decide for the themselves what to make public, but the answer isn't always necessarily no. [How Codecademy gets your email] [image] Here they teach you how to use the "length" method by measuring your email address. Sneaky. I like it. [image] Even if you don't given them an address, they re-prompt you to enter your email, but it doesn't come across as pushy because you're taking a test. I thought that this was pretty subtle. Because of the context, people who would ordinarily be sensitive to giving up their email might not even notice. Why? Because they want to answer the question *correctly*. They don't want the site to judge them for having entered something wrong, so they do as they're told. Is Codecademy collecting emails this way? I have no way to be sure, but they'd be silly not to.

The HTML5 AppCache and HTTP Authentication

2013-11-03T11:17:36+01:00

Published by marco on 3. Nov 2013 11:17:36 ------------------------------------------------------------------------ The following article was originally published on the "Encodo blogs" and is cross-published here. -------------------------------------------------------------------------------- "The following article outlines a solution to what may end up being a temporary problem. The conditions are very specific: no server-side logic; HTTP authentication; AppCache as it is implemented by the target platforms -- Safari Mobile and Google Chrome -- in late 2012/early 2013. The solution is not perfect but it's workable. We're sharing it here in the hope that it can help someone else or serve as a base for a better solution." [The HTML5 AppCache] The application cache is a relatively new feature that is, * Supported by all modern browsers * Uses a manifest file that indicates which files to cache * Browser checks manifest for changes * If there are changes, all files are refreshed * External links work when online * When offline, the application works with the local cache * External links to non-cached content are redirected to fallback links [AppCache Limitations] Web applications can use the HTML5 application-cache to store local content, but different browsers apply different restrictions to the amount of space allocated per domain. * Safari Mobile is limited to 50MB per domain. This means that the restriction will generally apply to all content packages downloaded from the same server/domain * Google Chrome is limited as well, but the actual limit is a bit of a moving target [Optimizing the HTML5 AppCache for Authenticated Content] In particular, the Safari Mobile browser cannot update the application cache for files for which it must obtain authentication. * Some requests do not trigger authentication * Manifest file * Home-screen icons * A lost connection or timeout can invalidate the authentication token * Version checks are not reliable * Open pages/running apps do not check for status updates * Home-screen apps don’t reliably check on startup * This can lead to out-of-date or missing content [Checking for and presenting updates to the user] The graphic below illustrates the mechanism by which a content package in a web application can manage content updates and present them to the user. * When online, the software regularly checks whether an update for the package is available * The user can determine whether to install an update * When an update has been found, the software stops checking for updates until the user has applied the latest update * If the user delays the update, the user interface displays an “update” button * The software will automatically start checking for updates whenever it detects that it is online * There is no way for a user to ignore updates * When the user proceeds with an update, the latest version is retrieved at that time, ensuring that the user has the latest version [image] [Solving the problems with authenticated data and the AppCache] In order to address the problems described above, the UA products use a separate version file to check for updates independent of the browser’s application-cache mechanism and to trigger this update only when authentication has been reestablished. [image] * The cache.version.txt file is publicly available but is very small and includes only a unique version number that is also included in the cache.manifest file (both of which are generated by a deployment script). * The software compares this version number against the last known good version number. If it differs, it knows that the server has been updated with new content for this package * Before the software can kick off the HTML5 AppCache update process, it must ensure that the user is authenticated and authorized to retrieve the update package (because most browsers will simply fail silently if this is not the case). * The software pulls the force.password.txt file from the private zone with an explicit request. The browser will ask the user to authenticate, if necessary. This file is also very small to avoid needlessly downloading a large amount of data simply to force re-authentication. * Once the user has authenticated, the software lets the automated HTML5 AppCache update take over, retrieving first the cache.manifest file and then updating files as needed. The user is notified that this download is taking place asynchronously. * The software receives a notification from the browser that the update is complete and can record the version number and then notify the user that the update has been applied and is ready to use. This approach worked relatively well for us, although we continue to refine it based on feedback and experience.

Entity Framework Generated SQL

2013-10-21T22:56:04+02:00

Published by marco on 21. Oct 2013 22:56:04 Updated by marco on 12. Jun 2018 20:06:15 ------------------------------------------------------------------------ Microsoft just recently released "Visual Studio 2013" , which includes Entity Framework 6 and introduces "a lot of new features" . It reminded me of the following query that EF generated for me, way, way back when it was still version 3.5. Here's hoping that they've taken care of this problem since then. So, the other day EF (v3.5) seemed to be taking quite a while to execute a query on SQL Server. This was a pretty central query and involved a few joins and restrictions, but wasn't anything too wild. All of the restrictions and joins were on numeric fields backed by indexes. In these cases, it's always best to just fire up the profiler and see what kind of SQL is being generated by EF. It was a pretty scary thing (I've lost it unfortunately), but I did manage to take a screenshot of the query plan, shown below. [image] It doesn't look too bad until you notice that the inset on the bottom right (the black smudgy thing) is a representation of the entire query ... and that it just kept going on down the page.

Ignoring files with Git

2013-07-15T01:38:50+02:00

Published by marco on 15. Jul 2013 01:38:50 ------------------------------------------------------------------------ The helpful page, "Ignoring files" , taught me something I didn't know: there's a file you can use to ignore files in your local Git repository without changing anyone else's repository. Just to recap, here are the ways to ignore a file: * Global .gitignore: you can designate basic exclusion directives that apply to all repositories on your system. This file is not committed to any repository or shared with others. Execute git config --global core.excludesfile ~/.gitignore_global to set the file to ~/.gitignore_global (for example). See the "linked article" for sample directives. * Per-repository global exclusions: add directives to the .git/info/exclude file in any repository. These directives are combined with any system-global directives to form the base exclusions for that repository. This file is not committed with the repository. This is the one I'd never heard of before. * .gitignore: add a file with this name to any directory. The directives in that file are merged with those from the parent directory to define the patterns that are excluded in that directory and all child directories. This is definitely the most common way to exclude files. * Exclude versioned files: and, finally, if your repository has files that are changed but not committed (e.g. configuration files), you can ignore future changes to those files with a call to git update-index --assume-unchanged path/to/file.txt. While this can be useful for legacy projects, it's best to structure new projects so developers don't have to rely on easily forgotten tricks like this.

Some new CSS length units (and some lesser-known ones)

2013-07-14T23:17:53+02:00

Published by marco on 14. Jul 2013 23:17:53 ------------------------------------------------------------------------ I've been using CSS since its inception and use many parts of the CSS3 specification for both personal work and work I do for "Encodo" . Recently, I read about some length units I'd never heard of in the article "CSS viewport units: vw, vh, vmin and vmax" by Chris Mills . * 1vw: 1% of viewport width * 1vh: 1% of viewport height * 1vmin: 1vw or 1vh, whatever is smallest * 1vmax: 1vw or 1vh, whatever is largest These should be eminently useful for responsive designs. While there is wide support for these new units, that support is only available in the absolute latest versions of browsers. See the article for a good example of how these can be used. While the ones covered in the article are actually new, there are others that have existed for a while but that I've never had occasion to use. The "Font-relative lengths: the ‘em’, ‘ex’, ‘ch’, ‘rem’ units" section lists the following units: * em: This one is well-known: 1em is equal to the "computed value of the 'font-size' property of the element on which it is used." * ex: Equal to the height of the letter 'x' in the font of the element on which it is used. This is useful when you want to size a container based on the height of a lower-case letter -- i.e. tighter -- rather than on the full size of the font (as you get with em). * ch: "Equal to the advance measure of the "0" (ZERO, U+0030) glyph found in the font used to render it." Since all digits in a font should be the same width, this unit is probably useful for pages that need to measure and render numbers in a reliable vertical alignment. * rem: The same as em but always returns the value for the root element of the page rather than the current element. Elements that use this unit will all scale against a common size, independently of the font-size of their contents. "There’s more to the CSS rem unit than font sizing" by Roman Rudenko has a lot more information and examples, as well as an explanation of how rem can stand in for the still nascent support for vw.

.NET 4.5.1 and Visual Studio 2013 previews are available

2013-06-29T17:00:15+02:00

Published by marco on 29. Jun 2013 17:00:15 Updated by marco on 29. Jun 2013 17:04:02 ------------------------------------------------------------------------ The following article was originally published on the "Encodo blogs" and is cross-published here. -------------------------------------------------------------------------------- The article "Announcing the .NET Framework 4.5.1 Preview" provides an incredible amount of detail about a relatively exciting list of improvements for .NET developers. [x64 Edit & Continue] First and foremost, the Edit-and-Continue feature is now available for x64 builds as well as x86 builds. Whereas an appropriate cynical reaction is that "it's about damn time they got that done", another appropriate reaction is to just be happy that they will finally support x64-debugging as a first-class feature in Visual Studio 2013. Now that they have feature-parity for all build types, they can move on to other issues in the debugger (see the list of suggestions at the end). [Async-aware debugging] We haven't had much opportunity to experience the drawbacks of the current debugger vis à vis asynchronous debugging, but the experience outlined in the call-stack screenshot below is one that is familiar to anyone who's done multi-threaded (or multi-fiber, etc.) programming. [image] Instead of showing the actual stack location in the thread within which the asynchronous operation is being executed, the new and improved version of the debugger shows a higher-level interpretation that places the current execution point within the context of the asnyc operation. This is much more in keeping with the philosophy of the async/await feature in .NET 4.5, which lets developers write asynchronous code in what appears to be a serial fashion. This improved readability has been translated to the debugger now, as well. [image] [Return-value inspection] The VS2013 debugger can now show the "direct return values and the values of embedded methods (the arguments)" for the current line. [1] Instead of manually selecting the text segment and using the Quick Watch window, you can now just see the chain of values in the "Autos" debugger pane. [image] [Nuget Improvements] "We are also releasing an update in Visual Studio 2013 Preview to provide better support for apps that indirectly depend on multiple versions of a single NuGet package. You can think of this as sane NuGet library versioning for desktop apps." We've been bitten by the afore-mentioned issue and are hopeful that the solution in Visual Studio 2013 will fill the gaps in the current release. The article describes several other improvements to the Nuget services, including integration with Windows Update for large-scale deployment. They also mentioned "a curated list of Microsoft .NET Framework NuGet Packages to help you discover these releases, published in OData format on the NuGet site", but don't mention whether the Nuget UI in VS2013 has been improved. The current UI, while not as awful and slow as initial versions, is still not very good for discovery and is quite clumsy for installation and maintenance. [User Voice for Visual Studio/.NET] You're not limited to just waiting on the sidelines to see which feature Microsoft has decided to implement in the latest version of .NET/Visual Studio. You should head over to the "User Voice for Visual Studio" site to get an account and vote for the issues you'd like the to work on next. Here's a list of the ones I found interesting, and some of which I've voted on. * Support Edit & Continue for all method bodies: While it is, to some degree, understandable that methods, fields and constants cannot be added or removed easily without restarting the debugger, method bodies can be modified. Unless they include lambdas or generics. Experience has shown that this means that we can't use Edit & Continue for most of our code. VS needs to "allow the modification of lambdas and anonymous methods" or the even higher-rated issue, "Debug Lambda expressions" . The other missing piece of this puzzle is to add "Support for generics under Edit & Continue" . * Better Generics and Design-by-Contract support: Language feature requests include "Expand Generic Constraints for constructors" , allow "[p]roper (generic) type ali[a]sing" and, near and dear to my heart, "Integrate Code Contracts more deeply in the .NET Framework" or "Integrate Code Contract Keywords into the main .Net Languages" . * There seem to be a lot of people asking Microsoft to work on "Silverlight 6" . Having developed several applications in Silverlight 5, we wouldn't be averse to seeing continued support rather than obsolescence for those projects. [2] * Make Visual Studio run faster and leaner: There are also pleas to "Decrease the Memory Footprint" (of Visual Studio) or "Make Visual Studio startup and shutdown faster" and "Make [Visual Studio] Solutions Load Faster" or at least "Improve performance of Visual Studio builds and other uses better when working with solutions with several projects (ex. 40+)" . Other ideas in this vein are add a "Visual Studio Task Manager" so we can at least see which components, plugins and panes are causing trouble. If all else fails, just "[m]ake VS scalable by switching to 64 bit" . -------------------------------------------------------------------------------- [1] In a similar vein, I found the issue "Bring back Classic Visual Basic, an improved version of VB6" to be interesting, simply because of the large number of votes for it (1712 at the time of writing). While it's understandable that VB6 developers don't understand the programming paradigm that came with the transition to .NET, the utterly reactionary desire to go back to VB6 is somewhat unfathomable. It's 2013, you can't put the dynamic/lambda/jitted genie back in the bottle. If you can't run with the big dogs, you'll have to stay on the porch...and stop being a developer. There isn't really any room for software written in a glorified batch language anymore. [1] This feature has been available for the unmanaged-code debugger (read: C++) for a while now.

Deleting multiple objects in Entity Framework

2013-06-08T09:43:11+02:00

Published by marco on 8. Jun 2013 09:43:11 Updated by marco on 9. Jun 2013 09:38:28 ------------------------------------------------------------------------ The following article was originally published on the "Encodo blogs" and is cross-published here. -------------------------------------------------------------------------------- Many improvements have been made to Microsoft's Entity Framework (EF) since I last used it in production code. In fact, we'd last used it waaaaaay back in 2008 and 2009 when EF had just been released. Instead of EF, I've been using the "Quino" ORM whenever I can. However, I've recently started working on a project where EF5 is used (EF6 is in the late stages of release, but is no longer generally available for production use). Though I'd been following the latest EF developments via the "ADO.Net blog" , I finally had a good excuse to become more familiar with the latest version with some hands-on experience. [Our history with EF] "Entity Framework: Be Prepared" was the first article I wrote about working with EF. It's quite long and documents the pain of using a 1.0 product from Microsoft. That version support only a database-first approach, the designer was slow and the ORM SQL-mapper was quite primitive. Most of the tips and advice in the linked article, while perhaps amusing, are no longer necessary (especially if you're using the Code-first approach, which is highly recommended). Our next update, "The Dark Side of Entity Framework: Mapping Enumerated Associations" , discusses a very specific issue related to mapping enumerated types in an entity model (something that Quino does very well). This shortcoming in EF has also "been addressed" but I haven't had a chance to test it yet. Our final article was on performance, "Pre-generating Entity Framework (EF) Views" , which, while still pertinent, no longer needs to be done manually (there's an "Entity Framework Power Tools" extension for that now). So let's just assume that that was the old EF; what's the latest and greatest version like? Well, as you may have suspected, you're not going to get an article about Code-first or database migrations. [1] While a lot of things have been fixed and streamlined to be not only much more intuitive but also work much more smoothly, there are still a few operations that aren't so intuitive (or that aren't supported by EF yet). [Standard way to delete objects] One such operation is deleting multiple objects in the database. It's not that it's not possible, but that the only solution that immediately appears is to, * load the objects to delete into memory, * then remove these objects from the context * and finally save changes to the context, which will remove them from the database The following code illustrates this pattern for a hypothetical list of users. var users = context.Users.Where(u => u.Name == "John"); foreach (var u in users) { context.Users.Remove(u); } context.SaveChanges(); This seems somewhat roundabout and quite inefficient. [2] [Support for batch deletes?] While the method above is fine for deleting a small number of objects -- and is quite useful when removing different types of objects from various collections -- it's not very useful for a large number of objects. Retrieving objects into memory only to delete them is neither intuitive nor logical. The question is: is there a way to tell EF to delete objects based on a query from the database? I found an example attached as an answer to the post "Simple delete query using EF Code First" . The gist of it is shown below. context.Database.SqlQuery( "DELETE FROM Users WHERE Name = @name", new [] { new SqlParameter("@name", "John") } ); To be clear right from the start, using ESQL is already sub-optimal because the identifiers are not statically checked. This query will cause a run-time error if the model changes so that the "Users" table no longer exists or the "Name" column no longer exists or is no longer a string. Since I hadn't found anything else more promising, though, I continued with this approach, aware that it might not be usable as a pattern because of the compile-time trade-off. Although the answer had four up-votes, it is not clear that either the author or any of his fans have actually tried to execute the code. The code above returns an IEnumerable but doesn't actually do anything. After I'd realized this, I went to MSDN for more information on the SqlQuery method. The documentation is not encouraging for our purposes (still trying to delete objects without first loading them), as it describes the method as follows (emphasis added), "Creates a raw SQL query that will return elements of the given generic type. The type can be any type that has properties that match the names of the columns returned from the query, or can be a simple primitive type." This does not bode well for deleting objects using this method. Creating an enumerable does very little, though. In order to actually execute the query, you have to evaluate it. Die Hoffnung stirbt zuletzt [3] as we like to say on this side of the pond, so I tried evaluating the enumerable. A foreach should do the trick. var users = context.Database.SqlQuery( "DELETE FROM Users WHERE Name = @name", new [] { new SqlParameter("@name", "John") } ); foreach (var u in users) { // NOP? } As indicated by the "NOP?" comment, it's unclear what one should actually do in this loop because the query already includes the command to delete the selected objects. Our hopes are finally extinguished with the following error message: System.Data.EntityCommandExecutionException : The data reader is incompatible with the specified 'Demo.User'. A member of the type, 'Id', does not have a corresponding column in the data reader with the same name. That this approach does not work is actually a relief because it would have been far too obtuse and confusing to use in production. It turns out that the SqlQuery only works with SELECT statements, as was strongly implied by the documentation. var users = context.Database.SqlQuery( "SELECT * FROM Users WHERE Name = @name", new [] { new SqlParameter("@name", "John") } ); Once we've converted to this syntax, though, we can just use the much clearer and compile-time--checked version that we started with, repeated below. var users = context.Users.Where(u => u.Name == "John"); foreach (var u in users) { context.Users.Remove(u); } context.SaveChanges(); So we're back where we started, but perhaps a little wiser for having tried. [Deleting objects with Quino] As a final footnote, I just want to point out how you would perform multiple deletes with the Quino ORM. It's quite simple, really. Any query that you can use to select objects you can also use to delete objects [4]. So, how would I execute the query above in Quino? Session.Delete(Session.CreateQuery().WhereEquals(User.MetaProperties.Name, "John").Query); To make it a little clearer instead of showing off with a one-liner: var query = Session.CreateQuery(); query.WhereEquals(User.MetaProperties.Name, "John"); Session.Delete(query); Quino doesn't support using Linq to create queries, but its query API is still more statically checked than ESQL. You can see how the query could easily be extended to restrict on much more complex conditions, even including fields on joined tables. -------------------------------------------------------------------------------- [1] As I wrote, We're using Code-first, which is much more comfortable than using the database-diagram editor of old. We're also using the nascent "Migrations" support, which has so far worked OK, though it's nowhere near as convenient as Quino's automated schema-migration. [1] Though it is inefficient, it's better than a lot of other examples out there, which almost unilaterally include the call to context.SaveChanges() inside the foreach-loop. Doing so is wasteful and does not give EF an opportunity to optimize the delete calls into a single SQL statement (see footnote below). [1] Translates to: "Hope is the last (thing) to die." [1] With the following caveats, which generally apply to all queries with any ORM: * Many databases use a different syntax and provide different support for DELETE vs. SELECT operations. * Therefore, it is more likely that more complex conditions are not supported for DELETE operations on some database back-ends * Since the syntax often differs, it's more likely that a more complex query will fail to map properly in a DELETE operation than in a SELECT operation simply because that particular combination has never come up before. * That said, Quino has quite good support for deleting objects with restrictions not only on the table from which to delete data but also from other, joined tables. Some combination of these reasons possibly accounts for EF's lack of support for batch deletes.

Merge conflicts in source control

2013-05-05T21:36:02+02:00

Published by marco on 5. May 2013 21:36:02 ------------------------------------------------------------------------ I was recently asked a question about merge conflicts in source-control systems. "[...] there keep being issues of files being over written, changes backed out etc. from people coding in the same file from different teams." My response was as follows: " tl;dr: The way to prevent this is to keep people who have no idea what they're doing from merging files. " [Extended version] Let's talk about bad merges happening accidentally. Any source-control worth its salt will support at least some form of automatic merging. An automatic merge is generally not a problem because the system will not automatically merge when there are conflicts (i.e. simultaneous edits of the same lines, or edits that are "close" to one another in the base file). An automatic merge can, however, introduce semantic issues. For example if both sides declared a method with the same name, but in different places in the same file, an automatic merge will include both copies but the resulting file won't compile (because the same method was declared twice). Or, another example is as follows: [Base file] public void A(B b) { var a = new A(); b.Do(a); b.Do(a); b.Do(a); } [Team One version] public void A(B b) { var a = new A(); b.Do(a); b.Do(a); b.Do(a); a.Do(); } [Team Two version] public void A(B b) { var a = null; b.Do(a); b.Do(a); b.Do(a); } [Automatic merge] public void A(B b) { var a = null; b.Do(a); b.Do(a); b.Do(a); a.Do(); } The automatically merged result will compile, but it will crash at run-time. Some tools (like ReSharper) will display a warning when the merged file is opened, showing that a method is being called on a provably null variable. However, if the file is never opened or the warning ignored or overlooked, the program will crash when run. In my experience, though, this kind of automatic-merge "error" doesn't happen very often. Code-organization techniques like putting each type in its own file and keeping methods bodies relatively compact go a long way toward preventing such conflicts. They help to drastically reduce the likelihood that two developers will be working in the same area in a file. With these relatively rare automatic-merge errors taken care of, let's move on to errors introduced deliberately through maliciousness or stupidity. This kind of error is also very rare, in my experience, but I work with very good people. "Let's say we have two teams: Team One - branch one > Works on file 1 Team Two - branch two > Works on file 1 Team One promotes file 1 into the Master B branch, there are some conflicts that they are working out but the file is promoted." I originally answered that I wasn't sure what it meant to "promote" a file while still working on it. How can a file be commited or checked in without having resolved all of the conflicts? As it turns out, it can't. As documented in "TFS Server 2012 and Promoting changes" , promotion simply means telling TFS to pick up local changes and add them to the list of "Pending Changes". This is part of a new TFS2012 feature called "Local Workspaces". A promoted change corresponds to having added a file to a change list in Perforce or having staged a file in Git. The net effect, though, is that the change is purely local. That is has been promoted has nothing to do with merging or committing to the shared repository. Other users cannot see your promoted changes. When you pull down new changes from the server, conflicts with local "promoted" changes will be indicated as usual, even if TFS has already indicated conflicts between a previous change and another promoted, uncommitted version of the same file. Any other behavior else would be madness. [1] "Team Two checks in their file 1 into the Master B branch. They back out the changes that Team One made without telling anyone anything." There's your problem. This should never happen unless Team Two has truly determined that their changes have replaced all of the work that Team One did or otherwise made it obsolete. If people don't know how to deal with merges, then they should not be merging. Just as Stevie Wonder's not allowed behind the wheel of a car, neither should some developers be allowed to deal with merge conflicts. In my opinion, though, any developer who can't deal with merges in code that he or she is working on should be moved another team or, possibly, job. You have to know your own code and you have to know your tools. [2] "Team One figures out the conflicts in their branch and re-promotes file one (and other files) to Master B branch. The source control system remembers that file 1 was backed out by Team Two so it doesn't promote file 1 but doesn't let the user know." This sounds insane. When a file is promoted -- i.e. added to the pending changes -- it is assumed that the current version is added to the pending changes, akin to staging a file in Git. When further changes are made to the file locally, the source-control system should indicate that it has changed since having been promoted (i.e. staged). When you re-promote the file (re-stage it), TFS should treat that as the most recent version in your workspace. When you pull down the changes from Team 2, you will have all-new conflicts to resolve because your newly promoted file will still be in conflict with the changes they made to "file 1" -- namely that they threw away all of the changes that you'd made previously. And, I'm not sure how it works in TFS, but in Git, you can't "back out" a commit without leaving a trail: * Either there is a merge commit where you can see that Team Two chose to "accept their version" rather than "merge" or "accept other version" * Or, there is a "revert" commit that "undoes" the changes from a previous commit Either way, your local changes will cause a conflict because they will have altered the same file in the same place as either the "merge" or "revert" commit and -- this is important -- will have done so after that other commit. To recap, let me summarize what this sounds like: * T1: I want to check in file1 * TFS: You have conflicts * T1: Promote file1 so that TFS knows about (other users can't see it yet because it hasn't been committed) * TFS: Okie dokie * T2: I want to check in file1 * TFS: You have conflicts * T2: Fuck that. Use my version. Oh, and, fuck T1. * TFS: I hear and obey. T2/file1 it is. * T1: OK, I resolved conflicts; here's the final version of file1 * TFS: Thanks! *tosses T1/file1 out the window* I don't believe that this is really possible -- even with TFS -- but, if this is a possibility with your source-control, then you have two problems: 1. You have team members who don't know how to merge 2. Your source control is helping them torpedo development There is probably a setting in your source-control system that disallows simultaneous editing for files. This is a pretty huge restriction, but if your developers either can't or won't play nice, you probably have no choice. -------------------------------------------------------------------------------- [1] This is not to rule out such behavior 100%, especially in a source-control system with which I am largely unfamiliar. It only serves to indicate the degree to which I would be unwilling to work with any system that exhibits this kind of behavior. [1] Different companies can have different grace periods for learning these two things, of course. I suppose that grace period can be interminably long, but...

A provably safe parallel language extension for C#

2013-02-12T21:44:37+01:00

Published by marco on 12. Feb 2013 21:44:37 Updated by marco on 12. Apr 2013 10:01:17 ------------------------------------------------------------------------ The paper "Uniqueness and Reference Immutability for Safe Parallelism" by Colin S. Gordon, Matthew J. Parkinson, Jared Parsons, Aleks Bromfield, Joe Duffy is quite long (26 pages), detailed and involved. To be frank, most of the notation was foreign to me -- to say nothing of making heads or tails of most of the proofs and lemmas -- but I found the higher-level discussions and conclusions quite interesting. The abstract is concise and describes the project very well: "A key challenge for concurrent programming is that side-effects (memory operations) in one thread can affect the behavior of another thread. In this paper, we present a type system to restrict the updates to memory to prevent these unintended side-effects. We provide a novel combination of immutable and unique (isolated) types that ensures safe parallelism (race freedom and deterministic execution). The type system includes support for polymorphism over type qualifiers, and can easily create cycles of immutable objects. Key to the system's flexibility is the ability to recover immutable or externally unique references after violating uniqueness without any explicit alias tracking. Our type system models a prototype extension to C# that is in active use by a Microsoft team. We describe their experiences building large systems with this extension. We prove the soundness of the type system by an embedding into a program logic." The project proposes a type-system extension with which developers can write provably safe parallel programs -- i.e. "race freedom and deterministic execution" -- with the amount of actual parallelism determined when the program is analyzed and compiled rather than decided by a programmer creating threads of execution. [Isolating objects for parallelism] The "isolation" part of this type system reminds me a bit of the way that SCOOP addresses concurrency. That system also allows programs to designate objects as "separate" from other objects while also releasing the program from the onus of actually creating and managing separate execution contexts. That is, the syntax of the language allows a program to be written in a provably correct way (at least as far as parallelism is concerned; see the "other provable-language projects" section below). In order to execute such a program, the runtime loads not just the program but also another file that specifies the available virtual processors (commonly mapped to threads). Sections of code marked as "separate" can be run in parallel, depending on the available number of virtual processors. Otherwise, the program runs serially. In SCOOP, methods are used as a natural isolation barrier, with input parameters marked as "separate". See "SCOOP: Concurrency for Eiffel" and "SCOOP (software)" for more details. The paper also contains an entire section listing other projects -- many implemented on the the JVM -- that have attempted to make provably safe programming languages. The system described in this paper goes much further, adding immutability as well as isolation (the same concept as "separate" in SCOOP). An interesting extension to the type system is that isolated object trees are free to have references to immutable objects (since those can't negatively impact parallelism). This allows for globally shared immutable state and reduces argument-passing significantly. Additionally, there are readable and writable references: the former can only be read but may be modified by other objects (otherwise it would be immutable); the latter can be read and written and is equivalent to a "normal" object in C# today. In fact, "[...] writable is the default annotation, so any single-threaded C# that does not access global state also compiles with the prototype." [Permission types] In this safe-parallel extension, a standard type system is extended so that every type can be assigned such a permission and there is "support for polymorphism over type qualifiers", which means that the extended type system includes the permission in the type, so that, given B => A, a reference to readable B can be passed to a method that expects an immutable A. In addition, covariance is also supported for generic parameter types. When they say that the "[k]ey to the system's flexibility is the ability to recover immutable or externally unique references after violating uniqueness without any explicit alias tracking", they mean that the type system allows programs to specify sections that accept isolated references as input, lets them convert to writable references and then convert back to isolated objects -- all without losing provably safe parallelism. This is quite a feat since it allows programs to benefit from isolation, immutability and provably safe parallelism without significantly changing common programming practice. In essence, it suffices to decorate variables and method parameters with these permission extensions to modify the types and let the compiler guide you as to further changes that need to be made. That is, an input parameter for a method will be marked as immutable so that it won't be changed and subsequent misuse has to be corrected. Even better, they found that, in practice, it is possible to use extension methods to allow parallel and standard implementations of collections (lists, maps, etc.) to share most code. "A fully polymorphic version of a map() method for a collection can coexist with a parallelized version pmap() specialized for immutable or readable collections. [...] Note that the parallelized version can still be used with writable collections through subtyping and framing as long as the mapped operation is pure; no duplication or creation of an additional collection just for concurrency is needed." [Real projects and performance impact] Much of the paper is naturally concerned with proving that their type system actually does what it says it does. As mentioned above, at least 2/3 of the paper is devoted to lemmas and large swaths of notation. For programmers, the more interesting part is the penultimate section that discusses the extension to C# and the experiences in using it for larger projects. "A source-level variant of this system, as an extension to C#, is in use by a large project at Microsoft, as their primary programming language. The group has written several million lines of code, including: core libraries (including collections with polymorphism over element permissions and data-parallel operations when safe), a webserver, a high level optimizing compiler, and an MPEG decoder." Several million lines of code is, well, it's an enormous amount of code. I'm not sure how many programmers they have or how they're counting lines or how efficiently they write their code, but millions of lines of code suggests generated code of some kind. Still, taken with the next statement on performance, that much code more than proves that the type system is viable. "These and other applications written in the source language are performance-competitive with established implementations on standard benchmarks; we mention this not because our language design is focused on performance, but merely to point out that heavy use of reference immutability, including removing mutable static/global state, has not come at the cost of performance in the experience of the Microsoft team." Not only is performance not impacted, but the nature of the typing extensions allows the compiler to know much more about which values and collections can be changed, which affects how aggressively this data can be cached or inlined. "In fact, the prototype compiler exploits reference immutability information for a number of otherwise-unavailable compiler optimizations. [...] Reference immutability enables some new optimizations in the compiler and runtime system. For example, the concurrent GC can use weaker read barriers for immutable data. The compiler can perform more code motion and caching, and an MSIL-to-native pass can freeze immutable data into the binary." [Incremental integration ("unstrict" blocks)] In the current implementation, there is an unstrict block that allows the team at Microsoft to temporarily turn off the new type system and to ignore safety checks. This is a pragmatic approach which allows the software to be run before it has been proven 100% parallel-safe. This is still better than having no provably safe blocks at all. Their goal is naturally to remove as many of these blocks as possible -- and, in fact, this requirement drives further refinement of the type system and library. "We continue to work on driving the number of unstrict blocks as low as possible without over-complicating the type system’s use or implementation." The project is still a work-in-progress but has seen quite a few iterations, which is promising. The paper was written in 2012; it would be very interesting to take it for a test drive in a CTP. [Other provable-language projects] A related project at Microsoft Research "Spec#" contributed a lot of basic knowledge about provable programs. The authors even state that the "[...] type system grew naturally from a series of efforts at safe parallelism. [...] The earliest version was simply copying Spec#’s [Pure] method attribute, along with a set of carefully designed task-and data-parallelism libraries." Spec#, in turn, is a "[...] formal language for API contracts (influenced by JML, AsmL, and Eiffel), which extends C# with constructs for non-null types, preconditions, postconditions, and object invariants". Though the implementation of this permissions-based type system may have started with Spec#, the primary focus of that project was more a valiant attempt to bring Design-by-Contract principles ("examples and some discussion here" ) to the .NET world via C#. Though spec# has "downloadable code" , the project hasn't really been updated in years. This is a shame, as support for "Eiffel" [1] in .NET, mentioned above as one of the key influences of spec#, was dropped by ISE Eiffel long ago. Spec#, in turn, was mostly replaced by Microsoft Research's "Contracts" project (an older version of which was covered in depth in "Microsoft Code Contracts: Not with a Ten-foot Pole" ). The Contracts project seems to be alive and well: the most recent release is from October, 2012. I have not checked it out since my initial thumbs-down review (linked above) but did note in passing that the implementation is still (A) library-only and (B) does not support Visual Studio 2012. The library-only restriction is particularly galling, as such an implementation can lead to repeated code and unwieldy anti-patterns. As documented in the "Contracts FAQ" , the current implementation of the "tools take care of enforcing consistency and proper inheritance of contracts" but this is presumably accomplished with compiler errors that require the programmer to include contracts from base methods in overrides. The seminal work Object-oriented Software Construction by Bertrand Meyer (vol. II in particular) goes into tremendous detail on a type system that incorporates contracts directly. The type system discussed in this article covers only parallel safety: null-safety and other contracts are not covered at all. If you're at all interested in these types of language extensions, the vol.2 of OOSC is a great read. The examples are all in "Eiffel" but should be relatively accessible. Though some features -- generics, notably but also tuples, once routines and "agents" -- have since made their way into C# and other more commonly used languages, many others -- such as contracts, anchored types (contravariance is far too constrained in C# to allow them), covariant return types, covariance everywhere, multiple inheritance, explicit feature removal, loop variants and invariants, etc. -- are still not available. Subsequent interesting work has also been done on extensions that allow creation of "provably null-safe programs" , something also addressed in part by Microsoft Research's "Contracts" project.

Programming in the moderncurrent age

2013-02-03T23:04:09+01:00

Published by marco on 3. Feb 2013 23:04:09 ------------------------------------------------------------------------ In order to program in 2013, it is important not to waste any time honing your skills with outdated tools and work-flows. What are the essential pieces of software for developing software in 2013? Runtime A runtime is a given for all but the most esoteric of programming exercises. Without something to execute your code, there is almost no point in writing it. Debugger Programming without an integrated debugger can be very time-consuming, error-prone and will quite frankly suck the fun right out of the whole endeavor. And, by "debugger" I mean a source-level single-step debugger with call-stack and variable/object/structure inspection as well as expression evaluation. Poring through logs and inserting print statements is not a viable long-term or even medium-term solution. You shouldn't be writing in a language without one of these unless you absolutely can't avoid it (NAnt build scripts come to mind). Compiler/Checker A syntax/semantics checker of some sort integrated into the editor ensures a tighter feedback/error-finding loop and saves time, energy and frustration. I was deliberately cagey with the "checker" because I understand that some languages, like Javascript [1], do not have a compiled form. Duck-typed languages like Python or Ruby also limit static checking but anything is better than nothing. Versioning A source-control system is essential in order to track changes, test ideas and manage releases. A lot of time can be wasted -- and effort lost -- without good source control. Great source control decreases timidity, encourages experimentation and allows for interruptible work-flows. I will argue below that private branches and history rewriting are also essential. Even for the smallest projects, there is no reason to forgo any of these tools. [Managing your Source Code] "tl;dr: It's 2013 and your local commit history is not sacrosanct. No one wants to see how you arrived at the solution; they just want to see clean commits that explain your solution as clearly as possible. Use git; use rebase; use "rebase interactive"; use the index; stage hunks; squash merge; go nuts. [2]" [image]I would like to focus on the "versioning" part of the tool-chain. Source control tells the story of your code, showing how it evolved to where it is at any given point. If you look closely at the "Encodo Branching Model" [3] diagram (click to enlarge), you can see the story of the source code: 1. All development was done in the master branch until v1.0 was released 2. Work on B was started in a feature branch 3. Work on hotfix v1.0.1 was started in a hotfix branch 4. Work on A was started in another feature branch 5. Hotfix v1.0.1 was released, tagged and merged back to the master branch 6. Development continued on master and both feature branches 7. Master was merged to feature branch A (includes hotfix v1.0.1 commits) 8. Finalization for release v1.1 was started in a release branch 9. Feature A was completed and merged back to the master branch 10. Version v1.1 was released, tagged and merged back to the master branch 11. Master was merged to feature branch B (includes v1.1 and feature A commits) 12. Development continued on master and feature B 13. Version v1.2 was released and tagged Small, precise, well-documented commits are essential in order for others to understand the project -- especially those who weren't involved in developing the code. It should be obvious from which commits you made a release. You should be able to go back to any commit and easily start working from there. You should be able to maintain multiple lines of development, both for maintenance of published versions and for development of new features. The difficulty of merging these branches should be determined by the logical distance between them rather than by the tools. Merging should almost always be automatic. Nowhere in those requirements does it say that you're not allowed to lie about how you got to that pristine tree of commits. [Why you should be using private branches and history rewriting] A few good articles about Git have recently appeared -- "Understanding the Git Workflow" by Benjamin Sandofsky is one such -- explaining better than ever why rewriting history is better than server-side, immutable commits. In the article cited above, Sandofsky divides his work up into "Short-lived work [...] larger work [...] and branch bankrupty." These concepts are documented to some degree in the Branch Management chapter of the "Encodo Git Handbook" (of which I am co-author). I will expand on these themes below. Note: The linked articles deal exclusively with the command line, which isn't everyone's favorite user interface (I, for one, like it). We use the "SmartGit/Hg" client for visualizing diffs, organizing commits and browsing the log. We also use the command-line for a lot of operations, but SmartGit is a very nice tool and version 3 supports nearly all of the operations described in this article. [What is rebasing?] As you can see from the diagram above, a well-organized and active project will have multiple branches. Merging and rebasing are two different ways of getting commits from one branch into another. Merging commits into a branch creates a merge commit, which shows up in the history to indicate that n commits were made on a separate branch. Rebasing those commits instead re-applies those commits to the head of the indicated branch without a merge commit. In both cases there can be conflicts, but one method doesn't pose a greatest risk of them than the other. [4] You cannot tell from the history that rebased commits were developed in a separate branch. You can, however, tell that the commits were rebased because the author date (the time the commit was originally created) differs from the commit date (the last time that the commit was applied). [What do you recommend?] At "Encodo" , we primarily work in the master branch because we generally work on very manageable, bite-sized issues that can easily be managed in a day. Developers are free to use local branches but are not required to do so. If some other requirement demands priority, we shunt the pending issue into a private branch. Such single-issue branches are focused and involve only a handful of files. It is not at all important to "remember" that the issue was developed in a branch rather than the master branch. If there are several commits, it may be important for other users to know that they were developed together and a merge-commit can be used to indicate this. Naturally, larger changes are developed in feature branches, but those are generally the exception rather than the rule. "Remember: Nowhere in those requirements does it say that you're not allowed to lie about how you got to that pristine tree of commits." Otherwise? Local commit history is absolutely not sacrosanct. We rebase like crazy to avoid unwanted merge commits. That is, when we pull from the central repository, we rebase our local commits on top of the commits that come form the origin. This has worked well for us. If the local commit history is confusing -- and this will sometimes come up during the code review -- we use an interactive rebase to reorganize the files into a more soothing and/or understandable set of commits. See "Sandofsky's article" for a good introduction to using interactive rebasing to combine and edit commits. Naturally, we weigh the amount of confusion caused by the offending commits against the amount of effort required to clean up the history. We don't use bisect [5] very often, so we don't invest a lot of time in enforcing the clean, compilable commits required by that tool. For us, the history is interesting, but we rarely go back farther than a few weeks in the log. [6] [When to merge? When to rebase?] At Encodo, there are only a few reasons to retain a merge commit in the official history: 1. If we want to remember which commits belonged to a particular feature. Any reasonable tool will show these commits graphically as a separate strand running alongside the master branch. 2. If a rebase involves too much effort or is too error-prone. If there are a lot of commits in the branch to be integrated, there may be subtle conflicts that resolve more easily if you merge rather than rebase. Sometimes we just pull the e-brake and do a merge rather than waste time and effort trying to get a clean rebase. This is not to say that the tools are lacking or at fault but that we are pragmatic rather than ideological. [7] 3. If there are merge commits in a feature branch with a large number of well-organized commits and/or a large number of changes or affected files. In this case, using a squash merge and rebuilding the commit history would be onerous and error-prone, so we just merge to avoid issues that can arise when rebasing merge commits (related to the point above). [When should I use private branches? What are they exactly?] There are no rules for local branches: you can name them whatever you like. However, if you promote a local branch to a private branch, at Encodo we use the developer's initials as the prefix for the branch. My branches are marked as "mvb/feature1", for example. What's the difference between the two? Private branches may get pushed to our common repository. Why would you need to do that? Well, I, for example, have a desktop at work and, if I want to work at home, I have to transfer my workspace somehow to the machine at home. One solution is to work on a virtual machine that's accessible to both places; another is to remote in to the desktop at work from home; the final one is to just push that work to the central repository and pull it from home. The offline solution has the advantage of speed and less reliance on connectivity. What often happens to me is that I start work on a feature but can only spend an hour or two on it before I get pulled off onto something else. I push the private branch, work on it a bit more at home, push back, work on another, higher-priority feature branch, merge that in to master, work on master, whatever. A few weeks later and I've got a private branch with a few ugly commits, some useful changes and a handful of merge commits from the master branch. The commit history is a disgusting mess and I have a sneaking suspicion that I've only made changes to about a dozen files but have a dozen commits for those changes. That's where the aforementioned "branch bankruptcy" comes in. You're not obligated to keep that branch; you can keep the changes, though. As shown in the "referenced article" , you execute the following git commands: git checkout master git checkout -b cleaned_up_branch git merge --squash private_feature_branch git reset The --squash tells git to squash all of the changes from the private_feature_branch into the index (staging) and reset the index so that those changes are in the working tree. From here, you can make a single, clean, well-written commit or several commits that correspond logically to the various changes you made. Git also lets you lose your attachment to checking in all the changes in a file at once: if a file has changes that correspond to different commits, you can add only selected differences in a file to the index (staging). "In praise of Git’s index" by Aristotle Pagaltzis provides a great introduction. If you, like me, regularly take advantage of refactoring and cleanup tools while working on something else, you'll appreciate the ability to avoid checking in dozens of no-brainer cleanup/refactoring changes along with a one-liner bug-fix. [8] [One final example: cherry picking and squashing] I recently renamed several projects in our solution, which involved renaming the folders as well as the project files and all references to those files and folders. Git automatically recognizes these kind of renames as long as the old file is removed and the new file is added in the same commit. I selected all of the files for the rename in SmartGit and committed them, using the index editor to stage only the hunks from the project files that corresponded to the rename. Nice and neat. I selected a few other files and committed those as a separate bug-fix. Two seconds later, the UI refreshed and showed me a large number of deleted files that I should have included in the first commit. Now, one way to go about fixing this is to revert the two commits and start all over, picking the changes apart (including playing with the index editor to stage individual hunks). Instead of doing that, I did the following: 1. I committed the deleted files with the commit message "doh!" (to avoid losing these changes in the reset in step 3) 2. I created a "temp" branch to mark that commit (to keep the commit visible once I reset in step 3) 3. I hard-reset my master branch to the origin 4. I cherry-picked the partial-rename commit to the workspace 5. I cherry-picked the "doh!" commit to the workspace 6. Now the workspace had the rename commit I'd wanted in the first place 7. I committed that with the original commit message 8. I cherry-picked and committed the separate bug-fix commit 9. I deleted the "temp" branch (releasing the incorrect commits on it to be garbage-collected at some point) Now my master branch was ready to push to the server, all neat and tidy. And nobody was the wiser. -------------------------------------------------------------------------------- [1] There are alternatives now, though, like Microsoft's TypeScript, that warrant a look if only because they help tighten the error-finding feedback loop and have the potential to make you more efficient (the efficiency may be robbed immediately back, however, if debugging generated code becomes difficult or even nightmarish). [1] Once you've pushed, though? No touchie. At that point, you've handed in your test and you get graded on that. [1] According to my business card, I'm a "senior developer and partner" at "Encodo System AG" . [1] With the exception, mentioned elsewhere as well, that rebasing merge-commits can sometimes require you to re-resolve previously resolved conflicts, which can be error-prone if the conflicts were difficult to resolve in the first place. Merging merge-commits avoids this problem. [1] bisect is a git feature that executes a command against various commits to try to localize the commit that caused a build or test failure. Basically, you tell it the last commit that worked and git uses a binary search to find the offending commit. Of course, if you have commits that don't compile, this won't work very well. We haven't used this feature very much because we know the code in our repositories well and using blame and log is much faster. Bisect is much more useful for maintainers that don't know the code very well, but still need to figure out at which commit it stopped working. [1] To be clear: we're only so cavalier with our private repositories to which access is restricted to those who already know what's going on. If we commit changes to public, open-source or customer repositories, we make sure that every commit compiles. See "Aristotle's index article" (cited above) for tips on building and testing against staged files to ensure that a project compiles, runs and passes all tests before making a commit -- even if you're not committing all extant changes. [1] That said, with experience we've learned that an interactive rebase and judicious squashing will create commits that avoid these problems. With practice, these situations crop up more and more rarely. [1] Of course, you can also create a separate branch for your refactoring and merge it all back together, but that's more work and is in my experience rarely necessary.

Are all errors exceptions?

2012-11-22T23:24:02+01:00

Published by marco on 22. Nov 2012 23:24:02 ------------------------------------------------------------------------ "The following ruminations were written seven years ago but have held up remarkably well. They have been published with minor updates." This article deals with the situation illustrated below, specifically the question raised in the comment. if (! $folder_id) { $this->db->logged_query ("SELECT folder_id FROM" . $this->app->table_names->objects . "WHERE id = $obj->object_id"); if ($this->db->next_record ()) $folder_id = $this->db->f ("folder_id"); else // raise exception? ignore? what to do? } Above we see a situation in which you may decide against a stricter enforcement because whereas the error is clear, the reaction is not. This is also a big part of working with contracts: deferring reactions. Often -- especially when developing libraries -- you're in code so deep that the desired reaction could be one of many depending on the deployment of that code. The code above is taken from the publishing loop in the "webcore" ; it's used to publish comments. In effect, the code has detected that a comment object id has been passed in that doesn't correspond to anything in the system. It's bogus. It's wrong. Some deployments -- I would hazard most -- would just like to silently ignore the error and publish as much as possible. Silently ignoring an error will always bite you in the ass in the end (pun intended). The key here is that whereas the person deploying the final system should be perfectly free to ignore the error, you, as the library developer, can and must not. Let's see what kind of reactions we could have here. Well, isn't that what exceptions were invented for? They're for transmitting error conditions up out of deep library code. Problem solved. For more severe errors in which the code cannot continue, the answer is quite clear: you simply throw an exception. However, in the situation above, it's not so clear. The problem is easily skipped and most of the rest of the job can be finished. Here is where deferral comes in. Just call a function that will handle it later. This function can log the error or warning, display it to the user, ask to abort/retry/ignore, consult a table for same, throw an exception or just ignore it. It's not your problem to dispatch solutions to encountered errors. It's your job to detect them and maintain the integrity of the running code. Simply throwing an exception no matter what the error condition is, in effect, making a decision about how the error will be handled. Control is lost because the exception handler is necessarily higher up. This is a bad thing if you'd actually like the code to do the best it can. As any experience at all will have shown you, some errors are just warnings or hints. It's not just black and white, error or not. Many deployments of the system containing the code above will actually treat the issue as a warning and log it for the database techs to address. However, to assume the opposite, that callers want errors to be swallowed, cheats those callers as well. It cheats them because it becomes incredibly hard to find errors; they must be detected by subtle logic or data problems (e.g. Hmmm...the log shows it only sent 500 emails, I thought there were 503 subscribers...). If the system never complains or logs anything, the end user calls you first. It cheats you because you can never adequately test your system because it never complains. Everything's OK. It kind of works. It mostly works. The desire for safety or avoiding crashes or exceptions on the client side should never override the desire to have correct code that detects error conditions. If you write library code or end-user code, that code deals mostly with detecting and reporting misused functions. The functionality itself is generally straightforward; it's wrapping the interface around it that's hard. The only thing that you'll probably spend more time on is hunting down memory bugs -- and, yes, even if you're using a garbage-collected runtime, you can still have memory bugs. What would you call it if the memory required by your publication script increases in proportion to the number of subscribers and mails? If you're wondering what I ended up doing in the case above, I decided on a function call 'raise'. It sounds like an exception, and that's usually what happens: it breaks the code on that line, but a little more elegantly than the usual PHP die statement. While the default handler for exceptions simply issues a fancy die statement, that handler can be replaced with a different one, one that redirects to an HTML page with a nicely formatted error printout and a form for submitting the error. Since this code is likely to run inside a script that just wants to send subscriptions and doesn't care about data integrity errors, the handler would probably be replaced with something that suppresses the exception, but logs the error. That way, once the subscription run is done, you can view the error log and see if there are data integrity problems, and, perhaps more importantly, you see them all at once instead of just one at a time. And, more importantly still, those subscribers for whom there were no problems received their mail on time.

A scalable pattern for building metadata

2012-11-22T19:45:47+01:00

Published by marco on 22. Nov 2012 19:45:47 ------------------------------------------------------------------------ The following article was originally published on the "Encodo blogs" and is cross-published here. -------------------------------------------------------------------------------- In the latest version of Quino -- version 1.8.5 -- we took a long, hard look at the patterns we were using to create metadata. The metadata for an application includes all of the usual Quino stuff: classes, properties, paths, relations. With each version, though we're able to use the metadata in more places. That means that the metadata definition code grows and grows. We needed some way to keep a decent overview of that metadata without causing too much pain when defining it. In order to provide some background, the following are the high-level requirements that we kept in mind while designing the new pattern and supporting framework. Manage complexity A simple model should be easy and straightforward to build, with no cumbersome boilerplate; complex models should support multiple layers and provide an overview Leverage existing knowhow Our users don't want to learn a new language/IDE in order to create metadata; neither do we want to provide support for our own metadata-definition language Support modularization Modules can be used to hide complexity but are also sometimes necessary to define hard boundaries in the application metadata Support extensibility Interdependent modules and overlays will need to refer to elements in other modules; there needs to be a standard mechanism for defining and accessing metadata elements that doesn't rely on string constants [1] Support refactoring Rely on convention and name-matching as little as possible to avoid subtle errors Support introspection Developers that stick to the pattern should be able to maximize efficiency using common navigation and introspection [2] tools like Visual Studio or ReSharper. [Definition Language] Quino metadata has always been defined using a .NET language -- in our case, we always use C# to define the metadata, using the MetaBuilder or InMemoryMetaBuilder to compose the application model. This approach satisfies the need to leverage existing tools, refactoring and introspection. Since Quino metadata is an in-memory construct, there will always be a .NET API for creating metadata. This is not to say that there will never be a DSL to define Quino metadata but that such an approach is not the subject of this post. [Modularization] Quino applications have always been able to define and integrate metadata modules (e.g. reporting or security) using an IMetaModuleBuilder. Modules solved interdependency issues by splitting the metadata-generation into several phases: * Add classes and foreign keys * Add paths between classes (depends on foreign keys) * Add calculated properties and relations (depends on paths) * Add layouts (depends on all properties) In this way, when a module needed to add a path between a class that it had defined and a class defined in another module, it could be guaranteed that classes and foreign keys for all modules had been defined before any paths were created. Likewise for classes that wanted to define relations based on paths defined in other modules. The limitation of the previous implementation was that a module generator always created its own module and builder and could not simply re-use those created by another generator. Basically, there was no "lightweight" way of splitting metadata-generation into separate files for purely organizational purposes. There were also a few issues with the implementation of the main model-generation code as well. The previous pattern depended heavily on local variables, all defined within one mammoth function. Separating code into individual method calls was ad-hoc -- each project did it a little differently -- and involved a lot of migration of local variables to instance variables. With all code in a single method, file-structure navigation tools couldn't help at all. The previous pattern prescribed using file comments or regions that could be located using "find in file". This was clearly sub-optimal. [The new pattern] The new pattern that can be applied for all models, bit or small includes the following parts: Model generator As before, there is a class that implements the IMetaModelGenerator interface. This class is used by the application configuration and various tools (e.g. the code generator or UML generator) to create the model. Model elements Metadata that is referenced from multiple steps in the metadata-generation process is stored in a separate object (or objects) called the model elements. (E.g. classes are created in the AddClasses() step and referenced in the AddPaths, AddProperties and AddLayouts steps.) The model elements typically has two properties called Classes and Paths. Metadata generators Module generators still exist, but there are now also metadata generators that are lightweight, using a metadata builder and elements defined by another generator (typically a module generator or the model generator itself). This may sound like a lot of overhead for a simple application, but it's really not that much extra code. The benefits are: * Models, modules and lightweight parts all use the same pattern, with the same phases and method names. That makes it far easier to know where to look for a definition * Since the pattern is the same, it's easy to move functionality from one module to another or to split one module into multiple lightweight parts without doing a lot of refactoring * A small model will naturally grow to a medium or large model, all while using the same pattern. There is no moment during development where you have to do a major refactoring in order to get organized: the pattern will naturally support a clean coding style. [Building a model, step by step] But enough chatter; let's take a look at the absolute minimum boilerplate for an empty model. [Step zero: create the boilerplate] public class DemoModelElements { public DemoModelElements() { Classes = new DemoModelClasses(); Paths = new DemoModelPaths(); } public DemoModelClasses Classes { get; private set; } public DemoModelPaths Paths { get; private set; } } public class DemoModelPaths { } public class DemoModelClasses { } public class DemoCoreGenerator : DependentMetadataGeneratorBase { } public class DemoModelGenerator : MetaBuilderBasedModelGeneratorBase { protected override void AddMetadata() { Builder.Include(); } } The code above is functional but doesn't actually create any metadata. So what does it do? 1. It uses the generic MetaBuilderBasedModelGeneratorBase to indicate the type of Elements that will be exposed by this model generator. The elements class is created automatically and is available as the property Elements (as we'll see in the examples below). Additionally, we're using a ModelGeneratorBase that is based on a MetaBuilder which means that the property Builder is also available and is of type MetaBuilder. 2. It includes the DemoCoreGenerator which is a dependent generator -- it's lightweight and uses the elements and builder from its owner. The exact types are shown in the class declaration; it can be read as: get elements of type DemoModelElements and a builder of type MetaBuilder from the generator with type DemoModelGenerator. The initial generic argument can be any other metadata generator that implements the IElementsProvider interface. 3. The model generator overrides AddMetadata to include the metadata created by DemoCoreGenerator in the model. Even though it's not very much code, you can create a snippet or a file template with Visual Studio or a Live Template or file template with ReSharper to quickly create a new model. [Step one: define the model] Now, let's fill the empty model with some metadata. The first step is to define the model that we're going to build. That part goes in the AddMetadata() method. [3] public class DemoModelGenerator : MetaBuilderBasedModelGeneratorBase { protected override void AddMetadata() { Builder.CreateModel("Demo", /*Guid*/); Builder.CreateMainModule("Encodo.Quino"); Builder.Include(); } } [Step two: add a class] A typical next step is to define a class. Let's do that. public class DemoModelClasses { public IMetaClass Company { get; set; } } public class DemoCoreGenerator : DependentMetadataGeneratorBase { protected override void AddClasses() { Elements.Classes.Company = Builder.AddClassWithDefaultPrimaryKey("Company", /*Guid*/, /*Guid*/); } } As you can see, we added a new class to the elements and created and assigned it in the AddClasses() phase of metadata-generation. [Step three: add another class and a path] An obvious next step is to create another class and define a path between them. public class DemoModelClasses { public IMetaClass Company { get; set; } public IMetaClass Person { get; set; } } public class DemoCoreGenerator : DependentMetadataGeneratorBase { protected override void AddClasses() { Elements.Classes.Company = Builder.AddClassWithDefaultPrimaryKey("Company", /*Guid*/, /*Guid*/); Elements.Classes.Person = Builder.AddClassWithDefaultPrimaryKey("Person", /*Guid*/, /*Guid*/); Builder.AddInvisibleProperty(Elements.Classes.Person, "CompanyId", MetaType.Key, true, /*Guid*/); } protected override void AddPaths() { Elements.Paths.CompanyPersonPath = Builder.AddOneToManyPath( Elements.Classes.Company, "Id", Elements.Classes.Person, "CompanyId", /*Guid*/, /*Guid*/ ); } } [Step four: add relations] Having a path is not enough, though. We can also define how the relations on that path are exposed in the classes. public class DemoCoreGenerator : DependentMetadataGeneratorBase { protected override void AddClasses() { Elements.Classes.Company = Builder.AddClassWithDefaultPrimaryKey("Company", /*Guid*/, /*Guid*/); Elements.Classes.Person = Builder.AddClassWithDefaultPrimaryKey("Person", /*Guid*/, /*Guid*/); Builder.AddInvisibleProperty(Elements.Classes.Person, "CompanyId", MetaType.Key, true, /*Guid*/); } protected override void AddPaths() { Elements.Paths.CompanyPersonPath = Builder.AddOneToManyPath( Elements.Classes.Company, "Id", Elements.Classes.Person, "CompanyId", /*Guid*/, /*Guid*/ ); } protected override void AddProperties() { Builder.AddRelation(Elements.Classes.Company, "People", "", Elements.Paths.CompanyPersonPath); Builder.AddRelation(Elements.Classes.Person, "Company", "", Elements.Paths.CompanyPersonPath); } } OK, now we have a model with two entities -- companies and people -- that are related to each other so that a company has a list of people and each person belongs to a company. [Step five: add translations] Now we'd like to make the metadata support German as well as English. Quino naturally supports more generalized ways of doing this (e.g. importing from files), but let's just add the metadata manually to see what that would look like (unaffected methods are left off for brevity). public class DemoModelElements { public DemoModelElements() { Classes = new DemoModelClasses(); Paths = new DemoModelPaths(); } public ILanguage English { get; set; } public ILanguage German { get; set; } public DemoModelClasses Classes { get; private set; } public DemoModelPaths Paths { get; private set; } } public class DemoCoreGenerator : DependentMetadataGeneratorBase { protected override void AddCoreElements() { Elements.English = Builder.AddDisplayLanguage("en-US", "English"); Elements.German = Builder.AddDisplayLanguage("de-CH", "Deutsch"); } protected override void AddClasses() { var company = Elements.Classes.Company = Builder.AddClassWithDefaultPrimaryKey("Company", /*Guid*/, /*Guid*/); company.Caption.SetValue(Elements.English, "Company"); company.Caption.SetValue(Elements.German, "Firma"); company.PluralCaption.SetValue(Elements.English, "Companies"); company.PluralCaption.SetValue(Elements.German, "Firmen"); var person = Elements.Classes.Person = Builder.AddClassWithDefaultPrimaryKey("Person", /*Guid*/, /*Guid*/); Builder.AddInvisibleProperty(person, "CompanyId", MetaType.Key, true, /*Guid*/); person.Caption.SetValue(Elements.English, "Person"); person.Caption.SetValue(Elements.German, "Person"); person.PluralCaption.SetValue(Elements.English, "People"); person.PluralCaption.SetValue(Elements.German, "Personen"); } } Note that I created a local variable for both company and person. I did this for two reasons: * The code is shorter and easier to read * There are fewer references to the Elements.Classes.Person and Elements.Classes.Company properties. It's useful to keep the number of references to a minimum in order to make searching for usages with a tool like ReSharper of maximum benefit. Otherwise, there's a lot of noise to signal and you'll get hundreds of references when there are only actually a few dozen "real" references. [Step six: using private methods] You can see that the metadata-generation code is still manageable, but it's growing. Once we've filled out all of the properties, relations, translations, layouts and view aspects for the person and company classes, we'll have a file that's several hundred lines long. A file of that size is still manageable and, since we have methods, it's eminently navigable with a file-structure browser. If we don't mind keeping -- or we'd rather keep -- everything in one file, we can see more structure by splitting the code into more methods. This is really easy to do because we're using the elements to reference other parts of metadata instead of local variables. For example, let's move the class initialization code for the person and company entities to separate methods (unaffected methods are left off for brevity). public class DemoCoreGenerator : DependentMetadataGeneratorBase { protected override void AddClasses() { AddCompany(); AddPerson(); } private void AddCompany() { var company = Elements.Classes.Company = Builder.AddClassWithDefaultPrimaryKey("Company", /*Guid*/, /*Guid*/); company.Caption.SetValue(Elements.English, "Company"); company.Caption.SetValue(Elements.German, "Firma"); company.PluralCaption.SetValue(Elements.English, "Companies"); company.PluralCaption.SetValue(Elements.German, "Firmen"); } private void AddPerson() { var person = Elements.Classes.Person = Builder.AddClassWithDefaultPrimaryKey("Person", /*Guid*/, /*Guid*/); Builder.AddInvisibleProperty(person, "CompanyId", MetaType.Key, true, /*Guid*/); person.Caption.SetValue(Elements.English, "Person"); person.Caption.SetValue(Elements.German, "Person"); person.PluralCaption.SetValue(Elements.English, "People"); person.PluralCaption.SetValue(Elements.German, "Personen"); } } [Step seven: using multiple generators] While this is a good technique for small models -- with anywhere up to five entities -- most models are larger and include entities with sizable metadata definitions. Another thing to consider is that, when working with larger teams, it's often best to keep a central item like the metadata definition as modular as possible. To scale the pattern up for larger models, we can move code for larger entity definitions into separate generators. As soon as we move an entity to its own generator, we're faced with the question of where we should create paths for that entity. A path doesn't really belong to one class or another; in which generate should it go? Well, we thought about that and came to the conclusion that the pattern should be to just create a separate generator for all paths in the model (or multiple path-only generators if you have a larger model). That is, when a model gets a bit larger, it should include the following generators (using the name "Demo" from the examples above): * DemoCoreGenerator * DemoPathGenerator * DemoCompanyGenerator * DemoPersonGenerator The DemoCoreGenerator will create metadata and assign elements like the display languages. It's also recommended to define base types like enumerations and very simple classes [4] in the core as well. Obviously, as the model grows, the core generator may also get larger. This isn't a problem: just split the contents logically into multiple generators. For the purposes of this example, though, we only have a single core and a single path generator and two entity generators. Since these generators will all be dependent on the model's builder and elements, the first step is to define a base class that will be used by the other generators. internal class DemoDependentGenerator : DependentMetadataGeneratorBase { } public class DemoCoreGenerator : DemoDependentGenerator { protected override void AddCoreElements() { Elements.English = Builder.AddDisplayLanguage("en-US", "English"); Elements.German = Builder.AddDisplayLanguage("de-CH", "Deutsch"); } } public class DemoPathGenerator : DemoDependentGenerator { protected override void AddPaths() { Elements.Paths.CompanyPersonPath = Builder.AddOneToManyPath( Elements.Classes.Company, "Id", Elements.Classes.Person, "CompanyId", /*Guid*/, /*Guid*/ ); } } public class DemoCompanyGenerator : DemoDependentGenerator { protected override void AddClasses() { var company = Elements.Classes.Company = Builder.AddClassWithDefaultPrimaryKey("Company", /*Guid*/, /*Guid*/); company.Caption.SetValue(Elements.English, "Company"); company.Caption.SetValue(Elements.German, "Firma"); company.PluralCaption.SetValue(Elements.English, "Companies"); company.PluralCaption.SetValue(Elements.German, "Firmen"); } protected override void AddProperties() { Builder.AddRelation(Elements.Classes.Person, "Company", "", Elements.Paths.CompanyPersonPath); } } public class DemoPersonGenerator : DemoDependentGenerator { protected override void AddClasses() { var person = Elements.Classes.Person = Builder.AddClassWithDefaultPrimaryKey("Person", /*Guid*/, /*Guid*/); Builder.AddInvisibleProperty(person, "CompanyId", MetaType.Key, true, /*Guid*/); person.Caption.SetValue(Elements.English, "Person"); person.Caption.SetValue(Elements.German, "Person"); person.PluralCaption.SetValue(Elements.English, "People"); person.PluralCaption.SetValue(Elements.German, "Personen"); } protected override void AddProperties() { Builder.AddRelation(Elements.Classes.Company, "People", "", Elements.Paths.CompanyPersonPath); } } MetaBuilderBasedModelGeneratorBase { protected override void AddMetadata() { Builder.CreateModel("Demo", /*Guid*/); Builder.CreateMainModule("Encodo.Quino"); Builder.Include(); Builder.Include(); Builder.Include(); Builder.Include(); } } You'll note that we only moved code around and didn't have to change any implementation or add any new elements or anything that might introduce subtle errors in the metadata. Please note, the classes are all shown in a single code block above, but the pattern dictates that each class should be in its own file. [Step eight: integrating external modules] So far, we've only worked with generators that are dependent on the model generator. How do we access information -- and elements -- generated in other modules? For example, let's include the security module and change a translation for a caption. public class DemoModelElements { public DemoModelElements() { Classes = new DemoModelClasses(); Paths = new DemoModelPaths(); } public ILanguage English { get; set; } public ILanguage German { get; set; } public SecurityModuleElements Security { get; set; } public DemoModelClasses Classes { get; private set; } public DemoModelPaths Paths { get; private set; } } public class DemoCoreGenerator : DemoDependentGenerator { protected override void AddCoreElements() { Elements.English = Builder.AddDisplayLanguage("en-US", "English"); Elements.German = Builder.AddDisplayLanguage("de-CH", "Deutsch"); Elements.Security = Builder.Include().Elements; } protected override void AddProperties() { Elements.Security.Classes.User.Caption.SetValue(Elements.German, "Benutzer"); } } This approach works well with any module that has adhered to the pattern and exposes its elements in a standardized way. [5] In this case, the core module includes the security module and retains a reference to its elements. Any code that uses the core module will now have access not only to the core elements but also to the security elements, as well. Another major benefit to using this pattern is that the resulting code is quite self-explanatory: it's no mystery to what the Elements.Security.Classes.User.Caption is referring. [One last thing: folder structure] The previous pattern had a single monolithic file. The new pattern increases the number of files -- possibly by quite a lot. It's recommended to put these new files into the following structure: [-] Models [+] Aspects [+] Elements [+] Generators The "Aspects" folder isn't new to this pattern, but it's worth mentioning that any model-specific aspects should go into a separate folder. That's all for now. Happy modeling! -------------------------------------------------------------------------------- [1] Naturally, the IMetaModel is always available and any part of the generation process can access metadata in the model at any time. However, the API for the model is quite generic and requires knowledge of the unique identifier or index for a piece of metadata. [1] By introspection, we mean that if metadata is accessed through .NET code structures -- like properties or constants -- we should be able to find all usages of a particular metadata element without resorting to a "find in files" for a particular string. [1] It doesn't have to go there. The DemoCoreGenerator could also set up the builder (since it's using the same builder object). To do that, you'd override AddCoreElements() and set up the model there. However, it's clearer to keep it in the generator that actually owns the builder that is being configured. [1] Simple classes generally have few extra properties and no layouts or short description classes. [1] Through the IElementProvider mentioned above

Updating to a touch-friendly UI

2012-11-21T23:08:51+01:00

Published by marco on 21. Nov 2012 23:08:51 Updated by marco on 8. Mar 2013 09:44:48 ------------------------------------------------------------------------ I was recently redesigning a web page and wanted to make it easier to use from touch-screen browsers. Links made only of text are relatively easy to click with a mouse, but tend to make poor touch targets. If the layout has enough space around the link, this can be remedied by applying CSS. [The basic box] "First" <#>"Second" <#>"Third" <#> Suppose we have a box with three links in it, as shown to the right. [Setting the height] The first step is to make this box taller, so the logical thing to do is to set the height. We'll have to pick a value, so set height: 40px on the gray box. "First" <#>"Second" <#>"Third" <#> [Aligning vertically] This isn't exactly what we want, though; we'd rather have the vertical space equally distributed. Also, if you hover over the links, you can see that the space below the text is not active. Maybe we can try to add vertical-align: middle to align the content. "First" <#>"Second" <#>"Third" <#> Unfortunately, this doesn't have the desired effect. The vertical-align property works when used this way in table cells, but otherwise has no effect for block elements. Knowing that, we can set display: table-cell for the gray box. "First" <#>"Second" <#>"Third" <#> And now the box has become longer, because the 50% width of the box is calculated differently for table cells than for regular boxes (especially when a table cell is found outside of a table). [Relative positioning] Let's abandon the vertical-alignment approach and try using positioning instead. Set position: relative and top: 25% to center the links vertically. "First" <#>"Second" <#>"Third" <#> Now that looks much better, but the space above and below the links is still not active. Perhaps we can use the height trick again, to make the individual links taller as well. So we set height: 100% on each of the links. "First" <#>"Second" <#>"Third" <#> We didn't get the expected result, but we should have expected that: the links are inline elements and can only have a height set if we set display: inline-block on each link as well. We use inline-block rather than block so that the links stay on the same line. "First" <#>"Second" <#>"Third" <#> The links are now the right size, but they stick out below the gray box, which isn't what we wanted at all. We're kind of out of ideas with this approach, but there is another way we can get the desired effect. [Positive padding and negative margins] Let's start with the original gray box and, instead of choosing a random height as we did above -- 40px -- let's set padding: 8px on the gray box to make room above and below the links. "First" <#>"Second" <#>"Third" <#> With just one CSS style, we've already got the links nicely aligned and, as an added benefit, this technique scales even if the font size is changed. The 8-pixel padding is preserved regardless of how large the font gets. [1] "First" <#>"Second" <#>"Third" <#> This approach seems promising, but the links are still not tall enough. The naive approach of setting height: 100% on the links probably won't work as expected, but let's try it anyway. "First" <#>"Second" <#>"Third" <#> It looks like the links were already 100% of the height of the container; in hindsight it's obvious, since the height of the gray box is determined by the height of the links. The 100% height refers to the client area of the gray box, which doesn't include the padding. We'd actually like the links to have padding above and below just as the gray box has. As we saw above, the links will only honor the padding if they also have display: inline-block, so let's set that in addition to padding: 8px. "First" <#>"Second" <#>"Third" <#> We're almost there. The only thing remaining is to make the vertical padding of the links overlap with the vertical padding of the gray box. We can do this by using a negative vertical margin, setting margin: -8px. "First" <#>"Second" <#>"Third" <#> We finally have the result we wanted. The links are now large enough for the average finger to strike without trying too hard. Welcome to the CSS-enabled touch-friendly world of web design. The code for the final example is shown below, with the sizing/positioning styles highlighted: .gray-box { background-color: gray; border: 1px solid black; border-width: 1px 0; width: 50%; text-align: center; padding: 8px 0; } .gray-box a { background-color: #8F8F8F; display: inline-block; padding: 8px 20px; margin: -8px 0; }

First Second Third

-------------------------------------------------------------------------------- [1] Naturally, we could also use .8em instead and then the padding will scale with the font size. This would work just as well with the height. Let's pretend that we're working with a specification that requires an 8-pixel padding instead of a flexible one.