Links and Notes for August 20th, 2021

Published by marco on

Below are links to articles, highlighted passages[1], and occasional annotations[2] for the week ending on the date in the title, enriching the raw data from Instapaper Likes and Twitter. They are intentionally succinct, else they’d be articles and probably end up in the gigantic backlog of unpublished drafts. YMMV.

[1] Emphases are added, unless otherwise noted.↩

[2] Annotations are only lightly edited.↩

COVID-19
Economy & Finance
Public Policy & Politics
Journalism & Media
Philosophy & Sociology
Programming

COVID-19

Economy & Finance

China’s income inequality is among the world’s worst by Nicu Calcea (New Statesman)

GINI Coefficients for 'important' countries

The title of the article and graph both discuss China’s inequality, but more interesting is that Turkey, Israel, and the U.S. are all worse than China—and they shouldn’t be, should they?

The Long Road to a New Ideology: Piketty on Trump, Democrats, and Inequality by John Plotz & Adaner Usmani (Public Books)

“I propose a minimum inheritance for all, €120,000 at the age of 25. This would really be for all, whether your ancestors were slaves or slave owners. Everybody would receive €120,000 at the age of 25.”

“We need to have some specific reparation: sometimes symbolic (like a pedagogical museum), sometime material for some specific injustice of the past. And at the same time, we need to look at the future of a universal redistribution mechanism. This would, in practice, benefit a lot of people from the minority groups. And these are, of course, still very much concentrated in the lower socioeconomic groups in societies, minority society, or postcolonial migrants in European societies.”

“Finding your counter-ideologies is usually not so simple. That’s really what I want to stress in the book: there’s always a tendency on the left to say, We know what we should do. And the only problem is that we have a group of very powerful people who don’t want this to happen. So all that matters is the balance of power. I’m not saying the balance of power is not important. I’m not saying that you don’t have people who are trying to protect what they have—that’s obvious. The problems that we are trying to solve are not simple.”

“In the end, Trump was, of course, an awful and a terrible president. But to me, compared to George W. Bush—who went to war in Iraq and caused half a million [Iraqi deaths] after 2003 and 2004 in the Iraq War—in a way Trump was less damaging. I understand that in the US you view Trump as damaging. But if we take a world perspective? It could have been worse.”

“If he had used the US military to do things, it could have been worse. After Vietnam, after Iraq, the question is, When is the next time that America will use its military to do very bad things? And at least Trump was not the answer to this question.”

How to Invest: The Few Key Things You Need to Know by Thomas Pueyo (Uncharted Territories)

“Imagine you have $150,000 in assets with an advisor that charges 1%. That means you pay them $1,500 per year. For them to make $200,000 per year, they need 130 people like you. With about 200 working days a year, that means they can only spend about 1.5 days per year on your account. How well do you think they’re going to serve you?”

Public Policy & Politics

The US and UK Got Things So Wrong in Afghanistan Because They do Not Understand the Afghan Way of War by Patrick Cockburn (CounterPunch)

“[…] the Taliban no longer need help from al-Qaeda and there is every reason why they should reject a renewed alliance. On the other hand, there may be Taliban commanders who feel ideologically akin to al Qaeda and its clones and will give them covert aid.

“The Taliban are visibly astonished by the completeness of their victory and will take time to digest and consolidate it. The outside world will be wondering what to make of the new Afghan regime and what will be the implications of its success for them and for the region.

“It is in the interests of the Taliban for the moment to show a moderate face, but they have fought a ferocious war for two decades, taking heavy casualties. There will be many in their ranks who do not wish to dilute their social and religious beliefs for the sake of politically convenience. Despite the amnesty just declared by Taliban leaders, many will seek vengeance against former government supporters whom they have long denounced as traitors.”

There's still plunder in them thar' hills!

Every Option in Afghanistan Was Bad by Nicholas Grossman (Arc Digital)

“Afghanistan is landlocked, so flying there requires going through airspace controlled by Pakistan or Iran, which the U.S. can get to from international waters, or over Turkmenistan, Uzbekistan or Tajikistan, which requires flying over the Caucuses, Russia or China. The U.S. might still try, especially if there’s evidence that terrorists based in Afghanistan are plotting direct attacks on America, but it will be more challenging than it was over the last two decades — not least because good intelligence will be harder to come by.”

Spoken like a deluded imperialist. What evidence? What intelligence?

“Regime change endgames based on a full handoff to local government forces are likely to fail.”

Regime change from outside of a country is just wrong, even if you could make it “succeed”. The people in the country should decide, not others. The others will always decide on what’s best for themselves, with the needs of the natives being purely ancillary.

GOOGLE LLC v. ORACLE AMERICA, INC. by Justice Breyer (U.S. Supreme Court)

“Google’s purpose was to create a different task-related system for a different computing environment (smartphones) and to create a platform—the Android platform—that would help achieve and popularize that objective. The record demonstrates numerous ways in which reimplementing an interface can further the development of computer programs. Google’s purpose was therefore consistent with that creative progress that is the basic constitutional objective of copyright itself.”

Position 47-50

“Google copied approximately 11,500 lines of declaring code from the API, which amounts to virtually all the declaring code needed to call up hundreds of different tasks. Those 11,500 lines, however, are only 0.4 percent of the entire API at issue, which consists of 2.86 million total lines. In considering “the amount and substantiality of the portion used” in this case, the 11,500 lines of code should be viewed as one small part of the considerably greater whole. As part of an interface, the copied lines of code are inextricably bound to other lines of code that are accessed by programmers. Google copied these lines not because of their creativity or beauty but because they would allow programmers to bring their skills to a new smartphone computing environment. The “substantiality” factor will generally weigh in favor of fair use where, as here, the amount of copying was tethered to a valid, and transformative, purpose.”

Position 51-57

“Applying the principles of the Court’s precedents and Congress’ codification of the fair use doctrine to the distinct copyrighted work here, the Court concludes that Google’s copying of the API to reimplement a user interface, taking only what was needed to allow users to put their accrued talents to work in a new and transformative program, constituted a fair use of that material as a matter of law.”

Position 62-65

“[…] a programmer building a new application for personal banking may wish to use various tasks to, say, calculate a user’s balance or authenticate a password. To do so, she need only learn the method calls associated with those tasks. In this way, the declaring code’s shortcut function is similar to a gas pedal in a car that tells the car to move faster or the QWERTY keyboard on a typewriter that calls up a certain letter when you press a particular key. As those analogies demonstrate, one can think of the declaring code as part of an interface between human beings and a machine.”

Position 141-145

“[…] the symbols by themselves do nothing. She must also use software that connects the symbols to the equivalent of file cabinets, drawers, and files. The API is that software. It includes both the declaring code that links each part of the method call to the particular task-implementing program, and the implementing code”

Position 166-168

“For most of the packages in its new API, Google also wrote its own declaring code. For 37 packages, however, Google copied the declaring code from the Sun Java API. Id., at 106–107. As just explained, that means that, for those 37 packages, Google necessarily copied both the names given to particular tasks and the grouping of those tasks into classes and packages.”

Position 173-175

“[…] copyright’s protection may be stronger where the copyrighted material is fiction, not fact, where it consists of a motion picture rather than a news broadcast, or where it serves an artistic rather than a utilitarian function.”

Position 275-276

“The Reexamination Clause is no bar here, however, for, as we have said, the ultimate question here is one of law, not fact. It does not violate the Reexamination Clause for a court to determine the controlling law in resolving a challenge to a jury verdict, as happens any time a court resolves a motion for judgment as a matter of law.”

Position 343-346

Journalism & Media

Rock of Ages by Scott Greenfield (Simple Justice)

“McWhorter can say this because he’s now a New York Times columnist, a Columbia linguistics professor and, well, black. Sometimes, the things black people demand are just dumb, and its neither woke nor anti-racist to acquiesce to dumb crap like removing a rock under some misguided vision of wokiosity that black people are always right when they claim to feel something.”

“Want to not be racist? Then accept the premise that people of any race or gender can do stupid, ridiculous, even crazy stuff, and don’t let them get away with it just because of their skin or genitalia. Real equality means that when someone demands something monumentally idiotic, like removing a 42-ton rock, you say “no.” And if that’s the worst racist thing they can manufacture, be happy that their lives are so wonderfully free of racism that they can’t come up with anything more serious to cry about than a rock.”

What The Media Hasn’t Told You About The Cuomo Debacle by Michael Tracey (SubStack)

“At her press conference, James proclaimed that one purpose of the investigation was to demonstrate that “we should believe women.” But it’s unclear whether the women who reportedly attested that they “valued” Cuomo’s conduct also merit such “belief”—and if so, why their testimonies were twisted to signify the opposite of what they apparently said. Either way, the attorney general’s standard of “belief” seems to involve explicitly accusing public officials of lawbreaking, while forsaking any obligation to actually prove those accusations in court.”

“One of the most eyebrow-raising characters in this entire mess is Charlotte Bennett, arguably the most significant of Cuomo’s “accusers” given that she “broke the dam” by being the second person to publicly come forward with claims. Did anyone bother to do basic research on this person before deciding that her allegations — which are really more of an interpretative paradigm she’s constructed than any one tangible “allegation” — had to be relayed to the public almost completely uncritically?”

“However much Cuomo might’ve had this coming to him, do the precedent-setting implications of the ordeal seem conducive to a healthier political and cultural climate? Does the empowerment and/or “vindication” of the people who employed these tactics against him — and received the most kid-glove possible treatment in the media — seem like a positive thing in the long run?”

“It should really be emphasized that Attorney General of the State of New York, Letitia James, did something here that previously would’ve been close to unthinkable. She went before the TV cameras and simply declared that Cuomo had violated the law, but then washed her hands of any responsibility to prove her allegations of lawbreaking. This law enforcement official might have radically discarded the most basic notions of due process, but a political objective was achieved, and you can bet James will be reaping dividends ahead of the next New York gubernatorial election in 2022.”

Philosophy & Sociology

Why Is It So Hard to Be Rational? by Joshua Rothman (New Yorker)

“COVID deniers and climate activists are different kinds of people, but they’re united in their frustration with the systems built by experts on our behalf—both groups picture élites shuffling PowerPoint decks in Davos while the world burns. From this perspective, the root cause of mass irrationality is the failure of rationalists. People would believe in the system if it actually made sense.”

“The realities of rationality are humbling. Know things; want things; use what you know to get what you want. It sounds like a simple formula. But, in truth, it maps out a series of escalating challenges. In search of facts, we must make do with probabilities. Unable to know it all for ourselves, we must rely on others who care enough to know. We must act while we are still uncertain, and we must act in time—sometimes individually, but often together. For all this to happen, rationality is necessary, but not sufficient. Thinking straight is just part of the work.”

Programming

In Search of an Understandable Consensus Algorithm (Extended Version) by Diego Ongaro and John Ousterhout (Github)

“Different servers may observe the transitions between terms at different times, and in some situations a server may not observe an election or even entire terms. Terms act as a logical clock [14] in Raft, and they allow servers to detect obsolete information such as stale leaders. Each server stores a current term number, which increases monotonically over time. Current terms are exchanged whenever servers communicate; if one server’s current term is smaller than the other’s, then it updates its current term to the larger value. If a candidate or leader discovers that its term is out of date, it immediately reverts to follower state. If a server receives a request with a stale term number, it rejects the request.”

Position 172-177

“If desired, the protocol can be optimized to reduce the number of rejected AppendEntries RPCs. For example, when rejecting an AppendEntries request, the follower can include the term of the conflicting entry and the first index it stores for that term. With this information, the leader can decrement nextIndex to bypass all of the conflicting entries in that term; one AppendEntries RPC will be required for each term with conflicting entries, rather than one RPC per entry. In practice, we doubt this optimization is necessary, since failures happen infrequently and it is unlikely that there will be many inconsistent entries.”

Position 253-257

“Raft determines which of two logs is more up-to-date by comparing the index and term of the last entries in the logs. If the logs have last entries with different terms, then the log with the later term is more up-to-date. If the logs end with the same term, then whichever log is longer is more up-to-date.”

Position 286-288

“This snapshotting approach departs from Raft’s strong leader principle, since followers can take snapshots without the knowledge of the leader. However, we think this departure is justified. While having a leader helps avoid conflicting decisions in reaching consensus, consensus has already been reached when snapshotting, so no decisions conflict. Data still only flows from leaders to followers, just followers can now reorganize their data.”

Position 437-440

“[…] sending the snapshot to each follower would waste network bandwidth and slow the snapshotting process. Each follower already has the information needed to produce its own snapshots, and it is typically much cheaper for a server to produce a snapshot from its local state than it is to send and receive one over the network.”

Position 441-443

“The Leader Completeness Property guarantees that a leader has all committed entries, but at the start of its term, it may not know which those are. To find out, it needs to commit an entry from its term. Raft handles this by having each leader commit a blank no-op entry into the log at the start of its term.”

Position 466-468

“[…] a leader must check whether it has been deposed before processing a read-only request (its information may be stale if a more recent leader has been elected). Raft handles this by having the leader exchange heartbeat messages with a majority of the cluster before responding to read-only requests. Alternatively, the leader could rely on the heartbeat mechanism to provide a form of lease [9], but this would rely on timing for safety (it assumes bounded clock skew).”

Position 468-471

“Algorithms are often designed with correctness, efficiency, and/or conciseness as the primary goals. Although these are all worthy goals, we believe that understandability is just as important. None of the other goals can be achieved until developers render the algorithm into a practical implementation, which will inevitably deviate from and expand upon the published form. Unless developers have a deep understanding of the algorithm and can create intuitions about it, it will be difficult for them to retain its desirable properties in their implementation.”

Position 573-577

Orthogonal Optimization of Subqueries and Aggregation by Cesar A. Galindo-Legaria & Milind M. Joshi (National University of Singapore)

“[…] we make the observation that there is significant overlap between techniques proposed for subquery execution and others such as GroupBy evaluation. Therefore we take the approach of identifying and implementing more primitive, independent optimizations that collectively generate efficient execution plans.”

Position 14-16

“By implementing all these orthogonal techniques, the query processor should then produce the same efficient execution plan for the various equivalent SQL formulations we have listed above, achieving a degree of syntax-independence.”

Position 59-60

“Another problematic construct is conditional scalar execution, expressed in SQL as case when <cond> then <value1> else <value2> end. The point is, <value2> should not be evaluated when <cond> is true. Therefore, eager execution of a subquery, say contained in <value2>, is incorrect, in particular if it happens to generate a run-time error.”

Position 166-169

“This step transforms an operator tree into a simplified/normalized form. Simplifications include, for example, turning outerjoins into joins, when possible, and detecting empty subexpressions. For subqueries, mutual recursion between relational and scalar execution is removed, which is always possible; and correlations are removed, which is usually possible. At the end of normalization, most common forms of subqueries have been turned into some join variant.”

Position 324-327

“Subqueries and aggregation should be handled by orthogonal optimizations. Earlier work has sometimes combined multiple, independent primitives to derive strategies that are suitable for some cases. What we do instead is to separate out those independent, small primitives. This allows finer granularity of their application; it generates a richer set of execution plans; it makes for more modular proofs; and it simplifies implementation.”

Position 346-349

Efficiently Compiling Efficient Query Plans for Modern Hardware by Thomas Neumann (Very Large Data Base Endowment Inc.)

“The algebraic operator model is very useful for reasoning over the query, but it is not necessarily a good idea to exhibit the operator structure during query processing itself. In this paper we therefore propose a query compilation strategy that differs from existing approaches in several important ways:”

Position 37-39

“The overall framework produces code that is very friendly to modern CPU architectures and, as a result, rivals the speed of hand-coded query execution plans. In some cases we can even outperform hand-written code, as using the LLVM assembly language allows for some tricks that are hard to do in a high-level programming language like C++. Furthermore, by using an established compiler framework, we benefit from future compiler, code optimization, and hardware improvements, whereas other approaches that integrate processing optimizations into the query engine itself will have to update their systems manually.”

Position 41-46

“The main point is that we consider spilling data to memory as a pipeline-breaking operation. During query processing, all data should be kept in CPU registers as long as possible.”

Position 73-74

“[…] how can we organize query processing such that the data can be kept in CPU registers as long as possible? The classical iterator model is clearly ill-suited for this, as tuples are passed via function calls to arbitrary functions – which always results in evicting the register contents. The block-oriented execution models have fewer passes across function boundaries, but they clearly also break the pipeline as they produce batches of tuples beyond register capacity”

Position 74-77

“As we have to materialize the tuples anyway at some point, we therefore propose to compile the queries in a way that all pipelining operations are performed purely in CPU (i.e., without materialization), and the execution itself goes from one materialization point to another.”

Position 101-102

“All four fragments in themselves are strongly pipelining, as they can keep their tuples in CPU registers and only access memory to retrieve new tuples or to materialize their results. Furthermore, we have very good code locality as small code fragments are working on large amounts of data in tight loops. As such, we can expect to get very good performance from such an evaluation scheme.”

Position 107-109

“The query execution code is no longer operator centric but data centric: Each code fragment performs all actions that can be done within one part of the execution pipeline, before materializing the result into the next pipeline breaker. The individual operator logic can, and most likely will, be spread out over multiple code fragments, which makes query compilation more difficult than usual.”

Position 114-116

“The iterator model has a nice, simple interface, but it pays for this by using virtual function calls and frequent memory accesses. By exposing the operator structure, we can generate near optimal assembly code, as we generate exactly the instructions that are relevant for the given situation, and we can keep all relevant values in CPU registers.”

Position 119-122

“The real translation code is significantly more complex, of course, as we have to keep track of the loaded attributes, the state of the operators involved, attribute dependencies in the case of correlated subqueries, etc., but in principle this simple mapping already shows how we can translate algebraic expressions into imperative code.”

Position 143-146

“[…] producing assembler code using LLVM is much more robust than writing it manually. For example LLVM hides the problem of register allocation by offering an unbounded number of registers (albeit in Single Static Assignment form). We can therefore pretend that we have a CPU register available for every attribute in our tuple, which simplifies life considerably. And the LLVM assembler is portable across machine architectures, as only the LLVM JIT compiler translates the portable LLVM assembler into architecture dependent machine code.”

Position 155-159

“Furthermore, the LLVM assembler is strongly typed, which caught many bugs that were hidden in our original textual C++ code generation. And finally LLVM is a full strength optimizing compiler, which produces extremely fast machine code, and usually requires only a few milliseconds for query compilation, […]”

Position 159-161

“While staying in LLVM, we can keep the tuples in CPU registers all the time, which is about as fast as we can expect to be. When calling an external function all registers have to be spilled to memory, which is somewhat expensive. In absolute terms it is very cheap, of course, as the registers will be spilled on the stack, which is usually in cache, but if this is done millions of times it becomes noticeable.”

Position 175-177

“[…] it makes sense to define functions within LLVM itself, that can then be called from places within the LLVM code. Again, one has to make sure that the hot path does not cross a function boundary. Thus a pipelining fragment of the algebraic expression should result in one compact LLVM code fragment.”

Position 185-187

“All these issues complicate code generation, of course. But overall the effort required to avoid these pitfalls is not too severe. The LLVM code is generated anyway, and spending effort on the code generator once will pay off for all subsequent queries. The code generator is relatively compact. In our implementation the code generation for all algebraic operators required for SQL-92 consists of about 11,000 lines of code, which is not a lot.”

Position 232-235

“This style of block processing where values are packed into a (large) register fits very naturally into our framework, as the operators always pass register values to their consumers. LLVM directly allows for modeling SIMD values as vector types, thus the impact on the overall code generation framework are relatively minor.”

Position 245-247

“By relying on mainstream compilation frameworks the DBMS automatically benefits from future compiler and processor improvements without re-engineering the query engine.”

Position 318-320

“When aggregating three columns, the system processes tuple attributes at a rate of 6.5GB/s, which is the bandwidth of the memory bus. We cannot expect to get faster than this without changes to the storage system. Our query processing is so fast that is is basically “I/O bound”, where I/O means RAM access.”

Position 483-486