Most interesting links of December '12

December 31, 2012

Recommended Readings

Software development

Kent Beck: When Worse Is Better: Incrementally Escaping Local Maxima - Kent reintroduces his Sprinting Centipede strategy ("reduce the cost of each change as much as possible so as to enable small changes to be chained together nearly continuously" => "From the outside it is clear that big changes are happening, even though from the inside it's clear that no individual change is large or risky.") and advices how to deal with situations where improvements have reached a local maxima by making the design temporarily intentionally worse (f.ex. by inlining all the ugly methods or writing to both the old and the new data store); strongly recommended
- Related: Efficient Incremental Change - transmute risk into time by doing small, safe steps, then optimize your ability to make these steps quickly and thus being able to achieve large changes
Researchers: It is not profitable to outsource development - the Scandinavian research organisation SINTEF ICT has studied the effects of outsourcing and discovered that often it is more expensive than in-country development due to hidden costs caused by worse communication and cultural differences (f.ex. Indians tend not to ask questions and work based on their, often incomplete, understanding) and very high people turn-over; even after the true cost is discovered, companies irrationally stay there. However it is possible to succeed, in some cases.
Bjørn Borud: Tractor pulling and software engineering - very valuable and pragmatic advices on producing good software (i.e. avoiding accumulating so much crap that the software just stops progressing). Don't think only about the happy path. Simplify. Write for other developers, i.e. avoid too "smart" solutions, test & document, dp actually think about design and its implication w.r.t performance etc. Awake the scientist in you: "Do things because you know they work, not because it happens to be the hip thing to do." (Note: I see the good intention behind "design for the weakest programmer you can think of" but plase don't take it too far! Software should be primarily simple, not necessarily easy.
Know your feedback loop – why and how to optimize it - to succeed, we need to learn faster; the only way to do that is to optimize our feedback loops, i.e. shorten the path our assumptions travel before they are (in)validated, whether about our code, business functionality, or the whole project idea. Conscise, valuable.
Code quality is the least important reason to pair program - the author argues, based on his experience, other benefits of pair programming are more important than code quality: "[..] the most important reasons why we pair: it contributes to an amazing company culture, it’s the best way to bring new developers up to speed, and it provides a great way to share knowledge across the development team."
You Can’t Refactor Your Way Out of Every Problem - refactoring can’t help you if the design is fundamentally wrong, you need to rewrite it; know when it can or cannot help and act accordingly (related to how much design is needed upfront since some design decision cannot be reverted/improved upon)

Languages

Josh Bloch: Java - the good, bad and ugly parts (video, 15 min); summary: right design decisions (VM, GC, threads, dynamic linking, OOP, static typing, exceptions, ...), some bad details (signed byte, lossy long-> double, == doesn't cal .equals, ability to call overriden methods from constructors, ...); Mr. Bloch has also given a longer talk examining the evolution of Java from 1.0 to 1.7 in The Evolution of Java: Past, Present, and Future.
True Scala complexity - a thoughtful criticism of the complexity of Scala, based on code samples; "[it is true that] Scala is a language with a smaller number of orthogonal features than found in many other languages. [...] However, the problem is that each feature has substantial depth, intersecting in numerous ways that are riddled with nuances, limitations, exceptions and rules to remember. It doesn’t take long to bump into these edges, of which there are many."; however, its possible to avoid many of the problems mentioned by resorting to less smart, more clumsy and verbose Java-like code; also, the author still likes Scala.
Scala or Java? Exploring myths and facts (3/2012) - a balanced view of Scala's strengths and weaknesses; "[..] the same features that makes Scala expressive can also lead to performance problems and complexity. This article details where this balance needs to be considered." Topics: productivity, complexity, concurrency support, language extensibility, Java interoperability, quality of tooling, speed, backward compatibility. Plenty of useful links.

Big data & Cloud:

Dean Wampler's slides from Beyond Map Reduce - 1) Hadoop Map Reduce is the EJB 2 of big data but there are better APIs such as Cascading with Scala/Clojure wrappers; there are also "alternative" solutions like Spark and Storm; 2) functional/relational programming with simple data structures (lists, sets, maps etc.) is much more suitable for big data than OOP (for we do mostly stateless data transformations)
Apache HBase vs Apache Cassandra - comparison sheet - if you want to decide between the two
Optimizing MongoDB on AWS - 20 min talk about the current state of the art. Simplicity: Mongo AMIs by 10gen, Cloudformation template etc. Stability & perf.: new storage options - EBS with provisioned IOPS volumes (high I/O) + EBS Optimized Instances (dedicated throughput to EBS), High IO instances (hi1.4xlarge - SSD)); comparison of throughput (number of operations, MBs) of these storages; tips for filesystem config. Scalability: scale horizontally and vertically, shrink as needed.
Getting Real About Distributed System Reliability by Jay Kreps, the author of the Voldemort DB: distributed software is NOT somehow innately reliable; a common mistake is to consider only probability of independent failures but failures typically are dependent (e.g. network problems affect the whole data center, not a single machine); the theoretical reliability "[..] is an upper bound on reliability but one that you could never, never approach in practice"; "For example Google has a fantastic paper that gives empirical numbers on system failures in Bigtable and GFS and reports empirical data on groups of failures that show rates several orders of magnitude higher than the independence assumption would predict. This is what one of the best system and operations teams in the world can get: your numbers may be far worse." The new systems are far less mature (=> mor bugs, worse monitoring, less experience) and thus less reliable (it takes a decade for a FS to become mature, likely similar here). Distributed systems are of course more complex to configure and operate. "I have come around to the view that the real core difficulty of these systems is operations, not architecture or design." Some nice examples of failures.

Other

Talks To Help You Become A Better Front-End Engineer In 2013 (tl;dr) - topics such as mobile web development, modern web devel. workflow, current/upcoming featrues of CSS3, ECMAScript 6, CSS preprocessors (LESS etc.), how to write maintainable JS, modular CSS, responsive design, JS debugging, offline webapps, CSS profiling and speed tracer, JS testing
On Being A Senior Engineer - valuable insights into what makes an engineer "senior" (i.e. mature; from the field of web operations but applies to IT in general): mature engineers seek out constructive criticism of their designs, understand the non-technical areas of how they are perceived (=> assertive, nice to work with etc.), understand that not all of their projects are filled with rockstar-on-stage work, anticipate the impact of their code (on resource usage, others' ability to understand & extend it etc.), lift the skills and expertise of those around them, make their trade-offs explicit when making decisions, do not practice "Cover Your Ass Engineering," are able to view the project from another person’s (stakeholder's) perspective, are aware of cognitive biases (such as the Planning Fallacy), practice egoless programming, know the importance of (sometimes irrational) feelings people have.

Clojure Corner

Polymorphism in Clojure - Tim Ewald's 1h live coding talk at Øredev conference introducing mechanisms for polymorphism (and Java interoperability) in Clojure and explaining well the different use cases for them. Included: why records, protocols & polymorphism with them (shapes, area => open, not explicit switch) (also good for Java interop.: interfaces), reify, multimethods.
Stuart Sierra: Thinking in Data (1h talk) - Sierra introduces data-oriented programming, i.e. programming with generic, immutable data structures (such as maps), pure functions, and isolated side-effects. Some other points: Records are an optimization, only for perforamnce (rarely) or polymorphism (ot often); the case for composable functions; testing using simulations (generative testing) etc.; visualization of state & process

Tools & Libs

Netflix' Hysterix: library to make distributed systems more resilitent by preventing a single slow/failing dependency from causing resource (thread etc.) exhaustion etc. by wrapping external calls in a separate thread with clear timeouts and support for fallbacks, with good monitoring etc. Read "Problem Definition" on the page to understand the problem it tries to solve.

Favorite Quotes

if you build something that is fundamentally broken it isn't really interesting that you followed the plan or you followed some methodology -- the thing you built is fundamentally broken.

- Bjørn Borud, Chief Architect at Comoyo.no, in an email 12/2012

The root of the Toyota Way is to be dissatisfied with the status quo; you have to ask constantly, "Why are we doing this?"

- Katsuaki Watanabe, Tyota President 2005 - 2009 (from the talk Deliberate Practice)

Tags: scala clojure webdev java kent beck methodology DevOps data