Friday, December 28, 2012

Diagnostic Logging

"Which logging framework should I use?" isn't the first question you should be asking in relation to diagnostic logs. Consider these instead: 

  • Can you debug your application using only the logs without a debugger attached?
  • Is there so much noise and useless information in the logs that you are searching for a needle in a haystack?
  • If you refactor your code, does your logging reflect the new code layout / workflow allowing you to pinpoint the cause of the problems easily?
  • Do you treat your logs like a user interface? Do you regularly review them for usefulness and usability?
  • Are your logs a security risk?
  • Can you correlate logs to business events or user actions (e.g. correlation ID)?
  • Have you tested the performance and scalability of your logs to ensure they have a negligible impact on your systems?
  • Do your logs have unintended side-effects on your application? (for example, forcing participation in a distributed transaction, or even deadlocks under heavy load)
Once you're confident that you're able to answer the above questions satisfactorily, the original question about which logging framework to use: It depends on your technology and your needs. Whatever those turn out to be, here's the list of features I usually look for:

  • Filtering/Targeting: Based on category or log source, I can filter the log output or target log output from different components to different places.
  • Pluggable sinks: Can I redirect log output to custom sinks and are there libraries of built-in sinks to choose from?
  • Can the framework gather extra contextual information so I don’t have to gather it when I want to log (e.g. thread id/name, class name, machine name, timestamp etc.)
  • Is the framework performant? Does it mostly evaluate to a no-op if I have logging verbosity turned down?
  • Can I dynamically change the logging verbosity in production in response to an event? I've seen some production systems direct all Debug info to an in-memory buffer and only dump that buffer to disk when an error level event happens (allowing you to see a bunch of debug info from *prior* to the error)
  • Can I adjust the formatting of the output? Are there built in formats compatible with log reading tools?
  • Do you actively manage the size of your logs? Rolling file and auto cleanup are useful.
Good logging practices have served me well over the years, but for every new system I build, I tend to use less and less logging - down to the point where I'm only logging macro level tasks that are going to change the internal state or configuration of the system in a major way. I find myself getting closer to the guidelines posted by Jeff Atwood.

Thursday, December 20, 2012

Fixing broken windows

At some point in our software development careers, we experience a phenomenon commonly described as the “broken window theory”. In terms of code, this loosely translates to mean that code quality will deteriorate over time because small lapses in quality will replicate themselves throughout the codebase. A corresponding belief is that the only way to improve quality is to either start with perfect code or start again (i.e. demolish and rewrite) with good intentions in mind. I would like to challenge this belief and present a way that even the worst legacy codebases can improve over time. If it’s broken – fix it!
Some my favourite background reading on on this topic include:

More than once, I have received questions from developers asking how they can convince their manager to let them rewrite the component that was badly written in a rush in the previous iteration. Let me tell you a secret, there is almost never time, business desire, or a budget to rewrite anything – even when poor quality is bringing team velocity to a halt like a tractor pulling a sled across a muddy field.

Sustainability is the key to overcoming the reluctance to rewrite. By its definition, sustainability points to being in it for the long haul and making the codebase a better place to work. Think of it as renting a dingy apartment and putting a coat of paint here, a new piece of furniture there, and continuing to make minor fixes. It gets better over time.
Improving something is almost pointless if you don’t measure what you’re improving. When untangling a jumble of string, how do you know if you’re creating more knots or less, unless you keep track of that straight part at the end? Measuring an increase in quality over time has a huge impact on how you feel about the codebase. It can also guide you in knowing where the most work is needed.

So how does one measure code quality? You should have a set of metrics that can be calculated automatically (e.g. cyclomatic complexity, coupling levels, etc.). These are discreet metrics, meaning they give definite numerical values which can be trended over time. You know you’re making an improvement if the average complexity goes down, despite that shiny new feature you’re just added.
It doesn't matter that your complexity is 1,000 times greater than it should be – what matters is that the next time you measure it, it is reduced. These metrics can be integrated easily into the continuous build system so that the review effort is minimal.

Besides the hard metrics, there are less defined ways to measure quality. How long does it take for you to implement a new feature compared to an ideal scenario? How long does it take to on-board a new developer? How long do you spend waiting for the build to complete each day? Are you confident with releasing this to production? All of these “gut” checks hint towards areas for improvement. So the next time you find yourself wishing things were better, stop for a moment and invent a few ways that you can make the situation even marginally better. In short, numerous small, iterative wins will trump big, disruptive wins any day.

Next time you go to your manager with the intention of begging them to give you time to rewrite the component that is badly implemented: don’t. Instead, approach them with a graph of metrics showing the baseline quality and the quality after you've implemented your next feature and fixed a few things along the way. Demonstrating a sustainable increase in quality without a large investment shows initiative and problem-solving. Besides, managers always like pretty graphs!
Here are a few tips on areas you should concentrate on when improving code:
  • Reduce friction and increase team productivity. Take some time to organize the solution structure so projects are easier to find. Optimizing the build so it builds faster is the first thing that should be tackled as it makes everything easier down the line.
  • Remove the human factor. Automate as much as possible. Write scripts for deploying to your test environment. Write a script that builds and runs all unit tests locally. The more repetitive things you can automate, the more time you can concentrate on doing things that are worthy of your time.
  • Measure! Don’t do anything without first measuring a baseline and subsequently measuring the impact. This is critical in providing information on how much impact you've had. It is also a great reward when you see things start to trend in the right direction.
  • Improve the code quality. Numerous measurable metrics designed to measure code quality exist, but probably the most important ones relate to reducing complexity and increasing the quality of tests.
  • Clean as you cook. Did you open a class to change a few lines of code? Review that class and its associated tests and do a quick cleanup while you are there. Missing tests? Write some. Didn't factor time into your estimates to do this extra work? Spend a little extra effort now and account for this next time.
  • Put barriers between poor code and good code. Define an integration point and translation layer where good code meets legacy code. Don’t inherit the bad when attempting to write the good. Leave the mess contained in the translation layer.
I've seen situations that range from systems outliving the language they’re written in, to systems which have grown “organically” with little thought to design. Gradual improvements will make your codebase a nicer place to live.

Thursday, December 6, 2012


For those of you wondering what this blog is about, head on over to the about page to find out more about the focus, naming, taxonomy, and guiding principles I've set out.

Tuesday, December 4, 2012

Lenovo Ideapad Yoga 13 review

Having bought the mid-range Lenovo Ideapad 13 (i5/8GB/128GB SSD) primarily for home use, here's my completely biased take on it.

  • Perfect machine for having around the home
  • Love the "rubberised" feel of the cover and keyboard area.
  • Great, bright, colourful screen.
  • Faster than we need for its intended purpose
  • Decent battery life (<5h)
  • Very responsive touch screen. I agree with Scott Hansleman when he said that "every laptop should (and will) have a touchscreen in a year". I will never buy a non-touch laptop again.
  • The power supply brick is small, thin, and light.
  • Lenovo's stupid, stupid HDD partitioning scheme. The 128GB leaves ~60GB usable space out of the box. First thing you should do is create recovery media, then reformat the entire machine (remove the 5 recovery partitions), and reinstall the drivers. See this Lenovo page for a hotfix if you don't want to do this yourself.
  • The keyboard is not up to Lenovo standards. Biggest pain is the shortened backspace key to accommodate the page up/down key column. I keep hitting page up when I want to hit backspace. Why haven't we agreed on a common keyboard layout yet?
  • The touchpad is nowhere near what I have come to expect from Lenovo. The integrated buttons are just weird.
  • The heat vents point out where the hinges are. This means when I have it in "stand mode" on my lap, my embarrassingly ample belly blocks the heat escape route. Doesn't get super hot, but is an issue.
  • There is no keyboard light (backlight or otherwise).
  • The power supply connector is one of the new USB-ish square connectors, making it incompatible with the multitude of other Lenovo power adapters I have around the place.

Things commonly cited in reviews that aren't actually issues for me:
  • Having the keyboard upside-down on my lap. It doesn't feel weird. I don't feel it at all. 
  • Touch screen laptops are flimsy and the screen moves when you touch it. It doesn't move. It flexes a bit if I'm rougher than I need to be, but springs back easily.

Overall, I love this machine for home use. It really does highlight the difference between consumer and business class machines. I miss features like the great keyboard/touchpad, keyboard light, docking port, user replaceable battery, etc. from my Lenovo T430s that I currently use for work.

Key question: Would I buy this again? Yes, absolutely.

Key question: Can it be improved? Yes. Give me a business class version.