Togetherness

You know that feeling when you’re part of a focused team working together towards a common goal? It’s great. It’s productive and engaging. But it’s kind of hard to get there.

Maybe it’s because it requires quite a few things. For instance, the possibility to focus on one thing at a time, with few disturbances. This can be difficult to achieve working in an open-plan office space within a company managing a few of the most visited web sites in Sweden. People tend to make sounds, technology tends to brake down, problems arises from seemingly nowhere and on top of that there’s a future to plan. Also you’re often requested to estimate how long something you’ve never done before will take. Etc.

So we try to find ways to make this manageable. We apply different processes, organize and reorganize, invest in new technology, use deck of cards to help us estimate. Etc.

Da Mob

At the moment many of us at Bonnier Broadcasting think that the best tool at hand right now is mob programming. The web teams take on it is simple. A group of people. One computer. One screen, a really big one. One problem. And then we attack it. From start to production.

We’ve created a few mob spaces where we manage to be part of the open-plan, but still isolated enough to keep an all-day conversation going without upsetting our co-workers too much.

We enforce hard restrictions on our business surrounding when it comes to prioritizing. We can handle one thing a time, pick the one you wish for the most. That is of course not entirely true. We are still responsible for a fairly complex product that needs tending to. Meetings needs to be attended and so forth. But we can manage that. For instance, one person can leave the mob to focus on a problem that emerged or attend a meeting. The work still goes on in the mob.

The mob becomes a natural place for communication and cooperation with people and functions outside the team. For instance we have a great UX team. Unfortunately they don’t have time to hang around our team all day long. But with the mob approach it becomes natural for them to swing by when they have an opening. They can become part of the mob for a while, and together we can discuss problems and develop solutions. If a stakeholder wants to now how things are going, they can simple stop by. Maybe just watch for a while, as the result of our work is almost always right there on the screen, the really big one.

imagesliderw800

A few weeks ago the top priority was an initiative to enhance the news section of TV4 Play. This has been in the making for a while. Lots of discussions, UX work, wire framing and so on. But now it was finally time for us to build something. It had been agreed that the most wished for feature was a new player module. Where the live news stream would be rolling right from the start but with no sound. A single click on the video should start the sound and on the right hand side there should be a list of the latest news clips that could be started quickly.

imagesliderw800-2

So the mob started to work. Things went great. Seemingly, our previous work reactifying our views paid off. At the beginning the UX team spent a lot of time in the mob. We identified problems and found solutions. Together. A few days in, everything started to feel production-ready. At that time we brought in people from SEO and Analytics who helped us to set up reasonable tracking and make sure google would still be able to find our news content. We were ready for production. One day ahead of the US election.

The sound of commercials

One thing you simply do not do, when developing an online video service all day long in an open-plan office, is having your volume set to more than a bare minimum. Especially when we’re sharing one big screen and not using headphones. Maybe that’s the reason we didn’t notice earlier that we had a huge problem with our video player. When switching between the latest news clips at a swift, but reasonable pace, the player could end up playing the sound of the commercial from one clip while playing sound and video from another. That’s a horrible user experience.

Into the mob enters one of our developers from the video player team. With common effort we could determine the problem was mainly caused by a bug in third party code. But we could also see a way around it. So we pulled up the code for the player and together we managed to fix the problem, and at the same time refactor parts of the player to make it more manageable and more easily integrated with TV4 Play. Now, we were really ready for production. We only missed the US election by two days. Well. At least we hadn’t estimated, “sync up meeted” and “moved stuff around in jira” our way into missing, what seemed to be, a perfect launch opportunity.

The bug we found in our third party code is acknowledged in their support system. We’re told that a fix is planned to be included in a release that will be available in two weeks from now.

C More 2.5.0 – made by Tegeluddsvägen

About a week ago Bonnier Broadcasting had the privilege to announce the release of C More’s new mobile apps for iOS and Android. At first sight you will see that the UI is brand new and it is aligned with the new design guidelines for C More and Bonnier Broadcasting. The new apps are much more responsive and easier to use, and we have put a lot of effort into making them robust. But the biggest transformation lies beneath this pretty surface – a brand new engine!

cmore_app_01

Nowadays, we have full control of the development process in-house. We have implemented a new modern code base, which give us better flexibility and capability to be more effective in our future product development. This project delivery is the first one out for Bonnier Broadcasting’s new development department, and specialists from various teams have been involved and contributed.

Our Android client programmers have been working with new innovating techniques such as: Model-View-Presenter design pattern, Data Binding, RxJava, Lambda expressions, ExoPlayer, annotations and dependency injections with Dagger. The iOS application is written in the Swift programming language.

During the first week we have already reached over half a million downloads for both iOS and Android.

teamet2

Some of the fabulous people who made this happen.
This pic is taken shortly after release to App Store and Google Play.

Engineers, engineers, engineers

Today and tomorrow on the 22-23 of November Bonnier Broadcasting will attend the engineering career fair THS Armada at KTH Royal Institute of Technology

We will talk to engineering students about how we are building the next generation TV-experience on TV4 and C More through our tech development.

Ask us questions and share your views on our technologies and platforms. Talk to us about our cloud platforms, programming languages and how our engineers build our streaming services TV4 and C More on one shared tech platform.

We will also have fun quizzes with nice prices and some great cupcakes that you definitely want to taste.

Also students will have an opportunity to refine our suggestions or pitch own ideas for exciting master thesis projects next year at Bonnier Broadcasting.

Test-Driven Development is not about testing

An early morning last month (October 2016) a mix of software developers from several teams of Bonnier Broadcasting gathered in the cosy offices of Agical in Gamla Stan to spend a full day listening to the wise words of Mr. J. B. Rainsberger on the topic of TDD — Test-Driven Development.

Here follows an attempt to distil some of my reflections and take-aways from the event in the form of a combined summary and short intro to the topic.

What is a test; what is TDD?

As developers we are human, and as humans, we err. Writing tests is a way for us to eliminate, or at least reduce, errors by ensuring that what we build functions in the way we expect. If we write code that can add two numbers, we might write a test that asserts that given one plus two, we get three.

TDD takes the idea of testing a step further by in essence letting the test define correct behaviour, as opposed to verifying correct behaviour after the fact. Writing tests early on turns out to have positive effects not only on the correctness of our code, but also its design, with reduced coupling and better separation of concerns — perhaps this is the greatest benefit of TDD.

Take-aways

Many things were covered during the event and it would have been odd if everyone went home with the exact same insights. Here are just a couple of take-aways that struck me as having particular value.

Design

Design is hard. Done well it brings many benefits and can be intellectually rewarding. Done wrong it can be the cause of high costs both financially and in the form of lowered productivity and motivation.

TDD is not a silver bullet that magically solves all design problems, and I will not claim that TDD is essential to designing things well, but it is a valuable tool that if used right allows us to find design mistakes early so that they may be corrected, before they become expensive. Rainsberger relates to the concept of risk exposure.

E = ∑([pm * cm, …])

Exposure is the sum of the probability of each possible mistake multiplied by the cost incurred should it happen.

To reduce exposure we may choose to reduce the probability of a mistake — in essence figuring out the potential mistake before it happens. If it is true that design is hard, then predicting future mistakes is not a trivial task.

Alternatively, we could assume that mistakes will be made and instead try to reduce their cost.

Code that is easy to test tends to be better designed, since — and one could almost get away with giving the inverse as the reason — well-designed code tends to be easier to test. It is like a virtuous circle, and also a concept within TDD; write a test, implement, test, refactor (improve the design), loop. This way TDD helps keep a clean design. If we only take care to watch for the tell-tales indicated by how straightforward and clear the tests are, it is easier to, at an early stage, spot potential flaws in our design.

Contracts

When designing we make certain assumptions. That code that adds two numbers might assume that its arguments are numbers. In a statically typed language this particular case is of course automatically enforced. In a dynamically typed language, it has to be assumed or checked. This is a precondition for the code to work correctly and essentially becomes part of its contract, whether explicitly documented or not.

Another example is the Read method as defined by the interface io.Reader of the Go standard library. Read is used to read bytes from a data source by copying it into a given buffer p. Its contract states “if some data is available but not len(p) bytes, Read conventionally returns what is available instead of waiting for more”. It is the responsibility of the caller to be aware of this behaviour.

Assumptions always exist about how the code we write will be used, and about how code we use behaves. A clear contract not only defines what can and cannot be expected, but it helps define to what extent the code should be tested. While some things are difficult to capture in a test (such as unexpected behaviour), writing tests can be a nice way to explicitly demonstrate what kind of usage we expect and support.

Unit tests vs. integration tests

A unit test serves to test a small part of a system in isolation.

A unit test:

  • …is isolated
  • …is simple
  • …is small
  • …runs fast
  • …is clear on what is being tested
  • …makes it easy to pinpoint the cause of specific failure
  • …drives design

An integration test serves to test how multiple layers of a system work together as a whole.

An integration test:

  • …tests many parts of the system as a whole
  • …might make calls to things outside of our control
  • …can fail in many different ways
  • …is slow

Side note: Rainsberger would correct the term “integration test” replacing it with “integraTED test”, defining it such that an integraTION test tests isolated points of contact between two parts, whereas an integraTED test is, quote, “any test whose result (pass or fail) depends on the correctness of the implementation of more than one piece of non-trivial behaviour“. While these definitions are clear in themselves, the use of two similar-sounding terms to mean different things tends to cause confusion (cf. authentication vs. authorization — both conveniently abbreviated ‘auth’), so I choose to simply walk around the definition swamp by using the term “integration test” to mean what Rainsberger calls “integrated test”, and for the time being avoid this other definition of “integration test”. How to name things: one of the hard problems in programming.

Rainsberger metaphorically compares the two methods of testing saying that the former is like painting a wall with a brush, while the latter is more like trying to cover all of it by throwing buckets of paint at it.

He describes the latter kind of tests as a scam and drew for us a vicious circle, illustrating that the more we start to rely on these tests, the more we lose the ability to iteratively improve our design, which in turn negatively affects the ability to write isolated unit tests, in turn resulting in even more end-to-end tests being added.

Perhaps one can extend the paint metaphor by saying that the more buckets of paint we throw at the wall, the harder it gets to move closer to it without getting our feet covered in paint.

Above being said, it should be pointed out that acceptance tests for the purpose of verifying behaviour from the end user’s point of view are perfectly fine, as long as they are not used as a substitute for tests which are usable in driving the design.

Conclusion

Even for someone already at least somewhat familiar with the idea of TDD I found it refreshing to during this event be able to reiterate on some of the concepts.

For the past two-or-so weeks we tried forcing ourselves to use TDD at one of our mob stations. In the context of mob programming it turned out to be quite useful in defining something of a road map of what to do next, and also as a means to keep focus on a particular task.

The most important insight that I gained from this day deserves its own line:

TDD is not about writing tests, it is about writing better code

Lastly, some things noteworthy.

  • – Prepare for design change from any direction.
  • – Not prepared for any particular change, but prepared for change
  • – Test integrations with external parties in isolation.
  • – Abstract external parties using interfaces, exposing only what is needed.
  • – Avoid mocking types that you don’t own.
  • – Do not pass nil to things intentionally.
  • – Complex mocks can be an indication of bad design.
  • – Make the documentation clear, entertaining, and small, with code examples.
  • – 100% test coverage does not necessarily mean that all logic is being tested.
  • – A worker should never talk directly to the client in multi-threaded code.
  • – Test the worker in isolation; it should not know whether or not it is being called asynchronously.
  • – Avoid large anonymous functions, because they are hard to test in isolation.

I shall also add that the above are my own interpretations, and as such do not necessarily correspond exactly to the ideas presented during the event.

Thank you for reading.

A platform for our platform

In June this year (2016), a new team was formed within Bonnier Broadcasting with a mission to build a realtime data platform. The purpose being to address a set of high priority data needs we have identified, but also to support the growing data driven culture we see in both C More and TV4. A realtime data platform has of course no formal definition, and it can mean slightly different things to different people. To us, it means an accessible data warehouse where all our data sources are consolidated and integrated. These data sources range from relatively static sources, such as user or video databases, to event streams from apps and services, ingested in realtime. From all these sources we need to create data models, analysis, and visualizations that help the rest of the organization do a better job.

So, when starting out on this journey, one of the first questions to arise was, where should we build this platform? What platform should we choose as the foundation for our platform? This post recounts the reasoning behind our choice, and our experiences with the result so far.

TL;DR

We went with Google Cloud Platform for our data platform and we are mostly happy with it.

A cloud platform

For the majority of our system components (infrastructure services, databases, video platform) we have already made the move to the cloud. Several years ago we started using AWS and Heroku (and we are still migrating legacy services). At the time there were not many options, and we are still happy with AWS for our core needs.

This year, however, when discussing our future data platform, the cloud landscape has evolved substantially from when we picked AWS, and we saw an opportunity to take a step back and evaluate our options. And there are a host of options these days, at various abstractions levels, and with various levels of managed offerings.

Apart from pure functional requirements, of which none is particularly unique, we have a few non-functional constraints that effectively narrow us down a bit.

First, we want as much control as possible. We want to avoid, or at least minimize, vendor lock-in and dependencies to third-party products and solutions. This rules out a number of packaged solutions, such as Cloudera, Databricks, MapR, and others.

Second, we are a small team with an ambitous task. We want to minimize, as much as possible, the cost of operation and maintenance that comes with doing too much ourselves. The most obvious path from there would perhaps be to go in the Apache direction, where Hadoop, Spark, and an abundance of other tools, await the willing, and set them them up on our AWS account. Much, if not all, of what we want to achieve has already been done with these tools, and many well-known companies have contributed to the toolbox over the years.

However, a third constraint is that we have picked Go to be our language of choice. This is a conscious choice, made from a number of reasons (which I would be happy to share in a separate post). As a consequence, we have invalidated (or at least made awkward to use) from our list of options the majority of open source data processing tools and frameworks, which are available to your everyday Java, or Scala based shop.

GCP

Google Cloud Platform (GCP) has been moving forward at tremendous speed during the last year(s) to catch up with AWS, trying to establish itself as a serious contender in the cloud market. And with Spotify announcing early that they are migrating their data processing platform to GCP, the risk associated with making the same move was substantially lowered.

As it turns out, what Google Cloud wants to be matches our needs and constraints to a very large extent. Most, if not all, of the infrastruture we need is there. The following four components are what we use the most today.

  • BigQuery is the managed data warehouse we needed but did not find in Redshift.
  • Container Engine (GKE) is the managed Kubernetes platform we don’t have to waste our own time on setting up and maintaining.
  • Dataflow is the managed data processing service that automatically runs and scales our stream and batch jobs, that we write with Apache Beam/Dataflow.
  • PubSub is the managed message bus that we use to pass data between the services above.

A key word here is managed, and it translates to a huge amount of time and effort that our small team does not have to spend on stuff (e.g. setting up, configure, monitor) that is not pushing our data platform forward.

Many people I talk to seem to look at GCP as simply a Google version (albeit less mature) of AWS or Azure, that aside from subjective bias and pricing, they have more or less the same offering. Having started out with AWS for our platform, we have found this assumption to be false, at least when it comes to our use case. The managed aspect of the GCP services is a differentiator.

What about Go?

The third constraint I mentioned above was the Go language. These days, the JVM is the ruler of data engineering, and to a large extent data science as well. As a consequence, we are not completely free from the JVM dependency, as much as we want it. What’s holding us back is Beam/Dataflow, where the stream processing SDK is still only available in Java. The Python SDK is moving along, however, and there are rumors that a Go SDK is in the works somewhere inside Google. Hopefully we can contribute to that once it is released to the community.

However, since all Google Cloud API’s have official Go client libraries (of which most are hand crafted and idiomatic, and not generated from spec), we can do most of our plumbing and infrastructure in the language we feel gives us most speed and quality.

The flip side

When I earlier wrote about what Google Cloud wants to be, I was hinting at the major downside we have experienced with our choice of cloud platform. Several must-have’s for us in terms of functionality are still in beta, or even alpha, and one or two GA components are still lacking (I’m looking at you, Stackdriver). Each component is moving at a very high speed, and one shouldn’t invest too much effort in trying to work around a current limitation, as it may well be solved two months later.

So, while moving in the right direction at high speed generally is perceived as a good thing, it is also frustrating when you try to build something real on top of it. We wouldn’t want it the opposite way though, and all in all we are happy with the choice we made.