Ofta och skarpt: #895

I våra nya projekt och miljöer har vi sett till att få deployer så enkla och automatiska som möjligt. Gör “deploy ofta och skarpt” är mantrat. För att hålla ångan uppe testar vi att (som enda utvecklingsavdelning i världen..?) lägga ut informationen på text-tv. I vårt fall på sid 895 i TV4:

Inspirationen kommer bland annat från Flickr. Sen är det en fördel att vi har en texttv-kanal att tillgå också ;).

crash and burn 2012

March 2 2012 peter svensson hosted the crash and burn at KTH Forum in Kista Stockholm.  The theme of the conference was integration, testing, deployment and virtualization.  It was a great conference even and hope it happens again next year as it added quite a few software projects to look at until then. Links to the speakers their presentations follows:

Sam Newman   Designing for rapid release

Can’t/don’t design huge monolithic systems especially if you want fast feed back and deployments.

Yan Pujante glu: open source deployment automation platform

You don’t have to build your own deployment system especially if you are deploying to java. Glu project provides tons of features to deploy any if not all types of web based systems (currently used by linkedin.com)

Mårten Gustavsson Ops side of Dev

Developers and operations have to work together if you are going to have any chnace of a sane production evironment. There are a lot of small things like logging that benefit from both dev and ops agreeing on what to log. Metrics are another key component to good cooperation (check  out http://metrics.codahale.com/ heck anything on https://github.com/codahale/)

John Stäck DNS in the Spotify Infrastructure (pdf 2.7 mb)

Lots of good information on how spotifiy uses  dns as a distrubted data store.

Carl Byström Load testing with locust

Load testing tools should be programmable(ie not xml an python fits well here) and they should reflect what the end user is going to do.

Leonard Axelsson & Ville Svärd Graphite – the village pump of your team

Metrics on a live system and seeing what your application and it’s users are doing is an invaulable for finding performance issues

Brian Riddle Continuous Integration the good, bad and ugly

Need to talk a little slower and maybe a demo. In preparation for this talk i gave a lunch seminar at valtech’s headquarters more info och video on their blog. That presentation is here.

Zach Holman Scaling Github

Every time someone from github gives talk you find interesting tidbits and the one that struck me the most? github has a employee retention of 100% and they are *still* growing. imagine working for a company like that.

Friends help friends deploy and build

It’s no secret that we use Jenkins pretty extensively here at TV4. Anything we can automate we do. We have about a 100 jobs running in Jenkins and 3600 builds and deploys to date.

Jenkins is a crucial part of what and how we work so when the call came out that Jenkins needed help we jumped at the chance to support an open source project that has more than paid for itself.

Jenkins wants more friends, be one today!

Kvalitet innebär förutsägbarhet

Ordet “kvalitet” går att använda som ett mått på förutsägbarhet: är något av god kvalitet så vet du vad du kan förvänta dig (och varje enhet ska leva upp till produktens kvalitetskrav). Att VETA vilken kvalitet man kommer leverera (oavsett om den är hög eller låg) är alltså ett kvalitetsmått i sig själv (även om det är bättre med hög jämn kvalitet än låg jämn kvalitet).

Under 2010 har vi varit väldigt fokuserade på kvalitet. Det är ofta svårt att vara det när man gärna prioriterar att göra någonting “fort” eller “billigt” men vi har gemensamt inom TV4 Digitala Medier insett värdet i att göra saker ordentligt. Det är egentligen ganska enkla frågor att ställa “Är det viktigt att det vi bygger ska fungera bra?” och “Kommer vi använda det vi gör idag även om några år?” osv men väldigt ofta prioriteras ändå tiden högt (och att tiden ska vara kort).

Kvalitetsmedvetenheten innebär bland annat att all kod vi skriver nuförtiden enhetstestas, att vi kör CI (Continous Integration) och att vi mäter mycket. En av de mest tillgängliga mätningarna är Pingdom. Pingdom är ett verktyg för webbsite-övervakning som mäter svarstider genom att regelbundet (1 ggr/minut) ladda ner en sida från olika mätpunkter på internet. Alla tider sparas och kan detaljstuderas eller visas i grafer. Alla våra egna sajter och många konkurrerande håller vi koll på med Pingdom.

Om man tittar på grafen över TV4 Play så ser man tydligt vad kvalitetsarbetet har fört med sig:

Grafen visar nedladdningstiden av förstasidan (endast html) och det röda strecket är lanseringen av den nya versionen vi släppte i oktober. En besökare på nya sajten upplever en betydligt jämnare (och högre) kvalitet än vad samma besökare gjorde på den förra versionen, även om vi har haft fler besökare, högre trafiktoppar och fler live-sändningar under andra hälften.

Giving tv4 a pulse with CI

tl;dr

on a java project using ant

  1. get it to compile locally
  2. download and start hudson  ‘java -jar hudson.war.
  3. configure hudson to checkout and run ‘ant compile’
  4. add unit tests
  5. configure hudson to checkout and run ‘ant test’
  6. move hudson and make it poll
  7. get emma to run so you can get some metrics
  8. add hudson emma plugin and configure hudson to run ‘ant emma-run-tests’
  9. using jsp? create an ant target to compile your jsp’s
  10. configure hudson to ‘ant emma-run-tests compile-jsp’

one of the most critical pieces of our infrastructure is our continuous integration server. we use hudson to build all of our projects and run our tests. setting up a new project that uses java or ruby takes about 10 minutes.  to get to the point of it taking only 10 minutes to configure a new job took a few weeks.  the following is how we got ci up and running for one of our large java projects that had no tests. this probably took 10 8 hour days to get this working but those days were spread out over a year and a half as we were working on other things. if you looking for more details on how to set up set up hudson this is not the place to get started this is more of the general approach we took.

start with getting it to compile locally

this seems simple but don’t take it as a given.  one of the first reasons i installed hudson was to make sure that our large code base would compile.  this was a challenge as we had developers from different projects working in pretty much the same areas of the code.  we use ant and have a target called compile but not everybody ran them and sometimes files were not properly checked in.  it took a couple of hours getting this to work as there was at some special set up that was required to get it to run at all locally. finally got to the point where running:

ant compile

was all it took to compile the our code.

next step is to  download and start hudson on my machine and get it up and running.   don’t get overly ambitious here. as long as you have java installed running:

java -jar hudson.war

should get hudson up and running and you can begin to configure hudson to checkout your code from your version control system. one tip is to create a special user in your version control system that only has read access.

configure hudson to checkout and run ‘ant compile’

in hudson under build  steps -> target  add compile this is the target that will be run when you trigger a build.

add unit tests

now is when hudson starts paying off.  adding unit tests to a large code base can be frustrating.  it can be done and it will seem like a monumental task if you have never used junit/testng before. even if you are familiar with junit/testng it will not be easy. but the pay off is more than worth it.  if you have work with unit testing before remember that the goal now is to get

ant test

to work. start off by creating a directory by your java src code called test this will be the root of your test hierarchy.  add a new target to ant that includes the your build classpath (ie the classpath used to run ant compile) you might at this point need to refactor your classpath from the compile target so your can use it as a part of the test classpath target. next add one test that passes and one test that fails. by this i mean something like this:
running ant test should now give you something like this:

if you get Tests run: 2, Failures: 1, Errors: 0. you can remove test file and start writing unit tests.

configure hudson to checkout and run ‘ant test’

change the build step from compile to test.

move hudson and make it poll

I ran hudson for almost a year on my local machine with little trouble. the hardest part was remembering to leave my machine on during vacation. the next step is to get a new home for hudson. the requirements are pretty low. one computer, power and network. the computer doesn’t have to have be anything special. 40Gig hard disk and 1G ram should do in the beginning. install your operating system of choice and install and configure hudson. now that it’s not local anymore you can make it poll version control for you. under build triggers -> PollSCM add */2 * * * *. this will make hudson check your version control every 2 minutes. if anything is checked in it will start a new build.

make hudson talk

Having hudson up and build every thing the next step i do is add an email notification task. using hudson’s default email notification plugin hudson will send an email everytime the build fails. to make this as visible as possible we created a special mail group that includes everyone that has access to our version control systems. to make it even more visible we use the email-ext plugin and configure it to send mail even when the build is successful. i know what you are omg! not more mail. my response is zomg it’s almost 2011 use a mail filter.

it’s now that hudson is becoming more and more relevant to you organization. especially if

  • your boss is getting mail every time it’s green (yeah!)
  • your boss is getting mail every time it’s broken(buuuuu). what do you mean it doesn’t compile?
  • your builds are red for very long. what’s keeping it from being fixed?

get emma to run so you can get some metrics

once hudson is up and running it’s time to get some metrics. why? well without measuring anything it’s really hard to know if it’s getting any better. it’s good to reflect on how the code base is shaping up now that more and more tests are being written. metrics are good points to see if the code that is being tested most is easiest to work with or how well the code that is used the most is tested. there’s tons of observations that can be made but only if you have some metrics. we use emma.  cause it’s free and not too hard to get working.  it’s basically the same as ant test target except it excludes some classes that are generated and needs both the build classpath and the test classpath. but once you can  ant emma-run-tests it’s a small matter to get hudson hooked make it pretty.

add emma plugin to hudson

here you will need to add the ‘hudson emma plugin’. and configure the emma section of you hudson job to find where the emma output is located. (ie build/emma/report/coverage.xml)

using jsp? create an ant target to compile your jsp’s

the next step was to get our jsp’s to compile. this came out of the need to upgrade our servlet container froim one version av resin to another. we have 500+ jsp pages and no real practical way to navigate to all of them but compiling them gave a bunch of benifits. we got rid of all the pages that had jsp scriptlets that did not compile whcih gave 503 errors every time some one navigated to them. there was not many of them as any jsp’s that werew surfed to often enough were kept in ok shape but the getting the few that did not work ended solving a couple of really low prioritized bugs.

the other benefit was we when we started to move from one version of resin to another (3.0 to 3.1) we found that resin 3.1 was much more strict in the way it interpreted of jstl and el syntax. this script should work for bother resin 3.0 and 3.1.

once that was in place it was a matter of running that script and correcting the errors. took about 3 hours to get everything clean. the amount of time and money this has saved not small. finding out that a jsp doesn’t work until it’s live causes a chain of patches, redeploys and testing that really eat time. now this is not an issue as hudson screams way before the code is deployed becuase it’s run every time hudson builds this job. fail early, fail fast.

configure hudson to ’emma-run-tests and compile-jsp’

The last configuration we have added is to make hudson run emma-run-tests and compile-jsp to hudson.

is it worth it? yes the further i get down the list of testing our code the better i sleep 🙂 . is there more to do ? always. we could add integration tests but in just this project we have reached a point of diminishing returns. this project has probably the worst test coverage of any of the project we have right now all the other have more than 50% and all ruby projects have at least 80%.

.