»

Running Dropwizard as a Guava Service

There are many things to like about the Dropwizard Framework, but if you’re like me, you might want to “own your main” function. A normal Dropwizard application gives you a run hook as part of its Application parent class, but in the end your code is still subservient to the Dropwizard framework code.

One pattern that is very good for organizing small logical services in your application is to use Guava’s Services to break your application into service level groupings (And avoid the drinking the microservice kool-aid prematurely). Guava services give you a common operational interface to coordinate all your in-process logical components. In this model, the Dropwizard web server is no more important than my periodic polling service, or my separated RPC service, and so on. I’d like my Dropwizard web stack to be a peer to the other services that I’m running inside my application. It only takes two steps to make this work.

Step 1: Create the Guava Service

Create a new class that extends AbstractIdleService. When we implement the startUp method in the service, we will need to handle the key bootstrap setup that normally is handled within the Dropwizard framework when you invoke the server command.

import com.codahale.metrics.MetricRegistry;

import com.google.common.util.concurrent.AbstractIdleService;
import com.google.inject.Guice;
import com.google.inject.Inject;

import com.google.inject.Injector;
import io.dropwizard.setup.Bootstrap;
import io.dropwizard.setup.Environment;

import org.eclipse.jetty.server.Server;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class DashboardService extends AbstractIdleService {
    private static final Logger logger = LoggerFactory.getLogger(DashboardService.class);
    private final DashboardApplication application;
    private final DashboardConfiguration config;
    private final MetricRegistry metrics;
    private Server server;

    @Inject
    public DashboardService(final DashboardApplication application,
                            final DashboardConfiguration config,
                            final MetricRegistry metrics) {
        this.application = application;
        this.config = config;
        this.metrics = metrics;
    }

    @Override
    protected void startUp() throws Exception {
        logger.info("Starting DashboardService");
        final Bootstrap<DashboardConfiguration> bootstrap = new Bootstrap<>(application);
        application.initialize(bootstrap);
        final Environment environment = new Environment(bootstrap.getApplication().getName(),
                                                        bootstrap.getObjectMapper(),
                                                        bootstrap.getValidatorFactory().getValidator(),
                                                        metrics,
                                                        bootstrap.getClassLoader());
        bootstrap.run(config, environment);
        application.run(config, environment);
        this.server = config.getServerFactory()
            .build(environment);
        server.start();
    }

    @Override
    protected void shutDown() throws Exception {
        logger.info("Stopping DashboardService");
        server.stop();
    }
}

Step 2: Reinitialize logging inside the Dropwizard Application

public class DashboardApplication extends Application<DashboardConfiguration> {
    @Override
    public void run(DashboardConfiguration config,
                    Environment environment) {
        reinitializeLogging(environment);
    }

    /**
     * Because Dropwizard clears our logback settings, reload them.
     */

    private void reinitializeLogging(Environment env) {
        LoggerContext context = (LoggerContext) LoggerFactory.getILoggerFactory();
        try {
            JoranConfigurator configurator = new JoranConfigurator();
            configurator.setContext(context);
            context.reset();
            String logBackConfigPath = System.getProperty("logback.configurationFile");
            if (logBackConfigPath != null) {
                configurator.doConfigure(logBackConfigPath);
            }
        } catch (JoranException e) {
            throw new RuntimeException("Unable to initialize logging.", e);
        }
    }
}

Start your Guava Service

Now you have a nicely contained Guava service that you can manage alongside your other Guava service that aren’t necessarily Dropwizard related. You can startAsync and stopAsync the service to start and stop the web server, even while other code is still running.

»

Terraform AWS Static Site with CloudFront

UPDATE: This pull request has been merged into Terraform

A recent patch on the Terraform GitHub repository adds support for CloudFront distributions to the Terraform AWS Provider. The patch has not been merged into Terraform mainline yet, but I wanted to share my experience setting up an S3 static site, fronted with CloudFront and DNS routed with Route53. Until the CloudFront PR gets merged, you’ll have to build the branch from source in order to use the aws_cloudfront_distribution resource. The words you’re reading right now were served up from this very Terraform configuration via CloudFront, migrated from a simple nginx setup. If you haven’t used Terraform before, please review the Introduction and Getting Started Guide before proceeding.

Step 1: Setup your S3 Static Site Bucket

The first thing you need to do is setup an S3 bucket to act as your ‘origin’. This is where all your static HTML files and assets will live. Here’s what the code looks like:

provider "aws" {
  alias = "prod"

  region = "us-east-1"
  access_key = "${var.aws_access_key}"
  secret_key = "${var.aws_secret_key}"
}

resource "aws_s3_bucket" "origin_blakesmith_me" {
  provider = "aws.prod"

  bucket = "origin.blakesmith.me"
  acl = "public-read"
  policy = <<POLICY
{
  "Version":"2012-10-17",
  "Statement":[{
    "Sid":"PublicReadForGetBucketObjects",
        "Effect":"Allow",
      "Principal": "*",
      "Action":"s3:GetObject",
      "Resource":["arn:aws:s3:::origin.blakesmith.me/*"
      ]
    }
  ]
}
POLICY
  
  website {
    index_document = "index.html"
  }
}

After running terraform apply, you will have an S3 bucket that’s setup to serve HTTP traffic from the root of the bucket. Let’s examine some of the important parameters:

  • policy: Bucket policy that makes the bucket publicly readable.
  • website: Configure the S3 bucket to serve up a static website, in this case setting the default index_document to index.html.

You can find other configurations on the aws_s3_bucket resource page.

After uploading your static site to the S3 bucket, you should already be able to view the website at http://${bucketname}.s3-website-${aws_region}.amazonaws.com. As an example, my blog can be served up at http://origin.blakesmith.me.s3-website-us-east-1.amazonaws.com/.

Step 2: Add a Route53 Record for your Origin

We need a DNS entry for this origin. In this example, we’ll create one at http://origin.blakesmith.me. This will give us a helpful DNS record that will not route through CloudFront and can be used to access the S3 bucket static site directly with no caching or other CloudFront routing rules applied.

resource "aws_route53_zone" "blakesmith_me" {
  provider = "aws.prod"
  name = "blakesmith.me"
}

resource "aws_route53_record" "origin" {
  provider = "aws.prod"
  zone_id = "${aws_route53_zone.blakesmith_me.zone_id}"
  name = "origin.blakesmith.me"
  type = "A"

  alias {
    name = "${aws_s3_bucket.origin_blakesmith_me.website_domain}"
    zone_id = "${aws_s3_bucket.origin_blakesmith_me.hosted_zone_id}"
    evaluate_target_health = false
  }
}

First we setup our top level zone, and create a record that’s associated with that zone. We setup an alias record configuration that targets our S3 bucket we created before. Here are the important parameters:

  • zone_id: A reference to our top level DNS zone id
  • alias#zone_id: A reference to the S3 bucket’s existing zone identifier

Once you terraform apply this, you should be able to access your origin at the above DNS record and the behavior should be the same as when you set up the S3 bucket static website hosting directly. Notice that http://origin.blakesmith.me has no behavior change from the simple S3 static bucket site we setup before.

Step 3: Setup your CloudFront Distribution

We have all the basic pieces in place, now comes to the meat: Let’s setup a CloudFront distribution that will use the origin we just configured to serve up the website at the edge. If you’re not too familiar with CDNs, think of them as a “big distributed cache across the globe”. After we configure this distribution, our content will be served from edge servers closest to visitor’s location.

resource "aws_cloudfront_distribution" "blakesmith_distribution" {
  provider = "aws.prod"
  origin {
    domain_name = "origin.blakesmith.me.s3.amazonaws.com"
    origin_id = "blakesmith_origin"
    s3_origin_config {}
  }
  enabled = true
  default_root_object = "index.html"
  aliases = ["blakesmith.me", "www.blakesmith.me"]
  price_class = "PriceClass_200"
  retain_on_delete = true
  default_cache_behavior {
    allowed_methods = [ "DELETE", "GET", "HEAD", "OPTIONS", "PATCH", "POST", "PUT" ]
    cached_methods = [ "GET", "HEAD" ]
    target_origin_id = "blakesmith_origin"
    forwarded_values {
      query_string = true
      cookies {
        forward = "none"
      }
    }
    viewer_protocol_policy = "allow-all"
    min_ttl = 0
    default_ttl = 3600
    max_ttl = 86400
  }
  viewer_certificate {
    cloudfront_default_certificate = true
  }
  restrictions {
    geo_restriction {
      restriction_type = "none"
    }
  }
}

There’s a lot going on here, so let’s break it down a bit. The most important part is our origin declaration:

  origin {
    domain_name = "origin.blakesmith.me.s3.amazonaws.com"
    origin_id = "blakesmith_origin"
    s3_origin_config {
    }
  }
  • domain_name: points to the origin Route53 record we created in the last step
  • origin_id: A unique identifier for this origin configuration. Since you can setup multiple origins, used to link caching behavior to origin configurations.
  • s3_origin_config: Extra S3 origin options, we leave this blank

Next we have some other important top level declarations:

  enabled = true
  default_root_object = "index.html"
  aliases = ["blakesmith.me", "www.blakesmith.me"]
  price_class = "PriceClass_200"
  retain_on_delete = true
  • enabled: Enable our CloudFront distribution
  • default_root_object: Use index.html as our root object
  • aliases: HTTP hostnames you will be serving your site from. These must match your DNS records, or you will get 403 Forbidden Errors.
  • price_class: How CloudFront will prioritize where traffic gets served from based on price. See: CloudFront Pricing.
  • retain_on_delete: Causes CloudFront deletions to simply disable your distribution. Useful since CloudFront distributions can take upwards of 15 minutes to propagate.

Then we setup our basic caching behavior:

  default_cache_behavior {
    allowed_methods = [ "DELETE", "GET", "HEAD", "OPTIONS", "PATCH", "POST", "PUT" ]
    cached_methods = [ "GET", "HEAD" ]
    target_origin_id = "blakesmith_origin"
    forwarded_values {
      query_string = true
      cookies {
        forward = "none"
      }
    }
    viewer_protocol_policy = "allow-all"
    min_ttl = 0
    default_ttl = 3600
    max_ttl = 86400
  }

The most important part is that we reference our target_origin_id to link these two stanza configurations together.

  • allowed_methods: Which HTTP verbs we permit our distribution to serve
  • cached_methods: Which HTTP verbs we let this behavior apply to
  • target_origin_id: The name of the previous origin_id in our origin stanza
  • forwarded_values: Entities that will be passed from the edge to our origin.
  • viewer_protocol_policy: Which HTTP protocol policy to enforce. One of: allow-all, https-only, or redirect-to-https.
  • min_ttl: Minimum time (seconds) to live for objects in the distribution cache
  • max_ttl: Maximum time (seconds) objects can live in the distribution cache
  • default_ttl: The default time (seconds) objects will live in the distribution cache

Finally, allow CloudFront to use its default SSL cert and serve anywhere:

  viewer_certificate {
    cloudfront_default_certificate = true
  }
  restrictions {
    geo_restriction {
      restriction_type = "none"
    }
  }

Once you run terraform apply with your patched version of Terraform and wait the 10-15 minutes for AWS to asynchronously setup your distribution, you will have a CloudFront provided domain name that you can validate your setup with. For example, this website is viewable via CloudFront domain name at: d1u25xzl6dnmgy.cloudfront.net

Until the aws_cloudfront_resource gets released, you’ll have to consult the documentation provided in the pull request if you need to deviate from my simple setup here. There are also other helpful examples in the integration tests if you want to see other settings in action.

Step 4: Add Root Route53 Records

The last step adds Route53 records to reference the CloudFront distribution we just setup.

resource "aws_route53_record" "root" {
  provider = "aws.prod"
  zone_id = "${aws_route53_zone.blakesmith_me.zone_id}"
  name = "blakesmith.me"
  type = "A"

  alias {
    name = "${aws_cloudfront_distribution.blakesmith_distribution.domain_name}"
    zone_id = "Z2FDTNDATAQYW2"
    evaluate_target_health = false
  }
}

resource "aws_route53_record" "www" {
  provider = "aws.prod"
  zone_id = "${aws_route53_zone.blakesmith_me.zone_id}"
  name = "www.blakesmith.me"
  type = "A"

  alias {
    name = "${aws_cloudfront_distribution.blakesmith_distribution.domain_name}"
    zone_id = "Z2FDTNDATAQYW2"
    evaluate_target_health = false
  }
}

Here we setup our apex zone and www record to point to our CloudFront distribution. The two critical new pieces you should observe are:

  • alias#name: This ALIAS name references our CloudFront distribution created in the previous step
  • zone_id: This is a fixed hardcoded constant zone_id that is used for all CloudFront distributions

One final terraform apply and voilà! You have your final product: A static site being served from an S3 bucket and fronted by the AWS CloudFront distribution with Route53 knitting everything together.

You can verify everything is working by examining the HTTP response headers and looking for the CloudFront headers:

> GET / HTTP/1.1
> User-Agent: curl/7.37.1
> Host: blakesmith.me
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: text/html
< Content-Length: 36938
< Connection: keep-alive
< Date: Sat, 02 Apr 2016 17:48:17 GMT
< Last-Modified: Fri, 01 Apr 2016 11:44:21 GMT
< ETag: "1ae13b1b0471e67bad10eb95347f99da"
< Accept-Ranges: bytes
* Server AmazonS3 is not blacklisted
< Server: AmazonS3
< Age: 29
< X-Cache: Hit from cloudfront
< Via: 1.1 62e12fdf0f65bd8388f763f504606830.cloudfront.net (CloudFront)
< X-Amz-Cf-Id: 4F85Bkl9_nSPQvqjDWAEMYkssuPA04gl8V5qLLIU3cPlS5E1Gtam7A==
<

Helpful headers:

  • Server: The origin server identifier, in our case AmazonS3
  • Age: The age (in seconds) of the object you just retrieved from the distribution cache
  • X-Cache: Whether the HTTP request was a cache hit or miss.

Here is the full code example if you’d like to see the full setup that powers this site.

Happy Terraforming!

»

Code Review Essentials for Software Teams

Code Review is an essential part of any collaborative software project. Large software systems are usually written by more than one person, and so a highly functioning software team needs a robust process to keep its members, as well as the code base itself moving in the right direction.

Code Review is a powerful tool that:

  1. Helps team members adapt their mental model of the system as it’s changing
  2. Ensures the change correctly solves the problem
  3. Opens discussion for strengths and weaknesses of a design
  4. Catches bugs before they get to production
  5. Keeps the code style and organization consistent

It’s helpful to think of these benefits as a hierarchy of needs.

Code Review: Hiercharchy of Needs

Keeping Team Members Together

The most critical function of a Code Review is to keep every member of the team moving in the right direction. You can’t safetly change a system you don’t understand, and so Code Review keeps the team mentally aligned together. When Bob submits a pull request for the accounting subsystem, Amy keeps her mental model of that system updated as she reviews Bob’s code. Amy gets the chance to ask questions about pieces she doesn’t understand, and Bob gets the benefit of clarifying his design decisions as well as teaching someone else about his work. When Amy has to make a change to the accounting subsystem a month later, her mental model of the system is already up to date and ready for action. She spends less time reading code and trying to piece the system together in her brain, and more time thinking about higher level abstractions and designs. Everyone wins, because everyone stays together.

Executing a Good Pull Request

Before you even write a line of code changing the system, ask yourself the following questions:

  • “Is this the right thing to be working on?” There will always be competing needs from customers, internal team members and other parties. This can be a good way to keep priorities straight in your head before you dive in deep. Other processes like iteration / sprint planning meetings also help keep this on track.
  • “Does the team already agree that the change is the right one?” If not, it’d be better to start a design discussion by email or maybe in person. Your changes are more likely to get accepted when people are in agreement about the design before you change it.
  • “How can I break this change into digestable chunks that are easy to review and understand?” Small changes are easier to think about and understand. Good discussions flow from the team being able to comprehend your change quickly. If your change is massive, teammate’s eyes will glaze over, and you might only get a few style nitpicks from them.
  • “How am I going to test this change to kill bugs and ensure correctness?” You might have a QA department, but it’s still your job as the developer to ship quality working software. Easily testable software is usually more decoupled, is broken into smaller chunks, and easier to reason about. You need to have a testing plan for all your changes.

I’ve found that answering these questions ahead of time has saved me a lot of headache in the long run. The last thing you want to do is spend days coding up a change only to have it rejected based on fundamental design flaws or team disagreement. Or for your change to get held up because no one can verify it for correctness. Again, the goal is to change the system while keeping other team members up to date with your changes. Asking yourself about how you’re going to break your work into bite-sized chunks and test those chunks is helpful on many fronts:

  • It reduces risk
  • It makes changes easier to reason about
  • It pushes you towards a better design

You’d rather make many small, precise cuts with a scalpel than one giant gash with a machete. I prefer scalpel driven developement in most additive cases, and like to save the machete for when it’s time to delete large blocks of dead code.

Sending the Pull Request

Ok, so you’ve gotten buy-in from the team that your changes are good, and you’ve achieved the design you set out to build. What’s the most effective way to actually send your pull request? You’ve been working hard on your changes, and you want other members of your team to pay attention to your pull request and give you fast feedback. How can you do that?

Remember earlier when I said that the most important part of a code review is to keep the team’s collective mental model well aligned? Your other teammates are probably working on something completely different than you, maybe in a completely different part of the system. Their brains are in a completely different context, so you must overcome this by giving them the helpful guidance they need to do the review. This means writing a well-organized description of the changes you made, why you made them and any other relevant information they won’t get from just reading the code alone. Don’t make your teammates do more mental work than they have to.

Let’s look at some good and bad examples of pull request descriptions:

Bad Example:

Title: Fix uninitialized memory bug
Description: 

This is the bug Bob and I talked about earlier. I had
trouble with the compiler but managed to make this work. Let me
know what you guys think.

If you’re the developer of this pull request, stop and put yourself in your code reviewer’s shoes for a second. The title is vague, and raises more questions than answers: Where is the memory bug? How critical is this change? What was the bug that Bob and you talked about earlier? What trouble did you have with the compiler? The description doesn’t provide any helpful context about the problem, nor does it provide any helpful description of your changes. If this pull request is long, the reviewer is going to have to dive into the code and do mental gymnastics in an attempt to gain context before even getting a chance to think about how this change fits in to the coherent design of the system.

Here’s a improved example that helps clarify the changeset:

Title: Fix process crash on startup from uninitialized memory [#54633]
Description:

This bug was causing process crashes on boot due to a memory
initialization error in our statistics Counter class. I talked this
over with Bob, and we both agree the crashes are a rare edge case
that don't warrant a hot-fix release. Here's a summary of the
changes:

  - Moved the underlying int variable into the class initializer
    to prevent ununitialized memory in the Counter.
  - Reworked the Counter interface to simplify caller conditional
    logic and prevent further off-by-one counting problems.
  - Added a unit test that exposes the crash

Testing: I've verified the test suite still passes, and verified
manually that the crash doesn't happen locally.

A few things that have improved in this pull request description: The title is descriptive enough to give the reviewer some short context, and entice them to click on the email and learn more. The reviewer knows it’s a process crash (which is usually a really bad thing) and there’s also a bug report number where the reviewer can read more details about the bug report if they’re interested in understand more about the problem. The new description also outlines where the problem was occuring, and helps give some context about how critical the fix is. There’s a summary that lists the high level structural changes in the pull request, which will give the reviewer a mental picture of the changes before they even look at the code. Code reviewers should not be surprised with what they find. This example also outlines testing that was performed, which will give reviewers confidence that the changes were well thought out, well tested and ready to be merged. Set your reviewers up for success by making it stupid easy for them to click ‘merge’.

Reviewers: Giving Constructive Feedback

The next part of the review process to consider is giving constructive feedback to your peers. If the goal is clarity and alignment, giving quality constructive feedback helps everyone understand the system better and push towards better code.

Avoid saying things like:

  • “This design is broken.” Why is it broken? How can it be made better? Statements like this hurt confidence and can bruise egos.
  • “I don’t like this change.” Why don’t you like it? What would you like instead? It’s okay to not like something, but you should articulate your thoughts and provide helpful clues about how the code can be improved.
  • “Can you rewrite this to be more clear?” What’s wrong with what I have now? How should I rewrite it? What’s unclear? Comments like this are themselves unclear, and don’t give a simple path forward.

Instead, say things like:

  • “How does this code handle negative integers?” This feedback is specific, and causes the developer to think through the outcomes themselves. As a reviewer, you might know that the code in question crashes with negative integers, but it’s better to have the developer intuit this themselves. Questions like this might also be indicative of a testing gap, or a need for further specification of scope.
  • “This section is confusing to me, I don’t understand why class A is talking to class B” If the developer hasn’t provided a helpful description, and the code isn’t neatly structured, this kind of comment helps drive for clarity of design and cleaner code.
  • “It looks like you broke an interface boundary here. How will that affect the user?” You’ve pointed out an issue you noticed, giving them the benefit of the doubt that they meant to do it. Now they can think through the unseen ramifications of breaking the interface boundary and either provide a rationale for their decision, or choose to change it.

In general, framing feedback as questions is a good way to drive for clarity, correctness, and at the same time help the developer improve their designs in the future. This is usually how quality creative writing groups give each other feedback. In a creative writing setting, it’s harmful to say things like, “I don’t like this character.” whereas the same comment can be reframed more clearly: “In chapter one your character was warm and compassionate and now he’s cold and icey. He doesn’t seem like a real person to me.” Now there’s specific feedback that can be clarified to suss out the problem.

Programmers like to solve problems and point out issues, so by nature they like to discover and point out flaws. It’s tempting to see code reviews as a way to prove how smart you are by finding problems in your peer’s code. Don’t do it. Code review is a way to get more eyes on a change and suss out critical problems, but your goal should be to review in a way that encourages your team members to improve their skills while fixing the problems at hand.

Style Points

Brace positions, variable / function names, indentation and spacing issues should be addressed, but they are not the central purpose of good code review (notice that I put them at the top the code review pyramid). If you find that your team is spending 90% of it’s time nitpicking indentation and variable names, you’re probably wasting everyone’s time on something that could be mostly automated. Write up a style guide, enforce indentation and spacing issues on check-in and spend your time focusing on higher value issues. I don’t want to diminish the importance of a consistent style. On the contrary, having a consistent idiomatic style is one of the easiest ways to make your codebase easy to read and comprehend. Still, if you spend all of your code review focused on these simplistic tasks, ask yourself whether you’re avoiding the harder and more important work of keeping your team mentally aligned and thinking about higher level designs.

» all posts

Blake Smith

create. code. learn.