Parameter Based Testing of JSON Encoders and Decoders

I’ve built a small Scala project that is a client for Ubiquiti UniFi controllers. As part of this I need to write JSON encoders and decoders to work with the controller API’s data model using the circe library. I want to test these in a reasonably comprehensive way but I also want to be lazy. By using property based testing with ScalaCheck I’ve managed to cover many scenarios in relatively little code.

When testing the encoders and decoders we can take advantage of their mirror nature. If you run an instance of the model through the encoder to turn it into JSON and then apply the decoder you should get back a data structure that is identical to the input. If you don’t then either the encoder or decoder has a bug. If we generate a large number of instances of the data structure covering the possible variations we can have a high degree of confidence that the encoder and decoder are compatible.

There are a couple of things this will not validate for us. Firstly it won’t verify that the encoder and decoder will work with the actual JSON we want to use. The structure they agree on could be completely different to the structure we wish to support. Secondly this won’t verify what happens with invalid JSON. If we have a data structure that can only represent valid data then we can’t get ScalaCheck to generate an invalid structure to encode and therefore verify the decoder using this mechanism. There are potentially some exceptions to this related to things like string lengths and other such items it may not be worth having the data model enforce. We will need additional tests for these cases.

Although an end-to-end test using ScalaCheck and the encoder and decoder doesn’t cover all cases it still gives us a lot of value. We can write additional tests that may or may not be property based1 to cover decoding JSON that matches the expected structure and any error cases we need to handle. We can then have property based tests to ensure compatibility between the validated decoder and the encoder. I’ve found that for my use case the only test needed for the encoder has the property based end-to-end test. In other cases having some explicit tests for the encoder may be valuable if you are particularly concerned with translating a data structure to JSON but this was not warranted in my situation where I am primarily interested in decoding and the main use for the encoders is to feed other tests.

Let’s look at an example. UniFi Firewall Groups allow you to define groups of elements for use in firewall rules rather than having to enter the values every time a rule uses them. They can contain port numbers, IPv4 addresses/subnets or IPv6 addresses/subnets. FirewallGroup is a type that I have defined to capture this. I’m not currently supporting IPv6 configuration with this library so that’s not represented. There is however an UnknownFirewallGroup that is used for cases the code doesn’t exactly support.

case class SiteId(id: String)
case class FirewallGroupId(id: String)

sealed trait IPv4
// Implementations of IPv4 omitted

sealed trait FirewallGroup {
  val id: FirewallGroupId
  val name: String
}

object FirewallGroup {
  case class PortGroup(
    id: FirewallGroupId,
    name: String,
    members: List[Int],
    siteId: SiteId
  ) extends FirewallGroup

  case class Ipv4AddressSubnetGroup(
    id: FirewallGroupId,
    name: String,
    members: List[IPv4],
    siteId: SiteId
  ) extends FirewallGroup

  case class UnknownFirewallGroup(
    id: FirewallGroupId,
    name: String,
    siteId: SiteId
  ) extends FirewallGroup
}

Now that we have our structure we’ll need some ScalaCheck generators. We define generators for the three different types of FirewallGroup we have defined and then set up a generator that will randomly chose one of them every time it’s asked for one.

The ScalaCheck User Guide discusses what generators are and how to write them in more detail.

// Generators for IPv4 omitted

val siteId: Gen[SiteId] = for {
  id <- Gen.identifier
} yield SiteId(id)

val firewallGroupId: Gen[FirewallGroupId] = for {
  id <- Gen.identifier
} yield FirewallGroupId(id)

val portGroup: Gen[PortGroup] = for {
  id <- firewallGroupId
  name <- Gen.identifier
  count <- Gen.choose(1, 5)
  members <- listOfN(count, Gen.choose(1, 65535))
  siteId <- siteId
} yield PortGroup(id, name, members, siteId)

val ipV4AddressSubnetGroup: Gen[Ipv4AddressSubnetGroup] = for {
  id <- firewallGroupId
  name <- Gen.identifier
  count <- Gen.choose(1, 5)
  members <- listOfN(count, Gen.oneOf(ipAddressV4, cidrV4))
  siteId <- siteId
} yield Ipv4AddressSubnetGroup(id, name, members, siteId)

val unknownFirewallGroup: Gen[UnknownFirewallGroup] = for {
  id <- firewallGroupId
  name <- Gen.identifier
  siteId <- siteId
} yield UnknownFirewallGroup(id, name, siteId)

val firewallGroup: Gen[FirewallGroup] = Gen.oneOf(portGroup, ipV4AddressSubnetGroup, unknownFirewallGroup)

We’ll also need an Arbitrary[FirewallGroup] to satisfy the test framework requirements. This will need to be in scope for the test to compile.

implicit val arbitraryFirewallGroup: Arbitrary[FirewallGroup] = Arbitrary(Generators.firewallGroup)

Now that we have a means to generate the instances we want to test the test itself is trivial.

package com.abstractcode.unifimarkdownextractor.unifiapi.models

import com.abstractcode.unifimarkdownextractor.Arbitraries._
import com.abstractcode.unifimarkdownextractor.unifiapi.models.FirewallGroup._
import io.circe.syntax._
import org.scalacheck.Prop.{forAll, propBoolean}
import org.scalacheck.Properties

object FirewallGroupSpec extends Properties("FirewallGroup") {
  property("can round trip encode and decode") = forAll {
    (firewallGroup: FirewallGroup) => firewallGroup.asJson.as[FirewallGroup] == Right(firewallGroup)
  }
}

Our test object extends Properties and takes a name to associate the tests with in the output. We define an individual test with property and supply a name that indicates what the test itself asserts. In forAll we define a function that takes a FirewallGroup and returns a boolean indicating whether the property being tested holds for the supplied FirewallGroup instance. This test turns the instance into a circe Json object then back into a FirewallGroup. As that process can fail as[T] returns an Either and we assert it should be a Right containing the FirewallGroup. We rely on the automatically generated case class equals implementation for this comparison.

When run ScalaCheck uses the generators we have defined via the Arbitrary to create FirewallGroup instances. It will do 100 by default assuming the property holds for all of them. Otherwise it will show the failing instance.

ScalaCheck can handle supplying multiple generated properties at once by adding them to the function and ensuring that there is ab Arbitrary in scope. It can also do test minimisation where, on finding a failed case, it will attempt to shrink the case to a minimal set of inputs that fail the property being tested. See the ScalaCheck documentation for details.

Considerations

Performance

ScalaCheck allows you to compose generators that can produce arbitrarily complex structures. You can take the output of a generator and transform it using functional operations like filter and map. This means you can potentially make generating instances very slow. I have seen a single property go from over 5 seconds to run to milliseconds just be changing a string generation to not use a filter. If you are filtering heavily then you are generating a lot more initial values than are produced which can quickly become non-trivial. Some care that your generators are efficient will assist in keeping test runs within reasonable bounds.

Handling partial support of a JSON structure

The UnknownFirewallGroup will be encoded as JSON that doesn’t fully match the desired structure. This is not an issue in this case where the client only decodes the structure but would not be desirable if we were sending encoded JSON out into the world. There are a number of possible solutions here.

Firstly you could not have UnknownFirewallGroup and validate that you get a decoder error if something unknown is encountered. In this instance I preferred to provide those elements I supported rather than fail to decode and to provide sufficient information to identify that there are other groups the client cannot handle. You can still do useful things in this case, you just can’t support IPv6. If you are generating encoded output you must then choose if you encode the incomplete data or not. If not you should also exclude it from your generators or the end-to-end test won’t work. If you are sending out encoded JSON not encoding invalid values is likely best. I’m not so I can get away with it here.

Secondly you could silently drop the unknown elements. This is similar to the first option but doesn’t let a user of the client know that information is missing. Whether this is valid depends on the use case and I decided it was better to indicate that the data was incomplete and let any consuming code determine how it will handle the response.

Thirdly you could error on finding an unknown element. You would not model the unsupported parts of the structure so you can’t encode it and have nothing to decode it to. This is only acceptable if you are sure you’ll never see the unsupported elements or that seeing them is an error case.


  1. Mine were not and used Specs2 ↩︎