Coyote Testing Tool

a simple environment, operations and runtime-meta testing tool
photo of Marios Andreopoulos
Marios Andreopoulos

A few days ago we open source’d Coyote, a tool we created in order to automate testing of our Landoop Boxes, which features a large range of environments for Big Data and Fast Data (see Kafka).

Coyote does one simple thing: it takes a .yml file with a list of commands to setup, run and check their exit code and/or output. It has some other functionality too, but its essence is this. The source code is short, I don’t expect any praise for it; coyote is a tool, a useful one.

We use it for environment and operations testing, as well as runtime meta-testing.

Environment and operations testing is to verify that an environment is set up and working as expected, such as having access to a specific port or software, or running a command and getting certain results, like compiling a program that requires tools and libraries present and set-up or a command that needs some environment variables set. Performance testing can also be seen as a subset of environment and operations testing.

Runtime meta-testing is a shortcut to proper software-run-tests. If you trust your software to be robust enough, then instead of verifying the actual results (e.g entries in a database), you can scan the output of the software for (un)expected logs. Of course this kind of tests should not be used without awareness of the underlying dangers.


Coyote has two important outputs; (a) a html report with the commands ran, their exit status, stdout, stderr and some statistics, (b) its exit code which up to 254 indicates the number of errors occured and at 255 is like saturation arithmetic and means that 255 or more errors occured. The html report is for humans, the exit code for machines. We use it with Jenkins CI where, amongst other things, we need quick visibility of failures and verbose output for debugging.

You may see some example html outputs here and here.


Test configuration is set via a YML file, partially inspired by Ansible. Let’s see a real world example before we delve into specifics. Below are two of our Box tests, a basic HDFS/Hadoop test and a Spark test. I’ve add some comments to explain what happens for the non-intuitive parts.

- name: coyote
  # name is the name of a group of tests. “coyote” is a reserved keyword for
  # global settings, such as html title and universal timeout (can be overriden)
  title: Box CDH Tests

# Hadoop/HDFS test
- name: Hadoop Tutorial
  skip: _hadoop_tutorial_
  # When skip is set to “true”, the group will be skipped. Useful for automation.
    - name: Clone Landoop Intro Repo
      command: git clone
      workdir: /home/coyote

    - name: hdfs put
      command: hadoop fs -put ../
      workdir: /home/coyote/landoop-intro/Hadoop-101

    - name: wordcount example
      command: hadoop jar hadoop-examples.jar wordcount results
      workdir: /home/coyote/landoop-intro/Hadoop-101

    - name: build code
      command: hadoop
      workdir: /home/coyote/landoop-intro/Hadoop-102

    - name: create jar
      command: |
        jar cf wc.jar
      workdir: /home/coyote/landoop-intro/Hadoop-102

    - name: run jar
      command: hadoop jar wc.jar WordCount results-2
      workdir: /home/coyote/landoop-intro/Hadoop-102

    - command: hadoop fs -rm -r results results-2
      nolog: true
      # nolog will not log the test results, neither count them as successes
      # or failures

# Spark Tests
- name: Spark Tests
  skip: _spark_tests_
    - command: mkdir -p coyote-spark-test-app/src/main/scala
      nolog: true

    - name: Create Test Scala App
      command: tee coyote-spark-test-app/src/main/scala/CoyoteTestApp.scala
      stdin: |
        /* SimpleApp.scala */
        import org.apache.spark.SparkContext
        import org.apache.spark.SparkContext._
        import org.apache.spark.SparkConf

        object CoyoteTestApp {
          def main(args: Array[String]) {
            val logFile = "CoyoteTestApp.scala" // file should be in HDFS
            val conf = new SparkConf().setAppName("Coyote Application")
            val sc = new SparkContext(conf)
            val logData = sc.textFile(logFile, 2).cache()
            val numAs = logData.filter(line => line.contains("a")).count()
            val numBs = logData.filter(line => line.contains("b")).count()
            val lineCount = logData.count()
            println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
            println("Linecount: %s".format(lineCount))
            sc.parallelize(List(("Linecount: %s".format(lineCount))),1)

    - name: Create Sbt File
      command: tee coyote-spark-test-app/main.sbt
      stdin: |
        name := "Coyote Test Project"
        version := "1.0"
        scalaVersion := "2.11.7"
        libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.2"

    - name: Put into HDFS a Test File
      command: |
        hadoop fs -put
                  -f coyote-spark-test-app/src/main/scala/CoyoteTestApp.scala

    - name: Build SBT Package
      command: sbt package
      workdir: coyote-spark-test-app

    - name: Spark-submit Application (Local)
      command: |
            --class "CoyoteTestApp"
            --master local[4]
      stdout_has: [ 'Linecount: 19' ]
      # stdout_has takes an array of regular expressions to check against stdout
      # there is also stdout_not_has, stderr_has, stderr_not_has

    - name: Verify via HDFS
      command: hadoop fs -cat spark-results.txt/part-00000
      stdout_has: [ 'Linecount: 19' ]

    - command: hadoop fs -rm -r -f spark-results.txt
      nolog: true

    - name: Spark-submit Application (CDH Cluster)
      command: |
          --class "CoyoteTestApp"
          --master yarn
          --deploy-mode cluster

    - name: Verify via HDFS
      command: hadoop fs -cat spark-results.txt/part-00000
      stdout_has: [ 'Linecount: 19' ]

    - command: hadoop fs -rm -r -f spark-results.txt
      nolog: true
    - command: rm -rf coyote-spark-test-app
      nolog: true

You can see the output for these two tests here. Your first reaction may be that the configuration is too verbose. This is the ansible inspired part. In practice you write your tests once or a few times and run them many times. The person running the tests and/or evaluating the results may be not the one that wrote them. Even the creator after a couple months can forget the purpose of each command. In Landoop we also use the coyote configuration files as reference examples for both newcomers and ourselves.

Currently coyote supports this settings for each command:

  • workdir: run the command in this directory
  • stdin: pass this as standard input to the command
  • nolog: do not count, nor log this command, usually we use it for small cleanup tasks
  • env: array of environment variables to make available to the command
  • timeout: if the command hasn’t completed in this time, kill it. It takes golang time duration strings, such as “30s”,”2m15s”,”1h”. There is a global default timeout of 5 minutes, which you can override globally and/or per command
  • stdout_has, stdout_not_has, stderr_has, stderr_not_has: arrays with regular expressions to check against standard output and standard error.
  • ignore_exit_code: do not fail the test if exit code is not zero, it still may fail from a timeout or a stdout/stderr check
  • skip: if set to true, skip this test (or the group of tests if “skip” is at the group level), useful for automating test runs by using sed to select tests

Also it supports some special strings:

  • %UNIQUE%: if this string is used inside a command, stdin, or env variable, it will be replaced by a unique numeric string at runtime
  • %UNIQUE_[0-9A-Za-z_-]+%: if such a string (e.g: %UNIQUE_VAR1%, %UNIQUE_1-B%) is used, all its instances inside commands, stdin and env variables will be replaced by a common unique numeric string at runtime

Some features you may miss are:

  • loop constructs (such as ansible’s “with”)
  • global variables / templating
  • abort if a test fails —this is a design choice for now, we always run all tests

Jenkins Integration at Landoop

Most of the time it is Jenkins that runs our tests.

We’ve set it to use Coyote’s exit code not only to mark the build as failed, but to add the number of errors to the job name, to grant us quick visibility of the current status. We also keep a history of test reports.

Enabling or disabling tests is easy, add a boolean variable for each test group and then a single bash line to set its skip flag to true:

[[ ${HADOOP_TEST} == false ]] && sed -e 's/_hadoop_tutorial_/true/' -i box-configuration.yml
[[ ${SQOOP_TEST}  == false ]] && sed -e 's/_sqoop_tutorial_/true/'  -i box-configuration.yml
[[ ${KAFKA_TEST}  == false ]] && sed -e 's/_kafka_tests_/true/'     -i box-configuration.yml
[[ ${SPARK_TEST}  == false ]] && sed -e 's/_spark_tests_/true/'     -i box-configuration.yml

Final Words

Although code testing is such a renowned practice, our research didn’t reveal many testing tools for the latter part of the devops pipeline, such as the deployment and execution phase. Coyote’s turnaround has been great until now and this is the reason we wanted to share it.

Of course, as it is with every such effort, we are bound to re-invent the wheel for some part. The codebase is kept small and golang makes it easy to add features as we go.

For the time being we don’t have a release plan, just grab the latest coyote commit from master, it works:

go get

To use it:

coyote -c configuration.yml -out test.html

To make changes to the source code:

cd  "$GOPATH/src/"
# edit code and/or html
go generate
go build

Should we make any breaking changes to the configuration format, we will explore how to communicate about it and protect Coyote’s current users.

If you are interested in examples or extending the code, visit Coyote’s github repository.

Thank you for your time,

comments powered by Disqus