Gradle task inputs & outputs

Published in

SoftwareMill Tech Blog

5 min readJun 6, 2019

In my previous post about common pitfalls with Gradle tasks, I’ve mentioned inputs and outputs of a task as one of the sources confusing users. Since I’ve barely scratched the surface, here comes another part on that topic.

Don’t repeat yourself

You may be surprised, but an important feature of a build tool — that is considered fast — is actually avoiding the job that doesn’t need to be done. That makes perfect sense: if a source file hasn’t been changed at all since it was compiled (or in a more general sense: processed), it’s pointless to compile it again and waste time and machine’s resources.

This feature is called incremental build and is, of course, supported by Gradle. Every single task can have 0 or many inputs and outputs defined. Before a task is executed there’s a snapshot of the inputs taken and after a task finishes another snapshot, of the outputs this time, is taken. These snapshots are saved to the cache. Now, if the task is invoked once again, but none of its inputs or outputs have changed since the last run, it is considered to be UP-TO-DATE and will not be executed. This means its cached outputs will be used.

Incremental build is typically related with, yes — you’re right, compiling files. But, there are many other time-consuming tasks that can use this feature heavily, e.g.:

applying DB migrations,
processing template files,
downloading dependencies and so on.

Typically, an input may be defined as:

a simple value like a string or a number (something that implements Serializable interface),
a filesystem type like a directory or a file,
a nested value — basically, this is a being that does not fit into two previous categories but has its own properties that may be both inputs and outputs, e.g. a value object or a data class.

In turn, an output of a task is typically a file or a directory, but it may be also a @Destroyable, which is a result of removing e.g. a collection of files. For the full list of possible inputs and outputs check, please, the documentation.

An example

Let’s have a look at the following class diagram:

It represents a Version interface, it's abstract AbstractVersion implementation and two concrete versions, that extend the latter. One is correct, whereas the other one is spoilt on purpose. It always returns false for the snapshot field, regardless of the argument passed via the constructor.

There’s also a test suite, VersionTest, but which implementation of the AbstractVersion is tested, is decided at runtime, based on the name from the VERSION_CLASS environment variable — or the DefaultVersion if the mentioned variable is empty. Why such a strange setup? This is a part of a programming quiz where a user is asked to write a set of tests for the Version class (which is much more complicated than in this trivial example). When user’s tests are run the Verions‘s implementations are loaded dynamically and it’s checked whether the tests passed or failed for a given version. After that, user gets a score.

The GitHub repo with the example above can be found here, please clone it, and check out the task-io-no-input tag. When you run ./gradlew you will get the inputs and outputs defined for the test task listed on the screen. As you can see inputs contain both source sets (test and main) and the dependencies. What’s interesting, the Gradle script is an input itself, however, it’s not listed. The outputs contain test results and a test report.

Now, please run the following command:

You will see that thetest task was run, and all the outputs were generated. If you run the same command for the second time you’ll notice that test (and others) tasks were marked as UP-TO-DATE and nothing has changed.

Now, we want to check if tests fail for the deliberately spoilt class: FailedVersionSnapshotAlwaysFalse, thus we run:

However.. nothing has happened as well. We expected the tests to fail this time, but what happened is actually the correct behavior. The sources are compiled, none of the outputs got deleted, why would we need to run the tests for the second time?

Now, we have 3 options here to make it work: The Good, The Bad and The Ugly.

The Bad is to add clean to the tasks being run. This will delete all the outputs and repeat all the work all over again. This is pointless. The sources are compiled, we don’t need to recompile them since the test classes are loaded dynamically.

The Ugly is to delete one of the outputs defined for the test task — Gradle will notice that one of the outputs is missing and will re-run the task. It works, but this way we’re hacking the system instead of using its features properly.

The Good is to define a proper input that Gradle will consider while calculating the snapshot of the inputs. To do it, you need to add the following piece of code to the build.gradle.kts script:

or just check out the cloned repo to the HEAD of master branch, where it’s already defined. Now, when you change the value of VERSION_CLASS variable, you will notice that Gradle detected that input has changed and will rerun the tests. However, since they haven’t been changed, the sources are not re-compiled.

If you wonder whether defining an input as a property is a real-life case, please have a look at this question posted on StackOverflow. And upvote the answer ;)

Summary

After reading this post you should know that:

to speed up things up you should avoid doing them ;)
each task in Gradle has configurable inputs and outputs
you’re responsible for configuring inputs and outputs correctly
you should use built-in features of your toolbox

Gradle task inputs & outputs

Don’t repeat yourself

An example

Summary

Written by Maciek Opała