


A guide to using Bazel, an artifact-based build system, for Scala developers with an example and step-by-step instructions.
The build time of a project has a significant impact on a team’s development efficiency. The larger the code base, the longer it takes to build. And the longer the build time, the worse the developer experience becomes.
While SBT is a great build tool, its design (in particular the lack of reliable content-addressable build cache) makes it not very suited for large projects.
This blog post introduces Bazel, a build system to achieve fast builds, even in Google-scale repositories, through building simple Scala applications.
First of all, let’s learn the concept of Bazel. Bazel is a build system whose motto is “{Fast, Correct} choose two”. In order to realize these properties, Bazel is designed to be an “artifact-based build system”. In this section, we will learn about the nature of Bazel by looking at what an artifact-based build system is and what “{Fast, Correct} choose two” means!
Traditional build systems like Ant and Maven are called task-based build systems. In the build configuration for task-based build systems, we describe an imperative set of tasks like do task A, then do task B, and then do task C.
On the other hand, in artifact-based build systems such as Buck, Pants, and Bazel, we describe a declarative set of artifacts to build, a list of dependencies, and limited options for the build.
So, the basic idea of Bazel is your build is a pure function:
For more details about the concept of artifact-based build systems, I recommend reading through Chapter 18 of Software Engineering at Google.
To better understand this claim, it is necessary to comprehend Bazel’s Hermeticity property.
In hermetic build systems like Bazel, when given the same input sources and configuration, it returns the same output.
Thanks to hermeticity, Bazel is able to provide reproducible builds: given the same inputs, Bazel always returns the same output on everyone’s computer.
And thanks to the reproducible build, Bazel can provide a remote cache feature to share the build cache within a team. With remote cache, we can build large projects fast by using the build cache shared across team members. Build reproducibility is also a critical aspect of supply chain security, required to achieve the highest level of SLSA compliance.
In this tutorial, we are building a simple Scala application with Bazel. The complete source code is available here: https://github.com/tanishiking/bazel-tutorial-scala/tree/main/01_scala_tutorial
The project structure looks like this:
|-- WORKSPACE
`-- src
`-- main
`-- scala
|-- cmd
| |-- BUILD.bazel
| `-- Runner.scala
`-- lib
|-- BUILD.bazel
`-- Greeting.scala
The Bazel configuration files are WORKSPACE and BUILD.bazel files.
The WORKSPACE file contains the external dependencies (for both Bazel and JVM). For example, we download rules_scala, a Bazel extension for compiling Scala in the WORKSPACE file.
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive") #Import a rule from the Bazel's "standard library"
...
http_archive( #This rule can download and import an archived repo
name = "io_bazel_rules_scala", #The name that will be used to reference the repo
sha256 = "77a3b9308a8780fff3f10cdbbe36d55164b85a48123033f5e970fdae262e8eb2",
strip_prefix = "rules_scala-20220201", #Only files from this directory will be unpacked and imported
type = "zip",
url = "https://github.com/bazelbuild/rules_scala/releases/download/20220201/rules_scala-20220201.zip",
)
For more details, see the project README.
To define the build in Bazel, we write BUILD.bazel files.
Before jumping into the BUILD.bazel files, let’s take a quick look at Scala files to build. This project consists of two packages, each containing a single Scala file.
// src/main/scala/lib/Greeting.scala
package lib
object Greeting {
def sayHi = println("Hi!")
}
// src/main/scala/cmd/Runner.scala
package cmd
import lib.Greeting
object Runner {
def main(args: Array[String]) = {
Greeting.sayHi
}
}
As you can see, lib.Greeting is a library module that provides the sayHi method, and cmd.Runner depends on lib.Greeting.
Next, let’s see how to write BUILD.bazel files to build these Scala sources.
To build lib.Greeting in this example, we put the BUILD.bazel file adjacent to Greeting.scala, and we define a build target using the scala_library rule provided by rules_scala.
A rule in Bazel is a declaration of a set of instructions for building or testing code. For example, there’s a set of rules for building Java programs (that is natively supported by Bazel). rules_scala provides a set of rules for building Scala programs.
scala_library compiles the given Scala sources and generates a JAR file.
# src/main/scala/lib/BUILD.bazel
load("@io_bazel_rules_scala//scala:scala.bzl", "scala_library")
scala_library(
# unique identifier of this target
name = "greeting",
# list of Scala files to build
srcs = ["Greeting.scala"],
)
Now we have Scala sources to build and a BUILD.bazel configuration. Let’s build it using the bazel command line.
$ bazel build //src/main/scala/lib:greeting
...
INFO: Found 1 target...
Target //src/main/scala/lib:greeting up-to-date:
bazel-bin/src/main/scala/lib/greeting.jar
INFO: Elapsed time: 0.152s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
Build succeeded 🎉, but wait, what is //src/main/scala/lib:greeting?
//src/main/scala/lib:greeting is something called a label in Bazel, and it points to the greeting in src/main/scala/lib/BUILD.bazel. In Bazel, we use a label to uniquely identify a build target.
A label consists of 3 components. For example, in @myrepo//my/app/main:app_binary,
That being said, //src/main/scala/lib:greeting points to a target that is in the same workspace, defined in a BUILD.bazel file located at src/main/scala/lib, and the target name is greeting.
Next, let’s build cmd.Runner that depends on lib.Greeting. This time, cmd.Runner depends on lib.Greeting, so we introduce a dependency between targets using a deps attribute.
# src/main/scala/cmd/BUILD.bazel
load("@io_bazel_rules_scala//scala:scala.bzl", "scala_binary")
scala_binary(
name = "runner",
main_class = "cmd.Runner",
srcs = ["Runner.scala"],
deps = ["//src/main/scala/lib:greeting"],
)
The differences from the previous example are:
Now, we should be able to build the application by bazel build //… but it fails!
$ bazel build //src/main/scala/cmd:runner
ERROR: .../01_scala_tutorial/src/main/scala/cmd/BUILD.bazel:3:13:
in scala_binary rule //src/main/scala/cmd:runner:
target '//src/main/scala/lib:greeting' is not visible from
target '//src/main/scala/cmd:runner'.
Bazel has a concept of visibility. By default, all targets’ visibility is private, meaning only targets within the same package (i.e. same BUILD.bazel file) can access each other.
To make lib:greeting visible from cmd, add the visibility attribute to greeting.
scala_library(
name = "greeting",
srcs = ["Greeting.scala"],
+ visibility = ["//src/main/scala/cmd:__pkg__"],
)
//src/main/scala/cmd:__pkg__ is a visibility specification that grants access to the package //src/main/scala/cmd.
Now we can build the app:
$ bazel build //src/main/scala/cmd:runner
...
INFO: Found 1 target...
Target //src/main/scala/cmd:runner up-to-date:
bazel-bin/src/main/scala/cmd/runner.jar
bazel-bin/src/main/scala/cmd/runner
INFO: Elapsed time: 0.146s, Critical Path: 0.01s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
As you can see, the scala_binary rule generates another file which is named runner in addition to runner.jar. This is a wrapper script for runner.jar, and we can easily run the JAR with this script.
$ ./bazel-bin/src/main/scala/cmd/runner
Hi!
$ bazel run //src/main/scala/cmd:runner
Hi!
In the examples above, we specify a target’s label and build one target, but is it possible to build multiple build targets at the same time?
The answer is yes. We can use a wildcard to select multiple targets. For example, we can build all targets by $ bazel build //….
Now, we have learned the basics of Bazel by building a simple application in Scala, but how do we use third-party libraries in Bazel?
Let’s now learn how to use third-party libraries from Maven by building a simple application that parses Scala programs using scalameta and pretty prints the AST using pprint.
In this example, we will use rules_jvm_external, one of the standard rules set to manage external JVM dependencies.
Note: we can download jars from Maven repositories using maven_jar, a natively supported rule by Bazel. However, I recommend using rules_jvm_extenal because it has a number of useful features over maven_jar.
The complete code example is available here: https://github.com/tanishiking/bazel-tutorial-scala/tree/main/02_scala_maven.
This project has only one Scala program.
// src/main/scala/example/App.scala
package example
import scala.meta._
object App {
def main(args: Array[String]) = {
pprint.pprintln(parse(args.head))
}
private def parse(arg: String) = {
arg.parse[Source].get
}
}
To download scalameta and pprint from Maven repositories, we use rules_jvm_external. So, we have to download rules_jvm_external first.
To download rules_jvm_external, copy and paste setup statements from the release page to your WORKSPACE file, like this:
http_archive(
name = "rules_jvm_external",
strip_prefix = "rules_jvm_external-4.5",
sha256 = "b17d7388feb9bfa7f2fa09031b32707df529f26c91ab9e5d909eb1676badd9a6",
url = "https://github.com/bazelbuild/rules_jvm_external/archive/refs/tags/4.5.zip",
)
...
Then list all the dependencies in the maven_install statement, which is also in the WORKSPACE file.
load("@rules_jvm_external//:defs.bzl", "maven_install")
maven_install(
artifacts = [
"org.scalameta:scalameta_2.13:4.5.13",
"com.lihaoyi:pprint_2.13:0.7.3",
],
repositories = [
"https://repo1.maven.org/maven2",
],
)
Now, Bazel can download the dependencies, but how can we use them?
To use the downloaded dependencies, we need to add them to the deps attribute of the build rules. rules_jvm_external automatically generates targets for the libraries under @maven repository in the following format:
> The default label syntax for an artifact foo.bar:baz-qux:1.2.3 is @maven//:foo_bar_baz_qux https://github.com/bazelbuild/rules_jvm_external#usage
Therefore, we can refer to com.lihaoyi:pprint_2.13:0.7.3 with the label @maven//:com_lihaoyi_pprint_2_13. So, put the following BUILD.bazel file adjacent to App.scala.
# src/main/scala/example/BUILD.bazel
scala_binary(
name = "app",
main_class = "example.App",
srcs = ["App.scala"],
deps = [
"@maven//:com_lihaoyi_pprint_2_13",
"@maven//:org_scalameta_scalameta_2_13",
],
)
And build it!
$ bazel build //src/main/scala/example:app
...
INFO: Found 1 target...
Target //src/main/scala/example:app up-to-date:
bazel-bin/src/main/scala/example/app.jar
bazel-bin/src/main/scala/example/app
INFO: Elapsed time: 0.165s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
...
$ bazel-bin/src/main/scala/example/app "object main { println(1) }"
Source(
stats = List(
Defn.Object(
...
)
)
)
Nice! 🎉 So, that’s how we use external JVM dependencies with rules_jvm_external.
In this article, we have shown how Bazel enables fast build in large repositories and introduced the basic concepts and usage of Bazel by building a simple Scala application.
As you saw, Bazel requires users to manage a lot more build configurations than other build tools such as sbt and Maven. However, this is an acceptable tradeoff for very large projects. considering the scalable build speed thanks to Bazel’s advantages, such as reproducible builds and remote cache.
I hope this article helps you take the first steps in getting started with Bazel.
If you want to learn more about Bazel, I recommend reading through the official Bazel starting guide and getting your hands dirty with your first Bazel project. There’re many more interesting topics in Bazel to learn about and explore!
If you like this article, you might also enjoy reading the following posts: