How to: Generic Derivation
A short story from reflection to shapeless
Coming from a Java background, reflection to me was a useful, albeit highly advanced tool in daily work. Being mostly used in libraries and frameworks, understanding its possibilities and usages was usually enough for me.
While advancing my career and moving to the scala ecosystem I learned about the existence of macros as a kind of compile-time reflection, used mostly by libraries. Still, it being a highly advanced language feature and not very ergonomic for daily programming, I felt more comfortable in the regular code world.
Then, during the development of a certain feature for the client I was working at, I felt the code I had written was so much boilerplate, there should be a way to shorten what I’ve written (unfortunately I cannot remember exactly what that was about, but it definitely had to do with some kind of mapping between data and their corresponding case classes…). In Java I would have tried my hand at reflection to extract and generate POJO’s, which could be done in scala as well, but I’ve always felt reflection isn’t the right tool for custom written production code, it being a slow, purely runtime process, which is never optimized by the compiler. I asked a senior colleague if using a macro to extract the field names and values would be a way to solve this, since it would bring me some compile-time safety. He then introduced me to the shapeless library, and the rabbit-hole opened up.
Heterogeneous lists and Tuples
One way to look at case classes (product-types) would be to list their fields and the types of those fields.
A case class in scala is nothing more than a named tuple of N elements (N-arity).
However, for regular coding we’re used to homogeneous[1] collections: a List[String]
, Set[Int]
, Array[Byte]
, etc.
How would that look like for a regular case class?
case class Person(name: String, age: Int, address: Address)
case class Address(street: String, number: Int)
Person("Chiel", 36, Address("my-street", 1))
In the same manner, Address
could be described by a heterogeneous list [String, Int]
.
The shapeless library uses a type HList
to describe case classes.
A HList
is, like a regular functional List, a recursive structure of a head H
and a tail HList
, always ending in the empty list HNil
.
The shapeless representation of Person
is:[4]
String :: Int :: Address ::: HNil
Scala 3 generic programming
With Scala 3 being released, we actually do not need shapeless anymore to do this, it has generic programming out of the box!
It enables us to represent every case class with Tuples
.
Person would be (String, Int, Address)
, but also Tuples
can be defined as recursive structures, like a functional list:
//These are equivalent
String *: Int *: Address *: EmptyTuple <> (String, Int, Address)
All recursive structures need to end in something[5] and EmptyTuple
is analogous to Nil
for lists.
Generic derivation: a simple random Generator
The simplest way to get an example up and running is probably a random case class Generator
:
-
no need to access field names
-
no need for an instance that should be inspected for its values
A generator for Person
would first generate a random String
, then a random Int
and last a random Address
, and then call the constructor of Person
with those generated values.[6]
The basic trait for a Generator is:
trait Generator[A]{
def generate: A
}
Let’s first provide instances for the basic Generators.
The generator for a random Int
:
given int: Generator[Int] = new Generator {
override def generate: Int = Random.nextInt()
}
given
makes the Generator
available within the scope, for a next using
to find it.
We need this later.
The generator for String
is a pick of "lorum ipsum …":
//full lorum ipsum inserted here
val lorumWords = "Lorem ipsum dolor .... ".split("[.|,]? ")
given string: Generator[String] = new Generator {
override def generate: String = lorumWords(Random.between(0, lorumWords.length))
}
To generate an Address
we need a generator of its Tupled values: String *: Int *: EmptyTuple
.
The last generator for the simple types is one for EmptyTuple
, but it is also the simplest, since it is a singleton!
given empty: Generator[EmptyTuple] = new Generator {
override def generate: EmptyTuple = EmptyTuple
}
Now we can build a Generator
for the Tuple representation of any case class that has fields of type Int
or String
.
After that Generator
is done, our code will automatically use it for other fields that are case classes themselves, since it is generic!
As a Tuple
is a recursive structure of H *: T
where T
is the rest of the Tuple
, we can define a Generator
for any Tuple
, when there is a Generator
for H
and a Generator
for T
available:
given tuple[H, T <: Tuple](using hGen: Generator[H], tGen: Generator[T]): Generator[H *: T] = new Generator {
override def generate: H *: T = hGen.generate *: tGen.generate
}
// For a representation of Address we could invoke this manually by calling
val generated: String *: Int *: EmptyTuple = tuple(string, tuple(int, empty)).generate
Finally, a Generator
of a case class needs to be able to construct one from its representation, using a Generator
of its representation:
given product[A <: Product](using m: Mirror.ProductOf[A], ts: Generator[m.MirroredElemTypes]): Generator[A] = new Generator {
override def generate: A = {
val representation = ts.generate
m.fromProduct(representation)
}
}
Let’s break this down:
-
A <: Product
: a case class is aProduct
in Scala 3, and theA
we will generate should therefore be a subtype ofProduct
-
using m: Mirror.ProductOf[A]
: aMirror
is the way to introspect at compile time. It extracts the representation of ourProduct
under construction:A
. -
ts: Generator[m.MirroredElemTypes]
: we also need agiven Generator
for the representation ofA
. This is thegiven tuple
we defined above! We’re able to extract the representation fromm: Mirror.ProductOf
bym.MirroredElemTypes
[7][8] -
m.fromProduct
: calls the constructor ofA
with therepresentation
we generated.
Start Generating
And that’s our basic case class generator done. For convenience, we’ll add a method that uses the generators we defined above implicitly.
When invoked for Person
, it will use string
, int
and product
to construct values for name
, age
and address
. product
for address
uses string
and int
again to generate its values.
def generate[A](using gen: Generator[A]): A = gen.generate
//Generate a person like this
val person: Person = generate[Person]
Resources
Below blogs and videos greatly helped me understand the topic of generic programming:
List[A]
in functional programming is recursively composed of a head and a tail, and could be written as A :: A :: .. ::: Nil
. Nil
being the empty List.