Weeding your micro service landscape
When you look at how a given organization develops software over time, it’s not uncommon to spot evolutionary patterns that mimic what we see in nature. Particularly where micro services are involved, you might well be able to deduce which services came first, and which were developed later in time, by looking at various subtle variations in project structure, tools, dependencies, build pipelines and deployment descriptors. Later services copy elements from the initial template services, apply variations to fit their needs, and successful variations are again used in future services, or even applied in the original template services.
Variations from service to service are essential to maintaining a vibrant engineering culture, with room for experimentation within appropriate boundaries. Over time however, all the subtle variations can make it harder to reason across services, particularly when you want to apply broader changes.
In this blogpost I’ll outline, and provide various samples of, how I harmonize such diverse micro service landscapes, and the scripts I use to reduce the accidental complexity in maintaining the various variations.
The goal
One of the formative articles to how I approach any assignment has for quite some time been this post on Twitter’s Engineering Effectiveness group: Let a 1,000 flowers bloom. As outlined there as well, standardization gives you maximum impact on even minor improvements, allowing all developers to be more productive with a bit of effort. Maintaining a certain level of standarization makes it easier to reason across services, apply common patterns and solutions, and onboard new developers.
Typically when I join an organization there’s been a good few years of development going on, with multiple teams maintaining dozens of micro services, with all the subtle variations commonly seen. My goal in such instances is to frequently apply small automated incremental improvements, that raise the level of standardization across an organization. Accidental undesirable variations in particular, such as outdated dependencencies and tools, can benefit from even crude automation to bring services back in line.
The goal need not be to create perfect scripts that can be maintained and extended for years and years. Small targeted scripts that lift services up to the current desired state are often enough, and can be discarded once they’ve run.
Low level automation
The types of changes I apply most often are not that complicated; simple substitutions, more or less context aware, or minor commands run per micro service source code repository.
For precisely those changes you can go a long way with standard commandline tools such as awk
, sed
& grep
, occassionally supplemented with more specific tools such as jq
and xmlstarlet
.
Using only such command line tools I’ve written quite a few scripts that each reduce a certain aspect of variation between services.
Shell scriping can be daunting at first; expect to lookup a lot of commands and quirks as you go along. Luckily there’s excellent online resources nowadays, and even tools like TLDR.sh available directly from the commandline; Try it! Also be sure to checkout ShellCheck, and integrate it with your editor for quick warnings and suggestions. |
The basic steps are always the same:
-
select a subset of micro service repositories that meet certain criteria, such as the same build tool, pipeline or deployment;
-
apply some transformation(s) to convert repositories from an old state to the new desired state;
-
start a merge request with the new state, for optional review, or directly merge once the build pipeline succeeds.
Selecting repositories
Selection for me is typically a two step process; I start with a collection of services either manually curated, or (periodically) extracted from GitLab or GitHub. You will likely want to limit your initial collection to actively maintained repositories, typically using a single programming language.
Here’s a sample script that allows you to extract a list of repositories per programming language from GitLab.
#!/bin/bash
set -e
# https://docs.gitlab.com/ee/api/projects.html
curl --header Private-Token:${GITLAB_ACCESS_TOKEN} \
--silent --get \
--data-urlencode "with_programming_language=$1" \ # Encode special characters such as in C#
--data "simple=true" \
--data "archived=false" \
--data "order_by=name" \
--data "sort=asc" \
--data "last_activity_after=2021-01-01T00:00.00Z" \
--data "per_page=100" \
--data "page=1" \
https://your.gitlab.host/api/v4/projects | \
jq -r '.[] | "\(.path_with_namespace)"'
Similar scripts can be created for GitHub, and tweaked to your needs to limit access to a particular organization or namespace.
Note how you can easily pipe the output into sort
, head
or grep
to make quick sub selections already.
The output will contain matching repositories, each on a new line, similar to this:
libraries/authentication
services/accounts
services/lending
For the next step in repository selection we need to checkout the initial set of repositories. Having a local copy will allow you to further restrict your selection based on files present, missing or containing certain elements. You can then decide to skip transformation for repositories that do not match the old state. This final step of selection is implemented as part of the transformation script outline below.
Transformation script outline
All my transformation scripts use the same outline:
-
take an argument file or standard input pipe,
-
clone or update each repository from input,
-
start a new branch for your transformation,
-
test repository matches requirements for transformation,
-
apply the transformation to affected files,
-
commit if there are any resulting changes,
-
create a merge request containting the changes.
You can see how the above steps are implemented in the below sample script:
#!/bin/bash
set -e
# TODO Name the branches created in each repository
branch="insert-branch-name-here"
# Regulate script output folder and file
folder=$(pwd)
gitpush=$folder/git-push.out
truncate -s 0 "$gitpush"
mkdir -p repos/
# Loop over each repository
while read -r p
do
cd "$folder";
echo -e "\n\n$p"
# Clean/clone local repository (2)
if [ -d "repos/$p" ]
then
cd "repos/$p"
git reset --hard
git checkout master || git checkout main
git pull
git branch -D $branch || true
else
cd "repos"
git clone "git@your.gitlab.host:$p.git" "$p"
cd "$p"
fi
# Start a new branch (3)
git checkout -b $branch
# TODO Insert match conditionals here, which continue onto the next repository if failed (4)
# TODO Insert transformations here to go from old state to new state (5)
# Only commit if this resulted in changes (6)
if [[ ! $(git status --porcelain) ]]
then
continue;
fi
# Commit all outstanding changes
git commit -a -m "TODO Insert commit message here"
# Show differences compared to master/main
git --no-pager diff origin/master || git --no-pager diff origin/main
# Push branch with push options for merge requests (7)
# https://docs.gitlab.com/ee/user/project/push_options.html
git push --force -u origin $branch \
-o merge_request.create \
-o merge_request.merge_when_pipeline_succeeds \
-o merge_request.remove_source_branch \
2>&1 | tee -a "${gitpush}"
# Loop over either first command line argument file, or piped input
done < "${1:-/dev/stdin}" # (1)
# Open all new pull requests in Chrome (optional)
grep 'merge_request' "${gitpush}" | awk '{print $2}' | xargs google-chrome
There are four TODO items in the above sample; these are the elements that change for each transformation.
|
Running such a script will create merge requests, which are set to merge automatically only if the build pipeline completes successfully through GitLab Push Options. This automatic merging is ofcourse optional, and can be disabled when you want to more closely inspect the changes per repository. Any fixes required can be applied to the same merge request as usual, allowing you to successfully conclude the transformation of all repositories. If you’re still unsure of a local script change can ofcourse comment out the git push command, and inspect the differences locally.
With the outline for each transformation script in place, we can begin to apply transformations to a selection of repositories.
Example transformations
Below you’ll find a wide sampling of transformations that can be applied to a micro service landscape. Most examples are geared towards Java development, but the same approach can be applied to any langauge or platform.
Upgrade dependencies
One of the easiest automations you can apply to make repositories more alike, is to ensure they at the very least all use the latest versions of their respective dependencies. This will ensure there are no more subtle version differences for developers to keep in mind, and will highlight any incompatibilities as soon as possible. ThoughtWorks highlighted this in their 2019 & 2020 TechRadar as Dependency drift, with special mention for tools such as Dependabot and Snyk to detect and resolve discrepancies.
Dependabot is available for GitHub, and can also be made to run on GitLab. Low level automation is perfectly capable of detecting what build tools a given repository uses, to create an initial Dependabot configuration file per repository. See my previous blogpost for details on configuration in bulk for Dependabot.
Upgrade build tools
Build tools such as Gradle and Maven are frequently added to repositories in the form of wrappers such as gradlew
and mvnw
, which allow developers and build pipelines to reliably run the same version.
As the tool version is checked into the source code repository, it’s typically only updated incidentally, which means you can miss out on performance improvements, or fail to spot deprecations in newer versions.
This can cause a backlog of issues when you then want to upgrade to say Java 16, which requires Gradle 7.
Below you’ll find instructions on how to upgrade the Gradle and Maven wrappers to their latest versions.
How to upgrade Gradle
Given an initial list of repositories, we want to detect which repositories that contain the Gradle wrapper, and if present upgrade the wrapper to the latest version.
# Only transform when gradle wrapper is present
if [ ! -f gradlew ]
then
continue
fi
# Upgrade gradle wrapper, which consumes stdin if not fed /dev/null
./gradlew wrapper --gradle-version 6.8.3 < /dev/null
As the comments show there’s a slight quirk when run within a script that uses piped input, but easily worked around by providing an explicit pipe in.
Drop JCenter
As you upgrade to Gradle 7 and further, you might encounter the jcenter() deprecation. As this can cause problems with the long term stability of your builds, we can remove JCenter from our repositories already.
sed -i \
-e "/maven { url \"https:\/\/jcenter.bintray.com\" }/d" \
-e "/jcenter()/d" \
build.gradle
Replace compile, testCompile & runtime
Similarly, when you upgrade to Gradle 7 you might encounter Could not find method testCompile()
due to the removal of compile and runtime configurations. Another quick few replacement patterns should help clear those out.
sed -i \
-e 's/compile /implementation /g' \
-e 's/testCompile /testImplementation /g' \
-e 's/runtime /runtimeOnly /g' \
build.gradle
How to upgrade Maven
We again want to detect any repositories that contain the Maven wrapper, and if present, upgrade the wrapper to the latest version.
# Only transform when the Maven wrapper is present
if [ ! -f mvnw ]
then
continue
fi
# Upgrade Maven wrapper
./mvnw -N io.takari:maven:0.7.7:wrapper -Dmaven=3.8.1
# Replace http: with https: for Maven 3.8.1
sed -i 's|<url>http://|<url>https://|g' pom.xml
Replace mvn install with verify
Frequently Maven projects unnecessarily call goals clean
or install
as part of the build.
These goals slow down the build, and are rarely needed; The verify
goal is usually sufficient to replace install
or even deploy
.
With another small substitution we can remove and replace these goals in our CI pipelines.
sed -i -r \
-e 's/(mvnw.*)(clean)/\1/g' \
-e 's/(mvnw.*)(package|install)/\1verify/g' \
Jenkinsfile # or .gitlab-ci.yml or GitHub Actions or CircleCI
Upgrade runtimes
Runtimes are another source of frequent accidental variation, with repositories adopting a mix of Java versions as needed. With the increased pace of Java releases, and with that the dropped support for non-latest of LTS versions, you will likely wish to keep up with the latest supported version.
Low level replacements will allow you to:
-
increase the Gradle
sourceCompatibility
or Maven equivalent, -
upgrade to suitable Jacoco and CheckStyle versions,
-
swap out your Docker base image if not already managed through Dependabot,
-
replace or drop JVM options as needed.
This way you can eliminate uncertainty around supported language features, and ensure you’re running all applications only on the latest supported version.
Upgrade test libraries
Test libraries are another common source of outdated dependencies. Upgrading these to their latest versions rarely gets priority, but luckily simple substitutions get us most of the way there. And since tests are executed as part of our build, we can safely apply these merge requests automatically if the build succeeds.
How to upgrade from JUnit4 to JUnit5 / Jupiter
JUnit 5 offers various advantages over JUnit 4, such as leveraging features from Java 8, organizing your tests hierachically, and using more than one extension at a time. It also includes some breaking changes though, particularly around asserting exceptions, replacing test rules with extensions, and the constructs needed for Hamcrest assertions. The bulk of the changes though can again be achieved with simple substitutions, which for some repositories might already get you all the way there:
Show script to replace JUnit 4 imports and constructs with JUnit 5
# Skip if not using JUnit 4 Test
if ! grep -q -r 'org.junit.Test' .
then
echo 'org.junit.Test not found'
continue
fi
# Find any Java files
for file in $(find . -type f -name '*.java')
do
# Clean up windows line endings
fromdos $file;
# Remove public modifiers from classes and test methods
if grep -q 'org.junit.Test' $file ; then
sed -i "s|public class|class|g" $file
sed -i "s|public void|void|g" $file
fi
# Replace verbose Assert.assert with shorthand
sed -i -r "s|(\s+)Assert.assert|\1assert|g" $file
sed -i "s|import org.junit.Assert;|import static org.junit.jupiter.api.Assertions.*;|" $file
# Remove RunWith if used for Spring
if grep -q 'RunWith(Spring' $file ; then
sed -i -r '/import org\.junit\.runner\.RunWith;/d' $file
fi
# Replace boolean equals assertions
sed -i -r "s|assertEquals\(true,\s*|assertTrue(|g" $file
sed -i -r "s|assertEquals\(false,\s*|assertFalse(|g" $file
# Remove @RunWith(SpringRunner.class) and @RunWith(SpringJUnit4ClassRunner.class)
sed -i -r '/\@RunWith\(SpringRunner\.class\)/d' $file
sed -i -r '/\@RunWith\(SpringJUnit4ClassRunner\.class\)/d' $file
sed -i -r '/import org\.springframework\.test\.context\.junit4\.SpringRunner;/d' $file
sed -i -r '/import org\.springframework\.test\.context\.junit4\.SpringJUnit4ClassRunner;/d' $file
# Based on: https://github.com/sbrannen/junit-converters/blob/master/junit4ToJUnitJupiter.zsh
# Replace imports of TestCase and Assert with Assertions.
sed -i -r 's/junit.framework.TestCase/org.junit.jupiter.api.Assertions/' $file
sed -i -r 's/org.junit.Assert/org.junit.jupiter.api.Assertions/' $file
# Replace JUnit 4's @Test, @BeforeClass, @AfterClass, @Before,
# @After, and @Ignore annotations with their JUnit Jupiter
# counterparts.
sed -i -r 's/org\.junit\.Test/org.junit.jupiter.api.Test/g' $file
sed -i -r 's/org\.junit\.BeforeClass/org.junit.jupiter.api.BeforeAll/g' $file
sed -i -r 's/org\.junit\.AfterClass/org.junit.jupiter.api.AfterAll/g' $file
sed -i -r 's/org\.junit\.Before/org.junit.jupiter.api.BeforeEach/g' $file
sed -i -r 's/org\.junit\.After/org.junit.jupiter.api.AfterEach/g' $file
sed -i -r 's/org\.junit\.Ignore/org.junit.jupiter.api.Disabled/g' $file
sed -i -r 's/\@BeforeClass/\@BeforeAll/g' $file
sed -i -r 's/\@AfterClass/\@AfterAll/g' $file
sed -i -r 's/\@Before$/\@BeforeEach/g' $file
sed -i -r 's/\@After$/\@AfterEach/g' $file
sed -i -r 's/\@Ignore/\@Disabled/g' $file
# Replace @RunWith(MockitoJUnitRunner.class) with @ExtendWith(MockitoExtension.class).
sed -i -r 's/org\.junit\.runner\.RunWith/org.junit.jupiter.api.extension.ExtendWith/g' $file
sed -i -r 's/\@RunWith/\@ExtendWith/g' $file
sed -i -r 's/org\.mockito\.junit\.MockitoJUnitRunner/org.mockito.junit.jupiter.MockitoExtension/g' $file
sed -i -r 's/MockitoJUnitRunner/MockitoExtension/g' $file
git add $file
done
When run across a whole set of repositories you’ll quickly find which need further changes based on failing builds, and can continue from there to finish the migration. Using only the latest version of JUnit across all your repositories again lowers the cognitive complexity when switching between services.
How to upgrade to Mockito 2+
Mockito 2 introduced some minor changes that could hinder direct adoption of version 2 and 3. Not upgrading means you’re missing out on features such as Java 8+ support, unnecessary stub detection and incorrect argument detection. Luckily, the team has provided a few helpful regular expressions to aid migration. These can once again easily be applied through a script to faciliate a quick migration.
find . -name '*.java' \
-exec sed -i -r \
-e 's/import org.mockito.runners.MockitoJUnitRunner;/import org.mockito.junit.MockitoJUnitRunner;/g' \
-e 's/import (.*) org.mockito.(Mockito|Matchers).argThat;/import $1 org.mockito.hamcrest.MockitoHamcrest.argThat;/g' \
-e 's/(\w+).getArgumentAt[(]([a-zA-Z0-9]*),\s*(.*?).class[)]\./(($3)$1.getArgument($2))./g' \
-e 's/.getArgumentAt[(]([a-zA-Z0-9]*),.*?[)]/.getArgument($1)/g' \
{} \;
How to upgrade to Cucumber-JVM 5+
Cucumber-JVM 5 updated the package structure, forcing you change the class imports in the migration. Adopting the new imports can mostly be achieved through use of the following:
find . -name '*.java' \
-exec sed -i 's/import cucumber.api/import io.cucumber/g' {} \; \
-exec sed -i 's/cucumber.CucumberOptions/cucumber.junit.CucumberOptions/g' {} \;
Spring Boot
Spring Boot is another area where I frequently see various services use different versions, forcing developers to take this into account in particular for non-functional requirements such as metrics.
Upgrading between versions sometimes forces you to adopt moved properties, which can be achieved with common replacement patterns in sed
.
Enable Spring Boot graceful shutdown
Occasionally there’s new features not enabled by default, which nonetheless make sense to adopt widely, such as graceful application shutdown.
The following script inserts the property to enable graceful shutdown into application.properties
and application.yml
if not already present.
# Only apply to projects using Spring Boot
if ! grep -q spring-boot build.gradle # or pom.xml
then
continue
fi
# Insert into application.properties
if [ -f src/main/resources/application.properties ] ; then
if grep -q shutdown src/main/resources/application.properties ; then
continue
fi
if grep ^server. src/main/resources/application.properties ; then
sed -i '0,/^server./s/^server\./server.shutdown=graceful\nserver./' src/main/resources/application.properties
else
printf '\nserver.shutdown=graceful\n' >> src/main/resources/application.properties
fi
fi
# Insert into application.yml
if [ -f src/main/resources/application.yml ] ; then
yq -i eval '.server.shutdown="graceful"' src/main/resources/application.yml
fi
You’ll find properties files are easier to do direct subsitutions for with crude matching than yaml. When editing yaml files it helps to look into dedicated tools such as yq.
Add application name and metric tag
Adding your application name consistently can help with service discovery, as well as with adding consistent metric tags. A simple script can check whether the application name is already defined, and if not add it using the repository name.
if [ -f src/main/resources/application.properties ]
then
# Prepend application name if not already present
if ! grep -q 'spring.application.name' src/main/resources/application.properties
then
sed -i "1s;^;spring.application.name=$(basename $p)\n;" src/main/resources/application.properties
fi
# Append metrics tags if not already present
if ! grep -q 'management.metrics.tags' src/main/resources/application.properties
then
if grep ^server. src/main/resources/application.properties ; then
sed -i '0,/^management\./s/^management\./management.metrics.tags.application=${spring.application.name}\nmanagement./' src/main/resources/application.properties
else
echo -e '\nmanagement.metrics.tags.application=${spring.application.name}\n' >> src/main/resources/application.properties
fi
fi
fi
# Insert into application.yml with YQ; no-op if already present
if [ -f src/main/resources/application.yml ]
then
if ! $(yq eval '.spring.application | has("name")' src/main/resources/application.yml)
then
yq -i eval ".spring.application.name=\"$(basename $p)\"" src/main/resources/application.yml
fi
if ! $(yq eval '.management.metrics | has("tags")' src/main/resources/application.yml)
then
yq -i eval '.management.metrics.tags.application="${spring.application.name}"' src/main/resources/application.yml
fi
fi
Remove Spring Boot Maven plugin
When you switch to layered docker images created through Spring Boot 2.3+ build packs, you'll find you might no longer need the Spring Boot Maven plugin in your pom.xml
anymore.
Here's a sample script using xmlstarlet to remove the plugin by name, as well as any empty elements it might leave behind when you do.
# Remove spring-boot-maven-plugin
xml ed -P -L -N x="http://maven.apache.org/POM/4.0.0" -d '//x:plugin/x:artifactId[text()="spring-boot-maven-plugin"]/..' pom.xml
# Clean up now empty build/plugins tags
xmlstarlet ed -P -L -N x="http://maven.apache.org/POM/4.0.0" -d '//x:plugins[not(./*) and (not(./text()) or normalize-space(./text())="")]' pom.xml
xmlstarlet ed -P -L -N x="http://maven.apache.org/POM/4.0.0" -d '//x:build[not(./*) and (not(./text()) or normalize-space(./text())="")]' pom.xml
# Remove emtpty lines
sed -i '/^\s*$/d' pom.xml
Kubernetes
Deployments are another area that frequently, and often needlessly, vary slightly from project to project. In particular when you have been using Kubernetes for a while, you might still have a few references to beta versions in your descriptors. Any such older v1beta1 references can be cleaned up fairly easily:
for file in $(find k8s -type f -name '*.yaml')
do
# Deployments
sed -i -r \
-e "s|apps/v1beta1|apps/v1|g" \
-e "s|batch/v1beta1|batch/v1|g" \
-e "s|extensions/v1beta1|apps/v1beta1|g" \
$file
done
Clearing out old references ensures you remain compatible with newer Kubernetes releases, before you're forced by a cluster migration. Similar replacements can be applied when you want to add prometheus scraping, service ports, or update the initial health check delay to reduce deployment delays.
Untrack inadvertently added files
There's a lot more small simple fixes you can do once you get into the habbit of applying changes broadly through scripts. Here's a small sample script to clear out any files your colleagues or predecessors likely had not intended to add to the repository.
# Skip if missing
if -d .idea
then
git rm -r --cached .idea
echo '.idea' >> .gitignore
fi
# Skip if missing
if -f .DS_Store
then
git rm -r --cached .DS_Store
echo '.DS_Store' >> .gitignore
fi
Final thoughts
Having a diverse set of sample transformations enables you to quickly apply similar transformations whenever you encounter a potential future cause of drift between services, such as a method deprecation or an improvement in the build pipeline or deployment. Scripts need not be perfect or long living to be of value in reducing complexity, although it helps to share your scripts within a team to give everyone access to the same toolset. Once you share your scripts within an organization, you can even consider to run a certain set periodically, such that any new projects immediately conform to the latest desired state.
Should your needs or organzation grow beyond what can be achieved through simple means, it might be worthwhile to explore dedicated tools that operate in the same solution space such as Atomist.
In any case it's worthwhile to keep in mind that automation allows you to have more impact throughout your day with only minimal extra effort, to ensure you can apply your cognitive abilities where it really matters: producing business value.