Weeding your micro service landscape

When you look at how a given organization develops software over time, it’s not uncommon to spot evolutionary patterns that mimic what we see in nature. Particularly where micro services are involved, you might well be able to deduce which services came first, and which were developed later in time, by looking at various subtle variations in project structure, tools, dependencies, build pipelines and deployment descriptors. Later services copy elements from the initial template services, apply variations to fit their needs, and successful variations are again used in future services, or even applied in the original template services.

Variations from service to service are essential to maintaining a vibrant engineering culture, with room for experimentation within appropriate boundaries. Over time however, all the subtle variations can make it harder to reason across services, particularly when you want to apply broader changes.

In this blogpost I’ll outline, and provide various samples of, how I harmonize such diverse micro service landscapes, and the scripts I use to reduce the accidental complexity in maintaining the various variations.

The goal

One of the formative articles to how I approach any assignment has for quite some time been this post on Twitter’s Engineering Effectiveness group: Let a 1,000 flowers bloom. As outlined there as well, standardization gives you maximum impact on even minor improvements, allowing all developers to be more productive with a bit of effort. Maintaining a certain level of standarization makes it easier to reason across services, apply common patterns and solutions, and onboard new developers.

Typically when I join an organization there’s been a good few years of development going on, with multiple teams maintaining dozens of micro services, with all the subtle variations commonly seen. My goal in such instances is to frequently apply small automated incremental improvements, that raise the level of standardization across an organization. Accidental undesirable variations in particular, such as outdated dependencencies and tools, can benefit from even crude automation to bring services back in line.

The goal need not be to create perfect scripts that can be maintained and extended for years and years. Small targeted scripts that lift services up to the current desired state are often enough, and can be discarded once they’ve run.

Low level automation

The types of changes I apply most often are not that complicated; simple substitutions, more or less context aware, or minor commands run per micro service source code repository. For precisely those changes you can go a long way with standard commandline tools such as awk, sed & grep, occassionally supplemented with more specific tools such as jq and xmlstarlet. Using only such command line tools I’ve written quite a few scripts that each reduce a certain aspect of variation between services.

Shell scriping can be daunting at first; expect to lookup a lot of commands and quirks as you go along. Luckily there’s excellent online resources nowadays, and even tools like TLDR.sh available directly from the commandline; Try it! Also be sure to checkout ShellCheck, and integrate it with your editor for quick warnings and suggestions.

The basic steps are always the same:

select a subset of micro service repositories that meet certain criteria, such as the same build tool, pipeline or deployment;
apply some transformation(s) to convert repositories from an old state to the new desired state;
start a merge request with the new state, for optional review, or directly merge once the build pipeline succeeds.

Selecting repositories

Selection for me is typically a two step process; I start with a collection of services either manually curated, or (periodically) extracted from GitLab or GitHub. You will likely want to limit your initial collection to actively maintained repositories, typically using a single programming language.

Here’s a sample script that allows you to extract a list of repositories per programming language from GitLab.

Listing 1. Select source code repositories by programming language from GitLab

#!/bin/bash
set -e
# https://docs.gitlab.com/ee/api/projects.html
curl --header Private-Token:${GITLAB_ACCESS_TOKEN} \
--silent --get \
--data-urlencode "with_programming_language=$1" \ # Encode special characters such as in C#
--data "simple=true" \
--data "archived=false" \
--data "order_by=name" \
--data "sort=asc" \
--data "last_activity_after=2021-01-01T00:00.00Z" \
--data "per_page=100" \
--data "page=1" \
https://your.gitlab.host/api/v4/projects | \
    jq -r '.[] | "\(.path_with_namespace)"'

Similar scripts can be created for GitHub, and tweaked to your needs to limit access to a particular organization or namespace. Note how you can easily pipe the output into sort, head or grep to make quick sub selections already.

The output will contain matching repositories, each on a new line, similar to this:

Listing 2. Sample output of repository selection

libraries/authentication
services/accounts
services/lending

For the next step in repository selection we need to checkout the initial set of repositories. Having a local copy will allow you to further restrict your selection based on files present, missing or containing certain elements. You can then decide to skip transformation for repositories that do not match the old state. This final step of selection is implemented as part of the transformation script outline below.

Transformation script outline

All my transformation scripts use the same outline:

take an argument file or standard input pipe,
clone or update each repository from input,
start a new branch for your transformation,
test repository matches requirements for transformation,
apply the transformation to affected files,
commit if there are any resulting changes,
create a merge request containting the changes.

You can see how the above steps are implemented in the below sample script:

Listing 3. Transformation script outline

#!/bin/bash
set -e

# TODO Name the branches created in each repository
branch="insert-branch-name-here"

# Regulate script output folder and file
folder=$(pwd)
gitpush=$folder/git-push.out
truncate -s 0 "$gitpush"
mkdir -p repos/

# Loop over each repository
while read -r p
do
  cd "$folder";
  echo -e "\n\n$p"

  # Clean/clone local repository (2)
  if [ -d "repos/$p" ]
  then
    cd "repos/$p"
    git reset --hard
    git checkout master || git checkout main
    git pull
    git branch -D $branch || true
  else
    cd "repos"
    git clone "git@your.gitlab.host:$p.git" "$p"
    cd "$p"
  fi

  # Start a new branch (3)
  git checkout -b $branch


  # TODO Insert match conditionals here, which continue onto the next repository if failed (4)

  # TODO Insert transformations here to go from old state to new state (5)


  # Only commit if this resulted in changes (6)
  if [[ ! $(git status --porcelain) ]]
  then
    continue;
  fi
  # Commit all outstanding changes
  git commit -a -m "TODO Insert commit message here"

  # Show differences compared to master/main
  git --no-pager diff origin/master || git --no-pager diff origin/main

  # Push branch with push options for merge requests (7)
  # https://docs.gitlab.com/ee/user/project/push_options.html
  git push --force -u origin $branch \
    -o merge_request.create \
    -o merge_request.merge_when_pipeline_succeeds \
    -o merge_request.remove_source_branch \
    2>&1 | tee -a "${gitpush}"

# Loop over either first command line argument file, or piped input
done < "${1:-/dev/stdin}" # (1)

# Open all new pull requests in Chrome (optional)
grep 'merge_request' "${gitpush}" | awk '{print $2}' | xargs google-chrome

There are four TODO items in the above sample; these are the elements that change for each transformation.

Running such a script will create merge requests, which are set to merge automatically only if the build pipeline completes successfully through GitLab Push Options. This automatic merging is ofcourse optional, and can be disabled when you want to more closely inspect the changes per repository. Any fixes required can be applied to the same merge request as usual, allowing you to successfully conclude the transformation of all repositories. If you’re still unsure of a local script change can ofcourse comment out the git push command, and inspect the differences locally.

With the outline for each transformation script in place, we can begin to apply transformations to a selection of repositories.

Example transformations

Below you’ll find a wide sampling of transformations that can be applied to a micro service landscape. Most examples are geared towards Java development, but the same approach can be applied to any langauge or platform.

Upgrade dependencies

One of the easiest automations you can apply to make repositories more alike, is to ensure they at the very least all use the latest versions of their respective dependencies. This will ensure there are no more subtle version differences for developers to keep in mind, and will highlight any incompatibilities as soon as possible. ThoughtWorks highlighted this in their 2019 & 2020 TechRadar as Dependency drift, with special mention for tools such as Dependabot and Snyk to detect and resolve discrepancies.

Dependabot is available for GitHub, and can also be made to run on GitLab. Low level automation is perfectly capable of detecting what build tools a given repository uses, to create an initial Dependabot configuration file per repository. See my previous blogpost for details on configuration in bulk for Dependabot.

Upgrade build tools

Build tools such as Gradle and Maven are frequently added to repositories in the form of wrappers such as gradlew and mvnw, which allow developers and build pipelines to reliably run the same version. As the tool version is checked into the source code repository, it’s typically only updated incidentally, which means you can miss out on performance improvements, or fail to spot deprecations in newer versions. This can cause a backlog of issues when you then want to upgrade to say Java 16, which requires Gradle 7. Below you’ll find instructions on how to upgrade the Gradle and Maven wrappers to their latest versions.

How to upgrade Gradle

Given an initial list of repositories, we want to detect which repositories that contain the Gradle wrapper, and if present upgrade the wrapper to the latest version.

Listing 4. Upgrade Gradle wrapper

# Only transform when gradle wrapper is present
if [ ! -f gradlew ]
then
  continue
fi
# Upgrade gradle wrapper, which consumes stdin if not fed /dev/null
./gradlew wrapper --gradle-version 6.8.3 < /dev/null

As the comments show there’s a slight quirk when run within a script that uses piped input, but easily worked around by providing an explicit pipe in.

Drop JCenter

As you upgrade to Gradle 7 and further, you might encounter the jcenter() deprecation. As this can cause problems with the long term stability of your builds, we can remove JCenter from our repositories already.

Listing 5. Drop jcenter bintray references before May 1st 2021 sunset

sed -i \
  -e "/maven { url \"https:\/\/jcenter.bintray.com\" }/d" \
  -e "/jcenter()/d" \
  build.gradle

Replace compile, testCompile & runtime

Similarly, when you upgrade to Gradle 7 you might encounter Could not find method testCompile() due to the removal of compile and runtime configurations. Another quick few replacement patterns should help clear those out.

Listing 6. Replace deprecated gradle compile, runtime and testCompile

sed -i \
  -e 's/compile /implementation /g' \
  -e 's/testCompile /testImplementation /g' \
  -e 's/runtime /runtimeOnly /g' \
  build.gradle

How to upgrade Maven

We again want to detect any repositories that contain the Maven wrapper, and if present, upgrade the wrapper to the latest version.

Listing 7. Upgrade Maven wrapper

# Only transform when the Maven wrapper is present
if [ ! -f mvnw ]
then
  continue
fi
# Upgrade Maven wrapper
./mvnw -N io.takari:maven:0.7.7:wrapper -Dmaven=3.8.1
# Replace http: with https: for Maven 3.8.1
sed -i 's|<url>http://|<url>https://|g' pom.xml

Replace mvn install with verify

Frequently Maven projects unnecessarily call goals clean or install as part of the build. These goals slow down the build, and are rarely needed; The verify goal is usually sufficient to replace install or even deploy. With another small substitution we can remove and replace these goals in our CI pipelines.

Listing 8. Drop Maven clean and replace Maven package/install with verify

sed -i -r \
  -e 's/(mvnw.*)(clean)/\1/g' \
  -e 's/(mvnw.*)(package|install)/\1verify/g' \
  Jenkinsfile # or .gitlab-ci.yml or GitHub Actions or CircleCI

Upgrade runtimes

Runtimes are another source of frequent accidental variation, with repositories adopting a mix of Java versions as needed. With the increased pace of Java releases, and with that the dropped support for non-latest of LTS versions, you will likely wish to keep up with the latest supported version.

Low level replacements will allow you to:

increase the Gradle sourceCompatibility or Maven equivalent,
upgrade to suitable Jacoco and CheckStyle versions,
swap out your Docker base image if not already managed through Dependabot,
replace or drop JVM options as needed.

This way you can eliminate uncertainty around supported language features, and ensure you’re running all applications only on the latest supported version.

Upgrade test libraries

Test libraries are another common source of outdated dependencies. Upgrading these to their latest versions rarely gets priority, but luckily simple substitutions get us most of the way there. And since tests are executed as part of our build, we can safely apply these merge requests automatically if the build succeeds.

How to upgrade from JUnit4 to JUnit5 / Jupiter

JUnit 5 offers various advantages over JUnit 4, such as leveraging features from Java 8, organizing your tests hierachically, and using more than one extension at a time. It also includes some breaking changes though, particularly around asserting exceptions, replacing test rules with extensions, and the constructs needed for Hamcrest assertions. The bulk of the changes though can again be achieved with simple substitutions, which for some repositories might already get you all the way there:

Show script to replace JUnit 4 imports and constructs with JUnit 5

Listing 9. Replace JUnit 4 imports and constructs with JUnit 5

# Skip if not using JUnit 4 Test
if ! grep -q -r 'org.junit.Test' .
then
  echo 'org.junit.Test not found'
  continue
fi

# Find any Java files
for file in $(find . -type f -name '*.java')
do
  # Clean up windows line endings
  fromdos $file;

  # Remove public modifiers from classes and test methods
  if grep -q 'org.junit.Test' $file ; then
    sed -i "s|public class|class|g" $file
    sed -i "s|public void|void|g" $file
  fi

  # Replace verbose Assert.assert with shorthand
  sed -i -r "s|(\s+)Assert.assert|\1assert|g" $file
  sed -i "s|import org.junit.Assert;|import static org.junit.jupiter.api.Assertions.*;|" $file

  # Remove RunWith if used for Spring
  if grep -q 'RunWith(Spring' $file ; then
    sed -i -r '/import org\.junit\.runner\.RunWith;/d' $file
  fi

  # Replace boolean equals assertions
  sed -i -r "s|assertEquals\(true,\s*|assertTrue(|g" $file
  sed -i -r "s|assertEquals\(false,\s*|assertFalse(|g" $file

  # Remove @RunWith(SpringRunner.class) and @RunWith(SpringJUnit4ClassRunner.class)
  sed -i -r '/\@RunWith\(SpringRunner\.class\)/d' $file
  sed -i -r '/\@RunWith\(SpringJUnit4ClassRunner\.class\)/d' $file
  sed -i -r '/import org\.springframework\.test\.context\.junit4\.SpringRunner;/d' $file
  sed -i -r '/import org\.springframework\.test\.context\.junit4\.SpringJUnit4ClassRunner;/d' $file

  # Based on: https://github.com/sbrannen/junit-converters/blob/master/junit4ToJUnitJupiter.zsh

  # Replace imports of TestCase and Assert with Assertions.
  sed -i -r 's/junit.framework.TestCase/org.junit.jupiter.api.Assertions/' $file
  sed -i -r 's/org.junit.Assert/org.junit.jupiter.api.Assertions/' $file

  # Replace JUnit 4's @Test, @BeforeClass, @AfterClass, @Before,
  # @After, and @Ignore annotations with their JUnit Jupiter
  # counterparts.
  sed -i -r 's/org\.junit\.Test/org.junit.jupiter.api.Test/g' $file
  sed -i -r 's/org\.junit\.BeforeClass/org.junit.jupiter.api.BeforeAll/g' $file
  sed -i -r 's/org\.junit\.AfterClass/org.junit.jupiter.api.AfterAll/g' $file
  sed -i -r 's/org\.junit\.Before/org.junit.jupiter.api.BeforeEach/g' $file
  sed -i -r 's/org\.junit\.After/org.junit.jupiter.api.AfterEach/g' $file
  sed -i -r 's/org\.junit\.Ignore/org.junit.jupiter.api.Disabled/g' $file

  sed -i -r 's/\@BeforeClass/\@BeforeAll/g' $file
  sed -i -r 's/\@AfterClass/\@AfterAll/g' $file
  sed -i -r 's/\@Before$/\@BeforeEach/g' $file
  sed -i -r 's/\@After$/\@AfterEach/g' $file
  sed -i -r 's/\@Ignore/\@Disabled/g' $file

  # Replace @RunWith(MockitoJUnitRunner.class) with @ExtendWith(MockitoExtension.class).
  sed -i -r 's/org\.junit\.runner\.RunWith/org.junit.jupiter.api.extension.ExtendWith/g' $file
  sed -i -r 's/\@RunWith/\@ExtendWith/g' $file
  sed -i -r 's/org\.mockito\.junit\.MockitoJUnitRunner/org.mockito.junit.jupiter.MockitoExtension/g' $file
  sed -i -r 's/MockitoJUnitRunner/MockitoExtension/g' $file

  git add $file
done

When run across a whole set of repositories you’ll quickly find which need further changes based on failing builds, and can continue from there to finish the migration. Using only the latest version of JUnit across all your repositories again lowers the cognitive complexity when switching between services.

How to upgrade to Mockito 2+

Mockito 2 introduced some minor changes that could hinder direct adoption of version 2 and 3. Not upgrading means you’re missing out on features such as Java 8+ support, unnecessary stub detection and incorrect argument detection. Luckily, the team has provided a few helpful regular expressions to aid migration. These can once again easily be applied through a script to faciliate a quick migration.

Listing 10. Replace Mockito 1 imports and methods with Mockito 2 and 3 replacements

find . -name '*.java' \
  -exec sed -i -r \
    -e 's/import org.mockito.runners.MockitoJUnitRunner;/import org.mockito.junit.MockitoJUnitRunner;/g' \
    -e 's/import (.*) org.mockito.(Mockito|Matchers).argThat;/import $1 org.mockito.hamcrest.MockitoHamcrest.argThat;/g' \
    -e 's/(\w+).getArgumentAt[(]([a-zA-Z0-9]*),\s*(.*?).class[)]\./(($3)$1.getArgument($2))./g' \
    -e 's/.getArgumentAt[(]([a-zA-Z0-9]*),.*?[)]/.getArgument($1)/g' \
    {} \;

How to upgrade to Cucumber-JVM 5+

Cucumber-JVM 5 updated the package structure, forcing you change the class imports in the migration. Adopting the new imports can mostly be achieved through use of the following:

Listing 11. Replace Cucumber-JVM <4 imports with Cucumber-JVM 5+ io.cucumber

find . -name '*.java' \
  -exec sed -i 's/import cucumber.api/import io.cucumber/g' {} \; \
  -exec sed -i 's/cucumber.CucumberOptions/cucumber.junit.CucumberOptions/g' {} \;

Spring Boot

Spring Boot is another area where I frequently see various services use different versions, forcing developers to take this into account in particular for non-functional requirements such as metrics. Upgrading between versions sometimes forces you to adopt moved properties, which can be achieved with common replacement patterns in sed.

Enable Spring Boot graceful shutdown

Occasionally there’s new features not enabled by default, which nonetheless make sense to adopt widely, such as graceful application shutdown. The following script inserts the property to enable graceful shutdown into application.properties and application.yml if not already present.

Listing 12. Add graceful application shutdown

# Only apply to projects using Spring Boot
if ! grep -q spring-boot build.gradle # or pom.xml
then
  continue
fi

# Insert into application.properties
if [ -f src/main/resources/application.properties ] ; then
  if grep -q shutdown src/main/resources/application.properties ; then
    continue
  fi
  if grep ^server. src/main/resources/application.properties ; then
    sed -i '0,/^server./s/^server\./server.shutdown=graceful\nserver./' src/main/resources/application.properties
  else
    printf '\nserver.shutdown=graceful\n' >> src/main/resources/application.properties
  fi
fi

# Insert into application.yml
if [ -f src/main/resources/application.yml ] ; then
  yq -i eval '.server.shutdown="graceful"' src/main/resources/application.yml
fi

You’ll find properties files are easier to do direct subsitutions for with crude matching than yaml. When editing yaml files it helps to look into dedicated tools such as yq.

Add application name and metric tag

Adding your application name consistently can help with service discovery, as well as with adding consistent metric tags. A simple script can check whether the application name is already defined, and if not add it using the repository name.

Listing 13. Add Spring Boot application name and metrics tags

if [ -f src/main/resources/application.properties ]
then
  # Prepend application name if not already present
  if ! grep -q 'spring.application.name' src/main/resources/application.properties
  then
    sed -i "1s;^;spring.application.name=$(basename $p)\n;" src/main/resources/application.properties
  fi
  # Append metrics tags if not already present
  if ! grep -q 'management.metrics.tags' src/main/resources/application.properties
  then
    if grep ^server. src/main/resources/application.properties ; then
      sed -i '0,/^management\./s/^management\./management.metrics.tags.application=${spring.application.name}\nmanagement./' src/main/resources/application.properties
    else
      echo -e '\nmanagement.metrics.tags.application=${spring.application.name}\n' >> src/main/resources/application.properties
    fi
  fi
fi
# Insert into application.yml with YQ; no-op if already present
if [ -f src/main/resources/application.yml ]
then
  if ! $(yq eval '.spring.application | has("name")' src/main/resources/application.yml)
  then
    yq -i eval ".spring.application.name=\"$(basename $p)\"" src/main/resources/application.yml
  fi
  if ! $(yq eval '.management.metrics | has("tags")' src/main/resources/application.yml)
  then
    yq -i eval '.management.metrics.tags.application="${spring.application.name}"' src/main/resources/application.yml
  fi
fi

Remove Spring Boot Maven plugin

When you switch to layered docker images created through Spring Boot 2.3+ build packs, you'll find you might no longer need the Spring Boot Maven plugin in your pom.xml anymore. Here's a sample script using xmlstarlet to remove the plugin by name, as well as any empty elements it might leave behind when you do.

Listing 14. Remove Spring Boot Maven plugin

# Remove spring-boot-maven-plugin
xml ed -P -L -N x="http://maven.apache.org/POM/4.0.0" -d '//x:plugin/x:artifactId[text()="spring-boot-maven-plugin"]/..' pom.xml
# Clean up now empty build/plugins tags
xmlstarlet ed -P -L -N x="http://maven.apache.org/POM/4.0.0" -d '//x:plugins[not(./*) and (not(./text()) or normalize-space(./text())="")]' pom.xml
xmlstarlet ed -P -L -N x="http://maven.apache.org/POM/4.0.0" -d '//x:build[not(./*) and (not(./text()) or normalize-space(./text())="")]' pom.xml
# Remove emtpty lines
sed -i '/^\s*$/d' pom.xml

Kubernetes

Deployments are another area that frequently, and often needlessly, vary slightly from project to project. In particular when you have been using Kubernetes for a while, you might still have a few references to beta versions in your descriptors. Any such older v1beta1 references can be cleaned up fairly easily:

Listing 15. Replace v1beta1

for file in $(find k8s -type f -name '*.yaml')
do
  # Deployments
  sed -i -r \
    -e "s|apps/v1beta1|apps/v1|g" \
    -e "s|batch/v1beta1|batch/v1|g" \
    -e "s|extensions/v1beta1|apps/v1beta1|g" \
    $file
done

Clearing out old references ensures you remain compatible with newer Kubernetes releases, before you're forced by a cluster migration. Similar replacements can be applied when you want to add prometheus scraping, service ports, or update the initial health check delay to reduce deployment delays.

Untrack inadvertently added files

There's a lot more small simple fixes you can do once you get into the habbit of applying changes broadly through scripts. Here's a small sample script to clear out any files your colleagues or predecessors likely had not intended to add to the repository.

Listing 16. Untrack IDE and file system folders

# Skip if missing
if  -d .idea
then
  git rm -r --cached .idea
  echo '.idea' >> .gitignore
fi
# Skip if missing
if  -f .DS_Store
then
  git rm -r --cached .DS_Store
  echo '.DS_Store' >> .gitignore
fi

Final thoughts

Having a diverse set of sample transformations enables you to quickly apply similar transformations whenever you encounter a potential future cause of drift between services, such as a method deprecation or an improvement in the build pipeline or deployment. Scripts need not be perfect or long living to be of value in reducing complexity, although it helps to share your scripts within a team to give everyone access to the same toolset. Once you share your scripts within an organization, you can even consider to run a certain set periodically, such that any new projects immediately conform to the latest desired state.

Should your needs or organzation grow beyond what can be achieved through simple means, it might be worthwhile to explore dedicated tools that operate in the same solution space such as Atomist.

In any case it's worthwhile to keep in mind that automation allows you to have more impact throughout your day with only minimal extra effort, to ensure you can apply your cognitive abilities where it really matters: producing business value.