Implementing a REST API for Object Detection with KotlinDL and KTor

After completing the Kotlin for Java Developers Course on Coursera I was looking for an excuse to put my freshly gained Kotlin knowledge into action. I decided to address my frustration about the large amount of falsely detected movements by one of my security camera’s.

One of the core components of that solution is a REST API that receives an image and returns a list of detected objects. I decided to develop that using Kotlin, KotlinDL and KTor.

This blog posts describes the core components of the solution. The source code of the example is available at GitHub.

Implementing Object Detection

The security camera I use detects motion based on the amount of "changes" it detects between frames over time. This leads to lots of false positives for example when it’s raining or windy and trees are moving back and forth. The camera generates a snapshot and movie for each detected motion. Performing object detection on the snapshots to find out what actually caused the movement will help a lot in reducing the amount of false positives.

KotlinDL

To keep it simple I used KotlinDL to implement the object detection using Kotlin.

"KotlinDL is a high-level Deep Learning API written in Kotlin and inspired by Keras. Under the hood, it uses TensorFlow Java API and ONNX Runtime API for Java. KotlinDL offers simple APIs for training deep learning models from scratch, importing existing Keras and ONNX models for inference, and leveraging transfer learning for tailoring existing pre-trained models to your tasks." — KotlinDL Website

The implementation will use KotlinDL to fetch the pretrained SSD ONNX model and use it to perform the actual object detection on an image file.

The code

Below you’ll find the code of the Kotlin service that implements the object recognition using KotlinDL.

Listing 1. ObjectDetectionService.kt

class ObjectDetectionService(private val maxResults: Int = 10, private val minProbability: Double = 0.25) {

    (1)
    private val model: SSDObjectDetectionModel by lazy {
        val modelHub = ONNXModelHub(cacheDirectory = File("cache/pretrainedModels"))
        ONNXModels.ObjectDetection.SSD.pretrainedModel(modelHub)
    }

    (2)
    fun detectObjects(file: File): List<DetectedObject> {
        if (!file.exists()) throw IllegalArgumentException("File not found ${file.absoluteFile}")
        return model.detectObjects(imageFile = file, topK = maxResults).filter { it.probability >= minProbability }
    }

    (3)
    suspend fun detectObjects(bytes: ByteArray): List<DetectedObject> = withContext(Dispatchers.IO) {
        val tmpFile = kotlin.io.path.createTempFile()
        tmpFile.writeBytes(bytes)
        detectObjects(tmpFile.toFile())
    }
}

1	Lazy initializes the ONNX SSD model by downloading it or using a cached version if this is already present in the specified folder.
2	Performs object detection on the file passed returning all objects that are larger than the minimum probability.
3	Stores the data from the byte array in a temporary file and perform object detection. By using the suspended keyword the blocking IO code is offloaded to the IO Dispatcher when we call the code from our KTor REST API.

The Unit Test

The following unit test verifies that everything is working as expected.

Listing 2. ObjectDetectionServiceTestSuite.kt

internal class ObjectDetectionServiceTestSuite {

    (1)
    private val objectDetectionService = ObjectDetectionService(10, 0.25)

    (2)
    @Test
    fun predictionSSDTest() {
        assertAll(
            // Picture by Anna Rye via Pexels
            { verifyDetection("snapshots/pexels-anna-rye-9975158.jpg", listOf("person")) },
            // Picture by Rachel Claire via Pexels
            { verifyDetection("snapshots/pexels-rachel-claire-6761052.jpg", listOf("person", "car", "car", "truck", "car")) },
        )
    }

    private fun verifyDetection(fileName: String, detectedObjectClasses: List<String>) {
        val result = objectDetectionService.detectObjects(getFileFromResource(fileName))
        assertContentEquals(detectedObjectClasses, result.map(DetectedObject::classLabel))
    }

     private fun getFileFromResource(fileName: String): File {
        (3)
        val resource = javaClass.classLoader.getResource(fileName) ?: throw IllegalArgumentException("File not found! $fileName")
        return File(resource.toURI())
    }
}

1	Initializes our object detection service to detected at most 10 objects with a probability higher than 0.25.
2	Runs the object detection for the two given images and verify the result.
3	Gets a reference to the resource or throw an IllegalArgumentException when it returns null.

Implementing the REST API

Now that we have the object detection in place it’s time to expose it by implement a REST API using Ktor.

About Ktor

"Ktor is a framework to easily build connected applications – web applications, HTTP services, mobile and browser applications. Modern connected applications need to be asynchronous to provide the best experience to users, and Kotlin coroutines provide awesome facilities to do it in an easy and straightforward way." — Ktor Website

With Ktor’s really easy to implement a straightforward REST API as it’s:

easy to start and configure a Ktor application using an embedded server, also for integration testing;
supports JSON serialization out of the box. This allows us to specify request and response payload as Kotlin Data classes;
offers request routing to handle the incoming requests.

The list above contains only a subset of all Ktor features specific for our use case. KTor has much more features such as authentication, authorization, sessions, sockets, monitoring and administration.

The code

The code below initializes and configures our Ktor application:

Listing 3. ODSRestApi.kt

(1)
fun main(args: Array<String>): Unit = EngineMain.main(args)

(2)
private val objectDetectionService = ObjectDetectionService()

fun Application.module() {
    (3)
    install(ContentNegotiation) {
        json(Json)
    }
    (4)
    routing {
        get("/") {
            call.respondText("ODS")
        }
        post("/") {
            (5)
            val odsRequest = call.receive<ODSRequest>()

            (6)
            val detectedObjects = objectDetectionService
                .detectObjects(Base64.getDecoder().decode(odsRequest.image))
                .map { DetectedObject(it.classLabel, it.probability, BoundingBox(it.xMax, it.xMin, it.yMax, it.yMin)) }

            (7)
            call.respond(ODSResponse(odsRequest.id, detectedObjects))
        }
    }
}

1	Launches Ktor using the configuration from application.conf (see below).
2	Initializes our ObjectDetectionService using the default values for the constructor parameters.
3	Installs the ContentNegotiation plugin and register the default JSON converter used to serialize our input and output.
4	Configures the routing to handle a Get and Post request.
5	Deserializes the request body to an ODSRequest.
6	Passes the image data from the ODSRequest as a byte array to our ObjectDetectionService and transform the result to a list of detected objects.
7	Returns the id from the request and the list of detected objects using our ODSResponse object.

Listing 4. application.conf

ktor {
    deployment {
        (1)
        port = 9090
    }
    application {
        (2)
        modules = [ ods.RestServiceKt.module ]
    }
}

1	The port the embedded server of Ktor will bind to.
2	The module(s) Ktor will initialize, in this case our module from ODSRestApi.

The Integration Test

Because of the lightweight character of Ktor it’s also very suitable to launch a server and execute some integration test. The code below shows the integration test to verify our simple REST API. More information about testing Ktor applications can be found here.

Listing 5. ODSRestApiTestSuite.kt

internal class ODSRestApiTestSuite {

    // ...

    @Test
    fun testPost() {
        withTestApplication({ module() }) {
            val id = generateId()
            handleRequest(HttpMethod.Post, "/") {
                addHeader("Content-Type", "application/json")
                setBody(Json.encodeToString(ODSRequest(id, encodeImage("snapshots/pexels-anna-rye-9975158.jpg"))))
            }.apply {
                assertEquals(HttpStatusCode.OK, response.status())
                assertNotNull(response.content)

                val response = Json.decodeFromString<ODSResponse>(response.content!!)
                assertEquals(id, response.id)
                assertEquals(1, response.detectedObjects.size)
                assertEquals("person", response.detectedObjects[0].labelName)
            }
        }
    }
    // ...
}

Conclusion

It was surprisingly easy, fun and satisfying to create an initial version of the Object Detection REST API using Kotlin, KotlinDL and Ktor.

KotlinDL made it easy to implement the object detection logic using a pretrained model exposed by the KotlinDL ModelHub while hiding lots of the complexity. One of my next steps is training my own model using KotlinDL, understanding the underlying concepts better and improving the performance of the actual recognition.

KotlinDL is in active development. In this implementation I only touched the surface of its power. New features and additional models are added as we speak. Have a look at the latest additions here.

Ktor made it really easy to expose the functionality as a REST API. The framework feels intuitive, lightweight and is certainly on my list to develop a next REST API using Kotlin.

Finally, it saves me a lot of time reviewing my camera snapshots. I only have to look at the images in which objects are detected.

You can find the source code of a full working example at Github.

If you have any questions or comments feel free to reach out to me on Twitter.