Look Ma! No Swagger! An approach to automated doc generation for Serverless Apps

Look Ma! No Swagger! An approach to automated doc generation for Serverless Apps

All of this started with this question

I was tired. We had this large API that we'd built out over time. And of course! Just like a lot of small teams, had NOT documented it. 😱 Now, like many small teams, were looking to make some sense of this madness.

To give you some context about "madness". We have a reasonably large, single serverless stack of about 170 functions. While we'd done some inline comments and function-specific documentation, we didnt really have API docs.

Naturally, I turned to the one thing that all devs turn to, when they need to document APIs.

They get Swagger!

Swagger (now called OpenAPI) is probably the most popular specification to document APIs. IMO, its emerged as the specification that seems to have won the API documentation wars, if there ever was such a thing. I had heard of competing API doc standards like:

But apparently, the specification with the most vibrant ecosystem happened to be OpenAPI/Swagger.

Don't get me wrong. Swagger is great. Its comprehensive. Accounts for all possibilities of API Documentation. It can be composed in the preferred language of all Infrastructure tomes, YAML. And beyond all that, there have been a LOT of tools built to auto-generate OpenAPI/Swagger specifications from existing APIs.

Some of them are framework specific, like this:

https://flask-restplus.readthedocs.io/en/stable/swagger.html

And some of them are integrated into tooling, like this:

serverless-openapi3-plugin
Serverless plugin to resolve $ref syntax of OpenAPI3

Obviously, being lazy, I started looking at the auto-generated variants for OpenAPI documentation. I'd love nothing more than being able to aut0-gen my API documents from my serverless.yml specification file.

For those who may not know, the serverless.yml specification file is created for the sole purpose of packaging and deploying serverless apps to AWS Lambda among others (it supports Google Cloud, Azure Functions, etc).

In the serverless yaml file, you define functions like this:

create_user:
    handler: cti.user.create_user.create_user
    description: This function is used by the super-admin to create a user
    events:
      - http:
          path: user/create_user
          method: post

The above specification:

  1. deploys the create_user function located in cti/user/create_user/ with its dependencies.
  2. Creates an API Gateway route for it with the necessary configuration for API Gateway, in this case,  a POST with a particular path

All in all, the serverless framework is very convenient, especially for deployment and keeping track of deployments over time.

My initial quest was for plugin (yes, the serverless framework has several plugins). And there were a few options. But none of them complied with the demands of my indolence.

They were not really "generators" so much as: "You document I generate" style of generators, where I would end up doing a bulk of the work. The only thing the "generator" would do, is quite literally, generate the OpenAPI compliant YAML.

My next approach was to look for an OpenAPI generator for my favorite REST API Client, Insomnia.

Insomnia is an awesome tool. We use it every day, especially for functional testing for our REST APIs. And like many good tools, Insomnia also has a plethora of plugins available. But unfortunately, no OpenAPI plugins. Not any useful ones anyway.

Apparently documenting APIs was not really high on a developer's TODO list. Who knew? 😒

I really didn't want to document OpenAPI myself. Principal reasons being:

  1. I needed to learn the spec from scratch. I had documented it once before, but it took a while. That was a project with approx 10 API requests. This had approx 200. Not happening! Someone sent me this as a way to make my life easier. It seemed more daunting than I had first imagined
  2. I really wanted more wiki style README docs, with hopefully some diagrams, etc that would be easy for my devs to consume.

Which is when I decided...

why not write an aut0-generator myself?

I know what you're thinking.

You're just reinventing the wheel!

But not really. I had done enough research to know that I wasnt doing that. I would have to spend a ton of time documenting the spec. I could connect the disparate dots in my existing artefacts and obtain passable API documentation. So I asked myself a question

What do you need for API docs?

I realized I would need:

  • Request and Response Data and JSON Format
  • URL info
  • Description about the API endpoint
  • a diagram of execution, maybe??

And I realized that I already had ALL of this:

  • Insomnia: URL, Request and Response data for each API endpoint
  • Serverless YAML - URL, Description and Event Execution Information

I just needed to map these two disparate datasets, and I would (hopefully) have passable API documentation.

I decided to use docsify to "write" the docs. We love docsify at we45 and use it for nearly all our documentation.

All you have to do is write markdown files into a directory and you can host that directory on github pages, netlify, etc. In addition, Docsify already has plugins that do syntax highlighting for code, collapsible sidebar that supports dynamic content, etc.

I used the following inputs:

  • Serverless YAML file - Parse file, pull out specific data with context
  • Insomnia Export JSON file - Parse file, pull out request and header data first
Update - I realize that Insomnia has a HTTP Archive Export option (HAR) which captures info in a more structured format with both request and response. Probably going with this next.

I would generate:

  • API docs with description and request URL and payload (initially)
  • Auto-generated Event Execution Diagram with Diagrams - basically "Diagrams as Code"

This is what a Insomnia HTTP Request exported (JSON) looks like

{
            "_id": "req_8500ae05ca7b45bc8bc864ab1a828cc1",
            "parentId": "wrk_c80ae37aec444a67b3f8bb8d1fb5723f",
            "modified": 1601657010868,
            "created": 1599300358937,
            "url": "{{ BASE_URL }}ops/delete-server", #match the URL
            "name": "Delete Running Server",
            "description": "",
            "method": "POST",
            "body": {
                "mimeType": "application/json",
                "text": "{\n\t\"server_sk\": \"something-thats-noyb\"\n}"
            },
            "parameters": [],
            "headers": [
                {
                    "id": "pair_e948cbc380b3472eb6f25b39a3d8e616",
                    "name": "Content-Type",
                    "value": "application/json"
                },
                {
                    "description": "",
                    "id": "pair_4ee5877f60fa46aebc54d160f3fffa41",
                    "name": "Authorization",
                    "value": "{{ AUTH_TOKEN }}"
                }
            ],
            "authentication": {},
            "metaSortKey": -1599300358937,
            "isPrivate": false,
            "settingStoreCookies": true,
            "settingSendCookies": true,
            "settingDisableRenderRequestBody": false,
            "settingEncodeUrl": true,
            "settingRebuildPath": true,
            "settingFollowRedirects": "global",
            "_type": "request"
 }

This is what a serverless function (yaml) looks like

delete-lab:
    handler: cti.ops.delete_server.delete_user_server
    description: This function is used to delete a user test server that has been initialized
    events:
      - http:
          path: ops/delete-server
          method: post
          cors: true

I had to match the path in the serverless yml to the URL in the Insomnia JSON and pull out the specific datasets to form my documentation for every API

  • URL
  • Request method
  • Request payload

In addition, since I was dealing with an event-driven system, I also wanted to capture the event execution scope in a diagram

All of this was written into a dynamically generated markdown document.

I decided to go with the following doc structure

docs
----- API Reference (master reference for all API Endpoints
----- Detailed API Reference
--------- individual-api-refs.md (individual API files with diagram)

All of this was written to a docs directory and published to a password protected netlify app as soon as the repo is pushed to github

Final result looks something like this (initial work, to be built on more)

Running API Reference Page
Individual API Page with auto-gen diagram (request info not in image)