Look Ma! No Swagger! An approach to automated doc generation for Serverless Apps
All of this started with this question
I was tired. We had this large API that we'd built out over time. And of course! Just like a lot of small teams, had NOT documented it. 😱 Now, like many small teams, were looking to make some sense of this madness.
To give you some context about "madness". We have a reasonably large, single serverless stack of about 170 functions. While we'd done some inline comments and function-specific documentation, we didnt really have API docs.
Naturally, I turned to the one thing that all devs turn to, when they need to document APIs.
Swagger (now called OpenAPI) is probably the most popular specification to document APIs. IMO, its emerged as the specification that seems to have won the API documentation wars, if there ever was such a thing. I had heard of competing API doc standards like:
- RAML - REST API Markup Language
- API Blueprint
But apparently, the specification with the most vibrant ecosystem happened to be OpenAPI/Swagger.
Don't get me wrong. Swagger is great. Its comprehensive. Accounts for all possibilities of API Documentation. It can be composed in the preferred language of all Infrastructure tomes, YAML. And beyond all that, there have been a LOT of tools built to auto-generate OpenAPI/Swagger specifications from existing APIs.
Some of them are framework specific, like this:
https://flask-restplus.readthedocs.io/en/stable/swagger.html
And some of them are integrated into tooling, like this:
Obviously, being lazy, I started looking at the auto-generated variants for OpenAPI documentation. I'd love nothing more than being able to aut0-gen my API documents from my serverless.yml
specification file.
For those who may not know, the serverless.yml
specification file is created for the sole purpose of packaging and deploying serverless apps to AWS Lambda among others (it supports Google Cloud, Azure Functions, etc).
In the serverless yaml file, you define functions like this:
create_user:
handler: cti.user.create_user.create_user
description: This function is used by the super-admin to create a user
events:
- http:
path: user/create_user
method: post
The above specification:
- deploys the
create_user
function located incti/user/create_user/
with its dependencies. - Creates an API Gateway route for it with the necessary configuration for API Gateway, in this case, a POST with a particular path
All in all, the serverless framework is very convenient, especially for deployment and keeping track of deployments over time.
My initial quest was for plugin (yes, the serverless framework has several plugins). And there were a few options. But none of them complied with the demands of my indolence.
They were not really "generators" so much as: "You document I generate" style of generators, where I would end up doing a bulk of the work. The only thing the "generator" would do, is quite literally, generate the OpenAPI compliant YAML.
My next approach was to look for an OpenAPI generator for my favorite REST API Client, Insomnia.
Insomnia is an awesome tool. We use it every day, especially for functional testing for our REST APIs. And like many good tools, Insomnia also has a plethora of plugins available. But unfortunately, no OpenAPI plugins. Not any useful ones anyway.
Apparently documenting APIs was not really high on a developer's TODO list. Who knew? 😒
I really didn't want to document OpenAPI myself. Principal reasons being:
- I needed to learn the spec from scratch. I had documented it once before, but it took a while. That was a project with approx 10 API requests. This had approx 200. Not happening! Someone sent me this as a way to make my life easier. It seemed more daunting than I had first imagined
- I really wanted more wiki style README docs, with hopefully some diagrams, etc that would be easy for my devs to consume.
Which is when I decided...
why not write an aut0-generator myself?
I know what you're thinking.
You're just reinventing the wheel!
But not really. I had done enough research to know that I wasnt doing that. I would have to spend a ton of time documenting the spec. I could connect the disparate dots in my existing artefacts and obtain passable API documentation. So I asked myself a question
What do you need for API docs?
I realized I would need:
- Request and Response Data and JSON Format
- URL info
- Description about the API endpoint
- a diagram of execution, maybe??
And I realized that I already had ALL of this:
- Insomnia: URL, Request and Response data for each API endpoint
- Serverless YAML - URL, Description and Event Execution Information
I just needed to map these two disparate datasets, and I would (hopefully) have passable API documentation.
I decided to use docsify to "write" the docs. We love docsify at we45 and use it for nearly all our documentation.
All you have to do is write markdown files into a directory and you can host that directory on github pages, netlify, etc. In addition, Docsify already has plugins that do syntax highlighting for code, collapsible sidebar that supports dynamic content, etc.
I used the following inputs:
- Serverless YAML file - Parse file, pull out specific data with context
- Insomnia Export JSON file - Parse file, pull out request and header data first
Update - I realize that Insomnia has a HTTP Archive Export option (HAR) which captures info in a more structured format with both request and response. Probably going with this next.
I would generate:
- API docs with description and request URL and payload (initially)
- Auto-generated Event Execution Diagram with Diagrams - basically "Diagrams as Code"
This is what a Insomnia HTTP Request exported (JSON) looks like
{
"_id": "req_8500ae05ca7b45bc8bc864ab1a828cc1",
"parentId": "wrk_c80ae37aec444a67b3f8bb8d1fb5723f",
"modified": 1601657010868,
"created": 1599300358937,
"url": "{{ BASE_URL }}ops/delete-server", #match the URL
"name": "Delete Running Server",
"description": "",
"method": "POST",
"body": {
"mimeType": "application/json",
"text": "{\n\t\"server_sk\": \"something-thats-noyb\"\n}"
},
"parameters": [],
"headers": [
{
"id": "pair_e948cbc380b3472eb6f25b39a3d8e616",
"name": "Content-Type",
"value": "application/json"
},
{
"description": "",
"id": "pair_4ee5877f60fa46aebc54d160f3fffa41",
"name": "Authorization",
"value": "{{ AUTH_TOKEN }}"
}
],
"authentication": {},
"metaSortKey": -1599300358937,
"isPrivate": false,
"settingStoreCookies": true,
"settingSendCookies": true,
"settingDisableRenderRequestBody": false,
"settingEncodeUrl": true,
"settingRebuildPath": true,
"settingFollowRedirects": "global",
"_type": "request"
}
This is what a serverless function (yaml) looks like
delete-lab:
handler: cti.ops.delete_server.delete_user_server
description: This function is used to delete a user test server that has been initialized
events:
- http:
path: ops/delete-server
method: post
cors: true
I had to match the path in the serverless yml to the URL in the Insomnia JSON and pull out the specific datasets to form my documentation for every API
- URL
- Request method
- Request payload
In addition, since I was dealing with an event-driven system, I also wanted to capture the event execution scope in a diagram
All of this was written into a dynamically generated markdown document.
I decided to go with the following doc structure
docs
----- API Reference (master reference for all API Endpoints
----- Detailed API Reference
--------- individual-api-refs.md (individual API files with diagram)
All of this was written to a docs
directory and published to a password protected netlify app as soon as the repo is pushed to github
Final result looks something like this (initial work, to be built on more)