AWS documentation frustrations

AWS logo (sad)
AWS logo (sad)

Introduction

AWS has a ton of documentation, and sometimes it misses the mark. Recently, I came across two different examples of AWS documentation either perplexing or frustrating me.

Now, I don't want to come across as a complainy Franey, but since I've been working on documentation for Blurry, a static site generator I open-sourced (and that builds this site), I've been thinking about what makes good technical documentation good. I don't think I've figured it out quite yet, but I have found two traits of AWS docs that can hurt documenation, and maybe doing the opposite can improve it.

Read on for a break-down of two AWS documentation breakdowns and what I learned therefrom.

Trait 1: perplexing

How many steps does it take to get a direct link to the VS Code extension?

From the Lambda function console page, it took 4 clicks, reading, and scrolling to get to the actual VS Code extension. Even from the AWS Toolkit for Visual Studio Code site, it was still two clicks to get to the page with the direct VS Code link (and that link was below-the-fold).

That's quite a long walk for what ended up being a link that opened the extension in VS Code itself. Four clicks for something that could be one click? That's pretty perplexing.

There's some risk in having that much documentation about a VS Code extension, too. Having a documentation site separate from the README could mean:

  1. You have to maintain documentation in multiple places, which can be difficult to keep in sync
  2. If documentation in one of the two places isn't unique enough to have to keep in sync, that means the docs may be general enough to be not that helpful

Trait 2: frustrating

It's frustrating when documentation is incomplete—especially in code examples.

Take this Python code sample for AWS CDK's PythonFunction, for instance:

entry = "/path/to/function"
image = DockerImage.from_build(entry)

python.PythonFunction(self, "function",
    entry=entry,
    runtime=Runtime.PYTHON_3_8,
    bundling=python.BundlingOptions(
        build_args={"PIP_INDEX_URL": "https://your.index.url/simple/", "PIP_EXTRA_INDEX_URL": "https://your.extra-index.url/simple/"}
    )
)

Linting the code using Pylint via (Pyrfecter), we can see a number of linting issues:

2:9: undefined name 'DockerImage'
4:1: undefined name 'python'
4:23: undefined name 'self'
6:13: undefined name 'Runtime'
7:14: undefined name 'python'

There are a couple missing imports that were easy to sort out (Runtime, DockerImage), but python? I couldn't find that in the CDK Developer Guide or the CDK API documentation or the AWS CDK Examples.

After getting help from a search engine, I found that the missing python import is:

from aws_cdk import aws_lambda_python_alpha as python

And the PyPI package to install is:

aws-cdk-aws-lambda-python-alpha

And self? The python.PythonFunction call shoud be inside a CDK Stack.

If we check the documentation for that, we get an example of an App and a Construct, but the Stack classes are stubbed:

from aws_cdk import App, Stack
from constructs import Construct

# imagine these stacks declare a bunch of related resources
class ControlPlane(Stack): pass
class DataPlane(Stack): pass
class Monitoring(Stack): pass

class MyService(Construct):

  def __init__(self, scope: Construct, id: str, *, prod=False):
  
    super().__init__(scope, id)
  
    # we might use the prod argument to change how the service is configured
    ControlPlane(self, "cp")
    DataPlane(self, "data")
    Monitoring(self, "mon")
    
app = App();
MyService(app, "beta")
MyService(app, "prod", prod=True)

app.synth()  

There are no linting errors, at least!

For a code example of a Stack, I went to the documentation for App and found this:

class MyFirstStack(Stack):

    def __init__(self, scope: Construct, id: str, **kwargs):
        super().__init__(scope, id, **kwargs)

        s3.Bucket(self, "MyFirstBucket")

And the linting output?

1:20: undefined name 'Stack'
3:31: undefined name 'Construct'
6:9: undefined name 's3'

We can get the Stack and Construct imports from the code sample in the Stacks page.

So. After checking the the aws_cdk.aws_lambda_python_alpha API docs, multiple pages of the AWS CDK Developers Guide, the GitHub examples repo, and some searching to find the docs for "@aws-cdk/aws-lambda-python-alpha module" (which is separate from the CDK API docs), we have enough documentation to use these new (and very helpful!) Python CDK constructs.

But, boy, did I have to work for it. I can't remember how much time it took for me to get a complete working example, and I shudder to think of how many dev-hours have been spent in a similar pursuit.

A complete, working code example would save oodles of time, and I'm happy to oblige:

from aws_cdk import Duration, RemovalPolicy, Stack
from aws_cdk import aws_dynamodb as dynamodb
from aws_cdk import aws_lambda as lambda_
from aws_cdk import aws_lambda_event_sources as event_sources
from aws_cdk import aws_lambda_python_alpha as lambda_python
from aws_cdk.aws_lambda_python_alpha import PythonFunction
from constructs import Construct


class WorkingStack(Stack):
    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        # Resources
        python_dependencies_layer = lambda_python.PythonLayerVersion(
            self,
            "PoetryLayer",
            entry="./src/layer",
            compatible_runtimes=[lambda_.Runtime.PYTHON_3_10],
        )

        dynamodb_table: dynamodb.Table = dynamodb.Table(
            self,
            "WorkingTable",
            partition_key=dynamodb.Attribute(
                name="PK", type=dynamodb.AttributeType.STRING
            ),
            stream=dynamodb.StreamViewType.NEW_IMAGE,
            sort_key=dynamodb.Attribute(name="SK", type=dynamodb.AttributeType.STRING),
            removal_policy=RemovalPolicy.RETAIN,
            billing_mode=dynamodb.BillingMode.PAY_PER_REQUEST,
        )

        # Functions
        react_to_new_entries_function: PythonFunction = PythonFunction(
            self,
            "ReactToNewEntries",
            entry="./src/functions",
            index="react_to_new_entries.py",
            runtime=lambda_.Runtime.PYTHON_3_10,
            layers=[python_dependencies_layer],
            environment={
                "TABLE_NAME": dynamodb_table.table_name,
            },
            timeout=Duration.seconds(29),
        )

        # Event handling
        react_to_new_entries_function.add_event_source(
            event_sources.DynamoEventSource(
                dynamodb_table,
                starting_position=lambda_.StartingPosition.TRIM_HORIZON,
                batch_size=5,
                bisect_batch_on_error=True,
                retry_attempts=1,
            )
        )

        dynamodb_table.grant_read_write_data(react_to_new_entries_function)

Summary

Documentation should yield more answers than questions

If your documentation requires multiple other (possibly unlinked) documentation sources to be helpful, there's a problem. Code samples, especially, should be complete, to make it easy to get started using your code.

Documentation should anticipate the information a reader is most likely looking for and make that easy to find

If someone clicks a link about a VS Code extension, make the call-to-action to download that extension dead-simple to find. Additional information can be helpful, too, especially for someone learning a new skill or concept, but make it easy for someone who has a good idea what they're looking for.

Last rites writes words

Writing documentation is hard. And it's easy to pick on AWS, not only because they have so much documentation, but also because they have the resources to maintain great docs.

But reading documentation is much easier than writing it, and complaining about documentation is much easier than fixing it. So keep an eye out for things that make documentation great and grody, send up PRs, and write some code...y.