February 2023 – Nivlesh's Blog

Introduction

Docker is an open-source platform for building, shipping, and running distributed applications. It allows developers to package their applications and dependencies into containers that can run on any host with Docker installed. This provides consistency, portability, and ease of deployment, as containers can be run in development, testing, and production environments with the same results. Docker also provides tools for managing and orchestrating containers, making it easier to scale applications and ensure their availability.

To fast track your docker adoption, you can easily download one of the several publicly available prebuilt images from https://hub.docker.com/.

For those that have used docker previously, have you ever wondered how the images are built? Have you ever wanted to build your own images?

Well you are in luck. In this blog, I will walk through the steps for creating your very own docker image using AWS CodeCommit, AWS CodeBuild, AWS CodePipeline and the image will be stored in Amazon Elastic Container Registry. We will be building a docker image for Hugo, a popular open-source static website generator. You can read more about it at https://gohugo.io/.

The docker image will run Hugo in server mode, which instructs it to become a webserver and start serving the supplied files.

For anyone interested, I recently created and published a docker image for hugo at https://hub.docker.com/r/nivleshc/hugo. This was also built using the solution that is described in this blog, however this image enables you to run hugo as a cli. This means you get all the functionalities that you can expect had you installed it locally on your computer.

The code for this solution is written in Terraform. For simplicity, the state file will be stored locally. For anyone that has used Terraform previously, you know how important the state file is. Just as a refresher (and for beginners), the state file tracks all the resources that Terraform provisions. Terraform uses that to calculate the changes it needs to perform, if you update the code. Without the state file, your provisioned resources will become “orphaned” and you will have to manually manage (or delete) them. This can quickly become a nightmare, if you have lots of resources. You can read more about how you can protect the state file using remote backends here https://developer.hashicorp.com/terraform/language/settings/backends/configuration.

High Level Architecture

Lets go through the high level architecture of what we will be building.

Here is an explanation for each of the steps, as marked by numbers in the above diagram.

The developer (you) will push code to the AWS CodeCommit repository.
An AWS EventBridge event rule will detect that a new push was done to the AWS CodeCommit repository.
The AWS EventBridge event rule will trigger the AWS CodePipeline pipeline.
The AWS CodePipeline pipeline will retrieve the updated code from the AWS CodeCommit repository (“Source” stage of the pipeline).
The code will be zipped, encrypted and stored in an Amazon S3 Bucket (the AWS CodePipeline’s artifact store).
The AWS CodePipeline pipeline will then trigger the AWS CodeBuild project to build the docker image (“Build” stage).
The AWS CodeBuild project will retrieve the artifact from the AWS S3 Bucket, unzip the bundle and retrieve the buildspec.yml file. This file contains the instructions on how to build the docker image. As part of the build process, the AWS CodeBuild server will retrieve the debian docker image from dockerhub ( 8 ) and the hugo package from hugo’s GitHub repository ( 9 ). It will then use these to build the docker image.

10. Once complete, the AWS CodeBuild server will upload the the image to the Amazon Elastic Container Registry repository.

Walkthrough of the code

Now that we know what the overall solution looks like, lets get familiar with the code.

Download the code from my GitHub repository using the following command
- git clone https://github.com/nivleshc/blog-create-docker-image.git

2. Open the folder called terraform. As you might have guessed, this folder contains all the terraform files that will be used to deploy this solution. For readability and easy troubleshooting, I always put my terraform code into separate files, based on their functionality.

As a quick summary, here is what the terraform files in this folder contain:

provider.tf – this file contains the terraform provider declarations, along with version constraints (protip – to ensure that you are not left head scratching when a working terraform code breaks, version lock your providers and terraform binaries. Most newer version releases have breaking changes)
main.tf – this contains resource definitions for all the resources that will be created.
iam.tf – this file contains resource declarations for all the iam roles and policies that will be created.
locals.tf – this file contains declarations for locals.
variables.tf – this file contains declarations for the variables that will be used in the terraform code.
outputs.tf – this file contains the outputs that will be displayed when the terraform code is run.

With that out of the way, lets go through each of these files in more detail. I will use the same file sequence as the summary above.

Open provider.tf in your favourite IDE.

Here is the first block of code.

	terraform {
	required_version = "~> 1.0.0"

	required_providers {

	aws = {
	source = "hashicorp/aws"
	version = "~> 4.0"
	}
	}
	}

view raw blog-create-docker-image_provider_1.tf hosted with ❤ by GitHub

The above puts a constraint on the terraform binary version that can be used to run this terraform project ( the ~> symbol means that only the rightmost number can change, which means all versions matching te pattern 1.0.x are allowed).

The above also states that the provider hashicorp/aws will be used, and only versions matching the pattern 4.x are allowed.

The next block defines the region for the aws provider.

	provider "aws" {
	region = "ap-southeast-2"
	}

view raw blog-create-docker-image_provider_2.tf hosted with ❤ by GitHub

Next, open main.tf. As previously mentioned, this contains definitions for all the resources that will be used in this solution.

Lets look at the code in small chunks, so I can explain as we go through the file.

Here are the first two lines.

	data "aws_caller_identity" "current" {}

	data "aws_region" "current" {}

view raw blog-create-docker-image_main_1.tf hosted with ❤ by GitHub

These are data sources and will be used to get information from external sources.

In the above case, the first line gets information about the aws_caller_identity. This will be used to get the account id for the AWS account that this code will run in.

The second line gets information about the aws region that the code is running in.

Spoiler alert: The above two data sources are used to define the locals in locals.tf. More about that later on.

The next block is another data source.

	data "aws_s3_bucket" "codepipeline_artifacts_s3_bucket" {
	bucket = var.codepipeline_artifacts_s3_bucket_name
	}

view raw blog-create-docker-image_main_2.tf hosted with ❤ by GitHub

The above tries to get information about the AWS S3 bucket whose name matches the value contained in the variable var.codepipeline_artifacts_s3_bucket_name. This AWS S3 bucket will be used to store the AWS CodePipeline artifacts.

I guess you have had enough of data sources, right? The next block has the resource definition for creating the AWS CodeCommit repository. This will be used to store the code that will be used to build the docker images for hugo.

	resource "aws_codecommit_repository" "hugo_repo" {
	repository_name = var.codecommit_repo_name
	description = "The AWS CodeCommit repository where the code to build the container will be stored."
	default_branch = var.codecommit_repo_default_branch_name
	}

view raw blog-create-docker-image_main_3.tf hosted with ❤ by GitHub

The next resource definition block creates an Amazon Elastic Registry repository. This will be used to store the hugo docker images.

	resource "aws_ecr_repository" "hugo_repo" {
	name = var.ecr_repo_name
	image_tag_mutability = "MUTABLE"

	image_scanning_configuration {
	scan_on_push = false
	}
	}

view raw blog-create-docker-image_main_4.tf hosted with ❤ by GitHub

Now we get to the engine room of this project. The next resource definition creates an AWS CodeBuild project, which will take the buildspec.yml file stored in the AWS CodeCommit repository and create the hugo docker image.

	resource "aws_codebuild_project" "hugo_imagebuild_project" {
	name = var.codebuild_project_name
	description = "AWS CodeBuild Project to build the container imager for hugo"
	build_timeout = "5"
	service_role = aws_iam_role.codebuild_service_role.arn

	artifacts {
	type = "NO_ARTIFACTS"
	}
	environment {
	compute_type = "BUILD_GENERAL1_SMALL"
	image = "aws/codebuild/amazonlinux2-x86_64-standard:4.0"
	type = "LINUX_CONTAINER"
	image_pull_credentials_type = "CODEBUILD"
	privileged_mode = true

	environment_variable {
	name = "AWS_REGION"
	value = local.region_name
	}

	environment_variable {
	name = "AWS_ACCOUNT_ID"
	value = local.account_id
	}

	environment_variable {
	name = "ECR_REPO_NAME"
	value = aws_ecr_repository.hugo_repo.name
	}
	environment_variable {
	name = "IMAGE_TAG_PREFIX"
	value = var.image_tag_prefix
	}
	}

	logs_config {
	cloudwatch_logs {
	group_name = var.codebuild_cloudwatch_logs_group_name
	stream_name = var.codebuild_cloudwatch_logs_stream_name
	}
	}

	source {
	type = "CODECOMMIT"
	location = aws_codecommit_repository.hugo_repo.clone_url_http
	git_clone_depth = 1

	git_submodules_config {
	fetch_submodules = true
	}
	}

	tags = {
	Environment = var.env
	}
	}

view raw blog-create-docker-image_main_5.tf hosted with ❤ by GitHub

AWS CodeBuild projects run inside a container. So what we are doing here is creating a docker image inside a docker container (that is why we have to set privileged_mode = trueunder environment).

The above AWS CodeBuild project is of type “LINUX_CONTAINER” and uses the image “aws/codebuild/amazonlinux2-x86_64-standard:4.0”. It uses a compute type of “BUILD_GENERAL1_SMALL”.

The AWS CodeBuild project uses a service role. This will provide it permissions to access other resources. This service role’s definition is in iam.tf and will be discussed further below.

The AWS CodeBuild project uses the following environment variables during the build. These are referenced in the buildspec.yml file. The environment variables that are passed are:

AWS_REGION
AWS_ACCOUNT_ID
ECR_REPO_NAME
IMAGE_TAG_PREFIX

The AWS CodeBuild project stores its logs in AWS CloudWatch Logs. The name of the log group is contained in var.codebuild_cloudwatch_logs_group_name and the stream name is stored in var.codebuild_cloudwatch_logs_stream_name.

Under the source block, you will see where the source files are stored. This is set to the AWS CodeCommit repository.

Note: when running this AWS CodeBuild project independently, it will retrieve the source files from the AWS CodeCommit repository. However, when it is run as stage in the AWS CodePipeline pipeline, it actually gets the source from the AWS CodePipeline artifact store S3 bucket.

The last part of the above resource definition adds a tag to the AWS CodeBuild project, to state the environment it is running in.

Security should always be on your mind whenever you do something. To this effect, since our AWS CodePipeline pipeline will store artifacts in an AWS S3 bucket, to ensure these are secure from prying eyes, we will encrypt them with an AWS KMS Customer Managed Key (CMK).

The next resource block will create this AWS KMS CMK.

	resource "aws_kms_key" "hugo_kms_key" {
	description = "KMS key used by hugo CodePipeline pipeline"
	}

view raw blog-create-docker-image_main_6.tf hosted with ❤ by GitHub

The next resource definition block creates the orchestrator of the solution – our AWS CodePipeline pipeline.

	resource "aws_codepipeline" "hugo_imagebuild_pipeline" {
	name = var.codepipeline_pipeline_name
	role_arn = aws_iam_role.codepipeline_role.arn
	artifact_store {
	location = data.aws_s3_bucket.codepipeline_artifacts_s3_bucket.id
	type = "S3"

	encryption_key {
	id = aws_kms_key.hugo_kms_key.id
	type = "KMS"
	}
	}
	stage {
	name = "Source"
	action {
	name = "Source"
	category = "Source"
	owner = "AWS"
	provider = "CodeCommit"
	version = "1"
	output_artifacts = ["source_output"]

	configuration = {
	RepositoryName = aws_codecommit_repository.hugo_repo.id
	BranchName = "main"
	PollForSourceChanges = "false"
	}
	}
	}

	stage {
	name = "Build"

	action {
	name = "Build"
	category = "Build"
	owner = "AWS"
	provider = "CodeBuild"
	input_artifacts = ["source_output"]
	output_artifacts = ["build_output"]
	version = "1"

	configuration = {
	ProjectName = aws_codebuild_project.hugo_imagebuild_project.name
	}
	}
	}
	}

view raw blog-create-docker-image_main_7.tf hosted with ❤ by GitHub

The AWS CodePipeline pipeline needs permissions to access other resources. A role (in case of the above you will see role_arn) is used to provide it these permissions.

In the above resource definition, you will see the artifact_store being defined, which is the AWS S3 bucket that we created a data source to. The artifacts will be encrypted using AWS KMS CMK, as defined in encryption_key section.

The first stage (“Source”) of the pipeline obtains the files from the AWS CodeCommit repository, zips them and puts them in the artifact store.

The second stage (“Build”) of the pipeline calls the AWS CodeBuild project and passes the zip file from the source stage to it.

Now, you might be thinking. This is all fine and dandy. But what happens when I (or others I have given access to) push commits to the AWS CodeCommit repository? Do I (or they) then need to manually trigger the AWS CodePipeline pipeline?

Well, that is one way of doing it, however it doesn’t sound very elegant does it? Things should be automated and the more we remove manual processes the better the solution becomes.

AWS CodePipeline has the ability to poll the source for changes (have a look at the parameter PollForSourceChanges in the “Source” stage of the above pipeline). The good thing is that this is automated, the bad thing is that it is a poll. A better approach is to create a trigger so that whenever new code is pushed to the AWS CodeCommit repository, this triggers the AWS CodePipeline pipeline to run.

That is why the PollForSourceChanges is set to false and explains the last two resource definition blocks.

Below is the first of the last two resource definitions.

	resource "aws_cloudwatch_event_rule" "trigger_image_build" {
	name = var.cloudwatch_events_rule_name
	description = "Trigger the CodePipeline pipline to build the image for hugo whenever a new push is made to hugo CodeCommit repository"

	event_pattern = <<PATTERN
	{
	"source": [
	"aws.codecommit"
	],
	"detail-type": [
	"CodeCommit Repository State Change"
	],
	"resources": [
	"${aws_codecommit_repository.hugo_repo.arn}"
	],
	"detail": {
	"event": [
	"referenceCreated",
	"referenceUpdated"
	],
	"referenceType": [
	"branch"
	],
	"referenceName": [
	"main"
	]
	}
	}
	PATTERN
	}

view raw blog-create-docker-image_main_8.tf hosted with ❤ by GitHub

This is an AWS CloudWatch event rule, which looks for event patterns. As you can see, the event pattern above is for pushes to the AWS CodeCommit repository to the main branch.

The last resource definition block in main.tf defines the target for the above AWS CloudWatch event rule.

	resource "aws_cloudwatch_event_target" "trigger_image_build" {
	target_id = "trigger_hugo_image_build"
	rule = aws_cloudwatch_event_rule.trigger_image_build.id
	arn = aws_codepipeline.hugo_imagebuild_pipeline.arn

	role_arn = aws_iam_role.cloudwatch_events_role.arn
	}

view raw blog-create-docker-image_main_9.tf hosted with ❤ by GitHub

When the AWS CloudWatch event rule finds the above event pattern, it will trigger the AWS CodePipeline pipeline. The AWS CloudWatch event target requires permissions to trigger the pipeline and this is granted via the role_arn defined above. This is defined in iam.tf and will be discussed further below.

Now lets look at iam.tf. Open it in your favourite IDE and lets walk through the code.

As previously mentioned, this file contains definitions for all the AWS IAM roles and policies.

Here is the first resource definition.

	resource "aws_iam_role" "codebuild_service_role" {
	name = var.codebuild_service_role_name

	assume_role_policy = <<EOF
	{
	"Version": "2012-10-17",
	"Statement": [
	{
	"Effect": "Allow",
	"Principal": {
	"Service": "codebuild.amazonaws.com"
	},
	"Action": "sts:AssumeRole"
	}
	]
	}
	EOF
	}

view raw blog-create-docker-image_iam_1.tf hosted with ❤ by GitHub

The above creates the AWS IAM role that will be used as the AWS CodeBuild service role. It creates an empty AWS IAM role that allows the service codebuild.amazonaws.com to assume it.

The next resource definition contains the AWS IAM role policy that grants the various permissions to the AWS CodeBuild service role.

	resource "aws_iam_role_policy" "codebuild_service_role_policy" {
	role = aws_iam_role.codebuild_service_role.name

	policy = <<POLICY
	{
	"Version": "2012-10-17",
	"Statement": [
	{
	"Sid": "AccessToAWSCloudWatchLogs",
	"Effect": "Allow",
	"Resource": [
	"arn:aws:logs:${local.region_name}:${local.account_id}:log-group:${var.codebuild_cloudwatch_logs_group_name}:*"
	],
	"Action": [
	"logs:CreateLogGroup",
	"logs:CreateLogStream",
	"logs:PutLogEvents"
	]
	},
	{
	"Sid":"AccessToAmazonECR",
	"Effect":"Allow",
	"Action":[
	"ecr:BatchGetImage",
	"ecr:BatchCheckLayerAvailability",
	"ecr:DescribeImages",
	"ecr:DescribeRepositories",
	"ecr:GetDownloadUrlForLayer",
	"ecr:InitiateLayerUpload",
	"ecr:ListImages",
	"ecr:CompleteLayerUpload",
	"ecr:GetAuthorizationToken",
	"ecr:InitiateLayerUpload",
	"ecr:PutImage",
	"ecr:UploadLayerPart"
	],
	"Resource": [
	"${aws_ecr_repository.hugo_repo.arn}"
	]
	},
	{
	"Effect": "Allow",
	"Action": "ecr:GetAuthorizationToken",
	"Resource": "*"
	},
	{
	"Sid": "CodeBuildAccessToS3",
	"Effect":"Allow",
	"Action": [
	"s3:GetObject",
	"s3:GetObjectVersion",
	"s3:GetBucketVersioning",
	"s3:PutObjectAcl",
	"s3:PutObject"
	],
	"Resource": [
	"${data.aws_s3_bucket.codepipeline_artifacts_s3_bucket.arn}",
	"${data.aws_s3_bucket.codepipeline_artifacts_s3_bucket.arn}/*"
	]
	},
	{
	"Sid": "CodeBuildAccesstoKMSCMK",
	"Effect": "Allow",
	"Action": [
	"kms:DescribeKey",
	"kms:GenerateDataKey*",
	"kms:Encrypt",
	"kms:ReEncrypt*",
	"kms:Decrypt"
	],
	"Resource": [
	"${aws_kms_key.hugo_kms_key.arn}"
	]
	}
	]
	}
	POLICY
	}

view raw blog-create-docker-image_iam_2.tf hosted with ❤ by GitHub

The above AWS IAM role policy grants the AWS CodeBuild service role access to:

AWS CloudWatch Logs
Amazon ECR repository
Amazon S3
AWS KMS CMK

The next resource definition creates an empty AWS IAM role with a trust policy that allows the AWS Service codepipeline.amazonaws.com to assume it. This will be used as the codepipeline role.

	resource "aws_iam_role" "codepipeline_role" {
	name = var.codepipeline_role_name

	assume_role_policy = <<EOF
	{
	"Version": "2012-10-17",
	"Statement": [
	{
	"Effect": "Allow",
	"Principal": {
	"Service": "codepipeline.amazonaws.com"
	},
	"Action": "sts:AssumeRole"
	}
	]
	}
	EOF
	}

view raw blog-create-docker-image_iam_3.tf hosted with ❤ by GitHub

Then comes the resource definition that grants permissions to the codepipeline role.

	resource "aws_iam_role_policy" "codepipeline_policy" {
	name = var.codepipeline_role_policy_name
	role = aws_iam_role.codepipeline_role.id

	policy = <<EOF
	{
	"Version": "2012-10-17",
	"Statement": [
	{
	"Sid": "CodePipelineAccessToS3",
	"Effect":"Allow",
	"Action": [
	"s3:GetObject",
	"s3:GetObjectVersion",
	"s3:GetBucketVersioning",
	"s3:PutObjectAcl",
	"s3:PutObject"
	],
	"Resource": [
	"${data.aws_s3_bucket.codepipeline_artifacts_s3_bucket.arn}",
	"${data.aws_s3_bucket.codepipeline_artifacts_s3_bucket.arn}/*"
	]
	},
	{
	"Sid": "CodePipelineAccesstoKMSCMK",
	"Effect": "Allow",
	"Action": [
	"kms:DescribeKey",
	"kms:GenerateDataKey*",
	"kms:Encrypt",
	"kms:ReEncrypt*",
	"kms:Decrypt"
	],
	"Resource": [
	"${aws_kms_key.hugo_kms_key.arn}"
	]
	},
	{
	"Sid": "AccessToCodeCommitRepo",
	"Effect": "Allow",
	"Resource": [
	"${aws_codecommit_repository.hugo_repo.arn}"
	],
	"Action": [
	"codecommit:GetBranch",
	"codecommit:GetCommit",
	"codecommit:UploadArchive",
	"codecommit:GetUploadArchiveStatus",
	"codecommit:GitPull"
	]
	},
	{
	"Effect": "Allow",
	"Action": [
	"codebuild:BatchGetBuilds",
	"codebuild:StartBuild"
	],
	"Resource": "*"
	}
	]
	}
	EOF
	}

view raw blog-create-docker-image_iam_4.tf hosted with ❤ by GitHub

The above resource definition gives the codepipeline permissions on the following AWS resources:

Amazon S3
AWS KMS CMK
AWS CodeCommit repository

Following the above, the next resource definition creates an empty AWS IAM role that will be used by the AWS CloudWatch event rule. The resource definition attaches a trust policy that allows the AWS Service events.amazonaws.com to assume it.

	resource "aws_iam_role" "cloudwatch_events_role" {
	name = var.cloudwatch_events_role_name

	assume_role_policy = <<EOF
	{
	"Version": "2012-10-17",
	"Statement": [
	{
	"Effect": "Allow",
	"Principal": {
	"Service": "events.amazonaws.com"
	},
	"Action": "sts:AssumeRole"
	}
	]
	}
	EOF
	}

view raw blog-create-docker-image_iam_5.tf hosted with ❤ by GitHub

The last resource definition in iam.tf gives the above role access to trigger the AWS CodePipeline pipeline.

	resource "aws_iam_role_policy" "cloudwatch_events_policy" {
	name = var.cloudwatch_events_role_policy_name
	role = aws_iam_role.cloudwatch_events_role.id

	policy = <<EOF
	{
	"Version": "2012-10-17",
	"Statement": [
	{
	"Sid": "CloudWatchPermissionToStartCodePipelinePipeline",
	"Effect": "Allow",
	"Action": [
	"codepipeline:StartPipelineExecution"
	],
	"Resource": [
	"${aws_codepipeline.hugo_imagebuild_pipeline.arn}"
	]
	}
	]
	}
	EOF
	}

view raw blog-create-docker-image_iam_6.tf hosted with ❤ by GitHub

The next file that we will walk through is locals.tf. Open it in your favourite IDE.

This file just has the following content.

	locals {
	account_id = data.aws_caller_identity.current.account_id
	region_name = data.aws_region.current.name
	}

view raw blog-create-docker-image_locals.tf hosted with ❤ by GitHub

Instead of having to use the full path to get the account_id and region name from the data sources, I am assigning them to local values instead. This improves readability of the code (also there is less typing to do).

Next up, lets look at variables.tf. Open it in your favourite IDE.

Here are the contents of this file.

	variable "env" {
	type = string
	description = "The name of the environment where this project is being run. eg dev, test, preprod, prod."
	}

	variable "codecommit_repo_name" {
	type = string
	description = "The name of the AWS CodeCommit Repository where the code to build the container image for hugo will be stored."
	}

	variable "codecommit_repo_default_branch_name" {
	type = string
	description = "The default branch for the AWS CodeCommit Repo"
	}

	variable "ecr_repo_name" {
	type = string
	description = "The name of the AWS Elastic Container Repository where the image for hugo will be stored"
	}

	variable "codebuild_project_name" {
	type = string
	description = "The name of the AWS CodeBuild project"
	}

	variable "codebuild_service_role_name" {
	type = string
	description = "The name of the AWS CodeBuild Project's service role"
	}

	variable "image_tag_prefix" {
	type = string
	description = "The prefix to use when tagging the newly build docker image"
	}

	variable "codepipeline_pipeline_name" {
	type = string
	description = "The name of the AWS CodePipeline pipeline"
	}

	variable "codepipeline_role_name" {
	type = string
	description = "The name of the AWS CodePipeline role"
	}

	variable "codepipeline_role_policy_name" {
	type = string
	description = "The name of the AWS CodePipeline role policy"
	}

	variable "cloudwatch_events_role_name" {
	type = string
	description = "The name for the AWS CloudWatch Events role policy"
	}

	variable "cloudwatch_events_role_policy_name" {
	type = string
	description = "The name for the AWS CloudWatch Events role policy"
	}

	variable "cloudwatch_events_rule_name" {
	type = string
	description = "The name of the AWS CloudWatch Event rule that will trigger the image build pipeline"
	}

	variable "codepipeline_artifacts_s3_bucket_name" {
	type = string
	description = "The name of the S3 bucket where CodePipeline will store its artifacts"
	}

	variable "codebuild_cloudwatch_logs_group_name" {
	type = string
	description = "The cloudwatch logs group name for CodeBuild"
	}

	variable "codebuild_cloudwatch_logs_stream_name" {
	type = string
	description = "The cloudwatch logs stream name for CodeBuild"
	}

view raw blog-create-docker-image_variables.tf hosted with ❤ by GitHub

As you can see, this file defines all the variables that are used in the terraform code. However, notice that none of these variables have any default values. Wonder why that is so? Wait till we get to the implementation section and then you will find out.

Lets now look into the last file. Open outputs.tf in your favourite IDE.

Below are the contents of this file.

	output "codecommit_repo_clone_url" {
	description = "AWS CodeCommit Repository Clone URL"
	value = aws_codecommit_repository.hugo_repo.clone_url_http
	}

view raw blog-create-docker-image_outputs.tf hosted with ❤ by GitHub

Output values are used to display information on the command line. I use them to provide information about resources, that might be of interest, to the person running terraform.

In regards to the above code, I am outputting the url that will be used to clone the AWS CodeCommit repository.

At this point, I am hoping you are now well versed with the code. Lets move onto the interesting bit, that of implementing the above solution!

Implementation

In this section, I will take you through the steps to deploy the above mentioned solution in your own AWS Account. Lets begin.

From the root folder that you cloned from my GitHub repository, open Makefile in your favourite IDE.
As previously mentioned when going through variables.tf none of the variables had a default value. This is because we should not hardcode values. Instead they should be easily customisable. I am doing this by defining them in the Makefile and then passing them on when calling terraform. Now you can appreciate the importance of the Makefile (and also because it makes running commands much easier).
Below are the only variables that (if required) need changing.
- PROJECT_NAME=hugo – this is the project name (set to hugo by default. This name will be used to prefix the name of all the resources. I would recommend to keep it as the default).
- ENV ?= dev – this is the environment name. This is used to suffix the names of the resources. Leave this to the default (dev) unless you are running this in a different environment.
- TF_VAR_codepipeline_artifacts_s3_bucket_name = <yours3bucketname> – this is the Amazon S3 bucket where AWS CodePipeline artifacts will be stored. Provide the name of an existing Amazon S3 bucket name here.
Open a command prompt (terminal on MacOS, cmd on Windows) and cd into the root of the above cloned folder.
At this point, ensure you have already downloaded git and AWS CLI and have configured your AWS CLI profile, Once done, run the following command
- “make terraform_init” – this will initialise the terraform project by downloading the required providers.
- run “make terraform_plan” – this will generate and display the changes that will be done to your AWS Account. You should see a list with all the resources that we discussed in the walk through. The changes will also be saved in a file called hugo_plan.tfplan inside the terraform folder.
- run “make terraform_apply” to apply the changes to your AWS Account.
- Once terraform has successfully provisioned all the resources, the AWS CodeCommit repository’s clone url will be displayed. Copy this url as it will be used later.
- If you want, you can login to the AWS Management Console to validate that all the resources have been provisioned correctly.
- Open another command prompt session – this will be used to clone the AWS CodeCommit repository. You will need git credentials for this. If you don’t have these, follow this document to generate it https://docs.aws.amazon.com/codecommit/latest/userguide/setting-up-gc.html.
- In the new command prompt screen, clone this new repository using the url that was provided above. When prompted, use your git credentials. Use this command to clone the repository “git clone <AWS CodeCommit repository clone url>“.
- Once the git repository has been cloned (it will be empty), checkout into a branch called main (if you are not on this branch already) using “git checkout -b main“
- Go back to the folder that contains the files you cloned from my GitHub repository. You will find a subfolder called codecommit_repo_files. Copy the “contents” of this folder into the folder you cloned the newly created AWS CodeCommit repository into. Don’t copy the codecommit_repo_files folder, just its contents.
- Stage these files, commit them and push them to your repository.
- Login to the AWS Management Console and open the AWS CodePipeline portal. You should see the newly created pipeline (default name is hugo_dev_imagebuild_pipeline). Within a couple of seconds, it will automatically trigger and start the “Source” stage. It will then proceed to the “Build” stage. You can click on the respective urls within the AWS CodePipeline pipeline job to view its progress.
- Once successfully complete, the docker image will be stored in your Amazon Elastic Container Registry. Open the Amazon Elastic Container Registry from the AWS Management Portal and look for the repository (default name is hugo_dev). Inside this, you will see an image with a name starting with 0.110.0_ext_amd64_debian_stable-20230109-slim and a suffix that corresponds to the first 8 characters of the commit sha of the push you did to the AWS CodeCommit repository.

Thats it! That’s all it takes to create a new docker image every time you push code to your AWS CodeCommit repository.

One last thing before we finish up. The docker image is created using the “recipe” contained in a Dockerfile. For this project, In the folder that contains the files you cloned from my GitHub repository, inside the codecommit_repo_files folder, you will find a file called Dockerfile-hugo-amd64. Open this file in your favourite IDE.

The contents are pasted below.

	FROM –platform=amd64 debian:stable-20230109-slim
	RUN apt update -y
	RUN apt install git wget -y
	ARG HUGO_VERSION=0.110.0
	ARG HUGO_PLATFORM=linux-amd64.deb
	RUN cd /tmp && wget https://github.com/gohugoio/hugo/releases/download/v${HUGO_VERSION}/hugo_extended_${HUGO_VERSION}_${HUGO_PLATFORM}
	RUN dpkg -i /tmp/hugo_extended_${HUGO_VERSION}_${HUGO_PLATFORM}
	RUN mkdir /www
	COPY websitefiles/ /www
	WORKDIR /www
	EXPOSE 1313
	CMD ["hugo", "server", "–buildDrafts", "–bind", "0.0.0.0"]

view raw blog-create-docker-image_Dockerfile-hugo-amd64 hosted with ❤ by GitHub

Lets quickly go through the above code.

To create the hugo docker image, we will use the debian:stable-20230109-slim base docker image (retrieved from dockerhub).
we will then run “apt update -y” to download any new updates.
we will then install git and wget onto it.
the next two lines are defining some variables that will be used to retrieve the hugo package.
next, using wget, we will download the hugo debian package from hugo’s GitHub repository and store it in /tmp
we will then install this debian package
next we will create a folder called /www
the next line copies the contents of the folder named websitefiles to /www inside the docker image.
the next line sets the working directory to /www so that when we run anything from within this docker image, that is the default directory it will look into.
the next line exposes tcp port 1313. This is the port that hugo server listens on.
the last command starts hugo in server mode so that it will serve the website based on the files in the /www folder.

Testing

Since we have done all this hard work for creating this docker image, it will be a shame to not test it right? Here are the steps to run the image on your local computer using docker.

A few prerequisites need to be met before we continue. Ensure you have installed docker locally on your computer and that the AWS user that you used to configure the AWS CLI profile has permissions to login to Amazon Elastic Container registry.
Using the AWS Management Console, open the Amazon ECR portal. Browse to the newly created repository (default name is hugo_dev). The newly created image will be visible inside. Click on the Image tag of the image and then in the next screen, copy its URI.
Open a command prompt. Use the following command to login to the Amazon ECR repository
- aws ecr get-login-password --region $AWS_REGION | docker login --username AWS --password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com
  - where
    - AWS_REGION – is the region where your Amazon ECR repository exists
    - AWS_ACCOUNT_ID – is your AWS Account ID
After successfully authenticated to Amazon ECR, run the following command to run the image locally on your computer
- docker run -p 1313:1313 <IMAGE URI>
  - where
    - IMAGE URI – this is the URI that you copied above
You should see the image being downloaded from the Amazon ECR repository. Once the container is up and running, use your browser to go to http://localhost:1313.
You should see the following website served by hugo.

Cleaning up

After you have finished testing the above solution in your AWS Account, to ensure you don’t get charged unnecessarily, delete all the resources that were created.

This is easily done by running the following command from within the folder that contains the files that you cloned from my GitHub repository. Before continuing, ensure you have deleted all the images from inside the Amazon ECR repository.

make terraform_destroy

Note: You might get an error on the first try regarding the AWS CodeBuild project refusing to be deleted. Just rerun the above destroy command again and it should get removed successfully.

Summary

I hope you enjoyed this blog and got a great insight into how you can use AWS CodeCommit, AWS CodeBuild, AWS CodePipeline and Amazon Elastic Container Registry to automatically build docker images.

The recipe for creating the docker image is contained within the Dockerfile. I hope you got a good understanding about this as well.

In the next blog, I will extend this solution further. I will create an Amazon Elastic Kubernetes Service (EKS) cluster and then deploy this docker image into it.

I will see you in the next blog. Till then, stay safe.

Nivlesh's Blog

This website contains posts about my adventures with cloud technologies. Subscribe to my YouTube channel (link in menu above) for educational and tech videos (including videos about posts on this blog)

Month: February 2023

Build a Hugo docker image using AWS CodeCommit, AWS CodeBuild, AWS CodePipeline and Amazon Elastic Container Registry

Introduction

High Level Architecture

Walkthrough of the code

Implementation

Testing

Cleaning up

Summary

Introduction

High Level Architecture

Walkthrough of the code

Implementation

Testing

Cleaning up

Summary

Share this: