GitHub CI/CD pipeline design choices for deploying Terraform code to GCP data domain
We have built our GCP data platform using Data Mesh principles. Each line-of-business (LoB) roughly translates to a domain on GCP. Each domain will have 3 environments, Dev, Non-Prod and Prod, and each environment will have 3 GCP projects/zones.
This is the mapping between GitHub, Terraform and GCP domains.

I have come up with multiple designs for CI/CD deployment pipelines. Let me share/explain them here.
Design 1: 3 environments, 3 branches and 3 code bases in GitHub

In design 1, every GitHub env will have its own branch and code base. the deployment will follow the sequence always. Code changes are first merged with Dev, and then to Non-Prod and finally to Prod via Pull Requests (PR).
Each PR merges the code to the appropriate code base and also performs the “Terraform apply” on GCP side.
The advantage with this design is, its simple to understand and implement. On the downside, its always cumbersome and complicated maintaining 3 different code bases which may not be in sync always within a repo.
Design 2: 3 environments, 1 branch and 1 code base in GitHub

In design 2, there will be only one branch and code base. We can checkout feature branches from the Prod branch and add the code changes. There will be only “Terraform apply” activity in Dev and Non-Prod envs. Code changes will be merged to Prod code base only when the PR is created for the Prod deployment.
The advantage with this design is, it has less maintenance overhead as there is only one code base to manage. On the downside, this design will lead to problems when there are many feature branches. As there is no code merge happening until Prod deployment, Terraform will destroy the resources in Dev and Non-Prod envs deployed by the earlier feature branches.
This design is suitable for small teams where there is only one or two people performing the code deployment work.
Design 3: 3 environments, 2 branches and 2 code bases in GitHub

In design 3, there will be 2 branches and 2 code bases. Code changes done through feature branches are merged with Dev code base through a PR. And, that PR also performs “Terraform apply” on Dev and Non-Prod envs.
We need to do another checkout from Prod branch, and manually copy the code files (for the services that we want to deploy to Prod) from Dev branch into the new feature branch for deploying to Prod.
The advantage with this design is, its easy to understand and implement. On the downside, there will be manual work involved in preparing for Prod deployment.
Design 4: 3 environments, 1 branch and 1 code base in GitHub

This design is similar to design#2, but with a small (but significant) difference). In this design, there will be only one branch and one code base. Code changes are deployed to Dev and Non-Prod envs via a PR. The same PR will also merge the code changes with the Prod branch.
For Prod deployment, we may not want to deploy everything that got merged with the Prod branch. This is where we use feature toggles.
For e.g., Let’s say we deployed both Bigtable and Data Fusion services to Dev and Non-Prod GCP envs. Terraform code for both services got merged with the Prod branch already via the PR that deployed the code to Dev and Non-Prod envs. For Prod deployment, we want to deploy Data Fusion service only to the Prod GCP env. This is where we can use “feature toggles” to select only the services we want to deploy to the Prod.
The advantage with this design is, it has less maintenance overhead as we have only one branch and code base. On the downside, implementing “feature toggles” may become complicated and has to be handled carefully.
Disclaimer: The posts here represent my personal views and not those of my employer or any specific vendor. Any technical advice or instructions are based on my own personal knowledge and experience.