Wednesday, August 16, 2023

Optimizing GitHub Workflows: How to Auto cleanup your cache after use


Intro


In the CI/CD space, every second counts like in an F1 race. That's where GitHub Actions cache comes in. Caching is like a pit stop for your code – it saves precious time by providing pre-loaded resources to speed up pipeline execution by storing and reusing previously downloaded dependencies.


However, cache use especially in public repositories might be risky and prone to malicious intrusion. Therefore you need to shield yourself against any potential vulnerabilities and  cache attacks. In this blog post, we’ll optimize GitHub cache across different workflow jobs, but cleanup things as soon as we’re done.

 

Caching vulnerabilities in public Repos


When it comes to caching in a public repository (with s
ecrets), running workflows can become a dangerous practice. Here's why:

  • Exposure to Unauthorized Access Anyone with read access can create a pull request and access the sensitive data within a public repo cache.

  • Forks of a repository can also create PRs on the base branch and access caches on the base branch.

  • Data Exfiltration Hackers can exploit the cached secrets to gain unauthorized access to your systems

  • No Encryption Caches are not encrypted by default, making stored secrets easily readable if discovered

  • Inadvertent Exposure Devs might accidentally push sensitive data to a public repo without realizing it.

  • Cache poisoning A malicious tool used in a test workflow can poison its cache. Later, another workflow using the same cache might be affected, read more in this github-cache-poisoning article.


It is even worse for artifacts as there’s literally a download button accessible to anyone in the internet 


Remediation


There are essential security best practices to minimize this risk, but today, I'll focus on only one from the list below.

  • Secret Management Tools: Use secret management tools provided by Vault, AWS Secrets Manager etc.

  • Use OIDC: See my previous blog post (OIDC in GitHub actions)

  • Private Repositories: Not always possible(OSS projects), but helps limit access to authorized users only.

  • Encryption: ensure strong encryption of the cached data

  • Don't store any sensitive information in the cache

  • Temporary Caching: If you need to use caching for performance optimization, ensure that it's temporary and short-lived.


Demo: Instant Cache cleanup

Broom Icon

As mentioned earlier, one solution for minimizing the attack surface involves regularly clearing the cache to prevent long-term exposure. The following example will demonstrate exactly that using cache action and GitHub CLI. 


Cache Retention in GitHub

  • GitHub Cache default retention is 7 days for caches that have not been accessed.

  • There is no cache number limit, but the total size of all caches in a repository is limited to 10 GB.

  • The artifacts & workflow log files on the other hand are usually retained for 90 days before auto deletion.

Cache action

We’ll be using a cache action called actions/cache@v3 that has 3 main parameters:

  • path: A list of files, directories, and wildcard patterns to cache and restore

  • key: An explicit key for a cache entry

  • restore-key: A list of prefix-matched keys to use for restoring stale cache if no cache hit occurred for key.


                                              PREREQUISITES
 

  • A repository

    Example   Repo: brokedba/githubactions_hacks  Branch: git_actions
    You can clone my repo and reload it into your GitHub but remember to add the environment & branch.

  • An environment
    Name: lab_tests , with deployment branch set to `selected branch`: i.e git_actions 

  • A workflow

      You can download test_cache_cleanup.yml under .github/workflows.

                 
  • The common workflow and jobs declaration

  • Trigger event: push     Target branch: git_actions

  • paths: our yaml workflow test_cache_cleanup.yml

# “test_cache_cleanup.yml

name: 'My_Cache_cleanup_Workflow'
on:
  push: <------ Trigger
    branches: [ "git_actions" ]
    paths:
      - '.github/workflows/test_cache_cleanup.yml'  <--- File 

jobs:
  terraform_setup_cache_load:
   
runs-on: ubuntu-latest
   
environment: test-labs <--- Environment linked to git_action branch
snipet ...

  • Initial steps: checkout the repo and install a specific version of terraform (1.0.3)

   steps:  
    # 1. Checkout the repository to the GitHub Actions runner
    - name: Checkout
      uses: actions/checkout@v3 
   

# 2. Install the latest version of Terraform CLI
     - name: Setup Terraform
uses: hashicorp/setup-terraform@v1
       with:
terraform_version: 1.0.3
        terraform_wrapper: false
 

  • Prepare the dependencies directory (for the terraform provider files)

  • Note: My repo has 0 terraform config file, but we’ll assume we ran terraform init (see section #4.)

# 3. Create a cache for the terraform plugin and copy terraform binary
- name: Config Terraform plugin cache
run: |
echo 'plugin_cache_dir="$HOME/.terraform.d/plugin-cache"' >~/.terraformrc
mkdir --parents ~/.terraform.d/plugin-cache
terraform -v
terra_bin=`which terraform`
cp $terra_bin . <------ copy terraform binary to local directory


# 4. Perform remaining steps ...example terraform init before caching.
# - name: terraform init

#   run: |
# Initialize the Terraform directory(creating initial files, load modules etc.)
#  example: terraform init ...


snipet ...

  • Cache all dependencies (terraform 1.0.3 binary + provider plugin)

    • Cache key includes github.run_id which is a unique ID for our workflow run

    • Restore-keys will use the same pattern  i.e: “Linux-terraform-5868971041

# ###################################
# Save directory files into our cache
# ###################################
   
#  Save all plugin files and working Directory in a cache
    - name: Cache Terraform
uses: actions/cache@v3
with:
path: |
          ~/.terraform.d/plugin-cache
          ./*
key: ${{ runner.os }}-terraform-${{ github.run_id }} <---- Our unique Cache Key
restore-keys: |
          key: ${{ runner.os }}-terraform-${{ github.run_id }}                

  • Now time to restore the cache in another job (runner), avoiding repo checkout, terraform install and initialization.

# ###################################
# JOB 2 : terraform Plan
# ###################################
   

Terraform_Plan:
name: 'Terraform Plan'
    runs-on: ubuntu-latest
    environment: test-labs <--- Environment linked to git_action branch
: write-all : [] <---- dependency on the previous job successpermissions
needsterraform_setup_cache_load


# Use default shell   
defaults:     
run:
       shell: bash

steps:

# ######################################
# Restore directory files from the cache
# ######################################
 

# 1. Restore all plugin files, tf binaries,and working directory in a cache

    - name: Cache Terraform
uses:
actions/cache@v3
with:
path: |
          ~/.terraform.d/plugin-cache
          ./*
key:
${{ runner.os }}-terraform-${{ github.run_id }}
restore-keys: |
          key:
${{ runner.os }}-terraform-${{ github.run_id }} <-- Our restore Cache Key

  • Right after that, we can run additional steps like terraform plan

# 2. Configure terraform in the new runner reusing the cache.
    -
name: Config Terraform plugin cache
run: |
        echo 'plugin_cache_dir="$HOME/.terraform.d/plugin-cache"' >
~/.terraformrc
# terraform Init not needed here. provider files already in the cache

# 3. Execute terraform PLAN   
    - name: Terraform Plan
run: |
       
echo "== Reusing cached version of terraform binary 1.0.3 =="
        sudo cp
./terraform  /usr/local/bin/
        terraform -v     
# example:
terraform plan
-input=false -no-color -out tf.plan

  • Finally, we clean up our cache when done with our workflow using GitHub CLI (cache deletion requires a token with write permission)

  • That's all there is, I chose gh-action over rest API list/delete with our unique key. now let's see logs


Workflow Execution result

                    

  • The cache is visible for a short period during the execution
             

  • But at the end, we can see the listed available caches and the matching cache being deleted.
     

Artifact vs. Caching

First Both are similar concepts to speed up the execution of CI/CD pipelines but each serve a slightly different purpose.

Caching

  • Involves storing intermediate results or dependencies and commonly reused files from previous jobs runs.

  • When your workflow/job runs again, it quickly retrieves the stored items instead of recreating them.

  • This greatly speeds up the execution time of the pipeline, the same operations are performed only once.

  • It’s ideal for components such as libraries, dependencies, or intermediate build outputs. It's like keeping your tools on standby, so you don't need to fetch them every time you work on a task.

  • GitHub does not allow modifications once entries are pushed – cache entries are read-only records.

Artefacts

  • Allow you to share data between running jobs and save them after the workflow is complete.

  • An artifact is a file or collection of files produced during a workflow run.

  • Example: docker image that is built early in the CI workflow but required to be pushed/run in a later stage.


      Screenshot of the

Difference

  • Caching is used to re-use non-changing files between jobs/workflows like sharing build dependencies.

  • Artifacts are used to save files after workflow ended such as Logs, manifest, statefile, built binaries etc.

Conclusion:

  • Caching optimizes workflows, but you should ensure they don't become targets for malicious actors.

  • We concentrated on instant GitHub cache cleanup today as one way to mitigate security vulnerabilities.

  • GitHub CLI is the ideal tool that streamlines cache cleanup directly from within your workflow.

  • Strategic setup and cleanup sustain a secure workflow (i.e Regularly clearing GitHub caches) 

  • Next, I will implement these tips in OIDC based .

Stay tuned