In the CI/CD space, every second counts like in an F1 race. That's where GitHub Actions cache comes in. Caching is like a pit stop for your code – it saves precious time by providing pre-loaded resources to speed up pipeline execution by storing and reusing previously downloaded dependencies.
However, cache use especially in public repositories might be risky and prone to malicious intrusion. Therefore you need to shield yourself against any potential vulnerabilities and cache attacks. In this blog post, we’ll optimize GitHub cache across different workflow jobs, but cleanup things as soon as we’re done.
Caching vulnerabilities in public Repos
When it comes to caching in a public repository (with secrets), running workflows can become a dangerous practice. Here's why:
Exposure to Unauthorized Access Anyone with read access can create a pull request and access the sensitive data within a public repo cache.
Forks of a repository can also create PRs on the base branch and access caches on the base branch.
Data Exfiltration Hackers can exploit the cached secrets to gain unauthorized access to your systems
No Encryption Caches are not encrypted by default, making stored secrets easily readable if discovered
Inadvertent Exposure Devs might accidentally push sensitive data to a public repo without realizing it.
Cache poisoning A malicious tool used in a test workflow can poison its cache. Later, another workflow using the same cache might be affected, read more in this github-cache-poisoning article.
It is even worse for artifacts as there’s literally a download button accessible to anyone in the internet.
There are essential security best practices to minimize this risk, but today, I'll focus on only one from the list below.
Secret Management Tools: Use secret management tools provided by Vault, AWS Secrets Manager etc.
Use OIDC: See my previous blog post (OIDC in GitHub actions)
Private Repositories: Not always possible(OSS projects), but helps limit access to authorized users only.
Encryption: ensure strong encryption of the cached data
Don't store any sensitive information in the cache
Temporary Caching: If you need to use caching for performance optimization, ensure that it's temporary and short-lived.
Demo: Instant Cache cleanup
As mentioned earlier, one solution for minimizing the attack surface involves regularly clearing the cache to prevent long-term exposure. The following example will demonstrate exactly that using cache action and GitHub CLI.
Cache Retention in GitHub
GitHub Cache default retention is 7 days for caches that have not been accessed.
There is no cache number limit, but the total size of all caches in a repository is limited to 10 GB.
The artifacts & workflow log files on the other hand are usually retained for 90 days before auto deletion.
We’ll be using a cache action called actions/cache@v3 that has 3 main parameters:
path: A list of files, directories, and wildcard patterns to cache and restore
key: An explicit key for a cache entry
restore-key: A list of prefix-matched keys to use for restoring stale cache if no cache hit occurred for key.
Example Repo: brokedba/githubactions_hacks Branch: git_actions
You can clone my repo and reload it into your GitHub but remember to add the environment & branch.
Name: lab_tests , with deployment branch set to `selected branch`: i.e git_actions
You can download test_cache_cleanup.yml under .github/workflows.
The common workflow and jobs declaration
Trigger event: push Target branch: git_actions
paths: our yaml workflow test_cache_cleanup.yml
Initial steps: checkout the repo and install a specific version of terraform (1.0.3)
Prepare the dependencies directory (for the terraform provider files)
Note: My repo has 0 terraform config file, but we’ll assume we ran terraform init (see section #4.)
Cache all dependencies (terraform 1.0.3 binary + provider plugin)
Cache key includes github.run_id which is a unique ID for our workflow run
Restore-keys will use the same pattern i.e: “Linux-terraform-5868971041”
Now time to restore the cache in another job (runner), avoiding repo checkout, terraform install and initialization.