Using the RUN instruction in a Dockerfile with 'source' does not work

Learn using the run instruction in a dockerfile with 'source' does not work with practical examples, diagrams, and best practices. Covers bash, docker, shell development techniques with visual expl...

Why 'source' in Dockerfile RUN instruction fails and how to fix it

Hero image for Using the RUN instruction in a Dockerfile with 'source' does not work

Understand the nuances of shell execution in Dockerfiles and learn effective methods to execute scripts and set environment variables using the RUN instruction.

When building Docker images, developers often encounter unexpected behavior when trying to use the source command (or its alias .) within a RUN instruction in a Dockerfile. This typically manifests as environment variables not being set or scripts not executing as intended. This article delves into the reasons behind this issue and provides robust solutions to ensure your Docker builds behave predictably.

The Problem: Shell Execution Context in Dockerfiles

The core of the problem lies in how Docker executes RUN instructions. By default, each RUN instruction is executed in a new shell context, typically /bin/sh -c (or /bin/bash -c if specified). When you use source or . to execute a script, it runs the script in the current shell context, meaning any environment variables or functions defined by that script become available in that specific shell session. However, once that RUN instruction completes, the shell session is terminated, and all its context (including sourced variables) is lost.

FROM alpine:latest

# This will NOT work as expected
RUN echo 'export MY_VAR="hello"' > /tmp/script.sh
RUN source /tmp/script.sh
RUN echo $MY_VAR

Example of source failing to persist environment variables across RUN instructions.

In the example above, the source /tmp/script.sh command successfully sets MY_VAR within its own shell context. However, the subsequent RUN echo $MY_VAR command executes in a new shell context, which has no knowledge of MY_VAR, resulting in an empty output.

flowchart TD
    A[Dockerfile RUN Instruction 1] --> B{"Shell Context 1 (/bin/sh -c)"}
    B --> C["Execute 'source script.sh'"]
    C --> D["MY_VAR set in Shell Context 1"]
    D --> E["Shell Context 1 Exits"]
    E --x F["MY_VAR is Lost"]
    F --> G[Dockerfile RUN Instruction 2]
    G --> H{"Shell Context 2 (/bin/sh -c)"}
    H --> I["Execute 'echo $MY_VAR'"]
    I --> J["MY_VAR is Undefined in Shell Context 2"]
    J --> K["Empty Output"]

Flowchart illustrating why source fails across separate RUN instructions.

Solutions for Persistent Environment Variables and Script Execution

To correctly set environment variables or execute scripts that modify the shell environment for subsequent RUN instructions, you need to ensure they operate within the same shell context or use Docker's built-in mechanisms.

Method 1: Combine Commands in a Single RUN Instruction

The most straightforward solution is to combine all related commands, including the source command and any commands that rely on its output, into a single RUN instruction. This ensures they all execute within the same shell context.

FROM alpine:latest

# This will work
RUN echo 'export MY_VAR="hello from combined run"' > /tmp/script.sh && \
    source /tmp/script.sh && \
    echo "Value of MY_VAR: $MY_VAR"

Correctly sourcing a script and using its variables within a single RUN instruction.

Method 2: Use the ENV Instruction

For setting simple, static environment variables, the ENV instruction is the most idiomatic and recommended Docker approach. Variables set with ENV persist across all subsequent RUN, CMD, and ENTRYPOINT instructions, and are also available in the final container.

FROM alpine:latest

ENV MY_VAR="hello from ENV"

RUN echo "Value of MY_VAR: $MY_VAR"

Using the ENV instruction to set a persistent environment variable.

Method 3: Change the Default Shell (Advanced)

If you have complex scripting needs or prefer a specific shell (e.g., bash for its advanced features), you can change the default shell used by RUN instructions with the SHELL instruction. This is less common for simply sourcing variables but can be useful for specific build environments.

FROM ubuntu:latest

# Install bash if not present (e.g., on Alpine)
RUN apt-get update && apt-get install -y bash

# Set the default shell for subsequent RUN instructions
SHELL ["/bin/bash", "-c"]

RUN echo 'export MY_VAR="hello from bash"' > /tmp/script.sh && \
    source /tmp/script.sh && \
    echo "Value of MY_VAR: $MY_VAR"

Changing the default shell to bash for RUN instructions.

Best Practices

To avoid issues and maintain clear, efficient Dockerfiles:

Use && \ to chain multiple commands into a single RUN instruction, especially when one command's output or environment changes are needed by the next.

2. Use ENV for static variables

For environment variables that are known at build time and don't require complex logic, ENV is the cleanest and most efficient method.

3. Minimize RUN layers

Each RUN instruction creates a new layer. Combining commands reduces the number of layers, leading to smaller and faster images. Remember to clean up temporary files in the same RUN instruction where they were created.

4. Understand shell differences

Be aware of the differences between /bin/sh (often dash on Debian/Ubuntu) and /bin/bash. If your scripts rely on bash-specific features, explicitly set SHELL or ensure bash is installed and used.