Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-452] dbt Dockerfile #4990

Closed
ptking777 opened this issue Apr 4, 2022 · 5 comments
Closed

[CT-452] dbt Dockerfile #4990

ptking777 opened this issue Apr 4, 2022 · 5 comments
Assignees
Labels
bug Something isn't working docker Related to official Docker files/images for dbt

Comments

@ptking777
Copy link

Running https://github.com/dbt-labs/dbt-core/docker/test.sh fails with the error message:
Running command git clone --filter=blob:none --quiet https://github.com/dbt-labs/ /tmp/pip-install-nz8ucm09/dbt-core_99b5634ed7b7477babc4f5921baae91d
remote: Not Found
fatal: repository 'https://github.com/dbt-labs/' not found
error: subprocess-exited-with-error

I believe there is a problem with the scope of the ARG statements in the Dockerfile at
https://github.com/dbt-labs/dbt-core/raw/main/docker/Dockerfile
I have made a fix to https://github.com/ptking777/dbt-core/raw/main/docker/Dockerfile.
This is based on the Docker documentation at https://docs.docker.com/engine/reference/builder/#arg

Please see attachment.
Dockerfile.zip

@github-actions github-actions bot changed the title dbt Dockerfile [CT-452] dbt Dockerfile Apr 4, 2022
@jtcohen6 jtcohen6 added triage docker Related to official Docker files/images for dbt Team:Execution bug Something isn't working labels Apr 4, 2022
@VShkaberda
Copy link

The whole Dockerfile is problematic. Using --target doesn't skip stages — it stops at a specific build stage. E.g., if --target dbt-snowflake had been used, one would get all the previous adapters: dbt-postgres, dbt-redshift and dbt-bigquery. Usage of dbt-all is impossible unless dbt-third-party is specified because the previous stage requiring dbt-third-party shall be executed.

@iknox-fa iknox-fa removed the triage label Apr 7, 2022
@iknox-fa
Copy link
Contributor

iknox-fa commented Apr 7, 2022

@ptking777 , @VShkaberda
Thanks for your feedback, however I am having a hard time replicating the issue. The test script is working fine (with the exception that the 1.0.1-b1 builds are failing due to a broken dependency, I'll update that in a PR shortly) and the docker file is used to build our packages on every release so it's regularly tested.

Can you provide some more specifics about your environments (OS, Docker engine and version, etc)?

@VShkaberda
Copy link

@iknox-fa

docker version:

Output of docker version

Client: Docker Engine - Community
Version:           19.03.13
API version:       1.40
Go version:        go1.13.15
Git commit:        4484c46d9d
Built:             Wed Sep 16 17:02:36 2020
OS/Arch:           linux/amd64
Experimental:      false
 
Server: Docker Engine - Community
Engine:
  Version:          19.03.13
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.13.15
  Git commit:       4484c46d9d
  Built:            Wed Sep 16 17:01:11 2020
  OS/Arch:          linux/amd64
  Experimental:     false
containerd:
  Version:          1.3.7
  GitCommit:        8fba4e9a7d01810a393d5d25a3621dc101981175
runc:
  Version:          1.0.0-rc10
  GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

docker info:

Output of docker info

Client:
Debug Mode: false
 
Server:
Containers: 103
  Running: 0
  Paused: 0
  Stopped: 103
Images: 235
Server Version: 19.03.13
Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 8fba4e9a7d01810a393d5d25a3621dc101981175
runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
init version: fec3683
Security Options:
  seccomp
   Profile: default
Kernel Version: 4.18.0-193.19.1.el8_2.x86_64
Operating System: CentOS Linux 8 (Core)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.25GiB
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Live Restore Enabled: false 

Execute docker build --tag dbt-snowflake:0.1 --target dbt-snowflake . :

Error

Step 19/27 : RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_core_ref}#egg=dbt-core&subdirectory=core"
---> Running in 82fba1df3eaf
Collecting dbt-core
  Cloning https://github.com/dbt-labs/ to /tmp/pip-install-udllnuvd/dbt-core_202d0bd389964dae8d220f119608063f
  Running command git clone --filter=blob:none --quiet https://github.com/dbt-labs/ /tmp/pip-install-udllnuvd/dbt-core_202d0bd389964dae8d220f119608063f
  remote: Not Found
  fatal: repository 'https://github.com/dbt-labs/' not found
  error: subprocess-exited-with-error
 
  × git clone --filter=blob:none --quiet https://github.com/dbt-labs/ /tmp/pip-install-udllnuvd/dbt-core_202d0bd389964dae8d220f119608063f did not run successfully.
  │ exit code: 128
  ╰─> See above for output.
 
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
 
× git clone --filter=blob:none --quiet https://github.com/dbt-labs/ /tmp/pip-install-udllnuvd/dbt-core_202d0bd389964dae8d220f119608063f did not run successfully.
│ exit code: 128
╰─> See above for output.
 
note: This error originates from a subprocess, and is likely not a problem with pip.
The command '/bin/sh -c python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_core_ref}#egg=dbt-core&subdirectory=core"' returned a non-zero code: 1 


Alter Dockerfile locating ARGs in the corresponding scopes:

Altered Dockerfile

##
#  Generic dockerfile for dbt image building.
#  See README for operational details
##
 
# Top level build args
ARG build_for=linux/amd64
 
##
# base image (abstract)
##
FROM --platform=$build_for python:3.10.3-slim-bullseye as base
 
# N.B. The refs updated automagically every release via bumpversion
# N.B. dbt-postgres is currently found in the core codebase so a value of dbt-core@<some_version> is correct
 
# special case args
 
# System setup
RUN apt-get update \
  && apt-get dist-upgrade -y \
  && apt-get install -y --no-install-recommends \
    git \
    ssh-client \
    software-properties-common \
    make \
    build-essential \
    ca-certificates \
    libpq-dev \
  && apt-get clean \
  && rm -rf \
    /var/lib/apt/lists/* \
    /tmp/* \
    /var/tmp/*
 
# Env vars
ENV PYTHONIOENCODING=utf-8
ENV LANG=C.UTF-8
 
# Update python
RUN python -m pip install --upgrade pip setuptools wheel --no-cache-dir
 
# Set docker basics
WORKDIR /usr/app/dbt/
VOLUME /usr/app
ENTRYPOINT ["dbt"]
 
##
# dbt-core
##
FROM base as dbt-core
ARG dbt_core_ref=dbt-core@v1.1.0b1
RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_core_ref}#egg=dbt-core&subdirectory=core"
 
##
# dbt-postgres
##
FROM base as dbt-postgres
ARG dbt_postgres_ref=dbt-core@v1.1.0b1
RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_postgres_ref}#egg=dbt-postgres&subdirectory=plugins/postgres"
 
 
##
# dbt-redshift
##
FROM base as dbt-redshift
ARG dbt_redshift_ref=dbt-redshift@v1.0.0
RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_redshift_ref}#egg=dbt-redshift"
 
 
##
# dbt-bigquery
##
FROM base as dbt-bigquery
ARG dbt_bigquery_ref=dbt-bigquery@v1.0.0
RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_bigquery_ref}#egg=dbt-bigquery"
 
 
##
# dbt-snowflake
##
FROM base as dbt-snowflake
ARG dbt_snowflake_ref=dbt-snowflake@v1.0.0
RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_snowflake_ref}#egg=dbt-snowflake"
 
##
# dbt-spark
##
FROM base as dbt-spark
ARG dbt_spark_ref=dbt-spark@v1.0.0
ARG dbt_spark_version=all
RUN apt-get update \
  && apt-get dist-upgrade -y \
  && apt-get install -y --no-install-recommends \
    python-dev \
    libsasl2-dev \
    gcc \
    unixodbc-dev \
  && apt-get clean \
  && rm -rf \
    /var/lib/apt/lists/* \
    /tmp/* \
    /var/tmp/*
RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_spark_ref}#egg=dbt-spark[${dbt_spark_version}]"
 
 
##
# dbt-third-party
##
FROM dbt-core as dbt-third-party
ARG dbt_third_party
RUN python -m pip install --no-cache-dir "${dbt_third_party}"
 
##
# dbt-all
##
FROM base as dbt-all
ARG dbt_postgres_ref=dbt-core@v1.1.0b1
ARG dbt_redshift_ref=dbt-redshift@v1.0.0
ARG dbt_bigquery_ref=dbt-bigquery@v1.0.0
ARG dbt_snowflake_ref=dbt-snowflake@v1.0.0
ARG dbt_spark_ref=dbt-spark@v1.0.0
ARG dbt_spark_version=all
RUN apt-get update \
  && apt-get dist-upgrade -y \
  && apt-get install -y --no-install-recommends \
    python-dev \
    libsasl2-dev \
    gcc \
    unixodbc-dev \
  && apt-get clean \
  && rm -rf \
    /var/lib/apt/lists/* \
    /tmp/* \
    /var/tmp/*
  RUN python -m pip install --no-cache "git+https://github.com/dbt-labs/${dbt_redshift_ref}#egg=dbt-redshift"
  RUN python -m pip install --no-cache "git+https://github.com/dbt-labs/${dbt_bigquery_ref}#egg=dbt-bigquery"
  RUN python -m pip install --no-cache "git+https://github.com/dbt-labs/${dbt_snowflake_ref}#egg=dbt-snowflake"
  RUN python -m pip install --no-cache "git+https://github.com/dbt-labs/${dbt_spark_ref}#egg=dbt-spark[${dbt_spark_version}]"
  RUN python -m pip install --no-cache "git+https://github.com/dbt-labs/${dbt_postgres_ref}#egg=dbt-postgres&subdirectory=plugins/postgres"

throws the next error when executing docker build --tag dbt-all:0.1 --target dbt-all .:

Error during the execution of the step with dbt_third_party with some previous sucsessful steps

Step 28/45 : RUN apt-get update   && apt-get dist-upgrade -y   && apt-get install -y --no-install-recommends     python-dev     libsasl2-dev     gcc     unixodbc-dev   && apt-get clean   && rm -rf     /var/lib/apt/lists/*     /tmp/*     /var/tmp/*
---> Using cache
---> d734f483498a
Step 29/45 : RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_spark_ref}#egg=dbt-spark[${dbt_spark_version}]"
---> Using cache
---> 931dd46540a3
Step 30/45 : FROM dbt-core as dbt-third-party
---> b58b739383c0
Step 31/45 : ARG dbt_third_party
---> Using cache
---> 582aa205c947
Step 32/45 : RUN python -m pip install --no-cache-dir "${dbt_third_party}"
---> Running in cd5a77dd884d
ERROR: Invalid requirement: ''
The command '/bin/sh -c python -m pip install --no-cache-dir "${dbt_third_party}"' returned a non-zero code: 1 


Activating experimental BuildKit

After activating BuildKit all is working as it is described in the README:

  • enable experimental features on the daemon ({"experimental":true}) in the daemon.json configuration file
  • set DOCKER_BUILDKIT=1 environment variable to use buildkit

If BuildKit is to be used, it would be desirable to add these requirements to the README.

@iknox-fa
Copy link
Contributor

iknox-fa commented Apr 8, 2022

@VShkaberda thanks for the info-- and you're 100% right about buildkit. I don't think that even crossed my mind that people may still be using Docker without buildkit / buildx. I'll update the docs in a few mins to help avoid future confusion!

@iknox-fa iknox-fa self-assigned this Apr 8, 2022
@jtcohen6
Copy link
Contributor

Closing as it looks like this was resolved by #5018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working docker Related to official Docker files/images for dbt
Projects
None yet
Development

No branches or pull requests

4 participants