Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: experimenting with docker containers on Travis #13569

Merged
merged 3 commits into from
Dec 5, 2015
Merged

Conversation

tkelman
Copy link
Contributor

@tkelman tkelman commented Oct 12, 2015

may help with #11553 ?

will close #10205

[av skip]

@pao
Copy link
Member

pao commented Oct 12, 2015

Package libunwind7-dev is not available, but is referred to by another package.

This may mean that the package is missing, has been obsoleted, or

is only available from another source

However the following packages replace it:

  libunwind8-dev

@pao
Copy link
Member

pao commented Oct 12, 2015

I would just change this in your branch to kick Travis off again but don't want to mess you up.

@tkelman
Copy link
Contributor Author

tkelman commented Oct 12, 2015

Saw that, I don't think it would get much further though, as it's also failing to install jq and openlibm. Not sure why the former is missing (http://packages.ubuntu.com/trusty/jq), the latter is apparently not being built in the juliadeps PPA for trusty.

My next option here is trying to make Julia-build docker container(s) that we could have building on docker hub and pull from them for deps. I think @malmaud started trying this (he was more interested in Cxx support IIRC, but they could overlap to a large extent) and would be curious to know how far he got.

@malmaud
Copy link
Contributor

malmaud commented Oct 12, 2015

Just to make sure I understand, are you proposing something like this?

  1. Create a Dockerfile that describes an image that has compiled versions of Julia's dependencies
  2. Use the Dockerhub automated buildbot feature to build that image
  3. Have the Travis CI script pull that image from dockerhub and create a new Docker container from that image
  4. Build Julia and run tests in that container

@tkelman
Copy link
Contributor Author

tkelman commented Oct 12, 2015

Exactly. Ref http://blog.travis-ci.com/2015-08-19-using-docker-on-travis-ci/

I'm not really sure how long the download for step 3 would take, and since you reported the time limits on dockerhub are a bit tight, steps 1-2 might need to be split into several layered dockerfiles - maybe one for llvm, one for openblas, and another for the rest.

@malmaud
Copy link
Contributor

malmaud commented Oct 12, 2015

I think this is a great idea.

It's worth keeping in mind that you don't have to use the Dockerhub buildbot - you can push binary images that were built anywhere onto Dockerhub. There are other opensource projects that will build Dockerhub files as service that might be more generous with resources, or someone could just take responsibility for building the dependencies image locally and pushing them when one of the dependencies changes. They're all guaranteed to produce the same image.

Regarding the time needed to download the image, it'll really just depend on how big it ends up being and the bandwidth Travis provides. It's always possible the host the image on a private repository like S3 if the limiting factor is the egress bandwidth of Dockerhub.

@pao
Copy link
Member

pao commented Oct 12, 2015

64-bit Linux tests are running, anyways, so that's progress.

@tkelman
Copy link
Contributor Author

tkelman commented Oct 12, 2015

aww, almost worked:

ERROR: LoadError: On worker 2:
LoadError: UDP send failed: network is unreachable (ENETUNREACH)
 in yieldto at ./task.jl:67
 in wait at ./task.jl:367
 in wait at ./task.jl:282
 in stream_wait at stream.jl:59
 in send at socket.jl:528
 [inlined code] from essentials.jl:111
 in include_string at loading.jl:266
 in include_from_node1 at ./loading.jl:307
 [inlined code] from util.jl:179
 in runtests at /tmp/julia/share/julia/test/testdefs.jl:7
 in anonymous at multi.jl:892
 in run_work_thunk at multi.jl:645
 [inlined code] from multi.jl:892
 in anonymous at task.jl:59
while loading /tmp/julia/share/julia/test/socket.jl, in expression starting on line 185
while loading /tmp/julia/share/julia/test/runtests.jl, in expression starting on line 13
    From worker 2:       * socket               

@tkelman
Copy link
Contributor Author

tkelman commented Oct 13, 2015

I'm somewhat doubtful whether udp will work inside Travis' docker containers if it doesn't work in the host here, but I have a trio of containers up on docker hub with all Julia deps in them:

https://hub.docker.com/r/tkelman/julia-openblas https://github.com/tkelman/julia-openblas/blob/master/Dockerfile
https://hub.docker.com/r/tkelman/julia-otherdeps https://github.com/tkelman/julia-otherdeps/blob/master/Dockerfile
https://hub.docker.com/r/tkelman/julia-llvm33 https://github.com/tkelman/julia-llvm33/blob/master/Dockerfile

These images are really big at the moment since I'm not cleaning up the builds yet (or using out-of-tree builds like I should be, since I'll want to play with variants of these for release-0.4), but I can keep experimenting with that. This does allow us to try out some of the fully-Docker-based competitors to Travis pretty easily though. Docker hub lets you hook up different repos so changes to one will automatically trigger rebuilds of its dependents.

@tkelman
Copy link
Contributor Author

tkelman commented Oct 14, 2015

Pleasant surprise, the socket test passes if I run it from inside a docker container: https://travis-ci.org/tkelman/julia/jobs/85385799

So you can do sudo docker run -w /home/julia-x86_64 tkelman/julia-llvm33 make -j3 testall from pretty much anywhere, with the deps already all in place.

@tkelman tkelman changed the title use sudo: required and dist: trusty on Travis WIP: experimenting with docker containers on Travis Oct 14, 2015
@malmaud
Copy link
Contributor

malmaud commented Oct 14, 2015

This. Changes. Everything.

On Oct 14, 2015, at 2:58 PM, Tony Kelman notifications@github.com wrote:

Pleasant surprise, the socket test passes if I run it from inside a docker container: https://travis-ci.org/tkelman/julia/jobs/85385799 https://travis-ci.org/tkelman/julia/jobs/85385799
These are a bit rough right now, but the dockerfiles are at
https://hub.docker.com/r/tkelman/julia-llvm33/~/dockerfile/ https://hub.docker.com/r/tkelman/julia-llvm33/%7E/dockerfile/
https://hub.docker.com/r/tkelman/julia-otherdeps/~/dockerfile/ https://hub.docker.com/r/tkelman/julia-otherdeps/%7E/dockerfile/
https://hub.docker.com/r/tkelman/julia-openblas/~/dockerfile/ https://hub.docker.com/r/tkelman/julia-openblas/%7E/dockerfile/
So you can do sudo docker run -w /home/julia-x86_64 tkelman/julia-llvm33 make -j3 testall from pretty much anywhere, with the deps already all in place.


Reply to this email directly or view it on GitHub #13569 (comment).

@ihnorton ihnorton added the domain:building Build system, or building Julia or its dependencies label Oct 20, 2015
@tkelman
Copy link
Contributor Author

tkelman commented Oct 20, 2015

I'm going to close this for now. #13577 has at least hidden the OOM under the rug a bit. Running the tests in a Docker container would work quite nicely on master, but be a bit harder to get running on PR's and branch builds. Would basically have to use Docker hub as a really roundabout way of creating a tarball of the compiled dependencies. We could use Travis' inbuilt caching feature for this too, but I don't think we can do a full source build of everything within the time limit. Would have to populate the cache in stages, which layered Dockerfiles work well for but travis.yml files don't.

@tkelman tkelman closed this Oct 20, 2015
@tkelman tkelman deleted the tk/trustytravis branch October 20, 2015 14:18
@tkelman tkelman restored the tk/trustytravis branch November 28, 2015 15:07
@tkelman tkelman reopened this Nov 28, 2015
@tkelman tkelman force-pushed the tk/trustytravis branch 6 times, most recently from 919d737 to e3ece76 Compare November 28, 2015 16:05
@tkelman
Copy link
Contributor Author

tkelman commented Nov 28, 2015

Revisited this. Travis is going to be forcing a migration to newer infrastructure in a few weeks, which does not have IPv6 available so would cause some of our tests to fail - travis-ci/travis-ci#4964 (somewhat heavy-handed of them to lock every related issue, otherwise I'd post a workaround for any other projects whose tests might start failing when this rollover happens)

As I noticed last month, the tests seem to work when run from inside a docker container, just not when run directly from the newer-infrastructure host. I was able to get docker auto-builds to work for all dependencies of Julia, rather than having to use the PPA. This requires a fairly roundabout setup with a chain of 4 different containers, 2 for 64 bit deps and 2 for 32 bit deps. Docker Hub used to have a setup where it would automatically trigger linked chains of builds when you modify the parent, but that doesn't seem to be working right now. These auto builds were also timing out for a while, something seems to have gotten faster recently and I hope it'll stay that way?

@tkelman
Copy link
Contributor Author

tkelman commented Nov 28, 2015

Anyone mac-savvy know why this is giving install: usr/lib/libccalltest.dylib.dSYM: Inappropriate file type or format ?

@tkelman
Copy link
Contributor Author

tkelman commented Nov 30, 2015

Turns out building and running in a docker container was actually overkill. The sudo: false workers, which run inside docker but always in the image that Travis has prepared rather than a custom one, can also pass the ipv6-related tests here. The downside is deps need to be built from source, but sudo: false gives you access to dependency caching. #13812 might make the deps caching cleaner or more compact, but that needs a rebase and was having some CI failures when it was last active.

The time limit is nominally 50 minutes per build which is a bit tight for a from-scratch build of all of LLVM, OpenBLAS, etc in default configurations, but apparently it's not enforced all that strictly? When cached copies of the deps are available and used, the timings are comparable or maybe a bit better than what we have now using the PPA for Linux deps.

@tkelman tkelman changed the title WIP: experimenting with docker containers on Travis RFC: experimenting with docker containers on Travis Nov 30, 2015
@yuyichao
Copy link
Contributor

It might also worth trying using system libunwind on linux again. (which might save a minute or so...).

@tkelman
Copy link
Contributor Author

tkelman commented Nov 30, 2015

I'm actually not using system anything on Linux here.

@tkelman
Copy link
Contributor Author

tkelman commented Dec 1, 2015

If travis starts inexplicably failing on the socket test with an ENETUNREACH any time in the next couple of weeks, this will (should) fix it. The first run after this gets merged will take about an hour on Travis to prime the deps cache, but from then on all other branches and PR's should use the master cache by default (or release-0.4 if this gets backported, for PR's against that branch) and take the usual half hour or so.

@tkelman tkelman mentioned this pull request Dec 1, 2015
19 tasks
something is still wrong with make install on osx, filtering out dSYMs

try running with bash -lc

[av skip]
with cached source build of deps
[av skip]
when an old PR build gets restarted after some time, the travis API request
might not return enough builds to include the most recent one for the PR,
so the null result would incorrectly cause a 'superceded' fast-fail condition
tkelman added a commit that referenced this pull request Dec 5, 2015
RFC: experimenting with docker containers on Travis
@tkelman tkelman merged commit db3afd7 into master Dec 5, 2015
@tkelman tkelman deleted the tk/trustytravis branch December 5, 2015 19:40
tkelman added a commit that referenced this pull request Dec 6, 2015
with cached source build of deps

ref #13569

(cherry picked from commit d7fc5d8)

Only cache deps on Travis for master and release branches

(cherry picked from commit 05301b9)

leave old travis cache in place for non-master builds

delete .pyc files from doc build

(cherry picked from commit 0055107)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:building Build system, or building Julia or its dependencies
Projects
None yet
Development

Successfully merging this pull request may close these issues.

travis builds libgit2, pcre unnecessarily
5 participants