Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GIthub workflows, Dockerfile, and Cargo.toml out of sync with blog. #75

Closed
cldershem opened this issue Mar 19, 2021 · 12 comments
Closed

Comments

@cldershem
Copy link

I've been following along with the blog posts pretty regularly and everything has been going pretty smoothly. However, the most recent one has led to a bunch of headaches to get everything running properly again. I shut down the App on DigitalOcean in between chapters to keep the cost down, but I can't get it to spin back up. While troubleshooting this problem I discovered that the Github Workflows, Dockerflie, and Cargo.toml dependencies have all been updated between chapters without any mention.

I realize that this wouldn't happen if someone was reading a physical book straight through, but in the blog format, it is inevitable. Maybe at minimum there could be a quick note to the top of the chapter to check the dependencies first and follow the compiler for updating.

@LukeMathWalker
Copy link
Owner

Can you be more specific - what did you have to change?

@cldershem
Copy link
Author

Oh shoot. I didn't actually mean to submit that yet! ha. I was just making a draft while I got it working again. I will update once I get it all running and can do a diff.

@LukeMathWalker
Copy link
Owner

Ahah no worries 😁

@cldershem
Copy link
Author

Somehow in my troubleshooting, I have really flubbed something up and can't get the tests to pass on Github. They work fine on my machine, but fail every time when I try to run it in CI.

Everything was working 100% at the end of last month when the last chapter came out. Other than one DB migration at the beginning of Chapter 7.2 I hadn't changed anything when it started failing. I likely had something simple wrong, but got distracted by the dependency changes and broke it all. I am guessing my original problem was a bug with sqlx that requires a cargo clean before doing cargo sqlx prepare -- --bin zero2prod.

I have absolutely no idea why it works on my machine, but not in CI...but I'm out of patience to troubleshoot it anymore today. Feel free to ignore this issue and I'll come back to it when I figure out what's going on.

@LukeMathWalker
Copy link
Owner

LukeMathWalker commented Mar 20, 2021

Your issue, I believe, is due to this file https://github.com/cldershem/zero2prod/blob/master/.env.example being named .env.example instead of .env.
sqlx does not see any DATABASE_URL environment variable therefore it tries to run in offline mode, but the data generated by cargo sqlx prepare does not cover the queries in your tests.

@cldershem
Copy link
Author

The .env.example is a convention I have so that I don't accidentally checkin legit credentials to source control. I commit an .env.example with dummy data so you know what is required in a .env and then add the environmental variables using Githubs secrets. All of that was actually correct.

What I believe happened was that while troubleshooting I went back and copy and pasted this Dockerfile with a typo. That line should be FROM lukemathwalker/cargo-chef AS cacher. planner is already which leads to an odd error message that led me down the wrong rabbit hole assuming that there was something else wrong.

I haven't actually gotten it to deploy completely yet, but now it's because I'm running into #71.

@cldershem
Copy link
Author

It's building again. At this point I'm not sure there was actually any dependency problems when I started. I think there was likely a something simple on my end and when I when I copy and paste a few of the files from your buildI I ran into a few hiccups.

  1. Several dependencies had been update so it was difficult to compare our code.
  2. the typo in the Dockerfile on Ch7.1 branch.
  3. Digital Ocean out of memory things as stated in 5.4 DigitalOcean Build error out of memory #71.

I will close this issue, but if anyone else should come along to find it here is a diff between chapters. I happened to make some minor, unrelated changes, but I think one can figure it out.

@cldershem
Copy link
Author

I finally got around to finishing this chapter and ran into the same issue that started this thread when I went to deploy it to production. I spent half of my day yesterday figuring out what was causing the issue. I am now wondering if the 7.2 branch can actually be deployed exactly as is.

During the test stage of CI it will fail every with sqlx-data.json as it is currently in the 7.2 branch. The error message, while accurate, took me a while to figure out. I think by running cargo sqlx prepare -- --bin zero2prod, sqlx ignores the tests/ directory and will not add the queries that are specific to tests to the sql-data.json file. My workaround is to add the same query! in a fake function that cargo sqlx prepare will find. The query is then cached in the sqlx-data.json file and can be reused by the functions in the tests/ directory during CI.

Troubleshooting this was difficult for a few reasons. A few of these reasons were already on this thread bout dependency version mismatches and the like. After I couldn't get things to build for a few hours, I started copying and pasting your code directly and that led to a few other problems.

  • The version of scripts/init_db in 7.2 just loops and never actually starts postgres or migrates the DB. I reverted to my script that I had in 7.1 and it works.
  • sqlx prepare sometimes needs a cargo clean between runs. The issue has been reported, but there doesn't seem to be a lot traction on it.
  • cargo sqlx prepare -- --bin zero2prod only prepares the queries from the zero2prod binary, which obviously doesn't include our integration tests in tests/. I don't know how to solve this, all other option sled to similar results.
  • sqlx prepare grabs queries including their white space and uses that to create it's hash keys. This makes it very difficult to compare your code to my code because of whitespace preferences and rustfmt changing things between versions. In the long run, sqlx prepare should probably strip whitespace and then use that, but I'm not sure it's a real isssue.
  • I never had the issue on my machine because even with SQLX_OFFLINE=true, sqlx was finding PG and running the tests/ queries I think. Once I shutdown the docker container, I was able to reproduce the same issues as in CI. This took me a little bit to notice though.
  • A few github workflows seemed to have updates. Almost all of the updates and version mismatches ended up being non-issues, but they were definitely red herrings while I tracked down the root cause of the issue.

I'm not 100% sure where to go from here, but I thought I'd at least publish the findings from this chapter's headache.

@cldershem cldershem reopened this Apr 8, 2021
@LukeMathWalker
Copy link
Owner

Thanks for the write-up - this might definitely turn out useful for others facing similar problems 🙇🏼

Unfortunately I don't think there is much I can do to improve the situation on my side.
Just one question, following what you mentioned here

The version of scripts/init_db in 7.2 just loops and never actually starts postgres or migrates the DB. I reverted to my script that I had in 7.1 and it works.

The two scripts are identical as far as I can see 👀

@cldershem
Copy link
Author

I think I was referring to the version of scripts/init_db from my repo at the end of chapter 7.1. Mine is a little different than yours, I think you likely made it more robust on your branch, but something didn't work when I used that version. I didn't look into too far since it wasn't a big deal.

I agree that there likely isn't much that you can do it. I think I just happened upon a domino path of failure and wanted to document it for anyone else that had a similar problem. I was trying to avoid the ole DenverCoder9 situation.

@LukeMathWalker
Copy link
Owner

I managed to reproduce the looping issue - just updated all branches with a fix!

@cldershem
Copy link
Author

Oh, I'm so glad to hear I haven't gone completely mad! Thanks for the update. I think it's safe to close this issue and trust that anyone else with my luck will find it when searching.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants