Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cargo sqlx prepare expecting DATABASE_URL in workspace scenario #1223

Open
lweberk opened this issue May 12, 2021 · 13 comments
Open

cargo sqlx prepare expecting DATABASE_URL in workspace scenario #1223

lweberk opened this issue May 12, 2021 · 13 comments

Comments

@lweberk
Copy link

lweberk commented May 12, 2021

When generating sqlx-data.json offline information for CI/CD within a workspace with subcrates for multiple databases containing their respective and valid .env information, it expects DATABASE_URL to be set.

cargo check works as expected. Macros fetching their respective DATABASE_URL from workspace subcrate .env.

Expected behavior would be for cargo sqlx prepare to do the same at workspace root. Alternatively to not confuse the different DATABASE_URL when generating them independently in their respective subcrates for separation.

cargo sqlx prepare --merged, if supposed to enable workspace sqlx-data.json information merging, does not change this behaviour.

Spun out of #121

@lweberk
Copy link
Author

lweberk commented May 14, 2021

I have been looking into fixing this and found the following mechanics;

  1. cargo sqlx prepare in essence wipes target/sqlx/query-*.json, then runs cargo check so it generates those files again.
  2. Takes the freshly generated files and unifies their contained json objects into one with their inner hash as key in the new unifying super-object.

The caveat is that, in all that process it also acceses DATABASE_URL to determine the type of database to then add that at the top level object as "db": "{Postgres, MySql, ...}"

Question: What is that being used for? Can I safely bypass it in the case of --merged?

I have manually merged those files with a small python script, omitting the db field altogether. So far running SQLX_OFFLINE=true cargo check has resulted positively as if everything resulted in a OK. Will attempt to reproduce in CI/CD clean room.

Insights to the implications of the things I'm touching upon by a developer of this project would be greatly appreciated. Fumbling along the surface is very different from having a deeper understanding of the underpinning motivations and reasons of them.

@lweberk
Copy link
Author

lweberk commented May 14, 2021

Ok so the db type information is expected else, it will not run. It also guides the set of types to be used at mapping the wired types to the local equivalents. There is no way around it. Messing around with it would probably be well out of scope of a work around until #121 gets properly solved.

The only workaround I can think of is to have individual sqlx-data.json's in each workspace subcrate and to have checks resolve to those first. It would also mean fixing the conflagration of URL_DATABASE environment vars between them.

Pointers to where to start to save time?

@lweberk
Copy link
Author

lweberk commented May 19, 2021

For the sake of completeness.

A current workaround:

  1. Remove target/sqlx/* files
  2. Run cargo check so that each subcrate workspace generates the files for each query
  3. Run this python snippet to merge all queries in one sqlx-data.json at root of the project directory
  4. Copy it to each of the subcrate workspaces that have queries in them to check
  5. Remove all the queries that are not specific to that subcrate (optional)
  6. At the top of these files add a "db" : "PostgreSQL | MySQL | ...", to the checker can properly map the type conversions
import json
import glob

glob.glob("target/sqlx/*")

data   = [json.load(open(path, "r")) for path in glob.glob("target/sqlx/query-*.json")]
merged = {v["hash"]: v for v in data}

json.dump(merged, open("sqlx-data.json", "w"), indent=4)

The checking against offline for individual subcrates just works. Its the sqlx prepare that does not. Until there is some clarity provided by the devs of this project on what the intentions to the whole multi-database question is, I'll defer to working around it.

@jplatte
Copy link
Contributor

jplatte commented Sep 2, 2021

I just ran into this same issue, and adopted your script. Here's what I now committed as sqlx-prepare.py:

#!/usr/bin/env python3

import json
import glob

data   = [json.load(open(path, "r")) for path in glob.glob("target/sqlx/query-*.json")]
merged = { "db": "PostgreSQL", **{ v["hash"]: v for v in data } }

json.dump(merged, open("sqlx-data.json", "w"), indent=4)

There isn't actually a need to copy sqlx-data.json to any subdirectories, the macros will find it in the workspace root. Just have to run SQLX_OFFLINE=false cargo check --workspace and then ./sqlx-prepare.py whenever some query code changes. Maybe I should even make the script run that command automatically (and also remove target/sqlx before doing so) 🤔

@jplatte
Copy link
Contributor

jplatte commented Sep 6, 2021

Updated script that fully replaces cargo sqlx prepare:

#!/usr/bin/env python3

import json
import glob
import os
import shutil
import subprocess

shutil.rmtree("target/sqlx", ignore_errors=True)

os.environ["SQLX_OFFLINE"] = "false"
subprocess.run(["cargo", "check", "--workspace"])

data   = [json.load(open(path, "r")) for path in glob.glob("target/sqlx/query-*.json")]
merged = { "db": "PostgreSQL", **{ v["hash"]: v for v in data } }

json.dump(merged, open("sqlx-data.json", "w"), indent=4)

@arlyon
Copy link
Contributor

arlyon commented Sep 21, 2021

Here is a shell script using jq. I've found that if you delete target/sqlx every time, it is only generated if there have been changes to those specific crates, so I touch the files for good measure as well. It also fails if you happen to have DATABASE_URL set, as sqlx will use that instead of the .env specific ones, so unset that as well.

#!/usr/bin/env bash
rm -rf target/sqlx
touch crates/*/src/*.rs
env -u DATABASE_URL SQLX_OFFLINE=false cargo check --workspace
jq -s '{"db": "MySQL"} + INDEX(.hash)' target/sqlx/query-*.json > sqlx-data.json

@zopieux
Copy link

zopieux commented Oct 1, 2021

Thanks @arlyon, for reproducibility I think it's a good idea to make sure the hashes are sorted. It also reduces the probability of useless VCS diffs.

jq -s '{"db": "MySQL"} + (INDEX(.hash)|to_entries|sort_by(.key)|from_entries)' target/sqlx/query-*.json > sqlx-data.json

I can live with that workaround, but that's still a pretty big hack. 😞

@LLBlumire
Copy link
Contributor

This doesn't seem to be working for me? cargo check --workspace doesn't appear to generate a /target/sqlx ?

@cleverjam
Copy link
Contributor

cleverjam commented Jul 4, 2023

This doesn't seem to be working for me? cargo check --workspace doesn't appear to generate a /target/sqlx ?

Same. I can't manually locate the target/sqlx directory either.

Did you find a solution ?

@exellentcoin26
Copy link

I did not test this further, however, I found the following. The sqlx prepare command internally runs cargo check (--workspace) with the following env variables: SQLX_TMP=<some-dir>, SQLX_OFFLINE="false", SQLX_OFFLINE_DIR=<some-dir>. If these directories exist before running cargo check with the environment variables set, the query files will be in the last directory.

@kontsaki
Copy link

Hello, I wanted to make sure I understand correctly, doesn't sqlx support a workspace with crates that use different databases?

@cleverjam
Copy link
Contributor

Hello, I wanted to make sure I understand correctly, doesn't sqlx support a workspace with crates that use different databases?

The issue as I understand is when running sqlx prepare from workspace root and different .env files in each sub crate pointing to different DATABASE_URL the prepare command does not work as expected.

I get around this issue by running sqlx prepare from the subcrate's directory for now.... this is more of an inconvenience than a lack of support imho

@exellentcoin26
Copy link

exellentcoin26 commented May 4, 2024

You can also solve this by loading in the .env file during compilation from each crate separately. For example, using dotenv-build in the build script of each crate. You do have to make sure that the crate is recompiled when the file changes. For example, using cargo flags printed in the build script. This makes it possible to run all commands from the root of the workspace as during compilation, cargo will set the correct env variables for each crate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants