Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

redshift_connector Driver throws error while running SELECT 1 (or any query) #20137

Closed
3 tasks done
niravpeak opened this issue May 20, 2022 · 7 comments
Closed
3 tasks done
Assignees
Labels
#bug Bug report

Comments

@niravpeak
Copy link

A clear and concise description of what the bug is.
We were using driver redshift+psycopg2 , that worked well so far. As part of enhanced security we moved from that driver to redshift+redshift_connector driver. Although it does successful connect & dropdown of tables. it is unable to display correct dataset & getting mentioned error below:

How to reproduce the bug

  1. install redshift_connector
    bootstrapScript: |
    #!/bin/bash
    rm -rf /var/lib/apt/lists/* &&
    pip install
    psycopg2-binary==2.9.1
    redis==3.5.3
    sqlalchemy-redshift==0.8.9
    redshift-connector==2.0.907 &&
    if [ ! -f ~/bootstrap ]; then echo "Running Superset with uid {{ .Values.runAsUser }}" > ~/bootstrap; fi

  2. Go to 'databases' => Add database using username / password or IAM based. test connect

  3. Go to sql editor page

  4. select database & relevant schema.

  5. On editor type "SELECT 1"

  6. See below error:
    First element of field tuple is neither a tuple nor str

Expected results

Output with data 1
what you expected to happen.

Actual results

what actually happens.
On pod we are getting below error trace:
raceback (most recent call last):
File "/app/superset/views/base.py", line 207, in wraps
return f(self, *args, **kwargs)
File "/app/superset/utils/log.py", line 245, in wrapper
value = f(*args, **kwargs)
File "/app/superset/views/core.py", line 2393, in sql_json
command_result: CommandResult = command.run()
File "/app/superset/sqllab/command.py", line 104, in run
raise ex
File "/app/superset/sqllab/command.py", line 96, in run
status = self._run_sql_json_exec_from_scratch()
File "/app/superset/sqllab/command.py", line 138, in _run_sql_json_exec_from_scratch
raise ex
File "/app/superset/sqllab/command.py", line 133, in _run_sql_json_exec_from_scratch
return self._sql_json_executor.execute(
File "/app/superset/sqllab/sql_json_executer.py", line 111, in execute
raise SupersetErrorsException(
superset.exceptions.SupersetErrorsException: [SupersetError(message='First element of field tuple is neither a tuple nor str', error_type=<SupersetErrorType.GENERIC_DB_ENGINE_ERROR: 'GENERIC_DB_ENGINE_ERROR'>, level=<ErrorLevel.ERROR: 'error'>, extra={'engine_name': 'Amazon Redshift', 'issue_codes': [{'code': 1002, 'message': 'Issue 1002 - The database returned an unexpected error.'}]})]

Screenshots

If applicable, add screenshots to help explain your problem.

Environment

(please complete the following information):

  • browser type and version: chrome:
  • superset version: superset version: Superset 0.0.0dev
  • python version: python --version: Python 3.8.12
  • node.js version: node -v:
  • any feature flags active: [ installed with helm chart ]
    helm install superset . --values=values.yaml -n superset-experiment

Checklist

Make sure to follow these steps before submitting your issue - thank you!

  • I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • I have reproduced the issue with at least the latest released version of superset.
  • I have checked the issue tracker for the same issue and I haven't found one similar.

Additional context

Add any other context about the problem here.
Few more details about debug trace from redshift_connector can be seen as below:
2022-05-17 05:50:12,363:DEBUG:redshift_connector:===================================
2022-05-17 05:50:12,363:DEBUG:redshift_connector.cursor:Cursor.paramstyle=named
2022-05-17 05:50:12,363:DEBUG:redshift_connector.core:===================================
2022-05-17 05:50:12,363:DEBUG:redshift_connector.core:Establishing a connection
2022-05-17 05:50:12,363:DEBUG:redshift_connector.core:{'user': 'IAM:uksegmentexplorer', 'database': 'dev', 'application_name': 'sqlalchemy-redshift', 'replication': None, 'client_protocol_version': '2', 'driver_version': 'Redshift Python Driver 2.0.907', 'os_version': 'Linux-5.4.181-99.354.amzn2.x86_64-x86_64-with-glibc2.2.5'}
2022-05-17 05:50:12,363:DEBUG:redshift_connector.core:===================================
2022-05-17 05:50:12,369:DEBUG:redshift_connector.cursor:Cursor.paramstyle=format
2022-05-17 05:50:12,369:DEBUG:redshift_connector.core:Sending start-up message
2022-05-17 05:50:12,551:DEBUG:redshift_connector.core:Server indicated EXTENDED_RESULT_METADATA transfer protocol will be used rather than protocol requested by client: BINARY
2022-05-17 05:50:12,551:DEBUG:redshift_connector.cursor:Cursor.paramstyle=format
2022-05-17 05:50:12,559:DEBUG:redshift_connector.core:field count=1

@niravpeak niravpeak added the #bug Bug report label May 20, 2022
@rusackas
Copy link
Member

@eschutho do you think this change in driver is something we should look at and support?

@Brooke-white
Copy link

Hi @rusackas & @eschutho , I maintain redshift_connector. We haven't tested integration with superset, so I'm new to the superset codebase. Is this issue something you folks have seen in the past?

I'm working on reproducing this issue locally and will update any more details I find here.

@eschutho
Copy link
Member

eschutho commented Jun 8, 2022

psycopg2 is the default driver for redshift on Superset, so it's definitely likely that we would need to make some small adjustments to use the redshift_connector driver. If we were to support it natively, I believe right now we would have to create a new db engine spec for redshift+redshift_connector. @betodealmeida do you have any additional insight on this?

@Brooke-white
Copy link

please let me know if there is anything needed from redshift_connector to support this

@betodealmeida
Copy link
Member

betodealmeida commented Jun 9, 2022

@Brooke-white the cursor description returned by theredshift_connector has the column name as bytes, and we expect it to be a string. For example:

[
    (b"date", 1043, None, None, None, None, None),
    (b"open", 701, None, None, None, None, None),
    (b"high", 701, None, None, None, None, None),
    (b"low", 701, None, None, None, None, None),
    (b"close", 701, None, None, None, None, None),
    (b"adj close", 701, None, None, None, None, None),
    (b"volume", 20, None, None, None, None, None),
]

There's no standard type for the column name (https://peps.python.org/pep-0249/#cursor-attributes), but in all DB API 2.0 drivers I've seen it's returned as a string, so you might want to change that. I'll also update Superset so that it works with bytes, just in case.

@Brooke-white
Copy link

Thanks for the heads up @betodealmeida -- we are working with @niravpeak to determine what a migration path looks like for retrieving column names as strings

@eschutho
Copy link
Member

I'm going to close this ticket given @betodealmeida's fix, but feel free to comment if you feel that action was premature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
#bug Bug report
Projects
None yet
Development

No branches or pull requests

5 participants