Code marked as unreachable after pandas concat #5630

ldouteau · 2024-03-13T07:59:29Z

Environment data

Language Server version: 2024.3.1
OS and version: win32 x64
Python version (and distribution if applicable, e.g. Anaconda): 3.11.5
python.analysis.indexing: true
python.analysis.typeCheckingMode: off

Code Snippet

import pandas as pd

a = 1
pd.concat()
a = 1
pd.concat([])
a = 1

Repro Steps

Open this piece of code
Code at line 7 is flagged as unreachable by Pylance

Expected behavior

Code flagged as unreachable, and thus greyed out

Actual behavior

Code should be reachable

Logs

I'm not sure what i should provide here, or if it's relevant. Let me know if more is required

ArthurPERE · 2024-03-13T09:02:02Z

Hello,

I have the same problem, but I am on debian 12.

How do i know it's pylance, when i disable pylance and reload, the code is suddenly reachable.

My version of python is 3.12.1 in an environment conda.
And my version of pandas is :
pandas 2.2.1 py312h526ad5a_0

clementlefevre · 2024-03-13T10:24:14Z

Hello same here with pylance v2024.3.1

each pandas.concat call make the following lines unreachable.

Version: 1.87.2 (user setup)
Commit: 863d2581ecda6849923a2118d93a088b0745d9d6
Date: 2024-03-08T15:20:17.278Z
Electron: 27.3.2
ElectronBuildId: 26836302
Chromium: 118.0.5993.159
Node.js: 18.17.1
V8: 11.8.172.18-electron.0
OS: Windows_NT x64 10.0.22621

torext · 2024-03-13T10:34:38Z

Have a duplicate report here with additional details possibly: #5631

debonte · 2024-03-13T14:57:32Z

The matching overload of concat in pandas-stubs currently returns Never, so Pylance is behaving correctly here. https://github.com/pandas-dev/pandas-stubs/blob/main/pandas-stubs/core/reshape/concat.pyi#L39

This overload was added in pandas-dev/pandas-stubs#858.

Please file an issue against pandas-stubs at https://github.com/pandas-dev/pandas-stubs/issues.

Btw, in this case there are multiple matching overloads because the input parameter is list[Any]. Following our overload matching rules we choose the first matching overload, which in this case is the new Never overload.

They seemt to be abusing Never in this case. I believe they want it to mean "not a legal call" whereas it really means "this call never returns".

debonte · 2024-03-13T15:06:41Z

Btw, while waiting for a fix, you could work around the pandas-stubs issue by commenting out the Never overload in ...\.vscode\extensions\ms-python.vscode-pylance-2024.3.1\dist\bundled\stubs\pandas\core\reshape\concat.pyi.

@overload
def concat(
    objs: Iterable[None] | Mapping[HashableT1, None],
    *,
    axis: Axis = ...,
    join: Literal["inner", "outer"] = ...,
    ignore_index: bool = ...,
    keys: Iterable[HashableT2] = ...,
    levels: Sequence[list[HashableT3] | tuple[HashableT3, ...]] = ...,
    names: list[HashableT4] = ...,
    verify_integrity: bool = ...,
    sort: bool = ...,
    copy: bool = ...,
) -> Never: ...

torext · 2024-03-13T15:21:32Z

Thanks for the explanation @debonte, I've submitted an issue to to Pandas devs here: pandas-dev/pandas-stubs#888

Dr-Irv · 2024-03-13T17:57:00Z

Note - both of the calls below will raise an exception with pandas, so in the example by the OP, pylance is absolutely correct in terms of what is reported!

pd.concat()
pd.concat([])

ldouteau · 2024-03-14T08:27:04Z

Note - both of the calls below will raise an exception with pandas, so in the example by the OP, pylance is absolutely correct in terms of what is reported!
pd.concat()
pd.concat([])

FYI Pylance (through pandas-stubs) doesn't mark code as unreachable after the first code, only after the second one.
About the relevance of the test case, sure it's a minimal example that doesnt make much sense.

Dr-Irv · 2024-03-14T12:15:25Z

Please see #5640 . pylance is doing the unreachable analysis by looking at the stubs even if type checking is off.

Dr-Irv · 2024-03-14T16:24:56Z

FYI Pylance (through pandas-stubs) doesn't mark code as unreachable after the first code, only after the second one.
About the relevance of the test case, sure it's a minimal example that doesnt make much sense.

I'm guessing that you had type checking turned off. Because if you turned it on, the first line would be flagged as a typing error.

The second line will raise an exception, so then code below it is unreachable.

What's debatable here is whether there should be some control over whether you can turn on/off the unreachable analysis, or whether pylance should be doing the unreachable analysis when you are not using type checking.

zorion · 2024-03-14T18:04:46Z

Why is this (and #5631) closed? Is it fixed? If it is not, which open issue will fix the question?

I will refer to #5631, but since that is duplicated to this one, the arguments should be relevant here.

Pylance says it is ok to decide that it is unreachable code because pandas-stubs starts checking if all list is None then it will fail. So pylance is correct.
Pandas-stubs says it is ok to check if all list elements are None, then this will obviously fail. They claim that pylance shouldn't state that "Any" means all-None.

I've been in both sides while reading the issues but now I feel like pandas-stubs are right and the issue is that if we have Any there is a possibility that we end with a unreachable code (all-None), but it isn't guaranteed. You can receive a list including some DataFrame. Don't you?
For instance, the #5631 question.
Ok, #5631's OP could state that generate_series will return a pandas.Series, which it does, but it's not always that easy.

from typing import Any

import pandas as pd


def generate_series() -> Any:
    return pd.Series([1, 2, 3])


df = pd.concat({k: generate_series() for k in range(10)}, axis=1)

print("Unreachable?!")

Thanks!

Dr-Irv · 2024-03-14T18:14:28Z

Ok, #5631's OP could state that generate_series will return a pandas.Series, which it does, but it's not always that easy.

I agree with the aspect of it not always being that easy. You could have a 3rd party library that is fully untyped (or improperly typed!) where the type checker cannot infer the result of a method or function from that library to be a Series or DataFrame (or an Iterable of either). So you have code that calls that 3rd party library and now the type checker thinks the result is Iterable[Any], even though you "know" it is Iterable[Series], so your only solution is to cast in order for the call to pd.concat() to work.

erictraut · 2024-03-14T18:27:20Z

Why is this closed?

Pyright and pylance are working as intended here (and in compliance with the Python type system). There's nothing actionable for the pylance or pyright teams, so it's appropriate to close this issue. The behavior you're seeing is due to the way the pandas stubs are using NoReturn in overloads. I recommend that the maintainers of these stubs reconsider the way they're using this part of the Python typing system. If necessary, we can discuss ways to extend the type system to better support their needs.

Dr-Irv · 2024-03-14T18:42:00Z

Pyright and pylance are working as intended here (and in compliance with the Python type system).

Is "reachability" part of the python type system? I don't see that in the link you provided.

Remember, with the stubs as they are in pandas-stubs, the code at #5630 (comment) passes fine with pyright on command line. So the stubs are correct with respect to both mypy and pyright as command line type checkers. So I would argue that the stubs are correct with respect to formal type checking.

You're saying they are incorrect in the context of doing "reachability analysis".

The issue here is the assumptions about reachability inferred by pyright in the context of pylance and VSCode, and the lack of control for users to turn off the reachability analysis.

debonte · 2024-03-14T18:58:23Z

the code at #5630 (comment) passes fine with pyright on command line

@Dr-Irv, please see my recent reply on the pandas-stubs issue. While pyright doesn't emit any diagnostics about unreachability, the Never overload effectively causes false negatives in the code after the concat call. pandas-dev/pandas-stubs#888 (comment)

Dr-Irv · 2024-03-14T19:03:56Z

@Dr-Irv, please see my recent reply on the pandas-stubs issue. While pyright doesn't emit any diagnostics about unreachability, the Never overload effectively causes false negatives in the code after the concat call. pandas-dev/pandas-stubs#888 (comment)

@debonte and see my response at pandas-dev/pandas-stubs#888 (comment)

Command line pyright does not exhibit what you show (and neither does mypy).

The fundamental issue is about the reachability analysis done within pylance/VSCode, which is why I opened #5640

erictraut · 2024-03-14T19:18:15Z

As I've already said, correct type evaluation requires reachability analysis. One cannot be done without the other. You can't just "turn off reachability analysis" and expect to get correct types.

Dr-Irv · 2024-03-14T19:21:16Z

As I've already said, correct type evaluation requires reachability analysis. One cannot be done without the other. You can't just "turn off reachability analysis" and expect to get correct types.

But if a user turns off type analysis in VS Code, then shouldn't the reachability analysis also be turned off if one can't be done without the other?

erictraut · 2024-03-14T19:25:13Z

There is no way to turn off type analysis in Pylance. All functionality within Pylance (completion suggestions, hover text, signature help, semantic tokens, etc.) relies on type analysis. If you set typeCheckingMode to "off", that simply suppresses certain classes of diagnostics. Reachability analysis and type analysis go hand in hand.

Dr-Irv · 2024-03-14T21:02:21Z

There is no way to turn off type analysis in Pylance. All functionality within Pylance (completion suggestions, hover text, signature help, semantic tokens, etc.) relies on type analysis. If you set typeCheckingMode to "off", that simply suppresses certain classes of diagnostics. Reachability analysis and type analysis go hand in hand.

OK, that makes sense. Maybe the option that is needed is "ignore stubs included with pylance" or "ignore specific packaged stubs included with pylance" ?? At least that would give people the option to temporarily suppress the issue if any stub package like pandas-stubs had an error like this one.

debonte · 2024-03-14T23:13:48Z

Maybe the option that is needed is "ignore stubs included with pylance" or "ignore specific packaged stubs included with pylance" ?? At least that would give people the option to temporarily suppress the issue if any stub package like pandas-stubs had an error like this one.

That seems excessive to me. I suspect that a small percentage of our users have sufficient understanding of Python typing, stubs, Pylance/Pyright, etc. to know whether such a command would help in any particular scenario. Those that do would likely know enough to work around the issue in their code or in the stubs, thereby retaining the benefit of the unbroken/non-buggy portion of the stubs rather than throwing the baby out with the bath water. In the case of pandas-dev/pandas-stubs#888, they could cast the parameter passed to concat or remove the Never overload from the stubs. Worst-case, this type of advanced user could easily find the bundled stubs in our install directory and delete them manually.

Btw, you pointed out that Pylance shipped with the stubs from your main branch rather than your most recent release. It sounded like you might have been frustrated that we were shipping prerelease stubs. To be honest, that hadn't occurred to me before. Currently we just grab whatever's in main once a week. Would you recommend that we always ship the latest official release instead?

Dr-Irv · 2024-03-14T23:59:38Z

Btw, you pointed out that Pylance shipped with the stubs from your main branch rather than your most recent release. It sounded like you might have been frustrated that we were shipping prerelease stubs. To be honest, that hadn't occurred to me before. Currently we just grab whatever's in main once a week. Would you recommend that we always ship the latest official release instead?

Whatever is in main has been tested with our CI and would go in the "next" release whenever I decide to do that release. So pulling once a week is fine. In fact, in this case, it helped us identify an issue that we can fix quickly (which I think we will do, based the latest remarks in the pandas-stubs issue). It would be worthwhile if you could do something (you could add a Literal to pandas-stubs/_version.pyi) that indicated which commit you pulled from in case we have to figure out the difference again.

zorion · 2024-03-15T07:21:15Z

This worked like a charm:
#5630 (comment)

I think this is better than adding a flag saying "I don't want pylance using pandas-stub", so I wouldn't add that option.

Thanks for your great support to both teams, pylance(-release) and pandas(-stubs). I love your projects!

rcyost · 2024-03-18T18:21:35Z

Hi guys is this fixed? Pandas concat messes up my editor viewing.

debonte · 2024-03-18T18:24:39Z

Hi guys is this fixed? Pandas concat messes up my editor viewing.

It has been fixed in pandas-stubs. We will pick up that fix in our next prerelease build, which will likely ship on Wednesday.

In the meantime you can use the workaround I mentioned above.

ahmedmohammed107 · 2024-03-21T13:32:55Z

A naive solution is to define my_concat function that returns pd.concat. Something like:

def my_concat(dfs_list, axis=0): return pd.concat(dfs_list, axis=axis)

df = my_concat([df_1, df_2], axis=0)

debonte · 2024-03-25T23:19:42Z

This issue is also fixed in our most recent stable release -- 2024.3.2

github-actions bot assigned debonte Mar 13, 2024

github-actions bot added the needs repro Issue has not been reproduced yet label Mar 13, 2024

debonte mentioned this issue Mar 13, 2024

Code mistakenly marked as unreachable after Pandas concat of Any types #5631

Closed

debonte added typestub Issue relating to our bundled type stubs external bug and removed needs repro Issue has not been reproduced yet labels Mar 13, 2024

debonte closed this as completed Mar 13, 2024

torext mentioned this issue Mar 13, 2024

concat of types Iterable[Any] cause Pyright to think subsequent code is unreachable because of recent stubs change pandas-dev/pandas-stubs#888

Closed

Dr-Irv mentioned this issue Mar 14, 2024

Unreachable conclusion reached even if Python->Analysis->Type Checking Mode is "off" #5640

Closed

debonte mentioned this issue Mar 15, 2024

Users should be able to tell which pandas-stubs we have bundled #5643

Closed

debonte mentioned this issue Mar 15, 2024

Code is unreachable #5646

Closed

thistlillo mentioned this issue Mar 20, 2024

Grey code with no reasons microsoft/vscode#208020

Closed

leolavaur mentioned this issue Mar 20, 2024

Code marked "Unreachable" incorrectly. #5664

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code marked as unreachable after pandas concat #5630

Code marked as unreachable after pandas concat #5630

ldouteau commented Mar 13, 2024

ArthurPERE commented Mar 13, 2024

clementlefevre commented Mar 13, 2024

torext commented Mar 13, 2024

debonte commented Mar 13, 2024 •

edited

Loading

debonte commented Mar 13, 2024 •

edited

Loading

torext commented Mar 13, 2024

Dr-Irv commented Mar 13, 2024

ldouteau commented Mar 14, 2024

Dr-Irv commented Mar 14, 2024

Dr-Irv commented Mar 14, 2024

zorion commented Mar 14, 2024

Dr-Irv commented Mar 14, 2024

erictraut commented Mar 14, 2024

Dr-Irv commented Mar 14, 2024

debonte commented Mar 14, 2024

Dr-Irv commented Mar 14, 2024 •

edited

Loading

erictraut commented Mar 14, 2024

Dr-Irv commented Mar 14, 2024

erictraut commented Mar 14, 2024

Dr-Irv commented Mar 14, 2024

debonte commented Mar 14, 2024

Dr-Irv commented Mar 14, 2024

zorion commented Mar 15, 2024

rcyost commented Mar 18, 2024

debonte commented Mar 18, 2024 •

edited

Loading

ahmedmohammed107 commented Mar 21, 2024 •

edited

Loading

debonte commented Mar 25, 2024

Code marked as unreachable after pandas concat #5630

Code marked as unreachable after pandas concat #5630

Comments

ldouteau commented Mar 13, 2024

Environment data

Code Snippet

Repro Steps

Expected behavior

Actual behavior

Logs

ArthurPERE commented Mar 13, 2024

clementlefevre commented Mar 13, 2024

torext commented Mar 13, 2024

debonte commented Mar 13, 2024 • edited Loading

debonte commented Mar 13, 2024 • edited Loading

torext commented Mar 13, 2024

Dr-Irv commented Mar 13, 2024

ldouteau commented Mar 14, 2024

Dr-Irv commented Mar 14, 2024

Dr-Irv commented Mar 14, 2024

zorion commented Mar 14, 2024

Dr-Irv commented Mar 14, 2024

erictraut commented Mar 14, 2024

Dr-Irv commented Mar 14, 2024

debonte commented Mar 14, 2024

Dr-Irv commented Mar 14, 2024 • edited Loading

erictraut commented Mar 14, 2024

Dr-Irv commented Mar 14, 2024

erictraut commented Mar 14, 2024

Dr-Irv commented Mar 14, 2024

debonte commented Mar 14, 2024

Dr-Irv commented Mar 14, 2024

zorion commented Mar 15, 2024

rcyost commented Mar 18, 2024

debonte commented Mar 18, 2024 • edited Loading

ahmedmohammed107 commented Mar 21, 2024 • edited Loading

debonte commented Mar 25, 2024

debonte commented Mar 13, 2024 •

edited

Loading

debonte commented Mar 13, 2024 •

edited

Loading

Dr-Irv commented Mar 14, 2024 •

edited

Loading

debonte commented Mar 18, 2024 •

edited

Loading

ahmedmohammed107 commented Mar 21, 2024 •

edited

Loading