Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-87092: compiler's CFG construction moved to after codegen stage #102320

Merged
merged 26 commits into from
Mar 7, 2023

Conversation

iritkatriel
Copy link
Member

@iritkatriel iritkatriel commented Feb 28, 2023

After we merge this I will make a PR to move codegen to a separate file.

Skipping news - we will have one entry for the entire refactor once it's done.

@iritkatriel iritkatriel added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Feb 28, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @iritkatriel for commit b8f4acf 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Feb 28, 2023
@iritkatriel iritkatriel added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Mar 1, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @iritkatriel for commit e5653ee 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Mar 1, 2023
@iritkatriel iritkatriel added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Mar 2, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @iritkatriel for commit 788da74 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Mar 2, 2023
@iritkatriel
Copy link
Member Author

The test failures are a couple of timeouts plus a couple of errors that are also showing up on other branches.

@markshannon
Copy link
Member

Do we really want another instruction struct and another instruction sequence struct?
A basic block, extended block and general instruction sequence, are much the same. The main differences being where jumps are allowed and how they are resolved.
The allowed use and meaning of branches should be clear from the context.
If you want stronger typing in C, just wrap the base instruction sequence:

typedef struct InstructionSequence InstructionSequence;
typedef struct {
    InstructionSequence instructions;
} InstructionList;

typedef struct {
    InstructionSequence instructions;
} BasicBlock;

The reason I'd like a single InstructionSequence, is that we can stick an object header in front of it and access it from Python.
Having instruction sequences be Python objects would make introspection and debugging so much nicer.

@iritkatriel
Copy link
Member Author

We can have an InstructionSequence type which is reused in basic blocks, but basic blocks have a lot more fields that we don't need for codegen, and they don't need a label map because they should have pointers to the target blocks.

@markshannon
Copy link
Member

We can have an InstructionSequence type which is reused in basic blocks

That makes sense.

@iritkatriel
Copy link
Member Author

We can have an InstructionSequence type which is reused in basic blocks

That makes sense.

I'm not sure.

The instruction in a basicblock is

struct cfg_instr {
    int i_opcode;
    int i_oparg;
    location i_loc;
    /* The following fields should not be set by the front-end: */
    struct basicblock_ *i_target; /* target block (if jump instruction) */
    struct basicblock_ *i_except; /* target block when exception is raised */
};

We can replace the first 3 fields by an instruction struct (this makes sense, I only avoided it to keep the diff smaller). But we would still need the target pointers, or to create a parallel data structure in the basic block for the pointers (and keep it in sync with the instruction sequence when we insert or delete instructions inside a block). It would be simpler to write a function that translates a CFG to something that can be exposed to python.

@iritkatriel
Copy link
Member Author

In any case, I don't think we should refactor the CFG data structures in the same PR. It will be another huge change.

@iritkatriel
Copy link
Member Author

iritkatriel commented Mar 3, 2023

I've created a function for resizing the doubling arrays, so this logic is not repeated 3 times now.

To share the instruction type between codegen and cfg, we could do:

struct cfg_instr {
    instruction i_instr;
    struct basicblock_ *i_target; /* target block (if jump instruction) */
    struct basicblock_ *i_except; /* target block when exception is raised */
};

instead of the current:

struct cfg_instr {
     int i_opcode;
     int i_oparg;
     location i_loc;
    struct basicblock_ *i_target; /* target block (if jump instruction) */
    struct basicblock_ *i_except; /* target block when exception is raised */
};

Do you think we should do it in this PR? It would make the diff quit a lot messier.

Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of questions, otherwise looks good.

Python/compile.c Show resolved Hide resolved
Python/compile.c Outdated Show resolved Hide resolved
@iritkatriel
Copy link
Member Author

A couple of questions, otherwise looks good.

Thank you. I'll merge then.

@iritkatriel iritkatriel merged commit 54060ae into python:main Mar 7, 2023
carljm added a commit to carljm/cpython that referenced this pull request Mar 7, 2023
* main:
  pythongh-102493: fix normalization in PyErr_SetObject (python#102502)
  pythongh-87092: compiler's CFG construction moved to after codegen stage (python#102320)
  pythongh-95913: Consolidate build requirements changes in 3.11 WhatsNew (pythonGH-98781)
  Remove redundant `_ensure_future` in favor of `ensure_future` in `asyncio` (python#102398)
  pythongh-95913: Edit Faster CPython section in 3.11 WhatsNew (pythonGH-98429)
  pythongh-90110: Fix the c-analyzer Tool (python#102483)
  pythongh-101759: Update macOS installer SQLite 3.40.1 checksum (pythongh-102485)
  Remove unused import of `warnings` from `unittest.loader` (python#102479)
  Add gettext support to tools/extensions/c_annotations.py (python#101989)
carljm added a commit to carljm/cpython that referenced this pull request Mar 8, 2023
* main:
  pythongh-102493: fix normalization in PyErr_SetObject (python#102502)
  pythongh-87092: compiler's CFG construction moved to after codegen stage (python#102320)
  pythongh-95913: Consolidate build requirements changes in 3.11 WhatsNew (pythonGH-98781)
  Remove redundant `_ensure_future` in favor of `ensure_future` in `asyncio` (python#102398)
  pythongh-95913: Edit Faster CPython section in 3.11 WhatsNew (pythonGH-98429)
  pythongh-90110: Fix the c-analyzer Tool (python#102483)
  pythongh-101759: Update macOS installer SQLite 3.40.1 checksum (pythongh-102485)
  Remove unused import of `warnings` from `unittest.loader` (python#102479)
  Add gettext support to tools/extensions/c_annotations.py (python#101989)
@iritkatriel iritkatriel deleted the instruction-stream branch April 3, 2023 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants