Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add script formatter, comment parsing in parser, and editor settings #76211

Open
wants to merge 37 commits into
base: master
Choose a base branch
from

Conversation

RolandMarchand
Copy link

@RolandMarchand RolandMarchand commented Apr 18, 2023

This PR builds upon and finalizes the work done in godotengine/godot/pull/55835.

This PR introduces a GDScript formatter in Godot's script editor. Users can access it through Edit/Format Code, Alt+Shift+F, or by enabling the new Format On Save editor setting while saving their work.

Integrating a formatter in Godot's script editor improves it as an IDE for GDScript. Additionally, the formatter will improve developer's adherence to the official GDScript style guide.

We encourage users to test the formatter on large code bases in order to detect any quirks or bugs that need to be addressed.

Additionally, this PR includes support for comments through a header and inline. During parsing, each comment of a comment block are added to a comment header vector. The next node (variable, function, annotation) has this vector set as the header comments. The following comment until a newline, if present, is set as the inline comment.

The commits have been kept separate for ease of review, but they will be squashed before merging. Lastly, there are no new dependencies included in this PR.

Production edit: Closes godotengine/godot-proposals#3630

@YuriSizov
Copy link
Contributor

Thanks for looking into this!

The commits have been kept separate for ease of review

This PR contains only one commit.

@NathanLovato
Copy link
Contributor

NathanLovato commented Apr 18, 2023

Just tested it on a number of files, here are snippets causing unexpected results. I'll post one code pattern/snippet/group of related snippets with before/after comparisons per comment.

I'm testing 2 things:

  1. Formatting gives the expected output.
  2. Formatting is stable: running the formatter on already formatted code doesn't make any changes.

The first case involves signal connections with a function literal (it may just be the inner block with a parenthesis at the end of the last line)

func _ready() -> void:
	energy_slider.value_changed.connect(func(value):
		light.energy = energy_slider.value)

Gets formatted into this:

func _ready() -> void:
	energy_slider.value_changed.connect(func(value):
		light.energy = energy_slider.value
)

This happens differently as you duplicate the two lines with the function literal.

This:

func _ready() -> void:
	energy_slider.value_changed.connect(func(value):
		light.energy = energy_slider.value)

	height_slider.value_changed.connect(func(value):
		light.height = height_slider.value)

	height_slider.value_changed.connect(func(value):
		light.height = height_slider.value)

Alternates between the following and the starting state when running the formatter.

func _ready() -> void:
	energy_slider.value_changed.connect(func(value):
		light.energy = energy_slider.value
)

	height_slider.value_changed.connect(func(value):
		light.height = height_slider.value
)

	height_slider.value_changed.connect(func(value):
		light.height = height_slider.value)

@NathanLovato
Copy link
Contributor

NathanLovato commented Apr 18, 2023

Longer annotations get placed on their own lines, unlike shorter ones. Not a bug per say, but I'm not 100% sure if it's intentional.

@export_multiline var text := ""
@export var logo_visible := false

Becomes:

@export_multiline
var text := ""
@export var logo_visible := false

@NathanLovato
Copy link
Contributor

Comment inside a function gets indented based on previous block.

func popup(immediate := false) -> void:
	if not is_inside_tree():
		await ready
	
	# Force container to be its smallest size possibles
	size = Vector2.ZERO

Becomes:

func popup(immediate := false) -> void:
	if not is_inside_tree():
		await ready

		# Force container to be its smallest size possibles
	size = Vector2.ZERO

@NathanLovato
Copy link
Contributor

Last line in a setter function attached to a property definition gets pushed down up to two times if following an ending code block.

var text := "":
	set(value):
		text = value
		if not is_inside_tree():
			await ready
		rich_text_label.text = text

First becomes

var text := "":
	set(value):
		text = value
		if not is_inside_tree():
			await ready

		rich_text_label.text = text

And after a second run of the formatter, it becomes

var text := "":
	set(value):
		text = value
		if not is_inside_tree():
			await ready


		rich_text_label.text = text

After that it's stable with subsequent formatter runs.

@NathanLovato
Copy link
Contributor

Arrays struggle to wrap when combined with method calls currently. Here's a very long line to generate a random nickname:

var nickname: String = ["Brave", "Horrific", "Courageous", "Terrific", "Fair", "Conqueror", "Victorious", "Glorious", "Invicible"].pick_random() + ["Leopard", "Cheetah", "Bear", "Turtle", "Rabbit", "Porcupine", "Hare", "Pigeon", "Albatross", "Crow" ].pick_random()

The formatter doesn't manage to wrap it to stay below the max line length. It produces this:

var nickname: String = (
		["Brave", "Horrific", "Courageous", "Terrific", "Fair", "Conqueror", "Victorious", "Glorious", "Invicible"].pick_random(
			)
		+ ["Leopard", "Cheetah", "Bear", "Turtle", "Rabbit", "Porcupine", "Hare", "Pigeon", "Albatross", "Crow"].pick_random(
			)
)

Not a super important case, but each array is longer than max line length, so I'd expect them to be wrapped vertically, as the formatter does well when all you have is an array (without the method call). E.g. this

var adjectives := ["Brave", "Horrific", "Courageous", "Terrific", "Fair", "Conqueror", "Victorious", "Glorious", "Invicible"]

Becomes the following, as expected:

var adjectives := [
	"Brave",
	"Horrific",
	"Courageous",
	"Terrific",
	"Fair",
	"Conqueror",
	"Victorious",
	"Glorious",
	"Invicible",
]

@NathanLovato
Copy link
Contributor

Overall it's starting to work really well on our code at least. Good job!

@anvilfolk
Copy link
Contributor

I've been working with GDScript docstrings lately, and from a super quick look through the code, that feels relatively similar to this idea that comments get assigned to relevant members. Is the purpose of that assignment to be able to format comments associated with members?

Because if it is, I wonder whether there's some refactoring possible there. Docstrings can be considered a special case of comment (they use ## instead of a single #). Either way, that refactoring should be done in a separate PR. I am just trying to figure out whether this is being used for documentation at all, and might thus be duplicated some work already done for docstrings!

@monkeez
Copy link

monkeez commented Apr 23, 2023

Will it be possible to call the formatter via the CMD line? Main use case would be if working with an external editor.

@NathanLovato
Copy link
Contributor

NathanLovato commented Apr 23, 2023

Will it be possible to call the formatter via the CMD line? Main use case would be if working with an external editor.

For external editors, the first goal should be to add formatting to the language server, which will instantly provide support for most editors (vscode, neovim, jetbrain, emacs, ...). That's definitely something to add once the formatter gets merged.

Command line support can and should probably be added for use in continuous integration and other use cases outside of an editor (e.g. batch formatting an existing project that didn´t have the formatter).

@ajreckof
Copy link
Member

ajreckof commented Apr 24, 2023

I've tested it a bit on a few of my codes. It feels really good but there are still some hiccups here are what I found.

I'm not sure the formatter should change annotation placement, I feel it should keep how you had it, especially since there is no explicit guideline on whether to put it in its separate line or not.

large number are wrongly formatted, if they are not formatted they do not change and if they are formatted every _ gets removed. See here for proper formatting of large numbers
Capture d’écran 2023-04-24 à 07 10 45

Adam:

  • fixed

when formatting

printt("a lot of arguments", "a lot of arguments", "a lot of arguments", "a lot of arguments", "a lot of arguments")

it produces this

printt(
			"a lot of arguments",
			"a lot of arguments",
			"a lot of arguments",
			"a lot of arguments",
			"a lot of arguments"
	)

which is missing the trailing coma (only tested with function but the trailing comma might be misiing in arrays and dictionaries too )

Moreover when separating into multiple line as in the last example if the line is just after a unindent than it will put two empty lines instead of the one needed. reformating will then reduce to one line.

I also had a similar issue to the one reported here #76211 (comment) but it was myself with nested functions

when having multiple unindent you will have as much empty line as the number of unindent which might result in the next line being a 4 line after (strangely this does not happen in the last function of the script) moreover on reformat it will add one more empty line.
This :
Capture d’écran 2023-04-24 à 07 40 59
Will become this :
Capture d’écran 2023-04-24 à 07 41 12
And then this :
Capture d’écran 2023-04-24 à 07 41 18

@RolandMarchand
Copy link
Author

@anvilfolk Thank you very much for the feedback!

For the annotations, @tool and @icon are applied to the parser itself, similar to a flag. There is no information in the parser as to which one came first. Other annotations do have a stack system, though. We could implement a system to keep track of those two annotations' order, but I do not think it's worth the effort.

Aside from that, yes, there are still some bugs. Thank you so much for the screenshots, I will make test cases out of them and fix them ASAP 👍

@dalexeev
Copy link
Member

dalexeev commented Apr 24, 2023

large number are wrongly formatted, if they are not formatted they do not change and if they are formatted every _ gets removed.

The exact formatting of the literals is lost:

-    print(0xFFFF)
-    print(1.2e-5)
-    print(0.1111111111111133) # <- Out of precision.
-    print("\t \n \uFFFD")
-    print(""" key="value" """)
+    print(65535)
+    print(0.000012)
+    print(0.11111111111111) # <- Out of precision.
+    print("	 
+ �")
+    print(' key="value" ')

Adam edit:

  • fixed

Possible solution: when allocating a LiteralNode, copy the Token::source property into a new LiteralNode property and use it in the formatter.

I'm a bit worried about this approach (restoring source code from AST) but overall the code looks good to me, excellent work!

Now we need to make sure that the parser has all the necessary information, including the order of the elements. For example, now the formatter moves the setter before the getter, but we do not have this recommendation in the style guide (and in the GDScript reference examples, the getter is placed before the setter). I think the formatter should not change anything that is not regulated by the style guide.

1. Class doc comment moves down.

 extends Node
-## Class doc comment.
 
 
+## Class doc comment.
 ## Method doc comment.
 func test():
     pass

Adam edit:

  • fixed

2. A space is added before commented code.

     # Comment.
-    #print("code")
+    # print("code")

Adam edit:

  • fixed

3. Regression: comment now acts like statement (pass is not required).


Adam edit:

  • fixed

4. The formatter removes unnecessary parentheses added for clarity. As far as I know, this is difficult to solve, since the parser uses parentheses for grouping, but does not create grouping nodes or otherwise store them in the AST.

-if not (x is Node) or (y >= 0 and y < 10):
+if not x is Node or y >= 0 and y < 10:
-_flags[flag / 8] &= ~(1 << (flag % 8))
+_flags[flag / 8] &= ~1 << flag % 8

Adam edit:

  • fixed

@aaronfranke
Copy link
Member

@dalexeev We should consider changing the recommended order of the class doc comment to be like this:

## Class doc comment.
class_name MyClass
extends Node

Every other doc comment, and all annotations, appear above/before what they describe, not after.

To avoid breaking compat, we could do this in steps, first support both orders for one Godot release (like 4.1), then enforce it being above for the next release (like 4.2).

@dalexeev
Copy link
Member

@aaronfranke You are probably right, although both class_name and extends are optional (if extends is omitted, the class extends RefCounted).

@adamscott adamscott force-pushed the code-format branch 3 times, most recently from 69dbfc6 to 11cdcd9 Compare May 10, 2023 17:50
@adamscott
Copy link
Member

adamscott commented May 11, 2023

Fixed the issue about literals being parsed:

Capture d’écran du 2023-05-11 11-07-19_merged

Note: The picture is outdated, I modified my commit and it makes it so that strings and numbers take the source code as is. So the quotes doesn't change anymore (like on line 6)

@lethefrost
Copy link

I'm kind of wondering if there's an update? It seems the last commit was two months ago. Have been craving such an essential feature for the built-in editor for a long time. Thank you for your efforts and work, and really looking forward to having it as a part of the official deliverables!

@AThousandShips
Copy link
Member

We are currently in feature freeze until 4.2 is released so no changes can be merged like this, only then can this be considered and merged, so come back and see when 4.2 has been released 🙂

@lethefrost
Copy link

We are currently in feature freeze until 4.2 is released so no changes can be merged like this, only then can this be considered and merged, so come back and see when 4.2 has been released 🙂

Thank you for the clarification!

@Calinou
Copy link
Member

Calinou commented Nov 9, 2023

Tested locally with https://github.com/godotengine/godot-demo-projects/tree/2d/platformer, it mostly works as expected. Great work so far 🙂

All files format without errors in that 2D platformer demo, and the demo keeps working after having all its scripts formatted.

Questions

  • Is there an annotation or comment to avoid formatting the next line, or within a series of lines (like // clang-format comments with begin and end markers)?

Bugs

  • Running formatter twice sometimes results in different code (that is still valid, but nonetheless different):

No formatter

if mode == DisplayServer.WINDOW_MODE_FULLSCREEN or \
				mode == DisplayServer.WINDOW_MODE_EXCLUSIVE_FULLSCREEN:

Formatted once

		if (
			mode
			== DisplayServer.WINDOW_MODE_FULLSCREEN
			or mode
			== DisplayServer.WINDOW_MODE_EXCLUSIVE_FULLSCREEN
		):

Formatted twice

		if (
			(
				mode
			== DisplayServer.WINDOW_MODE_FULLSCREEN
			or mode
			== DisplayServer.WINDOW_MODE_EXCLUSIVE_FULLSCREEN
			)
		):

  • The formatter sometimes inserts more than one blank line within a function, which isn't allowed as per the GDScript style guide:
	if not floor_detector_left.is_colliding():
		velocity.x = WALK_SPEED
	elif not floor_detector_right.is_colliding():
		velocity.x = -WALK_SPEED

	if is_on_wall():
		velocity.x = -velocity.x


	move_and_slide()  # ^ Two blank lines above ^

	if velocity.x > 0.0:
		sprite.scale.x = 1.0
	elif velocity.x < 0.0:
		sprite.scale.x = -1.0

  • Running formatter on 3D platformer's player.gd results in a script error due to incorrect formatting:
facing_mesh = (facing_mesh - Vector3.UP * facing_mesh.dot(Vector3.UP)).normalized()

This is because the formatter gets rid of parentheses too aggressively in this situation:

facing_mesh = facing_mesh - Vector3.UP * facing_mesh.dot(Vector3.UP).normalized()

Suggestions for improving formatting

  • Both operands within a comparison operation should always be written on the same lime, regardless of length. This looks strange even though it's valid GDScript:
		if (
			mode
			== DisplayServer.WINDOW_MODE_FULLSCREEN
			or mode
			== DisplayServer.WINDOW_MODE_EXCLUSIVE_FULLSCREEN
		):

I'd prefer the following:

		if (
			mode == DisplayServer.WINDOW_MODE_FULLSCREEN
			or mode == DisplayServer.WINDOW_MODE_EXCLUSIVE_FULLSCREEN
		):

  • Spacing between extends and variables, and variables and the first function should be 1 line only, not 2. There isn't much of a benefit of using 2-line spacing in this context:
class_name Game
extends Node


@onready var _pause_menu := $InterfaceLayer/PauseMenu as PauseMenu


func _unhandled_input(event: InputEvent) -> void:
	pass


func other_func():
	pass

I'd expect this instead:

class_name Game
extends Node

@onready var _pause_menu := $InterfaceLayer/PauseMenu as PauseMenu

func _unhandled_input(event: InputEvent) -> void:
	pass


func other_func():
	pass

  • The formatter will sometimes reorganize comments placed at the end of a block to put them at the beginning of the next block, which can be confusing:
	... 
	elif col_left.is_empty() and not col_right.is_empty():
		# If only right ray is occluded, turn the camera around to the left.
		difference = Basis(Vector3.UP, deg_to_rad(delta * autoturn_speed)) * difference
	# Do nothing otherwise, left and right are occluded but center is not, so do not autoturn.

	# Apply lookat.
	if difference.is_zero_approx():
		difference = (pos - target).normalized() * 0.0001

Becomes:

	... 
	elif col_left.is_empty() and not col_right.is_empty():
		# If only right ray is occluded, turn the camera around to the left.
		difference = Basis(Vector3.UP, deg_to_rad(delta * autoturn_speed)) * difference

	# Do nothing otherwise, left and right are occluded but center is not, so do not autoturn.
	# Apply lookat.
	if difference.is_zero_approx():
		difference = (pos - target).normalized() * 0.0001

  • The 2D platformer demo currently uses single-line class name declarations like this: class_name Enemy extends CharacterBody2D, which reminds me of C++/Java. This saves one line in every named class script without much of a downside (as this line is usually not wide enough to warrant splitting into two lines). The formatter always reformats this to two lines, but this would need to be discussed separately as the current style guide also disagrees.

  • The formatter doesn't do anything about signals that have no parameters but parentheses like this: signal coin_collected(). We should discuss whether the parentheses should be enforced or forbidden by the style guide, and make the formatter act accordingly. (I'd go towards enforcing them personally, like functions.)

  • The formatter should add 2 spaces before inline comments: var something # Like this, instead of just 1. This is consistent with what Python formatters do.

Other suggestions

  • I think the default keyboard shortcut should be Ctrl + Shift + I for consistency with VS Code. I got pretty used to that shortcut for formatting over the years 🙂
    • This is also because keyboard shortcuts that use Alt + Shift aren't very common, and they cause issues on Windows where you accidentally switch keyboard layouts while trying to enter them (Alt + Shift cycles keyboard layouts on Windows).

  • There should be a --format CLI argument for usage on CI, analogous to --check-only. Returns exit code 0 (EXIT_SUCCESS) if no file was reformatted, 1 (EXIT_FAILURE) if at least one file was reformatted, so that CI can exit with a failure if files don't follow formatting standards set in the project.
    • This argument should be recursive if you specify a folder name. A top-level addons/ folder (i.e. next to a project.godot file) should be ignored unless you explicitly specify it (or one of its files) by name, so that formatting errors in add-ons don't impact your project. Specifying by name should still be allowed so that add-ons themselves can use automated formatting.

@lethefrost
Copy link

lethefrost commented Nov 9, 2023

I think the default keyboard shortcut should be Ctrl + Shift + I for consistency with VS Code. I got pretty used to that shortcut for formatting over the years 🙂

  • This is also because keyboard shortcuts that use Alt + Shift aren't very common, and they cause issues on Windows where you accidentally switch keyboard layouts while trying to enter them (Alt + Shift cycles keyboard layouts on Windows).

If I may just add one thing, the shortcut for formatting documents in VSC on macOS is shift + opt + F, where the opt usually corresponds to Windows' alt. I think perhaps having different default shortcuts for different platforms would be a good idea, as VSC does. By the way, I personally think using F has a little advantage over I because you can use only your left hand to trigger the shortcut while you are holding the mouse using your right hand.

@emgast
Copy link

emgast commented Jan 23, 2024

Is this feature still being actively developed? I'd love to see it get into 4.3, but I don't know how realistic that is. Either way, I really appreciate the effort!

@aaronfranke
Copy link
Member

@emgast It is not being actively developed, the last time changes were pushed was 4 months ago.

@AThousandShips
Copy link
Member

It needs a rebase, might need some new fixes

@NathanLovato
Copy link
Contributor

I can confirm it is currently not in active development. Adam, the last developer we sponsored to work on this, was rehired by the Godot foundation, and he's now working on higher priority features and fixes for the community. The groundwork is here, with many automated tests. We ran out of budget for this, now we'll need a new contributor or sponsor willing to pick it up and complete this.

@AThousandShips
Copy link
Member

AThousandShips commented Jan 23, 2024

I will take a look over this when I have the time and might be able to salvage it, but will see if I can get my head around all the details

Edit: Haven't found the time or energy so not planning to dig into it now

@NathanLovato
Copy link
Contributor

If anyone gives it a stab, don't hesitate to ask Adam, he can surely give you a quick overview of how the thing works and the areas that caused issues.

Mainly, from what I understood, the current implementation faces the same challenges as any formatter that tries to output nice-looking code (like Python's black, the reference for this one): line splitting is complicated, and the algorithm might need a rework.

Then, there are lots of small edge cases to address, many of which were listed by Calinou above.

@Scony
Copy link
Contributor

Scony commented Jan 24, 2024

Not sure what was the reasoning behind implementing this feature as a single PR, but IMO it's virtually impossible to finish it this way - this is basically because GDScript internals are likely changing much faster than this PR can be aligned by a single person.

What I would suggest instead is implementing formatter in small steps (PRs up to 200-300LOC of code + tests) - one part at a time.

The first PR would basically:

  • introduce a "dummy" formatter that can format empty file or a file containing just some simple statements like pass.
  • introduce a means to run the formatter e.g. by using the keyboard shortcut in the code editor yet with the limitation such that formatter should only work if some Godot setting (disabled by default) is enabled. This way, just after the first PR people would be able to enable such "experimental" formatter at their own risk and start testing.
  • introduce test cases covering whatever is implemented in the formatter at this point.

Every consecutive PR would add one small functionality along with tests covering it.

The benefit of such "baby steps" approach would be that:

  1. PRs are easy to review since they are small
  2. More people would be able to get involved into development
  3. More people would be able to get involved into testing
  4. Most important! Every change in the GDScript internals that is breaking formatter would make formatter unit/integration tests failing thus forcing people working on GDScript internals to fix formatter in the same PR they're making changes. Example: Imagine the formatter is able to format pass statements and has unit tests for that. Now imagine some contributor is changing pass statement to noop statement for whatever reason. PR with such change would fail formatter tests immediately thus forcing the contributor to fix formatter before committing such PR

So, in other words, working in the master directly would enable organic development of formatter and make it a team effort - especially on the development side.

@NathanLovato
Copy link
Contributor

@Scony initially, the development was staged like that with a simpler formatter that would focus on the foundations. It couldn't get merged because to preserve user comments, they need to be added to the parser somehow, and from that point on the formatter becomes a project that adds a change to the GDScript parser and needs to be thoroughly tested before merging. The brief was that it generally shouldn't break any code we know before being exposed to users.

It's great if the team can find a way to split the work now. For instance, by keeping the formatter unexposed to users.

Anyway, this code belongs to the project, we did what we could, and I'll let the GDScript team decide on where to take it from there. You're more than welcome to split it, change it, discard it, or anything else you need to do.

@vnen
Copy link
Member

vnen commented Feb 8, 2024

I agree with @Scony's idea of doing it in small steps, but I don't know if I agree on what the steps should. IMO the first step would be a "noop" formatter that is able to read to the code and leave it exactly as it was found. Then start adding rules one by one, always backed by automated tests.

Doing a noop is much harder than formatter that only does one rule and discards the rest, but we can get that into production faster even without all the rules.

Not sure if @adamscott is still interested in this. He could work on this as part of his tasks within the Godot Foundation if his schedule allows for it.

@chexmo
Copy link

chexmo commented Feb 21, 2024

Sorry in advance for being new in these repos and be trying to quote stuff.... You can dismiss this message if it's garbage.

I was thinking... If both the parser and the formatter need to follow "code rules"... isn't it a chance to make "code rules" an actual system that them both can connect to?

I example: A code rule may be "the sign of a method should look like this", and both the parser and the formatter (even though they are pretty different functionalities in the user's eyes) can read the rule and then do what they need to do depending the tool selected and their configuration

The benefit from this is that you can add and delete code rules when the GDScript specification changes without touching the tools that use them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement a code formatter for GDScript in the script editor