Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename String .find_char -> find, rfind_char -> rfind, contains_char -> contains. #100520

Closed
wants to merge 1 commit into from

Conversation

Ivorforce
Copy link
Member

@Ivorforce Ivorforce commented Dec 17, 2024

Disclaimer: This is a change proposal, so if you have reservations about the rename, please comment them.

In my opinion, there is no real reason for the _char suffix. Instead, it reduces findability of these functions.

The PR itself is just a find-replace of find_char( -> find( and contains_char( -> contains(.

@Ivorforce Ivorforce force-pushed the string-char-nosuffix branch from 38be04c to 2372b12 Compare December 17, 2024 18:05
@AThousandShips
Copy link
Member

AThousandShips commented Dec 17, 2024

This makes potentially exposing these functions to extensions and scripting impossible without breaking the ability to have both module and extension support, as scripting doesn't allow overloadeds

It also nominally breaks compatibility but we don't guarantee module compatibility, but it's still very annoying for people making modules

I'm also against this on the principle of readability, and it now makes get_slicec odd

It also breaks other planned additions, which are planned to be exposed, which would either have to not be exposed or be different from this

@AThousandShips
Copy link
Member

This absolutely needs a proposal to track the discussion to not bother all reviewers etc.

@Ivorforce
Copy link
Member Author

Ivorforce commented Dec 17, 2024

I appreciate the comment, and was hoping you would pop in, since you mentioned before you have reservations :).

This absolutely needs a proposal to track the discussion to not bother all reviewers etc.

I did not expect this many reservations about the change. Should I close this PR and make a proposal?

This makes potentially exposing these functions to extensions and scripting impossible without breaking the ability to have both module and extension support, as scripting doesn't allow overloadeds

Is the reason that it's a hard rule to call in-engine functions the same as exposed functions? Because from a technical perspective, this would not be a problem.

It also nominally breaks compatibility but we don't guarantee module compatibility, but it's still very annoying for people making modules

True.

I'm also against this on the principle of readability, and it now makes get_slicec odd

Can you elaborate what you mean by that?

It also breaks other planned additions, which are planned to be exposed, which would either have to not be exposed or be different from this

Can you explain which exactly, and how this rename interferes with them?

@AThousandShips
Copy link
Member

Is the reason that it's a hard rule to call in-engine functions the same as exposed functions? Because from a technical perspective, this would not be a problem.

Yes because you can't use it in modules otherwise if you want to be able to use them as both a proper module and an extension, see modules/text_server_adv as an example, it's designed to be able to used as an extension as well, which is why the changes in #99328 skipped the servers

Can you elaborate what you mean by that?

Why rename find_char to find but not get_slicec to get_slice (not saying to do so, it'll break things)

Can you explain which exactly, and how this rename interferes with them?

The additions of replace_char and remove_char, this change only makes sense if we remove all character specific naming, otherwise it is far more confusing to have find be an overload but not replace etc.

@AThousandShips
Copy link
Member

AThousandShips commented Dec 17, 2024

So to summarise:

  • Bound methods can't be overloads
  • Bound methods should whenever possible (especially for core types) match to make working with both extensions and engine code in parallel simpler (or possible without lots of compiler checks)
  • This naming standard was pretty much settled on as discussed when adding contains_char, to move away from the c suffix

To me the suffix is helpful, most of the time we use a literal and then it doesn't matter, but it communicates intent and helps if it's a variable, where not having to go look for what that variable is is appreciated

Edit: note to self that we should document some things about extensions and method bindings about this as a lot of people aren't aware about the ability to make code thats both a module and an extension and things to think about with that

Edit 2: this is also a kind of change that's disruptive by causing hard to notice changes when merged for any PR that uses these methods, relying on people realising a PR needs a rebase to notice the renames fail CI

@Ivorforce
Copy link
Member Author

Ivorforce commented Dec 17, 2024

Is the reason that it's a hard rule to call in-engine functions the same as exposed functions? Because from a technical perspective, this would not be a problem.

Yes because you can't use it in modules otherwise if you want to be able to use them as both a proper module and an extension, see modules/text_server_adv as an example, it's designed to be able to used as an extension as well, which is why the changes in #99328 skipped the servers

If this is the only reason, it wouldn't be a hard limitation. It just means that the function is not exposed through the GDExtension API, and that modules designed to be used as both cannot use the char-taking function. It will not be auto-selected if a char * or String is passed.

Why rename find_char to find but not get_slicec to get_slice (not saying to do so, it'll break things)

I was not aware they were called that because they are taking chars. In the spirit of this PR, it would make sense to rename them as well.

Can you explain which exactly, and how this rename interferes with them?

The additions of replace_char and remove_char, this change only makes sense if we remove all character specific naming, otherwise it is far more confusing to have find be an overload but not replace etc.

I was trying to get this PR out fast because if we introduce more char-optimized versions, it will mean more and more effort to change it back. Changing existing unmerged PR shouldn't be a huge effort, since it's a simple find-and-replace action.

Bound methods can't be overloads

I still don't understand why that would be the case. Overloads can, technically, be bound just as easily as renames. find is already an overload of String& and const char* arguments. This should be enough evidence it's not a problem.

Bound methods should whenever possible (especially for core types) match to make working with both extensions and engine code in parallel simpler (or possible without lots of compiler checks)

I don't understand this point, I think?

This naming standard was pretty much settled on as discussed when adding contains_char, to move away from the c suffix

In the relevant PR there were several voices in favour of "merging it now but bulk-renaming functions to a non-_char suffix later". I have now created the relevant PR.

To me the suffix is helpful, most of the time we use a literal and then it doesn't matter, but it communicates intent and helps if it's a variable, where not having to go look for what that variable is is appreciated

The point is that the two function calls will yield equivalent results in all cases:

string.find("a")
string.find('a')

If the function was named differently, it would imply different behavior. However, the single-char version is merely an optimized case of the full string version. That's the entire reason why i'm against different names.

Edit 2: this is also a kind of change that's disruptive by causing hard to notice changes when merged for any PR that uses these methods, relying on people realising a PR needs a rebase to notice the renames fail CI

That's true. I'd be willing to take responsibility and notify authors with affected outstanding PRs in case of a merge. There's ways to search for that (though it's an admittedly manual act).

@AThousandShips
Copy link
Member

AThousandShips commented Dec 17, 2024

I was not aware they were called that because they are taking chars. In the spirit of this PR, it would make sense to rename them as well.

Let's not break compatible in classes that don't have support for handling it

I still don't understand why that would be the case. Overloads can, technically, be bound just as easily as renames.

They could but we don't have a system for them, so they can't be bound, you can bind a function that has an overload in c++ but the name can't be the same in extensions, which is the entire issue there

Essentially we can't have both find(String) and find(int) in extensions or scripting

To explain again, the bound methods can be used in extensions and modules, and it's possible to compile a module either as built into the engine, or as an extension using godot-cpp, the text servers are examples of this

In those cases only code that's exposed can be used, or it won't compile, so we use find(":") etc. in those instead of the character specific one because that is not exposed


I don't see the reason to make these methods different from every other method in this class, they cannot be named the same without forcing us to not exposing them

It is far more confusing to have replace_char and replace but only find with different overloads

Simply i don't see the logic behind this and to me it goes against established logic and preference for readability

By changing them like this we effectively say "we won't ever expose these" which I think is a bad idea


To reiterate the issue:

  1. Overloads are not supported in scripting and extensions, so find(String) and find(int) can't both be bound
  2. We want code that uses bound methods to be usable in extensions, as you can make a module that can both be compiled as a module under modules and as an extension, in that code we can't use methods that aren't bound
  3. We might well want to bind these methods especially to use in text servers etc., where we can't use find_char because it's not exposed, and we couldn't if this change is made (regardless what we name it as long as it's an overload)

Edit: Realized I had missed that you used find_char in that module, will make a fix to correct that error

@Ivorforce
Copy link
Member Author

Ivorforce commented Dec 17, 2024

Alright, to get me to understand, allow me to rephrase your (I think) two overarching points.

One is simply that find_char is a better name.
This is a totally valid opinion; while I disagree, between a few voices we could find a consensus on which is a better naming scheme.

More importantly, you are making the point that the GDExtension API does not support binding two different functions with the same name. This would be a technical limitation of the current implementation of the system.
The implications are clear. If this is the case, I am also against a rename.

Am I getting your points right? I appreciate your patience in continuing the discussion :)

PS. I do think this is a rather awkward limitation of the GDExtension system, but it's obviously out of scope to address that just for this rename.

@AThousandShips
Copy link
Member

AThousandShips commented Dec 17, 2024

You are correct, now this doesn't currently affect us as such, but it would prevent us from binding this very useful method until we support overloads, and that's currently not being heavily worked on and isn't a high priority (it would need GDScript support, the extension side wouldn't be difficult as such, but extension must match scripting)

So it's indeed a mix of convention, that we should either change all methods or none, having get_slicec be odd out would be weird, but it's bound so it can't be renamed, and these methods are pretty useful (and I was going to expose find_char, specifically because of the code in the modules, but things got in the way)

@Ivorforce
Copy link
Member Author

You are correct, now this doesn't currently affect us as such, but it would prevent us from binding this very useful method until we support overloads, and that's currently not being heavily worked on and isn't a high priority (it would need GDScript support, the extension side wouldn't be difficult as such, but extension must match scripting)

In this specific case, binding against GDScript wouldn't be useful because GDScript doesn't have a char32_t data type 😄
But yeah, I get that even so it would mean additional effort in adding to the binding system for other languages.

@AThousandShips
Copy link
Member

We already expose get_slicec you just use an int, and extensions don't use char32_t for that either but an int

@Ivorforce
Copy link
Member Author

Ivorforce commented Dec 17, 2024

Alright then. Thanks again for the explanations. I am closing this PR for the implication that, with the current implementation, this rename would prevent the find_char etc. functions to be bound and exposed to GDExtensions. This is a technical limitation that outweighs potential benefits for which the PR was created.

@Ivorforce Ivorforce closed this Dec 17, 2024
@Ivorforce Ivorforce deleted the string-char-nosuffix branch December 17, 2024 23:31
@AThousandShips
Copy link
Member

Thank you for resolving this confusing topic!

@AThousandShips AThousandShips removed this from the 4.x milestone Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants