Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make use of latin1 encoding explicit in gdextension_interface.cpp. #101352

Merged
merged 1 commit into from
Mar 11, 2025

Conversation

Ivorforce
Copy link
Member

@Ivorforce Ivorforce commented Jan 9, 2025

This is a small sanity PR with no change in behavior. latin1 constructors and parsing functions are exposed with this PR (as is parse_utf32 with similar reasoning).

Some functions in the gdextension interface advertise to parse latin1, but they currently parse "whatever the String constructor uses" - which is parse_latin1. This is correct, but imprecise.

To give more context, for most String constructor callers, the current parse_latin1 is incorrect, as most use it to parse utf-8 literals1. It would be sane to make the use of latin1 explicit, to avoid confusing it with the other cases.

I also took the opportunity to clean up the memnew calls of related functions a bit.

Footnotes

  1. https://github.com/godotengine/godot/issues/100641

@Ivorforce Ivorforce requested review from a team as code owners January 9, 2025 15:26
@Repiteo Repiteo added this to the 4.x milestone Jan 9, 2025
Copy link
Contributor

@Repiteo Repiteo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How am I just now realizing we've never had a parsing equivalent for ascii of all things. Yeah, there's the CharString function, but the fact we've never had a parse_ascii or static String ascii feels like a wild oversight.

@Ivorforce
Copy link
Member Author

How am I just now realizing we've never had a parsing equivalent for ascii of all things. Yeah, there's the CharStringfunction, but the fact we've never had aparse_ascii or static String ascii feels like a wild oversight.

Hah, yeah that surprised me too. I have a PR open for it: #101304

@@ -31,6 +31,7 @@
#ifndef SORT_EFFECTS_RD_H
#define SORT_EFFECTS_RD_H

#include "servers/rendering/renderer_rd/shader_rd.h"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick fix due to build failing because of erroneous cache restores.

I have previously established that including a non-generated file explicitly solves such cache issues for good.

@Ivorforce Ivorforce force-pushed the string-latin1-explicit branch from b8c8ed1 to 5b834db Compare January 9, 2025 16:23
@Ivorforce Ivorforce force-pushed the string-latin1-explicit branch from 5b834db to 334e027 Compare January 13, 2025 12:16
Copy link
Contributor

@dsnopek dsnopek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Overall, this is a great idea, however, I'm a little worried about introducing subtle behavior changes. There's two spots where I think behavior may be changing, and we should discuss if this is a change we want or not.

@Ivorforce Ivorforce force-pushed the string-latin1-explicit branch from 334e027 to 329774e Compare January 13, 2025 15:57
Copy link
Contributor

@dsnopek dsnopek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in gdextension_interface.cpp look good to me!

@Repiteo
Copy link
Contributor

Repiteo commented Mar 9, 2025

Needs a rebase before this can be merged

@@ -521,6 +519,14 @@ class String {
char32_t unicode_at(int p_idx) const;

CharString ascii(bool p_allow_extended = false) const;
CharString latin1() const { return ascii(true); }
void parse_latin1(const StrRange<char> &p_cstr);
static String latin1(const StrRange<char> &p_string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it return a CharString (like the latin1 above) ? Or maybe be called parse_latin1 instead ?
Because that would mean that :

String str("Hello");
CharString s1 = str.latin1();
String s2 = String::latin1("Hello"); // This seems like a shorthand for above, but it is not, one is CharString and the other is String

Wouldn't it make more sense to have it return a CharString, and maybe have a static parse_latin1 too ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I see that utf8 has the same behaviour, so at least it is consistent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, the naming is unnecessarily confusing. I opted it to be consistent for now but agree renaming the bunch may be a good idea.

@Ivorforce Ivorforce force-pushed the string-latin1-explicit branch from 329774e to 5efc71a Compare March 10, 2025 18:32
@Ivorforce Ivorforce force-pushed the string-latin1-explicit branch from 5efc71a to 1818453 Compare March 10, 2025 18:34
@Ivorforce
Copy link
Member Author

I rebased the PR to use Span. I took the opportunity to wipe my changes from gdextension_string_name_new_with_latin1_chars, which would have been somewhat of a performance regression. I can address that separately, some time.

Other than that, it should be as before!

@Repiteo Repiteo modified the milestones: 4.x, 4.5 Mar 11, 2025
@Repiteo Repiteo merged commit 854e973 into godotengine:master Mar 11, 2025
20 checks passed
@Repiteo
Copy link
Contributor

Repiteo commented Mar 11, 2025

Thanks!

@Ivorforce Ivorforce deleted the string-latin1-explicit branch March 11, 2025 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants