Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode parsing error (Linux) #79806

Open
GabrielLins64 opened this issue Jul 23, 2023 · 9 comments
Open

Unicode parsing error (Linux) #79806

GabrielLins64 opened this issue Jul 23, 2023 · 9 comments

Comments

@GabrielLins64
Copy link

Godot version

4.1.1 - .NET (with C# support)

System information

Linux Ubuntu 20.04 - Godot Engine 4.1.1 - .NET (with C# support)

Issue description

I've installed all the pre-requisites including C#, .NET 6.0 and Mono. I didn't touch a single configuration on the engine, but the fresh installation when creating any new project throws this output (without having to build):

Godot Engine v4.1.1.stable.mono.official (c) 2007-present Juan Linietsky, Ariel Manzur & Godot Contributors.
● modules/gltf/register_types.cpp:73 - Blend file import is enabled in the project settings, but no Blender path is configured in the editor settings. Blend files will not be imported.
--- Debug adapter server started ---
--- GDScript language server started ---
● Unicode parsing error, some characters were replaced with � (U+FFFD): Invalid UTF-8 leading byte (82)
● Unicode parsing error, some characters were replaced with � (U+FFFD): Invalid UTF-8 leading byte (82)
● Unicode parsing error, some characters were replaced with � (U+FFFD): Invalid UTF-8 continuation byte (f1 ... 3 ...)
● Unicode parsing error, some characters were replaced with � (U+FFFD): Invalid UTF-8 leading byte (88)
● Unicode parsing error, some characters were replaced with � (U+FFFD): Invalid UTF-8 continuation byte (cc ... 30 ...)
● Unicode parsing error, some characters were replaced with � (U+FFFD): Invalid UTF-8 leading byte (86)
● Unicode parsing error, some characters were replaced with � (U+FFFD): Invalid UTF-8 leading byte (86)
● Unicode parsing error, some characters were replaced with � (U+FFFD): Invalid UTF-8 continuation byte (f7 ... d ...)

I can use the engine and build the project, but I can't figure out why this errors are appearing.

Steps to reproduce

  1. Install Godot Engine 4.1.1 - .NET (C# support) from here.
  2. Create a new project (any type)
  3. Check the Output console

Minimal reproduction project

Project with only icon.svg, icon.svg.import and project.godot

; project.godot
; Engine configuration file.
; It's best edited using the editor UI and not directly,
; since the parameters that go here are not all obvious.
;
; Format:
;   [section] ; section goes between []
;   param=value ; assign values to parameters

config_version=5

[application]

config/name="Test"
config/features=PackedStringArray("4.1", "Forward Plus")
config/icon="res://icon.svg"

[dotnet]

project/assembly_name="Test"
@dalexeev
Copy link
Member

dalexeev commented Jul 28, 2023

Unicode parsing error, some characters were replaced with � (U+FFFD)

godot/core/string/ustring.cpp

Lines 1751 to 1757 in da81ca6

void String::print_unicode_error(const String &p_message, bool p_critical) const {
if (p_critical) {
print_error(vformat("Unicode parsing error, some characters were replaced with � (U+FFFD): %s", p_message));
} else {
print_error(vformat("Unicode parsing error: %s", p_message));
}
}

I noticed the error when running tests. Here are the backtraces:

Backtrace 1
#0  String::print_unicode_error (this=0x7fffffffd570, p_message=..., 
    p_critical=true) at core/string/ustring.cpp:1753
#1  0x000055555d92809f in String::operator+= (this=0x7fffffffd570, 
    p_char=0 U'\000') at core/string/ustring.cpp:577
#2  0x000055555d3c978f in JSON::_get_token (
    p_str=0x5555627111f0 U"\"\\u0000\"", index=@0x7fffffffd5e4: 6, p_len=8, 
    r_token=..., line=@0x7fffffffda10: 0, r_err_str=...)
    at core/io/json.cpp:313
#3  0x000055555d3ca8bb in JSON::_parse_string (p_json=..., r_ret=..., 
    r_err_str=..., r_err_line=@0x7fffffffda10: 0) at core/io/json.cpp:524
#4  0x000055555d3caa64 in JSON::parse (this=0x7fffffffd820, 
    p_json_string=..., p_keep_text=false) at core/io/json.cpp:548
#5  0x0000555558210037 in TestJSON::DOCTEST_ANON_FUNC_372 ()
    at ./tests/core/io/test_json.h:190
#6  0x00005555588da709 in doctest::Context::run (this=0x7fffffffdec8)
    at ./thirdparty/doctest/doctest.h:7007
#7  0x00005555586e8b05 in test_main (argc=3, argv=0x7fffffffe568)
    at tests/test_main.cpp:195
#8  0x00005555581ac13b in Main::test_entrypoint (argc=3, argv=0x7fffffffe568, 
    tests_need_run=@0x7fffffffdf87: true) at main/main.cpp:688
#9  0x0000555558126f2d in main (argc=3, argv=0x7fffffffe568)
    at platform/linuxbsd/godot_linuxbsd.cpp:56
Backtrace 2
#0  String::print_unicode_error (this=0x7fffffffd6f0, p_message=..., 
    p_critical=true) at core/string/ustring.cpp:1753
#1  0x000055555d92809f in String::operator+= (this=0x7fffffffd6f0, 
    p_char=0 U'\000') at core/string/ustring.cpp:577
#2  0x0000555558210536 in TestJSON::DOCTEST_ANON_FUNC_372 ()
    at ./tests/core/io/test_json.h:219
#3  0x00005555588da709 in doctest::Context::run (this=0x7fffffffdec8)
    at ./thirdparty/doctest/doctest.h:7007
#4  0x00005555586e8b05 in test_main (argc=3, argv=0x7fffffffe568)
    at tests/test_main.cpp:195
#5  0x00005555581ac13b in Main::test_entrypoint (argc=3, argv=0x7fffffffe568, 
    tests_need_run=@0x7fffffffdf87: true) at main/main.cpp:688
#6  0x0000555558126f2d in main (argc=3, argv=0x7fffffffe568)
    at platform/linuxbsd/godot_linuxbsd.cpp:56
Backtrace 3
#0  String::print_unicode_error (this=0x7fffffffd938, p_message=..., 
    p_critical=true) at core/string/ustring.cpp:1753
#1  0x000055555d92cab8 in String::parse_utf8 (this=0x7fffffffd938, 
    p_utf8=0x55555df295e4 <TestString::DOCTEST_ANON_FUNC_4463()::u8str+20> "\201", p_len=-1, p_skip_cr=false) at core/string/ustring.cpp:1930
#2  0x000055555835572f in TestString::DOCTEST_ANON_FUNC_4463 ()
    at ./tests/core/string/test_string.h:175
#3  0x00005555588da709 in doctest::Context::run (this=0x7fffffffdec8)
    at ./thirdparty/doctest/doctest.h:7007
#4  0x00005555586e8b05 in test_main (argc=3, argv=0x7fffffffe568)
    at tests/test_main.cpp:195
#5  0x00005555581ac13b in Main::test_entrypoint (argc=3, argv=0x7fffffffe568, 
    tests_need_run=@0x7fffffffdf87: true) at main/main.cpp:688
#6  0x0000555558126f2d in main (argc=3, argv=0x7fffffffe568)
    at platform/linuxbsd/godot_linuxbsd.cpp:56
Backtrace 4
#0  String::print_unicode_error (this=0x7fffffffd948, p_message=..., 
    p_critical=true) at core/string/ustring.cpp:1753
#1  0x000055555d92c5c2 in String::parse_utf8 (this=0x7fffffffd948, 
    p_utf8=0x555563001df0 "Eお\217\343㘏よう\300\200🎤\360\202\202\254\355\240\201", p_len=-1, p_skip_cr=false) at core/string/ustring.cpp:1839
#2  0x000055555d92c388 in String::utf8 (
    p_utf8=0x555563001df0 "Eお\217\343㘏よう\300\200🎤\360\202\202\254\355\240\201", p_len=-1) at core/string/ustring.cpp:1782
#3  0x000055555835607a in TestString::DOCTEST_ANON_FUNC_4465 ()
    at ./tests/core/string/test_string.h:195
#4  0x00005555588da709 in doctest::Context::run (this=0x7fffffffdec8)
    at ./thirdparty/doctest/doctest.h:7007
#5  0x00005555586e8b05 in test_main (argc=3, argv=0x7fffffffe568)
    at tests/test_main.cpp:195
#6  0x00005555581ac13b in Main::test_entrypoint (argc=3, argv=0x7fffffffe568, 
    tests_need_run=@0x7fffffffdf87: true) at main/main.cpp:688
#7  0x0000555558126f2d in main (argc=3, argv=0x7fffffffe568)
    at platform/linuxbsd/godot_linuxbsd.cpp:56

@bruvzg
Copy link
Member

bruvzg commented Jul 28, 2023

Most likely, this is caused by some of the system paths or environment variables being not in UTF-8 encoding.

I noticed the error when running tests.

Some tests deliberately include invalid Unicode sequences, and expected to trigger this error.

@dalexeev
Copy link
Member

Some tests deliberately include invalid Unicode sequences, and expected to trigger this error.

I thought that in this case we temporarily disable error printing (but the test framework should check the output).

@bruvzg
Copy link
Member

bruvzg commented Jul 28, 2023

I thought that in this case we temporarily disable error printing (but the test framework should check the output).

Test should suppress printing, it's done in String tests, but not JSON tests (also at least on test is wrong), and print error itself is using wrong encoding.

@GabrielLins64
Copy link
Author

GabrielLins64 commented Mar 10, 2024

Update: I'm still having the issue in the version 4.2.1

Most likely, this is caused by some of the system paths or environment variables being not in UTF-8 encoding.

I noticed the error when running tests.

Some tests deliberately include invalid Unicode sequences, and expected to trigger this error.

I've checked my project path, godot path and the related environment variables but I couldn't notice anything unusual:

Godot path: /home/gabriellins/Applications/godot/Godot_v4.2.1-stable_linux.x86_64
Project path: /home/gabriellins/Study/Game_Dev/Godot/Projects/Primeiro_Projeto
Related environment var: GODOT4=/home/gabriellins/Applications/godot/Godot_v4.2.1-stable_linux.x86_64

To reproduce:

  1. Open Godot 4.2.1
  2. Run the scene

Output:

image

Tested in: Ubuntu 20.04.6 LTS

@pdragon
Copy link

pdragon commented Jul 24, 2024

This error happens if you copy something from the Godot editor into visual studio. the fact is that the Godot editor uses UTF-8 characters as tabs, and when the user inserts them into visual studio it converts them into unreadable characters, you need to clear the file to which this error refers from all unreadable characters.

@pdragon
Copy link

pdragon commented Jul 24, 2024

I also discovered that when there are non-Latin characters, for example in comments, this error occurs. As soon as I delete them the error disappears

@begili
Copy link

begili commented Aug 13, 2024

@GabrielLins64 Maybe I'm a bit late to the conversation but i faced the exact same issue. After debugging the editor, i found that the reason for the error occurring was the loading of the system certificates. I had a bad certificate content in my /etc/ssl/certs/ca-certificates.crt file. Just make sure that you update the certificates file (sudo update-ca-certificates). Afterwards, the file is not in UTF-8 but does not contain any invalid chars and therefor your problem should be solved. Let me know if this has helped if you read this.

Steps for reproduction: Add self signed certificate to ca-certificates.crt but without update-ca-certificates afterwards.

@morcillo-alex
Copy link

Hello community,

I was able to fix these issues by simply saving with Visual Studio Code forcing the encoding UTF-8

image

Make sure to select this option.

image

Of course, this only helps for text-based files (CSV, code, ...)

I hope this helps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants