Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Codegen generates inefficient Java code for List parameters #282

Open
jkronegg opened this issue Feb 6, 2025 · 5 comments
Open

Codegen generates inefficient Java code for List parameters #282

jkronegg opened this issue Feb 6, 2025 · 5 comments

Comments

@jkronegg
Copy link
Contributor

jkronegg commented Feb 6, 2025

👓 What did you see?

For List parameters, the codegen generates Java code which looks like this:

public TableRow(
    Location location,
    java.util.List<TableCell> cells,
    String id
) {
    this.location = requireNonNull(location, "TableRow.location cannot be null");
    this.cells = unmodifiableList(new ArrayList<>(requireNonNull(cells, "TableRow.cells cannot be null")));
    this.id = requireNonNull(id, "TableRow.id cannot be null");
}

This means that the cells parameters is converted to ArrayList (a modifiable list), then to a non-modifiable list.
I think this is done to ensure a proper conversion from Collection -> ArrayList -> UnmodifiableList.
However, all list arguments are List and not Collection (see java.rb). This lead to inefficient code because two list allocations are done, no matter the input list is a modifiable or non-modifiable list.

The performance impact has been detected while working on cucumber/gherkin#361, using IntelliJ Profiler (it's hard to see the real impact because there is a lot of generated classes which suffer from this syndrome).

✅ What did you expect to see?

If the java.java.erb code generator see a List, it should generate convert it to a non-modifiable list using unmodifiableList(List) without doing a new ArrayList(Collection).
This will avoid memory allocation and list traversal, so will lead to faster code.

📦 Which tool/library version are you using?

gherkin 31.0.0 (which uses messages-27.2.0)

🔬 How could we reproduce it?

The test-case below shows the issue:

import io.cucumber.messages.types.Location;
import io.cucumber.messages.types.TableCell;
import io.cucumber.messages.types.TableRow;
import org.junit.jupiter.api.Assertions;
import org.junit.jupiter.api.Test;

import java.util.ArrayList;
import java.util.Collections;
import java.util.List;

public class TableRowTest {
    @Test
    public void modifiable_list_converted_to_unmodifiable_list() {
        // Given
        List<TableCell> tableCells = new ArrayList<>();

        // When
        TableRow row = new TableRow(new Location(1L,1L), tableCells, "1");

        // Then
        Assertions.assertNotSame(tableCells, row.getCells());
        Assertions.assertThrows(UnsupportedOperationException.class, () -> row.getCells().add(null));
    }

    @Test
    public void unmodifiable_list_not_converted() {
        // Given
        List<TableCell> tableCells = Collections.unmodifiableList(new ArrayList<>());

        // When
        TableRow row = new TableRow(new Location(1L,1L), tableCells, "1");

        // Then
        Assertions.assertSame(tableCells, row.getCells()); // fails
    }
}

📚 Any additional context?

No response

@mpkorstanje
Copy link
Contributor

Unfortunately, the semantics that we want are that of an immutable list, not merely an unmodifiable list.

So if we did not create a new list first, then when creating a new TableRow(new Location(1L,1L), tableCells, "1") anyone with a reference to tableCells can modify the contents.

As demo:

    List<String> original = new ArrayList<>();
    original.add("hello");
    List<String> dontChangeMe = Collections.unmodifiableList(original);
    System.out.println(dontChangeMe); // prints [hello]
    original.add("world");
    System.out.println(dontChangeMe);  // prints [hello, world]

While what we want:

    List<String> original = new ArrayList<>();
    original.add("hello");
    List<String> dontChangeMe = Collections.unmodifiableList(new ArrayList<>(original));
    System.out.println(dontChangeMe); // prints [hello]
    original.add("world"); 
    System.out.println(dontChangeMe); // prints [hello]

@jkronegg
Copy link
Contributor Author

jkronegg commented Feb 6, 2025

Most of the time, the input parameters are created by the cucumber framework itself. Can't we trust the caller that he will not change the List ?
The constructor from cucumber-messages contribute to 7% of the duration of io.cucumber.gherkin.Parser.parse(). Worth the effort ?

@mpkorstanje
Copy link
Contributor

Most of the time, the input parameters are created by the cucumber framework itself. Can't we trust the caller that he will not change the List ?

I don't think we'd make any changes intentionally. But these messages are shared across threads, if there are mistakes, problems would be incredibly hard to spot and impossible to trace back to their origin.

A possible solution would be to use `List.copyOf`` and then use an immutable collection where possible in the caller. But that requires Java 10+.

@jkronegg
Copy link
Contributor Author

jkronegg commented Feb 7, 2025

Yes, Java 10+ List.copyOf is safer.
Is there some plan for Java 10+ in Cucumber ?

@mpkorstanje
Copy link
Contributor

mpkorstanje commented Feb 9, 2025

Under consideration in cucumber/cucumber-jvm#2962 but no time line attached. Looking at the timeline that JUnit 5 is following, I would expect September 2025.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants