Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-15187: [Java] Made reader initialization lazy and added new getTransferPair() function that takes in a Field type #34424

Merged
merged 13 commits into from
Jun 18, 2023
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -580,6 +580,14 @@ public TransferPair getTransferPair(BufferAllocator allocator) {
*/
public abstract TransferPair getTransferPair(String ref, BufferAllocator allocator);

/**
* Construct a transfer pair of this vector and another vector of same type.
* @param field Field object used by the target vector
* @param allocator allocator for the target vector
* @return TransferPair
*/
public abstract TransferPair getTransferPair(Field field, BufferAllocator allocator);

/**
* Transfer this vector'data to another vector. The memory associated
* with this vector is transferred to the allocator of target vector
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,15 @@

package org.apache.arrow.vector;

import java.lang.reflect.InvocationTargetException;
import java.util.Collections;
import java.util.Iterator;

import org.apache.arrow.memory.ArrowBuf;
import org.apache.arrow.memory.BufferAllocator;
import org.apache.arrow.memory.ReferenceManager;
import org.apache.arrow.util.Preconditions;
import org.apache.arrow.vector.complex.reader.FieldReader;
import org.apache.arrow.vector.util.DataSizeRoundingUtil;
import org.apache.arrow.vector.util.TransferPair;
import org.apache.arrow.vector.util.ValueVectorUtility;
Expand All @@ -50,6 +52,8 @@ public abstract class BaseValueVector implements ValueVector {

protected final BufferAllocator allocator;

protected volatile FieldReader fieldReader;

protected BaseValueVector(BufferAllocator allocator) {
this.allocator = Preconditions.checkNotNull(allocator, "allocator cannot be null");
}
Expand Down Expand Up @@ -143,6 +147,43 @@ long computeCombinedBufferSize(int valueCount, int typeWidth) {
return allocator.getRoundingPolicy().getRoundedSize(bufferSize);
}

/**
* Each vector has a different reader that implements the FieldReader interface. Overridden methods must make
* sure to return the correct type of the reader implementation to instantiate the reader properly.
*
* @return Returns the implementation class type of the vector's reader.
*/
protected abstract Class<? extends FieldReader> getReaderImplClass();

/**
* Default implementation to create a reader for the vector. Depends on the individual vector
* class' implementation of `getReaderImpl()` to initialize the reader appropriately.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be {@link #getReaderImplClass}? JavaDoc does not use Markdown.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

*
* @return Concrete instance of FieldReader by using lazy initialization.
*/
public FieldReader getReader() {
FieldReader reader = fieldReader;

if (reader != null) {
return reader;
}
synchronized (this) {
if (fieldReader == null) {
try {
fieldReader =
(FieldReader) Class.forName(getReaderImplClass().getName()).getConstructor(getClass())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getReaderImplClass already returns a Class, why do we have to look it up again?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also: why can't getReaderImplClass return a factory function instead of a class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please clarify what you mean by return a factory function? I can change the usage to getReaderImplClass().getInstance if that is preferable. I just used Class.forName since I am more familiar with that option is all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't these all be Function<FieldVector, FieldReader>?

Copy link
Contributor Author

@rtadepalli rtadepalli Jun 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well returning a FieldReader instance would mean that I'd need to construct it. I want to do that in the synchronized block so I am returning the class type to construct an instance out of.

Out of curiosity, what is the advantage of returning a Function type over the current implementation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getReaderImplClass can return a Function which you then call.

Explicit reflection/metaprogramming isn't necessary here over just a factory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh I guess the construction would only happen when I call .apply which would in the synchronized block. That does seem nicer...

Any chance I could clean that up in a follow up ticket? Or would you prefer I make the changes here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, looks like we commented the same thing at the same time, :jinx:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not include reflection here that isn't strictly necessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized we don't need really need the ValueVector as an argument to the Function, so I decided to return a Supplier instead.

.newInstance(this);
} catch (ClassNotFoundException | NoSuchMethodException | InstantiationException | IllegalAccessException |
InvocationTargetException e) {
logger.error("Unable to instantiate FieldReader for {} because of: ", getClass().getSimpleName(), e);
throw new RuntimeException(e);
}
}

return fieldReader;
}
}

/**
* Container for primitive vectors (1 for the validity bit-mask and one to hold the values).
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,6 @@
*/
public final class BigIntVector extends BaseFixedWidthVector implements BaseIntVector {
public static final byte TYPE_WIDTH = 8;
private final FieldReader reader;

/**
* Instantiate a BigIntVector. This doesn't allocate any memory for
Expand Down Expand Up @@ -71,17 +70,11 @@ public BigIntVector(String name, FieldType fieldType, BufferAllocator allocator)
*/
public BigIntVector(Field field, BufferAllocator allocator) {
super(field, allocator, TYPE_WIDTH);
reader = new BigIntReaderImpl(BigIntVector.this);
}

/**
* Get a reader that supports reading values from this vector.
*
* @return Field Reader for this vector
*/
@Override
public FieldReader getReader() {
return reader;
protected Class<? extends FieldReader> getReaderImplClass() {
return BigIntReaderImpl.class;
}

/**
Expand Down Expand Up @@ -286,7 +279,7 @@ public static long get(final ArrowBuf buffer, final int index) {


/**
* Construct a TransferPair comprising of this and a target vector of
* Construct a TransferPair comprising this and a target vector of
* the same type.
*
* @param ref name of the target vector
Expand All @@ -298,6 +291,19 @@ public TransferPair getTransferPair(String ref, BufferAllocator allocator) {
return new TransferImpl(ref, allocator);
}

/**
* Construct a TransferPair comprising of this and a target vector of
* the same type.
*
* @param field Field object used by the target vector
* @param allocator allocator for the target vector
* @return {@link TransferPair}
*/
@Override
public TransferPair getTransferPair(Field field, BufferAllocator allocator) {
return new TransferImpl(field, allocator);
}

/**
* Construct a TransferPair with a desired target vector of the same type.
*
Expand Down Expand Up @@ -331,6 +337,10 @@ public TransferImpl(String ref, BufferAllocator allocator) {
to = new BigIntVector(ref, field.getFieldType(), allocator);
}

public TransferImpl(Field field, BufferAllocator allocator) {
to = new BigIntVector(field, allocator);
}

public TransferImpl(BigIntVector to) {
this.to = to;
}
Expand Down
31 changes: 20 additions & 11 deletions java/vector/src/main/java/org/apache/arrow/vector/BitVector.java
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,6 @@ public final class BitVector extends BaseFixedWidthVector {

private static final int HASH_CODE_FOR_ONE = 19;

private final FieldReader reader;

/**
* Instantiate a BitVector. This doesn't allocate any memory for
* the data in vector.
Expand Down Expand Up @@ -80,17 +78,11 @@ public BitVector(String name, FieldType fieldType, BufferAllocator allocator) {
*/
public BitVector(Field field, BufferAllocator allocator) {
super(field, allocator, 0);
reader = new BitReaderImpl(BitVector.this);
}

/**
* Get a reader that supports reading values from this vector.
*
* @return Field Reader for this vector
*/
@Override
public FieldReader getReader() {
return reader;
protected Class<? extends FieldReader> getReaderImplClass() {
return BitReaderImpl.class;
}

/**
Expand Down Expand Up @@ -542,7 +534,7 @@ public void setRangeToOne(int firstBitIndex, int count) {


/**
* Construct a TransferPair comprising of this and a target vector of
* Construct a TransferPair comprising this and a target vector of
* the same type.
*
* @param ref name of the target vector
Expand All @@ -554,6 +546,19 @@ public TransferPair getTransferPair(String ref, BufferAllocator allocator) {
return new TransferImpl(ref, allocator);
}

/**
* Construct a TransferPair comprising this and a target vector of
* the same type.
*
* @param field Field object used by the target vector
* @param allocator allocator for the target vector
* @return {@link TransferPair}
*/
@Override
public TransferPair getTransferPair(Field field, BufferAllocator allocator) {
return new TransferImpl(field, allocator);
}

/**
* Construct a TransferPair with a desired target vector of the same type.
*
Expand All @@ -572,6 +577,10 @@ public TransferImpl(String ref, BufferAllocator allocator) {
to = new BitVector(ref, field.getFieldType(), allocator);
}

public TransferImpl(Field field, BufferAllocator allocator) {
to = new BitVector(field, allocator);
}

public TransferImpl(BitVector to) {
this.to = to;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,6 @@
public final class DateDayVector extends BaseFixedWidthVector {

public static final byte TYPE_WIDTH = 4;
private final FieldReader reader;

/**
* Instantiate a DateDayVector. This doesn't allocate any memory for
Expand Down Expand Up @@ -72,17 +71,11 @@ public DateDayVector(String name, FieldType fieldType, BufferAllocator allocator
*/
public DateDayVector(Field field, BufferAllocator allocator) {
super(field, allocator, TYPE_WIDTH);
reader = new DateDayReaderImpl(DateDayVector.this);
}

/**
* Get a reader that supports reading values from this vector.
*
* @return Field Reader for this vector
*/
@Override
public FieldReader getReader() {
return reader;
protected Class<? extends FieldReader> getReaderImplClass() {
return DateDayReaderImpl.class;
}

/**
Expand Down Expand Up @@ -290,7 +283,7 @@ public static int get(final ArrowBuf buffer, final int index) {


/**
* Construct a TransferPair comprising of this and a target vector of
* Construct a TransferPair comprising this and a target vector of
* the same type.
*
* @param ref name of the target vector
Expand All @@ -302,6 +295,19 @@ public TransferPair getTransferPair(String ref, BufferAllocator allocator) {
return new TransferImpl(ref, allocator);
}

/**
* Construct a TransferPair comprising this and a target vector of
* the same type.
*
* @param field Field object used by the target vector
* @param allocator allocator for the target vector
* @return {@link TransferPair}
*/
@Override
public TransferPair getTransferPair(Field field, BufferAllocator allocator) {
return new TransferImpl(field, allocator);
}

/**
* Construct a TransferPair with a desired target vector of the same type.
*
Expand All @@ -320,6 +326,10 @@ public TransferImpl(String ref, BufferAllocator allocator) {
to = new DateDayVector(ref, field.getFieldType(), allocator);
}

public TransferImpl(Field field, BufferAllocator allocator) {
to = new DateDayVector(field, allocator);
}

public TransferImpl(DateDayVector to) {
this.to = to;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,6 @@
*/
public final class DateMilliVector extends BaseFixedWidthVector {
public static final byte TYPE_WIDTH = 8;
private final FieldReader reader;

/**
* Instantiate a DateMilliVector. This doesn't allocate any memory for
Expand Down Expand Up @@ -74,17 +73,11 @@ public DateMilliVector(String name, FieldType fieldType, BufferAllocator allocat
*/
public DateMilliVector(Field field, BufferAllocator allocator) {
super(field, allocator, TYPE_WIDTH);
reader = new DateMilliReaderImpl(DateMilliVector.this);
}

/**
* Get a reader that supports reading values from this vector.
*
* @return Field Reader for this vector
*/
@Override
public FieldReader getReader() {
return reader;
protected Class<? extends FieldReader> getReaderImplClass() {
return DateMilliReaderImpl.class;
}

/**
Expand Down Expand Up @@ -293,7 +286,7 @@ public static long get(final ArrowBuf buffer, final int index) {


/**
* Construct a TransferPair comprising of this and a target vector of
* Construct a TransferPair comprising this and a target vector of
* the same type.
*
* @param ref name of the target vector
Expand All @@ -305,6 +298,19 @@ public TransferPair getTransferPair(String ref, BufferAllocator allocator) {
return new TransferImpl(ref, allocator);
}

/**
* Construct a TransferPair comprising this and a target vector of
* the same type.
*
* @param field Field object used by the target vector
* @param allocator allocator for the target vector
* @return {@link TransferPair}
*/
@Override
public TransferPair getTransferPair(Field field, BufferAllocator allocator) {
return new TransferImpl(field, allocator);
}

/**
* Construct a TransferPair with a desired target vector of the same type.
*
Expand All @@ -323,6 +329,10 @@ public TransferImpl(String ref, BufferAllocator allocator) {
to = new DateMilliVector(ref, field.getFieldType(), allocator);
}

public TransferImpl(Field field, BufferAllocator allocator) {
to = new DateMilliVector(field, allocator);
}

public TransferImpl(DateMilliVector to) {
this.to = to;
}
Expand Down
Loading