Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] Problem using month() binding on augmented column #39548

Open
thisisnic opened this issue Jan 10, 2024 · 0 comments
Open

[R] Problem using month() binding on augmented column #39548

thisisnic opened this issue Jan 10, 2024 · 0 comments

Comments

@thisisnic
Copy link
Member

thisisnic commented Jan 10, 2024

Describe the bug, including details regarding any error messages, version, and platform.

library(stringr)
library(arrow)
library(dplyr)

tf <- tempfile()
dir.create(tf)
data <- tibble::tibble(x = 1)

write_parquet(data, file.path(tf, "2023_01_01.parquet"))
write_parquet(data, file.path(tf, "2023_02_02.parquet"))
write_parquet(data, file.path(tf, "2023_03_03.parquet"))

# works
start_of_query <- open_dataset(tf) |>
  mutate(file = add_filename()) |>
  mutate(file_date = str_remove(file, ".*/")) |>
  mutate(file_date = str_remove(file_date, ".parquet")) |>
  mutate(file_date = ymd(file_date)) |>
  mutate(year = year(file_date)) 

# works
start_of_query |> 
  collect()
#> # A tibble: 3 × 4
#>       x file                                                 file_date   year
#>   <dbl> <chr>                                                <date>     <int>
#> 1     1 /tmp/Rtmpsy0nYY/file2fa7d1c483f2a/2023_03_03.parquet 2023-03-03  2023
#> 2     1 /tmp/Rtmpsy0nYY/file2fa7d1c483f2a/2023_01_01.parquet 2023-01-01  2023
#> 3     1 /tmp/Rtmpsy0nYY/file2fa7d1c483f2a/2023_02_02.parquet 2023-02-02  2023

# doesn't work
start_of_query |> 
  mutate(month = month(file_date)) |>
  collect()
#> Error: Expression month(file_date) not supported in Arrow
#> Call collect() first to pull data into R.

The error comes from this bit of code at the top of the binding for month():

call_binding("is.integer", x)

which results in this error:

Error: Invalid: No match for FieldRef.Name(__filename) in x: double

This is most likely pretty much the same as #33464

Component(s)

R

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant