Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT-#7448: Allow QueryCompilerCaster to apply cost-optimization on automatic casting #7464

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

sfc-gh-jkew
Copy link
Contributor

@sfc-gh-jkew sfc-gh-jkew commented Mar 11, 2025

What do these changes do?

Currently a QueryCompiler derived from QueryCompilerCaster will cast all parameters, regardless of type to the query compiler we are calling an operation on. This CL introduces the ability to symmetrically cast to different query compilers based on a cost estimation.

Each query compiler implements a new function which can return a cost estimate of data migration from that specific compiler's perspective. On each operation involving mixed types we calculate the aggregate cost of migration across all query compilers and pick the query compiler class with the least cost.

The implementation of the cost-estimate, from each query compiler, can be anything, but it is constrained to a set of values between 0-1000.

Max Cost Not Implemented

This implementation still can face an issue where a workload is moved to the wrong engine, because there is no threshold on the cost of migration implemented. This can be implemented the future.

  • first commit message and PR title follow format outlined here

    NOTE: If you edit the PR title to match this format, you need to add another commit (even if it's empty) or amend your last commit for the CI job that checks the PR title to pick up the new PR title.

  • passes flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
  • passes black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
  • signed commit with git commit -s
  • Resolves Guard in QueryCompilerCaster to allow a QC to override casting behavior #7448?
  • tests added and passing
  • module layout described at docs/development/architecture.rst is up-to-date

from types import FunctionType, MethodType
from typing import Any, Dict, Tuple, TypeVar

from pandas.core.indexes.frozen import FrozenList

from modin.core.storage_formats.base.query_compiler import BaseQueryCompiler
from modin.core.storage_formats.base.query_compiler import (
BaseQueryCompiler,

Check failure

Code scanning / CodeQL

Module-level cyclic import Error

'BaseQueryCompiler' may not be defined if module
modin.core.storage_formats.base.query_compiler
is imported before module
modin.core.storage_formats.pandas.query_compiler_caster
, as the
definition
of BaseQueryCompiler occurs after the cyclic
import
of modin.core.storage_formats.pandas.query_compiler_caster.
@sfc-gh-jkew sfc-gh-jkew changed the title FEAT-7448 - Allow QueryCompilerCaster to apply cost-optimization on automatic casting FEAT-7448: Allow QueryCompilerCaster to apply cost-optimization on automatic casting Mar 12, 2025
@sfc-gh-jkew sfc-gh-jkew changed the title FEAT-7448: Allow QueryCompilerCaster to apply cost-optimization on automatic casting FEAT-#7448: Allow QueryCompilerCaster to apply cost-optimization on automatic casting Mar 12, 2025
from modin.core.storage_formats.base.query_compiler import BaseQueryCompiler
from modin.core.storage_formats.base.query_compiler import (
BaseQueryCompiler,
QCCoercionCost,

Check failure

Code scanning / CodeQL

Module-level cyclic import Error

'QCCoercionCost' may not be defined if module
modin.core.storage_formats.base.query_compiler
is imported before module
modin.core.storage_formats.pandas.query_compiler_caster
, as the
definition
of QCCoercionCost occurs after the cyclic
import
of modin.core.storage_formats.pandas.query_compiler_caster.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Guard in QueryCompilerCaster to allow a QC to override casting behavior
1 participant