Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(workflows) explore computation within Clickhouse #1729

Open
Mitan opened this issue Jan 10, 2025 · 1 comment
Open

(workflows) explore computation within Clickhouse #1729

Mitan opened this issue Jan 10, 2025 · 1 comment
Assignees

Comments

@Mitan
Copy link
Member

Mitan commented Jan 10, 2025

Explore if some part of distance computation can be done within clickhouse
https://clickhouse.com/docs/en/sql-reference/aggregate-functions/reference
https://clickhouse.com/docs/en/sql-reference/functions/distance-functions

@Mitan Mitan self-assigned this Jan 10, 2025
@Mitan
Copy link
Member Author

Mitan commented Jan 13, 2025

Summary:
It's challenging to move all the algorithm execution into Clickhouse, but it could do data aggregation and selection of important attributes.

Details:
Clickhouse has a rich set of aggregation functions, see here. Some of these aggregations can be used to select the important attributes.

Clickhouse also has a number of functions to compute distances. Most of these distances are L-family distances/norms, such as Euclidian distance of Manhattan distance. The most closely related to the algorithm distance is Cosine distance. However, the algorithm in Seer requires more sophisticated analysis, and it cannot be done purely within Clickhouse.

@Mitan Mitan closed this as completed Jan 13, 2025
@Mitan Mitan reopened this Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant