ETL framework to index data for AI, such as RAG; with realtime incremental updates and support custom logic like lego.
-
Updated
Apr 2, 2025 - Rust
ETL framework to index data for AI, such as RAG; with realtime incremental updates and support custom logic like lego.
Big Data and Machine Intelligence Course in Autumn 2019.
Patient Intake Form Extraction using llm
🌲 Improved Interval B+ tree implementation, in TS 🌲
This repository contains an application designed to recommend scientific papers that are most similar to a given input paragraph. The application uses the llama and weaviate libraries to achieve this.
A zero-dependency library of classes that make filtering, sorting and observing changes to arrays easier and more efficient.
Designed to store and retrieve high-dimensional data, such as embeddings, efficiently. It enables fast similarity searches by leveraging techniques.
System for Managing the data generated by the SEAGrid Science Gateway
A Subgraph-indexing-runtime that prioritises performance & cost efficiency
Time series analysis showing trend, seasonality, and periodicity decomposition; and forecasting using Facebook Prophet. The analysis makes extensive use of indexing data tools and of the Pandas and Datetime libraries.
BORDS is an open-access reaction search engine that leverages Google's Open Reaction Database to provide ultra-fast, comprehensive access to millions of chemical reactions. Built with a modern cloud stack, it streamlines reaction data extraction, transformation, and indexing for researchers in chemistry and related fields.
Python implementation of a TF-IDF/cosine based search engine
Add a description, image, and links to the data-indexing topic page so that developers can more easily learn about it.
To associate your repository with the data-indexing topic, visit your repo's landing page and select "manage topics."