A sub-millisecond on-device RAG memory library for iOS and macOS AI agents, implemented in pure Swift with hybrid BM25 + HNSW vector search, local MiniLM embeddings, Metal GPU acceleration, token budgeting, and a single portable .wax file format — no server or API required.