Image Credit: LlamaIndex

Multi modal RAG is going to seem so intuitive and normal soon.
Think about how we teach little kids animal sounds by pointing to pictures.
Using a bird call on your phone to find images and text to learn what it eats.
Contributing to scientific databases with photos where all the location and time stamp meta data get automatically logged.
As were able to interact with the similarity score calculations, to fine time and refine search results interactively and intentionally they are going to be extremely powerful
By “going to be” I mean — the time has already arrived for the technology and now it’s all about applications.
This flow is still missing key components of the UI/UX design which are going to drive adoption and make it so broadly applicable.

Leave a comment