All LLMs today need Retrieval Augmented Generation (RAG). Why? To access private knowledge and to work around the small context window.
GPT-4 is the state-of-art reasoning engine that was trained on a vast public corpus of human knowledge across numerous domains.
But it has NO access to our private knowledge in the form of proprietary documents, internal chat history, video conference transcripts, customer information and commercial contracts etc. It’s also RARELY possible to inject the entire private knowledge into the context window which is limited to 128k.
For a given query, RAG works by doing a semantic search of the corpus of private knowledge and retrieving only the relevant parts as the context for a LLM.
However the latest announced RAG in GPT-4-Turbo doesn’t work out of the box for most applications in finance because:
1) One has to manually upload up to 20 files, with each file limited to 512MB.
2) Only a few file types are supported (e.g. pdf, csv, txt)
3) There is no connectivity to any data source (e.g. database, data lakes, document stores).
4) The cost of storing these files is very steep: $0.20/GB/agent/day, which is about 260x of AWS S3 standard pricing at $0.023/GB/month.
Also. LLMs in general (including RAG)l are fundamentally terrible at time series analysis.
At SigTech we combine the state-of-art tools and data sets in finance with LLMs to maximise the productivity of our human users.