Alternative to Assistant or how to reduce response time? – API


I currently store All Assistants, Threads and messages in our database.
When using an Assistant it goes from 1 second to 10+ second even when streaming, when instruction over 8k and large amount of messages on a thread. I believe this is because they send all of the conversation and the instructions back for every response.
I’d like the AI to have history of previous messages, specific instructions, files attached to the AI but allow user to add files.
What is the best way to get fast responses with these features?
Can I store the previous messages and attach it as a file to the Assistant maybe?
Why doe OpenAI return so much data everytime and can we turn off everything but the message?

First, retrieve only last 2 messages, do you retrieve messages with default value of 20? params = {‘order’: ‘desc’, ‘limit’: ‘20’}

how i retrieve messages:



Source link

Leave a Comment