Solutons Lounge

Can Instructions be reused at no cost? Or, how to save on tokens – API


Hi, I am learning the API so please pardon if the question is weak. Specifically for the Assistant API (although this might apply to all APIs) I start giving the Assistant some fairly lengthy instructions. Does OpenAI charges me tokens for the same initial instructions on every thread? Suppose the instructions are 1000 tokens and the assistant executes 100 threads, does that means I will be charged 100,000 tokens just for the instructions part? And if that’s the case, is there a way to make this less token expensive? something that will make me get charged for the same instructions only once? How about other APIs? What I have seen so far is that you have to start from zero instructing ChatGPT every time an interaction starts, if it is thousands of interactions in a day this can add up real fast and become too expensive. I really hope I got this one wrong, thanks.

The way an API AI language model works for chat or new knowledge:

  • You have to supply all the data you want it to know, like previous chat, in every following independent API call.
  • The processing of the input text tokens up to where the AI can then generate its own tokens is computationally expensive

Assistants aggravate this:

  • They offer no limitation on how long a conversation can grow (a thread) until the model is at its context length limit
  • They inject other text instructions, they load knowledge without regard to its present relevance, etc.
  • They can iterate internally multiple times doing a task, each a new call to an AI model, even looping on errors.

A costly proposition, with many forum cautionary tales.

So the place to start is with chat completions API method and your code, where you have direct access to the input of the AI, and can give just the instructions and chat history length needed each turn to maintain the topic or quality you desire.



Source link

Exit mobile version