How to teach a model relational data? – API


I have a set of relational data that I would like a model to understand. I have tried using the Assistants API to create an assistant with a large set of data in a JSON file. I also tried Markdown. The results from both experiments was awful.

The data has a few simple fields and one very large text blob in one of the fields. I basically want to fetch some of the simple fields (id, url) based on analysis of the large text blog for that item. However it seems the model (gpt-4-turbo-preview) just chokes on the data and if it does spit out an answer it can never give me the correct ID or URL, in fact it just fabricates an ID that looks right but is not in the retrieval document.

The document size is 5.4MB and there are about 109,000 tokens in the document. Am I doing something wrong here? Should I be using a different format? Is there any way to get a ChatGPT model to “understand” relational data in the way we do when writing SQL queries for example and interpreting the results?

Thanks for any help or shared experiences with this.

Have you tried adding a JSON schema alongside the object? Seems to help in my experience :man_shrugging:

For Markdown, you can achieve a similar result by using a ToC (Table of Contents) for the document.



Source link

Leave a Comment