Open Source Model Benchmark: Financial Function Calling Polygon.io Testset 20240425–1

Alvin Cho
2 min readMay 8, 2024

--

Testset: 20240423–1 Financial API Function Calling

Test Plan Details

Use LLM to generate function call from the context. Datasets contain question and answer pairs which answers should be clear and unique and verifiable. Each question will be asked multiple times and correctness is calculate based on models’ answers. Datasets and prompts might be modified to achieve better results.

Test Set Details

Use LLMs to generate Polygon.io function call endpoint based on prompted questions and instructions. Question set has been revised with detailed instruction about date format and meaning of multiplier and timespan. Total 20 questions.

Date Published 2024–05–07

Methodology

See our blog post to know more about our methodology

Key Finding

Click the following link to view results.

Datasets

Polygonio API Function Calling Q&A dataset

Contains questions and answer to call Polygonio API function. Answers are simple and can be verified by program. Generated by ChatGPT.

Download data

Prompt Template

You are a financial application developer. You will question below is asking about Polygon.io API endpoint. Answer the question below based on instruction provided. If you don’t know the answer, just answer you don’t know.

Question: %question%

Instruction: %instruction%

multiplier is an integer number. timespan can be one of minute, hour, day, week, month, year. from and to are date in the format yyyy-mm-dd. Response only in JSON format as {“answer”: “some endpoint”} without any other text. No explanation is required.

No host name http://hostname required. Answer Like {“answer”:”/v2/somefunction”}

Results by Models

Download data

Test Results

Display the first 100 rows. Full data can be downloaded from our GitHub repository

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Alvin Cho
Alvin Cho

Written by Alvin Cho

Independent consultant. 30+ years experience in enterprise applications for trading and risk management. 

No responses yet

Write a response