Decoding Market Chatter: Using Large Language Models to Extract Key Deal Insights from Trading Conversations (2)

Part 2: Utilizing Small Open-Source LLMs for Information Extraction
Introduction to Small Open-Source LLMs In the realm of AI and machine learning, the emergence of small open-source Large Language Models (LLMs) has been a game-changer, particularly for sectors like finance where data security and confidentiality are paramount. These models offer a powerful blend of advanced language processing capabilities and the flexibility of local deployment, making them an ideal choice for financial institutions.
Why Small LLMs are Crucial for Financial Institutions
- Confidentiality and Data Security: Financial institutions handle sensitive information, including confidential trade details that are legally and ethically prohibited from being shared externally. Utilizing locally hosted LLMs ensures that all data processing occurs in-house, maintaining data confidentiality and adhering to strict regulatory standards.
- Customization and Control: Open-source LLMs provide the flexibility to be customized according to specific institutional needs. Financial organizations can tailor these models to better understand and process their unique financial jargon and trading language, enhancing the model’s relevance and accuracy.
- Cost-Effectiveness and Accessibility: Smaller LLMs, being less resource-intensive, offer a more accessible and cost-effective solution for financial institutions that might not have the infrastructure to support larger models like GPT-4. This makes advanced AI technologies available to a wider range of institutions, democratizing access to cutting-edge tools.
Process of Extracting Trade Information Using Small Open-Source LLMs
The quest to extract meaningful information from financial trading conversations using LLMs is a two-fold process. It begins with identifying the trade type from conversations and then extracting specific deal information using templates tailored to each trade type. Let’s delve into these steps in detail, complemented by our findings using different LLM models.
Step 1: Identifying the Trade Type
The first crucial step involves determining the trade type from textual conversation data. This is achieved through the following process:
- Generation of Synthetic Data: We have generated some trade data in Part 1: Knowledge Distillation for Generating Synthetic Deal Information. Here is an example:
{
"model": "gpt-4-1106-preview",
"trade_type": "FX Double Barrier Knock-In Option",
"currency": "NZDUSD",
"trader1": {
"name": "Alice",
"style": "descriptive",
"emotion": "skeptical",
"tone": "neutral",
"attitude": "open-minded",
"perspective": "first-person"
},
"trader2": {
"name": "Bob",
"style": "conversational",
"emotion": "happy",
"tone": "passive",
"attitude": "contrary",
"perspective": "local"
},
"deal": {
"trade_type": "FX Double Barrier Knock-In Option",
"currency_pair": "NZDUSD",
"lower_barrier": "1.2500",
"upper_barrier": "1.3500",
"strike_price": "1.3000",
"expiry_date": "2024-07-01",
"option_type": "Call",
"premium": "0.0180",
"amount": "500000",
"buyer": "Alice",
"seller": "Bob",
"conversation": [
{
"name": "Alice",
"message": "I'm exploring the idea of a double barrier knock-in option on NZDUSD. I want the lower and upper barriers set at 1.2500 and 1.3500, respectively, with a strike price at 1.3000."
},
{
"name": "Bob",
"message": "Oh, how delightful! A double barrier knock-in option, you say? Setting the lower barrier at 1.2500 and the upper one at 1.3500 sounds like an adventure. And the strike at 1.3000, correct?"
},
{
"name": "Alice",
"message": "Correct. The focus should be on protecting the position, considering the volatility in the NZDUSD pair. Do you have any thoughts on the premium and amount?"
},
{
"name": "Bob",
"message": "Well, if we're taking a stroll through this garden, I'd say a premium of 0.0180 for an amount of 500,000 sounds like the right path to take. It's congenial, yet considerate of the risks."
},
{
"name": "Alice",
"message": "That premium seems fair, given the constraints of the barriers. Let's formalize this option with an expiry of July 1st, 2024. Does that fit within your timeline?"
},
{
"name": "Bob",
"message": "Indeed, Alice, July 1st, 2024, as the expiry date is like the perfect time for this option to blossom. It's all coming together quite nicely."
},
{
"name": "Alice",
"message": "Great. We'll put this FX double barrier knock-in call option on NZDUSD in motion then. Just to recap: lower barrier at 1.2500, upper barrier at 1.3500, strike price at 1.3000, expiry date on 2024-07-01, option type 'Call', premium at 0.0180, for the amount of 500,000."
},
{
"name": "Bob",
"message": "What a splendid summary! Our little financial seedling is ready to be planted. I'll get the paperwork started. Pleasure doing this dance with you."
}
]
}
}
- Conversation Formatting: We start by formatting the raw conversational data into a continuous text stream suitable for LLM processing.
def conversation_to_text(conversation):
text = ""
for message in conversation:
text += message['name'] + ": " + message['message'] + "\n"
return text
- Trade Type Definition: 21 different types of financial trades are defined and template generated by GPT-4.
trade_types=['FX Spot','FX Swap','FX Vanilla Option',
'FX Down and In Option', 'FX Down and Out Option',
'FX Up and In Option', 'FX Up and Out Option',
'FX Double Barrier Knock-In Option',
'FX Double Barrier Knock-Out Option',
'FX Range Accrual Option','Fixed-Floating IRS',
'Floating-Floating IRS','IRO Cap','Fixed-Fixed IRS',
'Stock','Single Stock Option','Bond','Commodity',
'Credit Default Swap','Equity Swap','Autocallable Swap']
trade_template={
"FX Spot": {
"trade_type": "FX Spot",
"currency_pair": "EURUSD",
"rate": "1.1800",
"amount": "1000000",
"trade_date": "2023-01-15",
"settlement_date": "2023-01-17",
"buyer": "Alice",
"seller": "Bob",
"conversation": [{"name": "Alice", "message": "Looking to buy EURUSD at 1.1800"}, {"name": "Bob", "message": "Confirmed, selling EURUSD at 1.1800"}]
},
"FX Swap": {
"trade_type": "FX Swap",
"near_leg": {
"currency_pair": "USDJPY",
"near_rate": "110.00",
"near_amount": "500000",
"near_date": "2023-01-15"
},
"far_leg": {
"currency_pair": "USDJPY",
"far_rate": "110.25",
"far_amount": "500000",
"far_date": "2023-06-15"
},
"buyer": "Alice",
"seller": "Bob",
"conversation": [{"name": "Alice", "message": "Looking to do a near/far swap on USDJPY"}, {"name": "Bob", "message": "Agreed on rates 110.00 and 110.25"}]
}
# templates for other trade types
]
- LLM Prompting for Trade Type Identification: The formatted conversation is then fed to the LLM with a prompt to identify the trade type.
trade_types_str=", ".join(map(str,trade_types))
question = "Given trade type is one of "+ trade_types_str + ".
return only in a valid JSON format contains 1 elements
{""trade_type"":""Type""}.
Identify the trade type from the following conversation: \n\n"
+ conversation_text
# response = LLM_API_Call(question)
# API call to LLM
Post-Processing the Response: After prompting the LLM to identify the trade type and to extract specific deal information, the responses we receive are raw and sometimes not in a readily usable format. This is where post-processing becomes essential. It involves refining the LLM’s output into structured data that can be effectively analyzed and utilized. Here’s how we approach this critical step:
1. Extracting the Core Response:
- The LLM’s output may contain additional text or formatting that is not relevant to our analysis. We first isolate the core response — the actual information we need. For instance, if we’re looking for a JSON object, we extract the portion of the response that conforms to JSON formatting.
def extract_core_response(answer):
start = answer.find('{')
end = answer.rfind('}') + 1
return answer[start:end]
2. Cleaning and Formatting:
- LLM outputs might include escape characters, misplaced symbols, or inconsistent formatting. Our next step is to clean these anomalies to ensure the data is in a usable format.
def clean_response(response):
cleaned = response.replace("\\n", " ").replace("\\_", "_")
return cleaned
3. Converting to Structured Data:
- Particularly for financial data, structured formats like JSON are preferred. We convert the cleaned response to JSON, handling any parsing errors that may occur.
import json
def convert_to_json(cleaned_response):
try:
result = json.loads(cleaned_response)
except json.JSONDecodeError:
# Handle JSON conversion errors, possibly using regex or other methods
result = handle_parsing_error(cleaned_response)
return result
4. Handling Parsing Errors:
- In cases where the standard JSON parsing fails, we implement additional error handling mechanisms, such as using regular expressions to correct common formatting issues.
import re
def handle_parsing_error(response):
# Fix common JSON formatting issues using regex
fixed_response = re.sub('pattern_to_fix', 'replacement', response)
try:
return json.loads(fixed_response)
except json.JSONDecodeError:
return {'error': 'Failed to parse JSON', 'original_response': response}
- 5. Validating and Verifying Data:
- Once we have the data in JSON format, we validate it against our expected structure and verify its accuracy. This step is crucial to ensure the data aligns with our financial models and analysis requirements.
if "trade_type" in processed:
df_types.loc[index,model+"_type"]=processed["trade_type"]
df_types.loc[index,model]=processed["trade_type"]==df_deals["trade_type"][index]
if df_types.loc[index,model]:
df_types.loc[index,"Matches"]+=1
else:
df_types.loc[index,"NotMatches"]+=1
else:
if retries==1:
df_types.loc[index,"NaN"]+=1
Model Performance Analysis
| Model | Matches | NotMatches | NaN |
|--------------|---------|------------|-----|
| Mistral | 387 | 218 | 25 |
| Llama2:13b | 300 | 330 | 0 |
| Orca2:13b | 487 | 131 | 12 |
| Neural-Chat | 421 | 165 | 44 |
| Deepseek-LLM | 352 | 263 | 14 |
Our experiments with various LLMs yielded the following insights:
- Overall Performance: Different models showed varying levels of success in correctly identifying trade types. Orca2:13b had the highest match rate, followed by Neural-Chat and Deepseek-LLM.
- Inconsistencies and Challenges: All models faced some challenges, as indicated by the number of mismatches and instances where no clear trade type was identified (NaN).
- Interpretation of Results: The high match rate of Orca2:13b suggests superior capability in understanding financial dialogues, while the lower NaN count in models like Llama2:13b indicates a more definitive response pattern, albeit with higher mismatches.
Contextualizing the Model Performance Results
It’s important to note that the results presented in the table above are based on a simple and preliminary experiment, primarily designed to illustrate the process of using Large Language Models (LLMs) for extracting trade information from conversations. These findings are not intended to be a comprehensive or scientific assessment of the models’ performance. Several factors contribute to this context:
- Experiment Design: The experiment was conducted as a quick test to demonstrate the feasibility and methodology of using LLMs in financial conversation analysis, rather than a rigorous evaluation of model capabilities.
- Sample Variability: The outcomes are dependent on the specific set of samples used in the test. A different set of conversational data could yield varying results, highlighting the influence of the dataset’s nature and composition on model performance.
- Model Inconsistency: LLMs, particularly those based on complex algorithms and large datasets, can exhibit variability in their performance. The same model might produce different results under different test conditions or even when re-tested with the same data due to the inherent stochastic nature of these AI systems.
- Purpose of the Test: The primary objective of this experiment was to shed light on the process of applying LLMs in a financial context, rather than definitively ranking the models’ effectiveness. The results should be viewed as indicative of the potential of LLMs in this application area, rather than conclusive evidence of their relative performance.
Conclusion of the First Step in Part 2
As we reach the end of this segment of our exploration into using Large Language Models (LLMs) for financial data extraction, we have successfully navigated the initial step: identifying trade types from conversational data. This phase, crucial in setting the groundwork for detailed analysis, demonstrates the practical application and potential of LLMs in deciphering complex financial dialogues.
Our journey so far has involved formatting conversational data, prompting LLMs to categorize this data into distinct trade types, and then meticulously processing the output to glean actionable insights. The results from our initial experiments, while preliminary, offer a glimpse into the capabilities of various LLMs in understanding and classifying financial conversations.
Looking Ahead: Extracting Trade Information
The next installment of our blog series will delve deeper into the second step of this process. Building on the foundation laid by correctly identifying trade types, we will explore how to harness the power of LLMs to extract specific trade information from these conversations. This step is where the real magic happens — transforming raw data into precise, valuable insights that can drive decision-making in financial contexts.