Replace traditional NLP approaches with prompt engineering and Large Language Models (LLMS) for Jira ticket text classification. A code sample walkthrough
Remember the times when classifying text meant embarking on a machine learning journey? When you’ve been within the ML space long enough, you’ve probably witnessed at the very least one team disappear down the rabbit hole of constructing the “perfect” text classification system. The story normally goes something like this:
- Month 1: “We’ll just quickly train a NLP model!”
- Month 2: “We want more training data…”
- Month 3: “This is sweet enough”
For years, text classification has fallen into the realm of classic ML. Early in my profession, I remember training a support vector machine (SVM) for email classification. Plenty of preprocessing, iteration, data collection, and labeling.
But here’s the twist: it’s 2024, and generative AI models can “generally” classify text out of the box! You possibly can construct a sturdy ticket classification system without, collecting 1000’s of labeled training examples, managing ML training pipelines, or maintaining custom models.
On this post, we’ll go over tips on how to setup a Jira ticket classification system using large language models on Amazon Bedrock and other AWS services.
DISCLAIMER: I’m a GenAI Architect at AWS and my opinions are my very own.
Why Classify Jira Tickets?
A typical ask from corporations is to know how teams spend their time. Jira has tagging features, but it could actually sometimes fall short through human error or lack of granularity. By doing this exercise, organizations can recover insights into their team activities, enabling data-driven decisions about resource allocation, project investment, and deprecation.
Why Not Use Other NLP Approaches?
Traditional ML models and smaller transformers like BERT need lots of (or 1000’s) of labeled examples, while LLMs can classify text out of the box. In our Jira ticket classification tests, a prompt-engineering approach matched or beat traditional ML models, processing 10k+ annual tickets for ~$10/yr using Claude Haiku (excluding other AWS Service costs). Also, prompts are easier to update than retraining models.
This github repo incorporates a sample application that connects to Jira Cloud, classifies tickets, and outputs them in a format that will be consumed by your favorite dashboarding tool (Tableu, Quicksight, or another tool that supports CSVs).
Essential Notice: This project deploys resources in your AWS environment using Terraform. You’ll incur costs for the AWS resources used. Please pay attention to the pricing for services like Lambda, Bedrock, Glue, and S3 in your AWS region.
Pre Requisites
You’ll have to have terraform installed and the AWS CLI installed within the environment you wish to deploy this code from
The architecture is pretty simple. You will discover details below.
Step 1: An AWS Lambda function is triggered on a cron job to fetch jira tickets based on a time window. Those tickets are then formatted and pushed to an S3 bucket under the /unprocessed prefix.
Step 2: A Glue job is triggered off /unprocessed object puts. This runs a PySpark deduplication task to make sure no duplicate tickets make their strategy to the dashboard. The deduplicated tickets are then put to the /staged prefix. This is helpful for cases where you manually upload tickets in addition to depend on the automated fetch. When you can ensure no duplicates, you possibly can remove this step.
Step 3: A classification task is kicked off on the brand new tickets by calling Amazon Bedrock to categorise the tickets based on a prompt to a big language model (LLM). After classification, the finished results are pushed to the /processed prefix. From here, you possibly can pick up the processed CSV using any dashboarding tool you’d like that may devour a CSV.
To start, clone the github repo above and move to the /terraform directory
$ git clone https://github.com/aws-samples/jira-ticket-classification.git$ cd jira-ticket-classification/terraform
Run terraform init, plan, & apply. Ensure that you’ve got terraform installed in your computer and the AWS CLI configured.
$ terraform init$ terraform plan
$ terraform apply
Once the infrastructure is deployed into your account, you possibly can navigate to AWS Secrets Manager and update the key along with your Jira Cloud credentials. You’ll need an API key, base url, and email to enable the automated pull
And that’s it!
You possibly can (1) wait for the Cron to kick off an automatic fetch, (2) export the tickets to CSV and upload them to the /unprocessed S3 bucket prefix, or (3) manually trigger the Lambda function using a test.
Jira Fetch:
Jira fetch uses a Lambda function with a Cloudwatch cron event to trigger it. The Lambda pulls within the AWS Secret and uses a get request shortly loop to retrieve paginated results until the JQL query completes:
def fetch_jira_issues(base_url, project_id, email, api_key):
url = f"{base_url}/rest/api/3/search"# Calculate the date 8 days ago
eight_days_ago = (datetime.now() - timedelta(days=8)).strftime("%Y-%m-%d")
# Create JQL
jql = f"project = {project_id} AND created >= '{eight_days_ago}' ORDER BY created DESC"
# Pass into params of request.
params = {
"jql": jql,
"startAt": 0
}
all_issues = []
auth = HTTPBasicAuth(email, api_key)
headers = {"Accept": "application/json"}
while True:
response = requests.get(url, headers=headers, params=params, auth=auth)
if response.status_code != 200:
raise Exception(f"Did not fetch issues for project {project_id}: {response.text}")
data = json.loads(response.text)
issues = data['issues']
all_issues.extend(issues)
if len(all_issues) >= data['total']:
break
params['startAt'] = len(all_issues)
return all_issues
It then creates a string representation of a CSV and uploads it into S3:
def upload_to_s3(csv_string, bucket, key):
try:
s3_client.put_object(
Bucket=bucket,
Key=key,
Body=csv_string,
ContentType='text/csv'
)
except Exception as e:
raise Exception(f"Did not upload CSV to S3: {str(e)}")
Glue Job
An S3 event on the /unprocessed prefix kicks off a second lambda that starts an AWS Glue job. This is helpful when there’s multiple entry points that Jira tickets can enter the system through. For instance, if you wish to do a backfill.
import boto3 # Initialize Boto3 Glue client
glue_client = boto3.client('glue')
def handler(event, context):
# Print event for debugging
print(f"Received event: {json.dumps(event)}")
# Get bucket name and object key (file name) from the S3 event
try:
s3_event = event['Records'][0]['s3']
s3_bucket = s3_event['bucket']['name']
s3_key = s3_event['object']['key']
except KeyError as e:
print(f"Error parsing S3 event: {str(e)}")
raise
response = glue_client.start_job_run(
JobName=glue_job_name,
Arguments={
'--S3_BUCKET': s3_bucket,
'--NEW_CSV_FILE': s3_key
}
)
The Glue job itself is written in PySpark and will be present in the code repo here. The necessary take away is that it does a leftanti join using the problem Ids on the items in the brand new CSV against all of the Ids within the /staged CSVs.
The outcomes are then pushed to the /staged prefix.
Classify Jira Tickets:
That is where it it gets interesting. Because it seems, using prompt engineering can perform on par, if not higher, than a text classification model using a pair techniques.
- You possibly can define the classifications and their descriptions in a prompt,
- Ask the model to think step-by-step (Chain of Thought).
- After which output the classification without having to coach a single model. See the prompt below:
Note: It’s necessary to validate your prompt using a human curated subset of classified / labelled tickets. You must run this prompt through the validation dataset to make certain it aligns with the way you expect the tickets to be classified
SYSTEM_PROMPT = '''
You might be a support ticket assistant. You might be given fields of a Jira ticket and your task is to categorise the ticket based on those fieldsBelow is the list of potential classifications together with descriptions of those classifications.
ACCESS_PERMISSIONS_REQUEST: Used when someone doesn't have the write permissions or cannot log in to something or they cannot get the proper IAM credentials to make a service work.
BUG_FIXING: Used when something is failing or a bug is found. Often times the descriptions include logs or technical information.
CREATING_UPDATING_OR_DEPRECATING_DOCUMENTATION: Used when documentation is outdated. Normally references documentation within the text.
MINOR_REQUEST: This is never used. Normally a bug fix but it is very minor. If it seems even remotely complicated use BUG_FIXING.
SUPPORT_TROUBLESHOOTING: Used when asking for support for some engineering event. Also can appear like an automatic ticket.
NEW_FEATURE_WORK: Normally describes a brand new feature ask or something that may not operational.
The fields available and their descriptions are below.
Summmary: This can be a summary or title of the ticket
Description: The outline of the problem in natural language. The vast majority of context needed to categorise the text will come from this field
* It is feasible that some fields could also be empty through which case ignore them when classifying the ticket
* Think through your reasoning before making the classification and place your thought process in tags. That is your space to think and reason concerning the ticket classificaiton.
* Once you've got finished considering, classify the ticket using ONLY the classifications listed above and place it in tags.
'''
USER_PROMPT = '''
Using only the ticket fields below:
{summary}
{description}
Classify the ticket using ONLY 1 of the classifications listed within the system prompt. Remember to think step-by-step before classifying the ticket and place your thoughts in tags.
When you find yourself finished considering, classify the ticket and place your answer in tags. ONLY place the classifaction in the reply tags. Nothing else.
'''
We’ve added a helper class that threads the calls to Bedrock to hurry things up:
import boto3
from concurrent.futures import ThreadPoolExecutor, as_completed
import re
from typing import List, Dict
from prompts import USER_PROMPT, SYSTEM_PROMPTclass TicketClassifier:
SONNET_ID = "anthropic.claude-3-sonnet-20240229-v1:0"
HAIKU_ID = "anthropic.claude-3-haiku-20240307-v1:0"
HYPER_PARAMS = {"temperature": 0.35, "topP": .3}
REASONING_PATTERN = r'(.*?) '
CORRECTNESS_PATTERN = r'(.*?) '
def __init__(self):
self.bedrock = boto3.client('bedrock-runtime')
def classify_tickets(self, tickets: List[Dict[str, str]]) -> List[Dict[str, str]]:
prompts = [self._create_chat_payload(t) for t in tickets]
responses = self._call_threaded(prompts, self._call_bedrock)
formatted_responses = [self._format_results(r) for r in responses]
return [{**d1, **d2} for d1, d2 in zip(tickets, formatted_responses)]
def _call_bedrock(self, message_list: list[dict]) -> str:
response = self.bedrock.converse(
modelId=self.HAIKU_ID,
messages=message_list,
inferenceConfig=self.HYPER_PARAMS,
system=[{"text": SYSTEM_PROMPT}]
)
return response['output']['message']['content'][0]['text']
def _call_threaded(self, requests, function):
future_to_position = {}
with ThreadPoolExecutor(max_workers=5) as executor:
for i, request in enumerate(requests):
future = executor.submit(function, request)
future_to_position[future] = i
responses = [None] * len(requests)
for future in as_completed(future_to_position):
position = future_to_position[future]
try:
response = future.result()
responses[position] = response
except Exception as exc:
print(f"Request at position {position} generated an exception: {exc}")
responses[position] = None
return responses
def _create_chat_payload(self, ticket: dict) -> dict:
user_prompt = USER_PROMPT.format(summary=ticket['Summary'], description=ticket['Description'])
user_msg = {"role": "user", "content": [{"text": user_prompt}]}
return [user_msg]
def _format_results(self, model_response: str) -> dict:
reasoning = self._extract_with_(model_response, self.REASONING_PATTERN)
correctness = self._extract_with_(model_response, self.CORRECTNESS_PATTERN)
return {'Model Answer': correctness, 'Reasoning': reasoning}
@staticmethod
def _extract_with_(response, ):
matches = re.search(, response, re.DOTALL)
return matches.group(1).strip() if matches else None
Lastly, the classified tickets are converted to a CSV and uploaded to S3
import boto3
import io
import csvs3 = boto3.client('s3')
def upload_csv(data: List[Dict[str, str]]) -> None:
csv_buffer = io.StringIO()
author = csv.DictWriter(csv_buffer, fieldnames=data[0].keys())
author.writeheader()
author.writerows(data)
current_time = datetime.now().strftime("%Y%m%d_%H%M%S")
filename = f"processed/processed_{current_time}.csv"
s3.put_object(
Bucket=self.bucket_name,
Key=filename,
Body=csv_buffer.getvalue()
)
The project is dashboard agnostic. Any popular tool/service will work so long as it could actually devour a CSV. Amazon Quicksight, Tableu or anything in between will do.
On this blog we discussed using Bedrock to robotically classify Jira tickets. These enriched tickets can then be used to create dashboards using various AWS Services or 3P tools. The takeaway, is that classifying text has develop into much simpler for the reason that adoption of LLMs and what would have taken weeks can now be done in days.
When you enjoyed this text be happy to attach with me on LinkedIn