my first n8n workflow, as a knowledge scientist, it felt like I used to be cheating.
I could connect with APIs without reading 30-page docs, trigger workflows from Gmail or Sheets, and deploy something useful in minutes.
Nonetheless, the numerous drawback is that n8n shouldn’t be natively optimised to run a Python environment within the cloud instances utilized by our customers.
Like many data scientists, my day by day toolbox for data analytics is built on NumPy and Pandas.
To remain in my comfort zone, I often d calculations to external APIs as an alternative of using n8n JavaScript code nodes.
For example, that is what is completed with a Production Planning Optimisation tool, which is orchestrated through a workflow that features an Agent node that calls a FastAPI microservice.
This approach worked, but I had clients who requested to have complete visibility of the information analytics tasks on their n8n user interface.
I realised that I would like to learn simply enough JavaScript to perform data processing with the native code nodes of n8n.

In this text, we’ll experiment with small JavaScript snippets inside n8n Code nodes to perform on a regular basis data analytics tasks.
For this exercise, I’ll use a dataset of sales transactions and walk it through to an ABC and Pareto evaluation, that are widely utilized in Supply Chain Management.

I’ll provide side-by-side examples of Pandas vs. JavaScript in n8n Code nodes, allowing us to translate our familiar Python data evaluation steps directly into automated n8n workflows.

The thought is to implement these solutions for small datasets or quick prototyping inside the capabilities of a cloud enterprise n8n instance (i.e. without community nodes).

I’ll end the experiment with a fast comparative study of the performance versus a FastAPI call.
You may follow me and replicate your entire workflow using a Google Sheet and a workflow template shared within the article.
Let’s start!
Constructing a Data Analytics Workflow using JavaScript in n8n
Before beginning to construct nodes, I’ll introduce the context of this evaluation.
ABC & Pareto Charts for Supply Chain Management
For this tutorial, I propose that you simply construct a straightforward workflow that takes sales transactions from Google Sheets and transforms them right into a comprehensive ABC and Pareto charts.
This may replicate the ABC and Pareto Evaluation module of the LogiGreen Apps developed by my startup, LogiGreen.

The goal is to generate a set of visuals for the inventory teams of a supermarket chain to assist them understand the distribution of sales across their stores.
We’ll give attention to generating two visuals.
The primary chart shows an ABC-XYZ evaluation of sales items:

- X-axis (Percentage of Turnover %): the contribution of every item to total revenue.
- Y-axis (Coefficient of Variation): demand variability of every item.
- Vertical red lines split items into A, B, and C classes based on turnover share.
- The horizontal blue line marks stable vs variable demand (CV=1)
Together, it highlights which items are high-value & stable (A, low CV) versus those which might be low-value or highly variable, guiding prioritisation in inventory management.
The second visual is a Pareto evaluation of sales turnover:

- X-axis: percentage of SKUs (ranked by sales).
- Y-axis: cumulative percentage of annual turnover.
- The curve illustrates how a small fraction of things contributes to the vast majority of revenue.
In brief, this highlights (or not) the classic Pareto rule, which affirms that 80% of sales can come from 20% of the SKUs.
How did I generate these two visuals? I simply used Python.
On my YouTube channel, I shared a complete tutorial on the way to do it using Pandas and Matplotlib.
The target of this tutorial is to arrange sales transactions and generate these visuals in a Google Sheet using only n8n’s native JavaScript nodes.
Constructing a Data Analytics Workflow in n8n
I propose to construct a workflow that’s manually triggered to facilitate debugging during development.

To follow this tutorial, it’s good to
You may now connect your duplicated sheet using the second node, which can extract the dataset from the worksheet: Input Data
.

This dataset includes retail sales transactions on the day by day granularity:
ITEM
: an item that could be sold in multiple storesSKU
: represents an `SKU` sold in a particular storeFAMILY
: a bunch of thingsCATEGORY
: a product category can include multiple familiesSTORE
: a code representing a sales locationDAY
of the transactionQTY
: sales quantity in unitsTO
: sales quantity in euros
The output is the table’s content in JSON format, able to be ingested by other nodes.
import pandas as pd
df = pd.read_csv("sales.csv")
We are able to now begin processing the dataset to construct our two visualisations.
Step 1: Filter out transactions without sales
Allow us to begin with the easy motion of filtering out transactions with sales QTY
equal to zero.

We don’t need JavaScript; a straightforward Filter node can do the job.
df = df[df["QTY"] != 0]
Step 2: Prepare data for Pareto Evaluation
We first have to aggregate the sales per ITEM
and rank products by turnover.
sku_agg = (df.groupby("ITEM", as_index=False)
.agg(TO=("TO","sum"), QTY=("QTY","sum"))
.sort_values("TO", ascending=False))
In our workflow, this step will likely be done within the JavaScript node TO, QTY GroupBY ITEM
:
const agg = {};
for (const {json} of things) {
const ITEM = json.ITEM;
const TO = Number(json.TO);
const QTY = Number(json.QTY);
if (!agg[ITEM]) agg[ITEM] = { ITEM, TO: 0, QTY: 0 };
agg[ITEM].TO += TO;
agg[ITEM].QTY += QTY;
}
const rows = Object.values(agg).sort((a,b)=> b.TO - a.TO);
return rows.map(r => ({ json: r }));
This node returns a ranked table of sales per ITEM
in quantity (QTY) and turnover (TO):
- We initiate agg as a dictionary keyed by ITEM
- We loop over n8n rows in items
- Converting TO and QTY to numbers
- Add the QTY and TO value into the running totals of every ITEM
- We finally transform the dictionary into an array sorted by TO desc and return items

ITEM
– (Image by Samir Saci)We now have the information able to perform a Pareto Evaluation on sales quantity (QTY) or turnover (TO).
For that, we want to calculate cumulative sales and rank SKUs from the best to the bottom contributor.
abc = sku_agg.copy() # from Step 2, already sorted by TO desc
total = abc["TO"].sum() or 1.0
abc["cum_turnover"] = abc["TO"].cumsum()
abc["cum_share"] = abc["cum_turnover"] / total
abc["sku_rank"] = range(1, len(abc) + 1)
abc["cum_skus"] = abc["sku_rank"] / len(abc)
abc["cum_skus_pct"] = abc["cum_skus"] * 100
This step will likely be done within the code node Pareto Evaluation
:
const rows = items
.map(i => ())
.sort((a, b) => b.TO - a.TO);
const n = rows.length; // variety of ITEM
const totalTO = rows.reduce((s, r) => s + r.TO, 0) || 1;
We collect the dataset items
from the previous node
- For every row, we clean up the fields
TO
andQTY
(in case we now have missing values) - We sort all SKUs by turnover in descending order.
- We store in variables the variety of items and the full turnover
let cumTO = 0;
rows.forEach((r, idx) => {
cumTO += r.TO;
r.cum_turnover = cumTO;
r.cum_share = +(cumTO / totalTO).toFixed(6);
r.sku_rank = idx + 1;
r.cum_skus = +((idx + 1) / n).toFixed(6);
r.cum_skus_pct = +(r.cum_skus * 100).toFixed(2);
});
return rows.map(r => ({ json: r }));
Then we loop over all items in sorted order.
- Use the variable
cumTO
to compute the cumulative contribution - Add several Pareto metrics to every row:
cum_turnover
: cumulative turnover as much as this itemcum_share
: cumulative share of turnoversku_rank
: rating position of the itemcum_skus
: cumulative variety of SKUs as a fraction of total SKUscum_skus_pct
: same ascum_skus
, but in %.
We’re then done with the information preparation of the pareto chart.

This dataset will likely be stored within the worksheet Pareto
by the node Update Pareto Sheet
.
And with a little bit of magic, we are able to generate this graph in the primary worksheet:

We are able to now proceed with the ABC XYZ chart.
Step 3: Calculate the demand variability and sales contribution
We could reuse the output of the pareto chart for the sales contribution, but we’ll consider each chart as independent.
I’ll split the code for the node Demand Variability’ and ‘Sales x Sales % into multiple segments for clarity.
function mean(a)
function stdev_samp(a){
if (a.length <= 1) return 0;
const m = mean(a);
const v = a.reduce((s,x)=> s + (x - m) ** 2, 0) / (a.length - 1);
return Math.sqrt(v);
}
These two functions will likely be used for the coefficient of variation (Cov)
mean(a)
: computes the typical of an array.stdev_samp(a)
: computes the sample standard deviation
They take as inputs the day by day sales distributions of every ITEM
that we construct on this second block.
const series = {}; // ITEM -> { day -> qty_sum }
let totalQty = 0;
for (const { json } of things) {
const item = String(json.ITEM);
const day = String(json.DAY);
const qty = Number(json.QTY || 0);
if (!series[item]) series[item] = {};
series[item][day] = (series[item][day] || 0) + qty;
totalQty += qty;
}
import pandas as pd
import numpy as np
df['QTY'] = pd.to_numeric(df['QTY'], errors='coerce').fillna(0)
daily_series = df.groupby(['ITEM', 'DAY'])['QTY'].sum().reset_index()
Now we are able to compute the metrics applied to the day by day sales distributions.
const out = [];
for (const [item, dayMap] of Object.entries(series)) {
const day by day = Object.values(dayMap); // day by day sales quantities
const qty_total = day by day.reduce((s,x)=>s+x, 0);
const m = mean(day by day); // average day by day sales
const sd = stdev_samp(day by day); // variability of sales
const cv = m ? sd / m : null; // coefficient of variation
const share_qty_pct = totalQty ? (qty_total / totalQty) * 100 : 0;
out.push({
ITEM: item,
qty_total,
share_qty_pct: Number(share_qty_pct.toFixed(2)),
mean_qty: Number(m.toFixed(3)),
std_qty: Number(sd.toFixed(3)),
cv_qty: cv == null ? null : Number(cv.toFixed(3)),
});
}
For every ITEM
, we calculate
qty_total
: total salesmean_qty
: average day by day sales.std_qty
: standard deviation of day by day sales.cv_qty
: coefficient of variation (variability measure for XYZ classification)share_qty_pct
: % contribution to total sales (used for ABC classification)
Here is the Python version in case you were lost:
summary = daily_series.groupby('ITEM').agg(
qty_total=('QTY', 'sum'),
mean_qty=('QTY', 'mean'),
std_qty=('QTY', 'std')
).reset_index()
summary['std_qty'] = summary['std_qty'].fillna(0)
total_qty = summary['qty_total'].sum()
summary['cv_qty'] = summary['std_qty'] / summary['mean_qty'].replace(0, np.nan)
summary['share_qty_pct'] = 100 * summary['qty_total'] / total_qty
We’re nearly done.
We just have to sort by descending contribution to arrange for the ABC class mapping:
out.sort((a,b) => b.share_qty_pct - a.share_qty_pct);
return out.map(r => ({ json: r }));
We now have for every ITEM
, the important thing metrics needed to create the scatter plot.

Demand Variability x Sales %
– (Image by Samir Saci)Only the ABC classes are missing at this step.
Step 4: Add ABC classes
We take the output of the previous node as input.
let rows = items.map(i => i.json);
rows.sort((a, b) => b.share_qty_pct - a.share_qty_pct);
Just in case, we sort ITEMS
by descending by sales share (%) → most vital SKUs first.
()
Then we are able to apply the category based on hardcoded conditions:
- A: SKUs that together represent the primary 5% of sales
- B: SKUs that together represent the following 15% of sales
- C: All the things after 20%.
let cum = 0;
for (let r of rows) {
cum += r.share_qty_pct;
// 3) Assign class based on cumulative %
if (cum <= 5) {
r.ABC = 'A'; // top 5%
} else if (cum <= 20) {
r.ABC = 'B'; // next 15%
} else {
r.ABC = 'C'; // rest
}
r.cum_share = Number(cum.toFixed(2));
}
return rows.map(r => ({ json: r }));
This could be done that way using Python Code.
df = df.sort_values('share_qty_pct', ascending=False).reset_index(drop=True)
df['cum_share'] = df['share_qty_pct'].cumsum()
def classify(cum):
if cum <= 5:
return 'A'
elif cum <= 20:
return 'B'
else:
return 'C'
df['ABC'] = df['cum_share'].apply(classify)
The outcomes can now be used to generate this chart, which could be present in the primary sheet of the Google Sheet:

I struggled (probably on account of my limited knowledge of Google Sheets) to seek out a “manual” solution to create this scatter plot with the proper color mapping.
Subsequently, I used a Google Apps Script available within the Google Sheet to create it.

As a bonus, I added more nodes to the n8n template that perform the identical form of GroupBy to calculate sales by store or a pair of ITEM-store.

They could be used to create visuals like this one:

To conclude this tutorial, we are able to confidently declare that the job is completed.
For a live demo of the workflow, you possibly can have a take a look at this short tutorial
Our customers, who run this workflow on their n8n cloud instance, can now gain visibility into each step of the information processing.
But at which cost? Are we loosing in performance?
That is what we'll discover in the following section.
Comparative Study of Performance: n8n JavaScript Node vs. Python in FastAPI
To reply this query, I prepared an easy experiment.
The identical dataset and transformations were processed using two different approaches inside n8n:
- All in JavaScript nodes with functions directly inside n8n.
- Outsourcing to FastAPI microservices by replacing the JavaScript logic with HTTP requests to Python endpoints.

These two endpoints are connected to functions that can load the information directly from the VPS instance where I hosted the microservice.
@router.post("/launch_pareto")
async def launch_speedtest(request: Request):
try:
session_id = request.headers.get('session_id', 'session')
folder_in = f'data/session/speed_test/input'
if not path.exists(folder_in):
makedirs(folder_in)
file_path = folder_in + '/sales.csv'
logger.info(f"[SpeedTest]: Loading data from session file: {file_path}")
df = pd.read_csv(file_path, sep=";")
logger.info(f"[SpeedTest]: Data loaded successfully: {df.head()}")
speed_tester = SpeedAnalysis(df)
output = await speed_tester.process_pareto()
result = output.to_dict(orient="records")
result = speed_tester.convert_numpy(result)
logger.info(f"[SpeedTest]: /launch_pareto accomplished successfully for {session_id}")
return result
except Exception as e:
logger.error(f"[SpeedTest]: Error /launch_pareto: {str(e)}n{traceback.format_exc()}")
raise HTTPException(status_code=500, detail=f"Didn't process Speed Test Evaluation: {str(e)}")
@router.post("/launch_abc_xyz")
async def launch_abc_xyz(request: Request):
try:
session_id = request.headers.get('session_id', 'session')
folder_in = f'data/session/speed_test/input'
if not path.exists(folder_in):
makedirs(folder_in)
file_path = folder_in + '/sales.csv'
logger.info(f"[SpeedTest]: Loading data from session file: {file_path}")
df = pd.read_csv(file_path, sep=";")
logger.info(f"[SpeedTest]: Data loaded successfully: {df.head()}")
speed_tester = SpeedAnalysis(df)
output = await speed_tester.process_abcxyz()
result = output.to_dict(orient="records")
result = speed_tester.convert_numpy(result)
logger.info(f"[SpeedTest]: /launch_abc_xyz accomplished successfully for {session_id}")
return result
except Exception as e:
logger.error(f"[SpeedTest]: Error /launch_abc_xyz: {str(e)}n{traceback.format_exc()}")
raise HTTPException(status_code=500, detail=f"Didn't process Speed Test Evaluation: {str(e)}")
I need to focus this test only on the information processing performance.
The SpeedAnalysis
includes all the information processing steps listed within the previous section
- Grouping sales by
ITEM
- Sorting
ITEM
by descending order and calculate cumulative sales - Calculating standard deviations and technique of sales distribution by
ITEM
class SpeedAnalysis:
def __init__(self, df: pd.DataFrame):
config = load_config()
self.df = df
def processing(self):
try:
sales = self.df.copy()
sales = sales[sales['QTY']>0].copy()
self.sales = sales
except Exception as e:
logger.error(f'[SpeedTest] Error for processing : {e}n{traceback.format_exc()}')
def prepare_pareto(self):
try:
sku_agg = self.sales.copy()
sku_agg = (sku_agg.groupby("ITEM", as_index=False)
.agg(TO=("TO","sum"), QTY=("QTY","sum"))
.sort_values("TO", ascending=False))
pareto = sku_agg.copy()
total = pareto["TO"].sum() or 1.0
pareto["cum_turnover"] = pareto["TO"].cumsum()
pareto["cum_share"] = pareto["cum_turnover"] / total
pareto["sku_rank"] = range(1, len(pareto) + 1)
pareto["cum_skus"] = pareto["sku_rank"] / len(pareto)
pareto["cum_skus_pct"] = pareto["cum_skus"] * 100
return pareto
except Exception as e:
logger.error(f'[SpeedTest]Error for prepare_pareto: {e}n{traceback.format_exc()}')
def abc_xyz(self):
day by day = self.sales.groupby(["ITEM", "DAY"], as_index=False)["QTY"].sum()
stats = (
day by day.groupby("ITEM")["QTY"]
.agg(
qty_total="sum",
mean_qty="mean",
std_qty="std"
)
.reset_index()
)
stats["cv_qty"] = stats["std_qty"] / stats["mean_qty"].replace(0, np.nan)
total_qty = stats["qty_total"].sum()
stats["share_qty_pct"] = (stats["qty_total"] / total_qty * 100).round(2)
stats = stats.sort_values("share_qty_pct", ascending=False).reset_index(drop=True)
stats["cum_share"] = stats["share_qty_pct"].cumsum().round(2)
def classify(cum):
if cum <= 5:
return "A"
elif cum <= 20:
return "B"
else:
return "C"
stats["ABC"] = stats["cum_share"].apply(classify)
return stats
def convert_numpy(self, obj):
if isinstance(obj, dict):
return {k: self.convert_numpy(v) for k, v in obj.items()}
elif isinstance(obj, list):
return [self.convert_numpy(v) for v in obj]
elif isinstance(obj, (np.integer, int)):
return int(obj)
elif isinstance(obj, (np.floating, float)):
return float(obj)
else:
return obj
async def process_pareto(self):
"""Fundamental processing function that calls all other methods so as."""
self.processing()
outputs = self.prepare_pareto()
return outputs
async def process_abcxyz(self):
"""Fundamental processing function that calls all other methods so as."""
self.processing()
outputs = self.abc_xyz().fillna(0)
logger.info(f"[SpeedTest]: ABC-XYZ evaluation accomplished {outputs}.")
return outputs
Now that we now have these endpoints ready, we are able to begin testing.

The outcomes are shown above:
- JavaScript-only workflow: The entire process was accomplished in a bit greater than 11.7 seconds.
- FastAPI-backed workflow: The equivalent “outsourced” process was accomplished in ~11.0 seconds.
In other words, outsourcing complex computations to Python actually improves the performance.
The explanation is that FastAPI endpoints execute optimised Python functions directly, while JavaScript nodes inside n8n must iterate (with loops).
For big datasets, I might imagine a delta that might be not negligible.
This demonstrates that you could do easy data processing inside n8n using small JavaScript snippets.
Nonetheless, our Supply Chain Analytics products can require more advanced processing involving optimisation and advanced statistical libraries.

For that, customers can accept coping with a “black box” approach, as seen within the Production Planning workflow presented on this Towards Data Science article.
But for we are able to integrate them into the workflow to supply visibility to non-code users.
For one more project, I exploit n8n to attach Supply Chain IT systems for the transfer of Purchase Orders using Electronic Data Interchange (EDI).

This workflow, deployed for a small logistics company, entirely parses EDI messages using JavaScript nodes.

As you possibly can discover on this tutorial, we now have performed 100% of the Electronic Data Interchange message parsing using JavaScript nodes.
This helped us to enhance the robustness of the answer and reduce our workload by handing over the upkeep to the client.
What's the perfect approach?
For me, n8n must be used as an orchestration and integration tool connected to our core analytics products.
These analytics products require specific input formats that won't align with our customers’ data.
Subsequently, I might advise using JavaScript code nodes to perform this preprocessing.

For instance, the workflow above connects a Google Sheet (containing input data) to a FastAPI microservice that runs an algorithm for Distribution Planning Optimisation.
The thought is to plug our optimisation algorithm right into a Google Sheet utilized by Distribution Planners to organise store deliveries.

The JavaScript code node is used to remodel the information collected from the Google Sheet into the input format required by our algorithm.
By doing the job contained in the workflow, it stays under the control of the client who runs the workflow in their very own instance.
And we are able to keep the optimisation part in a microservice hosted on our instance.
To raised understand the setup, be happy to have a take a look at this short presentation
I hope this tutorial and the examples above have given you adequate insight to know what could be done with n8n by way of data analytics.
Be at liberty to share your comments in regards to the approach and your thoughts on what might be improved to boost the workflow’s performance with me.
About Me
Let’s connect on Linkedin and Twitter. I'm a Supply Chain Engineer who uses data analytics to enhance logistics operations and reduce costs.
For consulting or advice on analytics and sustainable supply chain transformation, be happy to contact me via Logigreen Consulting.
Find your complete guide for Supply Chain Analytics: Analytics Cheat Sheet.
Should you are occupied with Data Analytics and Supply Chain, take a look at my website.