From Python to JavaScript: A Playbook for Data Analytics in n8n with Code Node Examples

-

my first n8n workflow, as a knowledge scientist, it felt like I used to be cheating.

I could connect with APIs without reading 30-page docs, trigger workflows from Gmail or Sheets, and deploy something useful in minutes.

Nonetheless, the numerous drawback is that n8n shouldn’t be natively optimised to run a Python environment within the cloud instances utilized by our customers.

Like many data scientists, my day by day toolbox for data analytics is built on NumPy and Pandas

To remain in my comfort zone, I often d calculations to external APIs as an alternative of using n8n JavaScript code nodes.

Production Planning n8n workflow with API function calling – (Image by Samir Saci)

For example, that is what is completed with a Production Planning Optimisation tool, which is orchestrated through a workflow that features an Agent node that calls a FastAPI microservice.

This approach worked, but I had clients who requested to have complete visibility of the information analytics tasks on their n8n user interface.

I realised that I would like to learn simply enough JavaScript to perform data processing with the native code nodes of n8n.

Example of JavaScript node grouping sales by ITEM – (Image by Samir Saci)

In this text, we’ll experiment with small JavaScript snippets inside n8n Code nodes to perform on a regular basis data analytics tasks.

For this exercise, I’ll use a dataset of sales transactions and walk it through to an ABC and Pareto evaluation, that are widely utilized in Supply Chain Management.

ABC XYZ & Pareto Charts widely utilized in Supply Chain Management – (Image by Samir Saci)

I’ll provide side-by-side examples of Pandas vs. JavaScript in n8n Code nodes, allowing us to translate our familiar Python data evaluation steps directly into automated n8n workflows.

Example of JavaScript vs. Pandas – (Image by Samir Saci)

The thought is to implement these solutions for small datasets or quick prototyping inside the capabilities of a cloud enterprise n8n instance (i.e. without community nodes).

The experimental workflow we’ll construct together – (Image by Samir Saci)

I’ll end the experiment with a fast comparative study of the performance versus a FastAPI call.

You may follow me and replicate your entire workflow using a Google Sheet and a workflow template shared within the article.

Let’s start!

Constructing a Data Analytics Workflow using JavaScript in n8n

Before beginning to construct nodes, I’ll introduce the context of this evaluation.

ABC & Pareto Charts for Supply Chain Management

For this tutorial, I propose that you simply construct a straightforward workflow that takes sales transactions from Google Sheets and transforms them right into a comprehensive ABC and Pareto charts.

This may replicate the ABC and Pareto Evaluation module of the LogiGreen Apps developed by my startup, LogiGreen.

ABC Evaluation Module of the LogiGreen Apps – (Image by Samir Saci)

The goal is to generate a set of visuals for the inventory teams of a supermarket chain to assist them understand the distribution of sales across their stores.

We’ll give attention to generating two visuals.

The primary chart shows an ABC-XYZ evaluation of sales items:

ABC XYZ Chart – (Image by Samir Saci)
  • X-axis (Percentage of Turnover %): the contribution of every item to total revenue.
  • Y-axis (Coefficient of Variation): demand variability of every item.
  • Vertical red lines split items into A, B, and C classes based on turnover share.
  • The horizontal blue line marks stable vs variable demand (CV=1)

Together, it highlights which items are high-value & stable (A, low CV) versus those which might be low-value or highly variable, guiding prioritisation in inventory management.

The second visual is a Pareto evaluation of sales turnover:

Pareto Chart generated by the Logigreen App – Image by Samir Saci
  • X-axis: percentage of SKUs (ranked by sales).
  • Y-axis: cumulative percentage of annual turnover.
  • The curve illustrates how a small fraction of things contributes to the vast majority of revenue.

In brief, this highlights (or not) the classic Pareto rule, which affirms that 80% of sales can come from 20% of the SKUs.

How did I generate these two visuals? I simply used Python.

On my YouTube channel, I shared a complete tutorial on the way to do it using Pandas and Matplotlib.

The target of this tutorial is to arrange sales transactions and generate these visuals in a Google Sheet using only n8n’s native JavaScript nodes.

Constructing a Data Analytics Workflow in n8n

I propose to construct a workflow that’s manually triggered to facilitate debugging during development.

Final Workflow manually triggered to gather data from Google Sheets to generate visuals – (Image by Samir Saci)

To follow this tutorial, it’s good to

You may now connect your duplicated sheet using the second node, which can extract the dataset from the worksheet: Input Data.

Connect the second node to your copy of the Google Sheet to gather input data – (Image by Samir Saci)

This dataset includes retail sales transactions on the day by day granularity:

  • ITEM: an item that could be sold in multiple stores
  • SKU: represents an `SKU` sold in a particular store 
  • FAMILY: a bunch of things
  • CATEGORY: a product category can include multiple families
  • STORE: a code representing a sales location
  • DAY of the transaction
  • QTY: sales quantity in units
  • TO: sales quantity in euros

The output is the table’s content in JSON format, able to be ingested by other nodes.

import pandas as pd
df = pd.read_csv("sales.csv") 

We are able to now begin processing the dataset to construct our two visualisations.

Step 1: Filter out transactions without sales

Allow us to begin with the easy motion of filtering out transactions with sales QTY equal to zero.

Filter out transactions without sales using the filter node – (Image by Samir Saci)

We don’t need JavaScript; a straightforward Filter node can do the job.

df = df[df["QTY"] != 0]

Step 2: Prepare data for Pareto Evaluation

We first have to aggregate the sales per ITEM and rank products by turnover.

sku_agg = (df.groupby("ITEM", as_index=False)
             .agg(TO=("TO","sum"), QTY=("QTY","sum"))
             .sort_values("TO", ascending=False))

In our workflow, this step will likely be done within the JavaScript node TO, QTY GroupBY ITEM:

const agg = {};
for (const {json} of things) {
  const ITEM = json.ITEM;
  const TO = Number(json.TO);
  const QTY = Number(json.QTY);
  if (!agg[ITEM]) agg[ITEM] = { ITEM, TO: 0, QTY: 0 };
  agg[ITEM].TO += TO;
  agg[ITEM].QTY += QTY;
}
const rows = Object.values(agg).sort((a,b)=> b.TO - a.TO);
return rows.map(r => ({ json: r }));

This node returns a ranked table of sales per ITEM in quantity (QTY) and turnover (TO): 

  1. We initiate agg as a dictionary keyed by ITEM
  2. We loop over n8n rows in items
  • Converting TO and QTY to numbers
  • Add the QTY and TO value into the running totals of every ITEM
  1. We finally transform the dictionary into an array sorted by TO desc and return items
Output data of the aggregation of sales by ITEM – (Image by Samir Saci)

We now have the information able to perform a Pareto Evaluation on sales quantity (QTY) or turnover (TO).

For that, we want to calculate cumulative sales and rank SKUs from the best to the bottom contributor.

abc = sku_agg.copy()  # from Step 2, already sorted by TO desc
total = abc["TO"].sum() or 1.0
abc["cum_turnover"] = abc["TO"].cumsum()
abc["cum_share"]    = abc["cum_turnover"] / total             
abc["sku_rank"]     = range(1, len(abc) + 1)
abc["cum_skus"]     = abc["sku_rank"] / len(abc)               
abc["cum_skus_pct"] = abc["cum_skus"] * 100                    

This step will likely be done within the code node Pareto Evaluation:

const rows = items
  .map(i => ())
  .sort((a, b) => b.TO - a.TO);

const n = rows.length; // variety of ITEM
const totalTO = rows.reduce((s, r) => s + r.TO, 0) || 1;

We collect the dataset items from the previous node

  1. For every row, we clean up the fields TO and QTY (in case we now have missing values)
  2. We sort all SKUs by turnover in descending order.
  3. We store in variables the variety of items and the full turnover
let cumTO = 0;
rows.forEach((r, idx) => {
  cumTO += r.TO;
  r.cum_turnover = cumTO;                     
  r.cum_share = +(cumTO / totalTO).toFixed(6); 
  r.sku_rank = idx + 1;
  r.cum_skus = +((idx + 1) / n).toFixed(6);   
  r.cum_skus_pct = +(r.cum_skus * 100).toFixed(2);
});

return rows.map(r => ({ json: r }));

Then we loop over all items in sorted order.

  1. Use the variable cumTO to compute the cumulative contribution
  2. Add several Pareto metrics to every row:
  • cum_turnover: cumulative turnover as much as this item
  • cum_share: cumulative share of turnover
  • sku_rank: rating position of the item
  • cum_skus: cumulative variety of SKUs as a fraction of total SKUs
  • cum_skus_pct: same as cum_skus, but in %.

We’re then done with the information preparation of the pareto chart.

Final results – (Image by Samir Saci)

This dataset will likely be stored within the worksheet Pareto by the node Update Pareto Sheet.

And with a little bit of magic, we are able to generate this graph in the primary worksheet:

Pareto Chart generated using data processed by the n8n workflow – (Image by Samir Saci)

We are able to now proceed with the ABC XYZ chart.

Step 3: Calculate the demand variability and sales contribution

We could reuse the output of the pareto chart for the sales contribution, but we’ll consider each chart as independent.

I’ll split the code for the node Demand Variability’ and ‘Sales x Sales % into multiple segments for clarity.

function mean(a)
function stdev_samp(a){
  if (a.length <= 1) return 0;
  const m = mean(a);
  const v = a.reduce((s,x)=> s + (x - m) ** 2, 0) / (a.length - 1);
  return Math.sqrt(v);
}

These two functions will likely be used for the coefficient of variation (Cov)

  • mean(a): computes the typical of an array.
  • stdev_samp(a): computes the sample standard deviation

They take as inputs the day by day sales distributions of every ITEM that we construct on this second block.

const series = {};  // ITEM -> { day -> qty_sum }
let totalQty = 0;

for (const { json } of things) {
  const item = String(json.ITEM);
  const day  = String(json.DAY);
  const qty  = Number(json.QTY || 0);

  if (!series[item]) series[item] = {};
  series[item][day] = (series[item][day] || 0) + qty;
  totalQty += qty;
}

import pandas as pd
import numpy as np
df['QTY'] = pd.to_numeric(df['QTY'], errors='coerce').fillna(0)
daily_series = df.groupby(['ITEM', 'DAY'])['QTY'].sum().reset_index()

Now we are able to compute the metrics applied to the day by day sales distributions.

const out = [];
for (const [item, dayMap] of Object.entries(series)) {
  const day by day = Object.values(dayMap); // day by day sales quantities
  const qty_total = day by day.reduce((s,x)=>s+x, 0);
  const m = mean(day by day);               // average day by day sales
  const sd = stdev_samp(day by day);        // variability of sales
  const cv = m ? sd / m : null;        // coefficient of variation
  const share_qty_pct = totalQty ? (qty_total / totalQty) * 100 : 0;

  out.push({
    ITEM: item,
    qty_total,
    share_qty_pct: Number(share_qty_pct.toFixed(2)),
    mean_qty: Number(m.toFixed(3)),
    std_qty: Number(sd.toFixed(3)),
    cv_qty: cv == null ? null : Number(cv.toFixed(3)),
  });
}

For every ITEM, we calculate

  • qty_total: total sales
  • mean_qty: average day by day sales.
  • std_qty: standard deviation of day by day sales.
  • cv_qty: coefficient of variation (variability measure for XYZ classification)
  • share_qty_pct: % contribution to total sales (used for ABC classification)

Here is the Python version in case you were lost:

summary = daily_series.groupby('ITEM').agg(
    qty_total=('QTY', 'sum'),
    mean_qty=('QTY', 'mean'),
    std_qty=('QTY', 'std')
).reset_index()

summary['std_qty'] = summary['std_qty'].fillna(0)

total_qty = summary['qty_total'].sum()
summary['cv_qty'] = summary['std_qty'] / summary['mean_qty'].replace(0, np.nan)
summary['share_qty_pct'] = 100 * summary['qty_total'] / total_qty

We’re nearly done.

We just have to sort by descending contribution to arrange for the ABC class mapping:

out.sort((a,b) => b.share_qty_pct - a.share_qty_pct);
return out.map(r => ({ json: r }));

We now have for every ITEM, the important thing metrics needed to create the scatter plot.

Output of the node Demand Variability x Sales % – (Image by Samir Saci)

Only the ABC classes are missing at this step.

Step 4: Add ABC classes

We take the output of the previous node as input.

let rows = items.map(i => i.json);
rows.sort((a, b) => b.share_qty_pct - a.share_qty_pct);

Just in case, we sort ITEMS by descending by sales share (%) → most vital SKUs first.

()

Then we are able to apply the category based on hardcoded conditions:

  • A: SKUs that together represent the primary 5% of sales
  • B: SKUs that together represent the following 15% of sales
  • C: All the things after 20%.
let cum = 0;
for (let r of rows) {
  cum += r.share_qty_pct;

  // 3) Assign class based on cumulative %
  if (cum <= 5) {
    r.ABC = 'A';   // top 5%
  } else if (cum <= 20) {
    r.ABC = 'B';   // next 15%
  } else {
    r.ABC = 'C';   // rest
  }

  r.cum_share = Number(cum.toFixed(2));
}

return rows.map(r => ({ json: r }));

This could be done that way using Python Code.

df = df.sort_values('share_qty_pct', ascending=False).reset_index(drop=True)
df['cum_share'] = df['share_qty_pct'].cumsum()
def classify(cum):
    if cum <= 5:
        return 'A'
    elif cum <= 20:
        return 'B'
    else:
        return 'C'
df['ABC'] = df['cum_share'].apply(classify)

The outcomes can now be used to generate this chart, which could be present in the primary sheet of the Google Sheet:

ABC XYZ Chart generated with the information processed by the workflow using JavaScript – (Image by Samir Saci)

I struggled (probably on account of my limited knowledge of Google Sheets) to seek out a “manual” solution to create this scatter plot with the proper color mapping.

Subsequently, I used a Google Apps Script available within the Google Sheet to create it.

Script included within the Google Sheet to generate the visual – (Image by Samir Saci)

As a bonus, I added more nodes to the n8n template that perform the identical form of GroupBy to calculate sales by store or a pair of ITEM-store.

The experimental workflow we built together – (Image by Samir Saci)

They could be used to create visuals like this one:

Total Every day Sales Quantity per Store – (Image by Samir Saci)

To conclude this tutorial, we are able to confidently declare that the job is completed.

For a live demo of the workflow, you possibly can have a take a look at this short tutorial

Our customers, who run this workflow on their n8n cloud instance, can now gain visibility into each step of the information processing.

But at which cost? Are we loosing in performance?

That is what we'll discover in the following section.

Comparative Study of Performance: n8n JavaScript Node vs. Python in FastAPI

To reply this query, I prepared an easy experiment.

The identical dataset and transformations were processed using two different approaches inside n8n:

  1. All in JavaScript nodes with functions directly inside n8n.
  2. Outsourcing to FastAPI microservices by replacing the JavaScript logic with HTTP requests to Python endpoints.
Easy Workflow using FastAPI microservice – (Image by Samir Saci)

These two endpoints are connected to functions that can load the information directly from the VPS instance where I hosted the microservice.

@router.post("/launch_pareto")
async def launch_speedtest(request: Request):
    try:
        session_id = request.headers.get('session_id', 'session')

        folder_in = f'data/session/speed_test/input'
        if not path.exists(folder_in):
                makedirs(folder_in)

        file_path = folder_in + '/sales.csv'
        logger.info(f"[SpeedTest]: Loading data from session file: {file_path}")
        df = pd.read_csv(file_path, sep=";")
        logger.info(f"[SpeedTest]: Data loaded successfully: {df.head()}")

        speed_tester = SpeedAnalysis(df)
        output = await speed_tester.process_pareto()
        
        result = output.to_dict(orient="records")
        result = speed_tester.convert_numpy(result)
        
        logger.info(f"[SpeedTest]: /launch_pareto accomplished successfully for {session_id}")
        return result
    except Exception as e:
        logger.error(f"[SpeedTest]: Error /launch_pareto: {str(e)}n{traceback.format_exc()}")
        raise HTTPException(status_code=500, detail=f"Didn't process Speed Test Evaluation: {str(e)}")
    
@router.post("/launch_abc_xyz")
async def launch_abc_xyz(request: Request):
    try:
        session_id = request.headers.get('session_id', 'session')

        folder_in = f'data/session/speed_test/input'
        if not path.exists(folder_in):
                makedirs(folder_in)

        file_path = folder_in + '/sales.csv'
        logger.info(f"[SpeedTest]: Loading data from session file: {file_path}")
        df = pd.read_csv(file_path, sep=";")
        logger.info(f"[SpeedTest]: Data loaded successfully: {df.head()}")

        speed_tester = SpeedAnalysis(df)
        output = await speed_tester.process_abcxyz()
        
        result = output.to_dict(orient="records")
        result = speed_tester.convert_numpy(result)
        
        logger.info(f"[SpeedTest]: /launch_abc_xyz accomplished successfully for {session_id}")
        return result
    except Exception as e:
        logger.error(f"[SpeedTest]: Error /launch_abc_xyz: {str(e)}n{traceback.format_exc()}")
        raise HTTPException(status_code=500, detail=f"Didn't process Speed Test Evaluation: {str(e)}")

I need to focus this test only on the information processing performance.

The SpeedAnalysis includes all the information processing steps listed within the previous section

  • Grouping sales by ITEM
  • Sorting ITEM by descending order and calculate cumulative sales
  • Calculating standard deviations and technique of sales distribution by ITEM
class SpeedAnalysis:
    def __init__(self, df: pd.DataFrame):
        config = load_config()
        
        self.df = df
        
    def processing(self):
        try:
            sales = self.df.copy()
            sales = sales[sales['QTY']>0].copy()
            self.sales = sales

        except Exception as e:
            logger.error(f'[SpeedTest] Error for processing : {e}n{traceback.format_exc()}')
            
    def prepare_pareto(self):
        try:
            sku_agg = self.sales.copy()
            sku_agg = (sku_agg.groupby("ITEM", as_index=False)
             .agg(TO=("TO","sum"), QTY=("QTY","sum"))
             .sort_values("TO", ascending=False))
            
            pareto = sku_agg.copy()  
            total = pareto["TO"].sum() or 1.0
            pareto["cum_turnover"] = pareto["TO"].cumsum()
            pareto["cum_share"]    = pareto["cum_turnover"] / total              
            pareto["sku_rank"]     = range(1, len(pareto) + 1)
            pareto["cum_skus"]     = pareto["sku_rank"] / len(pareto)               
            pareto["cum_skus_pct"] = pareto["cum_skus"] * 100
            return pareto                    
        except Exception as e:
            logger.error(f'[SpeedTest]Error for prepare_pareto: {e}n{traceback.format_exc()}')
            
    def abc_xyz(self):
            day by day = self.sales.groupby(["ITEM", "DAY"], as_index=False)["QTY"].sum()
            stats = (
                day by day.groupby("ITEM")["QTY"]
                .agg(
                    qty_total="sum",
                    mean_qty="mean",
                    std_qty="std"
                )
                .reset_index()
            )
            stats["cv_qty"] = stats["std_qty"] / stats["mean_qty"].replace(0, np.nan)
            total_qty = stats["qty_total"].sum()
            stats["share_qty_pct"] = (stats["qty_total"] / total_qty * 100).round(2)
            stats = stats.sort_values("share_qty_pct", ascending=False).reset_index(drop=True)
            stats["cum_share"] = stats["share_qty_pct"].cumsum().round(2)
            def classify(cum):
                if cum <= 5:
                    return "A"
                elif cum <= 20:
                    return "B"
                else:
                    return "C"
            stats["ABC"] = stats["cum_share"].apply(classify)
            return stats
        
    def convert_numpy(self, obj):
        if isinstance(obj, dict):
            return {k: self.convert_numpy(v) for k, v in obj.items()}
        elif isinstance(obj, list):
            return [self.convert_numpy(v) for v in obj]
        elif isinstance(obj, (np.integer, int)):
            return int(obj)
        elif isinstance(obj, (np.floating, float)):
            return float(obj)
        else:
            return obj

    async def process_pareto(self):
        """Fundamental processing function that calls all other methods so as."""
        self.processing()
        outputs = self.prepare_pareto()
        return outputs
    
    async def process_abcxyz(self):
        """Fundamental processing function that calls all other methods so as."""
        self.processing()
        outputs = self.abc_xyz().fillna(0)
        logger.info(f"[SpeedTest]: ABC-XYZ evaluation accomplished {outputs}.")
        return outputs

Now that we now have these endpoints ready, we are able to begin testing.

Results of the experimentation (Top: Processing using native code nodes / Bottom: FastAPI Microservice) – (Image by Samir Saci)

The outcomes are shown above:

  • JavaScript-only workflow: The entire process was accomplished in a bit greater than 11.7 seconds.
  • FastAPI-backed workflow: The equivalent “outsourced” process was accomplished in ~11.0 seconds.

In other words, outsourcing complex computations to Python actually improves the performance.

The explanation is that FastAPI endpoints execute optimised Python functions directly, while JavaScript nodes inside n8n must iterate (with loops).

For big datasets, I might imagine a delta that might be not negligible.

This demonstrates that you could do easy data processing inside n8n using small JavaScript snippets.

Nonetheless, our Supply Chain Analytics products can require more advanced processing involving optimisation and advanced statistical libraries.

AI Workflow for Production Planning Optimisation (Image by Samir Saci)

For that, customers can accept coping with a “black box” approach, as seen within the Production Planning workflow presented on this Towards Data Science article.

But for we are able to integrate them into the workflow to supply visibility to non-code users.

For one more project, I exploit n8n to attach Supply Chain IT systems for the transfer of Purchase Orders using Electronic Data Interchange (EDI).

Example of Electronic Data Interchange (EDI) Parsing Workflow – (Image by Samir Saci)

This workflow, deployed for a small logistics company, entirely parses EDI messages using JavaScript nodes.

Example of Electronic Data Interchange Message – (Image by Samir Saci)

As you possibly can discover on this tutorial, we now have performed 100% of the Electronic Data Interchange message parsing using JavaScript nodes.

This helped us to enhance the robustness of the answer and reduce our workload by handing over the upkeep to the client.

What's the perfect approach?

For me, n8n must be used as an orchestration and integration tool connected to our core analytics products.

These analytics products require specific input formats that won't align with our customers’ data.

Subsequently, I might advise using JavaScript code nodes to perform this preprocessing.

Workflow for Distribution Planning Optimisation Algorithm – (Image by Samir Saci)

For instance, the workflow above connects a Google Sheet (containing input data) to a FastAPI microservice that runs an algorithm for Distribution Planning Optimisation.

The thought is to plug our optimisation algorithm right into a Google Sheet utilized by Distribution Planners to organise store deliveries.

Worksheet utilized by Planning Teams – (Image by Samir Saci)

The JavaScript code node is used to remodel the information collected from the Google Sheet into the input format required by our algorithm.

By doing the job contained in the workflow, it stays under the control of the client who runs the workflow in their very own instance.

And we are able to keep the optimisation part in a microservice hosted on our instance.

To raised understand the setup, be happy to have a take a look at this short presentation

I hope this tutorial and the examples above have given you adequate insight to know what could be done with n8n by way of data analytics.

Be at liberty to share your comments in regards to the approach and your thoughts on what might be improved to boost the workflow’s performance with me.

About Me

Let’s connect on Linkedin and Twitter. I'm a Supply Chain Engineer who uses data analytics to enhance logistics operations and reduce costs.

For consulting or advice on analytics and sustainable supply chain transformation, be happy to contact me via Logigreen Consulting.

Find your complete guide for Supply Chain Analytics: Analytics Cheat Sheet.

Should you are occupied with Data Analytics and Supply Chain, take a look at my website.

Samir Saci | Data Science & Productivity

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x