How I Optimized My Leaf Raking Strategy Using Linear Programming

-

, and it’s officially leaf-raking season. As I engaged on this tedious task, I noticed it is essentially one big optimization problem.

When raking my leaves, I made 4 piles: one on either side of the tree, one near the front, and one on the far side of the yard to catch the sparsely distributed leaves that the wind had caught.

As I worked, I started to wonder: how removed from optimal was this? Would fewer piles be faster because I’d bag all the things directly? Or more piles, so I wouldn’t should rake as far? Possibly a single back-to-front pass would minimize total time?

Desperate to attenuate the time I spent raking, my brain stopped raking and commenced modeling.

Why Optimization Still Matters

Optimization is a strong yet often ignored tool in data science circles now dominated by machine learning. Don’t get me unsuitable, machine learning is great, but sometimes I feel it could be easy to miss the efficient solution (optimization algorithms) and jump right to the fun one (ML techniques).

Hopefully, after reading this, you’ll think that optimization could be fun, too. I do know that after I first learned it, I began to see it in all places. It literally modified the best way I perceived the world around me.

In this text, I’ll show how a straightforward yard chore could be modeled as a linear programming problem. Along the best way, we’ll:

1) break down the formulation behind a linear program (LP),
2) compare the optimal plan to intuitive human strategies, and
3) see how the identical reasoning scales to other real-world problems.

Defining the Problem

In defining any optimization problem, there are just a few key elements:

Objective function: Determining the target function is solely determining the goal of the issue. The target function is at all times designed to maximise or minimize some value. On this problem, we’re minimizing the time spent raking and bagging leaves.

Decision Variables: The choice variables are the parts of the issue you could control. For our leaf raking problem, the choice variables are the variety of piles that the raker decides to make, where those piles are positioned, and which leaves are raked into each of those piles.

Constraints: We even have some which are out of our control. After we determine constraints, we’re asking, “What aspects are beyond my control?” or “What are the principles?” Normally, there are loads of these. Relating to raking leaves, there are rules that we are able to’t change to make sure the job is completed well. Every leaf must be raked right into a pile, and so they can only be raked to a location where a pile has already been began. This also implies that we cannot have an infinite variety of piles (possibly we could, but that defeats the aim of consolidating leaves into piles before bagging). Lastly, we’re constrained by the quantity of leaves we are able to fit into each bag since they’ve a limited capability.

And identical to that, we’ve got the essential elements of a linear program.

These elements are present in every linear program formulation, but it could even be useful to define sets and parameters at this stage. We are going to proceed to this in the following sections.

Since the model employs binary variables to pick open piles and integer variables to represent the number of baggage (with linear constraints and costs), we formulate it as an integer linear program (ILP). If the task of 1 cell to 1 pile is relaxed in order that a cell could also be split between several piles, it becomes a mixed integer linear program (MILP).

Parameters and Assumptions

At first I assumed I’d just need my raking speed and bagging time.

Nevertheless, that quickly expanded right into a short list, including raking speed, bagging time, leaf distribution, wind direction, and yard slope. I also needed a strategy to represent each location within the yard.

A Quick Conceptual Experiment

If I were to calibrate this, I’d rake a 100-ft section, mark every 10 ft, and record split times to estimate how raking speed changes with density and distance. My hunch is that as I rake leaves a farther distance, I decelerate. Perhaps I can rake a pound of leaves one foot in lower than half the time that I can rake a pound of leaves two feet.

Likewise, I could time bagging for piles of various sizes: ¼ bag, ½ bag, ¾ bag, and a full bag-sized pile. Then, I could fit a function to indicate how bagging time grows with leaf volume. I believe that is roughly linear.

I didn’t actually time myself doing these tasks. As a substitute, I made rough estimates. The goal isn’t to realize perfect numbers, but to exhibit the structure and approach of linear programming.

All data in this text is simulated or estimated from my very own yard. No external datasets involved.

The Full Model

1. Representing the yard

To account for leaf density per square foot and which leaves to rake (or assign) to which pile, we discretize the yard right into a grid of cells.

Sets:

  • I: set of yard cells (grid cells), indexed by i.
  • J: set of candidate pile sites, indexed by j.

We also define other parameters, reminiscent of yard length, yard width, the scale of a side of the grid square, the middle of the tree trunk (from which the leaf distribution is derived), the radius of the tree trunk, the wind direction, ellipticity of the leaf distribution, the spread of the leaves, the bottom density of the leaves, an accumulation parameter, and decay power parameter. All of those aspects contribute to determining where the leaves are set on the very starting of the issue and could be observed with their default values in Table 1.

Parameter Description Value
L Yard length 60 ft
W Yard width 40 ft
s Grid cell size 1.0 ft
tree_center Tree center coordinates (x,y) (15,20) ft
rtrunk Tree trunk radius 1.5 ft
φ Wind direction 90°
axis_ratio Ellipticity of leaf distribution 1.5
σ Spread (std. dev.) of leaf density 10
ρ₀ Base leaf density 0.03
Aamp Wind accumulation amplitude 0.28
ppow Decay power in leaf density 2.0

From these parameters and the leaf-density model, we obtain:

  • mᵢ: mass of leaves (in kilos) in cell i ∈ I.
  • dᵢⱼ: distance from the middle of cell i to candidate pile site j.

These are computed numerically (in code) and treated as given constants within the optimization model.

2. Additional model parameters (timing / capability)

  • α > 0: raking-time scaling factor; a baseline for the way long it takes to rake a small mass of leaves over a brief distance. Higher α corresponds to lower overall raking speed.
  • β > 0: distance scaling parameter that models how raking effort grows with distance (i.e., raking becomes slower as leaves are moved farther).
  • b₀ ≥ 0: fixed setup time per bag (seconds per bag opened).
  • b₁ ≥ 0: bagging time per pound of leaves (seconds per lb).
  • C > 0: bag capability (lb per bag).
  • Kₘₐₓ ∈ ℕ₀: maximum variety of piles which may be opened.

We also define the derived pile mass mⱼ as the entire mass of leaves assigned to pile j:

mⱼ = ∑ᵢ mᵢ xᵢⱼ

3. Decision Variables

Now, we’ve got the entire parameters that we want to create the representation of leaves distributed across a yard, and the parameters defined that govern the mechanics of how those leaves could be raked and bagged. Let’s move on to our decision variables: raking (assigning) leaves to piles, opening piles, and the number of baggage used.

Task variables

To find out which leaves shall be raked to which piles, we arrange an task as such:

xᵢⱼ ∈ {0, 1}

for all i ∈ I, j ∈ J: xᵢⱼ = 1 if cell i is raked to pile j, else 0.

Pile-open variables

To find out where to rake leaves to, we define a pile-opening decision variable:

yⱼ ∈ {0, 1}

for all j ∈ J: yⱼ = 1 if pile j is opened (used), else 0.

Bag-count variables

Finally, to be sure that we’ve got enough bags to carry the leaves at each pile, we’ve got a bag count variable for every pile, defined as:

nⱼ ∈ ℕ₀

for all j ∈ J: integer number of baggage used at pile j.

Every cell’s leaves should be fully assigned to precisely one pile (via the xᵢⱼ’s), and bag counts nⱼ should be sufficient to carry the mass assigned to every open pile.

4. Integrality Conditions

Next, we declare the variable domains explicitly. This is about notation (also used above), but don’t be confused by the notation. All that is saying is that every yard grid with leaves in it might either be assigned to a pile j or it might not, each pile may either be in a single cell or it might not, and the number of baggage used to bag all leaves in pile j should be an integer greater than zero.

Remember, i represents the leaves within the yard and j represents the placement of the piles.

xᵢⱼ ∈ {0, 1}, for all i ∈ I, j ∈ J
yⱼ ∈ {0, 1}, for all j ∈ J
nⱼ ∈ ℕ₀,  for all j ∈ J

In Python, we specify these as:

# Decision variables 
x = [[pulp.LpVariable(f"x_{i}_{j}", cat="Binary") for j in range(m)] for i in range(n)] 
y = [pulp.LpVariable(f"y_{j}", cat="Binary") for j in range(m)] 
B = [pulp.LpVariable(f"B_{j}", lowBound=0, cat="Integer") for j in range(m)]

5. Objective Function

Raking time relies on mass and distance; bagging time relies on pile mass and number of baggage.
We minimize total time:

minimize over x, y, n:

∑ᵢ∈ᴵ ∑ⱼ∈ᴶ xᵢⱼ · α · mᵢ · dᵢⱼ^β + ∑ⱼ∈ᴶ ( b₀ · nⱼ + b₁ · mⱼ )

The primary term approximates raking effort: mass × distance^β, scaled by α. The second term adds bag setup time: b₀ times the number of baggage nⱼ and bagging time proportional to pile mass mⱼ via b₁.

In code, that is implemented as:

∑ᵢ,ⱼ xᵢⱼ ( α mᵢ dᵢⱼ^β + b₁ mᵢ ) + b₀ ∑ⱼ nⱼ,

which is algebraically an identical, since

∑ᵢ,ⱼ xᵢⱼ b₁ mᵢ = b₁ ∑ⱼ ∑ᵢ mᵢ xᵢⱼ = b₁ ∑ⱼ mⱼ.

6. Constraints

Now recall the constraints we discussed earlier. All we’re doing here is taking those self same constraints and defining them with formal math in order that we are able to formulate and solve the total problem.

(C1) Task

Each cell’s leaves should be assigned to precisely one pile:

∑ⱼ∈ᴶ xᵢⱼ = 1, for all i ∈ I.

for i in range(n):
    prob += pulp.lpSum(x[i][j] for j in range(m)) == 1

(C2) Linking: use a pile only whether it is opened
The leaves in a cell can’t be assigned to a location without an opened pile. We define this constraint as:
xᵢⱼ ≤ yⱼ, for all i ∈ I, j ∈ J.

for i in range(n): 
    for j in range(m): 
        prob += x[i][j] <= y[j]

(C3) Bag capability

Total mass assigned to pile $j$ must not exceed the capability of its bags:

∑ᵢ∈ᴵ mᵢ xᵢⱼ = mⱼ ≤ C nⱼ, for all j ∈ J.

# On this instance, I used B[j] to represent n_j 
for j in range(m): 
    prob += pulp.lpSum(masses[i] * x[i][j] for i in range(n)) <= bag_capacity_lb * B[j]

(C4) Maximum variety of piles

We limit the entire variety of piles that could be opened:
∑ⱼ∈ᴶ yⱼ ≤ Kₘₐₓ.

prob += pulp.lpSum(y[j] for j in range(m)) <= K_max

The Full Model

Putting all of it together, we get:

minimize over x, y, n:

∑ᵢ∈ᴵ ∑ⱼ∈ᴶ xᵢⱼ · α · mᵢ · dijβ + ∑ⱼ∈ᴶ ( b₀ · nⱼ + b₁ · mⱼ )

subject to:

∑ⱼ∈ᴶ xᵢⱼ = 1, for all i ∈ I.

xᵢⱼ ≤ yⱼ, for all i ∈ I, j ∈ J.

∑ᵢ∈ᴵ mᵢ xᵢⱼ = mⱼ ≤ C nⱼ, for all j ∈ J.

∑ⱼ∈ᴶ yⱼ ≤ Kₘₐₓ.

This completes the model specification.

For the total implementation (calibration, solver setup, and plots) see the project repository: Leaf-Raking Optimization Code.

Testing Easy Heuristics

A heuristic is a “rule of thumb” approach to solving problem. When most individuals rake their yards, they use a straightforward heuristic whether or not they comprehend it or not.

To examine the worth added by optimization, I compared the ILP to 3 easy heuristics:

  • Front Sweep: continuous rake from the back of the yard forward.
  • Micro-piles: many small piles near leaf-dense areas.
  • Back-Front-Centers: just a few large piles in central spots.

Each heuristic returns a set of pile centers based on easy rules. Once these piles are made, we assume that the raker will rake each cell of leaves to the closest pile. Note that under the optimization, the raker can rake leaves to a pile even when that pile isn't the closest.

Formulating the issue as a linear program prior to creating formal heuristics is important to be sure that the values returned by heuristics are feasible and align with the optimization objective.

Even when solving for an optimization is impractical, formulating one is commonly very helpful.

Below is an example snippet showing how the “micro-piles” heuristic initializes and refines centers based on leaf density:

Example: Micro-piles heuristic

def centers_micro(cells, masses, bag_capacity_lb, seed=42):
    rng = np.random.default_rng(seed)
    M_total = masses.sum()
    k = max(1, int(round(M_total / (2 * bag_capacity_lb))))
    probs = masses / (M_total + 1e-12)
    centers = cells[ rng.choice(len(masses), size=k, replace=False, p=probs) ]
    # Iteratively split dense clusters
    for _ in range(8): 
        ... 
    return centers

Full implementations for all heuristics can be found within the repository under src/core/centers.py and src/core/front_sweep.py.

Results

Figure 1: Total time taken to rake the yard by each strategy

The answer found by the optimization identifies 5 piles which are mostly centered across the tree. Its improvement over easy heuristics is notable but not dramatic.

Notice that there isn't a massive optimality gap between the micro-piles methodology and the optimized plan. This illustrates the ability of heuristic methods, particularly when tested against an optimal solution for instance the performance of that heuristic method.

Figure 2: Heatmap of heuristic vs optimal raking (image by writer). Rendered GIF visible on GitHub.

Conclusion

In lots of situations, computing the total linear program is just too computationally expensive, so we must use a heuristic that's “close enough” to optimal. Even for a straightforward problem like this, I needed to cut the solver off after it reached an optimality gap of 5% from the lower certain. In activities as commonplace and trivial as raking leaves, we use heuristics on a regular basis. Perhaps we lose around 2.5 minutes by raking suboptimally, but we save hours by not having to formulate and code a linear program.

In lots of other applications, nevertheless, just a few hours (or days) of math and code can save several weeks, or tens of millions of dollars. Think, for instance, of the money and time saved by improving the routing of planes to varied airports around the globe, or trucks across the country.

All these problems are throughout us, even when we don't solve them with explicit optimization methods. Optimization can function an efficient methodology for translating on a regular basis problems into a sturdy decision-making framework.

To summarize the method, you could: 1) Discover the discrete and continuous decisions you control, 2) Determine what the parameters of the issue are, 3) Define the target (what you minimize or maximize) clearly, 4) State constraints explicitly (capability, task, etc), and 5) Formulate and solve the issue.

When you internalize this process, you’ll spot optimization opportunities in all places.

References

[1] Leaf-raking optimization code and data simulation (2025). GitHub repository: [https://github.com/Josiah-DeValois/Optimize-Leaf-Raking].

Writer note

When you enjoyed this, I write about analytical reasoning, decision science, optimization, and data science.

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x