Best Fit Bezier Curve

During my grad school days I came a cross a pretty fun math problem.

I was working with digital ink data, that is, time-stamped x-y coordinates which represented handwritten pen strokes. We wanted to compute the best-fit Bezier curve for a portion of any arbitrary pen stroke. In my googling I only came across approximations, e.g., gradient descent, simulated annealing, etc. These types of solutions are typically non-deterministic, slow, and do not guarantee the best solution. For that reason, I worked out a best-fit solution to this problem, very similar to the least squares approach to linear regression. In this article, I will walk through all the math I did, but if you want to jump to the end, I put it all in a code snippet on Source Forge.

Lets start by defining a cubic Bezier curve:

diagram1.png

Each c is a two dimensional Euclidean point and t ∈ [0,1] as illustrated in the figure below.

A simple cubic Bezier curve and its four control points.

A simple cubic Bezier curve and its four control points.

What is a simple/intuitive explanation of what this curve is doing? At t = 0, B(0) = p1 and the curve follows a trajectory towards the control point p2. As t increases though, the "influence" will shift, such that the curve follows a trajectory towards p4 coming from the control point p3. Finally, at t = 1, B(1) = p4.

To make it easier to do the least-squares derivation, we express the Bezier function in terms of matrix operations. Namely, given the matrices:

diagram3.png
diagram4.png
diagram5.png

We can write the function for a Bezier curve as:

diagram6.png

Next, lets define the input data we are trying to fit. Namely, we have a series of, n, Cartesian points comprising a segment of a handwritten stroke:

diagram7.png

Such that:

diagram8.png

If we consider the x values separately from the y values, we effectively have two vectors:

diagram9.png
diagram10.png

We need a way to synchronize the points in the stroke segment with points in the Bezier curve when computing the error of fit. Namely, we need an index into the Bezier curve for each point along the handwritten segment. To do this, we use the path length of the segment, defined at the ith point in the stroke as:

diagram11.png

With d1 = 0. We can then define a vector that maps the index, i of each point in the stroke to a corresponding index, ti, in the Bezier curve.

diagram12.png

We fit the y-values first and then repeat the process for the x-values.

As in any least squares problem, we need an equation that computes the error of fit for a given Bezier curve, defined by its control points P to the given y-values, E(P). I use the residual sum of squares, summing the squared distance from each handwritten point to its Bezier curve approximation:

diagram13.png

If we define the matrix:

diagram14.png

Then we can rewrite the error equation and find its maximum:

diagram15.png

Taking the derivative and setting it equal to zero to find the maxima:

Finally, solving for Py:

diagram17.png

And that's Howie do it. If you want to know the y-components of the best fit Bezier curve for our n, simply apply the transformations shown in the last equations. Similarly, to find the x-components, replace the y-vector with x:

I received several requests for code and have thus created a Source Forge project that hosts a JAVA implementation of this code. It requires the UJMP package to work.

Jim Herold

Jim Herold is a Catholic, Husband, Father, and Software Engineer.

He has a PhD in Computer Science with a focus on machine learning and how it improves natural, sketch-based interfaces.

Jim researched at Harvey Mudd and UC Riverside, taught at Cal Poly Pomona and UC Riverside, and worked at JPL NASA and Google.

Previous
Previous

Greedy Number Partitions in JAVA