The Challenge
The goal was straightforward yet complex: bridge the analog and digital worlds of mathematics. We wanted to build a system that could look at a handwritten equation on a piece of paper, just like a human student would, and understand it instantly.
The core difficulty wasn't just recognizing numbers (which is a solved problem with MNIST), but accurately distinguishing between mathematical operators (+, -, x) and digits in a noisy, real-world environment with limited training data for symbols.
Technical Architecture
1. The "7-Second" Pipeline
Custom OpenCV loop that opens a stabilization window. The user has 7 seconds to align the equation within a green ROI before auto-capture.
2. Intelligent Router
Why use one model when you can use two? We use HSV Color Segmentation to route characters: Black ink to CNN, Red ink to SVM.
3. Hybrid Inference
CNN (TensorFlow) for robust digit recognition. SVM (Sklearn) for symbol classification where datasets are smaller.
Deep Dive: The "Smart Router" Logic
One of the most innovative decisions in this project was to avoid a single monolithic model. Instead, we implemented a heuristic-based routing system inspired by how humans use color to highlight importance.
- Black Ink (Digits): Sent to a custom CNN trained on 60,000 MNIST images. Deep learning excels here due to the variety of handwriting styles.
- Red Ink (Operators): Sent to an SVM. We found that for simple geometric shapes (+, -, x) with limited training data, SVMs generalizes better than deep networks.
Final Step: SymPy
Once the string is assembled (e.g., "2*x+4=0"), we pass it to SymPy to solve it algebraically, allowing us to find 'x' rather than just evaluating numbers.
# Pseudo-code logic from digitRecognition.py
for char_img in contours:
# 1. Analyze Color in HSV Space
if is_predominantly_red(char_img):
# ROUTE TO SVM (Symbols)
# Preprocessing: Centering based on aspect ratio
flat_img = preprocess_symbol(char_img).flatten()
prediction = svm_model.predict(flat_img)
equation_string += symbol_map[prediction]
else:
# ROUTE TO CNN (Digits)
# Preprocessing: MNIST-style 28x28 padding
norm_img = preprocess_digit(char_img)
prediction = cnn_model.predict(norm_img)
equation_string += str(np.argmax(prediction))
# 2. Solve Algebraically
solution = sympy.solve(equation_string, x)
print(f"Solution: {solution}")
Current State: Prototype Phase
This project is currently in a primitive state. The main limitation of the current version is the strict requirement for color separation: digits must be written in distinct colors from symbols (e.g., black numbers, red operators) for the heuristic segmentation to work.
Roadmap / Future Work
The next major update will focus on replacing the color-based heuristic with a robust End-to-End OCR model (such as a CRNN or Transformer) capable of recognizing full equations in monochrome, making the app completely functional for real-world use.
Project Team
Project Context
Developed as a university group project to explore internal logic of Computer Vision and Pattern Recognition. The goal was to build a functional "teacher's assistant" tool for digitizing and checking math problems without relying on "black box" external APIs.