Scan This Picture And Index The Whole Video/Document/Ppt/Textbook!

image_scanning
math_formulas
non_max_suppression
ocr
copyright
chargrid
imagematch
This article explores techniques for scanning images and documents containing mathematical formulas using non-max suppression and image search libraries such as Chargrid OCR or Image Match, ensuring compliance with copyright laws while minimizing manual formatting.
Published

August 25, 2022


non-max suppression in combining similar bounding boxes

the lib:

from imutils.object_detection import non_max_suppression

basically greek letters

maybe you can document another great range of symbols by just enabling the system to search in greek?

could also search among math symbols, do math ocr.

kindly reminders

when building python c++ libraries without xcode, please add commandline header files like this:

in order to have this during build:

/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/Headers/Python.h we need to do:


ln -s /Library/Developer/CommandLineTools /Applications/Xcode.app/Contents/Developer

search by image instead of cranking latex out

image search libraries

character level optical char segmentation called chargrid ocr image match used for copyright violation detection, using phash algorithm ## get the latex out ### first detect the location of math formula scanning single shot detection for math formulas dataset for math symbol detection detect different part of document with yolov3 math expression detection ### next find the tool for picture to latex conversion 新开源的Python工具——Pix2Text (P2T),目标是 Mathpix 的Python开源替代品,现在可以识别截图中的数学公式并转换为Latex表示,也可以识别图片中的中英文文字。在线Demo: https://huggingface.co/spaces/breezedeus/pix2text 。 Github: https://github.com/breezedeus/pix2text ,Gitee: https://gitee.com/breezedeus/pix2text 。 attention based math ocr gui of image2latex im2latex-tensorflow im2markup with only nvidia support using torch, model for latex conversion can be found here deeplearning picture to latex pix2tex using pix2tex 中文公式 手写公式识别 需要进一步训练 中文 手写 pytorch版本 ## mask latex area and get conventional things out easyocr with pytorch support ## search the formula formula search based on sympy can we render latex to picture with sympy? latex search engine