Scan This Picture And Index The Whole Video/Document/Ppt/Textbook!
This article explores techniques for scanning images and documents containing mathematical formulas using non-max suppression and image search libraries such as Chargrid OCR or Image Match, ensuring compliance with copyright laws while minimizing manual formatting.
non-max suppression in combining similar bounding boxes
the lib:
1 | from imutils.object_detection import non_max_suppression |
basically greek letters
maybe you can document another great range of symbols by just enabling the system to search in greek?
could also search among math symbols, do math ocr.
kindly reminders
when building python c++ libraries without xcode, please add commandline header files like this:
in order to have this during build:
/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/Headers/Python.h
we need to do:
1 |
|
search by image instead of cranking latex out
image search libraries
character level optical char segmentation called chargrid ocr
image match used for copyright violation detection, using phash algorithm
get the latex out
first detect the location of math formula
scanning single shot detection for math formulas
dataset for math symbol detection
detect different part of document with yolov3
math expression detection
next find the tool for picture to latex conversion
新开源的Python工具——Pix2Text (P2T),目标是 Mathpix 的Python开源替代品,现在可以识别截图中的数学公式并转换为Latex表示,也可以识别图片中的中英文文字。在线Demo: https://huggingface.co/spaces/breezedeus/pix2text 。 Github: https://github.com/breezedeus/pix2text ,Gitee: https://gitee.com/breezedeus/pix2text 。
attention based math ocr
gui of image2latex
im2latex-tensorflow
im2markup with only nvidia support using torch, model for latex conversion can be found here
deeplearning picture to latex
pix2tex
using pix2tex
中文公式 手写公式识别 需要进一步训练
中文 手写 pytorch版本
mask latex area and get conventional things out
search the formula
formula search based on sympy
can we render latex to picture with sympy?
latex search engine