Call for paper | Submit Your Manuscript Online
Volume 2 - Issue 1, January - February 2026
📑 Paper Information
| 📑 Paper Title |
Automated Extraction of Structured Data from Unstructured Documents Using Hybrid OCR + NLP |
| 👤 Authors |
Dipti Kumari, Harshit |
| 📘 Published Issue |
Volume 2 Issue 1 |
| 📅 Year of Publication |
2026 |
| 🆔 Unique Identification Number |
IJAMRED-V2I1P85 |
| 📑 Search on Google |
Click Here |
📝 Abstract
With the international industries becoming fully computerize, the difficulty of unorganized document processingstatement,medical records, and legal contracts is one of the obstruction of. operational planning. Normal Optical Character Recognition(OCR)canbe used to convert pixels into text, but can not sustain semantic ranking or field relationships. This research presents ahybridsubstructure, which combines cutting- edge OCR. Multi-modal Natural Language Processing (NLP) mechanismandarchitectures.Implant of dimensional layouts are introduced by embedding situate. depth NLP-based verification layer, our systemattains acomplexknowledge of document systems. We apply a Scalable processing Spark-based cloud system. a LayoutLM-based transformer model entityrecognition. investigative analysis on over 10,000+ mixed documents. which largly improves margin the accuracy of thedata,attaining a 95.4score and saving over 70 percent of time in manual validation. Index Terms Optical Character Recognition(OCR),Natural. Layoutlm, NLP, Multimodal AI, Data. Hypotomony, Deep Learning, Document Intelligence.
📝 How to Cite
Dipti Kumari, Harshit,"Automated Extraction of Structured Data from Unstructured Documents Using Hybrid OCR + NLP" International Journal of Advanced Multidisciplinary Research and Educational Development, V2(1): Page(572-577) Jan-Feb 2026. ISSN: 3107-6513. www.ijamred.com. Published by Scientific and Academic Research Publishing.