Peer Reviewed Open Access Journal
Call for paper | Submit Your Manuscript Online IJAMRED

Volume 2 - Issue 1, January - February 2026

📑 Paper Information
📑 Paper Title Automated Extraction of Structured Data from Unstructured Documents Using Hybrid OCR + NLP
👤 Authors Dipti Kumari, Harshit
📘 Published Issue Volume 2 Issue 1
📅 Year of Publication 2026
🆔 Unique Identification Number IJAMRED-V2I1P85
📑 Search on Google Click Here
📝 Abstract
With the international industries becoming fully computerize, the difficulty of unorganized document processingstatement,medical records, and legal contracts is one of the obstruction of. operational planning. Normal Optical Character Recognition(OCR)canbe used to convert pixels into text, but can not sustain semantic ranking or field relationships. This research presents ahybridsubstructure, which combines cutting- edge OCR. Multi-modal Natural Language Processing (NLP) mechanismandarchitectures.Implant of dimensional layouts are introduced by embedding situate. depth NLP-based verification layer, our systemattains acomplexknowledge of document systems. We apply a Scalable processing Spark-based cloud system. a LayoutLM-based transformer model entityrecognition. investigative analysis on over 10,000+ mixed documents. which largly improves margin the accuracy of thedata,attaining a 95.4score and saving over 70 percent of time in manual validation. Index Terms Optical Character Recognition(OCR),Natural. Layoutlm, NLP, Multimodal AI, Data. Hypotomony, Deep Learning, Document Intelligence.
📝 How to Cite
Dipti Kumari, Harshit,"Automated Extraction of Structured Data from Unstructured Documents Using Hybrid OCR + NLP" International Journal of Advanced Multidisciplinary Research and Educational Development, V2(1): Page(572-577) Jan-Feb 2026. ISSN: 3107-6513. www.ijamred.com. Published by Scientific and Academic Research Publishing.
Visitor

Copyright © . Scientific and Academic Research Publishing, All Rights Reserved.
Submit your Article