Find Jobs
Hire Freelancers

Create App for Full-Text Searchable PDFs and Images using Tesseract and Solr

$750-1500 USD

Ditutup
Dibuat lebih dari 7 tahun yang lalu

$750-1500 USD

Dibayar ketika dikirim
Create an application that will index HTML, PDF, and Images on a Ubuntu web server. Create an search page that will perform a full text search across all files using Apache Solr or ElasticSearch. Run Tesseract OCR on all PDFs and Images so that each have a JSON file with word coordinates Search results should show all pages that the search phrase is contained on with short excerpts and with search term highlighting Display PDFs using in browser renderer like [login to view URL] When a user clicks on a search result it should take them to the exact page the term occurs on and if it is in a PDF it would need take them to that page and highlight the search term. If this could be build on a CMS platform like wordpress, that would be ideal but not required The overall solution has been described in this blog post: [login to view URL]
ID Proyek: 11242870

Tentang proyek

20 proposal
Proyek remot
Aktif 7 tahun yang lalu

Ingin menghasilkan uang?

Keuntungan menawar di Freelancer

Tentukan anggaran dan garis waktu Anda
Dapatkan bayaran atas pekerjaan Anda
Uraikan proposal Anda
Gratis mendaftar dan menawar pekerjaan
20 freelancer menawar dengan rata-rata $1.348 USD untuk pekerjaan ini
Avatar Pengguna
My name is Mike and I’m from UK. I work with individual clients and also provide outsourcing services for a number of UK and USA based agencies. Your project description sounds interesting to me and I do have skills & experience that are required to complete this project. I can show you some examples of my work. Please contact me to discuss your project.
$1.250 USD dalam 20 hari
4,9 (45 ulasan)
6,8
6,8
Avatar Pengguna
Hello, I understood the initial scope of this project. Although i want to discuss further this job in order to prepare the final concept for this project. After Complete discussion over the call or in chat, i will prepare following things for you - Technical Project Proposal - Flow chart for this Project - Execution plan (Step by step procedure with explanation how and at what that we are going to execute a particular task)
$1.546 USD dalam 40 hari
4,8 (50 ulasan)
6,9
6,9
Avatar Pengguna
I want to discuss this project with you further, let me know the best suitable time for you to schedule the meeting, Feel free to message me at any time, i used to be online 14 hrs in a day on this website so probably you will get a quick response from my end.
$1.546 USD dalam 40 hari
5,0 (15 ulasan)
6,7
6,7
Avatar Pengguna
Hi, I'm an expert Java developer, with over 10 years of experience in Spring IoC, Hibernate/JPA, Spring Data, RESTfull services, SQL, Unixes, etc Please contact me for detailed description of solution. Best regards, Dmitry , Miami 786 656 1921
$1.500 USD dalam 20 hari
4,5 (26 ulasan)
6,6
6,6
Avatar Pengguna
Hi! I'm a senior software engineer at reputable global company. At my free time I like to freelance. I think you project is interesting and I would love to hear more info about it. Please contact me with details. Regards, Ivan R.
$1.250 USD dalam 20 hari
4,9 (11 ulasan)
5,1
5,1
Avatar Pengguna
I am an IITK graduate, 9 year experienced software professional and I have got top notch developers in my team, who have got experience across a span of technologies. The members in my team have worked with top notch tech organization such as Amazon, Cisco, Oracle etc. We have been involved in similar projects in the past and our track record has been excellent.
$1.250 USD dalam 20 hari
4,4 (15 ulasan)
4,5
4,5
Avatar Pengguna
Hi. We are the best OCR development team. Why do you want to use Tesseract? Our ocr engine is better than Tesseract. Plz contact me. We have made similar job before which recognize driver's license card on web page. best regards...........................
$2.000 USD dalam 20 hari
5,0 (6 ulasan)
3,7
3,7
Avatar Pengguna
A proposal has not yet been provided
$1.000 USD dalam 7 hari
5,0 (3 ulasan)
3,3
3,3
Avatar Pengguna
A proposal has not yet been provided
$777 USD dalam 20 hari
1,0 (1 ulasan)
0,8
0,8
Avatar Pengguna
Hello. Thank you for your job posting. I am Mi XiaoLi, lead developer of a Chinese developer team. We're a talented mobile & web development team in China. We have professional developers who are specialized in mobile & web development. Our developers are senior developers who have rich experience of OpenCV and Image Processing. Here is my brief overview of the solution to your project. First user input search term and click search button. Then iteration of all HTML, PDF and Images is started. You need OCR in HTML images, PDF and Image files. Tsseract already provides OCR but it's weak point is it confuses O, Q, C and B frequently. So we must implement special feature to distinguish them. In my previous experience, printed PDF works well but scanned pdf is bad with Tsseract result. Our team have developed a car number system in our country and implemented 95% accuracy system. If you want I can show you our desktop car number recognition system. We are not only talented & dedicated but also well co-operated team so we can overcome any obstacles with our team work. I hope our team to work on your project. Regards. Mi.
$1.500 USD dalam 10 hari
0,0 (0 ulasan)
0,0
0,0
Avatar Pengguna
ANDRIY TROFYMYAK SUMMARY • 7+ years of experience in software development • Advanced knowledge of Java, Apache Lucene • Quick-learner, good communication and teamwork skills TECHNICAL BACKGROUND Programming Languages: Java, C# Programming Technologies: Java EE, Java Beans, Java SE J2EE Frameworks: Apache Lucene, Apache Struts, Hibernate, iBATIS, Spring Framework Application Servers and Middleware: BEA WebLogic Portal, BEA WebLogic Server, Jakarta Tomcat Integrated Development Environments: Eclipse Platform, NetBeans IDE, Borland Delphi Internet Technologies: Java Applets, Java Server Pages (JSP), AJAX, Front-end development, Google Web Toolkit, XML/XSL/XSLT UXD. Development: (X)CSS Development Operating Systems: MS Windows Technical Writer Tools: MS Word Building Tools: Ant/NAnt/CPPAnt RDBMS: MySQL Construction Languages: C (including Ansi C), C++, Object Pascal, SQL Scripting Languages: JavaScript XML Protocols and Standards: XSL-FO, XSLT Software Design: Design patterns Project Management/Defect Tracking Systems: EPAM Project Management Center, JIRA EXPERIENCE EPAM Systems (03/2013 – now) Developer German project to provide Solr search engine. Search service implementation for Woulters Kluwer. Environment: Solr, Jetty, Zookeeper, Eclipse IDE.
$1.250 USD dalam 20 hari
0,0 (0 ulasan)
0,0
0,0
Avatar Pengguna
I have built such a search feature already .I have used Tesseract . I built it with Elasticsearch and have very good knowledge how things workout .
$1.111 USD dalam 30 hari
0,0 (0 ulasan)
0,0
0,0

Tentang klien

Bendera UNITED STATES
United States
4,9
35
Anggota sejak Jul 13, 2004

Verifikasi Klien

Terima kasih! Kami telah mengirim Anda email untuk mengklaim kredit gratis Anda.
Anda sesuatu yang salah saat mengirimkan Anda email. Silakan coba lagi.
Pengguna Terdaftar Total Pekerjaan Terpasang
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Memuat pratinjau
Izin diberikan untuk Geolokasi.
Sesi login Anda telah kedaluwarsa dan Anda sudah keluar. Silakan login kembali.