Find Jobs
Hire Freelancers

automation app

$250-750 USD

Ditutup
Dibuat hampir 12 tahun yang lalu

$250-750 USD

Dibayar ketika dikirim
Here is the outline of how the script will work: For a given input file, see attached Some key information is missing for certain records, such as telephone, company name, url, email address, and address. This script will attempt to find this data and output it. See attached output file. All input and output files will be in CSV format. The script will use 14 ip addresses to send searches to google and bing. A W9 will be required for the developer who gets selected. Specifics of the logic: 1. take anybody that does not have a company name and do an address search. a. Do the search with google, and take the title tag out of the top 10 results. b. Do the same search on Bing and take the title tag from the top 10 results. c. Pattern match the page titles and it should give a pretty unanimous company name 2. Take pattern matched company name, if company name was empty, if not then use the company name we already had. Take the company name and full address and google it. Street names are off on many of the examples. So we would strip out by removing 's, directions and street extensions: This search gives us all kinds of results so we have to score these results: a. go to the home page of each of the pages in the top 10 results. Page analysis: i. the name of the business should be located on this page ii. the name of the business should be located in the title tag b. If the name of the business is located on both, on page and in title then we can be pretty sure this is the website of the company. c. If the company name is not on either go to the next site in the top 10 until we achieve the pattern match. d. Once this sub routine is complete, now we have the URL of the company. See below for example of search. From this pdf in the results it would find the url of the business: 3. get email a. Search for name, email and Url. b. Use common business email address structures and do pattern matching for these anywhere in the resulting pages: i. firstlastname @[login to view URL] ii. [login to view URL] @[login to view URL] iii. first_lastname @[login to view URL] iv. First initial last name @[login to view URL] If one of these structures are found, success, move on to #5. If it is unsuccessful move on to #4. 4. Email search: a. As the regular search for email did not work, now we do the reverse: b. This produces many searches looking for the right name. if a match is found the match is graded: i. If the match is on the company website, +10 ii. If match is in a pdf, +5 iii. If match is in a PPT, +5 iv. If the match is both found by google and bing, +5 The match with the highest grade is the one the script will use. 5. Phone search a. Google search for name, phone, actual email address. b. This usually returns some type of result that has “phone:” on the page, from this we will parse the page pulling back all the digits to the right of the word “phone:” Thanks for your time.
ID Proyek: 1680725

Tentang proyek

1 proposal
Proyek remot
Aktif 12 tahun yang lalu

Ingin menghasilkan uang?

Keuntungan menawar di Freelancer

Tentukan anggaran dan garis waktu Anda
Dapatkan bayaran atas pekerjaan Anda
Uraikan proposal Anda
Gratis mendaftar dan menawar pekerjaan
1 freelancer menawar dengan rata-rata $500 USD untuk pekerjaan ini
Avatar Pengguna
I can do it.
$500 USD dalam 5 hari
5,0 (3 ulasan)
3,4
3,4
Avatar Pengguna
I can do it. Regards.
$750 USD dalam 15 hari
0,0 (0 ulasan)
0,0
0,0

Tentang klien

Bendera UNITED STATES
Sacramento, United States
5,0
1
Memverifikasi Metode pembayaran
Anggota sejak Mar 19, 2012

Verifikasi Klien

Terima kasih! Kami telah mengirim Anda email untuk mengklaim kredit gratis Anda.
Anda sesuatu yang salah saat mengirimkan Anda email. Silakan coba lagi.
Pengguna Terdaftar Total Pekerjaan Terpasang
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Memuat pratinjau
Izin diberikan untuk Geolokasi.
Sesi login Anda telah kedaluwarsa dan Anda sudah keluar. Silakan login kembali.