Content scraper with url crawling - PHP and crons

Ditutup Dipasang May 20, 2012 Dibayar saat pengiriman
Ditutup Dibayar saat pengiriman

I need a couple of content scrapers based on PHP that will work with automatic crons. The functionalities of these scrapers must be at lease:

1. Easy to manage: in an easy administrator overview dashboard I can add the url that must be scraped. Now I can add the classes on the site that I will scrape and select from a drop down menu in what database colomn I will add the content.

* For example: I will scrape the title of a site article. For this I add the class 'class=title' and select the database colomn 'colomn1'.

* Multiple colomns (at lease 10) must be available

2. Automatic scan the whole website: when I like that the script will crawl the whole website I can select this option in the administrator panel. The script will now crawl every url that correspondent with the url I have submit, and check if the there is corresponding data to add to my database.

* Check for dubble entries: the script check for dubble entries. Dublicated content will not be placed in the database.

* Add the url to the database: the urls where the data is scraped must be added to every row of content in the database so the script can check if the url is already crawled.

* Check daily for new updates: the script can check the sites daily for new content. So when the URL has new articles on the site or new products, the script will automatic pick these items and add the content to my database.

* Dates: the database must include a date column so I can see when the data is scraped.

* Use random IP addresses and automatically fill this: the system must have a separated database for IP addresses. A script must update the IP list daily and scrape new IP addresses from websites. Also open source IP website should be used to update the list.

○ The scraping scripts must use the random IP addresses to scrape data.

My budget is low so don't bid more than the project budget. Also, only bid and send PM when you can do the job in max. 7 days.

NOTE: Only bid when you have read the project details. Don't send messages with all kinds of example links you build before that are not relevant. Only send project relevant messages or I will report the messages as spam.

NOTE: No milestones will be payed. I only do small projects so the delivery can be quick and the scripting will not be too hard to finish. If you don't work without milestones please do not bid to this project.

Thanks already for the replies. Hope to find a long relationship development partner.

Best regards

Desain Grafis HTML MySQL PHP Desain Situs Web

ID Proyek: #1646433

Tentang proyek

2 proposal Proyek online Aktif Jun 26, 2012

2 freelancer rata-rata menawar €83 untuk pekerjaan ini

SigmaVisual

We can help in your project, please check PMB and our ratings/reviews to get idea of our experience.

€95 EUR dalam 2 hari
(240 Ulasan)
7.8
lakshanigamage

Please check PM

€70 EUR dalam 5 hari
(0 Ulasan)
0.0