Closed

Create Scraper To Extract Data From Site

This project received 24 bids from talented freelancers with an average bid price of $246 USD.

Get free quotes for a project like this
Employer working
Project Budget
$30 - $250 USD
Total Bids
24
Project Description

I'm looking to have an entire website, data and media (images) scraped and saved to a MYSQL database. I need script also so i can run in future.

Extraction target site: [url removed, login to view]

Details:

1. Select Counties in top navbar, 1st 'county' is Alachua

2. CLick on the current date, ie "Wed

11-24

17 Arrests"

3. It lists the arrests of that date. People's names. Click on every name on every day and grab their information and mugshot. In this case, Dovon Anderson is first. We click his name/link.

[url removed, login to view] all of this data and store into a MYSQL database. Classify by County and state (right now we're just doing FL, but we will move on to the other states soon).

- Data to be scraped on an individual mugshot page, ALL. Except the advertisements and 'Tag This Mugshot'.

Scrape: Arrest Information (Full name, Date, Time, Arresting Agency, Total Bond), Personal Information, Charges.

This should be repeated for all counties in FL, as well as all dates listed in the gray colored bar that lists dates and arrests like the one above. I have a list of 40+ private proxies to be used and am willing to use a captcha service if need be. I prefer Mac OSX but can run it on Windows if it's easier for you.

It will need to run via a list of proxies that I have and possibly be ready to do a captcha if that happens on test scrape

Prefer to save images in CDN.

Summary: I want all the data and mugshot photos from [url removed, login to view] put into a database periodically so my dev can use the data to display on our own website.

Output: MySQL database

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online