Find Jobs
Hire Freelancers

Some Spark and hive queries

₹1500-12500 INR

Closed
Posted over 1 year ago

₹1500-12500 INR

Paid on delivery
Spark Use Case (Movie Review Analysis) IMBD is an online database of movie-related information. IMBD users rate the movies and provide reviews. They rate the movies on a scale of 1 to 5; 1 being the worst and 5 being the best. The dataset also has additional Information, such as the release year of the movie. You have to analyze the data collected and answer the following questions. You need to find: 1) The total number of movies 2) The maximum rating of movies 3) The number of movies that have maximum rating 4) The movies with ratings 1 and 2 5) The list of years and number of movies released each year 6) The number of movies that have a runtime of two hours Steps to follow: 1. Create a table in RDBMS (MySql, MSsql, Oracle) and load the data in table (usign bulk insert). 2. Ingest the data using Sqoop to HDFS locaton 3. Create a Hive External Table 4. Read External Table using PySpark Session 5. Perform the Spark POC query and Save the file in Parquet data formate 6. After save the file again create a External table in hive and load the parquet data. 7. Optional Create a BI report using (Tablue, PowerBI and Kibana) Note I'm shareing the bulk inset query for your refernce (MSSQL) create table customers ( Customer_id int, Cust_name varchar(100), City varchar(20), Grade nvarchar(10), Salesman_id int ) BULK INSERT customers FROM 'C:\Users\Ramkrishna\Desktop\SQL\MYSQL\Qerry\[login to view URL]' --location with filename WITH ( FIELDTERMINATOR = ',', ROWTERMINATOR = '\n' ) GO Data File you will require for above can be downloaded from Myeclass in the Project Section named as: [login to view URL]
Project ID: 35296358

About the project

8 proposals
Remote project
Active 1 yr ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
8 freelancers are bidding on average ₹15,238 INR for this job
User Avatar
Hello... I am interested
₹57,000 INR in 7 days
5.0 (131 reviews)
6.5
6.5
User Avatar
Hi, I'm an experienced data scientist with over 7 years of active development experience building Machine learning and AI systems using multiple tools and technologies including R, Python and PySpark. I hold a Masters degree in Data Science from Trinity College Dublin as well as a Bachelors degree in Computer Science. I have plenty of experience on big data, hadoop and it's ecosystem components specially sqoop, flume, oozie, hive, impala and currently working on all of them including pyspark. Feel free to check out my profile reviews. Cheers!
₹12,000 INR in 7 days
5.0 (50 reviews)
6.0
6.0
User Avatar
Hello, I have read your project description Spark and hive queries. I am an expert in database systems and I have developed many database systems including but not limited to a database system for a School, County Bursary, Pos System and Sacco system. I have used tools such as SQL Developer, SQL Plus, SQLYog, MySQL Workbench, SSMS etc. Am confident I can handle your project. Kindly open a discussion with me so that we talk the best way to work. If you award me the project I will work closely with you throughout the whole project life span, communicating continuously at every stage with updates until the project comes to fruition, in perfect condition, and ready for submission. Hope to work with you soon! Thanks Nyaronyari
₹9,900 INR in 5 days
5.0 (36 reviews)
5.0
5.0
User Avatar
PYSPARK EXPERT HERE!!! "Satisfy the client with my ability and passion" This is my slogan here. I hope you will be interested in me. Thanks.
₹10,000 INR in 3 days
5.0 (1 review)
1.8
1.8
User Avatar
*Extensive experience in working with structured data using HiveOL, Join operations, optimizing Hive Queries * Experience in importing and exporting data using Sqoop from HDFS to Relational Database.
₹4,000 INR in 7 days
0.0 (0 reviews)
0.0
0.0
User Avatar
Hi I can Analyze and Visualize this data as per your Requirement. Also Provide you description of each step that will help you to understand the project.
₹6,000 INR in 7 days
0.0 (0 reviews)
0.0
0.0
User Avatar
I have 4+ years experience as Data Engineer. I have hands on experience on python, SQL, Hadoop, AWS services, and visualization tool as an power BI. I worked on different database and files like SQL, SAP HANA, parquet,CSV, excel , Dynamo DB etc. I worked on end to end projects.
₹11,000 INR in 7 days
0.0 (0 reviews)
0.0
0.0
User Avatar
I can do the work with the steps you mentioned and I can create a script python and spark and this will be very good and you can run it on the data at any time just change the location of the data file
₹12,000 INR in 7 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of INDIA
B 5 Block, India
0.0
0
Member since Jan 19, 2020

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.