Simple ETL process using Java/Spark

Completed Posted 6 years ago Paid on delivery
Completed Paid on delivery

Write a ETL process using Java, Spark & HDFS.

Copy the input file to HDFS

Read the input file from HDFS using Java & Spark

Perform below function on the dataset

Average_Calculation()

For each stock , calculate the average trading volume for each month, average trading price for each month.

so for each stock , for each month , calculate the avg of volumen and average of stock close price

STOCK, AVG_VOLUME, AVG_CLOSING_PRICE, MONTH, YEAR

AAPL, 4343434 , 85, JULY, 2007

Write the output file back to HDFS

Run the ETL process using Spark in Cluster mode and client mode

Document all errors encountered and error resolution

------------------------------------------------------------------------------------------------------------------------------

Source input file:

STOCK, ASK_PRICE, BID_PRICE,OPEN_PRICE,CLOSE_PRICE, VOLUME,DATE

AAPL, 100.01, 100.02, 99.5, 99.7, 343434000, 12/7/2001

Destination file

STOCK, AVG_VOLUME, AVG_CLOSING_PRICE, MONTH, YEAR

AAPL, 4343434 , 85, JULY, 2007

Big Data Sales Hadoop Java Oracle Software Architecture

Project ID: #16258187

About the project

10 proposals Remote project Active 6 years ago

Awarded to:

deytps86

Hello, I didn't notice another calculation below in second line so increasing a little amount of 20$ extra. "so for each stock , for each month , calculate the avg of volumen and average of stock close price" If pro More

$60 USD in 2 days
(42 Reviews)
5.4

10 freelancers are bidding on average $153 for this job

chrisvwn

Hi, I have experience using Java, Spark and HDFS in a Hadoop cluster and can implement your task for you for both Spark client and cluster modes. As an IT specialist I am able to setup, configure and troubleshoot th More

$155 USD in 3 days
(11 Reviews)
5.1
dineshrajputit

hi, I have expertise on spark,scala, java, hadoop.... done production scripts, scala job which process hdfs data and write back to hdfs. have read JSON, XML, CSV, tab, avro, parquet, orc file format. have read hive More

$133 USD in 2 days
(7 Reviews)
4.0
amitkumar0327

Hi, I am Amit. I have experience in Spark and Java. I can write the code as per the requirement you have given. Please share the input file for testing. And can provide you with documentation as well. Looking forward More

$100 USD in 3 days
(12 Reviews)
4.1
farrukhcheema23

Hi, I am a professional Big Data Consultant with over 5 years of experience. I have read your request and interested to work for you as I am expert of Spark with Scala, and HDFS and can write a spark script for this pr More

$111 USD in 3 days
(3 Reviews)
2.7
VirtualBrainInc

I have briefly read the description on java development, and I can deliver as per the requirements. .................

$200 USD in 4 days
(4 Reviews)
2.2
haadfreelancing

I am interested to work on this project as I have relevant experience in Big Data,Sqoop, Hadoop, Spark, Hive, Kafka, Spark Streaming, Rdd, Datframe, Dataset , Python, Scala and Java. I am well versed in Installation an More

$55 USD in 3 days
(0 Reviews)
0.0