Sunnyvale, CA

Spark/Scala Data Engineer



Job Description

Spark/Scala Data Engineer
Austin, TX/ Sunnyvale, CA
Activities:
Ability to work independently work on end-to-end development track with engineering mindset and product ownership attitude
Requirement analysis and coding in accordance with coding standards
Requirement and design playback with business
Design pipeline end to end to meet customer requirement
Coding in Scala using existing frameworks to meet business requirements
Work in agile methodology involving daily standups, frequent sync up with peers.
Delivering high quality code along with reviewing and mentoring peer's code.
Skills: -
Must Have: -
SCALA, SPARK, SQL, HADOOP, AWS, HIVE, Spark SQL, HIVE QL, CICD (Continuous Integration/Continuous Delivery), VCS (GIT HUB)
Coding in Scala (60% of time)
Designing in of HADOOP ecosystem (20% of time)
Hands-on experience on AWS tools like EMR, EC2 (10% of time)
Hands-on experience of SQL in Big Data: SQL, Spark SQL, Hive QL (60% of time)
Proficient in working with large data sets and pipelines (20% of time)
Proficient with workflow scheduling / orchestration tools (20% of time)
Well versed with CICD process and VCS (20% of time)
In Scala, below hands-on experience is essential
o Functional Programming and Object-Oriented Programming
o Currying and Partially Applied Functions
o Higher Order Functions
o Tail Recursion
o Futures
o Case class
o Pureconfig
o Implicit Functions
o Practical knowledge of all containers like list, map, array etc.
o Implementation of foreach loop etc. for sorting instead of using direct functions
Nice to have: -
PYTHON, JAVA, ICEBERG , KAFKA, Amazon EKS,Kubernetes, Amazon S3,Airflow ,Oozie,POSTGRES
Experience in PYTHON and JAVA Coding.
Experience in using KAFKA to setup workflows.
Experience with ICEBERG tables.
Experience with workflow scheduling / orchestration such as Airflow.
Experience with schema design and data modeling.
Machine learning experience is good to have.
Experience using Cassandra database (NOSQL) is good to have.

Recommended Skills

  • Agile Methodology
  • Amazon Elastic Compute Cloud
  • Amazon S3
  • Amazon Web Services
  • Apache Hadoop
  • Apache Hive
Browse other jobs