Search Jobvertise Jobs
Jobvertise

PySpark Developer
Location: US-TX-Irving
Jobcode: 292bba539a0561c07f06aa7db6a8b5e4-122020
Email Job | Report Job

Apply Online

or email this job to apply later

Key Responsibilities:

Possess development experience in building frameworks to automate high-volume and real-time data processing by using big data technologies such as Python, Pyspark, Hadoop, Spark, Hive, Pig, Kafka, Nifi, etc.

Ability to develop, transform, and deploy complex analytical models in scalable, production-ready solutions

Provide advanced anomaly detection machine learning platform support

Implementation of CI/CD on cloud platforms

Able to troubleshoot and debug production related code issues and peer reviews

KRA:

Technical Delivery of system components, development and deployment – 75%

Contribution towards building re-usable components – 25%

Experience & Skillset

MUST-HAVE

Having experience in developing sustainable data driven solutions with current new generation data technologies to drive our business and technology strategies

Experience in developing data APIs and data delivery services to support critical operational and analytical applications

Leveraging reusable code modules to solve problems across the team and organization

Should possess problem-solving attitude

At least 2 years of development experience on designing and developing Data Pipelines for Data Ingestion or Transformation using Java or Scala or Python or Pyspark(link removed)>
At least 2 years of development experience in the following Big Data frameworks: File Format (Parquet, AVRO, ORC), Resource Management, Distributed Processing and RDBMS

At least 2 years of developing applications with Monitoring, Build Tools, Version Control, Unit Test, TDD, Change Management to support DevOps

At least 2 years of development experience with SQL and Shell Scripting experience

Experience in developing and deploying production-level data pipelines using tools from Hadoop stack (Pyspark, HDFS, Hive, Spark, HBase, Kafka, NiFi, MongoDB, Neo4J, Oozie, Splunk etc).

Experience in troubleshooting JVM-related issues.

Having development experience in dealing with mutable data in Hadoop.

Having development experience with Stream sets.

Experience in any of the data visualization tools like Kibana, Grafana, Tableau and associated architectures

GOOD-TO-HAVE

Experience in Angular.JS 4 Development and React.JS Development with Cloud Technologies

Experience in banking domain

1+ years’ experience with Amazon Web Services (AWS), Google Compute or another public cloud service

1+ years of experience working with Streaming using Spark or Flink or Kafka or NoSQL

1+ years of experience working with Dimensional Data Model and pipelines in relation with the same

Intermediate level experience/knowledge in at least one scripting language (Python, Perl, JavaScript)

Experience with various noSQL databases (Hive, MongoDB, Couchbase, Cassandra, and Neo4j) will be a plus

Experience in Ab Initio technologies including, but not limited to Ab Initio graph development, EME, Co-Op, BRE, Continuous flow)

Education: Minimum Bachelor’s degree in Computer Science, Engineering, Business Information Systems, or related field. Masters in Computing related to scalable and distributed computing is a major plus.

MTH Technologies

Apply Online

or email this job to apply later