|
Search Jobvertise Jobs
|
Jobvertise
|
PySpark Developer Location: US-TX-Irving Jobcode: 292bba539a0561c07f06aa7db6a8b5e4-122020 Email Job
| Report Job
Key Responsibilities:
- Possess development experience in building frameworks to automate high-volume and real-time data processing by using big data technologies such as Python, Pyspark, Hadoop, Spark, Hive, Pig, Kafka, Nifi, etc.
- Ability to develop, transform, and deploy complex analytical models in scalable, production-ready solutions
- Provide advanced anomaly detection machine learning platform support
- Implementation of CI/CD on cloud platforms
- Able to troubleshoot and debug production related code issues and peer reviews
KRA:
- Technical Delivery of system components, development and deployment – 75%
- Contribution towards building re-usable components – 25%
Experience & Skillset
MUST-HAVE
- Having experience in developing sustainable data driven solutions with current new generation data technologies to drive our business and technology strategies
- Experience in developing data APIs and data delivery services to support critical operational and analytical applications
- Leveraging reusable code modules to solve problems across the team and organization
- Should possess problem-solving attitude
- At least 2 years of development experience on designing and developing Data Pipelines for Data Ingestion or Transformation using Java or Scala or Python or Pyspark(link removed)>
- At least 2 years of development experience in the following Big Data frameworks: File Format (Parquet, AVRO, ORC), Resource Management, Distributed Processing and RDBMS
- At least 2 years of developing applications with Monitoring, Build Tools, Version Control, Unit Test, TDD, Change Management to support DevOps
- At least 2 years of development experience with SQL and Shell Scripting experience
- Experience in developing and deploying production-level data pipelines using tools from Hadoop stack (Pyspark, HDFS, Hive, Spark, HBase, Kafka, NiFi, MongoDB, Neo4J, Oozie, Splunk etc).
- Experience in troubleshooting JVM-related issues.
- Having development experience in dealing with mutable data in Hadoop.
- Having development experience with Stream sets.
- Experience in any of the data visualization tools like Kibana, Grafana, Tableau and associated architectures
GOOD-TO-HAVE
- Experience in Angular.JS 4 Development and React.JS Development with Cloud Technologies
- Experience in banking domain
- 1+ years’ experience with Amazon Web Services (AWS), Google Compute or another public cloud service
- 1+ years of experience working with Streaming using Spark or Flink or Kafka or NoSQL
- 1+ years of experience working with Dimensional Data Model and pipelines in relation with the same
- Intermediate level experience/knowledge in at least one scripting language (Python, Perl, JavaScript)
- Experience with various noSQL databases (Hive, MongoDB, Couchbase, Cassandra, and Neo4j) will be a plus
Experience in Ab Initio technologies including, but not limited to Ab Initio graph development, EME, Co-Op, BRE, Continuous flow)
Education: Minimum Bachelor’s degree in Computer Science, Engineering, Business Information Systems, or related field. Masters in Computing related to scalable and distributed computing is a major plus.
MTH Technologies
|