Data Engineer Job at VDart Inc, Remote

WlBROHRLanpOcWN6UmpPNWpVKzRveExNMWc9PQ==
  • VDart Inc
  • Remote

Job Description

Title: Data Engineer

Location: Remote

Duration: 6 Months

Work Description:

We are in the process of migrating off CRMA Data Manager by rewriting queries and implementing the required data transformations in AWS. This platform modernization effort includes working through a backlog of datasets that must be migrated to AWS and transformed to meet current and future reporting needs.

Business Knowledge:

Limited business knowledge is needed.

Technical Skills:

Must-Have Technical Skills:

  • AWS Data Services (Hands-on)
  • S3: Data lake design, partitioning strategies, lifecycle management
  • IAM: Roles & policies, least-privilege access, cross-account access
  • Glue / EMR: Crawlers, Data Catalog, ETL job development
  • Athena: Querying data lakes with performance and cost optimization
  • Lake Formation: Basic governance and permission management

Compute & Processing

  • Apache Spark (PySpark): Batch processing, performance tuning, joins, partitioning
  • Python: Production-grade coding (packaging, testing, logging, type hints)
  • SQL: Advanced querying (window functions, query optimization, data modeling support)

Orchestration & Scheduling

  • Airflow / MWAA / AWS Step Functions
  • DAG design
  • Retry mechanisms
  • SLA management
  • Backfills
  • Data Warehousing & Modeling
  • Redshift / Snowflake (on AWS): Fundamentals and performance considerations
  • Dimensional Modeling: Star/Snowflake schema design

ETL/ELT Patterns:

  • CDC (Change Data Capture)
  • SCD (Slowly Changing Dimensions)
  • Idempotent data pipelines
  • Data Reliability & Observability
  • Data quality frameworks: Great Expectations / Deequ (or equivalent)
  • Data reconciliation & validation
  • Monitoring & observability: CloudWatch logs, metrics, alerts

DevOps & Delivery

  • Version Control: Git, branching strategies, code reviews
  • CI/CD: Data pipeline automation (e.g., GitLab CI/CD)
  • Infrastructure-as-Code: OpenTofu / CloudFormation for AWS resource deployment

Security & Compliance

  • Encryption: At rest & in transit (KMS)
  • Secrets management: AWS Secrets Manager / SSM
  • Networking fundamentals: VPC, private subnets, endpoints (data access control)

Role Expectations (Hands-on Experience Required):

  • Designed, developed, and maintained production-grade ETL pipelines using AWS Glue (PySpark)
  • Built scalable data ingestion pipelines from S3, databases, and streaming sources into S3 data lakes
  • Implemented complex transformations and joins in PySpark, optimizing performance (partitioning, broadcast joins, caching)
  • Developed incremental and idempotent pipelines, including handling CDC and SCD
  • Automated schema discovery using Glue Crawlers and Data Catalog
  • Tuned Glue Spark jobs for performance, concurrency, and cost efficiency
  • Integrated pipelines with orchestration tools like Airflow (MWAA) or Step Functions
  • Collaborated with data teams to load curated data into Redshift / Snowflake / Iceberg for analytics
  • Implemented data quality checks using built-in validations or tools like Great Expectations / Deequ
  • Applied AWS security best practices (IAM roles, KMS encryption, secure data access)
  • Contributed to CI/CD pipelines for Glue job deployment using Git and IaC tools
  • Monitored pipelines using CloudWatch, ensuring reliability and quick incident resolution
  • Worked closely with stakeholders to define data contracts, SLAs, and business expectations

Key Skills: Data Engineer, AWS Glue, IAM, ETL, Athena, PySpark

Job Tags

Full time

Similar Jobs

Vision Truck Line

Truck Driver CDL Class A Local Home Daily Great Miles Jonestown Job at Vision Truck Line

 ...Transmission Type: Automatics. Drug Test: Hair Test. Vision Drivers Enjoy Great Benefits. Health, Dental and Vision. Long and...  .... NO FELONIES or MISDEMEANOR EVER - NO EXCEPTION. Vision Truck Line LLC is a licensed contract carrier and property broker as... 

Hackensack Meridian Pascack Valley Medical Center

Maintenance Tech / Plumber Job at Hackensack Meridian Pascack Valley Medical Center

 ...clinical/non-clinical customers regarding impacts, timelines, and restoration verification. Qualifications: Job Requirements : High Schoo l diploma or equivalent. Valid High Pressure Black Seal Boiler License or ability to obtain within 6 months of start.... 

Randstad Technologies

Product Owner Job at Randstad Technologies

 ...job summary: The Product Owner will oversee the requirements and development of a new incentive management platform, which supports a larger Enterprise-wide Service to Sales Initiative. This Initiative allows call center agents to offer different cross sell opportunities... 

Confidential

Site Manager Job at Confidential

 ...Randstad, the worlds leading talent company, is hiring a Site Manager to support our Randstad Inhouse Service (RIS) division. RIS offers a unique operations and staffing solution that caters specifically to clients with high-volume staffing needs. Randstad supports... 

Somerby Edgewater

Business Office Director Job at Somerby Edgewater

 ...Place to Work (2025-2026)! Now Hiring! Detail Oriented Business Office Director to join our team of leaders! What you can expect as...  ...; gender identity; pregnancy, childbirth, or related medical conditions; age; disability or handicap; citizenship status; service...