Is ETL development still relevant when ELT has become more common?

ELT (Extract, Load, Transform) has become the dominant pattern in cloud data warehouses because platforms like Snowflake, BigQuery, and Redshift can apply transformations at scale within the warehouse itself — using tools like dbt — rather than transforming before loading. However, ETL is still necessary when data must be cleaned or masked before loading for privacy reasons, when source systems are sensitive and raw data can't be staged in the cloud, or when legacy on-premises data warehouses don't have in-warehouse transformation capabilities. Many ETL developers have expanded their skills to include dbt and consider themselves data engineers.

What is dbt and how does it relate to traditional ETL development?

dbt (data build tool) is an open-source transformation framework that allows analysts and engineers to write transformations as SELECT statements in SQL, compile them into views or tables, test data quality, and document lineage — all within the data warehouse. It's become the standard for the 'T' in ELT pipelines. ETL developers who add dbt proficiency bridge the gap between traditional pipeline development and modern analytics engineering, which is the direction the field has moved.

What is incremental loading and why does it matter?

An incremental load only extracts and processes records that have changed since the last load, rather than re-loading the full source table every time. For large tables — millions to billions of rows — the difference in processing time and cost between full and incremental loads is dramatic. Implementing incremental loads correctly requires understanding how the source system tracks changes: timestamps, CDC (change data capture), or sequence-based watermarks. Incorrect incremental load design is a common source of data quality issues.

How has cloud migration affected ETL development?

Cloud-hosted data warehouses have fundamentally changed the trade-offs. On-premises ETL required expensive server infrastructure and ETL tool licenses; cloud-native pipelines can use managed services that scale automatically and charge per use. The shift has also democratized transformation work — SQL-fluent analysts can now write transformations using dbt rather than needing to use complex GUI-based ETL tools. This has blurred the line between ETL developer and analytics engineer, and many ETL developers are building skills in orchestration (Airflow, Prefect) and transformation (dbt) rather than focusing on traditional ETL tools.

What is change data capture (CDC) and when is it used in ETL?

Change Data Capture extracts only the records that have been inserted, updated, or deleted since the last extraction, usually by reading the database transaction log rather than querying tables. CDC enables near-real-time data replication with minimal load on source systems and is used when latency requirements are tight or source tables are too large to query efficiently. Common CDC tools include Debezium, AWS DMS, Oracle GoldenGate, and Attunity. ETL developers who understand CDC have a significant advantage in streaming and real-time pipeline architectures.

Software Engineering

ETL Developer

Last updated May 12, 2026

At a glance

Salary (USD)$108K

$85K low$135K high

Read time: 9 min
Last updated: May 12, 2026

Salary methodology

Our proprietary model combines official data from sources such as the U.S. Bureau of Labor Statistics and industry compensation reports, along with publicly available job postings, posting details, and other market signals, to identify what we believe is a representative range for this role.

These figures are directional and provided for informational and educational purposes only. Actual compensation varies by employer, location, experience, certifications, and negotiation, and should not be relied upon for hiring, salary-negotiation, or financial- planning decisions.

Role-specific factorsETL Developers with cloud data warehouse experience (Snowflake, BigQuery, Redshift) earn at the high end. Those who combine ETL skills with data engineering capabilities — streaming pipelines, orchestration frameworks, and dbt — command premiums. Large financial services and healthcare ETL roles typically pay above the median due to complexity and compliance requirements. Legacy on-premises ETL specialists earn at or below median.

ETL (Extract, Transform, Load) Developers build and maintain data pipelines that move and transform data between source systems, data warehouses, and analytics platforms. They design the workflows that extract data from databases, APIs, and files, apply business logic transformations, and load processed data into destinations where analysts and business intelligence tools can use it.

Role at a glance

Typical education: Bachelor's in CS, Information Systems, or related field
Typical experience: Not specified
Key certifications: None typically required
Top employer types: Enterprises, Cloud Data Warehouse providers, SaaS companies, Tech-driven organizations
Growth outlook: Growing demand as the role evolves into Data Engineering, driven by cloud warehouse adoption and legacy modernization.
AI impact (through 2030): Augmentation — AI automates routine SQL generation and mapping, but the role is expanding into complex data engineering, orchestration, and data product governance.

Duties and responsibilities

Design and implement ETL workflows to extract data from relational databases, flat files, APIs, and SaaS platforms
Write SQL transformations and stored procedures that apply business rules, data cleansing, and aggregation logic
Build and maintain data pipelines using ETL tools such as Informatica, SSIS, Talend, dbt, or Apache Airflow
Develop data quality checks and validation rules that detect and flag data integrity issues before loading to production
Map source system data structures to target schema designs in collaboration with data architects and analysts
Debug pipeline failures by analyzing error logs, identifying root causes, and implementing fixes or alerting
Optimize slow ETL processes through query tuning, parallelization, and incremental load strategies
Document pipeline logic, data lineage, transformation rules, and dependency maps for current and future maintainers
Participate in data warehouse design reviews, proposing ETL-friendly schema structures and load strategies
Collaborate with business stakeholders to understand data requirements and validate that transformed data meets needs

Overview

ETL Developers are the engineers who make sure data gets from where it's created to where it needs to be used — reliably, accurately, and on schedule. When a business analyst runs a report showing last month's sales by region, an ETL developer's pipeline is responsible for the fact that that data is accurate, current, and correctly formatted for the analytics tool.

The 'extract' phase is more complex than it sounds. Source systems — CRM platforms, ERP systems, transactional databases, third-party APIs — each have their own data models, authentication requirements, and operational limitations. Extracting from a production Oracle database without affecting user-facing performance, handling API rate limits, and managing authentication tokens that expire are all extraction problems that require careful engineering.

Transformation is where business logic lives in data pipelines. Joining customer records from three source systems into a single unified customer view requires matching on keys that don't align cleanly, handling nulls and empty strings consistently, applying deduplication logic, and standardizing address and phone number formats. Each transformation rule should be documented and testable — a pipeline that silently changes how it transforms a field after a source system update is one of the hardest problems to diagnose in production.

Data quality is not someone else's problem. ETL developers who treat data quality checks as a core engineering responsibility — building validation rules that detect anomalies before data reaches the warehouse — are significantly more valuable than those who deliver pipelines and move on. A missed quality check that allows bad data into a financial report can cause real organizational damage.

Orchestration and scheduling are increasingly sophisticated. Modern ETL involves managing hundreds of interdependent pipeline steps with varying schedules, handling failures gracefully, alerting on delays, and maintaining dependency graphs that show when downstream jobs can safely start. Tools like Apache Airflow, Prefect, and Dagster have become standard for this purpose and are now expected skills rather than differentiators.

Qualifications

Education:

Bachelor's in computer science, information systems, or related field (common)
Associates with strong SQL and data background accepted at some organizations
Relevant certifications in specific ETL tools or cloud platforms can substitute for academic credentials

Core technical skills:

SQL: advanced queries, window functions, CTEs, joins, aggregations, query optimization
At least one ETL/ELT tool: Informatica PowerCenter/IICS, SQL Server SSIS, Talend, dbt, or Apache Spark
Pipeline orchestration: Apache Airflow, Prefect, Dagster, or Azure Data Factory
Scripting: Python for custom transformations, file handling, API integration, and automation
Data warehouse concepts: dimensional modeling (star schema, snowflake schema), SCD (slowly changing dimensions) types 1, 2, 3

Source system experience:

Relational databases: SQL Server, Oracle, PostgreSQL, MySQL
Cloud data warehouses: Snowflake, BigQuery, Amazon Redshift, or Azure Synapse
SaaS data sources: Salesforce, HubSpot, Shopify, or similar via API or certified connectors
File-based sources: CSV, JSON, XML, Parquet; handling encoding issues, schema variation, malformed data

Advanced skills:

dbt: models, tests, sources, macros, documentation
Change data capture: Debezium, AWS DMS, Oracle GoldenGate
Streaming pipelines: Apache Kafka, Spark Streaming, Kinesis for near-real-time use cases
Cloud storage and data lake patterns: AWS S3, Azure Data Lake, GCS with Parquet and Delta/Iceberg formats

Soft skills:

Documentation discipline — ETL logic that isn't documented becomes a black box
Communication with business analysts about data quality findings and rule exceptions
Systematic debugging approach for pipeline failures

Career outlook

The ETL Developer role is evolving significantly, but demand for people who can build reliable data pipelines is growing, not shrinking. The shift is more about title and tool than about the underlying need.

The emergence of 'data engineer' as a distinct and well-compensated role has absorbed much of what ETL developers traditionally did, with an expanded scope that includes streaming data, large-scale processing frameworks (Spark, Flink), and cloud data infrastructure management. ETL developers who have expanded their skills toward data engineering — orchestration frameworks, cloud-native tools, dbt, CDC — are the ones moving into this higher-compensated adjacent role.

Cloud data warehouse adoption has accelerated dramatically. Snowflake, BigQuery, and Redshift are the standard targets for new data warehouse projects, and organizations building on these platforms need developers who understand their specific capabilities, pricing models, and optimization patterns. The shift away from on-premises Teradata and Oracle data warehouses has created both a migration market and an ongoing new-project market.

Data mesh and modern data stack adoption is creating demand for developers who understand not just pipeline mechanics but data product thinking — designing pipelines as products with owners, consumers, and SLAs. This requires broader software engineering habits (testing, documentation, version control) applied to data work, which traditional ETL development often skipped.

Legacy ETL modernization is a sustained demand category. Large enterprises with 15-year-old Informatica or DataStage pipelines are funding modernization programs to move to cloud-native tools. Developers who understand both the legacy patterns and the modern migration approaches are specifically valuable for this work, which can span several years.

Salary growth is stronger for developers who move toward full data engineering scope. Data Engineers with streaming, orchestration, and cloud-native skills earn $115K–$160K and are one of the most in-demand specializations in data infrastructure.

Sample cover letter

Dear Hiring Manager,

I'm applying for the ETL Developer position at [Company]. I've been building and maintaining data pipelines for four years, currently at a retail company where I own the pipeline that loads order, inventory, and customer data from our ERP, e-commerce platform, and three fulfillment warehouse systems into our Snowflake data warehouse.

The most complex challenge in that work is the customer identity resolution across three source systems that use different customer identifiers. Our ERP uses a legacy account number scheme, our e-commerce platform uses email as primary key, and our warehouse systems use shipping address + name. I built a matching pipeline in Python that uses fuzzy matching on name and address with exact matching on email where available, assigns a canonical customer ID, and maintains a crosswalk table that maps all source IDs to the canonical one. The pipeline runs nightly and handles about 4,000 new customer merge decisions per day with a manual review queue for records below the confidence threshold.

I've recently moved most of our transformations from SSIS to dbt, which has been a net improvement for the team — the SQL-based approach is easier for our analysts to read and contribute to, and the built-in testing and documentation are significantly better than what we had before. I maintain the Airflow DAGs that orchestrate the full pipeline dependency graph.

Your organization's need for a reliable data pipeline supporting the analytics team is closely aligned with what I've been building. I'd welcome the chance to discuss how my experience fits the role.

[Your Name]

Frequently asked questions

Is ETL development still relevant when ELT has become more common?: ELT (Extract, Load, Transform) has become the dominant pattern in cloud data warehouses because platforms like Snowflake, BigQuery, and Redshift can apply transformations at scale within the warehouse itself — using tools like dbt — rather than transforming before loading. However, ETL is still necessary when data must be cleaned or masked before loading for privacy reasons, when source systems are sensitive and raw data can't be staged in the cloud, or when legacy on-premises data warehouses don't have in-warehouse transformation capabilities. Many ETL developers have expanded their skills to include dbt and consider themselves data engineers.
What is dbt and how does it relate to traditional ETL development?: dbt (data build tool) is an open-source transformation framework that allows analysts and engineers to write transformations as SELECT statements in SQL, compile them into views or tables, test data quality, and document lineage — all within the data warehouse. It's become the standard for the 'T' in ELT pipelines. ETL developers who add dbt proficiency bridge the gap between traditional pipeline development and modern analytics engineering, which is the direction the field has moved.
What is incremental loading and why does it matter?: An incremental load only extracts and processes records that have changed since the last load, rather than re-loading the full source table every time. For large tables — millions to billions of rows — the difference in processing time and cost between full and incremental loads is dramatic. Implementing incremental loads correctly requires understanding how the source system tracks changes: timestamps, CDC (change data capture), or sequence-based watermarks. Incorrect incremental load design is a common source of data quality issues.
How has cloud migration affected ETL development?: Cloud-hosted data warehouses have fundamentally changed the trade-offs. On-premises ETL required expensive server infrastructure and ETL tool licenses; cloud-native pipelines can use managed services that scale automatically and charge per use. The shift has also democratized transformation work — SQL-fluent analysts can now write transformations using dbt rather than needing to use complex GUI-based ETL tools. This has blurred the line between ETL developer and analytics engineer, and many ETL developers are building skills in orchestration (Airflow, Prefect) and transformation (dbt) rather than focusing on traditional ETL tools.
What is change data capture (CDC) and when is it used in ETL?: Change Data Capture extracts only the records that have been inserted, updated, or deleted since the last extraction, usually by reading the database transaction log rather than querying tables. CDC enables near-real-time data replication with minimal load on source systems and is used when latency requirements are tight or source tables are too large to query efficiently. Common CDC tools include Debezium, AWS DMS, Oracle GoldenGate, and Attunity. ETL developers who understand CDC have a significant advantage in streaming and real-time pipeline architectures.

See all Software Engineering jobs →