Read-Write ETL on NAS Data with EMR Serverless Spark — No Cluster, No Copy

May 26, 2026 · 4:57 PM UTC ·12 min read · 0 reactions · 0 comments · 13 views

⚡ TL;DR · AI summary

The article discusses the use of EMR Serverless Spark for read-write ETL processes on NAS data without the need for cluster management or data copying. It highlights the efficiency of this approach, achieving a full ETL pipeline execution in just 37 seconds at a low cost. The integration of FSx for ONTAP with EMR Serverless allows for direct reading and writing to NAS storage, streamlining data processing workflows.

Key facts

▪EMR Serverless Spark can read, transform, and write-back Parquet files on FSx for ONTAP via S3 Access Points.
▪The total Spark execution time for a full ETL pipeline is 16 seconds, with a total job time of 37 seconds including cold start.
▪This serverless approach eliminates the need for cluster management and reduces costs to approximately $0.05 per job.

Original article

DEV.to (Top)

Read full at DEV.to (Top) →

Opening excerpt (first ~120 words) tap to expand

try { if(localStorage) { let currentUser = localStorage.getItem('current_user'); if (currentUser) { currentUser = JSON.parse(currentUser); if (currentUser.id === 1143688) { document.getElementById('article-show-container').classList.add('current-user-is-article-author'); } } } } catch (e) { console.error(e); } Yoshiki Fujiwara(藤原善基)@AWS Community Builder for AWS Community Builders Posted on May 26 Read-Write ETL on NAS Data with EMR Serverless Spark — No Cluster, No Copy #aws #spark #emr #amazonfsxfornetappontap FSx for ONTAP S3 Access Points × Lakehouse Deep Dive (7 Part Series) 1 Query NAS Data In Place with Athena and FSx for ONTAP S3 Access Points 2 FSx for ONTAP S3 Access Points Lakehouse — What Works, What Doesn't, and Why ... 3 more parts...

…

Excerpt limited to ~120 words for fair-use compliance. The full article is at DEV.to (Top).

Anonymous · no account needed

Discussion

0 comments

Read-Write ETL on NAS Data with EMR Serverless Spark — No Cluster, No Copy

Discussion

More from DEV.to (Top)