Subscribers 44,200
Views 6,998,591
Videos 1,790
Country US
Created Nov 2012 (12 years old)
Topics Knowledge Technology Lifestyle_(sociology)
Actions Copy all info to clipboard
You can use it to paste it into ChatGPT and analyze the channel

Contacts and Links

Description

As a Data Engineer with expertise in Apache Hudi and Iceberg, I specialize in building scalable data ingestion pipelines in the AWS Big Data ecosystem and data lakes. I am the creator of the "LakeBoost" framework, which integrates Apache Hudi with AWS Glue ETL, significantly improving operational efficiency and reducing costs for large-scale data operations. With advanced skills in Apache Spark and modern data platforms, I design and implement high-performance systems that enable robust data workflows. Beyond engineering, I am passionate about educating others and share my knowledge through my YouTube channel, which has 43,000 subscribers and over 1,600 videos on big data technologies. My goal is to make complex data engineering concepts accessible to a global audience, fostering both technical innovation and learning.

Videos from channel

Published Title Description Views
Jan 18, 2025 Thank you everyone for attending talk ... 90
Jan 18, 2025 Stay away from these financial mistakes  ... 559
Jan 18, 2025 Financial mistakes made at different stages of life #music ... 549
Jan 15, 2025 Learn How to configure Spark Session to Join Managed (S3 Table Buckets) and Unmanaged Iceberg Table Learn How to configure your Spark Session to Join Managed (S... 73
Jan 11, 2025 Learn How to Ship and Publish Your First Python Package to PyPI with UV 🚀 Want to ship your first Python package to PyPI? 🚀 In my l... 171
Jan 10, 2025 Amazon S3 Tables Store Tabular Data in S3 Amazon Web Services ... 104
Jan 10, 2025 How to Ingest Data Incrementally from S3 Using S3 Events & SQS into S3 table Bucket|Run it Locally code https://github.com/soumilshah1995?tab=repositories Bl... 117
Jan 05, 2025 How to use Custom Images in EMR Serverless | Custom Docker File How to use Custom Images in EMR Serverless | Custom Docker ... 58
Jan 04, 2025 How to Use the New Hudi Streamer with Hudi 1.0.0 on EMR Serverless 7.5.0 | hands on Labs How to Use the Hudi Streamer with New Hudi 1.0.0 on EMR Serv... 48
Jan 04, 2025 Learn about Apache Hudi 1.0.0 Expression Index | hands on Labs #2 Learn about Apache Hudi 1.0.0 Expression Index | hands on La... 60
Jan 01, 2025 How Customers and Companies Can Use Hudi 1.0.0 on EMR Serverless | Developer Guide Code https://github.com/soumilshah1995/hudi-1-0-0-emr-serv... 72
Jan 01, 2025 ✨ Happy New Year! 2025 🎉 ✨ ✨ Happy New Year! 🎉 ✨ Wishing everyone a year full of growt... 150
Dec 29, 2024 Q/A | Office Hours | Ask Questions | 12/28/2024 ... 86
Dec 28, 2024 Creating a Ray Cluster on EMR on EC2 and Submitting Your First Ray Job #6 Creating a Ray Cluster on EMR on EC2 and Submitting Your Fir... 71
Dec 28, 2024 Learn to Use S3 Table Buckets Locally: Ingest Data from Raw to Silver with MERGE INTO Commands code https://github.com/soumilshah1995/s3tables-locally/blob... 108