Skip to main content


Welcome to Daily Updates

*** Happy Learning***

Quick Links


==> Online Fun Games hub

==>Daily Job Updates IT and Non IT

==>Every month Current Affairs

==>Every month Dividend, Bonus Issue and Stock Split information

==> JAVASCRIPT QUIZ

==>AADHAAR download, Date of birth,name and Address correction, PVC card and more links

==> MICROSOFT EXCEL TUTORIAL

Unix/Linux Basics for Data Engineers – Complete ETL Server Guide

Unix/Linux Basics for Data Engineers – Complete ETL Server Guide

Linux commands are essential for every Data Engineer working with ETL pipelines. Most production data pipelines run on Linux servers, and understanding command-line tools helps in debugging, automation, monitoring, and file processing.

1. pwd – Check Current Directory

Purpose: Shows your current working directory.

pwd

Example Output:

/home/etl_user/projects/sales_pipeline
ETL Scenario: Before running a pipeline script, confirm you're inside the correct project directory.

2. ls -l – List Files with Permissions

Purpose: Displays files with detailed permissions and ownership.

ls -l

Example Output:

-rwxr-xr-- 1 etl_user data_team 2048 Mar 1 02:00 etl_job.sh
ETL Scenario: Verify whether your ETL script has execution permission.

3. head – View Beginning of File

Purpose: View first few lines of a large CSV file.

head -n 10 sales.csv

Example Output:

id,name,amount
1,John,200
2,Alice,150
ETL Scenario: Validate headers before loading data into database.

4. tail – View End of Log File

Purpose: View last lines of log file.

tail -n 50 etl_job.log

Live Monitoring:

tail -f etl_job.log
ETL Scenario: Most ETL failures appear at the end of logs.

5. grep – Search Errors in Logs

Purpose: Search specific keywords inside files.

grep -i "error" etl_job.log

Example Output:

ERROR: Database connection timeout
ETL Scenario: Quickly identify root cause of job failure.

6. wc -l – Count Records

Purpose: Count number of rows in file.

wc -l sales.csv

Example Output:

100001 sales.csv
ETL Scenario: Validate source vs target record count.

7. awk – Perform Calculations

Purpose: Perform column-based operations.

awk -F',' '{sum+=$3} END {print sum}' sales.csv

Example Output:

350000
ETL Scenario: Validate total revenue before loading to warehouse.

8. chmod – Fix Permission Issues

Purpose: Grant execution permission to script.

chmod +x etl_job.sh
ETL Scenario: Fix "Permission Denied" error in production.

9. ps -ef – Check Running Jobs

Purpose: See running processes.

ps -ef | grep etl
ETL Scenario: Check whether scheduled pipeline is still running.

10. kill -9 – Stop Stuck Job

Purpose: Terminate process forcefully.

kill -9 12345
ETL Scenario: Stop a frozen ETL process consuming high CPU.

11. crontab – Schedule ETL Job

Edit Cron:

crontab -e

Example (Run Daily at 2 AM):

0 2 * * * /home/etl_user/etl_job.sh >> job.log 2>&1
ETL Scenario: Automate daily data warehouse load.

12. df -h – Check Disk Space

Purpose: Check disk usage.

df -h
ETL Scenario: Full disk is a common cause of ETL failures.

13. gzip – Compress Data Files

Purpose: Compress file before transfer.

gzip sales.csv

Result:

sales.csv.gz
ETL Scenario: Compress file before sending to S3 or FTP.

Conclusion

Mastering these Unix/Linux commands enables Data Engineers to debug production issues, monitor ETL pipelines, validate data, and automate workflows efficiently. Strong command-line knowledge significantly improves troubleshooting speed and reliability in real-world data engineering environments.

Comments

Popular posts from this blog

JNTUK Convocation VIII for 2018-19 and 2019-20 batch OD apply

JNTUK Convocation VIII for 2018-19 and 2019-20 batch, who have taken the PC in the period of 01/01/2019 to 31/12/2020 For more details Click here last Date to apply - 18-12-2021 Required Documents: 1. PC 2. CMM (For UG ) 3. SSC 4. Recent Photo  5. For PG courses - Sem wise mark sheets 6. Adhaar front and back scanned copy 7. Bank challan or Payment - Those made offline payment should submit hard copies of above documents at the university examination Fee - Rs.2000/- Process to apply: 1. Register with Hall ticket number and email Id(user name) - Click here 2. Login with user name(E-mail ID) and password - click here 3. Enter the details and next step will be payment  Note: Updation of payment status into complete status may take 2-3 days 4. Then need to attach the required documents 5. Next step is to check the details and everything and then click on check box by agreeing that there information was furnished by you is true. 6. There will be no Submit option that is only downlo...

JNTUK R16 SGPA and CGPA calculator for Lateral entry b.tech

Lateral entry students those are joined directly engineering by completing polytechnic, they may or may not appeared for ECET for getting seat in engineering course. that is B.Tech students studied course in 4 years but lateral entry students studied course is 3 years, that one year spend in polytechnic course. Lateral entry students strong in Technically than regular students. for SGPA calculator -   click here NOTE: IF ANYONE WANT CALCULATE UPTO SOME SEMISTERS(LIKE UPTO 3-2) FOR PLACEMENTS CAN PROVIDE REMAINING SGPAS AND TOTAL CREDITS AS ZEROS(0) THEN WILL GET ACCURATE CGPA TILL THAT PARTICULAR SEMISTER. FOR LATERAL ENTRY SCHEME B.TECH CGPA IS... FIRST SEMISTER SGPA    total credits   SECOND SEMISTER SGPA    total credits   THIRD SEMISTER SGPA    total credits   FOURTH SEMISTER SGPA    total credits  ...

JNTUK R16 percentage calculation

JNTUK R16 PERCENTAGE CALCULATION Here R16 batch students can calculate percentage according to obtained CGPA throughout the graduation. JNTUK introduced SGPA, CGPA in 2020 batch that is R16 REGULATION. Most of the companies ask percentage but students suffer how much percentage equal to obtained CGPA. for that here providing a wonderful platform for calculation of percentage by providing CGPA. ENETR CGPA BELOW BOX AND GET PERCENTAGE: ENTER CGPA    Percentage   For Lateral Entry students CGPA Calculator-  click here for Regular students CGPA Calculator -   click here Andhra Pradesh SI and Constable recruitment Notification 2022 Details - Click here for more details feel free to comment ...