Hadoop Distributed File System (HDFS) commands

HDFS stands for the Hadoop distributed file system. In some of my articles on Hadoop, I have explained about HDFS, it is for storing huge data. HDFS is similar to local file system in Linux. To use a local file system we are using different commands, similarly there are some commands to use HDFS. Thus, to achieve more clarity about HDFS command, we would study it in two parts: Linux like commands and Hadoop specific commands.

Here we go!

Linux Like commands:
cat,cp,df,du,ls,mkdir,mv,rm,stat,tail etc.

Display the content of a file (cat):
root@kb:/home/kb# hadoop fs -cat /pcode/abc.txt 
Copy file from local disk to hdfs (cp):
root@kb:/home/kb# hadoop fs -cp file:///home/kb/a.c hdfs://localhost:9000/pcode/abc.txt 
Free disk space (df):
root@kb:/home/kb# hadoop fs -df 
Filesystem               Size        Used    Available     Use%
hdfs://localhost:9000  45793492992  1549187  12586029056    0% 
Disk usage (du):
root@kb:/home/kb# hadoop fs -du / 
564473  /MaxTemp 
0        /hcommand 
23059    /op 
95       /op123 
95       /output 
13948    /pcode 
604749  /tmp 
9        /user 
Display files from hdfs (ls):
root@kb:/home/kb# hadoop fs -ls / 
Found 8 items 
drwxr-xr-x   - root supergroup          0 2016-01-09 08:48 /MaxTemp 
drwxr-xr-x   - root supergroup          0 2016-01-15 11:34 /hcommand 
drwxr-xr-x   - root supergroup          0 2016-01-11 06:54 /op 
drwxr-xr-x   - root supergroup          0 2016-01-13 11:12 /op123 
drwxr-xr-x   - root supergroup          0 2016-01-13 06:57 /output 
drwxr-xr-x   - root supergroup          0 2016-01-15 11:14 /pcode 
drwx-wx-wx   - root supergroup          0 2016-01-09 09:28 /tmp 
drwxr-xr-x   - root supergroup          0 2016-01-09 09:29 /user 
Creating directory in hdfs (mkdir):
root@kb:/home/kb# hadoop fs -mkdir /hcommand
We cannot create file directly in HDFS, We create a file in local file system and then put in to hdfs.
Move file from one location to another (mv):
root@kb:/home/kb# hadoop fs -mv /pcode/abc.txt /hcommand/a.txt
Remove/Delete the file or directory (rm -r):
Directory : 
root@kb:/home/kb# hadoop fs -rm -r /hcommand 

File :
root@kb:/home/kb# hadoop fs -rm -r /pcode/wcinput.txt 
Display the status of a file (stat):
root@kb:/home/kb# hadoop fs -stat /MaxTemp/1990 
2016-01-09 03:18:50 
Display last part of a file (tail):
root@kb:/home/kb# hadoop fs -stat /MaxTemp/1990 
2016-01-09 03:18:50 root@kb:/home/kb# hadoop fs -tail /MaxTemp/1990

Shows last 10 lines from a file.
Hadoop Specific Commands:
copyFromLocal,put,copyToLocal,get,moveFromLocal,setrep,getmerge,distcp,appendToFile,checksum,count etc.
Copy file from local disk to HDFS:
copyFromLocal :
root@kb:/home/kb# hadoop fs -copyFromLocal /home/kb/wcinput.txt /pcode

put:
root@kb:/home/kb# hadoop fs -put /home/kb/wcinput.txt /pcode
Copy file from HDFS to local disk:
copyToLocal :
root@kb:/home/kb# hadoop fs -copyToLocal  /pcode/wcinput.txt /home/kb/Desktop/wcip.txt

get: 
root@kb:/home/kb/Desktop# hadoop fs -get /pcode/wcinput.txt /home/kb/Desktop/wcip.txt 
Move file from local disk to HDFS:
root@kb:/home/kb/Desktop# hadoop fs -moveFromLocal rank.txt /pcode
    “moveToLocal: Option '-moveToLocal' is not implemented yet.”
Setting number of replication in HDFS:
root@kb:/home/kb# hadoop fs -setrep 5 /pcode/rank.txt 
Replication 5 set: /pcode/rank.txt 
Merge two or more files in HDFS:
root@kb:/home/kb# hadoop fs -getmerge /pcode/rank.txt /pcode/wcinput.txt
Parallel copying with distcp :
root@kb:/home/kb# hadoop distcp /pcode /MaxTemp
Map and reduce functions are internally runs in this command.
Appending file in HDFS:
root@kb:/home/kb# hadoop fs -appendToFile brank.txt /MaxTemp/pcode/rank.txt
Append single src, or multiple srcs from local file system to the destination file system (local/hdfs).
Checksum :
root@kb:/home/kb# hadoop fs -checksum /MaxTemp/pcode/rank.txt 
/MaxTemp/pcode/rank.txt MD5-of-0MD5-of-512CRC32C 000002000000000000000000f2c5599bc7fa5ba8fe3895b0b291fdb2 
Count of (directory, files, total_size) :
root@kb:/home/kb# hadoop fs -count /MaxTemp 
2(no. of directory) 3(no. of files) 579037(total size) /MaxTemp 
That's all for this tutorial. Please do let me know about your views about this article in the comment section below, and stay tuned for more articles.

1 comments:

Post a Comment