This document shows a number of "essential" commands for manipulating files on Hadoop's HDFS file system. Basically all commands start with hdfs dfs and then use a command form that resembles ordinary Unix file system commands. There are a few special cases, however. Also, keep in mind that every user has a home directory in HDFS, which is the default location for files without a path prefix. You have full access to your personal home directory, but HDFS manages file system permissions much like Unix does, so your access to other directories inside HDFS may be limited. The hadoop user's home directory is a storage place for various public files. All users have read access to that space.
Before you can use Hadoop on Lemuria, you must first set up your environment. Do this by executing the following command:
$ source useHadoop.sh
It is necessary to execute this every time you open a new terminal session on Lemuria. The environment created by this command is lost when you close the terminal (this avoids potential conflicts with other programs in the case when you aren't trying to use Hadoop).
Last Revised: 2025-01-09
© Copyright 2025 by Peter Chapin <peter.chapin@vermontstate.edu>