Pentesting beyond the basics — Cloud WAFs, reverse proxies, IPS, MVC routes (level: hard)

10 min readMar 17, 2024

Even a seasoned pentester should read this and get in on the discussion

The reason im writing this, is not to try to school anyone, I am open to learning, please comment if you can help — this is a discussion — not me telling you or talking at you ;p From doing 10 bug bounties in the past 4–5mo here is what I notice:

Lets look at a modern web application:

DJ Substance presenting modern web app SEO 9x tranceattic — This was found on laravel.com which is a PHP framework, but this goes for most any modern webapp

Now lets look at a historical web application:

This is your webappl up until a few years ago

Lets break down these difrent layers / abstractions / keep in mind — im not professionally a programmer . so cut me some slack.

MVC —

In modern web applications, the MVC architecture is designed to enhance code organization and maintenance by separating different concerns. This separation allows developers to manage each component independently. While the traditional MVC pattern consists of three components — Model, View, and Controller — some frameworks extend this architecture to include a fourth component called “Routing.” If its a new concept, forget what you thought you new about paths and web URI’s, and dont confuse layer 3 “routing” with what we are discussing here. This has nothing to do with networking in this discussion.

The exponential growth of the Internet has been good and bad, depending on your viewpoint, its gotten more complex to attack web apps, but its gotten much MORE COMPLEX to configure them right ;)

Here is a typical setup for todays corp:

Lets just say prior to Web 3.0, web applications were much more housed in a datacenter and you might deal with two databases unknowingly to place and order, there was no such thing as a web application firewall.

To clarify this so there is no confusion

Conventional Firewall vs WAF (overview):

Layer of Operation:

Conventional Firewall:
Operates primarily at the network level (Layer 3 and 4 of the OSI model)
- focusing on IP addresses and port
- filters traffic based on protocol, source, and dst IPs and ports
(making decisions to allow or block traffic based on these parameters)
Web Application Firewall (WAF)
— Application layer (Layer 7)
- designed to
inspect,
monitor,
filter HTTP/HTTPS traffic to / from a web application.

Critical Difference: WAFs understand and analyze the content of web traffic, including URLs, headers, and the body of messages.

The next major difference between WAF and regular FW:

Focus of what is being protected:

Conventional Firewall:
Aimed at securing the network from unauthorized access and attacks
Examples:
— IP spoofing, port scans, and DDoS attacks at the network level
Web Application Firewall (WAF):
Protects web applications from application-layer attacks
Examples:
— SQL injection
— Cross-Site Scripting (XSS)
— file inclusion, security misconfigurations, and countless others.

Critical Difference: The WAF is designed to understand and mitigate attacks that exploit the specific logic of web applications.

The next major difference between WAF and regular FW: Traffic Inspection:

Conventional Firewall:
Performs stateful inspection of packets,
examining the state and context of network connections
(but typically not delving into the payloads of packets)
Web Application Firewall (WAF):
Performs deep packet inspection
— Layers 5–7 focus on application layer
- analyzing the contents and context of HTTP/HTTPS traffic
- Analyzes requests and responses — usually with deep analysis:
- including:
form fields, cookies, and API calls, for malicious patterns.

The next major difference between WAF and regular FW:

Deployment Location:

Conventional Firewall:
Typically deployed at the network perimeter to protect an organization’s internal networks from external threats.
Web Application Firewall (WAF):
Positioned in front of web applications, either in the cloud or on-premises, to specifically protect web applications from attacks and vulnerabilities.

In addition, and without boring you to death (this is important to undersatand these key differences) lastly —

Rules and Policies:

Conventional Firewall:
Utilizes a set of predefined rules based on IP addresses, ports, and protocols to allow or block traffic. These rules are relatively static and require manual updates for changes in the network structure or policy.
Web Application Firewall (WAF):
Employs a more dynamic set of rules that can be automatically updated and customized to the specific security requirements of each web application. WAFs can use signatures, anomaly detection, and behavior-based analysis to identify and mitigate threats.

Firewalls (in the conventional layer 1–4) sense, were merely permit / deny based on the typical TCP attributes that are still in use today. So, with that being said, taking a Ubuntu box running Apache2 and php — lets just say the historic web server looked like:

https://www.wallymart.com
https://www.wallymart.com/cart
https://www.wallymart.com/searchItem.php?id=0438
https://dev.wallymart.com # The development server — likely on the sameVM

All 4 of those examples are called URL/Is (Uniform Resource Location / Identifiers).

Even a midsize company could probably run off a server and there wasnt knowledege about containers and cloud orchestration. Lets look at a hypothetical current site now:

The difference now:

Now the /path’s are considered “routes” and (likely) bring you to different systems all together (many times) just by visiting different paths.

https://wallycloudmart.com/ # Rev Proxy -> WAF -> Forward Proxy -> Server
https://wallycloudmart.com/cart/ # 301 to another node “SalesForce”
https://wallycloudmart.com/searchItem ->
https://search.wallycloudmart.com/ # Hosted in a diff. cloud infra.

https://wallycloudmart.com/extranet # 301 ->
https://extranet.wallycloudmart.com — which is in a diff subnet / webserver

Lets try to digest what we just discussed. The moral of the story is that, from my experience both architecting these solutions and pentesting them back then, and today, the Cloud is playing a pivotal role. Stepping back for a moment. Just to be clear, the MVC methodology and separating web apps by /routes is not cloud specific, please do not think that. I am just discussing both at once. What I covered is that
/thispath # Takes me to internal back end webserver 10.8.2.1:443
/thatpath # Takes me to internal back end webserver 10.8.2.2:443

— On modern web apps — it could mean completely different UI experiences, applications and credentials needed.

MVC is a framework compromised of Model, View, Controller and Routing
Model: represents the application’s:
— — - data structure
- — — — business logic
-— — — -rules
—- — — — data retrieval
— — — — — storage
— — — — — -manipulation.

dj substance / 9x / ethical hacker / tranceattic — When you hit a /path on a server imagine now a days that you are access a completely different system

The Model communicates with the database and is typically accessed by the Controller (to retrieve or save data).

Importantly, it remains unaware of the user interface.

View

Purpose: The View is responsible for presenting the user interface. It displays data to users and formats it for interaction. Whether it’s a web page or a desktop window, the View ensures a seamless user experience.
Interaction: The View receives data from the Controller and focuses solely on display. It doesn’t process or store data.

Controller

Purpose:
Acting as an intermediary,

The Controller bridges the gap between the Model and the View.
It listens to user inputs
(such as mouse clicks or keyboard events),
retrieves data from the Model,
selects an appropriate View (for rendering)

Interaction: The Controller manages data flow into the Model and updates the View whenever data changes. By keeping the View and Model separate, it maintains a clear separation of concerns.

I find this interesting, considering its such a integral part of this whole frame work, however Routing is Not traditionally part of MVC.

Routing determines how the application responds to client requests based on specific endpoints (URIs or paths) and HTTP methods (such as GET or POST). While not a standard part of MVC, it plays a crucial role in directing requests to the appropriate components.

Remember that this extended architecture provides flexibility and adaptability, especially when handling complex web applications.

I am sorry if that was torture, but if you made it this far, I commend you and it will be worth it. The way that I can tell where I am within a web app I am testing is primarily 2 methods:
1) Inspect / console, network, element etc
2) Burp / repeater / intruder

Lets look at this request — always being on the look out for the following:
* Anything “non-standard” or “weird” looking
* Mismatched, misspelled or headers out of order
* Duplicate headers
* Inconsistant reponses with the same request
* Look for CUSTOM HEADERS (Like Tesla uses: X-Tzla-)

Here is a request to pagead2.googlesyndication.com: Most of the information is being “send” via the GET request on the URL:

Anytime I see a header (response or request) that starts with X- I am all ove it. However looking up X-Client-Data, and according to the GPT pentester:

The thing to do in my mind is send another request: Look for deltas:

Request:

Response #1:

Always look for 1) URLs 2) Known headers of interest 3) Unknown responses

Response #2: (With comments)

This response is providing lots of info. lets just go over what i see:

First of all we made a POST request to espanol.yahoo.com on TCP/443.
Looking back, the POST URI was extensive:

/tdv2_fp/api/resource/NotificationHistory.getHistory;count=5;imageTag=img%3A40x40%7C2%7C80;theme=default;notificationTypes=breakingNews;lastUpdate=1710642684;loadInHpViewer=true;includePersonalized=;partner=yahoo

When we see a URI like that, its suprising its so long, usually the data is in the cookies, etc. However I notice a URL/ hex encoded string, when you see a string like this think to yourself why did they go through the trouble to encode it:

The string provided, img%3A40x40%7C2%7C80;, was URL encoded.

When decoded, it translates to img:40x40|2|80;

%3A is the URL-encoded representation of :
%7C is the URL-encoded representation of |

So, decoding img%3A40x40%7C2%7C80; results in img:40x40|2|80;

it doesnt look too interesting but get used to decoding strings. another response: Notice Server: ATS. Start googling . We know they run Kuberneters on port 4080. I would start adding X-Forwarded headers

Look for any custom header responses or anything starting with X-*

In this example the things that jump out to me, are several, and they should be jumping out to you too. I hope by now you know what k8s is

Some key injection points on that URL: GET /tdv2_fp/api/resource/NotificationHistory.getHistory;count=<HERE>5;imageTag=<HERE>img%3A40x40%7C2%7C80<HERE>;theme=<HERE>default;notificationTypes=breakingNews;lastUpdate=1710642684;loadInHpViewer=<HERE>true;includePersonalized=<HERE>;partner=<HERE>yahoo HTTP/1.1

I have an entire list of payloads at DJ Substance / XSS Payload List / Short .

They look like:
;{JAVASCRIPT};
{JAVASCRIPT};
<SCR%00IPT>{JAVASCRIPT}</SCR%00IPT>
\”;{JAVASCRIPT};//
<STYLE TYPE=”text/javascript”>{JAVASCRIPT};</STYLE>
<<SCRIPT>{JAVASCRIPT}//<</SCRIPT>
“{EVENTHANDLER}={JAVASCRIPT}

This should be nothing new to you. Now to the meat and potatos of the article. In order to get to your target you need to have a passing “score” in order to get past the WAF .. OR . If your lucky, look for DNS misconfigurations like I have seen on fastly.net . If you use https://dnsdumpster.com and lookup sites like yahoo.com, you will see CNAMEs and other entries that are intresting looking, and if the stars are shining in your favor, one of those links may just lead you directly to the target without having to deal with the defenses. The odds are low.

Here is a response from googletagservices.com — let me mention that I did not initiate. “Background traffic” persay:

No server response EVER *HAS* TO disclose what type of server it is, at least make sure you note this

We see HTTP/2 — It is somewhat current. This mine as well not be using a CSP, cause its worthleless.

Note the last header: Alt-Svc / The Alt-Svc header (Alternative Services) is:
— defined in HTTP/2 but more used in HTTP/3
informs client that requested resource is available through alternative services or protocols.

This allows a web server to indicate to the client that it can connect using different network protocols or hostnames, potentially optimizing connection efficiency, reliability, and performance. (Also possibly misdirecting it)

In the example provided, Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000, the header specifies alternative services for accessing the resource. Here's a breakdown of its components:

h3=”:443": This indicates that the HTTP/3 protocol is available on port 443. In another article we will cover QUIC “(a transport layer network protocol) to improve performance by reducing connection / transport latency.
ma=2592000: The ma parameter stands for "max-age", and it specifies the time, in seconds, that the client should consider the alternative service as valid. In this case, 2592000 seconds equals 30 days. During this period, the client may use the alternative service for subsequent requests to the same server.
h3–29=”:443": This is similar to h3=":443" but specifies a specific draft version of the HTTP/3 protocol (draft 29 in this case) that the client can use to communicate.

The use of the Alt-Svc header allows a server to direct clients to use more efficient or appropriate protocols without requiring any changes to the URI or affecting caching mechanisms. It's part of ongoing efforts to evolve the web's infrastructure to improve speed, efficiency, and security.

Keep in mind, some headers are not sent by the client, some are not sent by the server. That doesnt meant we cant try ; )

Given the premise that clients do not send Alt-Svc headers, lets try some fuzzing techniques that could be applied to server response headers like Alt-Svc in a testing scenario. Remember clients never send this:

Hypothetical Fuzzed Alt-Svc Value(s)

Extended Max-Age: Alt-Svc: h3=":443"; ma=999999999999
Invalid Protocol: Alt-Svc: h99=":443"; ma=2592000
**Non — numeric Port: Alt-Svc:h3=":xyz"; ma=2592000
Negative Max-Age: Alt-Svc: h3=":443"; ma=-2592000
Multiple Services with Conflicting Information: Alt-Svc: h3=":443"; ma=2592000, h3=":443"; ma=10
Missing Max-Age: Alt-Svc: h3=":443"
Oversized Value Length: Alt-Svc: h3=":443"; ma=2592000; param=" followed by a very long string exceeding typical header size limits.
Special Characters in Protocol Name: Alt-Svc: h3$=":443"; ma=2592000
Whitespace Manipulation: Alt-Svc: h3 = ":443" ; ma = 2592000 (unconventional spacing around equals and semicolons)
Control Characters: Embedding control characters or non-printable characters within the header value.

— — — — —

Thats enought for now. to be continued

substance

Pentesting beyond the basics — Cloud WAFs, reverse proxies, IPS, MVC routes (level: hard)

Written by DJ SUBSTANCE