Verified list of 1000 random websites with sensitive keywords found in source: matched on “API| Secret| JWT| token|secret | AWS” : Goal : Iterate list (with curl?) identify leaked info — Why? sharpen skills and learn how not to code pages ;)! (10/23) (99% verified up) :

9 min readOct 16, 2023
dj substance bringing you a 0day list of mostly non US/CA websites containing potential api key in cod eor comment
. know your enemy! — RAGE
Verfied dump of the list of hosts cerified to have potnetial sensitive 
info in the client side source:

[List on page *do not wgeT* ]
[wget link]

Example of usage:
do this as a non root use - no need for root
mkdir secrets
mv urlz[tab] urls.txt

- lETS VERify that we have data in the list
bash$ head ./urls.txt # Remember? we renamed it

[ ************ top entries of file. you will notice alot of prominent sites ** ]
[ *************** [END OF HEAD STDOUT - Just verify the entitre list has no line
[ feeds OR dynam ic urls ]

bash$ which curl
/usr/bin/curl # Verify curl is ready
which xargs

You might want to apt-get install html2text its very useful

try something like this if the list of 1000 urls in in urlz.txt
cat urlz.txt | xargs -I{} sh -c "echo {};echo; curl -skL --connect-timeout 4 \
--user-agent 'mozzila' '{}'"

- AT this poit verify you can see the source being dumped. Im not going to
explain ow to grep and pattern match. good luck

You are looking at a List of 1000 of websites with high likely
hood of info exposure

These are the things I encounter most of the time:
potential api key in code or comments —
The comments are especially *easy to miss*
I have almost missed things in the client-side source comments that we
made the difference of me completing the pen test.
with that being said make a list of keywords to grep these sites for but you have to be good with

grep -v (display anything NOT matching) <keyword>

You can use grep on either the input side or piped thru to the output (both have various reasons for doing so. for example:

cat urlz.txt| grep -i ‘secret\|api’ # Note using “ vs ‘ in bash is much differen
The above will list out (initially to stdout) the URLs in urls.txt and check each “line’ (Which if the site is minified and we curl it, we get back 1000s of chars) on one line. making it very hard to spot anything good

Its worth assuming the code your viewing is generated by a CMS (WordPress, etc)
This unfortunately complicates things when trying to find keywords in
the source due to the fact that most CMS (Joomla and Drupal for sure)
are just JS files that are dynamo ically generating the page for your
client session DOM. In English, what this has meant in my dealings.
was heavy use of minification
*making sometimes a 2 million character script all on one line*
The main issue here is you need to do some regex foo to get just
the info you want typical greps arent going to work likely.

The *only* way i know how to deal w/minified 10mil long one liner scripts is using Xpath or XSLT (google them) — or another nifty method would be

using a PHP library called “simple_dom_html.php” (google it)

Without further ado, here is the list:

Dont do anything illegal with this information please -

*note: this is just how i would go about scraping these potentially **

cat urls.txt | xargs -I{} sh -e "echo {}"

Note what these developers (or who ever wrote the source your looking at )
Examine the structure and how the DOM is constructed as you start out
examining the list. Take it or leave it , this is my article and i
can rant about
what i want ;p

Q: Why did i bother posting this BS on medium ?
A: You could likely take over a extremely large % of these hosts
Q: What kinda info should i curl for ? - in specific
A: pick an interesting sounding name out of this list incognito visit the site
Once its loaded inspect element, then (in most browsers) hit CTRL T
(new tab) - hit the page:

** Def book mark for future use - get remote source of any page
** It is acting as a proxy so you get another prospective of the source.

Remote Source Code Viewer: [free]

Next tasks to get the most of thi exercise (instead of just writing a script
that does a foreach adn greps for the keywo
Once it loads paste your link in and i usually pick stylize and hit go
Finally - it may seem pointing hbut hit C TRL T new tab
now v view the source on your local machine . Sometimes i have noticed (in Brave and Chrome)
I cannot find the selectiuon to "View Source"
If you only see inspect element, try right clikging outside any kind
of <canvas> or <svg> or video element, these prevent viewing of source.

Regardlkess of your role its you

this is simply for training
in red team exercises. If you are currently working at an org in an IT
security position you absolutely need to

contain keyword matches on: API|Secret|JWT|Bearer|token|sword|secret

examine all public facing webservers
and examine the following (this should take you like 4min by now to automate)
bash script t0
> curl -sLk --connect-timeout 3 '' | grep -i disall
- I have found it *essential* to use the '--connect-timeout 3' in all my
automated headless browsing like this, you will def. run into non
responsive hosts.


_ _
| | | |
_ _ _| |__ _ _ __| | ___ _ _ _ _____
| | | | _ \| | | | / _ |/ _ \ | | | | ___ |
| | | | | | | |_| | ( (_| | |_| | | | | | ____|
\___/|_| |_|\__ | \____|\___/ \___/|_____)
/ _ \
____ _____ ____ ____(_( ) )
/ ___|____ |/ ___) ___ | (_/
( (___/ ___ | | | ____| _
\____)_____|_| |_____)(_)

---- You may wonder how this can help you reduce info exposure about your
company. Think about it this way - Index engines/Spiders/Crawlers/whatever
you call it do this for the reason of (most of the time) -
Identify content of site and category for listing in search\
Cache for viewing later on (like
Check out - that is a search engine for insecure
s3 cloud buckets (all types) - the bottom line is check out what others
are doing *WRONG* - to be doing the best thing for your org - stay
ahead of your adversary by studying their activities stealthly but make
sure this is done over time . Lets get to the list:

one nation :: underground




twenty years professionally as a Network Engineer, more recently I have focused on red teaming mostly, but I am always up for learning and exchanging info