Web application scanning with Htcap

Htcap is a free web application scanner that can crawl single page applications in a recursive manner by intercepting Ajax calls and DOM changes.

The app is focused mainly on the crawling process and uses external tools to discover vulnerabilities. It’s designed to be a tool for both the manual and automated penetration testing of modern web applications.


Htcap scan process and modes

The scan process is divided into two parts. Htcap crawls the target and collects as many requests as possible and saves them to a SQLite database. When the database is populated, you can explore it with tools such as SQLite3 or DBEaver, or export the results using built-in scripts.

“Htcap has been designed to focus on the crawling process, leaving the fuzzing phase to ready-available scanners and security tools, such as Arachni and sqlmap. Doing so, Htcap can obtain a greater coverage in the discovery process and also take advantage of a wide range of well-written and reliable fuzzers without reinventing the wheel,” Htcap lead developer Filippo Cavallarin, CEO at Segment Srl, told Help Net Security.

The tool supports three scan modes: passive, active and aggressive. When in passive mode, the app doesn’t interact with the page and only follows links. Active mode triggers all discovered events, while aggressive mode makes Htcap also fill input values and post forms.

“The architecture of the scanning engine is modular and it’s easily extendible by writing ‘scanner modules’. Anyone with a little bit of Python knowledge can write it’s own scanner module to fuzz the requests captured by the crawler. Even the report generated by Htcap is focused on the discovery process. It displays information about pages that make use of Ajax, WebSockets, JSONP along with the ones that contains vulnerabilities,” Cavallarin added.

Development challenges

Unsurprisingly, the main challenge was the development of the algorithm able to perform recursive Ajax crawling.

“The algorithm was written in JavaScript so that the concept of ‘waiting for something’ is limited to the asynchronous nature of the language. For example, if you want to detect when an Ajax call is completed, you have to pool its status with a non-synchronous loop. Since the algorithm is recursive, Htcap must wait both for Ajax calls and child call of the recursive function, in an asynchronous environment,” Cavallarin explained.

Don't miss