- Mar 27, 2022
N3XT-G3N Web App Firewall (NGWAF)
The Motivation | What is the N3XT ST3P?With the explosive growth of web applications since the early 2000s, web-based attacks have progressively become more rampant. One common solution is the Web Application Firewall (WAF). However, tweaking rules of current WAFs to improve the detection mechanisms can be complex and difficult. NGWAF seeks to address these drawbacks with a novel machine learning and quarantine-to-honeypot based architecture.
Inspired by actual pain points from operating WAFs, NGWAF intends to simplify and reimagine WAF operations through the following processes:
|Pain point||NGWAF Feature|
|Maintenance of detection mechanisms and rules can be complex||Leverage machine learning to automate the process of creating and updating detection mechanisms|
|Immediate blocking of malicious traffic reduces chances of learning from threat actor behavior for future WAF improvements||Threat elimination through redirected quarantine as opposed to conventional dropping and blocking of malicious traffic|
The deployment have been tested on macOS (Docker desktop), linux (ubuntu).
Check out our
NGWAF is created by , , and
Special shoutout to for her contributions to the initial stages of NGWAF.
How does NGWAF work?NGWAF runs out-of-the-box with three key components, these components as mentioned above are all containerised and are scalable according to desired usage. The protected resource can be customised by making a deployment change within the setup.
High level architecture of NGWAF with expected traffic flows from different parties
Key BenefitsNGWAF was engineered with the following key user benefits in mind:
1. Rule Complexity ReductionNGWAF replaces traditional rulesets with deep learning models to reduce the complexity of managing and updating rules. Instead of manually editting rules, NGWAF’s machine learning automates the pattern learning process from malicious data. Data collected from the quarantine environment are automatically scrubbed and batched, allowing it to be retrained into our detection model if desired.
2. Cyber DeceptionNGWAF adopts a novel architecture consisting an interactive and quarantine environment built to isolate potential hostile attackers. Unlike conventional WAFs which blocks upon detection, NGWAF diverts threat actors to emulated systems, trapping them to soften the impact of their malicious actions. The environment also act as a sinkhole to gather current attack methods, enabling the observation and collection of malicious data. These data can be used to further improve NGWAF’s detection capability.
NGWAF in action: Upon detection of SQL injection, NGWAF redirects to our quarantine environment, instead of dropping or blocking the attempt.
3. Compliance to Internationally Recognised StandardsThe guiding principal behind the creation of NGWAF is to guard against the risks highlighted from the Open Web Application Security Project’s standard awareness document - .
Training data and compliance checks for NGWAF are collected and conducted based on this requirement.
The Components of NGWAF
1. The Brains - Machine-Learning based WAF | Who needs manual when we can go NEURALInstead of traditional rulesets which require analysts to manually identify and add rules as time goes by, NGWAF leverages end-to-end machine learning pipelines for the detection mechanism, greatly reducing the complexity in WAF rule management, especially for detecting complex payloads.
Base ModelTo do so, we needed to first create a base model and architecture that users can start off with, before they later use data collected from their own applications for retraining and fine-tuning:
- We collected malicious and non-malicious payloads from various application logs (total of ~40k observations)
- Instead of manually identifying rules, we leverage machine and deep learning to automate the process of learning patterns from previous malicious data.
- We then experimented with several model architectures, and our final model utilized a sequential neural network to predict whether an incoming payload was malicious or not.
PerformanceOur model was able to achieve 99.6% accuracy on our training dataset.
Maintenance & RetrainingAlthough we have included logs from various applications in order to improve the generalizability of the base model, further maintenance and retraining of the model will be important to:
- Tune the model for better performance on traffic from the user’s specific application
- Reduce model degradation over time, as threat actors discover new methods and opportunities
2. The Looking Glass - Scalable Interactive Quarantine Environment | Don’t let them go, DETAIN THEM!Contrary to traditional WAFs where malicious traffic are blocked or dropped right away. NGWAF is going with a more flexible approach. Whereby, it redirects and detains malicious actors within a quarantine environment. This environment consists of various interactive emulated honeypots to try and gather more attack methods/data, these data will be utilised to potentially enhance NGWAF’s detection rate of more modern and complex attacks.
Capturing of Malicious data and Auto-Scrubbing for retraining purposesCurrently, NGWAF’s quarantine environment forwards all data submitted by the trapped attacker to our ELK stack for analysis and visualisation. The data are auto-scrubbed into different components of the HTTP request, then packaged internally on the environment’s backend in JSON format before forwarding. This helps to lower the manpower cost required to clean and index the data when we kickstart the retraining process.
Creating your customised quarantine environmentNGWAF currently provides users to make changes to the look and feel of the front-end aspect of our honeypots within the quarantine environment (based off a customised version of drupot). Users simply have to replace the assets folder within the docker volume with their front-end assets of choice.
NGWAF is also accommodating to users who would like to link their own honeypots as part of the quarantine environment. Users just have to forward the honeypot’s HTTP requests to the environment’s backend server (backend processes will automatically scrub and forward data to the analysis dashboard - ELK stack).
3. The Library - Retraining Sequence to Reinforce the Brains | Smart isn’t really smart till you can keep learning.As new payloads and attack vectors emerge, it is important to upgrade detection capabilities in order to ensure security. Hence, a retraining function is built into NGWAF to ensure defenders are able to train the machine learning model to detect those newer payloads.
Retraining of datasets is one of the main features in NGWAF. On our dashboard, users can insert new dataset for retraining, to strengthen and improve the quality of NGWAF detection of malicious payloads.
This can be achieved in the following steps:
- Create a new dataset (.csv) for upload in the following format (empty column, training data, label). You can refer to patch_sqli.csv as an example.
- Navigate to to view NGWAF admin panel.
- Select the “Import Dataset” tab and upload the training set you have created
- Confirm that the training set have been uploaded successfully under the “Manage Datasets” tab.
- Under “Manage Model” tab, select the dataset(s) you want to retrain the model on and click on the “UPDATE WAF MODEL” button.
- Congrats! The model should finish re-training after some time.
4. Additional Features:NGWAF uses ELK stack to capture logs of network data that passes through NGWAF, allowing users to monitor the traffic that passes through the NGWAF for further analysis.
NGWAF also comes with live Telegram notification, to inform owners about live malicious threats that is detected by NGWAF.
Sample Usage Scenarios
- Newly normal application (Use the inbuilt web cloner / create another duplicate deployment to use as isolation environment)
- Integrate into existing honeypot/honeynet (Update the configuration to point to honeypot/honeynet)
Setting up NGWAF | Requirements, installation, and usage
RequirementsTested Operating Systems
- macOS (Docker Desktop)
- tensorflow (tentative)
WAF Admin Panel Component
- Create React App
- Elastic Search Stack Components (Elasticsearch, Logstash, Kibana, Filebeats)
Installation and UsageWith Docker running, run the following file using the command below:
To replace the targets, point the dest_server and honey_pot_server variable to the correct targets in the /waf/WafApp/waf.py file
Replace me dest_server = “dvwa” honey_pot_server = “drupot:5000”Once the Docker container is up, you can visit your localhost, in which these ports are running these services:
|Port||Service||Remarks||Credentials (If applicable)|
|8080||DVWA||Where the WAF resides||adminassword|
|5601||Elasticsearch||To view logs||elastic:changeme|
|8088||Admin Dashboard||Dashboard to manage the WAF model|
token=‘’ CHAT_ID = ‘’ WAF_NAME = ‘Tester_WAF’ WARN_MSG = "ALERT [Security Incident] Malicious activity detected on " +WAF_NAME+ “. Please alert relevant teams and check through incident artifacts.” URL= “ ”.format(token,CHAT_ID,WARN_MSG)
Disclaimers & Other ConsiderationsNGWAF is a W.I.P, Open source project, functions and features may change from patch to patch. If you are interested to contribute, please feel free to create an issue or pull request!
You must be registered for see images