uwsgi,pywb and copilot

Photo by John Gruber on Unsplash

We use pywb to have a complete day by day snapshot of our website. It’s the famous wayback machine. It’s a python program we run it in uwsgi ( WSGI web server gateway interface).

Anyway look at this:

curl -k https://wayback.herts.ac.uk:8443//static/%2e%2e/%2e%2e/%2e%2e/%2e%2e/%2e%2e/etc/passwd

Well, I’ve truncated the number of %2e%2e’s required to find the passwd file but you can see what is happening. So its vulnerable to path traversal exploitation.

My mate copilot reckoned there was some way to fix this with pywb but in fact it didn’t work so my mate copilot came up with a wrapper solution:

#!/usr/local/pywb/venv/bin/python3
from pywb.apps.wayback import application

def block_malicious_requests(env, start_response):
  path = env.get('PATH_INFO', '')
  print(f"Processing path: {path}") # Debug statement
  if '..' in path or '%2e%2e' in path or '/static/' in path and 'passwd' in path:
    print("Blocked path traversal attempt") # Debug statement
    start_response('451 Unavailable', [('Content-Type', 'text/plain')])
    return [b'Access Denied']
  return application(env, start_response)
app = block_malicious_requests

Great, it works I guess until I confirm there is in fact no better way of stopping this.

One last thing, the uwsgi.ini file needs a line like this (the wrapper program we called wayback_wrapper.py):

#wsgi = pywb.apps.wayback
wsgi = wayback_wrapper:app

So, another victory for copilot. All done in record time and robbed of personal satisfaction but certainly more productive.

Prologue

The maintainers at webrecorder promptly addressed the issue: https://github.com/webrecorder/pywb/issues/931 so its just necessary now to update pywb.