|
@@ -1,3 +1,64 @@
|
|
|
-# http-proxy
|
|
|
+# httpirate
|
|
|
+
|
|
|
+
|
|
|
+**What is HTTPirate?**
|
|
|
+------
|
|
|
+HTTPirate is a web proxying framework that supports javascript modification, reverse proxying, caching, and dynamic routing.
|
|
|
+It is meant to assist in couteracting DNS black hole when attempting to access a blocked resource from a protected network,
|
|
|
+or to access a web application privately.
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
+**How does it help with either those use cases?**
|
|
|
+-----
|
|
|
+As it stands, HTTPirate runs in aggressive mode, or 'shotgun' mode, wherein the configuration file you specify the 'expected' domains that
|
|
|
+the target web resource uses, I.E. CDN's, 3rd party authentication services, API's, and so forth. Subsequently you supply the shotgun 'slugs'
|
|
|
+in the *routemap*. When you request a web page through your proxies domain name, it will rewrite all instances of whatever is specified in the
|
|
|
+*rewrite* file. This allows for the javascript to get rewritten mid-flight, before it lands on the client. When your machine executes the
|
|
|
+javascript, it will automatically target your proxy. **It is up to you to specify all the expected domains and have them rewritten with you proxies
|
|
|
+domain name in the rewrite file**. When the proxy then recieves a request, it will spin off goroutines to attempt to get the requested resource
|
|
|
+on each domain. This is not foolproof, and can result in errors. Manual intervention *may* be required for certain routes.
|
|
|
+When the proxy recieves a 200 request, it will update its *routecache* with a mapping of that URI path to the responding domain. This allows
|
|
|
+the proxy to have a relatively quick way to map that URI to the appropriate domain next time it is requested.
|
|
|
+
|
|
|
+**Adding cookies**
|
|
|
+------
|
|
|
+As an administrator, it is your job to know what cookies your web application requires to operate. Once you manually identify and extract the cookies,
|
|
|
+you can add them to the programs configuration via the *httpirate-cmd* command line tool. (add more info on this later. Not done yet)
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
+**Routemap persistence**
|
|
|
+-----
|
|
|
+As the proxy stands right now, when a SIGINT is sent to the program, it will write the contents of its routemap to the file specified in the config
|
|
|
+titled *route_map_path*. This will then be loaded again after the program starts, so that you do not need to rebuild the routemap each time.
|
|
|
+
|
|
|
+
|
|
|
+**Page Modification rewrites**
|
|
|
+-----
|
|
|
+at times, manual intervention is required when a page cannot load a resource accurately. Some resource files, such as a javascript file may require to retrieve
|
|
|
+other javascript files from a CDN, and you may need to exclude that route from alteration in the config file.
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
+**To-Do**
|
|
|
+----
|
|
|
+
|
|
|
+- get the program to handle all Unix program signals (SIGTERM, SIGHUP, etc)
|
|
|
+- Remove need for exceptions in page modification rewrites
|
|
|
+- unit tests
|
|
|
+- remote caching server support (routes and resources)
|
|
|
+- configuration file generation
|
|
|
+- save program PID to filesystem / OR create unix socket for httpirate-cmd to talk to httpirate through
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
|
|
|
-HTTP Proxy for an upwork client
|