Child pages
  • 4.9 HTTP Proxy for local system consumption
Skip to end of metadata
Go to start of metadata

Introduction

For systems that deal with HTTP(S) based networking, especially when local zones with constrained connectivity are involved, it may be beneficial to have a proxy server responsible for interaction with remote services such as IPS package repositories, git systems, downloads of remote source code tarballs, IDE or Jenkins plugins, or general other types of browsing. My personal example also involves a laptop installation which roams from one connection type (and quality) to another, so it would be cumbersome to rewrite proxy settings everywhere upon each reconnection... and regarding the connection qualities - a web proxy also helps when downloads are broken mid-flight and have to be restarted (e.g. Netbeans IDE failed to update over bad cell connection, because it wanted a bundle of all dependent plugins to be available at once).

The solution proposed in this article is to deploy a simple SQUID proxy server, allow it to cache large files as well (as may be needed for the source tarballs and packages and plugins), set up optional links to corporate or ISP proxies that may be available in different LANs I connect to, and finally set up all the clients running from or alongside this OpenIndiana instance (e.g. shell, zones, other VMs on the laptop) to use this proxy.

Walkthrough

Set up SQUID

Install the package:

:; pkg install squid

Optionally, dedicate a ZFS filesystem for the cache; data can be on a separate pool, e.g.:

:; zfs create -o compression=gzip-9 -o mountpoint=/var/squid rpool/SHARED/var/squid
:; zfs create -o compression=lz4 -o mountpoint=/var/squid/cache -o com.sun:auto-snapshot=false pool/var-squid-cache
:; mkdir /var/squid/logs
:; mkdir /var/squid/run
:; chown -R webservd:webservd /var/squid/*

Set up additional options in /etc/squid/squid.conf:

# Extend the list of remote HTTP and HTTPS ports that we allow connections to:
acl SSL_ports port 8443
acl SSL_ports port 9443
acl Safe_ports port 8080 # http
 
# In my setup, the "hierarchy_stoplist" caused some issues and seemed better disabled. YMMV.
#OBSOLETE#hierarchy_stoplist cgi-bin ?
 
# Customize the caching options
maximum_object_size 256 Mb
cache_dir aufs /var/squid/cache 10240 16 256
 
# Some targets are better left un-proxied:
### VirtualBox host-only network for other VMs on this machine:
acl INSIDE_IP dst 192.168.56.0/24
### Etherstub for constrained local zones on this OI host:
acl INSIDE_IP dst 192.168.127.0/24
acl INSIDE_IP dst 127.0.0.1
acl INSIDE_IP dst ::1
### You can specify your machine's hostname or organization domain here too:
acl INSIDE_DOM dstdomain localhost myOIhostname
always_direct allow INSIDE_IP INSIDE_DOM

# For some cases, we can want to force all data to go through "parents"
# (upstream HTTP proxies, if configured); in such case uncomment this line:
#USUALLY_OFF# never_direct allow all
  • If your physical external network provides proxy servers you might want to use, you can specify them as required "parents", or optional "sibling" proxies that. These can use common HTTP requests like any other client, or specially set up with ICP protocol to exchange metadata about things they have already cached and eventually toss around the cached payloads - so they conserve upstream traffic and use fast LAN interconnections. My interest with a roaming laptop was to have connectivity at all from corporate LAN (which requires their proxy to be used), as well as to share data with the proxy at home, and I had better luck with a "sibling" setup for both cases. 

    ### CUSTOM CONFIG to use remote HTTP proxy servers
    # http://wiki.squid-cache.org/Features/CacheHierarchy
    # http://wiki.squid-cache.org/SquidFaq/CompleteFaq#SquidFaq.2FConfiguringSquid.How_do_I_configure_Squid_to_work_behind_a_firewall.3F
    # http://www.squid-cache.org/Doc/config/cache_peer/
    # http://www.christianschenk.org/blog/using-a-parent-proxy-with-squid/
    # http://serverfault.com/questions/42728/two-squid-chaining
     
    cache_peer proxy.work.com sibling 8080 0 no-query no-digest no-netdb-exchange login=corpuser:n0Trust%21 connect-fail-limit=5 allow-miss default
    cache_peer cache.work.com sibling 8080 0 no-query no-digest no-netdb-exchange login=corpuser:n0Trust%21 connect-fail-limit=5 allow-miss
    cache_peer proxy.home.net sibling 3128 3130 icp background-ping no-digest connect-fail-limit=5 allow-miss
    
    # Allow to go direct when no upper proxies are "up"
    prefer_direct on
    nonhierarchical_direct on

    Note that when roaming, you often do need to svcadm restart squid as part of the IP addressing change, so it picks up new IP and DNS settings, as well as tries to reconnect to parent or sibling proxies.

Finally, enable the service:

:; svcadm enable squid ; svcadm clear squid ; svcadm restart squid
Set up clients  

Shell: I defined a file that can be included via shell profiles of accounts and VMs I want to use this proxy (note they all should be able to resolve the corresponding IP address of the OI host using the commonly provided name - on LAN, host-only VM network, etherstubs, etc. as applicable; the /etc/hosts or equivalent on each of those clients is a good way to solve this):

# Include-file
http_proxy='http://myOIhostname:3128/'
https_proxy="$http_proxy"
ftp_proxy="$http_proxy"
ftps_proxy="$http_proxy"
export http_proxy https_proxy ftp_proxy ftps_proxy

For IPS packaging there are options depending on IPS version on your system. For current OpenIndiana Hipster it is possible to set the proxy per-publisher, which can be useful for local development (e.g. I do not need it for repositories I host locally, especially if using the file:/// URIs). Alternatives include setup of proxy for all of IPS, and usage of the http_proxy and friends from environment variables; see https://docs.oracle.com/cd/E26502_01/html/E29024/glqjr.html for details. The environment variable approach is arguably the most flexible option when you want to quickly avoid the proxy when squid goes south, or to use some alternate proxy server. The pre-set for publisher (or whole IPS) lets this setting be independent of shell profile snippets or whatever envvars your scripts or cronjobs might think of inheriting or setting - you would use that if you specifically only want one type of proxying, as a not easily changeable or corruptible setting.

Netbeans, Jenkins, etc - see their interactive configuration options.

Generic Java applications (e.g. OpenGrok, tomcat, etc.) might ignore the Unix/Linux specific http_proxy and friends, but would use Java command-line options (or otherwise passed configuration properties) like http.proxyHost=myOIhostname http.proxyPort=3128 instead. 

  • No labels