This week-of-unix-tools is intended to be a high concentration of
information with little fluff. I'll be covering only GNU versions of the
tools, for the sake of choosing only one version for sanity sake.
Data comes from lots of places. Loosely categorizing, they come from 3 places:
- Files and devices
- Output of other tools
- The network (via other tools)
Cat means 'concatonate'. It is mostly useful for doing a few things:
- Cat lots of files together; eg 'cat *.c' for processing by
another tool, or generally glueing data sets (from files) together.
- Make a shell script more readable by making the input more obvious
Netcat. Basically gives you the ability to talk tcp and udp from the
shell. You can send data using standard input, and receive data from
standard output. Simple.
- tcp client (connect to google.com port 80)
- nc google.com 80
- tcp server (listen on port 8080)
- nc -l 8080
- udp client (connect to ns1.slashdot.org port 53)
- nc -u ns1.slashdot.org 53
- udp server (listen on port 5353)
- nc -l -u 5353
Examples:
- Basic HTTP request
% echo "GET / HTTP/1.0\n" | nc google.com 80 | head -1
HTTP/1.0 200 OK
openssl is a command that any unix-like system will probably have
installed. The command itself can do many many things, but for this article
I'll only cover the s_client command.
'openssl s_client' is essentially 'netcat + ssl'. This tool is extremely
useful for debugging text-based protocols behind SSL such as ssl'd nntp,
imaps, and https.
Example:
- Open an https connection to addons.mozilla.org
-
% echo "GET / HTTP/1.0\r\n\r\n" \
| openssl s_client -quiet -connect addons.mozilla.org:443 \
| col \
| sed -e '/^$/q'
depth=3 /C=BE/O=GlobalSign nv-sa/OU=Root CA/CN=GlobalSign Root CA
verify error:num=19:self signed certificate in certificate chain
verify return:0
HTTP/1.1 302 Found
Date: Fri, 25 May 2007 10:07:25 GMT
Server: Apache/2.0.52 (Red Hat)
Location: http://www.mozilla.com/
Content-Length: 293
Keep-Alive: timeout=300, max=1000
Connection: Keep-Alive
Content-Type: text/html; charset=iso-8859-1
* The 'col' command will strip the \r (carriage return) characters
from the http response, allowing sed's /^$/ to match an empty line
(end of headers).
You can query webservers (http) with any number of tools and you'll get
the raw source or data for any page you query. This is really useful.
- GET, POST, lwp-request, et al. Comes with libwww-perl
- curl
- wget
- fetch (FreeBSD)
Most of the time I need to fetch pages to stdout, I use GET, becuase
it's less typing. Here's some examples of the above commands:
- Fetch / from www.w3schools.com and output page to stdout
- GET http://www.w3schools.com/
- wget -O - -q http://www.w3schools.com/
- fetch -o -q http://www.w3schools.com/
- curl http://www.w3schools.com/
But what if you don't want the raw html from a webpage? You can have
w3m and lynx do some basic rendering for you, also to stdout. I
recommend w3m instead of lynx, but use whatever.
- w3m -dump http://www.google.com/
- lynx -dump http://www.google.com/
w3m's output looks like
this.
ssh can be a data source too. Run a command on 1000 machines and process
the output locally, for fun and profit.
- Login to N systems and get uptime. Prefix output with the hostname
% echo "fury\ntempest" \
| xargs -n1 -i@ sh -c 'ssh @ "uptime" | sed -e "s/^/@/"'
fury 6:18am up 2:25, 1 user, load average: 0.06, 0.04, 0.04
tempest 06:18:00 up 9:01, 2 users, load average: 0.12, 0.09, 0.09
Combining xargs and ssh gives you a powerful ability to execute commands
on multiple machines easily, even in parallel.