Lecture Five |
5.1 Objectives |
This lecture introduces other useful UNIX system utilities and covers:
- Connecting to remote machines.
- Networking routing utilities.
- Remote file transfer.
- Other Internet-related utilities.
- Facilities for user information and communication.
- Printer control.
- Email utilities.
- Advanced text file processing with sed and awk.
- Target directed compilation with make.
- Version control with CVS.
- C++ compilation facilities.
- Manual pages.
5.2 Connecting to Remote Machines |
- telnet machinename
telnet provides an insecure mechanism for logging into remote machines. It is insecure because all data (including your username and password) is passed in unencrypted format over the network. For this reason, telnet login access is disabled on most systems and where possible it should be avoided in favour of secure alternatives such as ssh.
telnet is still a useful utility, however, because, by specifying different port numbers, telnet can be used to connect to other services offered by remote machines besides remote login (e.g. web pages, email, etc.) and reveal the mechanisms behind how those services are offered. For example,
$ telnet www.doc.ic.ac.uk 80
Trying 146.169.1.10...
Connected to seagull.doc.ic.ac.uk (146.169.1.10).
Escape character is '^]'.
GET / HTTP/1.0
HTTP/1.1 200 OK
Date: Sun, 10 Dec 2000 21:06:34 GMT
Server: Apache/1.3.14 (Unix)
Last-Modified: Tue, 28 Nov 2000 16:09:20 GMT
ETag: "23dcfd-3806-3a23d8b0"
Accept-Ranges: bytes
Content-Length: 14342
Connection: close
Content-Type: text/html<HTML>
<HEAD>
<TITLE>Department of Computing, Imperial College, London: Home Page</TITLE>
</HEAD>
(etc)Here www.doc.ic.ac.uk is the name of the remote machine (in this case the web server for the Department of Computing at Imperial College in London). Like most web servers, it offers web page services on port 80 through the daemon httpd (to see what other services are potentially available on a machine, have a look at the file /etc/services; and to see what services are actually active, see /etc/inetd.conf). By entering a valid HTTP GET command (HTTP is the protocol used to serve web pages) we obtain the top-level home page in HTML format. This is exactly the same process that is used by a web browser to access web pages.
- rlogin, rsh
rlogin and rsh are insecure facilities for logging into remote machines and for executing commands on remote machines respectively. Along with telnet, they have been superseded by ssh.
- ssh machinename (secure shell)
ssh is a secure alternative for remote login and also for executing commands in a remote machine. It is intended to replace rlogin and rsh, and provide secure encrypted communications between two untrusted hosts over an insecure network. X11 connections (i.e. graphics) can also be forwarded over the secure channel (another advantage over telnet, rlogin and rsh). ssh is not a standard system utility, although it is a de facto standard. It can be obtained from http://www.ssh.org. A good introduction page giving more background and showing you how to set up ssh is http://www.tac.nyc.ny.us/~kim/ssh/.
ssh clients are also available for Windows machines (e.g. there is a good ssh client called putty).
5.3 Network routing utilities |
- ping machinename
The ping utility is useful for checking round-trip response time between machines. e.g.
$ ping www.doc.ic.ac.uk
measures the reponse time delay between the current machine and the web server at the Department of Computing at Imperial College. ping is also useful to check whether a machine is still "alive" in some sense.
- traceroute machinename
traceroute shows the full path taken to reach a remote machine, including the delay to each machine along the route. This is particularly useful in tracking down the location of network problems.
5.4 Remote File Transfer |
- ftp machinename (file transfer protocol)
ftp is an insecure way of transferring files between computers. When you connect to a machine via ftp, you will be asked for your username and password. If you have an account on the machine, you can use it, or you can can often use the user "ftp" or "anonymous". Once logged in via FTP, you can list files (dir), receive files (get and mget) and send files (put and mput). (Unusually for UNIX) help will show you a list of available commands. Particularly useful are binary (transfer files preserving all 8 bits) and prompt n (do not confirm each file on multiple file transfers). Type quit to leave ftp and return to the shell prompt.
- scp sourcefiles destination (secure copy)
scp is a secure way of transferring files between computers. It works just like the UNIX cp command except that the arguments can specify a user and machine as well as files. For example:
$ scp will@rose.doc.ic.ac.uk:~/hello.txt .
will (subject to correct authentication) copy the file hello.txt from the user account will on the remote machine rose.doc.ic.ac.uk into the current directory (.) on the local machine.
5.5 Other Internet-related utilities |
- netscape
netscape is a fully-fledged graphical web browser (like Internet Explorer).
- lynx
lynx provides a way to browse the web on a text-only terminal.
- wget URL
wget provides a way to retrieve files from the web (using the HTTP protocol). wget is non-interactive, which means it can run in the background, while the user is not logged in (unlike most web browsers). The content retrieved by wget is stored as raw HTML text (which can be viewed later using a web browser).
Note that netscape, lynx and wget are not standard UNIX system utilities, but are frequently-installed application packages.
5.6 User Information and Communication |
- finger, who
finger and who show the list of users logged into a machine, the terminal they are using, and the date they logged in on.
$ who
will pts/2 Dec 5 19:41
$
- write, talk
write is used by users on the same machine who want to talk to each other. You should specify the user and (optionally) the terminal they are on:
$ write will pts/2
hello willLines are only transmitted when you press . To return to the shell prompt, press ctrl-d (the UNIX end of file marker).
talk is a more sophisticated interactive chat client that can be used between remote machines:
$ talk will@rose.doc.ic.ac.uk
Unfortunately because of increasingly tight security restrictions, it is increasingly unlikely that talk will work (this is because it requires a special daemon called talkd to be running on the remote computer). Sometimes an application called ytalk will succeed if talk fails.
5.7 Printer Control |
lpr adds a document to a print queue, so that
the document is printed when the printer is available. Look at /etc/printcap
to find out what printers are available.
lpq checks the status of the specified print
queue. Each job will have an associated job number.
lprm removes the given job from the specified print queue.
Note that lpr, lpq and lprm are BSD-style print management utilities. If you are using a strict SYSV UNIX, you may need to use the SYSV equivalents lp, lpstat and cancel.
5.8 Email Utilities |
mail is the standard UNIX utility for sending and receiving email.
Mail version 8.1 6/6/93. Type ? for help.
"/var/spool/mail/will": 2 messages 2 new
1 jack@sprat.com Mon Dec 11 10:37 "Beanstalks"
2 bill@whitehouse.gov Mon Dec 11 11:00 "Re: Monica"
&Some of the more important commands (type ? for a full list) are given below in Fig. 5.1. Here a messagelist is either a single message specified by a number (e.g. 1) or a range (e.g. 1-2). The special messagelist * matches all messages.
? help q quit, saving changes to mailbox x quit, restoring mailbox to its original state t messagelist displays messages +/- show next/previous message d messagelist deletes messages u messagelist undelete messages m address send a new email r messagelist reply to sender and other receipients R messagelist reply only to sender Fig. 5.1: Common mail commands You can also use mail to send email directly from the command line. For example:
$ mail -s "Hi" wjk@doc.ic.ac.uk < message.txt
$emails the contents of the (ASCII) file message.txt to the recipient wjk@doc.ic.ac.uk with the subject "Hi".
- mutt, elm, pine
mutt, elm and pine are more friendly (but non-standard) email interfaces that you will probably prefer to use instead of mail. All have good in-built help facilities.
- sendmail, exim
Email is actually sent using an Email Transfer Agent, which uses a protocol called SMTP (Simple Mail Transfer Protocol). The two most popular Email Transfer Agents are sendmail and exim. You can see how these agents work by using telnet to connect to port 25 of any mail server, for example:
$ telnet mail.doc.ic.ac.uk 25
Trying 146.169.1.47...
Connected to diver.doc.ic.ac.uk (146.169.1.47).
Escape character is '^]'.
220 diver.doc.ic.ac.uk ESMTP Exim 3.16 #7
HELP
214-Commands supported:
214- HELO EHLO MAIL RCPT DATA AUTH
214 NOOP QUIT RSET HELP
MAIL FROM: alien@xfiles.com
250 <alien@xfiles.com> is syntactically correct
RCPT TO: wjk@doc.ic.ac.uk
250 <wjk@doc.ic.ac.uk> verified
DATA
354 Enter message, ending with "." on a line
Hi
This is a message from an alien
.
250 OK id=145UqB-0002t6-00
QUIT
221 diver.doc.ic.ac.uk closing connection
Connection closed by foreign host.
$This sends an email to wjk@doc.ic.ac.uk, apparently from alien@xfiles.com. Email advertisers (aka spammers) often use this technique to attempt to confuse recipients as to the true source of messages. Fortunately exim and sendmail include extensive header information when they forward email, including the IP address of the computer from where the message was sent.
5.9 Advanced Text File Processing |
- sed (stream editor)
sed allows you to perform basic text transformations on an input stream (i.e. a file or input from a pipeline). For example, you can delete lines containing particular string of text, or you can substitute one pattern for another wherever it occurs in a file. Although sed is a mini-programming language all on its own and can execute entire scripts, its full language is obscure and probably best forgotten (being based on the old and esoteric UNIX line editor ed). sed is probably at its most useful when used directly from the command line with simple parameters:
$ sed "s/pattern1/pattern2/" inputfile > outputfile
(substitutes pattern2 for pattern1 once per line)$ sed "s/pattern1/pattern2/g" inputfile > outputfile
(substitutes pattern2 for pattern1 for every pattern1 per line)$ sed "/pattern1/d" inputfile > outputfile
(deletes all lines containing pattern1)$ sed "y/string1/string2/" inputfile > outputfile
(substitutes characters in string2 for those in string1)
- awk (Aho, Weinberger and Kernigan)
awk is useful for manipulating files that contain columns of data on a line by line basis. Like sed, you can either pass awk statements directly on the command line, or you can write a script file and let awk read the commands from the script.
Say we have a file of cricket scores called cricket.dat containing columns for player number, name, runs and the way in which they were dismissed:
1 atherton 0 bowled
2 hussain 20 caught
3 stewart 47 stumped
4 thorpe 33 lbw
5 gough 6 run-outTo print out only the first and third columns we can say:
$ awk '{ print $1 " " $3 }' cricket.dat
atherton 0
hussain 20
stewart 47
thorpe 33
gough 6
$Here $n stands for the nth field or column of each line in the data file. $0 can be used to denote the whole line.
We can do much more with awk. For example, we can write a script cricket.awk to calculate the team's batting average and to check if Mike Atherton got another duck:
$ cat > cricket.awk
BEGIN { players = 0; runs = 0 }
{ players++; runs +=$3 }
/atherton/ { if (runs==0) print "atherton duck!" }
END { print "the batting average is " runs/players }
(ctrl-d)
$ awk -f cricket.awk cricket.dat
atherton duck!
the batting average is 21.2
$The BEGIN clause is executed once at the start of the script, the main clause once for every line, the /atherton/ clause only if the word atherton occurs in the line and the END clause once at the end of the script.
awk can do a lot more. See the manual pages for details (type man awk).
5.10 Target Directed Compilation |
- make
make is a utility which can determine automatically which pieces of a large program need to be recompiled, and issue the commands to recompile them. To use make, you need to create a file called Makefile or makefile that describes the relationships among files in your program, and the states the commands for updating each file.
Here is an example of a simple makefile:
scores.out: cricket.awk cricket.dat
[TAB]awk -f cricket.awk cricket.dat > scores.outHere [TAB] indicates the TAB key. The interpretation of this makefile is as follows:
make is invoked simply by typing
- scores.out is the target of the compilation
- scores.out depends on cricket.awk and cricket.dat
- if either cricket.awk or cricket.dat have been modified since scores.out was last modified or if scores.out does not exist, update scores.out by executing the command:
awk -f cricket.awk cricket.dat > scores.out
$ make
awk -f cricket.awk cricket.dat > scores.out
$Since scores.out did not exist, make executed the commands to create it. If we now invoke make again, nothing happens:
$ make
make: `scores.out' is up to date.
$But if we modify cricket.dat and then run make again, scores.out will be updated:
$ touch cricket.dat(touch simulates file modification)
$ make
awk -f cricket.awk cricket.dat > scores.out
$make is mostly used when compiling large C, C++ or Java programs, but can (as we have seen) be used to automatically and intelligently produce a target file of any kind.
5.11 Version control with CVS |
- cvs (Concurrent Versioning System)
cvs is a source code control system often used on large programming projects to control the concurrent editing of source files by multiple authors. It keeps old versions of files and maintains a log of when, and why changes occurred, and who made them.
cvs keeps a single copy of the master sources. This copy is called the source ``repository''; it contains all the information to permit extracting previous software releases at any time based on either a symbolic revision tag, or a date in the past.
cvs has a large number of commands (type info cvs for a full cvs tutorial, including how to set up a repository from scratch or from existing code). The most useful commands are:
- cvs checkout modules
This gives you a private copy of source code that you can work on with without interfering with others.
- cvs update
This updates the code you have checked out, to reflect any changes that have subsequently been made by other developers.
- cvs add files
You can use this to add new files into a repository that you have checked-out. Does not actually affect the repository until a "cvs commit" is performed.
- cvs remove files
Removes files from a checked-out repository. Doesn't affect the repository until a "cvs commit" is performed.
- cvs commit files
This command publishes your changes to other developers by updating the source code in the central repository.
5.12 C/C++ compilation utilities |
- cc, gcc, CC, g++
UNIX installations usually come with a C and/or C++ compiler. The C compiler is usually called cc or gcc, and the C++ compiler is usually called CC or g++. Most large C or C++ programs will come with a makefile and will support the configure utility, so that compiling and installing a package is often as simple as:
$ ./configure
$ make
$ make installHowever, there is nothing to prevent you from writing and compiling a simple C program yourself:
$ cat > hello.c
#include <stdio.h>
int main() {
printf("hello world!\n");
return 0;
}
(ctrl-d)
$ cc hello.c -o hello
$ ./hello
hello world!
$Here the C compiler (cc) takes as input the C source file hello.c and produces as output an executable program called hello. The program hello may then be executed (the ./ tells the shell to look in the current directory to find the hello program).
5.13 Manual Pages |
- man
More information is available on most UNIX commands is available via the online manual pages, which are accessible through the man command. The online documentation is in fact divided into sections. Traditionally, they are
1 User-level commands
2 System calls
3 Library functions
4 Devices and device drivers
5 File formats
6 Games
7 Various miscellaneous stuff - macro packages etc.
8 System maintenance and operation commandsSometimes man gives you a manual page from the wrong section. For example, say you were writing a program and you needed to use the rmdir system call. man rmdir gives you the manual page for the user-level command rmdir. To force man to look in Section 2 of the manual instead, type man 2 rmdir (orman -s2 rmdir on some systems).
man can also find manual pages which mention a particular topic. For example, man -k postscript should produce a list of utilities that can produce and manipulate postscript files.
- info
info is an interactive, somewhat more friendly and helpful alternative to man. It may not be installed on all systems, however.