Users, Permissions, Processes, and Pipes

If you can see this check that

Javascript is enabled

Welcome to linuxzoo
- System Administration
  - Users, Permissions, Processes, and Pipes

USERS

UID and GID

In Unix, there are User IDs and Group IDs. For convenience group ids are usually referred to as UIDs, and for group ids we use GID. Each user will have a UID, and 1 or more GIDs. UIDs are unique, in that each user has their own personal UID. GIDs can be shared between users. Each file is owned by 1 UID, and will have access permissions for 1 GID. The groups are particularly useful when a file needs to be accessed by more than 1 user, in which case the GID of the file is set to a group which both users have in common.

For instance, user "gordon" could have UID 1000, and user "robert" could have UID 1001. They could be both in a group called "staff", with a GID of 500.

Users

For system administrative purposes, the relationship between users, groups, UIDs, and GIDs, are all stored in 4 system files. These also contain user and group identity information, such as usernames, realnames, and passwords. The 4 files are:

/etc/passwd - General User details.
/etc/shadow - User passwords.
/etc/group - The user's groups.
/etc/gshadow - Passwords for groups.

$ cat /etc/passwd

Here is an example containing the first few lines of the /etc/passwd file.

 root:x:0:0:root:/root:/bin/bash
 bin:x:1:1:bin:/bin:/sbin/nologin
 daemon:x:2:2:daemon:/sbin:/sbin/nologin

The file has the following format:

Username, x, uid, gid, text name, home directory, login shell.

Username is the user-friendly name for that particular user. It is used for logging in. The 'x' is where the encrypted password string used to be stored for each user, but this has lately been moved to /etc/shadow and the 'x' indicates that this column is currently unused. UID is obviously the uid of this entry (thus root has UID 0). The GID is the primary group of that user, in that when that user logs in and immediately starts to create files, the files in question will belong to that primary GID.

The "text name" of a user is effectively a comment field describing that user. So for user "gordon" it could be "Gordon Russell". For system users it is usually just the same text as the username. The home directory is the location that the user is "cd"ed to when they log into their account, as well as the location then end up in if they do "cd ~". You can also cd to other user's home directories by doing "cd ~username", e.g. "cd ~root".

Lastly, the login shell is the program which is controlling the prompt when that user logs in. A common shell program is "/bin/bash", which is the "Bourne Again SHell" (an in-joke in the linux community as one of the original shells was called the Bourne Shell). Others may refer to ksh, csh, tcsh. If a shell specifies "/sbin/nologin" then if a user logs in with that account they are immediately logged out. Such nologin accounts are only useful for partitioning system files into categories, so that not all system files end up being owned as root. Root is the system administration account, or "super-user".

$ head -3 /etc/shadow

root:$1$RcFIaOlb$bwl5dvTECg3M1ZgMQ7e6I.:12663:0:99999:7:::
bin:*:12621:0:99999:7:::
daemon:*:12621:0:99999:7:::

This indicates the encrypted password information for each user. Most system files are readable by anyone, but this file is only readable by root. Thus to steal password you must be root first, and then you still need to decrypt the password. The passwords themselves are encrypted using one of the currently considered "reasonable strength" algorithms, which in this case is a seeded md5.

The rest of the line allows you to specify rules about password expiry information, should you want to use such things. Things which can be stored here are:

Date of last password change
Days to wait before a password change is permitted
Days left before a password change is required
Days left before a warning about a password change requirement
Days left before the account becomes inactive
Days until the account expires

Password and account expiry is not considered further here.

$ tail -3 /etc/group

gdm:x:42:
dovecot:x:97:
mysql:x:27:

The group file contains the translations from GID to group names. So for instance, group "mysql" is GID 27. The 'x' is another placeholder similar to that found in /etc/passwd, and can be safely ignored.

It is perfectly reasonable for a user to be in more than 1 group, but as the /etc/passwd file only specifies the user's primary group then additional group membership needs to be stored here. So for instance if "gordon" has a primary group id of 500 (staff), but also belonged to group 501 (student) then you could have an entry like:

staff:x:500:
student:x:501:gordon,robert

$ tail -3 /etc/gshadow

gdm:x::
dovecot:x::
mysql:x::

This file allows a group to be given a password, which in turn allows a user to do things as if they belonged to a group, even when the /etc/group file does not list them in that group, provided they know the right password. I have never seen this used in the real world, but it is technically possible, yet it is very unlikely to see any password stored here. The encrypted password would go where the 'x' is.

Changing ownership information of Files

> touch /tmp/test
> ls -l /tmp/test
-rw-r--r--.  1 root root 0 Sep 23 15:47 /tmp/test
> chmod og+wx /tmp/test
> ls -l /tmp/test
-rw-rwxrwx.  1 root root 0 Sep 23 15:47 /tmp/test
> ls -l /tmp/test
-rw-rwxrwx.  1 ftp mem 0 Sep 23 15:47 /tmp/test
> chgrp root /tmp/test
> ls -l /tmp/test
-rw-rwxrwx.  1 ftp root 0 Sep 23 15:47 /tmp/test

Files are created with the ownership UID of the current user, and the GID of that user's primary group. Sometimes though you need to change the GID or the UID of a file. A normal user can change the GID of a file they own to a group they have membership of. However root can change the UID or the GID to anything they desire.

PROCESSES

Processes

A process is a executable program which is running somewhere on the machine. Each process has its own identifier, which is a 16 bit number called a pid. Some processes are named after the executable file used to start them (e.g. the ls command, when executed, will create a process called /bin/ls) but others have more dynamic or symbolic named, depending on their purpose. Strangly named processes often have names which appear in square brackets like [kjournald], and these usually indicate they are processes deep inside the kernel, and thus should not be messed with. All processes have a parent (the process which started them or is taking responsibility for them). The top process which is ultimately the parent of all processes is called INIT, and has pid 1.

$ ps aux

USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
root       1    0.1  0.8  1480  496 ?        S    12:57   0:00 init [5]
root       2    0.0  0.0     0    0 ?        SWN  12:57   0:00 [ksoftirqd/0]
root       3    0.0  0.0     0    0 ?        SW<  12:57   0:00 [events/0]
root       4    0.0  0.0     0    0 ?        SW<  12:57   0:00 [khelper]
root       16   0.0  0.0     0    0 ?        SW   12:57   0:00 [kjournald]
...
root       527  0.0  0.9  1464  576 ?        S    12:58   0:00 syslogd -m 0
rpc        553  0.0  0.9  1544  584 ?        S    12:58   0:00 portmap
rpcuser    573  0.1  1.3  1644  812 ?        S    12:58   0:00 rpc.statd
root       658  0.4  2.4  3656 1484 ?        S    12:58   0:00 /usr/sbin/sshd
gordon   15521  0.0  0.1  3992  760 pts/1    R    20:41   0:00 ps aux

Here is a brief list of the processes on one of my machines. The init process has PID 0, is using 0.1% of the CPU and 0.8% of the memory. VSZ is the virtual size of a process, of which some of the size may be in main memory and some may have been swapped to disk. The amount of main memory used is the resident size, or RSS.

The TTY is the device name related to the controlling shell which actually started the process. If this is "?" then the process was probably started without a shell (e.g. directly by the kernel). However "ps aux" at the end was typed at the prompt, thus is being controlled by a shell, and the name of the screen device is called /dev/pts/1. STAT is the state codes, and these are discussed in the next section. Finally is the start time of the process, and the total run time of the process in active CPU seconds. Lastly is the name of the process, which is most likely the name of the executable used to start the process.

State Codes

The STAT or state codes can tell you about the behaviour of a process:

Standard Codes
- D uninterruptible sleep (usually IO)
- R runnable (on run queue)
- S sleeping
- T traced or stopped
- W paging
- X dead
- Z a defunct ("zombie") process
Additional Codes
- W has no resident pages
- < high-priority process
- N low-priority task
- L has pages locked into memory (for real-time and custom IO)

Process Relationships

All processes have parents, except INIT. A process may have a parent, who has a parent, who has another parent, etc, so long as eventually the parent is INIT. Knowing the parentage of a process may help you understand what is actually being executed on your machine. A great tool for this is called pstree.

In this example extract pstree is executed, in the bash shell, after logging in to the machine using ssh (controlled by sshd).

$ pstree

       init---anacron
	      +-atd(daemon)
	      +-crond 
	      +-sshd---sshd---bash---pstree
	      +-syslogd
	      +-xfs(xfs)
	      +-xinetd

/proc

All processes are also in fact directories in Linux. This is just a convenience, so that you can use your favourate file and directory commands to look for and understand processes without having to learn a new set of commands. All processes are directories in the directory /proc. They appear as subdirectories with a name equal to their PID. In each PID directory is tons of information about memory usage, process state, how the process was created, and even its current raw memory pages. If you remember pid 658 was the sshd daemon in the previous example:

$ ls -l /proc/658

-r--------. 1 root root 0 Sep 21 16:32 auxv
-r--r--r--. 1 root root 0 Sep 21 16:31 cmdline
lrwxrwxrwx. 1 root root 0 Sep 21 16:32 cwd -> /
-r--------. 1 root root 0 Sep 21 16:32 environ
lrwxrwxrwx. 1 root root 0 Sep 21 16:32 exe -> /usr/sbin/sshd
dr-x------. 2 root root 0 Sep 21 16:32 fd
-r--r--r--. 1 root root 0 Sep 21 16:32 maps
-rw-------. 1 root root 0 Sep 21 16:32 mem
-r--r--r--. 1 root root 0 Sep 21 16:32 mounts
lrwxrwxrwx. 1 root root 0 Sep 21 16:32 root -> /
-r--r--r--. 1 root root 0 Sep 21 16:31 stat
-r--r--r--. 1 root root 0 Sep 21 16:32 statm
-r--r--r--. 1 root root 0 Sep 21 16:31 status
dr-xr-xr-x. 3 root root 0 Sep 21 16:32 task
-r--r--r--. 1 root root 0 Sep 21 16:32 wchan

Some of the files are hard to read. "cwd" is the current working directory, or in effect where the process has "cd"ed to. "exe" is the name of the executable file used to start the process. "cmdline" contains the options and flags used when the executable was started. "fd" contains all the files which this executable currently has open. This fd directory can be useful for forensic analysis.

$ ls -l /proc/658/fd

lrwx------. 1 root root 64 Sep 21 16:32 0 -> /dev/null
lrwx------. 1 root root 64 Sep 21 16:32 1 -> /dev/null
lrwx------. 1 root root 64 Sep 21 16:32 2 -> /dev/null
lrwx------. 1 root root 64 Sep 21 16:32 3 -> socket:[4230]

So here sshd appears to have 4 things open. Almost all processes have a minumum of 3 open files, numbered 0,1, and 2. These have linux names, called STDIN, STDOUT, and STDERR. If a process wants information from the keyboard, it reads from STDIN. If it want to print normal information, it goes to STDOUT. If something is wrong, and it wants to print diagnostic information, it goes to STDERR.

In the case of sshd, which is a daemon, it wont have a keyboard or screen directly connected to it. Almost all daemons will instead close handles 0,1, and 2 down when it starts. A special device called /dev/null can pretend to be a keyboard or screen, but never has any key presses and anything printed to it is quickly destroyed. So in effect STDIN, STDOUT, and STDERR are closed.

There is another handle open; handle 3. This from the name indicates that it is a socket. Sockets are network connections, and this is sensible as sshd will be listening to tcp port 22 for new ssh connections. So in this case the process is working properly.

Consider the following example:

> sleep 20 > /tmp/hia &
[1] 854
> ls -l /proc/854
lrwxrwxrwx. 1 root root 0 Sep 21 16:45 cwd -> /root
-r--------. 1 root root 0 Sep 21 16:45 environ
lrwxrwxrwx. 1 root root 0 Sep 21 16:45 exe -> /bin/sleep
dr-x------. 2 root root 0 Sep 21 16:45 fd

> ls -l /proc/854/fd
lrwx------. 1 root root 64 Sep 21 16:45 0 -> /dev/pts/0
l-wx------. 1 root root 64 Sep 21 16:45 1 -> /tmp/hia
lrwx------. 1 root root 64 Sep 21 16:45 2 -> /dev/pts/0

Here a process "sleep" is started from the shell (and the user is logged in using /dev/pts/0 as their keyboard and screen). As the process was started with a redirection, things which would normally go to the screen (i.e. STDOUT or handle 1) is redirected to the file /tmp/hia. This can be seen from the fd directory of this process.

Daemons

A Daemon is a process started when you boot which runs in the background. Not all things started when booting stay running (e.g. they set something up and then die). To help us, daemons usually have a name which ends with a "d". (e.g. syslogd, sshd).

$ top

The top command is an interactive tool which tells you what is going on, process wise, on your machine. It refreshes itself automatically every few seconds.

top - 16:03:17 up  1:04,  1 user,  load average: 0.00, 0.00, 0.00
Tasks:  35 total,   2 running,  33 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0% us,  0.0% sy,  0.0% ni, 100.0% id,  0.0% wa,  0.0% hi,  0.0% si
Mem:     59764k total,    52308k used,     7456k free,     6192k buffers
Swap:   205816k total,        0k used,   205816k free,    32472k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
  807 root      16   0  1624  728 1412 R  0.0  1.2   0:00.02 in.telnetd
  934 root      16   0  1828  872 1628 R  0.0  1.5   0:00.00 top

The list of processes, by default, only includes those which were running since the scree refreshed last. In the example I logged in using telnet, thus connected to in.telnetd, then typed top. Both the telnet daemon and top need to work together to show the information on my screen, so both are active.

Useful information includes:

load average - this shows the overall load of the machine over the last 1, 5, and 15 minutes. A load of 1.0 shows that 1 CPU was fully loaded. Numbers exceeding the number of CPUs means there was more work to be done than CPUs available.
CPU load in terms of USer, SYstem, NIce, and IDle percentages. High user times means time spent running user processes. System time is things in the kernel, like disk activity and networking. Nice is like User, but low prority backgroup tasks. Idle time is time spent doing nothing. Another useful measure is WAit, which is the amount of time the computer was doing nothing due to waiting for information from a harddrive. High wait times indicate too much work dependent on disk (so called diskbound activities) which should be avoided if possible.
Mem is the actual memory usage. Used is the memory needed to run processes. While buffers is how much memory is being used to buffer read or writes to the disk.
Swap is how much the virtual swapspace is being used. High used values indiate that there are many virtual memory blocks swapped out to disk.

SYSLOG

The syslogd daemon is your friend. It helps other daemons record what is going on into a file. On the website, you can click on "syslog output" and see what syslogd has noticed. This output, known as the syslog, can also be seen from the prompt using "dmesg".

$ who

The who command tells you "who" is currently logged onto the machine. It specifies the username, the controlling device, and when they connected. When specified by a string in brackets, it also gives the hostname from where that user has logged in from.

root     pts/0        Sep 21 15:59 (hub1-gw)

So in this case:

I am root, logged on from hub1-gw.
My session is linked to a device which handles my screen and keyboard, called pts/0
This refers to /dev/pts/0

$ ls -l /dev/pts/0

All hardware devices, such as keyboards, screens, disks, network cards, etc, will have a file to allow them to be controlled located somewhere in /dev. In the example above the screen and keyboard of the remote user's connection is a pseudo-terminal called /dev/pts/0. It works a character at a time, so is called a character device. Hard drives for instance work a disk block at a time, so are called block devices.

crw--w----.  1 root tty 136, 0 Sep 21 16:49 /dev/pts/0

This is a character device.
The person connected via it always owns it.
There are no sizes with block or char devices.
136 is the major device number
0 is the minor device number.

The major device number is a number which helps the kernel know what device you are talking about. Basically it is an index number which the kernel uses to look up the code which handles this device. In the case of a terminal, there may be multiple users connected and thus multiple pts devices in use. However the same code looks after all the users, so the minor device number is used to tell each different pts device apart in the kernel.

mknod

In the olden days an administrator had to look up major and minor numbers for each piece of hardware they wanted to use, then create the device files they wanted by hand. These days this is almost totally done automatically by the devfs filesystem. However, it may be useful to know how to do this for diagnostic reasons.

To create a new file to represent a device, use mknod. The four parameters are the name of the device file, the type (c for character), the major number, and the minor number. So lets create a copy of the /dev/pts/0 device.

> mknod /tmp/screen c 136 0
> echo "hello there" > /tmp/screen
hello there

By sending strings to this device, it will appear on the screen of the user currently using the /dev/pts/0 pseudoterminal (probably you).

$ man ps

Rather than going through hundreds of commands in the notes, you can just discover commands for yourself using the man command.

"man ls" gives you pages of information about the ls command and all its flags. To leave man press "q". If you are not sure what the name of the command is you need you can do keyword searches using "man -k" (although some keyword searches can produce a lot of stuff).

> man -k directory
....
lookup_dcookie (2)   - return a directory entry's path
ls (1)               - list directory contents
ls (1p)              - list directory contents
mcd (1)              - change MSDOS directory
....

PIPES

Pipes

In the last lecture you saw ">" and "<" as redirections. For example, to copy file a to file b you could do:

$ cp a b

But you could do the following (ugly) command (please dont)

$ cat < a > b

Remember the cat command prints what it gets, and here it gets from a and puts to b. Don't do this, as its too ugly for a real admin to do. These redirections work fine, unless you want ">" to give its output to another program (rather than a file). For example, I am looking for all the users who have a username beginning with "a". I will use a regular expression for this, "^a"

$ grep "^a" /etc/passwd
adm:x:3:4:adm:/var/adm:/sbin/nologin
apache:x:48:48:Apache:/var/www:/sbin/nologin
andrew:x:501:500:Andrew Cumming:/home/andrew:/bin/bash

That got the information, plus lots of other pieces of info.

There is a command called "cut", which chops things out of a line. It will split a field out of a line so long as it knows what character marks the end of 1 field and the start of another.

In /etc/passwd, ":" splits each field, so we tell cut that the delimiter character which splits fields is ":" by using -d":". The username is in field 1, so we tell cut to show us only field 1 using the flag -f1.

$ grep "^a" /etc/passwd > a
$ cut -d":" -f1 < a 
adm
apache
andrew

We can do this in one line, instructing the prompt to give the output from grep as the input to cut. We use pipe "|" to do this.

$ grep "^a" /etc/passwd | cut -d":" -f1

HARD and SOFT LINKS

File links

Sometimes we want to have the same file contents in two or more different places.

Edit one version and you edit all versions
You only use up the disk space for one version no matter how many copies you have.
This is done using a file link

Links are common in the system directories, and are used for configuration as well as dynamic libraries. There are 2 different types of links in Linux; hard and soft.

Hard Link

Lets assume that the current working directory is /home/john. We wish to create a link 'hardfile2' within the sub-directory projects from the file 'hardfile'.

% date > hardfile ( create the file ) 
% ls -l 
-rwx-xr-x. 1 john users 605 Nov 18 12:25 hardfile 

% ln hardfile project/hardfile2 
% ls -l projects/hardfile2
-rwx-xr-x. 2 john users 605 Nov 18 12:25 hardfile 

% ls -l projects/hardfile2 
-rwx-xr-x. 2 john users 605 Nov 18 12:25 hardfile

The file 'hardfile' and its like 'hardfile2' are now indistinguishable, so if 'hardfile' is updated then 'hardfile2' is also updated. Notice the link number has increased to 2. This would occur in both listings.

If one of the links is deleted, the linkcount would decrease to 1. Only when the count reaches 0 does the data which the links point to get deleted. Hard links only work in the same partition, not across partitions. So although useful and effective, soft links are more common.

Soft Links

Again, let us assume that the current working directory is /home/john and we wish to create a link 'softfile2' within the subdirectory projects to the file 'softfile'. Notice the '-s' switch which is really the only difference between the hard and soft link examples:

% date > softfile ( create the file ) 
% ls -l 
-rwx-xr-x. 1 john users 605 Nov 18 12:25 softfile 
% ln -s /home/john/softfile project/softfile2 
% ls -l projects/softfile2 
lrwx-xr-x. 1 john users 605 Nov 18 12:25 softfile2 -> /home/staff/john/softfile

Notice the appended pathname on the long listing, the link number has not changed, but the permissions show an 'l' at the beginning of the long listing rather than a '-'.

Again any updates in 'softfile' will be reflected in 'softfile2'. Thus you can access the same disk information from either the original file or via the softlink. However, if you delete the original file then the original data is immediately destroyed, and accessing the soft link will result in a file not found error. Thus with soft links it is up to you to manage the files.

Discussion

Future of file permission:

Is User/Group/Other sufficient?
Simple control methods? ACL...
Complex control methods? SELinux

Centos 7 intro:	Paths \| BasicShell \| Search
Linux tutorials:	intro1 intro2 wildcard permission pipe vi essential admin net SELinux1 SELinux2 fwall DNS diag Apache1 Apache2 log Mail
Caine 10.0:	Essentials \| Basic \| Search \| Acquisition \| SysIntro \| grep \| MBR \| GPT \| FAT \| NTFS \| FRMeta \| FRTools \| Browser \| Mock Exam \|
Caine 13.0:	Essentials \| Basic \| Search \| ~~Acquisition~~ \| SysIntro \| grep \| MBR \| GPT \| FAT \| NTFS \| FRMeta \| FRTools \| Browser \| Registry \| Mock Exam \|
CPD:	Cygwin \| Paths \| Files and head/tail \| Find and regex \| Sort \| Log Analysis
Kali 2020-4:	1a \| 1b \| 1c \| 2 \| 3 \| 4a \| 4b \| 5 \| 6 \| 7 \| 8a \| 8b \| 9 \| 10 \|
Kali 2024-4:	1a \| 1b \| 1c \| 2 \| 3 \| 4a \| 4b \| 5 \| 6 \| 7 \| 8a \| 8b \| 9 \| 10 \|
Useful:	Quiz \| Privacy Policy \| Terms and Conditions

Linuxzoo created by Gordon Russell.
@ Copyright 2004-2025 Edinburgh Napier University