21 minutes
Beginner Linux Tools and Troubleshooting
I wrote this wiki for my interns and junior admins as a quick overview of basic things to check and use when they encountered an issue. The intent of this guide is to provide nothing more than the fundamental tools that one uses to operate a Linux Server and act as a central jumping point (use the links to learn more). This was originally written around the year 2014/5, but still remains relevant due to the core nature of these utilities. They will most likely still be relevant to troubleshooting and navigating a linux server in 10 or 20 years.
This was written during the height of CentOS 6. Some concepts may be stale and out of date.
Summary
There are basically an infinite amount of tools and ways to resolve an issue. 10 times out of 10, if you have a problem someone else has already encountered it and documented how to resolve the issue somewhere out there via google.com.
- EVERYTHING on Linux is a file, understanding this fact is key to understanding linux, permissions, access, drivers, kernels, and debugging
- Follow through! Do not just hit enter and walk away, understand the consequences (good and bad) of what you are doing before you change something.
- Start small, the first thing every aspiring Linux Admin should build is a LAMP or LAPP Stack and then move on to more complicated services/setups
- Take notes, so that you remember where you got stuck and what you did so it can be a repeatable process that should be able to be duplicated many times over (this is what the wiki is for)
- Don’t Panic! when you mess up, learn how to fix it this will make you a better admin, trial by fire is the way of life for all of us.
- Take snapshots, backups, SVN/GIT, copy files before any major changes, always start at Testing/DEV and work your way up to BTA/PROD deployments.
- Ask questions, discuss what you are doing before/after you change something(even if it is just research), measure twice/cut once: This allows everyone to know where you are and what you are doing, that way if something bad happens we can quickly triage the situation.
Bash - Shell Scripting
Bash scripting is the bread and butter of systems administration. Too often we have to repeat a set of commands across many machines, and the goal of everyone should be to complete that goal with as much automation as possible. Any command that you can type on the CMD line is valid in bash scripting along with basic if/elif/else logic, for loops, variable substitution, and mathematical expressions.
#!/bin/bash
Man Pages
If you are ever unsure of what a command does man pages are the way to locally read and understand how to use any command in Linux. It is literally the command manual in text format.
man ifconfig
IFCONFIG(8) Linux Programmer’s Manual IFCONFIG(8)
NAME
ifconfig - configure a network interface
SYNOPSIS
ifconfig [interface]
ifconfig interface [aftype] options | address ...
DESCRIPTION
Ifconfig is used to configure the kernel-resident network interfaces.
It is used at boot time to set up interfaces as necessary. After that,
it is usually only needed when debugging or when system tuning is
needed.
If no arguments are given, ifconfig displays the status of the cur-
rently active interfaces. If a single interface argument is given, it
displays the status of the given interface only; if a single -a argu-
ment is given, it displays the status of all interfaces, even those
that are down. Otherwise, it configures an interface.
[there is much more to this document in full]
/var/log/messages
/var/log/messages
is the default catch-all for any system errors that may be occurring. There are sometimes logs setup for specific services. Always try to see if a specific service has its own log file or directory (httpd,cron,secure,php-fpm,mysql,postgresql) and if it does not exist or is empty, check /var/log/messages it will probably have the errors, warning, INFO, emergencies, kernel panics, and any other frightening logs that you need to resolve any issues that may occur.
journalctl
The command journalctl can be used to read system and application logs through systemd’s journaling service. This is the newer, sexier way to quickly access logs, but by default journalctl only keep logs around for a few days or until you max out the reserved amount.
$ journalctl -f # follow the logs
$ journlactl -n100 # look back last 100 lines of the log file
If you have checked both the service log and /var/log/messages and still cannot find any information about the service you are trying to fix, you may have to enable debugging mode(s) or level(s) on the service itself. This is different for all services and can usually be found in the products documentation.
Machine Info
The following is a collection of commands and their result to gather basic information about the machine.
Get name, kernel # and date of build with uname
$ uname -a
Linux devsql005.example.com 3.10.0-123.8.1.el7.x86_64 #1 SMP Mon Sep 22 19:06:58 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
OS release info can be found in the following files
$ cat /etc/redhat-release; # shows official CentOS/Redhat version number
CentOS Linux release 7.0.1406 (Core)
or for Ubuntu
$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.1 LTS"
Use Uptime
to get the amount of time since the server has started, and also load averages
$ uptime
20:40:43 up 1 day, 22:36, 0 users, load average: 0.23, 0.16, 0.18
Ip Address and network information
$ ip addr;
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
link/ether 00:50:56:84:59:79 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.[?]/24 brd 10.0.0.255 scope global ens160
valid_lft forever preferred_lft forever
inet6 fe80::250:56ff:fe84:5979/64 scope link
valid_lft forever preferred_lft forever
$ ifconfig;
ens160: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.0.0.[?] netmask 255.255.255.0 broadcast 10.0.50.255
inet6 fe80::250:56ff:fe84:5979 prefixlen 64 scopeid 0x20<link>
ether 00:50:56:84:59:79 txqueuelen 1000 (Ethernet)
RX packets 1675151509 bytes 296413707167 (276.0 GiB)
RX errors 0 dropped 235945 overruns 0 frame 0
TX packets 1428044385 bytes 471991954342 (439.5 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 0 (Local Loopback)
RX packets 409040766 bytes 330694444196 (307.9 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 409040766 bytes 330694444196 (307.9 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
Find the directory you are in
$ pwd
/var/lib/pgsql/9.3/data
Show local users
$ cat /etc/passwd;
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
....
Show local groups
$ cat /etc/group;
root:x:0:
bin:x:1:
daemon:x:2:
sys:x:3:
adm:x:4:
tty:x:5:
disk:x:6:
....
Find which groups the user ‘apache’ is in
$ groups apache;
apache : apache svc_account1
grep/awk/sed
Learn these three tools, they are the swiss army knife to the system’s administrator’s war chest. They can be used to do most anything and you will see many example combinations and use cases in the sections below.
You can be like captain planet and when you combine the power of all three of these tools you can create very powerful and flexible scripts that do all the heavy lifting for you.
grep
grep is a tool mainly used for searching or weeding out information.
$ grep -r mysql; # will search the entire directory tree and every file in those directories for lines containing the word mysql
$ cat /var/log/maillog | grep user@fqdn.com
https://en.wikipedia.org/wiki/Grep
awk
I generally use awk piped at the end of a command to filter parts of the returned information. But awk is it’s own programming language and can be just as powerful as perl or any other scripting language.
$ df -h -t nfs -P | grep /vol/ | awk '{ print $5 " " $6}';
# df -h gives me a listing of the mounts, I search for /vol/ to get mounted drives, then I awk for result $5 and $6 of each returned line which is always used percentage and the partition that is in question.
$ grep "request ID" history | awk 'match($0,"is"){print substr($0,RSTART+3,50)}'`;
# this searches for "request ID" in a file called history and awk looks for the word "is" then prints 50 characters, 3 characters after "is", which is always the request ID in this example.
http://en.wikipedia.org/wiki/AWK
sed
sed is used for inline text editing and manipulation
$ sed -n 51,61p sbr/index.html | sed -i '50r /dev/stdin' testsed;
# this takes lines 51-61 of index.html and appends them after line 50 in testsed
IP=`nslookup $HOSTNAME | grep Address | grep -v "#53"| awk '{ print $2}'`;
# looks up hostname using DNS and greps for address, ignores anything with #53,prints second value on returned line
$ echo "$IP"
$ sed -i 's/0.0.0.0/'$IP'/g' /etc/monitrc; - # reads file, greps for 0.0.0.0, then replaces 0.0.0.0 with result of $IP in script.
$ curl -s http://cbsg.sourceforge.net/cgi-bin/live | grep -Eo '^<li>.*</li>' | sed s,\</\\?li\>,,g | shuf -n 1
https://en.wikipedia.org/wiki/Sed
Disk is full
The following are commands and explanation of the many things you can do to find and remove large log/temporary files, unusual file names, and similar scenarios.
Below is a list of the most likely to be filled areas and should be the first place(s) one looks for space that can be reclaimed
/var/log
/var/spool/{clientmqueue,mqueue,mail}
/tmp
/var/tmp
df
df -h; shows one the local and mounted filesystems and the amount of space, used vs available.
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 12G 4.9G 6.7G 43% /
/dev/sda1 497M 115M 383M 24% /boot
10.10.0.248:/vol/remote_data 1.9T 1.2T 628G 66% /mnt/remote_data
du
du-sh /var/log/; will count the size of each file within a directory and total it.
$ du -sh /var/log
113M /var/log
$ du -shx /var/log/*
4.0K /var/log/alternatives.log
100K /var/log/apt
0 /var/log/btmp
4.0K /var/log/dist-upgrade
92K /var/log/dpkg.log
4.0K /var/log/journal
4.0K /var/log/landscape
4.0K /var/log/lastlog
4.0K /var/log/unattended-upgrades
0 /var/log/wtmp
Clearing large Log files
Do not just rm -rf logfile; if a process, say apache, is still writing to the logfile and you remove the file from existence the process will crash. Remember we wish to have an uptime of 24/7/365. Instead you can clear out the file to size 0 and allows the process to continue writing to it.
$ cat /dev/null > /var/log/messages
No Space left to Delete
Rarely, but it does happen a volume will fill up and be so full that rm will not work as it creates temporary records while it deletes files. There is another way to delete the file by finding it’s inode number and using find to delete the file
let’s assume you did all steps above and found the large offending file.
$ rm /var/tmp/3a8066e5-a90c-4ae5-bdc6-47e117acf354.error
rm: remove regular file ‘/var/tmp/3a8066e5-a90c-4ae5-bdc6-47e117acf354.error’? y
rm: cannot remove ‘/var/tmp/3a8066e5-a90c-4ae5-bdc6-47e117acf354.error’: No space left on device
$ ls -li /var/tmp/3a8066e5-a90c-4ae5-bdc6-47e117acf354.error
56436168 -rw-r--r-- 1 gerbn308 zxdev 0 May 13 11:17 /var/tmp/3a8066e5-a90c-4ae5-bdc6-47e117acf354.error
find . -inum 56436168 -delete
find
Find is really really useful and below are some of the ways find has solved strange issues for me
find date range
$ ll -tr;
# will list out files in time/date order newest -> oldest (remove the r if you would like oldest -> newest)
$ find . -type f -newer file_xyz.txt ! -newer recent_zzy.txt -exec ls -l {} \; -print &> output.txt;
# make sure I got the correct range of data greped for earliest date and latest date
$ find . type -newer file_xyz.txt ! -newer recent_zzy.txt delete;
find a specific file type
In this example we are finding all mp3 files and moving them to a mounted usb drive on /mnt/mp3
$ grep *.mp3
# lets assume there is a mix of file types with no standard naming onvention and there are thousands of them
$ find / -iname "*.mp3" -exec mv {} /mnt/mp3 ;
# you can have any command in the -exec section
find anything older than 12 hours
Sometimes you have to alleviate some pressure so the file system does not fill up and you do not want to clear out files that may be actively being worked on.
ls -R | wc -l;
# will list number of files in directory
find . -type f -mmin +720 -delete;
# find and delete anything over 12 hours old
find like grep
find can be used much in the same way as grep to search for the name of files in a directory.
cd /usr/local/lib;
find ./ -name "*gdal*" -print;
# prints out all files containing the word gdal in their name
find ./ -name "*gdal*" -delete;
# deletes all the files containing the word gdal in their name
Server seems slow
This is the holy grail of complaints as there are literally millions of things that could cause a server to be slow.
top
top gives one a task explorer like peak into the activity of the server. When one is using top you can press u
; key and sort by specific username. c
; allows one to get more detail on the processes running. top by default tries to list everything in order by highest use of %CPU.
$ top -u apache
top - 08:37:14 up 41 days, 20:30, 3 users, load average: 0.00, 0.04, 0.08
Tasks: 140 total, 3 running, 136 sleeping, 0 stopped, 1 zombie
%Cpu(s): 1.7 us, 2.0 sy, 0.0 ni, 95.9 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st
KiB Mem: 1885520 total, 1339856 used, 545664 free, 0 buffers
KiB Swap: 4095996 total, 52368 used, 4043628 free. 462444 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11462 apache 20 0 1005776 12212 4072 S 0.0 0.6 1:13.79 httpd
11492 apache 20 0 1005640 12688 4208 S 0.0 0.7 1:13.22 httpd
11493 apache 20 0 1005636 12124 4104 S 0.0 0.6 1:13.60 httpd
12802 apache 20 0 254124 1636 836 S 0.0 0.1 0:00.00 httpd
12803 apache 20 0 255300 1532 672 S 0.0 0.1 0:14.62 httpd
12804 apache 20 0 255300 1508 652 S 0.0 0.1 0:00.18 httpd
12806 apache 20 0 1005940 12892 3548 S 0.0 0.7 4:35.21 httpd
19201 apache 20 0 656156 13036 1832 S 0.0 0.7 0:00.03 php-fpm
19202 apache 20 0 656156 13036 1832 S 0.0 0.7 0:00.03 php-fpm
free
The free command will give one a snapshot of the current memory and swap usage. Remember to subtract buffered and cached memory from used to get an actual representation of the amount of RAM in use.
$ free
total used free shared buffers cached
Mem: 3924876 3663056 261820 0 298528 1755512
-/+ buffers/cache: 1609016 2315860
Swap: 4194296 235456 3958840
ps
The ps or process command can be used to get a very detailed account of every single process running at the exact moment you enter the command, it does not refresh like top, but can clue you into process trees and zombie/dead/defunct processes that might not show up in top. I will truncate the output below as it can be quite lengthy.
$ ps fax
PID TTY STAT TIME COMMAND
1 ? Ss 0:18 /sbin/init
480 ? S<s 0:00 /sbin/udevd -d
5799 ? S< 0:00 \_ /sbin/udevd -d
1057 ? S 1:29 /opt/chef-server/embedded/service/bookshelf/erts-5.9.3.1/bin/epmd -daemon
1349 ? Sl 174:03 /usr/sbin/vmtoolsd
1836 ? S<sl 0:47 auditd
1854 ? Ss 0:00 /sbin/portreserve
...
2194 ? Ssl 1:07 hald
2195 ? S 0:00 \_ hald-runner
2224 ? S 0:00 \_ hald-addon-input: Listening on /dev/input/event2 /dev/input/event0
2235 ? S 0:00 \_ hald-addon-acpi: listening on acpid socket /var/run/acpid.socket
2255 ? Ssl 1:44 automount --pid-file /var/run/autofs.pid
2307 ? Ss 0:00 rpc.rquotad
3078 tty2 Ss+ 0:00 /sbin/mingetty /dev/tty2
3081 tty3 Ss+ 0:00 /sbin/mingetty /dev/tty3
3083 tty4 Ss+ 0:00 /sbin/mingetty /dev/tty4
3092 tty5 Ss+ 0:00 /sbin/mingetty /dev/tty5
3095 tty6 Ss+ 0:00 /sbin/mingetty /dev/tty6
4846 tty1 Ss+ 0:00 /sbin/mingetty /dev/tty1
21685 ? Ss 3:27 /usr/sbin/httpd
7578 ? S 0:18 \_ /usr/sbin/httpd
7579 ? S 0:00 \_ /usr/sbin/httpd
7580 ? S 0:00 \_ /usr/sbin/httpd
7581 ? S 0:00 \_ /usr/sbin/httpd
7582 ? S 0:00 \_ /usr/sbin/httpd
7583 ? S 0:00 \_ /usr/sbin/httpd
7584 ? S 0:00 \_ /usr/sbin/httpd
7585 ? S 0:00 \_ /usr/sbin/httpd
7586 ? S 0:00 \_ /usr/sbin/httpd
6584 ? Sl 24:30 /usr/bin/monit
6133 ? Ss 0:07 /usr/sbin/sssd -f -D
6135 ? S 0:02 \_ /usr/libexec/sssd/sssd_nss --debug-to-files
6136 ? S 0:02 \_ /usr/libexec/sssd/sssd_pam --debug-to-files
6137 ? S 0:01 \_ /usr/libexec/sssd/sssd_ssh --debug-to-files
6166 ? S 0:05 \_ /usr/libexec/sssd/sssd_be --domain default --debug-to-files
ps faux # the u will give one system usage along with the process tree
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 2951 0.0 0.0 117296 1256 ? Ss 2014 1:35 crond
root 6584 0.0 0.0 179628 2872 ? Sl Mar17 24:30 /usr/bin/monit
jenkins 5454 1.4 14.7 2503936 577244 ? Ssl Apr09 700:00 /etc/alternatives/java -Djava.awt.headless=true
root 28121 0.0 0.0 101428 3364 ? Ss May11 0:09 /var/cfengine/bin/cf-execd
root 28130 0.6 0.1 366888 4236 ? Ss May11 24:15 /var/cfengine/bin/cf-serverd
root 28141 0.0 0.1 35624 5284 ? Ss May11 0:59 /var/cfengine/bin/cf-monitord
root 17905 0.0 0.0 22180 988 ? Ss May12 0:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
dhcpd 18334 0.0 0.1 49000 4376 ? Ss May12 0:02 /usr/sbin/dhcpd -user dhcpd -group dhcpd
root 6133 0.0 0.0 199608 2372 ? Ss May12 0:07 /usr/sbin/sssd -f -D
root 6135 0.0 0.3 201872 14588 ? S May12 0:02 \_ /usr/libexec/sssd/sssd_nss --debug-to-files
root 6136 0.0 0.0 192212 2808 ? S May12 0:02 \_ /usr/libexec/sssd/sssd_pam --debug-to-files
root 6137 0.0 0.0 189892 2688 ? S May12 0:01 \_ /usr/libexec/sssd/sssd_ssh --debug-to-files
root 6166 0.0 0.1 229764 6916 ? S May12 0:05 \_ /usr/libexec/sssd/sssd_be --domain default
kill
Now is the time to learn how to stop runaway processes. Always first try to do a service reset as it will exit “correctly”; and start again and if there are more problems you should be able to see them in a logfile, instead if you just kill it. The process will not have time to show you errors.
$ kill PID PID2 PID3;
# you can kill any number of specific processes as long as you get their PID number from either top or ps fax.
$ kill -9 PID PID2 PID3;
# this <b>REALY</b> kills it if the above does not work, sometime you have to resort to the most drastic measure to get a zombie process out of there. Use sparingly as regular kill alows the process to stop and let go of files before exiting. adding -9 kills it immediately and may leave behind file locks.
$ killall httpd;
# will kill all processes with the word httpd in the name
$ killall -u user1;
# will kill all processes running as the user1 user
Kill - my favorite one liner
ps fax | grep httpd | awk '{print $1}' | xargs kill
# substitue any process name in the grep and kill a bunch at one time, good for when things are going really crazy
Networking
Often times you will have to determine if a service is working correctly by whether or not it is listening on the correct port or if it is responding at all
ping
The most common network debugging tool, ping. It is good for a quick up/down test of the machine, but NOTE if the machine is super busy it will not respond to ping right away and you could be seeing slow ping times and it could have nothing to do with the network or the NIC.
$ ping -t 8.8.8.8
# -t will continuously ping, never stop until you quit
telnet
Telnet is a good way to query a specific remote listening port to see if a service is responding as it should
$ telnet sql_host 5432
Trying sql_host...
Connected to sql_host.
Escape character is '^]'.
^]
exit
netstat
netstat produces a list of listening interfaces, ports, and connections to and from the machine.
$ netstat -an
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:5308 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:46341 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN
tcp6 0 0 :::443 :::* LISTEN
tcp6 0 0 :::34459 :::* LISTEN
tcp6 0 0 :::22 :::* LISTEN
udp 0 0 0.0.0.0:53348 0.0.0.0:*
udp 0 0 0.0.0.0:111 0.0.0.0:*
udp 0 0 0.0.0.0:123 0.0.0.0:*
udp 0 0 0.0.0.0:36748 0.0.0.0:*
udp6 0 0 :::111 :::*
udp6 0 0 :::55414 :::*
udp6 0 0 :::48913 :::*
raw6 0 0 :::58 :::* 7
$ netstat -anetu | grep 514;
tcp 0 0 0.0.0.0:514 0.0.0.0:* LISTEN
tcp 0 0 :::514 :::* LISTEN
tcp 0 0 :::5514 :::* LISTEN
tcp 0 0 ::ffff:127.0.0.1:44946 ::ffff:127.0.0.1:9300 ESTABLISHED 497 3320514
udp 0 0 0.0.0.0:514 0.0.0.0:*
udp 0 0 :::514 :::*
udp 0 0 :::5514 :::*
ss
the ss command can do the same and more as netstat and should be used in the future. netstat has become deprecated/obsolete to ss since CentOS 6.4
$ ss -apnetu | grep 443;
tcp LISTEN 0 128 :::443 :::* ino:202599741 sk:ffff88013beba100
tcp ESTAB 0 0 ::ffff:10.0.20.63:443 ::ffff:70.198.42.193:2400 timer:(keepalive,119min,0) uid:48 ino:228626831 sk:ffff880011629880
tcp TIME-WAIT 0 0 ::ffff:10.0.20.63:443 ::ffff:70.198.42.193:2426 timer:(timewait,48sec,0) ino:0 sk:ffff88013cb3c940
tcp ESTAB 0 0 ::ffff:10.0.20.63:443 ::ffff:70.198.42.193:2421 timer:(keepalive,119min,0) uid:48 ino:228626830 sk:ffff8800027380c0
tcp TIME-WAIT 0 0 ::ffff:10.0.20.63:443 ::ffff:70.198.42.193:2410 timer:(timewait,47sec,0) ino:0 sk:ffff88013cb3ca80
tcp ESTAB 0 0 ::ffff:10.0.20.63:443 ::ffff:70.198.42.193:2414 timer:(keepalive,119min,0) uid:48 ino:228626829 sk:ffff88006031c100
tcp TIME-WAIT 0 0 ::ffff:10.0.20.63:443 ::ffff:70.198.42.193:2401 timer:(timewait,48sec,0) ino:0 sk:ffff880017550e80
Find all incoming connections from unique IP Addresses
$ netstat -tapn | awk '{print $5}' | sed 's/::ffff://' | sed 's/:.*//' | sort | uniq -c | sort;
1 10.0.20.63
1 10.0.50.232
1 10.0.50.234
1 10.0.50.244
1 10.0.5.54
1 10.10.1.231
2 10.0.5.154
2 129.82.224.115
22 129.82.224.137
3 10.0.20.201
39 10.0.20.52
4 10.0.20.30
9 0.0.0.0
Find all outgoing connections to unique IP Addresses
$ ss -tapw | awk '{print $5}' | sed 's/::ffff://' | sed 's/:.*//' | sort | uniq -c | sort;
1 10.10.1.63
12 10.0.20.63
1 Local
3 10.0.50.63
lsof
lsof can also be used to determine which process is using a specific port. We once had an issue with a process running on a sendmail port which would not allow sendmail to start on one of our production web servers. This was key in tracking that down and has become a quicker way to check specific ports than netstat.
$ lsof -i :514
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
rsyslogd 4835 root 1u IPv4 3354631 0t0 TCP *:shell (LISTEN)
rsyslogd 4835 root 2u IPv6 3354632 0t0 TCP *:shell (LISTEN)
rsyslogd 4835 root 3u IPv4 3354623 0t0 UDP *:syslog
rsyslogd 4835 root 4u IPv6 3354624 0t0 UDP *:syslog
Copying large number of files
rsync
rsync is the method to use when copying a large number of files big or small. The / in the from where to where are very important and is the diference between copying the directory or just everything within the directory to a new location.
$ rsync -avr /nfs/data othermachine:/new/dataarea/ --progress;
# copies local directory and all of it's content data to othermachine:/new/dataarea/data
$ rsync -avr othermachine:/new/dataarea/ /nfs/data --progress;
# copies all data FROM the othermachine under dataarea/ to /nfs/data/
screen
Highly recommended to do any long running operations in a screen session so that if your connection to the machine timesout, the operation does not end. This is also a way to do things privetly on a server no one can see within a screen session, unless you allow it
$ screen -S copyfiles;
# start a new session, named copyfiles
$ rsync -avr /nfs/data othermachine:/new/dataarea/ --progress;
## example command
Ctrl-a d
# While the rsync is running, Control-a d will detach but leave running the operation and you can continue doing what ever you want
$ screen -r copyfiles;
# will reattach to the the rsync screen session
$ screen -d R session_share;
# allows you to <a class="external text" href="http://technonstop.com/screen-commands-for-terminal-sharing" rel="nofollow">share your session</a> with someone else, useful for training.
$ screen -x session_share;
# allows your friend on the same machine to connect to your session
Ctrl-d - while inside the screen will detach and terminate the session
I need my script to run at startup
Scenario: On many of our servers we need NFS mounts to other servers/NAS devices to be present at start up so that users/services/websites can get to their data.
$ vim /usr/local/bin/nfs_mounts.sh
# insert nfs mount command to new file nfs_mounts.sh
$ chmod 700 /usr/local/bin/nfs_mounts.sh
# changes permissions so only root user can execute.
$ vim /etc/rc.local
# add the line at the end
$ /usr/local/bin/nfs_mounts.sh &
# make sure their is an &; symbol after the command
$ chmod -x /etc/rc.local
# the rc.local file should be set to executable permissions
sidenote: Do not use fstab/mtab to auto-mount NFS volumes if they are not present at boot time then the system will hang indefinetly until it is available. our nfs_mounts shell script is the way to get around that.
PATH
Sometimes a problem occurs when one build a project via source and the executable are not located in /usr/bin or /usr/local/bin, the filesystem does not automatically know where to find them via name so one has to type out the full path to interact with those files.
env
$ env | grep PATH
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/var/cfengine/bin:/root/bin
Add new location to PATH
To temporarily add a new location to path type the following:
$ export PATH=/home/user/new_path:$PATH;
# adding $PATH will keep the old PATH along with the new one, order matters, IF you forget $PATH nothing will work, reboot
To permanently add a new location to path there are 2 options:
User Specific paths
If the user always logs onto the same machine and needs certain files.
$ vim /nethome/username/.bashrc
add export command to the file, examples of some below
# .bashrc
export SVN_EDITOR=vim
export PATH=/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin:/opt/grads- 2.0.1.oga.1/Contents:/opt/grib2/wgrib2:/home/gempak/NAWIPS/os/linux64/bin:/home/ldm/bin:
export GAVERSION=2.0.1.oga.1
source /home/gempak/NAWIPS/Gemenviron.profile
export LAPS_DATA_ROOT=/wxrnd/lapsdata
export LAPS_SRC_ROOT=/opt/laps-0-50-19
export LAPSINSTALLROOT=/opt/laps-0-50-19
Machine Specific paths
This is probably the “best practices”; way and I would recommend doing this from now instead of .bashrc. As it allows anyone who logs into the machine to have the same PATH settings.
$ touch /etc/profile.d/postgresl.sh
$ vim /etc/profile.d/postgresql.sh; # make sure file ends with .sh or it will not be read by filesystem
export PATH=/usr/pgsql-9.3/bin:$PATH
export MANPATH=$MANPATH:/usr/pgsql-9.3/share/man
# save, log out and log back in
$ env | grep PATH;
# and you should now see the new PATH settings
Repeat this for any/all source built files that have non-standard paths or settings.
Change Hostname
The following files are what you need to edit to change the hostname of a machine
$ vim /etc/hostname;
# edit to newhostname.example.com
$ vim /etc/hosts;
# add/edit "newhostname.example.com newhostname" to beginning of both lines in file
$ hostname newhostname.example.com
$ export HOSTNAME=newhostname.example.com;