Ryan I Am

How to extend partitions and filesystems in Linux with the LVM

This is a fairly straightforward process, so I’ve created a quick reference that can hopefully benefit someone else.

First, why use the LVM? The advantages are clear: It gives you the ability to resize partitions dynamically and span them across multiple block devices. Additionally, you can take snapshots, but I’ll save that for a future post.

These are our storage layers, in ascending order of abstractness:

  1. Physical Disk
  2. Disk Partition (Primary and Logical)
  3. Physical Volume
  4. Volume Group
  5. Logical Volume
  6. Filesystem

Here’s a quick summary of what we’re going to do:

  1. Increase the size of the physical disk.
  2. Create new logical partitions from that unallocated disk space.
  3. Created new Physical Volumes in the LVM from those partitions.
  4. Extend the existing LVM Volume Group by adding the new Physical Volumes to it.
  5. Extend the size of the Logical Volumes that are mapped to certain filesystems, from unallocated space in the Volume Group.
  6. Grow the actual filesystems themselves.

So, suppose you have a VM or a VPS named thor with a single 5GB disk. You originally set up LVM through the OS installer and created partitions through the installation wizard (or used kickstarter), called the volume group “thor,” and gave each logical volume (which looked a lot like a partition in the installer) a friendly label like “home,” “var,” etc. However, you’ve allocated much of the space to the /home partition, but now you’re getting nervous about /var filling up. Here’s the current status of things:

# df -h
Filesystem              Size  Used  Avail  Use%  Mounted on
tmpfs                   124M  0     124M   0%    /lib/init/rw
udev                    120M  128K  120M   1%    /dev
tmpfs                   124M  0     124M   0%    /dev/shm
/dev/sda1               228M  15M   202M   7%    /boot
/dev/mapper/thor-root   322M  139M  167M   46%   /
/dev/mapper/thor-home   1.6G  37M   1.5G   3%    /home
/dev/mapper/thor-tmp    124M  13K   118M   1%    /tmp
/dev/mapper/thor-usr    1.7G  603M  979M   39%   /usr
/dev/mapper/thor-var    843M  484M  317M   61%   /var
# swapon -s
Filename   Type       Size    Used  Priority
/dev/dm-1  partition  241656  12    -1
# ls -l /dev/mapper/
total 0
crw------- 1 root root 10, 59 Oct 11 23:13 control
lrwxrwxrwx 1 root root      7 Oct 11 23:13 thor-home ->   ../dm-5
lrwxrwxrwx 1 root root      7 Oct 11 23:13 thor-root ->   ../dm-0
lrwxrwxrwx 1 root root      7 Oct 11 23:13 thor-swap_1 -> ../dm-1
lrwxrwxrwx 1 root root      7 Oct 11 23:13 thor-tmp ->    ../dm-4
lrwxrwxrwx 1 root root      7 Oct 11 23:13 thor-usr ->    ../dm-2
lrwxrwxrwx 1 root root      7 Oct 11 23:20 thor-var ->    ../dm-3
# fdisk -l
Disk /dev/sda: 4.76 GB, 16106127360 bytes
Device     Boot  Start  End  Blocks    Id  System
/dev/sda1   *    1      32   248832    83  Linux
/dev/sda2        32     1958 15476756  5   Extended
/dev/sda5        32     653  4990976   8e  Linux LVM

Output from fdisk has been snipped to show just the pertinent information.

As you can see, /boot is a physical disk partition (although this is no longer necessary as the kernel can now boot from an LVM partition), while /, /home, /tmp, /usr, and /var look like some sort of LVM thing. This is correct: the dm devices stand for ‘device mapper’.

From an LVM perspective, here’s what we have:

# pvdisplay
 --- Physical volume ---
 PV Name               /dev/sda5
 VG Name               thor
 PV Size               4.76 GiB / not usable 1.81 MiB
 Allocatable           yes (but full)
 PE Size               4.00 MiB
 Total PE              1218
 Free PE               0
 Allocated PE          1218
 PV                    UUID SjKAWG-ak9K-1Qf3-0Jmd-8xvY-SJcL-mN4QW5
# vgdisplay
 --- Volume group ---
 VG Name               thor
 System ID 
 Format                lvm2
 Metadata Areas        1
 Metadata Sequence No  9
 VG Access             read/write
 VG Status             resizable
 MAX LV                0
 Cur LV                6
 Open LV               6
 Max PV                0
 Cur PV                1
 Act PV                1
 VG Size               4.76 GiB
 PE Size               4.00 MiB
 Total PE              1218
 Alloc PE / Size       1218 / 4.76 GiB
 Free PE / Size        0 / 0 
 VG UUID               OWp6UN-8ssJ-nPTG-3MUA-v29T-PJfR-wNp6nR

Even if you don’t understand everything above, you can make it out pretty clearly that I have less than 5GiB of space total on the disk.

To remedy the situation, you think an extra 10GB should be plenty to cover /var, add a little extra swap, and expand capacity for increased anticipated usage of /home. So you must first either increase the physical disk size in the host’s VM management software, or ask your VPS provider to do it for you (alternatively, you could simply add a second disk, but this creates more clutter since it requires an extra file on the host). After a reboot, you see:

# fdisk -l
Disk /dev/sda: 16.1 GB, 16106127360 bytes

Great, the disk is bigger. That only completes step one, however. We need a larger partition. Since you can’t increase a physical partition size while it’s mounted (the process involves deleting it and recreating it as a larger size, which we can’t do since our OS is running on it), let’s just create a new one (don’t worry, the LVM will solve the problem of how to extend the existing filesystems onto this new partition). For this, I used cfdisk which uses a simple ncurses interface, so I have no output to paste below. In this case, I decided to create two additional 5GB Logical Partitions, so that I would have three Physical Volumes all of approximately equal size.

I now have:

# pvscan
 PV /dev/sda5    VG thor    lvm2    [4.76 GiB / 0 free]
 Total: 1 [4.76 GiB] / in use: 1 [4.76 GiB] / in no VG: 0 [0 ]

Wait, where are my new physical volumes? We haven’t created them yet. We have new logical partitions, but not LVM PVs yet. So let’s fix that.

# pvcreate /dev/sda6
 Physical volume "/dev/sda6" successfully created
# pvcreate /dev/sda7
 Physical volume "/dev/sda7" successfully created

Now we have them, but they haven’t yet been assigned to a Volume Group:

# pvscan
 PV /dev/sda5   VG thor       lvm2 [4.76 GiB / 0 free]
 PV /dev/sda6                 lvm2 [4.76 GiB]
 PV /dev/sda7                 lvm2 [5.24 GiB]
 Total: 3 [14.76 GiB] / in use: 1 [4.76 GiB] / in no VG: 2 [10.00 GiB]

Simple enough, we will extend the Volume Group:

# vgextend thor /dev/sda6
 Volume group "thor" successfully extended
# vgextend thor /dev/sda7
 Volume group "thor" successfully extended

Now we have a Volume Group that spans all of the Physical Volumes (and logical partitions):

# vgdisplay thor
 --- Volume group ---
 VG Name                thor
 System ID 
 Format                 lvm2
 Metadata Areas         3
 Metadata Sequence No   11
 VG Access              read/write
 VG Status              resizable
 MAX LV                 0
 Cur LV                 6
 Open LV                6
 Max PV                 0
 Cur PV                 3
 Act PV                 3
 VG Size                14.75 GiB
 PE Size                4.00 MiB
 Total PE               3777
 Alloc PE / Size        1218 / 4.76 GiB
 Free PE / Size         2559 / 10.00 GiB
 VG UUID                OWp6UN-8ssJ-nPTG-3MUA-v29T-PJfR-wNp6nR

It’s kind of as if we took a few disks and spanned them in a JBOD array, and are now ready to present them to the OS as a single device.

So now we can extend the logical volumes to our desired size. Let’s give /var an extra 500GB:

# lvextend -L +500M /dev/mapper/thor-var
 Extending logical volume var to 1.32 GiB
 Logical volume var successfully resized

And to confirm:

# lvdisplay /dev/thor/var
--- Logical volume ---
LV Name                /dev/thor/var
 VG Name               thor
 LV UUID               MJAjZB-EC1P-ODqE-d1KC-9jQG-3J8h-ZfVUyz
 LV Write Access       read/write
 LV Status             available
 # open                1
 LV Size               1.32 GiB
 Current LE            339
 Segments              2
 Allocation            inherit
 Read ahead sectors    auto
 - currently set to    256
 Block device          254:3

Remember, this is /dev/dm-3, which has symlinks pointed to it at /dev/mapper/thor-var and also /dev/thor/var. Now, we’ve finally increased the size of the layer directly underneath the filesystem, so it’s time to resize the filesystem itself:

# resize2fs /dev/thor/var
resize2fs 1.41.12 (17-May-2010)
Filesystem at /dev/thor/var is mounted on /var; on-line resizing required
old desc_blocks = 1, new_desc_blocks = 1
Performing an on-line resize of /dev/thor/var to 347136 (4k) blocks.
The filesystem on /dev/thor/var is now 347136 blocks long.

And we can confirm:

# df -h /dev/mapper/thor-var 
Filesystem            Size  Used  Avail  Use%  Mounted on
/dev/mapper/thor-var  1.4G  485M  785M   39%   /var

One great benefit of adding all the new disk space to the LVM is that we can easily see how much disk space has been allocated to logical volumes, as well as how much is still unallocated (look towards the bottom):

# vgdisplay thor
 --- Volume group ---
 VG Name                 thor
 System ID
 Format                  lvm2
 Metadata Areas          3
 Metadata Sequence No    12
 VG Access               read/write
 VG Status               resizable
 MAX LV                  0
 Cur LV                  6
 Open LV                 6
 Max PV                  0
 Cur PV                  3
 Act PV                  3
 VG Size                 14.75 GiB
 PE Size                 4.00 MiB
 Total PE                3777
 Alloc PE / Size         1343 / 5.25 GiB
 Free PE / Size          2434 / 9.51 GiB
 VG UUID                 OWp6UN-8ssJ-nPTG-3MUA-v29T-PJfR-wNp6nR

In old-fashioned terms, we’ve still got over 9GiB of disk space to allocate to other partitions. Said correctly, we have over 9GiB of unallocated space in the VG that can be allocated to LVs.

So we can now extend our swap logical volume and partition as well:

# lvextend -L +256M /dev/mapper/thor-swap_1
 Extending logical volume swap_1 to 492.00 MiB
 Logical volume swap_1 successfully resized
# swapoff -v /dev/thor/swap_1 
swapoff on /dev/thor/swap_1
# mkswap /dev/thor/swap_1 
mkswap: /dev/thor/swap_1: warning: don't erase bootbits sectors
        on whole disk. Use -f to force.
Setting up swapspace version 1, size = 503804 KiB
no label, UUID=f60c0c9a-9ff5-4766-875e-4d9cb17b6891
# swapon -va
swapon on /dev/mapper/thor-swap_1
swapon: /dev/mapper/thor-swap_1: found swap signature: version 1, page-size 4, same byte order
swapon: /dev/mapper/thor-swap_1: pagesize=4096, swapsize=515899392, devsize=515899392

And for /home:

# lvextend -L +5G /dev/thor/home
 Extending logical volume home to 6.59 GiB
 Logical volume home successfully resized
# resize2fs /dev/thor/home
resize2fs 1.41.12 (17-May-2010)
Filesystem at /dev/thor/home is mounted on /home; on-line resizing required
old desc_blocks = 1, new_desc_blocks = 1
Performing an on-line resize of /dev/thor/home to 1726464 (4k) blocks.
The filesystem on /dev/thor/home is now 1726464 blocks long.

Now, let’s confirm:

# df -h
Filesystem             Size  Used  Avail  Use%  Mounted on
tmpfs                  124M  0     124M   0%    /lib/init/rw
udev                   120M  140K  120M   1%    /dev
tmpfs                  124M  0     124M   0%    /dev/shm
/dev/sda1              228M  15M   202M   7%    /boot
/dev/mapper/thor-root  322M  139M  167M   46%   /
/dev/mapper/thor-home  6.5G  38M   6.2G   1%    /home
/dev/mapper/thor-tmp   124M  13K   118M   1%    /tmp
/dev/mapper/thor-usr   1.7G  603M  979M   39%   /usr
/dev/mapper/thor-var   1.4G  485M  785M   39%   /var
# swapon -s
Filename    Type         Size     Used   Priority
/dev/dm-1   partition    503800   1212   -1

Here are some of the benefits of  handling this the way we did:

  1. Only one physical disk. On the host, this means only one virtual disk file.
  2. Only one reboot was necessary, and that’s because we changed the size of the physical disk.  Without the LVM, this would have required booting into an alternate OS, and we would have been limited to expanding only the last partition on the disk (into the space to the right of it).
  3. We can continue to add as much space in as small chunks as we want to using the LVM.
  4. We can easily see disk allocation stats in vgdisplay, so we know if we can grow partitions without performing any calculations.
  5. We gain the ability to take LVM snapshots.

As usual, please let me know if I missed something in the comments.

IPv6 Notes

So I finally got around to finishing up he.net’s free IPv6 certification process:

Not the prettiest of badges, but what do you expect for free?

One thing that tripped me up when I was testing ipv6 in BIND, is that it only seemed to return my A records, not my AAAA records. I finally realized that by default, the dig command is querying A records only. To test an AAAA record, you have to explicitly specify it:

dig aaaa @ns1.example.com subdomain.example.com

Starting out, I also had trouble figuring out what glue records are. It turns out this is a very simple concept: They’re records served by the TLD registry (Verisign in the case of the .com TLD) that contain the hostname and IP addresses of your name servers. Without these, you have a chicken-and-egg scenario in which the client cannot resolve ns1.example.com because it cannot access ns1.example.com to look it up, because the client doesn’t know its IP. If your domain uses name servers accessible by a domain name other than your own, such as ns3196.dns.dyn.com, you don’t have this problem and you don’t need glue records, and you can let Dyn worry about setting up their own glue records properly. But for both ipv4 and ipv6, if your domain’s DNS is hosted by name servers on the very same 2nd-level domain name,  you need them.

Also interesting is that glue records don’t always go by that name. In my domain registrar’s interface, there are a couple of text fields to enter “DNS Hosts” with no mention of the word ‘glue’. And in dig output, they show up under ‘;; ADDITIONAL SECTION:’.

So to find the glue records for google.com, we query the registry :

$ dig @f.gtld-servers.net google.com
; <<>> DiG 9.6-ESV-R4-P3 <<>> @f.gtld-servers.net google.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17116
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 4, ADDITIONAL: 4
;; WARNING: recursion requested but not available
;; QUESTION SECTION:
;google.com. IN A
;; AUTHORITY SECTION:
google.com. 172800 IN NS ns2.google.com.
google.com. 172800 IN NS ns1.google.com.
google.com. 172800 IN NS ns3.google.com.
google.com. 172800 IN NS ns4.google.com.
;; ADDITIONAL SECTION:
ns2.google.com. 172800 IN A 216.239.34.10
ns1.google.com. 172800 IN A 216.239.32.10
ns3.google.com. 172800 IN A 216.239.36.10
ns4.google.com. 172800 IN A 216.239.38.10
;; Query time: 193 msec
;; SERVER: 192.35.51.30#53(192.35.51.30)
;; WHEN: Sun Aug 26 22:11:54 2012
;; MSG SIZE rcvd: 164

The glue records are visible in the Additional Section, thus hatching the chicken from the egg.

A tweet

WordPress recently added the ability to embed tweets simply by pasting the tweet’s URL into the editor. Bam, fancy tweet styling, no plugins or CSS required. This is me testing it out.

All I did was paste this into the editor:

http://twitter.com/IT_BORAT/statuses/238376762961195008

Haters gonna hate, but I don’t care. I love WordPress.

Ode to Bojangles

Oh Bojangles, you’re not newfangled
Thine chicken is classically fried
A delicious dish, and my one true wish
Is that I may within thy walls abide.

Your chicken is divine, and to eat it, there is no line
Of morally questionable pseudo-protesters
Who, at Chic-fil-A, feel the urge to portray
Solidarity as anti-gay sandwich investors.

Bojangles, the political sidelines suit you just fine
No need to enter the tired, intolerant, fray
As long as in you don’t chime, I’ll throw you my dime
For chicken so sublime, I can scarcely contain my hooray!

OpenBSD as a router and troubleshooting traffic problems

I am a big fan of using OpenBSD for a router, primarily for three reasons:

  1. pf, the OpenBSD packet filter, which is a joy to use
  2. The OS is secure by default
  3. Ease of maintenance (stable, with very few security updates)

The other night, I was bored and searched for every package with the word top, tcp, or pf in it. I was aware of some of these, but some were new to me. Here is my list of useful traffic analysis packages to install on an OpenBSD router (I didn’t include netflow tools such as tcpflow as I don’t think those are very helpful in everyday troubleshooting):

trafshow, iftop, dnstop, ifstat, ntop, pkstat, 
tcpstat, vnstat, pftop, tcptraceroute

As a bonus, OpenBSD gives you tcpdump (and libpcap) in base, meaning they are audited and secure. And standards such as netstat, netcat, and telnet are already included as well. But the power of certain pf-specific commands, such as pfctl, shouldn’t be underestimated. For example, to see stats on your QoS queues:

# pfctl -sq -v
queue root_em5 on em5 bandwidth 4.90Mb priority 0 cbq( wrr root ) {std_out, voice_out}
  [ pkts:    3006154  bytes:  716517235  dropped pkts:      0 bytes:      0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
queue  std_out on em5 bandwidth 3.10Mb cbq( borrow default ) 
  [ pkts:    3006154  bytes:  716517235  dropped pkts:    285 bytes: 361341 ]
  [ qlength:   0/ 50  borrows:  74630  suspends:      0 ]
queue  voice_out on em5 bandwidth 1.80Mb cbq( borrow ) 
  [ pkts:          0  bytes:          0  dropped pkts:      0 bytes:      0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
queue root_em0 on em0 bandwidth 49.70Mb priority 0 cbq( wrr root ) {std_in, voice_in}
  [ pkts:    3133090  bytes: 1132912428  dropped pkts:      0 bytes:      0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
queue  std_in on em0 bandwidth 47.90Mb cbq( borrow default ) 
  [ pkts:    3133090  bytes: 1132912428  dropped pkts:     90 bytes: 136260 ]
  [ qlength:   0/ 50  borrows:  22270  suspends:      0 ]
queue  voice_in on em0 bandwidth 1.80Mb cbq( borrow ) 
  [ pkts:          0  bytes:          0  dropped pkts:      0 bytes:      0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]

 

The same goes for the OpenBSD version of netstat. The ifconfig command doesn’t display interface statistics, like Linux does. At first, I was annoyed by this, until I discovered it can be done with netstat, in continuous output format:

# netstat -dw 2 -I em5 
  em5 in        em5 out                    total in      total out                  
 packets  errs  packets  errs colls drops   packets  errs  packets  errs colls drops
35067840     4 34045239     0     0     0  87713393     4 87166093     0     0     0
       6     0        5     0     0     0        12     0       11     0     0     0
       1     0        1     0     0     0         1     0        1     0     0     0
       1     0        1     0     0     0       125     0      125     0     0     0
       1     0        1     0     0     0         7     0        7     0     0     0
^C

 

That is a thing of beauty, and much easier to mentally parse than the linux ifconfig output. Compare that to the linux equivalent:

# netstat -ic
Kernel Interface table
Iface   MTU Met   RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0       1500 0    477799      0      0 0         41170      0      0      0 BMRU
lo        16436 0       573      0      0 0           573      0      0      0 LRU
Iface   MTU Met   RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0       1500 0    477801      0      0 0         41171      0      0      0 BMRU
lo        16436 0       573      0      0 0           573      0      0      0 LRU
Iface   MTU Met   RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0       1500 0    477803      0      0 0         41172      0      0      0 BMRU
lo        16436 0       573      0      0 0           573      0      0      0 LRU

 

Yes, those columns really do line up like that in my terminal, and no, I can’t make sense of the alignment, either. Additionally, it repeats the headers every time. This is much less readable.

To be fair, the ip command in Linux does give something a bit better:

# ip -s link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    RX: bytes  packets  errors  dropped overrun mcast   
    57702      573      0       0       0       0      
    TX: bytes  packets  errors  dropped carrier collsns 
    57702      573      0       0       0       0      
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:97:a2:69 brd ff:ff:ff:ff:ff:ff
    RX: bytes  packets  errors  dropped overrun mcast   
    63160319   479755   0       0       0       0      
    TX: bytes  packets  errors  dropped carrier collsns 
    7531611    41613    0       0       0       0

 

But I can’t get it to display continuous output, and if it did, the formatting of the column headers would make it too difficult to read.

One of my favorite aspects of routing in OpenBSD, compared to linux, is that the ifconfig and route commands are not in the adolescent process of being replaced by `ip`. Instead, they are functional and mature in their ipv6 capability. For example, no pseudo-interface hackery is required to add aliases, as OpenBSD’s ifconfig can do this without resorting to eth0:0-style interfaces. This eliminates the need for adding new tools such as iproute2. I’m not against using iproute2 on linux; far from it. But what I do find interesting is that the linux net-tools code was so full of cruft that that a complete rewrite was required, while the equivalent OpenBSD commands are still maintained. I think this speaks to the quality of code inherited from 4.4BSD, and the careful stewardship over it over it in the years since.