2009/04/10

[ FBSD ] FreeBSD ZFS filesystem

ZFS(Zettabyte File System) 一個打破過去思維的檔案系統,是昇陽 Sun Microsystems 這家公司所開發出來的全新型態檔案系統,因為 License 的問題所以目前只有在 Solaris、Mac、BSD 上看得到,ZFS 是 128bit 的檔案系統而它到底有多強呢?別再等待了馬上用了你就知道,只能說 ZFS 真是一個上帝賜給IT人員的好禮物。

優點:

1. 簡易管理(Ease of Management):只需透過簡易的兩個指令 zpool 及 zfs 即可操作管理增加及縮減檔案系統容量。
2. 高延展性(Scalability):可以線上動態的增加容量給檔案系統而不需停機。
3. 資料完整性(Data Integrity):再也不需要 fsck 修復資料,任何在 ZFS 裡面進行的動作都會經過同位檢查(Everything is checksummed)確認後才寫入/讀出。
4. 驚人的高效能表現(Breathtaking Performance):資料寫入磁碟機時,首先會將資料先寫入第一個未使用的區塊(first free block),不需等待實體磁碟機轉速的延遲及磁頭移動的時間。同時具有智慧預備讀取功能(Intelligent Prefetch)可自動預測下筆讀取資料並將其預先放置在快取記憶體裡。
5. 擁有企業級的操作功能:具備 Quota(磁碟配額)、Reservation(磁碟預留)、Compression(磁碟壓縮)、Snapshot(磁碟快照)、Clone(磁碟復製)等強大的功能。
6. ZFS是免費使用檔案系統:Apple 的 Mac OS 將在下一版 10.6 Snow Leopard 即內建支援 ZFS,而 FreeBSD 7.X 環境下也已經開始支援 ZFS 也將於 8.0 之後的版本列入為主要檔案系統的一部份。

測試環境:

i386 P4-1.6
Real RAM 512M
OS FreeBSD 7.1R

1. 啟動 ZFS

# vi /etc/rc.conf #加入開機自動啟動
zfs_enable="YES"

# vi /boot/loader.conf # 針對 ZFS FreeBSD 核心的最佳化調整
vm.kmem_size="330M"
vm.kmem_size_max="330M"
vfs.zfs.arc_max="40M"
vfs.zfs.vdev.cache.size="5M"

# /etc/rc.d/zfs start # 手動啟動 ZFS
# reboot # 重新啟動讓調整的參數開機載入


2. ZFS 的 zpool 指令

# zpool create storage mirror ad2 ad3 # 建立 Raid1(1+1=1) Mirror 格式,資料會同時寫入兩棵 HD。
# df -h # 查看 storage pool 是有已經建立
Filesystem Size Used Avail Capacity Mounted on
storage 5.8G 6.9M 5.8G 0% /storage

# zpool status # 查看本機 ZFS 的狀態,這個系統中有一個 pool 名稱為 storage 是由兩棵代號 ad2 及 ad3 的實體硬碟 mirror 而成
pool: storage
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
mirror ONLINE 0 0 0
ad2 ONLINE 0 0 0
ad3 ONLINE 0 0 0

errors: No known data errors

# zpool list # 查看 pool 的空間使用狀態
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
storage 5.94G 7.00M 5.93G 0% ONLINE -

# zpool offline storage ad2 # 讓 storage 這個 pool 的 ad2 硬碟暫時離線
Bringing device ad2 offline
# zpool status storage # 查看狀態會有 DEGRADED 警告
pool: storage
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
storage DEGRADED 0 0 0
mirror DEGRADED 0 0 0
ad2 OFFLINE 0 0 0
ad3 ONLINE 0 0 0

errors: No known data errors

# zpool online storage ad2 # 讓 storage 這個 pool 的 ad2 硬碟重新上線
Bringing device ad2 online
# zpool status storage # 查看狀態 DEGRADED 警告已解除
pool: storage
state: ONLINE
scrub: resilver completed with 0 errors on Thu Apr 9 13:49:50 2009
config:

NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
mirror ONLINE 0 0 0
ad2 ONLINE 0 0 0
ad3 ONLINE 0 0 0

errors: No known data errors

# zpool export -f storage # 卸除(umount) storage 這個 pool 參數 -f (Force) 為強制卸除
# df -h # 查看是否 umount,已無 storage 這個 pool
Filesystem Size Used Avail Capacity Mounted on
/dev/ad0s1a 11G 1.4G 8.5G 14% /
devfs 1.0K 1.0K 0B 100% /dev

# zpool import storage # 掛載(mount) storage 這個 pool
# df -h # 查看是否 mount,已重新掛載 storage pool
Filesystem Size Used Avail Capacity Mounted on
/dev/ad0s1a 11G 1.4G 8.5G 14% /
devfs 1.0K 1.0K 0B 100% /dev
storage 5.8G 6.9M 5.8G 0% /storage

# zpool scrub storage # 手動驗證 storage 這個 pool 中所有資料的完整性
# zpool replace storage ad4 ad5 # 在 storage pool 中用新的 ad4 ad5 取代原本的 ad2 ad3 兩棵 HD


3. ZFS 的 zfs 指令

# zfs create storage/compressed # 在 storage 這個 pool 中建立一個檔案目錄叫 compressed
# zfs list # 查看 ZFS 檔案狀態
NAME USED AVAIL REFER MOUNTPOINT
storage 7.02M 5.84G 6.91M /storage
storage/compressed 18K 5.84G 18K /storage/compressed

# zfs set compression=gzip storage/compressed # 將 compressed 目錄設定成 gzip 壓縮格式
# zfs set compression=off storage/compressed # 解除 compressed 目錄設的壓縮格式

# zfs mount # 查看所有 ZFS 掛載的磁區
storage /storage
storage/compressed /storage/compressed

# zfs get all storage/compressed # 查看 storage/compressed 檔案目錄屬性
NAME PROPERTY VALUE SOURCE
storage/compressed type filesystem -
storage/compressed creation Thu Apr 9 14:14 2009 -
storage/compressed used 18K -
storage/compressed available 5.84G -
storage/compressed referenced 18K -
storage/compressed compressratio 1.00x -
storage/compressed mounted yes -
storage/compressed quota none default
storage/compressed reservation none default
storage/compressed recordsize 128K default
storage/compressed mountpoint /storage/compressed default
storage/compressed sharenfs off default
storage/compressed checksum on default
storage/compressed compression gzip local
storage/compressed atime on default
storage/compressed devices on default
storage/compressed exec on default
storage/compressed setuid on default
storage/compressed readonly off default
storage/compressed jailed off default
storage/compressed snapdir hidden default
storage/compressed aclmode groupmask default
storage/compressed aclinherit secure default
storage/compressed canmount on default
storage/compressed shareiscsi off default
storage/compressed xattr off temporary
storage/compressed copies 1 default

# zfs set quota=3G storage/compressed # 可設定檔案空間大小為 3G 但 storage 這個 pool 總大小為 5.8G 如果 3G 滿了而 5.8 還未用完會 share 給 /storage/compressed 繼續使用
# df -h # 查看 storage/compressed 的 Avail 變成 3G
Filesystem Size Used Avail Capacity Mounted on
storage 5.8G 7.0M 5.8G 0% /storage
storage/compressed 3.0G 128K 3.0G 0% /storage/compressed

# zfs set reservation=3g storage/compressed # 設定保留了 3G 給 /storage/compressed
# df -h # 查看 storage 的 Size 變成 2.8G,這是因為保留了 3G 給 /storage/compressed,代表 /storage/compressed 真正擁有 3G 別的 filesystem 是不能跟它 share 的
Filesystem Size Used Avail Capacity Mounted on
storage 2.8G 7.0M 2.8G 0% /storage
storage/compressed 3.0G 128K 3.0G 0% /storage/compressed

# zfs create storage/data # 於 storage pool 中再建立一個 data 檔案目錄
# zfs set mountpoint=/data storage/data # 設定真實掛載點原本 Mounted on 會由 /storage/data 變成到 /data
# df
Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/ad0s1a 11294270 1468766 8921964 14% /
devfs 1 1 0 100% /dev
storage 2982656 7040 2975616 0% /storage
storage/compressed 3145728 128 3145600 0% /storage/compressed
storage/compressed@2009-04-09 3145728 128 3145600 0% /storage/compressed/.zfs/snapshot/2009-04-09
storage/data 2975616 0 2975616 0% /data

# zfs set sharenfs=rw storage/data # 設定分享權限可讀及寫,通常用在 NFS 分享上

# touch /storage/compressed/1234 # 建一個檔案到 compressed 目錄下
# md5 /storage/compressed # 用 md5 去 hash snapshot 之前的目錄,會得到一個 hash 值
MD5 (/storage/compressed) = 7bffed2808dfba7915f89f8f42b09f83
# zfs snapshot storage/compressed@2009-04-09 # snapshot compressed 目錄,取個隨便名稱就用今天日期
# md5 /storage/compressed/.zfs/snapshot/2009-04-09 # 再次用 hash 去驗證 snapshot 前後所得到的檔案是否一樣
MD5 (/storage/compressed/.zfs/snapshot/2009-04-09) = 7bffed2808dfba7915f89f8f42b09f83

# zfs destroy storage/compressed # 可刪除 storage/compressed 檔案目錄


Ref.
http://en.wikipedia.org/wiki/ZFS
http://opensolaris.org/os/community/zfs/
http://wiki.freebsd.org/ZFS
http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html

2009/04/02

[ FBSD ] Lagg Interface Setup

FreeBSD 插雙(多)網卡方式,用意在針對網路卡容錯及增加網路卡效能。

# vi /etc/rc.conf

defaultrouter="192.168.1.254"
ifconfig_vr0="up"
ifconfig_vr1="up"
cloned_interfaces="lagg0"
ifconfig_lagg0="laggproto failover laggport vr0 laggport vr1 inet 192.168.1.9 netmask 255.255.255.0"

# reboot

可套用的 Protocols

failover
Sends and receives traffic only through the master port. If the master port becomes unavailable, the next active port is used. The first interface added is the master port; any interfaces added after that are used as failover devices.

fec
Supports Cisco EtherChannel. This is a static setup and does not negotiate aggregation with the peer or exchange frames to monitor the link.

lacp
Supports the IEEE 802.3ad Link Aggregation Control Protocol (LACP) and the Marker Protocol. LACP will negotiate a set of aggregable links with the peer in to one or more Link Aggregated Groups. Each LAG is composed of ports of the same speed, set to full-duplex operation. The traffic will be balanced across the ports in the LAG with the greatest total speed, in most cases there will only be one LAG which contains all ports. In the event of changes in physical connectivity, Link Aggregation will quickly converge to a new configuration.

loadbalance
Balances outgoing traffic across the active ports based on hashed protocol header information and accepts incoming traffic from any active port. This is a static setup and does not negotiate aggregation with the peer or exchange frames to monitor the link. The hash includes the Ethernet source and destination address, and, if available, the VLAN tag, and the IP source and destination address.

roundrobin
Distributes outgoing traffic using a round-robin scheduler through all active ports and accepts incoming traffic from any active port.

none
This protocol is intended to do nothing: it disables any traffic without disabling the lagg interface itself.

Ref.

http://www.cyberciti.biz/faq/freebsd-network-link-aggregation-trunking/