Comments welcome. These are just some things I've decided work best and save hassles:
General Storage
- Compression should be used on the back-end storage, not in the VM or host that is mounting the storage.
- Data should be grouped in volumes based on general
purpose/function, performance requirements, backup requirements, and any
isolation requirements. For example:
- DB logs should be on a separate hosting aggregate/raid-set from the databases that they help to protect.
- General business data should not be on the same volume as Financials data, because they have distinct back-up requirements.
- File share data and VM data should not be on the same underlying volume (though they may live on the same RAID set / aggregate.
- NAS/SAN volumes and shares should be named to say as briefly as possible what they actually are:
- Scratch/temp data should always be named such that the use can see that it is "scratch" data. e.g., "localscratch0", "nas_scratch0".
- vi_eng_0 (virtual infrastructure for Engineering's VM's, number 0)
- volume IT_Software shared out as \\corp.mycompany.com\LS\IT\Software.
- Security
- Permissions should be set for at least the share and
directly-contained folders using domain-local security groups (using
nested groups) that are specific to just that share and those folders.
- "Everyone" should almost never be used.
- "Deny" permissions should almost never be used; instead, create a group that includes only the desired people/roles.
- Scratch data should not be backed up. Take every relevant opportunity to remind users of that.
- Linux
- LVM
- LVM should be used where possible -- always for the OS; for data
disks it may be omitted for cases where separate LUNs are presented for
data and there is only one filesystem per LUN.
- The OS should use a different Volume group from major application
data. (simple LAMP boxes with a single disk device may us all on a
single volume group; systems with more disk devices, or where one may
wish to restore data LUNs from snapshot should have a separate VG_Data
including all physical volumes holding application data
- All partitions on a single disk device should only belong to the
same Volume Group. (this makes re-assembling a broken system easier...
if the data volume group has a physical volume missing/broken or
restored from snapshot, the system can still boot if the OS volume group
is intact; then the data volume group can be re-assembled.).
- Logical volumes should be named with "LV" plus the mount path,
substituting "_" for "/"; LV_root for the root "/" filesystem, and
LV_swap for the swap filesystem.
- /boot should not be on a logical volume.
- Logical volumes should always be thick-provisioned
- Resizing
- Before any resize operation, always back up the data to external storage first!
- When growing a logical volume, always grow the logical volume
first, using round "g" size, then resize the filesystem (don't specify a
size, where possible, and let resize detect the new size). If the
hosting physical volume is to be grown, grow the hosting partition/LUN
and then pvextend before growing the LUN.
- When shrinking a logical volume, always shrink the filesystem
first, using roung "g" size, then resize the logical volume, also using
round "g" size specification. If the hosting partition/LUN is also to
be shrunk, then it should be shrunk last using round "g" size
specification.
- Filesystems should be labeled with the mount path, e.g. "/var/log"
where it fits; if the path is too long, then the last unique part of
the path should be used, e.g. "postgres-backups".
- Windows
- Application Data for major applications and those where one may
wish to restore data LUNs from snapshot separately from the OS should
place application data on separate disk devices (not just separate
partitions)
- Filesystems should be labeled based on mount/drive path and
purpose, e.g. "C_OS", "D_Data", "F_Logs", "G_MailboxDB00",
"C_Appdata_logs"
General SAN
- LUNs should always be masked on the SAN target device to only allow
access from only those initiators who require access to that LUN.
- LUNs should be formatted using the and mounted (fstab) mapped using the multipath device
- Multipathing
- Multipathing should be used where possible, with each target (redundant initiators, switches/paths, and targets)
- Redundant targets should be used, as well: Initiator A should
connect on subnet/switch A to the target LUN on SAN head A; initiator B
should connect on subnet/switch B to the target LUN through SAN head
B.
- Active-active with round-robin load balancing should be used.
- LUNs should generally be thinly provisioned (except for LUNs
hosting essential databases); the monitoring system (Zabbix) should be
configured to alert when the hosting aggregate is at 80, 90, and 95% of
capacity..
FC
- FC switches should use WWN-based zoning
- Any host OS installation should include *unplugging* the HBA so
that the OS does not wipe all visible LUNs. (If the host OS is being
installed for boot-from-SAN, then all LUN masks and target attributes
should be triple-checked to avoid inadvertent destruction of data.)
- switches in separate paths should be managed separately. (Combining
them into a single management domain would combine them into a single
fault domain for administrative errors).
- WWN's should, where be possible, be aliased all storage devices
(switches and SAN) to include the hostname (and optionally a function)
such that they can be easily identified, referenced, and so that all
related zonings/mappings can be updated only by updating the alias
itself. (For example, if an HBA must be replaced, that would invalidate
all WWN-based mappings; but since we used an alias for all mappings, we
just update the single alias definition that included that WWN.
iSCSI
- all IQN's should always be typed as all lower-case.
- Initiator IQN names should be:
- iqn.2014-01.local:hostname[-boot]
- The - is optional, and is only needed if there are several different initiators on a single host
- No iscsi traffic should be routed. In other words, any given initiator should be connecting to a target on the same subnet.
- Targets should be mapped using IP address, not Host name.
- Jumbo frames (9000-Byte) should be used where possible.
No comments:
Post a Comment