2009/07/09

Parallelizing Tasks in Unix/Linux

From Ian C. Blenke, The easiest way is with parallelized xargs:

$ find . -name '*.jpg' | sed -e 's/.jpg$//' | xargs -P4 -l1 -i
convert {}.jpg {}.png

The -P flag for xargs is a _wonderful_ thing to learn. Do it now, it
will forever save you time. I use it daily in our huge farm of linux
servers, makes for far more bearable adminning.

2009/06/08

CentOS NFS permission denied on mount

NFS mount results in a permission denied error.  Check the export permissions, and those are right.

The solution is to add (on the exporting host):

nfsd /proc/fs/nfsd nfsd auto,defaults 0 0

to /etc/fstab and then type:

$mount -a

on the client.

not sure why this error exists.

2009/05/20

Remote execution on Windows

I've been trying to have a poor man's backup: from my scsi-tape-attached linux box, remote execute ntbackup on each of my windows boxes, then dump those backups to tape.

In the past, I've had separate scheduled tasks on each windows server; the problem is, there's not central error reporting mechanism; the idea of the new approach is to have all of the backup reporting (and exit statuses) in one cron log report.

I've been using winexe, which is pretty cool. It lets you run remote windows commands from Linux. It appears to be part of Samba4, although you don't need all of Samba4 to make it work.

...It hasn't worked properly.

This thread
appears to say why:
'Any process you can access or create on a remote machine will not be able to "touch" any other machine in the network. Only an "interactive" session can do this by default.
'You would need to tell Active Directory to "Trust" the machine for "Delegation" to make this work. This is usually not a good idea as it can present a considerable security risk if not managed closely.'
If true, then that might have something to do with it.

...the selection lists, the backup scripts, and the backup targets are located on a linux samba server. Then again, it appears to be able to see and execute those files. Hmm... too tired, need to think about this more.

2009/05/19

OpenSolaris Notes

I'm primarily a linux guy (used Solaris between 96-2002), so here are some notes to self:

Service log files are stored under /var/svc/log

There are a few problems getting printing to work in 2009.06:
http://defect.opensolaris.org/bz/show_bug.cgi?id=2656
http://defect.opensolaris.org/bz/show_bug.cgi?id=6366

patch /etc/dbus-1/system.d/hal.conf

pkg install SUNWsmmgr
svcadm enable network/device-discovery/printers:snmp
svcadm refresh svc:/system/dbus:default
svcadm restart svc:/system/dbus:default
svcadm disable snmp
svcadm enable snmp
svcadm clear printers:snmp
svcs printers:snmp
tail -f /var/svc/log/*print* to see what's happening.

2009/05/16

Centos AD Authentication and users and groups

To configure your linux workstation to pull user and group, and authentication information, from AD, run these commands.  They do the dirty work of configuring pam, samba+winbind, nscd, and Kerberos.

..substitute your admin user account where mine is used below (admin-username), your AD dns domain/realm where domainname.com is used, and the netbios domain name where domainname is used.

yum install samba pam_krb5.x86_64 pam_smb.x86_64 nscd

authconfig --enableshadow --passalgo=sha512 --disablenis --disableldap --disableldapauth --disableldaptls --disablesmartcard --disablerequiresmartcard --enablekrb5 --krb5kdc=dc1.domainname.com --krb5adminserver=dc1.domainname.com --krb5realm=DOMAINNAME.COM --enablekrb5kdcdns --enablekrb5realmdns --disablesmbauth --smbworkgroup=DOMAINNAME --smbservers=dc1.domainname.com,dc2.domainname.com --enablewinbind --disablewinbindauth --smbsecurity=ads --smbrealm=DOMAINNAME.COM --smbidmapuid=10000000-20000000 --smbidmapgid=10000000-20000000 --winbindseparator=\\ --winbindtemplatehomedir=/home/%D/%U --winbindtemplateshell=/bin/bash --enablewinbindusedefaultdomain --enablewinbindoffline --winbindjoin=admin-username --disablewins --disablehesiod --enablecache --enablelocauthorize --enablepamaccess --disablesysnetauth --enablemkhomedir --updateall

Note that all users on the domain will now be able to log in to your computer over the network, unless you either:

1.       Set up a ssh AllowUsers or AllowGroups parameter in /etc/ssh/sshd_config (see man page for sshd_config); or

2.     2. Use pam_access (see man page for pam_access)

2009/05/12

XenServer VM won't shutdown

I had a windows x64 vm domU that was locked up. It would not shut down through XenCenter. It would not shut down through the command line with "xe vm-shutdown vm=" or "xe vm-shutdown --force vm=".

In the logs were these mesages:
VM.hard_shutdown R:82ca53505e13|xapi] VM.hard_shutdown locking failed: caught transient failure OTHER_OPERATION_IN_PROGRESS: [ VM; OpaqueRef:6c1f16b5-7a80-c3fa-eb07-6b605d1fa305 ]
[20090512 08:43:59.181|debug|vserve0|166 unix-RPC|VM.hard_shutdown R:82ca53505e13|xapi] Waiting for 12.745108 seconds before retrying...

I could not find any information on what "OTHER_OPERATION_IN_PROGRESS" meant (other than the obvious), or how to address it.

A host reboot also did not work: the xen host would not shut down gracefully, because it could not terminate this vm -- it would just hang at "terminating remaining VM's".

Finally, too late, I found the answer in the Citrix forums:

To know the cause of the rejection, you can try running xe task-list and see if anything is in pending state that might be related to the command failure.

If you do see anything that is likely to be in the way, try removing the task with xe task-cancel uuid= TASK-UUID, then try the shutdown operation again.


You may have to do a
xe-toolstack-restart.

2009/05/04

Samba and SELinux

In a previous post, I mentioned that you can keep selinux enabled to keep your system a bit more secure, by applying a label to your system.  For example, with a Samba share, you might do this:

chcon -R -t samba_share_t /srv/exports/backups

This labels (recursively) the /srv/exports/backups share as a samba share.

But this change wont persist across a filesystem relabel.  So, we have to do this:

semanage fcontext -a -t samba_share_t ’/srv/exports/backups(/.*)?’
restorecon -R -v /srv/exports/backups


For more tips and more options, see http://danwalsh.livejournal.com/14195.html .

2009/02/11

Sun xVM Server evaluation

The following is a letter I sent after evaluating Sun xVM Server Early Access 3. After EA3, there should be an EA4, and then, I'm told, there will be the General Availability. Letter follows:


First, let me say, I really appreciate the decent customer service at the beginning of this engagement, both from my reseller and from Sun. It was a little painful to actually get the software to download, but my many questions were answered well to my satisfaction.

I like Xen. I like Sun because of history. I think Sun Xen (xVM) offers an easy migration path for my existing RHEL 5 Xen infrastructure. I like the Microsoft support of Sux xVM. But I haven’t used Solaris seriously in 8 years, so I know I won’t be able to do the kind of under-the-hood troubleshooting that I routinely do on linux-based systems. It’s an uphill battle getting Sun hardware and software, though, because other folks here are much more comfortable with, for example, HP and VMware. That said, we’re in a growth mode, moving to virtualize a lot of stuff this year, buying hardware and possibly software for virtualization.

While I expected some occasional glitches in xVM server, perhaps even panics, etc., I have really been left feeling that this EA3 is not a Beta, RC1, or RC2 type product, but really an alpha-level development snapshot.

It would not recognize the RAID in my HP server. Okay, so I can’t run it on any of my HP servers, and won’t be able to do so in the GA.

…the install cd panics on my HP xw8600 workstation. In fact, of the OpenSolaris releases, 2008.05 works, but 2008.11 panics (and an upgrade from 2008.05 to 2008.11 also results in a non-bootable, panicky, system). The panics happen in several different modules, but appears to be related to the Intel SATA chipset. After spending wasting many hours getting it working, I was able to mask out the proper devices in the BIOS, and get xVM server to install. The Hardware Compatibility List is apparently wrong about my particular workstation.

Finally, having it installed, the web UI is painfully slow on my quad-core, 10GB RAM workstation.

Some web UI features don’t work, because they use flash, and flash isn’t distributed with xVM. Okay, I’ll just look at the charts from a remote box.

Wait, the box isn’t on the network. DHCP didn’t work. (tried both NICs). I assigned a static IP. Still no-go. No console diagnostics, apparently, to try to ping a gateway, see NIC state (UP, RUNNING, counters, etc.), or further troubleshoot my problem. *sigh*. I guess I also can’t try the registration feature, the “update” feature, or access the flashy graphs from another system; nor can I import an iso nor create a library on my nfs server.

Okay, so I’ll play around with the other features.

Hmm…, I can’t see how to set up raid-Z in the storage part of the management console. I was hoping to do things like stripe the main test volume disks, but replicate snapshots to a separate internal disk, then to a remote zfs target. No luck there.

No nic teaming nor vlan trunks seems like a glaring omission for a virtualization platform (no redundant network paths?!! No bandwidth aggregation / load balancing?). I knew this from the Q&A at the beginning of the eval, but it’s still alarming that this feature set would be absent in a modern server product distribution.

Basically, I expected 80-90% functionality. I figure that I’ve gotten a pretty (slow) UI, with 20% functionality under the hood. Blame a lot of it on the hardware, but, come on, Intel SATA chipset, HP workstation, and Broadcom NICs?? Can’t you support that?

I can’t imagine that the GA will be that great, if this pre-release is this bad, and release is scheduled (slipped) for Q2CY2009. Maybe in a year. But then, only if Sun is showing true market leadership and profitability in at least some major area, and demonstrates that it can “not just survive, but thrive”.

Yes, I’m frustrated because I’ve spent time I didn’t have, evaluating a product that didn’t work (not even to beta levels); even though I was a “believer” going in to the experience, trying to give it every chance I could, I call this eval a failure.

Can I try to run my own opensolaris Xen solution? Well, my trial of opensolaris has been less than stellar (panics on install).

Thank you for the opportunity to look at this product. I’ll give EA4 exactly one chance, but I won’t waste any more time trying to troubleshoot stuff that should just work.

P.S. I’d have a Sun workstation, but the HP gave me more, at a much lower cost than the Sun model.

2009/01/19

OpenSolaris Review, Troubleshooting Tips

Trying to install OpenSolaris 2008.11 on my HP Pavillion xw8600 workstation. It's certified for solaris 10, but apparently not OpenSolaris? Has the caveat that the SATA controller must use PATA emulation as noted on the certification link.

Successfully installed 2008.5, then upgraded to latest; reboot after upgrade makes the system panic on reboot.

No luck so far with 2008.11. Kernel panics galore. Have tried passing various versions of these options to the kernel, with no luck:

disable-uhci=true (disable usb)
disable-ehci=true (made this one up, disable usb 2.0?)
acpi-user-options=2 (disable acpi, see this )
acpi-user-options=8 (run acpi in legacy mode)
use_mp=0 (disable cpu cores other than core 0)
ata-dma-enabled=0 (disable ata dma transfers)

To use the above options, modify the grub kernel line. To do that, boot the system from the cd. When the grub menu is displayed, press a key to stop the countdown. Then press "e" and "e" again to edit the kernel line of the default boot option. Append "-B" followed by a space and a comma-separated list of the above parameters. Then press "ENTER" and "b" to boot it.



to boot into single-user mode, add "-s" to the end of the kernel line.

Debugger tips
To enter the debugger, add "-kdv" to the end of the kernel line
use_mp/W 0 :c (to disable multi-processor)
moddebug/W 80000000 (to print debug info from each module that loads)
:c - continue boot (or reboot if it's panicked)
$C prints a stack backtrace.
::msgbuf - print the console messages
::status - the state of the machine
::stack - print the stack
::modinfo - display info on loaded modules
::findstack


example for how to debug a core file: mdb -k unix.0 vmcore.0
::panicinfo
see dumpadm man page

Network Configuration
edit nwam config
vi /etc/nwam/llp
disable nwam (network automagic configurer)
svcadm disable svc:/network/physical:nwam
enable default network config-ability
svcadm enable svc:/network/physical:default

see if the network interface device driver is installed by doing ifconfig -a -- you should see some interfaces other than lo0
If it's not loaded, then do "ifconfig bge0 plumb" to enable the module, where bge0 is replaced by the correct device name for your driver.

Once it's loaded, you can do
ifconfig bge0 dhcp
to configure the interface for dhcp, or
ifconfig bge0 /
ifconfig bge0 up
route add default
default router config file: /etc/defaultrouter
...then, edit /etc/resolv.conf to list your dns servers, e.g.
nameserver 10.0.0.5
nameserver 10.0.0.6
edit /etc/hostname. to set the hostname, or sys-unconfig

Working with Services
svcs -a (list all services and states)


Package Management Tips
see http://opensolaris.org/os/project/pkg/
find a package, e.g.:
pkg search java
install a package
pkg install
e.g.,
pfexec pkg install SUNWjre-config-plugin

2008/11/14

bond interfaces in Xen

CentOS 5.2 x86_64 / RHEL 5U2

bond type: 802.3ad

We're going to bond eth0 and eth1 in a 802.3ad bond (bond0), and make that the primary bridge for Xen (xenbr0). It will be in vlan access mode (i.e. no vlan trunk). It will also be the primary interface for dom0. Eth2 and Eth3 are present, but not used.

Configure the switchports where eth0 and eth1 are plugged in for 802.3ad, lacp.

/etc/modprobe.conf (bond interface must be first, then load drivers for other network interfaces)
alias bond0 bonding
alias eth0 e1000
alias eth1 e1000
alias eth2 e1000
alias eth3 e1000
change this line in /etc/xen/xend-config.sxp
(network-script 'network-bridge netdev=bond0')
Ensure you have a line in /etc/sysconfig/network to define that bond0 should be used as the default gateway:
GATEWAYDEV=bond0
/etc/sysconfig/network-scripts/ifcfg-bond0 (adjust for your ip settings, use a unique MAC address):
DEVICE=bond0
IPADDR=10.10.10.119
MACADDR=00:00:10:01:01:19
NETMASK=255.255.255.0
NETWORK=10.10.10.0
BROADCAST=10.10.10.255
ONBOOT=yes
BOOTPROTO=none
USERCTL=no
BONDING_OPTS="mode=4 miimon=100"

/etc/sysconfig/network-scripts/ifcfg-eth0 (adjust according to your device's true mac address):
DEVICE=eth0
ONBOOT=yes
BOOTPROTO=none
USERCTL=no
MASTER=bond0
SLAVE=yes
HWADDR=00:1E:68:37:FA:92
/etc/sysconfig/network-scripts/ifcfg-eth1 (adjust according to your device's true mac address):
DEVICE=eth1
ONBOOT=yes
BOOTPROTO=none
USERCTL=no
MASTER=bond0
SLAVE=yes
HWADDR=00:1E:68:37:FA:93
/etc/sysconfig/network-scripts/ifcfg-eth2 (adjust according to your device's true mac address):
DEVICE=eth2
BOOTPROTO=dhcp
HWADDR=00:1E:68:37:FA:94
ONBOOT=no
/etc/sysconfig/network-scripts/ifcfg-eth3 (adjust according to your device's true mac address):
DEVICE=eth3
BOOTPROTO=dhcp
HWADDR=00:1E:68:37:FA:95
ONBOOT=no