2011/09/27

Uninstall GDM from Ubuntu 10.x and up

If you just uninstall the desktop packages and/or gdm, you will find that your system will hang upon reboot: it's trying to start a gdm that's not there!

First, this is braindead behaviour from Canonical/Ubuntu.  Uninstalling GDM should obviously clean up after itself, including setting a new default display manager, or if there is none, then change to boot to text mode.

Here's what you have to do *before* you reboot. (else, you boot into a rescueCD):

edit /etc/default/grub, and change this line:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
to
GRUB_CMDLINE_LINUX_DEFAULT="text"

Then, run "sudo update-grub".  (If and only if you are in a rescueCD, then you have to edit /boot/grub/grub.cfg so that the default menu option has init=/bin/bash at the end of the kernel load line; reboot, remount / read-write, run "update-grub", and then reboot.

Failing that (or if you want to be sure), verify that the file /etc/X11/default-display-manger is empty (or the one line is commented out).


2011/09/09

(another) Oracle Support Fail

on a 7410C enterprise NAS device (clustered active-active pair)


1. Fail. I asked to be contacted by phone.  Just saw an email in my box. 


2. Fail.  We pay a premium for 4 hour response time; That was exceeded. And I requested SEV1: 3 outages spanning the infrastructure in 24 hours, what more do we need?

3. Uploading two more bundles:
nodeA: on it's way up
nodeB: (the node that is in takeover mode and now not serving data): webUI barely responsive enough to upload a bundle: still building the same bundle. Perhaps someone who knows what they're doing should just log into the bloomin' thing and look at it.  I don't need an alterboy on a distant mountain to look at a dump.  I need the dude with the beard and mustard stain to log in and do the diagnostic work. (I could, but it's supposed to be an appliance, and if I go to the shell, I am warned that any fooling around may void my warranty).

4. Support tech told me that he saw AD problems on the deviceAD configuration is fine, unless the product is broken. See my previous ticket about AD unconfiguring itself. It only ever seems to work on one node of the cluster; that's okay, because we only use CIFS on fs02.



Larry Ellison, all of your stuff stinks on ice.  It did before you acquired Sun, and their stuff did when all the bright egg-heads got too ADD to finish products before they released them.  The only reason people buy Oracle products is because 1) they can throw so much money, time and consultants at their ERP that they can even make this stuff work and compensate for the lousy support; and 2) nobody else has  really given this a serious try to outdo you.  Good luck Salesforce.com: as soon as you establish a reputation for delivering, you'll have an easy job replacing Oracle.


it's hard not to be cynical when the largest, most successful and profitable companies (EMC, CommVault, Symantec, Oracle) decide that the customer is just an annoyance that must helped minimally and at a distance; and if you put up enough barriers, maybe you don't even need to deal with the customer.

Quest Software Web Site Fail

Another one from the file of "I want to give you money, please help me to give you money."


11.     I really want information, quickly, on Bakbone NetVault. The “request a quote” page is completely broken,   http://www.bakbone.com/request_information.php  gives this error:
Microsoft VBScript compilation error '800a0401'
Expected end of statement
/products/request_information/check_request.asp, line 335
"
City: " & LeadRequest.City & _
^

22.     So I click through to the another “request more info” page which gives a “not found” error.
33.     The http://www.quest.com/products/request-more-info-landing.aspx?requestdefid=34948 page requires me to log in before I can request more info
44.  Trying to get info through your support site, my login (john_doe@foo.bar.com), required a password reset, but won’t accept my password and won’t let me go past it unless I can get it to accept it.  It does not tell me why it doesn’t like my password, but I used a number of random passwords with no luck..

2011/06/13

The future of CentOS

It appears that the CentOS project is languishing. There are insufficient resources to release 5.6, 6.0, and 6.1 in a timely fashion; installed servers are going unpatched. It appears from a quick health-check that the resulting attrition may prove an existential threat to the distribution. (Once I start moving everything to Debian, Ubuntu LTS, or some such, I'm probably not going to go back.)

Here's my thought on how to address resource constraints; with 100 businesses such as my own, 4-8 full-time devs could earn a decent wage, and CentOS may be able to survive.

Some reasonably sized consumers of CentOS, such as myself, have found that:
1. RHEL subscription costs are not reasonable.

2. CentOS is good enough that we run 80+ physical and virtual hosts.

3. CentOS is free. We like free. But we also recognize the value of the CentOS product to our business, and feel that our business should (and would) be willing to contribute something to the development, in particular if a release schedule can be assured. Perhaps $100 per vm/physical per year... for me, that might be $8000/yr to contribute to the project. Surely there are several more businesses that would be willing to contribute, again, for an assurance in return? A requirement of this, though, is for you to have a formal purchase order and billing process in place.

Bottom line is, my company (a very large company) does not have a process or a desire for "donation". If there's not a process for it, in a big business, it nearly can't be done. But ordering software, licenses and subscriptions, is something that happens every day as part of the regular business process.

I realize that for some devs it may be the love of the work, or a need for a good distro, or prestige more than money; but some who might be willing could jump in and help if their income were supplemented.

It's a great distro, a great product where stability is king. But we also need consistency and steadiness, or stability of lifecycle, if you will.

2011/03/08

Check return code of piped command OR Export variable to parent shell

I just spent a day or more with a peer working the problem of "how do we get the return code of the first command in a series of piped commands in bash?"

The problem is: $? will hold the return code of the last command in the pipe sequence. We tried doing various things like ( foocmd ; export OUTERR=$?) | gzip... The problem with that is that, in bash, exported variables are not "global", so the value of $OUTERR is lost as soon as we hit the ). {}'s also did not work.

My partner's hack was to read stdout from foocmd into a variable, which he later cat'ed into gzip. Ick. Since we're dumping databases with foocmd, we're sure to run into architecture-dependent bash variable size limits, not to mention the RAM requirement of storing entire DB dumps into a variable.

As is usually the case, in the end we found that we'd spent so much time because we were going about it the wrong way, and we lacked the simple truth that would help us solve the problem efficiently.

${PIPESTATUS[@]} is an array similar to $?, except that it stores the return code of each component in a piped series. Thus, if we do this:

# /bin/false | tr x y | wc > /dev/null 2>&1
# echo ${PIPESTATUS[@]}
...we get as output...
1 0 0
...the first command, /bin/false returned "1", and the others returned "0" each.

Properly, we would test the value of each array element before considering the execution of the whole to be a success.

Now, that array will be overwritten/cleared the very next command, so the first thing we want to do is copy the array:
FOO_EXITCODE=("${PIPESTATUS[@]}")
So simple in the end.

2011/02/25

Remote bulk file edits and administration with SSH and SED ( sed examples )

Want to deploy the zabbix agent to a bunch of Ubuntu Linux systems? Easy. But wait... the config file for each needs to be updated. How about this:
for target in host1 host2 host3 host4; do echo $target; ssh -t $target "apt-get -y install zabbix-agent; sed -i.bak -e \"s/Server=localhost/Server=10.10.1.11/g\" -e \"s/Hostname=localhost/Hostname=$target/g\" /etc/zabbix/zabbix_agent.conf /etc/zabbix/zabbix_agentd.conf; update-rc.d zabbix-agent enable; /etc/init.d/zabbix-agent restart"; done

This will:
  1. ssh to each host
  2. install the agent on that host
  3. replace the default "Server=" and the "Hostname=" lines in the two config files zabbix_agent.conf and zabbix_agentd.conf", where 10.10.1.11 is the zabbix server ip address.
  4. make a backup of the two config files
  5. configure the zabbix_agent to auto-start
  6. restart the zabbix agent to pick up the config file changes.
That was easier and more reliable than trying to complete the procedure on 50 systems.
(think about updating fstab and others for a mass of hosts.)

For a simple file in-place edit of one line of a file (such as to comment out a line on all the systems' config files):
for target in host1 host2 host3 host4; do echo $target; ssh $target "sed -i.bak -e 's/^domain mynisdomain server mynismaster.company.com$/g #domain mynisdomain server mynismaster.company.com' /etc/yp.conf"; done
If you have a file to edit, and the line you want to replace has quote marks, you'll need to escape them with \\\ like so:
for target in host1 host2 host3 host4; do ssh $target "sed -i.bak -e \"s/^ENABLED=\\\"false\\\"/ENABLED=\\\"true\\\"/g\" /etc/default/sysstat "; done

2011/02/22

NetApp commands cheat sheet

Enable advanced commands

priv set advanced

Get RAID layout of an aggregate
aggr status -r

Grab a performance snaphot
statit -b
(wait)
statit -e

Monitor performance on-going
sysstat

Terse performance overview
perf report -t

Measure fragmentation
wafl scan measure_layout

Perf stats per-Qtree
qtree stats

Detailed performance status
sysstat -m 1

2011/01/11

mod_authnz_ldap searching root of Active Directory

If you try to do as the docs say, and specify:

AuthLDAPURL "ldap://mydc.foo.com:389/DC=foo,DC=com?sAMAccountName?sub?(objectClass=user)"

...it won't work. You'll get a weird error:
[warn] [client 10.10.10.1] [14343] auth_ldap authenticate: user my-ldap-acct authentication failed; URI /repo-path [ldap_search_ext_s() for user failed][Operations error]

...and yet, binding or searching from the root works from openldap, Apache Directory Studio, and myriad other tools.

Appears to be a bug with mod_authnz_ldap.

The workaround? Make sure your DC's all have the Global Catalog role, and then search on port 3268 instead of port 389! ..or 3269 for SSL/TLS.

Works. (tested on mod_authnz_ldap v 2.2....)