Search this site





vmware 2.0 startup problems (and a solution)

A few weeks ago I installed Vmware Server 2.0 Beta 1. I noted a regression from vmware server 1.3 (and 1.2) that "raw disks" were seemingly not supported. The workaround was to manually edit the 'vmx' file for the virtual machine to add the old entries which exposed raw disks to vmware.

Tonight, I rebooted my server after accidentally powering it off while cleaning dust off of the intake vents, and vmware didn't start back up. Technically, all of the startup scripts (/etc/init.d/vmware) ran fine and reported no errors, but I couldn't connect to the management interface on port 8333. Netstat output confirmed that nothign was listening on this port. Crap.

After grepping around in various places, I figured that the tomcat server that comes with vmware (named webAccess) had no intentions on listening to port 8333, and this was normal. I checked /var/log/ for anything useful, and found /var/log/vmware. In this directory, was a set of hostd-N.log files, where N is a number. In hostd-0.log, was this entry (the entry below is truncated for readability):

[2008-01-08 21:31:23.790 'vm:/vmdisks/vms/filer (solaris 64bit)/filer (solaris 64bit
).vmx' 47879793637584 warning] Disk was not opened successfully. Backing type unknow
n: 0
[2008-01-08 21:31:23.790 'vm:/vmdisks/vms/filer (solaris 64bit)/filer (solaris 64bit
).vmx' 47879793637584 warning] Disk was not opened successfully. Backing type unknow
n: 0
[2008-01-08 21:31:23.791 'App' 47879793637584 error] 

Exception: ASSERT /build/mts/release/bora-63231/bfg-atlantis/bora/vim/hostd/vmsvc/vm

[2008-01-08 21:31:23.794 'App' 47879793637584 error] Backtrace:
<actual backtrace snipped>
Keep in mind, that even though vmware-hostd was failing, /etc/init.d/vmware reported success for every operation. Eek.

So, I went to my filer vmx file and commented out the rawDisk entries and restarted vmware (with the init script). No more failures were logged in hostd-0.log, and a subsequent netstat showed vmware-hostd listening on port 8333. Peachy.

Back on my windows box, I ran the vmware console, and guess what happens... I can now manage my vmware sessions again.

I can only hope that VMware decides to allow raw, local disk access in the finished version of vmware 2.0, because I am rather dependent on it. If they don't, I might be able to get away with moving the data out of the zfs pool, initializing the drives with some random linux file system, and creating a 500gig vmware virtual drive on each disk, and finally telling Solaris to fix its zfs stuff. Since I don't have too much data there, I might be able to get away with draining one disk out of the zfs pool, and doing the conversion from raw to virtual disk one physical disk at a time. Might be a useful exercise in learning zfs more.

I'll cross that bridge when I get to it.

ssh honeypot.

Using slight variations on the techniques mentioned in my previous post, I've got a vmware instance running Fedora 8 that permits any and all logins. These login sessions are logged with script(1).

Fedora 8 comes with selinux enabled by default. This means sshd was being denied permission to execute my special logging shell. The logs in /var/log/audit/ explained why, and audit2allow even tried to help make a new policy entry for me. However, I couldn't figure out (read: be bothered to search for more than 10 minutes) how to install this new policy. In searching, I found out about chcon(1). A simple command fixed my problems:

chcon --reference=/bin/sh /bin/sugarshell
The symptoms prior to this fix were that I could authenticate, but upon login I would get a '/bin/sugarshell: Permission Denied' that wasn't logged by sshd.

There are plenty of honeypot software tools out there, but I really wasn't in the mood for reading piles of probably-out-of-date documentation about how to use them. This hack (getpwnam + pam_permit + logging shell) took only a few minutes.

As a bonus, I found a feature in Fedora's yum tool that I like about freebsd's packaging system: It's trivial to ask "Where did this file come from?" Doing so made me finally look into how to do it in Ubuntu.

FreeBSD: pkg_info -W /usr/local/bin/ssh
/usr/local/bin/ssh was installed by package openssh-portable-4.7.p1,1
Fedora: yum whatprovides /usr/bin/ssh
openssh-server.x86_64 : The OpenSSH server daemon
Ubuntu: dpkg -S /usr/bin/ssh
openssh-client: /usr/bin/ssh

Let's see what I catch.

VMware Server 2.0 Beta

I upgraded my vmware machine from vmware 1.3 to vmware 2.0 beta. The install process was great by comparison to the last two releases. This install was much nicer than the previous one for simple reasons that I didn't have to hack the perl script to not misbehave, and I didn't have to mess around compiling or finding my own vmware kernel modules. Everything Just Worked during the install.

On the downside, vmware-server-console is deprecated. Vmware Server 2.0 uses Vmware Infrastructure, which appears to be tomcat+xmlrpc and other things. The New Order seems to be that you manage your vms with the webbrowser, which isn't a bad idea. However, we must remember that Good Ideas do not always translate into Good Implementations.

The web interface looks fancy, but the code looks like it's from 1998. The login window consists of layers and layers of nested tables and a pile of javascript all in the name of getting the login window centered in the browser. You can see the page align itself upon rendering even on my 2gHz workstation with Firefox. Horrible.

Once you log in, you're presented with a visually-useful-but-still-runs-like-shit interface. The interface itself appears useful and nice, but again fails to respond quickly presumably due to the piles of poorly written javascript involved.

Since VMware thought this was a fresh install, it didn't know about any of my old virtual machines. Adding them using the web interface causes vmware to crash. Oops. So, I found a vmware infrastructure client executable randomly in the package; "find ./ -name '*.exe'" will find it for you. Copied this to my windows box and installed it. I used this tool to re-add my old vmware machines.

Unfortunately, "raw disks" are disabled in this free version of vmware server. I'm not sure why. My Solaris VM uses raw disks for its zfs pool, so this was a problem. Luckily, this is purely a gui limitation and not a vmware limitation. To repair my Solaris VM, I created a new virtual machine with the same features and told it where it's first disk lived (the first disk was a normal file-backed vmware disk image). After that, I looked at the old vm's .vmx file and copied in the lines detailing the raw drives to the new .vmx file:

scsi0:1.present = "true"
scsi0:1.filename = "zfs-sdb.vmdk"
scsi0:1.deviceType = "rawDisk"
scsi0:2.present = "true"
scsi0:2.filename = "zfs-sdc.vmdk"
scsi0:2.deviceType = "rawDisk"

Everything's backup and running sanely now in vmware. Hurray :)

Boredom, vmware cpu performance, and /dev/random

These are strictly cpu-bound tests using 'openssl speed'. I didn't compile any of the openssl binaries here, so it's possible that differences in compilationcaused the differences in the numbers.

I've never noticed a performance decrease of the host vs guest systems in vmware, and here's data confirming my suspecions.

guest/solaris10    OpenSSL 0.9.8e 23 Feb 2007
guest/freebsd6.2   OpenSSL 0.9.7e-p1 25 Oct 2004
host/linux         OpenSSL 0.9.8c 05 Sep 2006

'openssl speed blowfish'
                   type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
host/linux         blowfish cbc     72062.94k    77117.35k    78280.70k    78680.96k    79309.48k
guest/freebsd6.2   blowfish cbc     68236.69k    73335.83k    74060.50k    74423.40k    74703.29k
guest/solaris10    blowfish cbc     64182.15k    73944.47k    75952.21k    76199.94k    76931.07k

'openssl speed rsa'
                                      sign    verify    sign/s verify/s
host/linux         rsa  512 bits 0.000308s 0.000020s   3244.3  49418.3
guest/freebsd6.2   rsa  512 bits   0.0003s   0.0000s   3343.5  41600.1
guest/solaris10    rsa  512 bits 0.001289s 0.000116s    775.6   8630.8

host/linux         rsa 1024 bits 0.000965s 0.000049s   1036.7  20409.8
guest/freebsd6.2   rsa 1024 bits   0.0009s   0.0001s   1160.0  18894.2
guest/solaris10    rsa 1024 bits 0.007152s 0.000369s    139.8   2708.1

host/linux         rsa 2048 bits 0.004819s 0.000135s    207.5   7414.4
guest/freebsd6.2   rsa 2048 bits   0.0045s   0.0001s    222.8   6951.1
guest/solaris10    rsa 2048 bits 0.045780s 0.001334s     21.8    749.8

host/linux         rsa 4096 bits 0.028600s 0.000422s     35.0   2371.3
guest/freebsd6.2   rsa 4096 bits   0.0279s   0.0004s     35.8   2271.4
guest/solaris10    rsa 4096 bits 0.317812s 0.004828s      3.1    207.1
It's interesting that the performance on blowfish were pretty close, but rsa was wildly different. The freebsd guest outperformed the linux host in signing by 10%, but fell behind in verification. Solaris peformed abysmally. The freebsd-guest vs linux-host data tells me that the cpu speed differences between guest and host environments is probably zero, which is good.

Again, the compilation options for each openssl binary probably played large parts in the performance here. I'm not familiar with SunFreeware's compile options with openssl (the binary I used came from there).

Either way, the point here was not to compare speeds against different platforms, but to in some small way compare cpu performance between host and guest systems. There are too many uncontrolled variables in this experiment to consider it valid, but it is interesting data and put me on another path to learn about why they were different.

My crypto is rusty, but I recall that rsa may need a fair bit of entropy to pick a big prime. Maybe solaris' entropy system is slower than freebsd's or linux's system? This lead me to poke at /dev/random on each system. I wrote a small perl script to read from /dev/random as fast as possible.

host/linux        82 bytes in 5.01 seconds: 16.383394 bytes/sec
guest/solaris10   57200 bytes in 5.01 seconds: 11410.838461 bytes/sec
guest/freebsd6.2  210333696 bytes in 5.01 seconds: 41947398.850271 bytes/sec
I then ran the same test on the host/linux machine while feeding /dev/random on the host from entropy from the freebsd machine:
% ssh [email protected] 'cat /dev/random' > /dev/random &
% perl                                  
448 bytes in 5.00 seconds: 89.563136 bytes/sec

# Kill that /dev/random feeder, and now look:
% perl
61 bytes in 5.01 seconds: 12.185872 bytes/sec
When speed is a often a trade-off for security, are FreeBSD's and Solaris's /dev/random features more insecure than Linux's? Or, is Linux just being dumb?

Googling finds data indicating that /dev/random on linux will block until entropy is available, so let's retry with /dev/urandom instead.

host/linux        29405184 bytes in 5.01 seconds: 5874687.437817 bytes/sec
guest/solaris10   70579600 bytes in 5.00 seconds: 14121588.405586 bytes/sec
guest/freebsd6.2  208445440 bytes in 5.02 seconds: 41502600.216189 bytes/sec
FreeBSD's /dev/urandom is a symlink to /dev/random, so the same throughput appearing here is expected. FreeBSD's still wins by a landslide. Why? Then again, maybe that's not a useful question. How often do you 40mb/sec of random data?

Back at the rsa question - If solaris' random generator is faster than linux in all cases, then why is 'openssl speed rsa' slower on solaris than linux? Compile time differences? Perhaps it's some other system bottleneck I haven't explored yet.

Vmware Server Console on FreeBSD

Put the vmware-remotemks' program where vmware console wants it
Symlink vmware-remotemks to /lib/vmware-server-console/bin/vmware-remotemks
Mount linprocfs to /proc
mount -t linprocfs - /proc
Hack fix for vmware dep library
From vmware-server-console-distrib/lib/bin/:
for i in ../lib/lib*/*; do ln -s $i `basename $i`; done
Copy the pixmaps
sudo cp -R share/ /usr/lib/vmware-server-console/share
Remote console works for consoling into freebsd guests, but for some reason it doesn't display console for my solaris guest. Though, I can take screenshots and those look fine. Weird.

Ubuntu 64bit / vmware server

Now that I have all the hardware/bios problems fixed on this system, I've started installing vitrual machines. However, getting vmware to go was no small task.

  • The install script failed to build the vmmon kernel module, so I hacked the script to not do it.
  • Ubuntu has packages for vmmon and vmnet, but installs them in /lib/modules/.../vmware-server/ instead of /lib/modules/.../misc/ where the vmware init.d script expects them. Hacked that with a symlink. The init script looks for 'foo.o' and the ubuntu package provides 'foo.ko'.
  • I couldn't verify my license key because vmware-vmx would fail to run with an error of "No such file or directory". Turns out this really means "You are running a 32 bit binary and I can't find the libraries it needs". The solution is to apt-get install ia32-libs and possibly others.
There are probably other hacks I had to do, but it's 5am and I don't remember them right now.