Most Popular in Unix
-
ILOM Sunservice login incorrect
-
Auto putty login and an ebook required
-
zlogin: makeutx failed
-
what is the difference between /etc/mnttab and /etc/vfstab
-
Jumpstart Error: Unsupported version (3.0).
-
SAN/Solaris: luxadm shows NOT CONNECTED
-
involuntary context switching
-
Net Calculator
-
Installing bash on AIX 6.1
-
SFTP check if file exists
VDI Sun -- part 4 battling locked pages.
Okay I?m learning a lot about memory allocation more specifically how and when pages are locked into physical memory and which kernel functions are responsible for it. While trying to figure out why pageslocked kept growing in kstats and what is using it. I first blamed it on my C program being too aggressive and just locking memory blindly now that it?s being started by cron without human intervention, so I am currently working on it and giving it some intelligence of about locked memory, but still even with my improvements pageslocked kept growing so. I brought out the big guns, DTrace which I tried to catch VirtualBox locking memory, and trying to see how much memory is being locked in the VirtualBox tasks using pmap ?x and not really finding a good answer, my current guess is that Virtualbox?s kernel modules are locking memory using kernel functions, and this really isn?t triggered by a normal syscall. After going away and finding pagelocked still growing even though there are no VirtualBox guests running on the box at all, it finally clicked in my mind ZFS ARC is using more and more locked pages, upto max arc which I set to 4GB or about 66% of free memory on my system. Now ZFS will give it back, but not fast enough if you ask for it a gigabyte at a time and immediately lock it into physical memory.
While fighting pagelocked growing was bad enough, the reason I?m working this hard is that I am running into this bug http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6910378, while the warning messages are bad enough, it?s the underlying upheaval that it creates. Network connections die, iscsi initiators on remote hosts timeout, which taught me how to do forcibly remove iSCSI luns that are formatted with XFS, can?t login after a few minutes the box gets back to normal.
To add more pain, I find the bug and see that it?s fixed in b133 and was released into OSOL /dev on Friday night. So since I am currently tracking /dev, for ZFS dedup, I figured I might as well upgrade. Well the upgrade seemed to go well, packages are downloaded, new versions are installed, cleanup scripts all run, and I reboot the box. Now since I rarely use the gui on the system, I don?t have it connected to a monitor or even a keyboard. So I go into shell on another box and run ping ?s waiting for the system to come back to life so I can login. 5 minutes go by, then 10 min and finally after 15 minutes of the system sitting there, with no disk activity lights, I figure it?s time to connect a monitor and keyboard, which requires a restart because video is not enabled on the machine unless a monitor is connected on startup. So I reboot, tell it to grub to boot into text mode, add ?v for verbose output, remove the graphic screen and not apply special colors to the screen all of which makes kernel output unreadable until X starts. So finally I see Solaris dumping out tons of debugging information, all seems pretty normal for a ?v enabled Solaris boot, after a while I see my iSCSI connections failing, but I still can?t ping the box, so I think it is the iSCSI SMF manifest screwed up and starting before physical networking is started. Finally a login prompt appears and I?m able to get into the console, I run ifconfig ?a, and see all my normal network interfaces up, lo, Virtualbox?s virtual NIC is there, but where is my gigabit adapter, that is strange. I look in dmesg, and nope no mention of my NIC, run scanpci and it doesn?t see my NIC. So I decide to try and plumb it, but no luck it?s not finding my NIC, well since OSOL has the best of live upgrade enabled using ZFS, I simply tell beadm to activate my earlier version, and reboot the box.
ifconfig rge0 plumb
beadm activate opensolaris-dedup-7
Shutdown ?i6 ?y ?g0 ; sleep 300; reboot
Not sure why shutdown doesn?t always reboot the box so I forcibly reboot it after giving it a chance to update the boot archives. About 6minutes later the system is back up and running, network is fully intact, and everything is good to go, hopefully they will fix the but.
Since it was ZFS?s arc that is locking memory I have gone back to setting arc maximum to 3GB I had set it to 4GB and that is fine for a while but then starts to grow slowly and consumes 66% of memory leaving little memory left for VirtualBox guests, and I?m going to try and keep pageslocked at 75% of physical memory, that will leave about 3GB of ZFS ARC + 1.5GB of virtuals, with small Windows XP instances needing between 384 and 512MB that is more than a few desktops that can be loaded. May even drop Max ARC to 2GB, once I deal with memory pressure issues it should be fine, but this is a work in progress.
Hopefully tomorrow I will post an updated version of my code.
More Stories in Unix Admin Corner
Most Popular Stories
A fix for those "Pairing Record Missing" errors
ILOM Sunservice login incorrect
Auto putty login and an ebook required
zlogin: makeutx failed
Innovation isn't dead, it just moved to the cloud
Diablo 3 Slow on Mac? Here is a Solution
Planning Board Rendering Unveils New Glass-Enclosed Apple Store for Stanford
7 Online Resources To Trace The History Of Your House
Daily iPhone App: Elenints matches Triple Town's planning with a few new tricks
Meet Heckerty, well-known British children's story, makes its way to the iPad