Tuesday, May 24, 2011

NetworkManager, stealthy scoundrel, corrupter of resolv.conf

So NetworkManager is the culprit in my mysteriously changing /etc/resolv.conf. How do I know?

Because it said so.

I was about to cut'n'paste the helpful comment that it put at the top of the modified resolv.conf, but when I went to look at it this time, it wasn't there.

It knows that I'm on to it. Probably monitoring my Google searches...

My thinking now is that it's not adding an OpenDNS server that I configured in some other tool, but that it's starting with a DNS configuration containing two of OpenDNS servers, and replacing one with my router. And I still have no idea where it's getting the OpenDNS servers from.

Unless it's from the router itself... Hmmm. Why would it replace one in the local resolv.conf? Was it really the combination of local and remote DNS servers causing the problems that I was seeing before? The difference seemed quite dramatic, as documented here. Trying again, time netstat -a is back up to >8 seconds, after changing resolv.conf. I suppose that there's some incantation I need to do to restart networking, but I seem to recall that changing resolv.conf was supposed to have an immediate effect, since it's being referred to for each request.

I've my network configuration to DHCP, addresses only. Let's see if that stops it from messing around.

In better news, I got xmonad working again after switching to the Haskell platform instead of the Ubuntu packages. Two quick cabal installs, change .xsession to point to the new executable locations, and I'm good to go.

A spontaneous failure of audio also auto-resolved after a reboot. It looks like the hardware was not properly identified, but who can tell when you're me, and just poking around though the GUIs?

Postscript:

The last refuge of the clueless system administrator: Reboot.

Now, NetworkManager has had it's chance to work it's magic on resolve.conf. It's fingerprints are there, but it behaved. It only put in the router entry and didn't propagate any of it's DNS server settings.

Too bad this didn't work. netstat -a now takes ~23 seconds.

Heh. This just keeps getting better. I go back to emacs, use my trusty TRAMP to edit resolve.conf as root (boy, that sounds bad) and lo, and behold, and holy crap, it's changed!

Am I losing my mind? No! (Well, maybe, but more data is required.) I've still got the old un-TRAMPy buffer open, and I can see that my memory is at least accurate in this regard.

And the change is at least somewhat for the better. netstat -a is down to 4 seconds again. I'm assuming that removing my router as an entry will improve this even further, but I'm too tired to test that tonight.

Oh, ho! Another player is afoot! Who's been messing with my resolv.conf? And dammit, why won't you behave?




Sunday, May 22, 2011

More evidence that I don't know what I'm doing.

Twice now, an entry has been added to /etc/resolve.conf without my intervention. Some program somewhere is deciding to help me out and hose my internet connection. Which program, and for what nefarious purpose?

The problem, I think, is too many layers. This is something that bugs me about the design of lots of things, computer UI's and information management systems in particular. I don't doubt that the culprit in messing with resolve.conf is an application for managing connections which is trying to be helpful, and that I have probably configured to behave in exactly this fashion without knowing it.

Perhaps it not really the multiple layers of abstraction between basic networking and UI elements, but trying to work with more than one level at once. This isn't an uncommon problem in other areas of user-interface design. Try working with the output generated by a LaTeX front-end or other form of code generator sometime. You can have the polished interface, or you can work one level lower, but don't try to do both!

These days, I'm more inclined to strip a layer away than go back up. Too often, "user friendly" is achieved by being "user limiting". And I actually do know what I'm doing. Sometimes.

Sunday, May 15, 2011

GHC, Haskell Platform installations.

So after great angst, a good night's sleep, and a day not spent thinking about it, the problems of yesterday were dispatched pretty easily today.

GHC 7.0.3 installed without a hitch. As did the Haskell Platform. "Configure, make, sudo make install" for the win.

Then, after a tweak to the cabal file, my code compiled as well. It was limiting base library versions to <= 5. Hence, it couldn't be compiled after the upgrade.

The occasional problem aside. I'm always impressed with the reliability of cabal and Haskell packaging in general. It's not perfect, of course, but once I get past my own ignorance and the occasional somewhat less than helpful error message, it works more often and with less fuss than any other code management tool I've used.

Saturday, May 14, 2011

More adventures in Linux

After a distro upgrade, my Haskell code suddenly stopped compiling. The error message from cabal was less than helpful:

cabal: Package parallel-particle-mc-0.0.1 can't be built on this system.

Okayyy...

Apparently some packages I needed went missing in the upgrade. Tried cabal-installing them, which went without a hitch, but compiling my package still doesn't work.

"You know, " I said to myself, "it's time to upgrade to a newer version of the Haskell Platform anyway."

I thought I'd try the Ubuntu Haskell Platform package. To synaptic! Search! Mark for Installation! Apply! Error! "Fix broken packages first."

Okayyy...

Edit -> Fix Broken Packages and...

"Resolve generated breaks, this may be caused by held packages."

To Google! Few pages, all of which suggest a magical incantation along the lines of "

apt-get clean
apt-get auto clean
apt-get update
apt-get upgrade
apt-get dist-upgrade

Tried that. Got the tee-shirt. Didn't work.

We'll there's always installing from source...

To Haskell.org! Download Now! Linux!

Don't want the package, tried that already. Download the source. Need a pre-compiled binary. Follow that link. Ignore the suggestion that I download the platform instead, since that what brought me here.

Extract tarball. Open INSTALL. Follow directions.

/usr/bin/install: cannot change permissions of `/usr/local/share/man/man1': No such file or directory

*sigh*. I'm lazy. It's late. Let's just create that directory.

Creating directory: permission denied, /usr/local/share/man/man1

WTF? I'm root dammit! Don't tell me I can't do that!

I don't want to figure this out. I've already wasted to much time on this. Despair. Angst. Wonder if life is worth living. Wonder if Linux is worth using. Wonder if I've had too much wine.

Try the last command outside of emacs-shell.

Oh now it works.

Never had that happen before. No idea why. In the past, eshell has never given me any problems with respect to permissions. What this changed in the upgrade? Is there some other reason?

Even inquiring minds are beyond caring at this point. Will continue this adventure tomorrow. For now, going to have more wine and less whine.


P.S. First attempt at posting this generated an error in blogger:

Your HTML cannot be accepted: Tag is not allowed: META

I didn't put that in there, Blogger, you did! Edit HTML. Remove. Works.




Friday, April 22, 2011

Huh. Don't do that, I guess.

One quick change to resolve.conf and here are the new numbers:

> time netstat -an

real 0m0.047s
user 0m0.000s
sys 0m0.010s


> time netstat -a

real 0m0.312s
user 0m0.000s
sys 0m0.020s


Ahh. Much better. The random hangs seem to have gone away as well.

Apparently having both the router (and DHCP sererver) and some other DNS server listed in resolve.conf is a bad idea. Some time ago, I put two server lines in there, both for OpenDNS. The mysterious inner workings of DHCP replaced the first one with my router's address, presumably because it forwards and maybe caches DNS queries.

So, there you go. I flailed my way to the solution of a technical problem with a combination of Google and blind luck. The worst part is, I have no damn clue why this was wrong, or even why it happened. Now that it works, I can't really justify spending more time on it just to satisfy my curiosity, and frankly the motivation isn't there either.

Thursday, April 21, 2011

Diving into the depths of Linux network configuration

Send help if I don't emerge in a few days.

Here's the situation: My internet connectivity sucks. It's fast (enough) when it works, but connections, e.g. opening a new web page frequently "hang": taking somewhere between very long to forever to conclude. A reload and retry will frequently work instantly.

First clue was that the issue might be DNS related. Consulting the great Google yielded this at serverfault. One of the suggestions to diagnose problems with reverse DNS lookups is to check the speed difference between netstat -a and netstat -an, the idea being that -n doesn't require reverse DNS, since it just uses numeric addresses. So here goes:

> time netstat -an
...
real 0m0.039s
user 0m0.010s
sys 0m0.010s

> time netstat -a
...
real 0m37.941s
user 0m0.000s
sys 0m0.020s

Well, there's your problem.

Okay, so my reverse DNS records are probably screwed up. Problem is, I don't really know what that means, or what to do about it. Is this my ISP's fault? (I'm prepared to believe that, since it's Qwest, after all.)

Stay tuned, for more posts about me trying to get a clue.