Interesting "Who cares what you think?" in Linux 2.6.21

2 June 2007

I’ve found yet another oddity. Rather, gcc pointed it out to me, because I wasn’t paying too much attention.

this current patch which should fix RAM resume and suspend problems in 2.6.21.

It modifies the subroutine as follows:

static void tick_resume(int broadcast)
{
        struct tick_device *td = &__get_cpu_var(tick_cpu_device);
        unsigned long flags;
        int broadcast = tick_resume_broadcast();

If you don’t read nerd (or C), this is saying “Ok, so, you told me
what ‘broadcast’ is, but I’m going to ignore that and figure it out, myself.” A bit strange logic, that.

However, there is a proposed update which clears up some of this mess.

Ugh. I miss FreeBSD at times.

Linux 2.6.21.3 - it's what's (eating you) for dinner.

31 May 2007

Linux has been going through many, many changes recently. One of the most impressive has been with the release of 2.6.21. With this release, there has been offered a ‘no tick’ for normal (32 bit) PCs.

What this means is that Linux can now understand: ‘If there’s nothing to do, don’t worry about it.’ It will take a siesta. Most people won’t really notice this, or care.. however, it means a lot to people like myself who are currently stuck with it as an operating system on our Laptops. It means that Linux won’t chew up our batteries as harshly as it did priorly.

I’ve been fighting the most recent stable release (2.6.21.3) for most of the week that it’s been out. The CPQ scheduler was broken, ALSA (sound) drivers are too old, and don’t work very well with my hardware, there is a problem with it’s clock system (if you put your laptop to sleep, if it wakes up, it will have a fit, not knowing that it was asleep, but rather, thinking that it just mysteriously lost time), the new headers break support for VMWare’s drivers, and also destroy any binary-only drivers available by removing system calls.

But, hey, all in the name of progress, right?

Anyhow, I’ve opted to backtrack to 2.6.21, the latest-not-a-sub-release-release, and base my code on Debian’s own 2.6.21 unstable sources.

I’ve updated ALSA, cleaned up some of the SCSI defines, fixed the clock, added a few suggestions by Intel for the 2.6.21 kernel, as well as many, many, many other changes, mostly laptop centric (support for Core Duo 2, a few driver updates, and even a forward-patch of BootSplash), however, these patches do increase the stability of 2.6.21, as well as add stable features (with the exception of BootSplash, which is just eyecandy).

If you are a laptop user who’s having difficulties with the broken/strange ACPI support, or, forbid, just want your old tools to work again, here’s a link to a diff from the STOCK 2.6.21 kernel, with Debian’s patches, my updates, a few Gentoo bugfixes, and even more.

My patches will also enable you to build flgrx, ipw3945, and many other things which you might find you have lost with 2.6.21.

Getting phone numbers from Packet8's DTA-310

28 May 2007

Since I’ve moved, my roommates have often destroyed the telephone to the point that I don’t want to search for it.

Let’s be honest – who wants to pound on someone else’s door, and perhaps, just perhaps peer through dirty undies to find the phone?

Not me.

This is a good reason to find out who might be calling BEFORE you go searching for it.

Enter the DTA-310. It’s Packet8’s answer to VoIP. It is outdated, and no longer available, but it offers a decent connection, at not too much bandwidth.

As you’ll see above, I opted to parse the ‘latest ten’ call log of my machine.

The DTA-310 returns it’s data entirely in JavaScript, for no specific reason, so I cheated by making a small parser which dumps the data into temporary array to properly format it’s data as I wished.

If you want the awful code, feel free to contact me for it; this “throwaway” is usable, but not worth documenting.

IP-To-Country conversion; RAM isn't always best!

24 May 2007

I’ve recently had an odd request. Given an Apache log, can I write a tool to query a database, and find what country a given IP is in.

The first rule is that this client did not wish to use the GeoIP database provided by MaxMind, despite the fact that their API already has this functionality… for free.

The second was, it had to generate two files; one with the IPs for the country we’re looking for, the second, the unwanted IPs.

The third, it had to be fairly fast, and written in PHP.. and it had to have both a Web interface, and a pretty command line. (Note the “spinnies” around the percentage completed.)

Of course.

Enter Web hosting.Info and their IP-To-Country database. Ok, fine. It’s not MaxMind’s software, but it’s CSV. No problem.

Looking at the file, we have 5 fields:
Starting IP, Ending IP, Country Code (2 Characters), Country Code (3 Characters, and the Country Name.

Let’s import this into an SQL database:

mysql>create database 'ip2country';
mysql>use 'ip2country';

Now, let’s create our table.

mysql>create table ip2c (`ip_start` int(4) unsigned \
  NOT NULL default '0', `ip_end` int(4) unsigned \
  NOT NULL default '0', `country_code2` char(2) \
  NOT NULL default '', `country_code3` char(3) \
  NOT NULL default '', `country_name` varchar(50) \
  NOT NULL default '', PRIMARY KEYS (`ip_start,ip_end`));

Now that we have our table setup, let’s cheat and use the local file support inherit within MySQL (unless disabled):

mysql>LOAD DATA LOCAL INFILE 'ip-to-country.csv' \
INTO TABLE `ip2c` FIELDS TERMINATED BY ',' \
ENCLOSED BY '\"' LINES TERMINATED BY '\r\n';

Still with me? Let’s test it.

mysql> SELECT `country_code2`, `country_name` \
  FROM `ip2c` WHERE (`country_code2` != "" AND \
  `country_name` != "") GROUP BY `country_code2` \
   ORDER BY `country_name` ASC LIMIT 5;
+---------------+----------------+
| country_code2 | country_name   |
+---------------+----------------+
| AF            | AFGHANISTAN    |
| AL            | ALBANIA        |
| DZ            | ALGERIA        |
| AS            | AMERICAN SAMOA |
| AD            | ANDORRA        |
+---------------+----------------+
5 rows in set (0.12 sec)
mysql>

Good. Everything seems to be in there.

Now, then. Here’s the tricky part. This database stores its ip addresses in long numerical format.

That’s not too bad. So, we need to test our IPs in the long format. You can either do it in PHP, or, you can do as I did, and have MySQL do the grunt work with INET_ATON().

The only ‘gotcha’ approach with this is you have a range of IPs between the beginning, and the end. So, you need to test if your IP is between, or equal to them. Here’s sample MySQL code:

mysql>SELECT `country_name` FROM `ip2c` WHERE \
  `ip_from` <= inet_aton('216.239.51.104') AND `ip_to` \
   >= inet_aton('216.239.51.104');
...
UNITED STATES

The whole reasoning for this is that IP-To-Country attempts to discern when IPs are tunneled or sent to other countries. It’s a bit of a nuisance.. it’s essentially asking “Ok, is this phone number between 555-0000 and 555-9999?”

It’s also a bit obnoxious that every IP address causes a MySQL call. I’ve thought about loading the array for our search into RAM, but herein lies the problem:

mysql> select NULL from ip2c WHERE \
  `country_code3` = "USA";
...
14557 rows in set (0.05 sec)

Yep. That’s right. There’s nearly fifteen THOUSAND sets for the US. This would mean that even in RAM, you’d have to load your full list, and test your numeric representation between two sets of numbers for each list. This is still fairly easy, but can get messy in PHP.

The problem with this approach is that in this case, you end up with a O(n*14557) for a host that doesn’t match any test case in this scenario. You can always break out of the loop, of course, if you find a match.. but at what cost?

The SQL version is always O(n) for the loop, at the cost of an SQL query for each IP address.

You’d say, “Well, gee, RAM is faster, right?” Wrong.

Upon delivering the product, I was asked if there was a way to reduce the load (I admit, it sucks to hammer an SQL server, even one that’s dedicated for this purpose). I stated that I felt that due to the constraints of what I had to work with, it would be unlikely to be faster (I suggested moving the project to C, or even a more numerically-friendly scripting language, but was denied).

So, noting once more that in the case this client would wish to test for the United States, that alone would give nearly 15,000 queries to test for every IP lookup. I stated that if one was found, it would be faster, but would never reach the speed of an SQL server.

Still, they persisted.. so, I wrote a companion that loaded the whole table they wanted to search for into an array (from MySQL.. but only ONE DB call), creating a multilevel array in RAM for each beginning and end ip, all numerically indexed (does this sound familiar?).

Then, much to my chagrin, I had it loop through this dynamic array for each IP, testing every case, and if it was true, set the Boolean to be true, and break; else, complete the test and the Boolean returns false, after all, there were thousands of sets that it might be, and I had to explicity test each range.

This version only used the SQL server for it’s single query: “Gimme the sets of IPs I need to search for.” This made the client quite happy.

Then, I started both versions of the product in “Debug” mode.

They watched as the MySQL version happily chugged away with each IP, taking mere milliseconds per IP call. However, the RAM version had to loop through and try to fit the number between the two in it’s discovered IP set. At best, it took roughly 1 full second for every IP.. still not too bad, but I gave it a somewhat likely test case scenario of nearly 250,000 IPs.

The MySQL version took 9 seconds. We all opted to give up on the in-RAM processing version after two and a half hours. It was nearly at 60,000, that’s roughly 25%. Given that the data was the same for the remainder of this test, it would have taken ten hours.

A lack of updates, but here, have some music!

21 May 2007

I’ve long ago turned my ‘nerd blog’ into a personal blog, and once more back into a ‘nerd blog’, noting many of the trials and tribulations of being a System Administrator / Programmer. This is hardly amusing, and nobody cares but voyeuristic nerds.

So, have some blog-quality content: A link to My Pandora Stations. Be forewarned, everything that doesn’t explicity say the genre is usually a meld of electronica, since that doesn’t disturb me much when in a coding daze:

First of all, is my QuickMix (which is currently very electronica biased), and following, each station with (at least a slightly) different genre:

Bombatino · Telefroworld · Nevermindless · Unterwhirrl · Rezzounding · Tripanned · Downsrythmed· GrooveSash · Masquephoria · Bassempo · Melodightness · Newavities · Technocraptic · Gloobersnot · Jazzy Tichersize · Creamy Peanut Butter · Punkupine · Neon Babylon · Acoustic Vocalizations · Tempting Titches · Eventussinova · Limp ’n’ Flighty · Laid way back · Industrial Sprockets · Pleasant Cheezeball · Alternative Rock Blocks · Unsucky Rock · Bubble, Spike, & Glitz · Glitz & Cheese, and finally: Rock with your collar popped.

Shawn's Scraps

Because every Internet Presence deserves a web page.