While watching Oscars, I always hear of new and old films that I may wanna watch. Looks like here's the list for now: Finding Neverland, The Aviator, Sideways, Closer, Hotel Rwanda, Street Car Named Desire, Motorcycle Diaries, I Heart Huckabees, Two Brothers.
Watching the Oscars now. Loving it :) Let's see who wins the best Actor...
Been sick all weekend. Don't feel like doing anything. Went to a local doctor and got some meds. Gonna go rest now.
Had a pretty good day today. Lotsa minor software troubles at work. Left early to get my car serviced for the first time. Even got a full carwash. Then got a haircut, picked up a bouquet of flowers and a nice bottle of wine to go see Jessica. We went to Macaroni Grill for dinner and then later chilled for a while in her apartment playing with her kitty Trixie. Had a really good time. But I'm tired 'n sleepy now. Going to bed.
w00t! Walt Disney returned me my money :)
Fingerprint Matching 101Sun, 13th Feb '05, 12:30 pm::
I spent yet another weekend pouring over computer algorithms. This time, it's fingerprint extraction and identification. The whole topic came up from my discussion at work. I foolishly boasted that I could design a system that will let any of the 10-15 people use any of the 10-15 computers to log on to their user account by pressing any of their 10 fingers against a little Fingerprint Scanner. So in theory, if my boss's computer was busy, he could walk up to my computer, press his finger against the fingerprint scanner and it will automatically log him into the new business software that I'm gonna make.
So of course, now that I've told everyone it's possible, it's time to figure out how. The lazy computer programmer in me wants to spend a little money, buy something like the VeriFinger Standard SDK, hook it up to one of the cheap fingerprint scanners, let it do the scanning and recognition, write a small application to manage it all, and call it a day. It won't be cheap but it'll work pretty realiably. A lot of companies around the world have done it. The mathematician in me wants to do it all myself. It's not just the thrill of writing it on my own, it's the additional features that I can add on to it. So, if you are given the task of writing a software to do all of this from scratch, how would you go about it?
It's a known fact that every individual has absolutely unique fingerprints on each of their ten (or so) digits. If you want to design something that can let anyone use any finger to log on from any computer, you need (1) software + scanner on every computer, (2) a server that has the database of every person's every finger, and (3) some way of reliably matching a finger with the correct person no matter what computer they are on. So now we can break this whole operation into two phases. The first phase is the setup phase when we add each users's fingerprint to the database. And the second phase is application phase in which we recognize the user when they press their finger against a scanner. In either phase, a fingerprint has to be read, converted to a form that can be understood by a computer, and transferred to the server. Only difference is that in the first phase it is stored on the server and in the second, it is matched against every existing fingerprint on the server.
What we have are two different operations: Extraction and Identification/Recognition/Matching. Extraction is the process by which a fingerprint is read from the scanner and specific unique qualities about the fingerprint are extracted from the image. Identification/Recognition/Matching is simply looking up the server for other fingerprints with similar unique qualities. So how and what do we extract from the scan of a fingerprint? Look at your own index finger right now or if you don't have any fingers or fingerprints, look at this image. The first thing we notice and actually don't even realize is that the lines are actually ridges and valleys. The ridges are the thick bright protruding highs and the valleys are the thin dark low-lying crevasses. It is this pattern of ridges/valleys that is different for each person's each finger.
The first thought that comes to mind is that, if this pattern is different for everyone, just store a picture of their fingerprint and match it up against the database. Storing a picture to a database is easy. But matching up a picture of a fingerprint against a database is not. How do you match? Based on what? One idea is to just overlay the scanned fingerprint on to each of the 1000 fingerprints in the database and compare each pixel - if 95% of the pixels match, we have a match. That is how the older fingerprint matching systems worked. It works decently in identifying criminals, especially if you can wait 2 hours for it to match with 1000 fingerprints. But it's not fast enough for instant identification and poses a lot of problems, like what happens if your finger is positioned slightly to the left and/or at a 5% degree angle. It wouldn't even match against your own finger with 5% margin of error. So people have moved from the picture (raster) matching techniques on to the marker (vector) matching.
Instead of matching the whole fingerprint against 1000 others, why not extract unique characteristics of each finger and store them. This is called Minutiae Matching. Look at your finger again. If you notice carefully, you can identify many types of markers where ridges end, ridges bifurcate into two ridges, three ridges form a delta etc. If we can somehow chart this information like a graph or a map, then we can store this in a database much more easily. The information stored in the database, if read in English would seem something like this - "At the center of the fingerprint is a 'delta' and 5mm away on the right is a 'bifurcation.' 7mm below the bifurcation is an 'island' and 3mm to left of the island is another bifurcation." If this information is stored in the database for my right index finger, then when I press my right index finger against a scanner, the software asks the server to match my finger against all others who have a delta near a bifurcation and must have an island. This instantly narrows down the search only to those fingers which have deltas, islands, and bifurcations. Then it looks to see if they are positioned similar to my finger.
So now the extraction problem is just to find where the ridges end, bifurcate, or form deltas and map them on a graph. It's like saying plot Singapore, Mexico City, and Cape Town on a map. Not very difficult when you look at the big picture. The algorithm to extract markers looks at every pixel and it's neighborhood pixels. If they satisfy some special characteristics then it assigns it a marker type (delta, island etc.) and stores it in the database, relative to the position of other markers on the same finger.
Now comes the hard part - matching a finger's markers against that of the 10,000 in the database. If you think about it, it's actually an age-old problem that the ancient Greeks like Ptolemy busied themselves with - finding constellations. You must've heard of the constellation Ursa Major (The Big Dipper) or my favorite Ursa Minor (The Little Dipper). Astronomers and astrologers for centuries have stood under the night sky and identified tons of constellations simply by looking up and observing. They didn't need no fancy computers or telescopes to find The Libra in the night sky. We humans have built-in pattern-matching and pattern-recognition abilities that seem so natural to us but it is near impossible to replicate these on a computer.
Given a night sky full of stars, how do you find a constellation you are looking for? You can start by looking for small groups within the constellation. Maybe two of the stars in the constellation you are looking for, are really close to each other. So scan the whole sky for two stars very close to each other. If you find such a pair of stars then look for further signs - like is there a third star directly above or below one of them but at twice the distance. Stuff like this is what we humans are really REALLY good at. You don't even realize that you are performing one of the most difficult patterm-recognition operations right now - reading text from a computer screen. After all, OCR is big business. So is recognizing sounds (especially voice), handwriting, images, and videos - things that we so easily discern and detect.
Anyways, back to recognizing fingerprints. After the extraction process, the software will have to form a constellation with all the markings on the fingerprint. In the setup phase, this is stored on the server and in the matching phase this is what is searched for in the server. Searching a constellation of markers within each of the 1000 fingerprints in the database can be done in the following way. One of the things we need to realize is that due to the randomness of the physical act of positioning a finger on a scanner, you will almost never get two exact readings. However, the marker data in the middle of the finger will be much more accurately readable than the markings on the fringes or towards the sides of the finger. So give more importance to a bifurcation in the middle of the finger than an island at the far left - after all, it could be just a normal fully-connected ridge but the person might not have pressed the finger fully on the scanner.
My extraction method would be to start towards the middle, spiral out in a clockwise direction and note the position of every marker. Note the distance between each marker and nearby ones and store it in a cyclic data format. Now search the database for only those fingers which have similar markings in the center as there is a very high probability that center readings are accurate. Then narrow down the list to only those with similar markings near the immediate area surrounding the center. Keep narrowing down the search till you have at most 5-10 fingerprints. Then just cycle through each of them and compare each of them to the reading. Leave some margin for error, take into account the rotation and position of the markers and we should have a pretty damn reliable match. If you don't account for rotation or slight movement in position, you will almost always get an incorrect reading.
Note that throughout this discussion, we have concentrated mostly on the 1:N and not the 1:1 matching. 1:N matching means that one fingerprint is compared against N (10 or 10,000 or 10 million) fingerprints to identify the person. This is mainly used for easy identification, say to let people into a Government building. 1:1 matching is used for secure authentication - that is to verify that a person really is who they say they are, say to allow you access to your own safety deposit box in your bank. The 1:N method is geared towards faster searching and the 1:1 method is geared towards more reliable matching. It is quite difficult to design an algorithm that performs equally well in both situations for you can either do it fast or do it accurately, rarely both.
Anyways, I'm still writing algorithm for the pre-extraction phase right now. Before you extract markers, you gotta convert the true-color ridges and valleys to two color lines and gaps. Using a very simple algorithm, this is what I've come up with so far. In the next few days, I should get my own fingerprint reader and then I'll improve upon this code and do more cool things with it :) If I'm successful, then maybe I'll open-source the code for scanner and recognition and make it easy for others to use it in their applications. While I'm almost positive I won't be able to make it as good as these guys, if I make it sufficiently workable, it'll certainly make them review their pricing. These people have been doing this for years and have received national awards so I don't think I'm gonna be much of a competition (neither do I care to be). After all, I've only known about fingerprint techniques for less than 24 hours now :) But it seems like the whole scientific world has been at it for ages.
I still dunno what/how I'm gonna be setting this up for my work but all I know is that I really want to. Good thing is that I can add this feature to my work software anytime so even if it takes me months that's perfectly fine. Let's see what happens first once I have a fingerprint scanner sitting on my desk.
For people living in US, Google Maps is absolutely the best thing I've ever seen. I don't think I'm gonna use any other site ever again (unless of course, the directions on GMaps turn out to be wrong). But the interface... soooo kickass!
God I miss the bunker :( You can take the boy outta the bunker but you can't take the bunker outta the boy... I dunno. I have a nice 1000 sq.ft fully-carpetted apartment and I still yearn for a 8x10 sq. ft. room underground in Jersey. It was seriously soooo cosy!
Solving World's Greatest Mathematical ProblemsSun, 6th Feb '05, 3:05 pm::
So I spent this entire weekend trying to discover the world's first 30-digit Keith Number with no luck. It's kinda like Fibonacci Numbers. Basically, take any number and write out it's digits. Say we pick the number "123" and then write out 1,2,3. Now add all the digits up and put the sum at the end of the list. 1+2+3 = 6, so now the new list is 1,2,3,6. Now drop the first number and add the rest and put it at the end of the list again. Drop "1" and add 2+3+6 = 11. So now we have 2,3,6,11. Of course, you can keep doing this till infinity. Now the fun part happens when you pick a number like 197. Let's see: 1,9,7. Then 1+9+7 = 17. So we get 1,9,7,17. Dropping "1" and adding the rest we get 9+7+17 = 33. Drop 9 and adding 7+17+33, we get 57. Then next we get 17+33+57 = 107. Next we get 33+57+107 = 197 -> The same number we started with!
So if we get the same number as we started with, then it is a Keith Number. There are only two 3-digit Keith Numbers - 197 and 742. There are six 2-digit KNs - 14, 19, 28, 47, 61, 75. Try it for 47 -> 4,7,11,18,29,47! But it doesn't work for any other numbers. 48 -> 4,8,12,20,32,52... No "48" in this list. A few 4-digit KNs are: 1104, 1537, 2208, 2580 etc.
Keith Numbers are very rare and only about 90 of them are known. The largest known Keith Number is 988242310393860390066911414 (27-digits long). We don't even know if there are infinitely many Keith Numbers (though it seems so). There is no award or anything for finding new large Keith Numbers but there is a huge prize (and worldwide glory) for finding out a simple method for generating all Keith Numbers easily. Basically, the problem of finding Keith Numbers can be likened to solving the Knapsack Problem. Of course, if you solve the Knapsack Problem with a simple method, then you've just discovered a method for solving the P=NP problem which has a $1million award from Clay Mathematics Institute.
While I don't understand the whole P=NP problem as clearly as I want, I do understand it enough to be amazed by it. The biggest mystery in computer science right now is whether P is equal to NP or not. So what is P and NP? First of all, Computer Science (my primary major in college) is not about how to make PowerPoint presentations or write financial database software for the stock exchange. In the simplest terms, Computer Science is very much like pure Mathematics - studying problems and solving them using simple equations. Of course, if it was just solving math problems, then it would be called Math and not CompSci.
Computer Science normally deals with a different type of problems, that which involve a lot of calculation and especially estimating the time it takes to calculate. E.g., if someone gives you 10 different business cards, how do you sort them alphabetically by a person's name? Well one simple way is to put all the cards on a table and go through the list and pick out the ones that begin with A, then B, then C till you are done. You basically went through a list of 10 cards, 26 times (A to Z - 26 letters). Of course, you'd be a lot less than that, since everytime you pick a card, there's one less left in the pile. Anyways, this is a very simple sorting algorithm. Computer Science deals with sorting algorithms a lot.
Another thing is searching. If someone gives you 10 business cards and says find "Chirag Mehta." Well, you'll have to go through 10 cards at most to find mine. If it's a one-time thing then it's fine but if you often have to find a business card from a bunch of business cards, you're much better off alphabetizing them once, so that next time someone asks for "Steve Buscemi", you can directly jump to the "S" pile instead of going through all the cards. Computer Science also deals with searching and inventing better methods to find stuff from a bunch of stuff. This is why people love Google so much - it makes easy to find stuff, using their own special search methods. You just type in "flying pigs" and it looks through billions upon billions of documents to find pages which have the phrase "flying pigs" in it and returns those pages within milliseconds. Amazing isn't it!
So what does all of this have to do with P=NP problem and the million-dollar award? Well, after decades of research, computer scientists have found out that there are some problems like searching, sorting etc. whose answers can be solved within a fixed amount of time - seconds, hours, days, centuries whatever. Basically, if someone tells you to find "Chirag Mehta" from 1000 business cards, and you can read 10 a minute, it'll take you 100 minutes at most to find the card. If it was 1000 billion cards, it would take you 100 billion minutes. So the problem of searching has a fixed execution time and it can be very accurately estimated. Same goes with sorting.
However, there are tons of other problems out there, where we can't even estimate how long it's going to take to solve the problem, let alone solve the problem itself. For example, let's look at the Travelling Salesman Problem. Suppose you have to go on a business trip to 3 cities and you can go from any city to any city, what is the cheapest round-trip route that visits each city once and returns to your home city? In other words, if you live in Mumbai, and have to go to London, Beijing, and Los Angeles, what is the cheapest round-trip route that will visit each city once and end up at Mumbai? Of course, you'd need a list of all the airfares, say Mumbai to London (or back) = $100, Mumbai to Beijing = $50, Mumbai to Los Angeles = $200, London to Beijing = $50, London to Los Angeles = $100, Beijing to Los Angeles = $150. Let's assume the reverse fare is same in all the cases, so London to Mumbai equals Mumbai to London equals $100. Currently there is no way accurately find the cheapest route without going through all the combinations (Mumbai-London-Beijing-LA-Mumbai, Mumbai-Beijing-LA-London-Mumbai, Mumbai-LA-London-Beijing-Mumbai etc.)
So you think what's the big deal about having a simple computer go through all the 20-25 combinations and finding the best? Actually no big deal, until you increase the cities from 5 to 50. Or 50 to 5000. Then even the largest supercomputers in the world combined would take centuries to find the solution. Basically, there exists no way to predetermine how long it will take to solve the travelling salesman problem for x cities. We can't say if x = 5, then it'll take 5 seconds and if x = 10, it'll take 10 seconds. We can't even say if x = 5, then it'll take 25 seconds and if x = 10, it will take 100 seconds. Truth is, we don't know. Of course, if you are British Airways, you do want to know, otherwise how else are you going to schedule the flight route of 500 airplanes around 200 airports around the world? Even though the perfect solution doesn't exist, computers can do a pretty good job of going through a billion or so combinations and finding out the best routes. Yes, they cannot go through all the possible combinations, but in practical cases, going through a few million combinations results is pretty good, say as opposed to the 1950s when all the flight pattern/routing was done manually! Without the special flight-routing computers, the whole airline industry would be in chaos because nobody would be able to schedule flights between cities where there is most demand, most people, highest airfare etc.
While the airlines are using computers to plan flight routes, they are doing it as best as they possibly can, not absolutely the best. It's like me going out and asking 100 girls on a date and deciding to marry the one I think is the best. This doesn't mean she is "the one" because I did not ask the other 3 billion girls. It just means she's the best of the ones I asked. Similarly, airlines plan their flights out of the 10 billion combinations instead of the 10000000 trillion possible options. Still 10 billion is better than 100,000. And yet, it is not as good as 10000000 trillion. So there is absolutely a LOT of room for improvement.
Another similar problem called the Bin-Packing Problem, which is closely related to the Cookie-Cutter Problem. If you have 10 items and you need to pack them into 3 bins, such that no bin weighs more than 100lbs, what is the best way to do it? Of course, trial-and-error might work with 10 items and 3 bins, but what if you are the world's largest steel manufacturing company and want to make sure that you can ship the most amount of steel bars of different sizes into trucks while never putting more than 10 tons per truck and of course not overflowing any truck's container. Or if you are my mom and are making cookies. You make a big rectangle piece of dough and then use a cutter to cut fancy cookie shapes. How do you make sure you use as much of the dough as possible and not waste any? Or if you are the world's biggest cookie company and have the same problem. How do you solve it?
Turns out, there is no exact "optimal" way to solve this problem in a fixed or predictable amount of time. In other words, there is no polynomial time method to solve any of these problems. And now comes the beauty of computer science -> all of these problems, from finding Keith Numbers to Knapsack Problem, to Travelling Salesman Problem to Cookie-Cutter Problem can all be solved if even one of them is solved! Oh and the game of Minesweeper also falls into this list. So if you can find a perfect way to beat Minesweeper each time, guess what... you've just solved the world's greatest computer science problem and would most likely get the $1mllion award.
Well I've still not explained what P=NP really means. P means that the problem is easy to solve. Easy doesn't mean you can do it in your head. Easy means it is possible to do it in a fixed time, like searching, sorting etc. NP means that a solution to a problem is easy to verify. That is, if someone gives you a Keith Number, you can easily check in a specific amount of time whether it is a Keith Number or not. Or whether the flight-route from Mumbai-London-Los-Angeles-Beijing-Mumbai covers all the 3 cities and starts/ends at Mumbai. So what we don't know, is whether problems that are easy to verify can be solved easily or not. That is whether NP = P. Currently there is no known method for solving any of the problems I stated above in a fixed/predetermined amount of time.
Most scientists believe that P is not equal to NP and a lot of them believe that given the current state of knowledge in computer science and mathematics, we can't really answer this question. We need to learn a lot more and look at this P=NP problem from the different angle in order to solve it. Maybe that's where computers that work using Quantum Mechanics could help. I don't know much about Quantum Computing so I guess that I can take care of next weekend :)
Hope you enjoyed my little computer science introduction course. Oh and according to my calculations, looking for a 30-digit Keith Number involves going through about 10^30 (10 followed by 30 zeroes) numbers. Currently, I can verify over a hundred 30-digit numbers per second, about 10 million a day. That's 10^7. So it will take me 10^30/10^7 = 10^23 days to find all the 30-digit Keith Numbers. Suppose a million people join me in finding the 30-digit KNs, and 1 million computers run my software, it'll still take 10^23/10^7 = 10^16 days. Which is about twice the age of the Universe! Of course, if we involve all the 1 billion computers on this planet, and instead of doing 10 million a day, find a way to speed it up to 100 billion a day on each computer, then it'll take 10^30/10^11/10^9 = 10^10 days, which is still approximately 27 million years.
So how the hell am I gonna find a 30-digit Keith Number (if it even exists)? By pure luck :) My program just randomly generates 30 digit numbers and tests to see if they are KN or not. It has gone through about 8 million 30-digit numbers already. Of course that's like looking into a bucket of seawater from the Florida coast order to find the pearl necklace you threw into the Indian Ocean a million years ago. There's a VERY high probability that you won't find anything, but hey... who knows, you just might!
Oh and also I think there's more chance of me winning the World's Largest Powerball Lottery than finding a 30-digit Keith Number, even though I haven't even purchased a lottery ticket! But I guess it's good for my computer since it's getting some good exercise now :)
So my kitties figured out that poking a hole into my airbed can be a lot of fun. It was kinda cute though. I think Giga popped a hole and then both of them crowded around it as trying to figure out how to fix it. Of course, I tried to patch the hole but my bed deflated anyway after a while. I certainly won't buy a new airbed but I think I do need to get a bedroom furniture set soon. Maybe next month.
Gotta buy my tickets to India soon. I plan to go around 8th April and return around 20-21st April.
Walt Disney Travel Company Sucks Monkey AssWed, 2nd Feb '05, 7:15 pm::
After my friends Art 'n Michele left Florida in mid January, I thought my problems with Disney were over. Turns out, I couldn't be more wrong. I said it once and I would like to repeat this again: Walt Disney Travel Company Sucks Monkey Ass! They have absolutely the WORST customer service that I've experienced in US ever. Everytime I've talked to them, they have been completely arrogant, extremely uncooperative, and overall a bunch of jerks.
Basically, they billed me $61 more than they should have and are telling me to wait 2 months to get my money back! The only way I can get my money back sooner is if I return the original documentation they sent me. Of course, once I do that, I no longer have any proof that I should be getting my $61. Other than trying to sue them, the only sane thing I can do is just wait for 2 months. THIS is what happens when a company grows too big - they can afford to treat their customers like crap and actually get away with it.
I didn't go into the details of the whole nightmare before, but basically, I booked my Disney tickets + hotel via them around Jan 5-7th. It was gonna cost me $520-something. An absolutely horrible woman from the company called me a few days later to tell me that the hotel I had booked and paid for was overbooked and they were cancelling my reservation. After going through a LOT of convincing that I really needed a hotel, she finally moved my reservation to a hotel outside of Disney, that turned out to be awful. But fine, I didn't care. I'm gonna go have fun with my friends.
The lady told me my tickets will be at Disney when I get there. When we got to Magic Kingdom in Disney, of course my tickets weren't there. They were at the hotel where we were gonna sleep at night! Took us about 45 minutes of convincing the guys at the Disney entrance to let us in. They called the hotel, got some confirmation numbers, and finally gave us our tickets. Once inside Disney, we had a great time.
But of course, the hotel was run by a bunch of total retards. When we got to our hotel, they did not have our tickets to Disney for the next day. The room was stinky, the beds were uncomfortable, but whatever, I was with friends and even though it was waaaay overpriced, I didn't care. I was told that they will locate my Disney tickets by morning. I'm sure you can already guess that by morning, the tickets were still missing. I went to the hotel reservation desk and turns out, the genius lady who gave the ticket confirmation number to Disney the day before, decided to TEAR MY NEXT DAY'S TICKETS! According to her, now that the package was opened, it had to be destroyed by their policy. But she assured us to no extent, that our tickets could now be instantly issued at Disney. I didn't trust her at all, but well we had no choice.
After arriving at Epcot Center on Jan 16th (it was FREEZING cold that day), it came to me as no surprise that my tickets weren't there at Disney and they could not seem to re-issue them. I was not in their database. Well at least the lady at the counter from Casablanca, Morocco was very nice and helpful. So I think me and my friends waited for over an hour to get our tickets and finally get in. Well, so we finally got in and pretty much enjoyed the rest of the day.
My friends came here all the way from New Jersey and the whole point of the vacation was that we wanted to spend some time together and have fun. During the course of the whole mayhem, I did not whine or freak out about the absolute lack of service. I swear anyone in my position would have created a major scene. But once again, I was with friends and we were there to enjoy and not fight.
So as I stated above, after my friends left, I thought, phew, the nightmare with Disneyi s over, until an hour ago when I got my credit card statement by email. Turns out, Walt Disney Travel Company (which is a part of Disney) charged me $61 more than they should. They charged me for the original hotel instead of the hell hole they put us in. Now, they are not going to return me my own money for a whole month more.
I could technically dispute the charge on my credit card and make it harder on their part or I can just wait for a month or two till they return my money. I somehow think I have a lot better things to do than screw with them anymore. I'm sure if I had nothing to do all day, I'd fight with them and teach them a lesson but now I'm just so sick of them, I don't wanna deal with them anymore. I've already forgotten about my money so if they ever pay me back, it'll be like I found money in one of my old coat pockets. Though $61 can't even pay for my weekly grocery bills these days. Ha!
So the meeting in Dallas, Texas went well. I can't give a lot of details as everything is confidential, but I can say that for my first business meeting ever, it was pretty decent. The weather sucked and so did the highways and the return flight, but eh it's all over now. I'm just glad the meeting went well and we did a good job as a team.