November 19, 2012

Hide Amazon's list price to prevent impulsive shopping

Would you prefer

to this?


Note the missing list price and savings. If you want the only factor in your buying decisions to be the actual price of the item, install this stylish extension.

Edit:

Use in conjunction with an extension like Priceblink to know the true value of an item, by comparing with multiple vendors.

January 14, 2012

Adblock issues in Chrome

Apparently, the following 'Lazy Background Pages' flag in chrome://flags (or about:flags) is incompatible with Adblock Plus. Enabling it results in Chrome not being to fetch any web page whatsoever. This had me scratching my head for a while.


Well, now you know.

January 04, 2012

Using visualization to mislead

Here's a recent advertisement in the Times of India, touting the Times Now television channel's viewership:
The difference between Times Now and CNN IBN is barely four percentage points, and yet, the three dimensional pie exaggerates the difference.  In fact, at this angle, CNN IBN would have lesser projected surface area even if it's share of the viewership was larger than Times Now's.

Here's the full advertisement, which appeared on page 9 of the Mumbai edition of The Times of India on 27th of December. It should be accessible from here, with a few clicks.

September 11, 2011

Authenticating people over telephone

Never seen this being done before.

I contacted a Bank of America representative over the telephone, and they needed to authenticate me before we could get down to business. I was asked my name, and was then asked to confirm a series of questions. I was not asked to state these facts myself - they stated it themselves, and asked me to confirm it.

Silly way to verify a person's identity, one would think. Anyone could pass of as me, if they had my name and my account number - and if they simply confirmed every detail that the CSR gave them. However, the CSR did make one minor mistake while stating the details - my phone number was off by one number, and I corrected them promptly. Looking back, it is quite clear that the mistake was deliberate. In this way, I did not say out any of my personal details out loud (which would have been terrible in public), and I pretty much authenticated myself by correcting one random mistake that they chose to make.

March 20, 2011

A distributed pipeline for processing text

Usually, Hadoop is the way to go.

However, I have joined a project that has been underway for more than a year, and the processes have been written in mostly an ad-hoc way - shell, python, and Java standalone programs. Converting each of these to mappers and reducers would have been an arduous task.

I decided to re-write the pipeline in SCons. There are many things about this pipeline that represent a conventional build. There are dependencies, and usually newer functionality/processing is added to the later stages of the pipeline. Luckily, SCons takes in regular python functions as "Builders", which I hooked into xml-rpc functions, and we soon had SCons running the pipeline on multiple servers (just five, actually - that's all we'd get for our pipeline). The file-system is an NFS share, which simplifies things a great deal.

Python, however, has been a bit on the slower side. Also, invoking the Java VM every time you need to process a file feels like too much of an overhead. So while the pipeline is functional, and processes the corpus much faster than before (5-6 hours vs 20+ earlier), we are considering re-writing the XML-RPC server in Java. The standalone programs can be easily ported to the server implementation, and invoking shell scripts from Java shouldn't be very different from invoking them from python - things should only improve. I wonder, however, if I should have written this in Hadoop to start with.

December 20, 2010

Correlation != Causation

(from PeteSearch)
My dog Thor hates getting wet, but even when there's rain lashing against the windows he still starts off dancing in circles when it's time for his walk. It's only when I pull out his yellow rain jacket that he slumps and stares at me mournfully. He seems convinced that if I just left the jacket off, the rain would go away.

Much as I try and convince him of the error in his logic, he's unmoved, and it's hard to blame him. Humans will happily swallow studies that use the weasel word 'link' to claim something that is associated with an outcome is its cause … I half-expect to get up one morning and discover that Thor's eaten the raincoat, in the hope of bringing back the sun …

December 19, 2010

Random funny

(from http://neil.fraser.name/news/2007/04/07/ ... the carving there is great too!)
Early Friday morning I walked by a garbage can outside one of the Google cafés. A bird which had been feeding on scraps saw me coming, exploded out of the garbage can and flew off. A passing coworker looked at me incredulously and said "Someone threw out a perfectly good bird."

October 11, 2010

The other side

A bit too much of soy sauce, but not too shabby.

This is my first attempt at making noodles, the Indo-Chinese way. I put a bit too much of soy sauce, but otherwise, it was nice. In fact, it was good enough for Hao to prefer it over the cucumbers that he had cooked! I wasn't that great with the chopsticks though, and used them as long as the volume of noodles was large enough for me to aim at.

September 09, 2010

Finally ...

... I had someone from China mistake me for a Chinese person. This, after years of being told that I look like Chinese person while in India. Inevitable, I guess.

September 07, 2010

Cabbage and rice tastes great!

Or maybe it is the hunger speaking. I didn't have breakfast - skipped directly to lunch.

This was a day after I visited NYC. Thanks to Shrikanth, I have some photos from my university now.


Have a look. NYC photographs will follow soon.

Edit: Let's try a slideshow -



July 16, 2010

The weaker sex?

From http://esr.ibiblio.org/?p=2118:
... women, in general, are not willing to eat the kind of shit that men will swallow to work in this field. Now let’s talk about death marches, mandatory uncompensated overtime, the beeper on the belt, and having no life. Men accept these conditions because they’re easily hooked into a monomaniacal, warrior-ethic way of thinking in which achievement of the mission is everything. Women, not so much. Much sooner than a man would, a woman will ask: "Why, exactly, am I putting up with this?"

... if we really want to fix the problem of too few women in computing, we need to ask some much harder questions about how the field treats everyone in it.

More like the wiser one.

June 11, 2010

Using unicode to annotate emails

I have been using unicode characters like ★, ✘, ✔, to annotate my gmail labels. I used it today to mark the subject of one of my mails as a high-priority mail. Wonder if it'll work.

June 07, 2009

Using javascript to avoid the mouse and page scrolling

Here's the problem. It is not very easy to scroll a document when you're inside an input element. Arrow keys don't work, and Page Up/Page Down jump in big increments. What if you want to see just a few lines below the current element? Our clients hate to scroll. And they hate having to use the mouse. This just brings the two together.

FScroll is a JQuery plug-in which makes a page scroll to the currently focussed element, keeping it's position centered with respect to the document. This helps keep a bit of "context" around the currently focussed element - since it is centered, you can see a few elements both above and below the currently focussed element.

Here are the sources. And here's a page explaining it's usage in some detail. And oh, it does nested centering too. But it requires that the 'nesting' container have a css styling of position : relative (in the demo page, the div enclosing the table is positioned relative). This was not strictly necessary, but it made the coding a bit easier. If you can't live with the styling restriction, let me know. I'll try to do what I can.

You may report issues here.

June 02, 2009

Importing git history into a new svn repository

So the management has finally approved your project, and has asked you to start working on it. Heh ... little do they know that you'd already been working on it, and have a nice prototype working, and it's all saved on your local git repository. But your company is not as cool as you are - it has it's own svn repository, and now you have to import your code into it, history and all.

Here is the git tree, as you have developed it:

Original repostiory structure

.. and your svn repository looks similar to this -
$ svn co <svn repo url>
Checked out revision 0.

$ cd <svnrepo>

$ mkdir tags trunk branches

$ svn add *
A branches
A tags
A trunk

$ svn commit -m "initial directory structure"
Adding branches
Adding tags
Adding trunk

Committed revision 1.

Now you could copy all the files from the git repository into trunk, and commit it. But that is really not the way it should be. For one - no one will know the reason for *anything* in this repository before the big bang. Also, there might have been legitimate reasons for people to branch out from some earlier state of the code, but now no one will even know.

Fortunately, a mail on the kerneltrap archives tells us how we can export a git repository, along with all it's history, into an svn repository.
(from http://kerneltrap.org/mailarchive/git/2008/10/26/3815034)

From: Björn <B.Steinbrink@...>

...
...

This should do and uses a graft to simplify the process a bit:

Initialize git-svn:
git svn init -s --prefix=svn/ https://svn/svn/SANDBOX/warren/test2

The --prefix gives you remote tracking branches like "svn/trunk" which
is nice because you don't get ambiguous names if you call your local
branch just "trunk" then. And -s is a shortcut for the standard
trunk/tags/branches layout.

Fetch the initial stuff from svn:
git svn fetch

Now look up the hash of your root commit (should show a single commit):
git rev-list --parents master | grep '^.\{40\}$'

Then get the hash of the empty trunk commit:
git rev-parse svn/trunk

Create the graft:
echo <root-commit-hash> <svn-trunk-commit-hash>  >> .git/info/grafts

Now, "gitk" should show svn/trunk as the first commit on which your
master branch is based.

Make the graft permanent:
git filter-branch -- ^svn/trunk --all

Drop the graft:
rm .git/info/grafts

gitk should still show svn/trunk in the ancestry of master

Linearize your history on top of trunk:
git svn rebase

And now git svn dcommit -n should tell you that it is going to commit
to trunk.

If you check your svn repository log, it will look like this.
SVN log

All the history, nice and linearised for svn.

Keep in mind though, that this method is lossy. All the branches have been linearised, and you can no longer "check them out" in the original git repository. Apart from that, things work just fine, and you can continue to commit in your local git repository, and push to svn as and when needed.

May 28, 2009

Sights and sounds, in and around Vasai

This is what population growth, and the resulting competition, can do.

Tough job market



~

English is a foreign language to most Indians, and yet it seems to be preferred for dispensing information. It's not uncommon to see a gaffe every now and then.

Compushop @ Vasai

May 26, 2009

Do you see me?

Updating my profile picture isn't an easy task anymore. There are many sites where I own a profile, and I have to look just right on each one of them.
Silhouette

As of now, these are the places you can see my shiny new profile picture:

 
Phew. I hope that covers them all.

P.S. Last.fm and Twitter aren't playing nice right now. I guess I have to keep trying.

May 17, 2009

Tuning LINQ performance with Mr. P and Mr. S

I thought I'd take a second look at the Mr. P and Mr. S problem, which I'd posted more than a couple of years ago. The last time I tried it, I wasn't successful. I had a strategy to solve it, but somehow I just couldn't translate it into code.

I've been programming a lot with C# lately, and decided to use LINQ to solve the puzzle. Although not very concise, compared to the Python and Haskell solutions out there, it does print out the right answer. After you've tried to solve it yourself, you can have a look at my solution here.

There's something special about LINQ queries. All LINQ queries are deferred, which means that they aren't executed until they are accessed. Also, they are re-executed when the execution context changes. Say we have a list of numbers, and a query on it like so :


var numbers = new List<int>();
var query =
from i in numbers
select i;


The query hasn't been executed yet. We add a few numbers to the list, and compare the counts of the list and the query.


numbers.Add(0);
numbers.Add(1);
numbers.Add(2);

// 3 elements in list, 3 in the query
Assert.AreEqual(numbers.Count, localDeferredQuery.Count());


The test passes. LINQ queries are "live", very much like functions. Usually, this is a good thing, as no operation is performed until it is actually needed. However, there are exceptions. For example, I used these three ranges -


public static IEnumerable OddRange(int stop) // returns odd numbers upto "stop"
{
for (int i = 1; i < stop; i+=2) yield return i;
}

public static IEnumerable EvenRange(int stop) // returns even numbers upto "stop"
{
for (int i = 2; i < stop; i+=2) yield return i;
}

public static IEnumerable Range(int stop) // returns all numbers upto "stop"
{
for (int i = 0; i < stop; ++i) yield return i;
}


To define the Deferred() and Immediate() functions below:


public void Deferred()
{
var all = Range(limit);
var even = from e in EvenRange(limit) where all.Contains(e) select e;
var odd = from o in OddRange(limit) where !even.Contains(o) select o;

var query = from q in odd select q;

foreach(var i in query) { var j = i+1; }
}

public void Immediate()
{
var all = Range(limit);
var even = (from e in EvenRange(limit) where all.Contains(e) select e) .ToArray();
var odd = (from o in OddRange(limit) where !even.Contains(o) select o) .ToArray();

var query = (from q in odd select q).ToArray();

foreach(var i in query) { var j = i+1; }
}


all, even and odd are three sub queries, each using the previous one. The Immediate() function only differs from Differed() due it's forced execution of the subqueries with ToArray(). However, Immediate() performs much better than Deferred(). I knew LINQ operators are actually euphemism for functions, and that iterator blocks are actually exploded by the compiler into a lot of code. But Deferred() was waaaayy slower than Immediate(), and the time taken would increase exponentially with the value of limit. This couldn't be just some extra code.

I posted a query on stackoverflow, and it did not disappoint. It is quite obvious in hindsight. This statement -

var odd = (from o in OddRange(limit) where !even.Contains(o) select o).ToArray();


in deferred mode, turns out to be pretty expensive indeed. It contains a call to even.Contains(o). While in the immediate mode this is an O(n) operation, in deferred mode, the sequence of calls looks like this -


odd --> even -+-> EvenRange()
|
+-> all --> Range()


A simple O(n) operation is now O(n3). We can do better than O(n), however, by using a Hashset.

var evenSet = new HashSet(even);
var odd = from o in OddRange(limit)
where !evenSet.Contains(o) select o; // Contains() is now O(1)


It doesn't get much better than this.

May 10, 2009

Microblogging on identi.ca

If my journal template hasn't changed since this post, you should see a µBlog roll on the sidebar. If you've clicked on any of the links, you'd now that those notices (or 'dents') come from identi.ca.

identi.ca is a website very similar to twitter, only better. It's built with the open source laconi.ca project, and has tags and groups too. The killer feature for me is IM support, along with a decent command list. All you have to do is add their bot on google talk, and you can send/receive messages in real -time.

the commands currently supported by the IM bot are:

on - turn on notifications
off - turn off notifications
help - show this help
follow <nickname> - subscribe to user
leave <nickname> - unsubscribe from user
d <nickname> <text> - direct message to user
get <nickname> - get last notice from user
whois <nickname> - get profile info on user
fav <nickname> - add user's last notice as a 'fave'
stats - get your stats
stop - same as 'off'
quit - same as 'off'
sub <nickname> - same as 'follow'
unsub <nickname> - same as 'leave'
last <nickname> - same as 'get'


identi.ca also supports forwarding dents to twitter, so you wont completely alienate your fans on twitter. However, identi.ca doesn't pull tweets, so you wont see any @replies from twitter on identi.ca. At least until you can convince your friends to move from twitter.

identi.ca belongs to a larger ecosystem of OpenMicroBlogging software, which have adapted a common standard so that messages between them may be shared. If you use a software that supports OMB, you wont alienate someone just because they happen to like something different (in contrast, the twitter community belongs only on twitter).

Another popular µBlogging site is jaiku, which will support OMB, and go open source soon. If identi.ca is not your cup of tea, or if you happen to like everything Google, jaiku may be for you.