Friday, March 22, 2013

Step 2. Visualization

How to show 15 million numbers?

Previously: Step 1. Digitizing
Now I have my binary file chrY.fa.tcag.n-t.hex.
Some sort of presentation of all these bunch of numbers would be very useful.
Do you have any idea? I did not.

So, I tried the following: take 2 consecutive bytes and pretend that they are X and Y coordinates on a surface. 
This way I would have a square 256 x 256 (FF x FF in hex) with both coordinates changing from 0 to FF.
I took first 64K hex numbers from the file and put a dot in the position defined by the coordinates. The more dots was drawn at the same coordinates, the darker they look.
64K hex numbers represent quarter a million of nucleotides. 

Frankly, I expected to see some randomly spreaded dots.


Here was the result:

OK. This was kind of expected. I clicked a button in my program a few times more, each time adding another 64K numbers.

256K: are there some pattern there or this is just my imagination?

I started to add more and more data.
704K:
1344K:

1984K:

4416K:

Wow! This is incredible! This is human matrix. 
Have I discovered something amazing?
Fills even scary.
I stopped. I need to save my breath. I need somehow to comprehend the result.
Human X-cromosome
Click the image to see it in action

First I decided to compare the picture with another binary files. Maybe I do not know something about my newly created way of file presentation. Maybe they all look the same way.

Let's take, say, zip file.
Looks pretty random to me. Better I try exe or dll. They are programs too after all.
Looks cool but very far from human matrix.

I decided to try different species. 
Is this matrix uniquely human or what?

First I found a mouse.
No, we are not much different from a mouse.

What about, say, lizard?
Wow! Lizard has the same pattern too.
I tried cow, dog, lancelet (primitive fish-like creature), blowfish, wild boar, and orangutan. 
They all have the same pattern!

Different pattern I found in worms:
which resembles a pattern of mosquito:
and honey bee:

I'm hooked. I want to try everything.

Bacteria. E.Coli:


Influenza:
It looks closer to human than to bee or E.Coli.

Let me try plants.
Glicine:

I believe that you've got the idea. I have.

To be continued...

No comments:

Post a Comment