Sunday, December 28, 2014

rDNS to Burn Ho Ho Ho

Every year, like clockwork, someone makes sure to do something clever like this. Dont forget to set your max ttl to a little over 100 so you get ALL the lyrics. You don't want to miss out on a Christmas miracle.

$ traceroute -m 120 xmas.futile.net
traceroute to xmas.futile.net (77.75.106.106), 120 hops max, 60 byte packets
1 ec2-50-112-0-86.us-west-2.compute.amazonaws.com (50.112.0.86) 1.700 ms ec2-50-112-0-82.us-west-2.compute.amazonaws.com (50.112.0.82) 2.160 ms 2.121 ms
2 100.64.1.143 (100.64.1.143) 1.287 ms 100.64.1.157 (100.64.1.157) 1.850 ms 100.64.1.179 (100.64.1.179) 1.514 ms
3 100.64.0.114 (100.64.0.114) 1.697 ms 100.64.0.56 (100.64.0.56) 1.431 ms 100.64.0.66 (100.64.0.66) 1.637 ms
4 100.64.16.89 (100.64.16.89) 0.816 ms 100.64.16.29 (100.64.16.29) 0.870 ms 100.64.16.153 (100.64.16.153) 0.929 ms
5 205.251.232.166 (205.251.232.166) 3.181 ms 2.279 ms 54.239.48.184 (54.239.48.184) 1.175 ms
6 205.251.232.154 (205.251.232.154) 1.731 ms 1.315 ms 205.251.232.198 (205.251.232.198) 1.409 ms
7 205.251.232.89 (205.251.232.89) 7.241 ms 205.251.232.91 (205.251.232.91) 8.558 ms 205.251.232.78 (205.251.232.78) 7.016 ms
8 205.251.225.201 (205.251.225.201) 7.385 ms 205.251.226.226 (205.251.226.226) 6.461 ms 205.251.226.192 (205.251.226.192) 5.849 ms
9 ae-8.r04.sttlwa01.us.bb.gin.ntt.net (198.104.202.189) 8.910 ms ae-18.r05.sttlwa01.us.bb.gin.ntt.net (129.250.201.177) 7.586 ms ae-8.r04.sttlwa01.us.bb.gin.ntt.net (198.104.202.189) 8.911 ms
10 ae-7.r21.sttlwa01.us.bb.gin.ntt.net (129.250.5.48) 7.377 ms ae-6.r21.sttlwa01.us.bb.gin.ntt.net (129.250.5.44) 6.058 ms 8.077 ms
11 ae-3.r22.nycmny01.us.bb.gin.ntt.net (129.250.2.50) 73.357 ms 74.689 ms 71.985 ms
12 ae-5.r22.londen03.uk.bb.gin.ntt.net (129.250.3.127) 145.012 ms 153.988 ms 154.003 ms
13 ae-1.r02.londen03.uk.bb.gin.ntt.net (129.250.5.25) 142.310 ms 143.254 ms 144.756 ms
14 xe-0-0-3.edge00.the.uk.hso-group.net (62.73.169.34) 155.629 ms 153.431 ms 170.395 ms
15 xe-3-3.core00.the.uk.hso-group.net (93.89.91.13) 154.304 ms 145.935 ms 155.607 ms
16 xe-4-4.core00.thw.uk.hso-group.net (77.75.108.135) 161.489 ms 170.428 ms 168.645 ms
17 xe-8-4.core00.thn.uk.hso-group.net (77.75.108.137) 145.769 ms 151.782 ms 150.956 ms
18 xe-4-4.core00.gs1.uk.hso-group.net (77.75.108.160) 155.492 ms 154.586 ms 153.756 ms
19 ae0-1203.edge00.sov.uk.hso-group.net (46.17.60.117) 162.573 ms 145.314 ms 156.443 ms
20 xoxoxoxoxoxo.Ho.Ho.Ho.xoxoxoxoxoxo (93.89.84.75) 152.557 ms 148.137 ms 149.111 ms
21 ooooxooooooxooo.V.ooooooxooooxoooo (82.133.91.37) 148.166 ms 158.433 ms 157.012 ms
22 ooxoooooxooooo.MMM.ooooooooxxoooxo (82.133.91.18) 146.747 ms 150.743 ms 148.490 ms
23 oooxoooooxooo.EEEEE.oooxoooooxoooo (82.133.91.63) 155.556 ms 150.612 ms 149.207 ms
24 ooooxooxooox.RRRRRRR.ooooooxooooox (82.133.91.56) 150.052 ms 147.187 ms 147.717 ms
25 oxooooooxoo.RRRRRRRRR.oooxooooooxo (82.133.91.55) 146.850 ms 157.273 ms 163.567 ms
26 xoooxooooo.YYYYYYYYYYY.oooxooooxoo (82.133.91.58) 146.490 ms 155.792 ms 155.358 ms
27 ooxoooooxooooo.CCC.ooooooooxoooxoo (82.133.91.96) 147.694 ms 146.370 ms 146.917 ms
28 oooooxooo.HHHHHHHHHHHHH.oxoooxoooo (82.133.91.23) 156.264 ms 157.224 ms 147.976 ms
29 ooxooxoo.RRRRRRRRRRRRRRR.ooxoooxoo (82.133.91.49) 149.529 ms 157.479 ms 146.768 ms
30 oxoooxo.IIIIIIIIIIIIIIIII.oooxooxo (82.133.91.60) 147.004 ms 149.610 ms 146.144 ms
31 oooxoo.SSSSSSSSSSSSSSSSSSS.ooxoooo (82.133.91.42) 159.816 ms 147.422 ms 147.647 ms
32 oooxoooxoooooo.TTT.ooooooooooooxoo (82.133.91.61) 146.592 ms 150.407 ms 147.971 ms
33 ooxoo.MMMMMMMMMMMMMMMMMMMMMM.oooxo (82.133.91.34) 158.860 ms 157.308 ms 146.151 ms
34 xxoo.AAAAAAAAAAAAAAAAAAAAAAAA.oxoo (82.133.91.80) 157.832 ms 154.396 ms 147.990 ms
35 oxo.SSSSSSSSSSSSSSSSSSSSSSSSSS.ooo (82.133.91.40) 147.656 ms 157.133 ms 146.486 ms
36 ooxooooooooooo.XXX.oooooooooooooxo (82.133.91.35) 156.158 ms 147.422 ms 148.664 ms
37 oxoooooooooooo.XXX.ooooooooooooxxo (82.133.91.10) 156.621 ms 156.882 ms 149.265 ms
38 Oh.the.weather.outside.is.frightful (82.133.91.41) 156.463 ms 159.244 ms 149.237 ms
39 But.the.fire.is.so.delightful (82.133.91.19) 155.633 ms 155.197 ms 148.495 ms
40 And.since.weve.no.place.to.go (82.133.91.77) 163.198 ms 148.568 ms 157.004 ms
41 Let.It.Snow.Let.It.Snow.Let.It.Snow (82.133.91.43) 145.524 ms 148.442 ms 148.235 ms
42 xXx (82.133.91.24) 154.886 ms 156.168 ms 148.119 ms
43 It.doesnt.show.signs.of.stopping (82.133.91.36) 155.728 ms 149.842 ms 147.509 ms
44 And.Ive.bought.some.corn.for.popping (82.133.91.73) 156.513 ms 155.195 ms 148.890 ms
45 The.lights.are.turned.way.down.low (82.133.91.76) 156.784 ms 156.929 ms 149.023 ms
46 Let.It.Snow.Let.It.Snow.Let.It.Snow (82.133.91.67) 157.132 ms 147.427 ms 146.847 ms
47 xXx (82.133.91.38) 155.047 ms 157.809 ms 158.599 ms
48 When.we.finally.kiss.good.night (82.133.91.62) 148.095 ms 156.498 ms 155.979 ms
49 How.Ill.hate.going.out.in.the.storm (82.133.91.45) 144.783 ms 153.313 ms 154.611 ms
50 But.if.youll.really.hold.me.tight (82.133.91.78) 145.371 ms 147.084 ms 154.588 ms
51 All.the.way.home.Ill.be.warm (82.133.91.17) 148.071 ms 148.590 ms 147.392 ms
52 xXx (82.133.91.70) 151.266 ms 148.426 ms 146.638 ms
53 The.fire.is.slowly.dying (82.133.91.95) 156.077 ms 155.992 ms 157.804 ms
54 And.my.dear.were.still.goodbying (82.133.91.57) 149.232 ms 157.089 ms 155.868 ms
55 But.as.long.as.you.love.me.so (82.133.91.31) 147.339 ms 148.318 ms 156.999 ms
56 Let.It.Snow.Let.It.Snow.Let.It.Snow (82.133.91.53) 153.937 ms 155.810 ms 147.369 ms
57 oOo (82.133.91.94) 162.711 ms 148.860 ms 154.739 ms
58 Ho.Ho.Ho.Are.We.Having.Fun.Yet (82.133.91.64) 148.437 ms 156.270 ms 163.520 ms
59 M.E.R.R.Y.C.H.R.I.S.T.M.A.S (82.133.91.86) 153.629 ms 146.195 ms 154.195 ms
60 oOo (82.133.91.15) 156.081 ms 145.935 ms 155.280 ms
61 Dashing.through.the.snow (82.133.91.14) 146.503 ms 150.094 ms 155.717 ms
62 In.a.one-horse.open.sleigh (82.133.91.83) 156.842 ms 156.586 ms 159.232 ms
63 Over.the.fields.we.go (82.133.91.27) 155.382 ms 146.111 ms 146.592 ms
64 Laughing.all.the.way (82.133.91.71) 156.139 ms 164.653 ms 148.819 ms
65 Bells.on.bobtail.ring (82.133.91.79) 156.060 ms 153.467 ms 147.566 ms
66 Making.spirits.bright (82.133.91.75) 154.545 ms 146.811 ms 164.212 ms
67 What.fun.it.is.to.ride.and.sing (82.133.91.82) 146.802 ms 156.057 ms 157.627 ms
68 A.sleighing.song.tonight (82.133.91.98) 147.468 ms 148.048 ms 148.202 ms
69 oOo (82.133.91.29) 169.488 ms 156.140 ms 146.594 ms
70 Jingle.bells.jingle.bells (82.133.91.91) 149.405 ms 147.309 ms 149.885 ms
71 Jingle.all.the.way (82.133.91.81) 147.528 ms 152.163 ms 162.643 ms
72 Oh.what.fun.it.is.to.ride (82.133.91.21) 148.221 ms 148.672 ms 155.447 ms
73 In.a.one-horse.open.sleigh (82.133.91.30) 145.895 ms 150.218 ms 150.723 ms
74 Jingle.bells.jingle.bells (82.133.91.59) 157.465 ms 157.238 ms 148.593 ms
75 Jingle.all.the.way (82.133.91.32) 149.462 ms 148.965 ms 150.207 ms
76 Oh.what.fun.it.is.to.ride (82.133.91.74) 156.425 ms 153.923 ms 148.276 ms
77 In.a.one-horse.open.sleigh (82.133.91.87) 146.874 ms 156.538 ms 149.652 ms
78 M-E-R-R-Y--C-H-R-I-S-T-M-A-S (82.133.91.72) 145.625 ms 156.200 ms 146.071 ms
79 Have.yourself.a.merry.little.Christmas (82.133.91.90) 155.789 ms 147.145 ms 156.051 ms
80 Let.your.heart.be.light (82.133.91.47) 148.308 ms 153.645 ms 147.440 ms
81 From.now.on (82.133.91.12) 148.154 ms 146.651 ms 149.013 ms
82 our.troubles.will.be.out.of.sight (82.133.91.99) 155.112 ms 154.470 ms 155.005 ms
83 oOo (82.133.91.22) 148.033 ms 150.713 ms 150.175 ms
84 Have.yourself.a.merry.little.Christmas (82.133.91.68) 148.183 ms 163.155 ms 155.142 ms
85 Make.the.Yule-tide.gay (82.133.91.52) 146.391 ms 144.659 ms 147.140 ms
86 From.now.on (82.133.91.66) 145.505 ms 146.407 ms 145.916 ms
87 our.troubles.will.be.miles.away (82.133.91.54) 148.138 ms 146.840 ms 153.899 ms
88 oOo (82.133.91.93) 162.987 ms 147.289 ms 156.463 ms
89 Here.we.are.as.in.olden.days (82.133.91.25) 156.433 ms 147.995 ms 157.982 ms
90 Happy.golden.days.of.yore (82.133.91.89) 155.275 ms 149.958 ms 155.338 ms
91 Faithful.friends.who.are.dear.to.us (82.133.91.46) 153.740 ms 146.164 ms 147.598 ms
92 Gather.near.to.us.once.more (82.133.91.69) 156.085 ms 154.181 ms 146.960 ms
93 oOo (82.133.91.85) 154.229 ms 145.904 ms 154.842 ms
94 Through.the.years (82.133.91.39) 159.886 ms 158.502 ms 148.180 ms
95 We.all.will.be.together (82.133.91.33) 147.462 ms 147.589 ms 158.578 ms
96 If.the.Fates.allow (82.133.91.44) 145.665 ms 155.215 ms 149.548 ms
97 Hang.a.shining.star.upon.the.highest.bough (82.133.91.97) 156.918 ms 155.705 ms 163.390 ms
98 And.have.yourself.A.merry.little.Christmas.now (82.133.91.88) 155.173 ms 145.481 ms 153.135 ms
99 o.O (82.133.91.11) 156.867 ms 145.141 ms 156.870 ms
100 48.61.70.70.79.20.48.6f.6c.69.64.61.79.73.20.46.72.65.65.6e.6f.64.65.20.23.63.69.73.63.6f (82.133.91.51) 148.932 ms 155.109 ms 155.636 ms
101 oOoOoOoOoOoOoOoOoOoOoOo.MyssT.oOoOoOoOoOoOoOoOoOoOoOo (77.75.106.106) 155.663 ms 150.728 ms 150.342 ms

Tuesday, December 23, 2014

Apache VirtualHost Proxy Configuration - Helpful for Tomcat, Node.js and similar frameworks

I recently came across this question on ServerFault:
" I've subsonic application running of [sic] tomcat. Everything else works on apache. I don't want to write port number everytime [sic] so I'd like to set-up [sic] a simple directory where subsonic will be accessible. So, I'm trying to make virtualhost file [sic] inside apache dir. [...] I tried many variations, but cannot make anything work. "
The poor chap than [sic - ha!] provided an example of his latest go at the problem, an excerpt from his httpd.conf file:

<VirtualHost *:80>
DocumentRoot /var/www/streamer
ProxyPass / http://mini.local:4040/
ProxyPassReverse / http://mini.local:4040/
</VirtualHost> 
Not sure a bad go of it, all thing considered. Still, it wasn't providing him with the sort of results he was looking for. Naturally I had encountered similar issues not long ago myself, with the implementation of a Ghost blogging software platform, which runs on node.js. Its been my first serious effort with node, other than some occasionally one-off one-page type of scripts for existing web sites.

So, I felt like I might be able to help the gentleman. Now, let's bear in mind his question: "I don't want to write port number everytime". He does not say, "I never want to write port number", just that it should be possible to render the page with a URL that does not append the port number. Ensuring that the port number is never written would require the solution below in addition to another step - there are a few ways to resolve it, but mod_rewrite would be the most obvious / most well known. Some of the other solutions might depend upon what version of Apache is being used, like an <if> directive for example.

In any event, here is what I provided to the questioner, which is quite similar to what I have implemented previously and serves my purposes quite well (notice I am using his somewhat strange hostname nomenclature):

<VirtualHost *:80>
ServerName mini.local
ProxyRequests Off
ProxyPreserveHost On
<Proxy *>
AddDefaultCharset Off
Order deny,allow
Allow from all
</Proxy>
ProxyPass / http://mini.local:4040/
ProxyPassReverse / http://mini.local:4040/
</VirtualHost>

Monday, December 22, 2014

University of Sydney Uses XTF to Index 60's Sci-Fi Comics

I grew up reading pulp science fiction. There was a time when I would never had admitted something like that in public. But times have changed. Computer programming is now a career instead of a bizarre waste of time that might get you arrested. People wear and fiddle with mobile computers; displaying them at coffee shops like peacock plumage. When I was a kid and told adults I liked computers they assured my parents it was a phase I would grow out of.

As bitter as I may be of the past, I was delighted to find that the University of Sydney Library had combined my youthful passion for computers and science fiction comics into one mammoth project of love. They digitized the Frontiers of Science, a comic strip which was a big deal in Australia in the early 60's by way of the Sydney Morning Herald.

But they did more than just scan the damn things. Any unpaid intern can do that. Instead, they relied on the eXtensible Text Framework (XTF) in order to provide contextual search capabilities for the comic strip. XTF is designed to allow you to work with heterogenous media sources, and index them based on disparate types of metadata. It comes in handy, if say you want to build a database of comics that have already been scanned, and say the knucklehead that did the scanning saved some images as BMPs, others as JPGs and the rest as PDFs. All of these images contain usable metadata, and XTF is clever enough to grab it all and re-index it into a consistent format.

So bear this in mind for future projects, and go check out the Frontiers of Science at the University of Sydney Library's web site.

Saturday, December 20, 2014

Symlink Analyzation Function in C

Here is a simple function for performing symlink analysis using C. The readlink function gets the value of the symbolic link filename. The file name that the link points to is copied into buffer. This file name string is not null-terminated; readlink normally returns the number of characters copied. The size argument specifies the maximum number of characters to copy, usually the allocation size of buffer.

There is significant utility for such a function in a variety of contexts. Have fun with it!


h/t Tanguy

Windows 7 and Windows 8 Basics: Searching by File Size, Modification Date and Other File Properties

It was one of these days, not long ago, that I work up one day and realized that I had become an Old Man. Mine is the last generation that remembers a time prior to the internet. I remember using acoustic couplers. My first laptop, a Toshiba, had dual 5 1/2 inch floppy drives, but had no hard drive. I was so excited when I got my hands on that machine. It meant I could connect to networks using my acoustic coupler from a pay phone!

My ruminations on aging is at least somewhat related to the topic at hand. You see, among the memories rattling around my grey hair ensconced head are a few about searching Windows file systems for files of specific types. This sort of thing is very important, even just for every day normal computer usage.

When your computer starts running out of space, wouldn't it be nice to be able to find all of the really large files on that computer? Or perhaps you are looking for an important document you wrote - you can't remember the name of the file but you remember the week that you wrote it. Doing this in Windows XP is straight-forward, because the Windows XP search box (what Microsoft calls the "Search Companion") includes these more advanced functions, and accessing that search box is as simple as clicking the Start button and clicking Search from the resulting contextual menu. Such a search box typically looks similar to this:

Windows XP, Josh Wieder, search, dog
Ruff!
As you can see selecting size and date modification are simple in this format. However, Microsoft, in their infinite wisdom, decided to abandon this simple and straight forward menu, replacing it with a single magnifying glass icon without any options whatsoever:

Windows 7, Josh Wieder, Search bar
Searching mad stupid.
Without the simple and easy to use Search Companion, how are we supposed to look for files based on their properties instead of their name?

The answer, unfortunately for users only accustomed to graphical interfaces, is a series of command line arguments.

Here is a list of such the available search commands for Windows 7 and Windows 8, taken from the relevant Microsoft KB article:

Example search termUse this to find
System.FileName:~<"notes"
Files whose names begin with "notes." The ~< means "begins with."
System.FileName:="quarterly report"
Files named "quarterly report." The = means "matches exactly."
System.FileName:~="pro"
Files whose names contain the word "pro" or the characters pro as part of another word (such as "process" or "procedure"). The ~= means "contains."
System.Kind:<>picture
Files that aren't pictures. The <> means "is not."
System.DateModified:05/25/2010
Files that were modified on that date. You can also type "System.DateModified:2010" to find files changed at any time during that year.
System.Author:~!"herb"
Files whose authors don't have "herb" in their name. The ~! means "doesn't contain."
System.Keywords:"sunset"
Files that are tagged with the word sunset.
System.Size:<1mb
Files that are less than 1 MB in size.
System.Size:>1mb
Files that are more than 1 MB in size.

In addition to these commands, users can also use a series of Boolean command line operators to further refine searches:

OperatorExampleUse this to
AND
tropical AND island
Find files that contain both of the words "tropical" and "island" (even if those words are in different places in the file). In the case of a simple text search, this gives the same results as typing "tropical island."
NOT
tropical NOT island
Find files that contain the word "tropical," but not "island."
OR
tropical OR island
Find files that contain either of the words "tropical" or "island."

Although the commands themselves are non-intuitive, using them is straight-forward. Simply type the appropriate command into the Windows search box, either in the Start menu or in the top-right corner of a File Manager menu. Here is an example, where we have searched for all files larger than 100MB in size in the drive C:\

Windows 7, Josh Wieder, search terms
A search example in Windows 7
There are a variety of circumstances where Windows' search implementation will fail to meet a user's needs. First and foremost, the search function is resource intensive, inaccurate and slow. Compared to Linux's `grep`, `find` and `locate` commands, Windows Search is almost laughably bad, particularly when attempting to search for strings inside of files.

There are other tools available for Windows that vastly improve on the default Windows search function. My recommendation at this time is GrepWin built by Stefan Kiing, available for download at the Google Code site.

GrepWin allows users to search by simple strings, operators and terms like those we described above, providing faster more accurate responses than those available from Windows' default search. In addition to basic search functionality, GrepWin also accepts regular expressions as input. While cryptic, and with a steep initial learning curve, regular expressions are incredibly powerful and a fundamental part of modern computer programming. With regular expressions, you may find specific and complex patterns from large datasets efficiently and quickly. We will almost certainly explore regular expressions in depth with their own post (or perhaps series of posts).

Thats it for now on Windows-based searching. When we return to searching in the future, we will likely spend more time on searching databases, arrays and other data structures as well as providing more theoretical explanations for file system search.

Friday, December 19, 2014

Ghost Blog Platform - Automatic sitemap.xml File Creation

There seems to be some mis-communication making the rounds about the new node.js-based Ghost blogging platform's ability to automatically generate a sitemap.xml.

As of August of this year (2014), automatic sitemap.xml file generation is a core feature, not a plugin. The new documentation on Ghost's development roadmap on Trello makes this clear:

Ghost blog, Josh Wieder, Trello, sitemap, sitemap.xml

But, if your like me and usually check on this sort of thing by browsing through the forums, its easy to get the idea that sitemaps will be handled in some nebulous, future plugin:

Ghost blog, Josh Wieder, Trello, sitemap, sitemap.xml

I'm fairly new to working with node, and my forrays with Ghost are partly an excuse to learn a bit more. So far, I like what I see. Its true Ghost isnt as mature as say, Wordpress, but it also doesnt have the same target painted on its back (ever go through your server logs for bad requests for wp-admin.php?), and a lot of what is not there or not built well I can re-do myself.

Anyone else jumping on the node.js blogger bandwagon, either with Ghost or something else? Whats your experience been like so far?

Wednesday, December 17, 2014

GDB Hash Script in C

Here is a very simple implementation of a GDB hash algorithm as written in C. The script accepts a single contiguous string of any size as input via command line argument. You can easily use a file as input using a pipe or redirect.

For example:

    #./gdb_script < input_file.txt

or

    #./gdb_script thisismystring

Each character in the string is input as a non-contiguous block of integers, for simplicity in reviewing output.

Enjoy!

Tuesday, December 16, 2014

Linked List Creation in C

Below is an example of how to implement a Linked List function using C. Note that the example below does not include a main function or preprocessor imperatives.

Thursday, December 11, 2014

Dynamically Assign Filenames within an Iterative Loop

This question came up on Stack Exchange; Using C, how can you write filenames dynamically using an iterative variable loop? Or in English, you want to write some stuff to a bunch of different files, and have those files named in a logical manner like file1.file, file2.file, file3.file, ... etc.

The answer is pretty simple, but a bit of a mind-bender when you are first getting started. I was also pretty astonished that all of the accepted "solutions" to the problem failed to compile. This is a pretty big issue on these programming solution websites.

I think the reason for the problem is there are two groups of people: There are the people who want other people to think they can fix stuff. Then there are the people who can fix stuff. The two groups, obviously, have some overlap: people who can fix stuff and who want others to know about it. But there is also a statistically significant portion of both groups that do not overlap - people who can fix stuff who don't care whether other people know about it, and people who cant fix stuff, but who want other people to think they can. This last group seems to have a somewhat outsized voice on internet forums. By all means, leave a comment sharing your thoughts on the answer (I will delete it, but go ahead and leave your comment).

So, I contributed my own answer. This answer has the benefit of actually compiling and resolving the problem, which the other "solutions" do not do. Here it is, if you want to spare yourself the click-through:

#include <stdio.h>
#include <string.h>

int main(void)
{
for (int k = 0; k < 51; k++)
{
char title[8];
sprintf(title, "%d.jpg", k);
FILE* img = fopen(title, "a");
char* data = "Write this down";
fwrite (data, 1, strlen(data) , img);
fclose(img);
}
}

Tuesday, December 9, 2014

Pointer Fun With Binky From Stanford Computer Lab

Stanford created a strange, claymation character named Binky to teach students about the concept of Pointers in programming. Watch the video and share in the severe WTF-ness that comes along with it.


Barack Obama on Sorting Algorithms

Watch a pre-Presidency Obama chat with Eric Schmidt of Google about sorting algorithms. Oh, how you've changed, Barry.

The Sounds of Sort

The video below provides a graphical display of a number of different sorting algorithms at work. Whats unique about this video, is that in addition to the graphical display, each operation in the sorting algorithm has also been assigned a tone - in other words, this video lets you hear the algorithms as well as see them! While counter-intuitive, the "song" that the tones play can provide insight into how the algorithm is working; at least for me, it is easier to distinguish small differences using sound than it is to do so using the often frenetic, fast moving images in sorting visualizations. 

Check it out for yourself!

Saturday, December 6, 2014

Asymptotic or "Big O" Notation for Understanding Algorithm Performance

What does it mean for an algorithm to be "fast" or "efficient"? When we say an algorithm is fast, we are not referring to an explicitly notation in time, like seconds or minutes. There is a distinction here because of the distinction between hardware and software - performance of a program in real time can be drastically impacted by the hardware it is running on.

In order to account for this distinction, while still providing an objective measurement for the efficiency of a program, computer scientists rely on Asymptotic Measurement and a notation format called Big O Notation to describe such measurements. The formal description as a formula is as follows:

f(x) ∈ g(x)
xo C
f(x) ≤ C • g(x)
x > xo

In less theoretical terms; as the size of your input increases toward infinity, how does the runtime of your program grow?

For example, imagine counting the number of characters in a string in the most simple way, by reading each character and summing the related values. Such an algorithm is set to run in linear time, compared to the number of characters "n" in the string. This can be notated by saying the runtime is O(n) (or Big O of N).

Using this approach, the time to traverse the entire string is directly proportional to the number of characters (it is linear). Counting the characters in a 20 char string will take twice as long as counting the number of characters in a 10 char string because you must look at all of the characters and each character takes the same amount of time to review.

What if linear time, O(n), is not fast enough for your task? O(n) is insufficient when dealing with large datasets.

Suppose we decide to store the number of characters in the string in a variable "len" pre-determined earlier in your program; before the string was enumerated. In such a scenario, to determine string length you can merely print the variable value. Such an operation - checking a predetermined variable - is referred to as an Asymptotically Constant Operation or O(1) (Big O of 1). Why is this label appropriate? Remember what aymptotical means in this context - how does the runtime of your program grow as the input grows? With this type of operation, the runtime remains unchanged by the size of the input. It is constant, regardless of input. If your string is one char or a million chars, the operation would run in the same time. There is, however, a drawback to this method - you must reserve memory to store the variable, and there is performance overhead to actually storing the variable as well.

There are a variety of common types of Big O notated algorithms. O(n^2) algorithms, for example, can be slower than linear O(n) algorithms. O(n^2) algorithms are not *always* slower, but they become slower as their input grows; and they become slower at an exponential rate. For large datasets, O(n^2) are inappropriate. For smaller datssets, however, they can be appropriate or even run faster than O(n) algorithms.

Another common type is O(log n). The classic "binary search" algorithm for presorted datasets is one such algorithm. Doubling the size of the array increases a binary search time by only one operation. Binary search and other logarithmic algorithms are incredibly powerful when working with large datasets.

There is a pseudo-random performance modifier to such algorithms. With binary search, for example, there is a statistically significant possibility that the desired result of a binary search will be in the absolute center of the array. In cases such as this, the binary search algorithm completes in one operation, no matter what the size of the array is. Because of these probabilistic outcomes, Asymptotic notations also includes expressions for "best case" and "worst case" outcomes, or the upper and lower bounds of run time.

Again using binary search as our example, the best case scenario is as previously mentioned right in the middle of the array to be searched. The best case scenario is always one operation, which also means that the best case scenario is constant, because the result is always achieved in one operation in the best case scenario, regardless of the input size. Best case scenarios are noted using Ω, so the binary search algorithm is said to run with Ω(1).

In the worst case scenario, the binary search is O(log n). Worst case scenarios are expressed simply using the Big O notation already covered.

Using a linear search example, we see that linear algorithms also have Ω(1), because the best case scenario is the item searched for is the first item of the array. The worst case would be the search item being the last item of the array, expressed as O(n).

One last notation we will cover is Θ. Θ is used to express algorithms where the best and worst case scenarios are the same. One example where Θ is appropriate is the variable example we used earlier; where the run time will always remain the same. With the variable example, Ω(1) and O(1), so Θ(1).

Friday, December 5, 2014

Apache Log Pong

Looking for an apache log visualization program recently, I came across logstalgia. It turns your log files into a game of Pong between your web server and the internet, with each new request being the ball going back and forth!

So cool.


Eric Garner's Last Words

Every time you see me, you want to mess with me. I'm tired of it. It stops today. Why would you...? Everyone standing here will tell you I didn't do nothing. I did not sell nothing. Because everytime you see me, you want to harass me. You want to stop me [garbled] Selling cigarettes. I'm minding my business, officer, I'm minding my business. Please just leave me alone. I told you the last time, please just leave me alone. please please, don't touch me. Do not touch me. [garbled] I can't breathe. I can't breathe. I can't breathe. I can't breathe. I can't breathe. I can't breathe. I can't breathe. I can't breathe. I can't breathe.

Tuesday, December 2, 2014

WonkaLeaks

Paid FBI informant, accused sex criminal and convicted embezzler Sigurdur Thordarson, who was paid thousands of dollars by his DOJ handler for 8 hard drives filled with internal Wikileaks documents, bears an uncanny resemblance to Augustus Gloop from Willy Wonka. Coincidence, or something more sinister?

Sigurdur Thordarson, Josh Wieder, Augustus Gloop, Wikileaks, Willy Wonka
Augustus Gloop, formerly of Willy
Wonka's Chocolate Factory
Sigurdur Thordarson, Josh Wieder, Augustus Gloop, Wikileaks, Willy Wonka
Sigurdur Thordarson, formerly of Wikileaks




















Sunday, November 30, 2014

Trust

When UNIX co-progenitor and super-smarty-pants Ken Ritchie was given a Turing Award, he provided a warning to those within ear shot. Admins and developers often find it satisfactory to review the source code of applications to determine maliciousness. And to a certain extent, this works out all right. Over time we have built a series of expectations of where to expect naughty code based on our experience. We have also chosen to trust other types of tools that we use during this process. We discriminate.

But there's no reason that bad stuff *has* to be in the applications that we expect to find it in. Yes, the clever among us know that compilers can be bad. But we check the source of our compilers and find no bad stuff, and so we assume we are safe.

We do, though, compile the compiler, don't we?

Well, alright then some megalomaniac at Intel or somewhere far upstream decided to embed badness in the embedded distro compilation software. We can still look at the binary of compiled programs to determine What Is Really Going On.

We do, though, tend to use applications to help make machine code human readable, though, don't we?

The point to the thought experiment isn't to stop the unbelievable interchange of ideas and applications that has brought us to where we are today in modern computing. It would be impossible to manually read the machine code of all of the applications that we use and still function in the workplace and our other communities as we are expected to.

Rather, we should be aware of the trust that we place in our tools. We should be aware that when we set out to solve or review problems, we take certain things for granted.

The point is not, that we should never trust. The point is that we should make trust decisions willfully and based on reasonable deductions and facts; not impulse or ease of use.

I came across Ritchie's chat today in a CS class, and its still as relevant today as it ever was. You can read the whole thing here.

Saturday, November 29, 2014

Chess, Encryption and Comic Books (Mind MGMT)

Lately, I've been hooked on a brilliant comic book from genius Matt Kindt, called Mind MGMT. In a nutshell, Mind MGMT follows a cold war era intelligence service based on the conceit that Men Who Stare at Goats-style ESP spook tactics work, and have silently and secretly played a role in the machinations of world politics throughout the 20th century. Mind MGMT is really clever, the art is striking and the whole business is worth a read on its own.

Part of the fun of the comic book is that the creators seamlessly weave the sort of subliminal messaging they use in the plot, into the layout of the comic itself. Fake advertisements in the back of issues contain hidden text, while the margins themselves are formatted like Scantron documents with little limericks where the dotted "fold here" lines usually go.

Just today I read through issue 23, which opens with a tale of a man gifted with the fore-mentioned spying super-powers; a reclusive Bobby Fischer type who communicates through the world with messages encoded in the notation of championship chess game layouts. Have a look for yourself (click to enlarge):  
MIND MGMT, Josh Wieder, comic book, chess, encryption
The story fills in the picture a bit, while also providing a series of six chess boards with notation beneath each one. I don't want to spoil the fun of decoding the image for you - what do you think the chess boards spell out? Some things to consider - does each board, or does each notation spell a unique character? Does every board / notation spell the same character every time?

Any way, seeing this inspired me. I don't have much in the way of formal education in cryptography, but even I know that chess boards have been used for cryptography before. What I think would be cool is creating a simple program that would allow you to export a chess board from a computer chess game and use it as part of a cipher for an encryption system. There's even been some more recent publishing being done with chess board cryptosystems (which I have yet to read ... I've got a lot on my plate lately).

Not necessarily the most practical project but IMO a fun distraction / way to sharpen development skills for integrating ciphers into applications.

Friday, November 28, 2014

Programming in C Chapter V - Typecasting

In its simplest sense Typecasting is altering a computer's interpretation of data by implicitly or explicitly changing its data type; for example, by changing an `int` to a `float` and vice verse.

To better understand typecasting, we must start with data types themselves. In programming languages like C, every variable has some kind of `type` that determines how the computer and the user interprets that variable. Each of these data types, for instance `int`, `long long`, `float` and `double` all have their own unique characteristics and are use to handle data types of various ranges and precision.

Typecasting allows us to take a floating point number, like 3.14, and specifying the number before the decimal - 3 - by parsing it to an `int`.

Let's us an example from the English language to better clarify what we mean.

example.

        WIND

Each carefully manipulated line in the example above forms a unique symbol. However, these symbols are immediately identifiable to those fluent in a Romance language as letters. We implicitly understand the data type `letter`.

Even more interesting, reviewing the string of `letter` data type symbols composing the example above, we can see that two very different, specific data types are formed. Each of the two words that are formed has a completely different meaning, connotation, pronunciation and history.

There is the noun wind, as in: "The wind blows outside". Yet there is also the verb wind, as in: "Wind up that spool".

This is a valuable analogy inasmuch as it leads us to understand that how we type the data determines how we use that data. The `noun` data type of WIND ought to be used in very different circumstances than the `verb` data type of WIND.

Setting aside more advanced topics such as Natural Language Processing for a moment, let's take for granted that computers do not care about English grammar. Computer programming languages, such as C, rely on the same idea - taking the same bit of data, and using it very differently based on how we cast the `type` of that data.

Here are most common data types of a 32 bit operating system:

1 byte  : char
4 bytes : int, float
8 bytes : long long, double

Each byte represents 8 bits of memory storage in a 32 bit OS. Thus, an variable of type `int` will use 32 bits of memory when it is stored. As long as that variable remains of type `int`, the processor will always be able to convert that variable back to its' relevant number. However, we could in theory cast those same 32 bits into a series of boolean operators. As a result, the computer would no longer see a number in that address space, but an altogether different sequence of binary characters. We could then try to read that data as a different numeric type, or even as a string of four characters.

When dealing with numbers and type casting, it is vital to understand how the *precision* of your value will be effected. Keep in mind that the precision can stay the same, or you can lose precision - as in our float 3.14 to int 3 example at the very beginning of our discussion. You cannot, however, gain precision. The data to do so simply does not exist in the addressed memory space you would be attempting to pull it from.

Let's review the 3 most common ways you can lose precision.

Casting a float to an int would cause truncation of everything after the decimal point, leaving us with a whole number.

To perform a float to int conversion, we can perform the following simple operation:

example.

        float -> int
float x = 3.7;
(int)x;

In the scenario above, (int)x = 3, because we will have truncated all values after the decimal point.

We can also convert a long long to an int:

long long -> int

As before, this will lead to a loss of higher-order bits. a long long takes up 8 bytes or 64 bits in memory.

Similarly a double can be cast a a float:

double -> float

This will give you the closest possible float to the double without rounding. A double will allow you to store 53 bits or 16 significant digits, while a float has 24 significant bits.

Because floats can only store 24 significant bits, you can only store number up to the value of 2^24 AKA two to the power of twenty-four AKA 16777217.

EXPLICIT VS IMPLICIT CASTING

Explicit casting is when we write the data type in parentheses before the variable name, like so:

(int)x -> explicit casting

Implicit casting is when the compiler automatically changes similar types to a "super-type", or performs some other form of casting without requiring any additional code from the user to perform the operation.

For example when we write the following:

5 + 1.1 -> implicit casting

The values already have types associated with them. 5 is an `int`, while 1.1 is a `float`. In order to add the two of them together, the computer implicitly casts the `int` 5 into a `float`:

(float)5.0 + (float)1.1 -> implicit casting

Implicit casting also allows us to assign variables of different types to each other. We can always assign a less precise type into a more precise one. For instance:

example.

        double x;
int y;

We cannot take from this example that `x=y`, because a `double` has more precision than an `int`. On the other hand, it would also be problematic to say `y=x`, because `y` might have a larger value than `x`, and may not be able to hold all of the information stored in the double.

Type casting is also used in comparison operators such as:

< LESS THAN
> GREATER THAN
== EQUAL TO

example.

        if (5.1 > 5)

The example above will be returned as one, because the compiler will implicitly cast 5 to a float in order to compare the two numbers. The same would be true of this example as well:

if(2.0 == 2)

Also, don't forget that `int's can be cast to `char's or ASCII values. `char's also need to be reduced to binary, which is why you can easily convert between char's and their respective ASCII values.

Programming in C Chapter IV - Precedence

Precedence is how we answer the question: What operations should we perform first? Whether in solving mathematical equations or writing source code, strict procedural rules of precedence allow the same operations to produce the same results every time.

The first rule of precedence in the C programming language (and many others) is that we always work from the inner-most parentheses out-ward. This is particularly important to remember during bug-testing. Adding parentheses can be a good debugging tactic, but it is bad form to litter your code with un-needed parentheses.

The second rule is that when operators have equal priority, we simply solve from left to right.

With simple arithmetic, precedence or order of operations conforms to PEMDAS - from first to last, in pairs: parentheses and  exponents, multiplication and division, and finally addition and subtraction. Multiplication and division share the same precedence in this scenario because, functionally, they are the same operation. After all, division is merely multiplying by the inverse of the denominator. Similarly subtraction can be seen as merely adding a negative value.

example.
3 + 10 / 2 * (9 - 1) - 1
| --> we start with exponents
|
3 + 10 / 2 * 8 - 1           --> * and / are equal, so we do
  |         both but from left to right
  |       starting with /
3 +    5   * 8 - 1
  |
  |
3 +  40 - 1
|
|
   43 -  1
   | --> last we subtract the two
   | remaining digits
42


In C, there are two types of increment/decrement operators:

*Prefix Form*: ++i or --i
*Suffix Form*: i++ or i--

The *Suffix Form* is commonly used in "For" loops. It represents the idea that the current value is used first, and then that value is incremented. The value will only be different the NEXT time the variable is used.

By contrast, using the *Prefix Form* the variable is incremented FIRST, and then it is used in the expression.

Let us consider an example using the integer `x`, which for the purposes of this example we will set to `5`.

example.

int x = 5;

x++; /* x on this line is still 5
if we were to print it immediately
we would get the value 5 */
x; // here, on this line, x = 6
++x; /* using the prefix increment, the value is
incremented first - so on this line the
value would be 7 */

POINTER NOTATION

The last precedence issue that we will consider deals with pointer notation. The Dereference Operator - * - has priority over basic math operators, but not over the suffix increment and decrement operators. This leads us to our final example.

example.

int x = 7;    // we create an integer, x and set it to 7
int *y = &x;  /* we create a pointer, y, and assign in
to the address of x. When we dereference
y we should get the number assigned to x
which in this case is 7 */
*y++;  /* this situation appears to be ambiguous.
do we dereference the pointer to 7, and
then increment? Or do we increment first
and then derefence? In reality, because
the dereference operator does not have
precedence over suffix increments, the
pointer address itself is incremented -
leading to *y being pointed away from x
and into a completely different segment
of memory entirely! */
(*y)++;  /* this is the solution to the connundrum
presented by *y++. (*y) dereferences
first to the address of x, which is 7
ONLY THEN does it increment, to 8 */

To recap, our Precedence can take list form as follows:

1. Follow Inner Parentheses Outward
2. i++, i-- Suffix operators
3. *x, &x   Dereferences and Address operators
  also 3. ++i, --i Prefix operators
4. *, /, %  Simple math operations mult, div, percent
5. +, - Simple math operations add, subtract

Programming in C Chapter III - Boolean Values & Operators

Today we will learn a bit about Boolean values, operators and expressions.

Boolean values and conditions are named after 19th century mathematician and logician George Boole who pioneered a field of logic now referred to as Boolean logic; which is based upon grouping and comparing *Boolean values*.

*Boolean Value* - a variable that has two possible conditions; TRUE and FALSE.
                               Similar to a light switch that can be either on or off, or how binary
                               numbers can be either 1 or 0.

Boolean values are seem fairly simply on the surface. However, they allow for a dynamic array of combined values that allow for nearly infinite complexity.

*Boolean Operator* - A Boolean Operator combines two Boolean values into a single value. The two most common of such operators are AND and OR, but there are quite a few additional Operators we will explore as well.

*AND* - results in a value of TRUE ONLY if BOTH input values
are TRUE.

examples.
false AND true = false
false AND false = false
true AND true = true

*OR*  - results in a value of TRUE if EITHER of the
operator's input values are TRUE

example.
false OR true = true
true OR true = true
false OR false = false

*NOT* - accepts a Boolean Variable and provides the
opposite of that variable.

example.
NOT true = false
NOT false = true

Combining variable and operators into a single statement results in a *Boolean Expression*. Boolean Expressions can be *nested* to provide more complex outcomes. As numbers are prioritized and grouped using the arithmetical order of operations (PEMDAS), Boolean expressions can be grouped using paranthesis.

example.

x=TRUE
y=TRUE
z=TRUE
x AND (y OR (NOT z))   ---> this is 3 expressions
|
|
x AND (y OR FALSE)         ---> evaluate the innermost expression 1st
|
|
 x AND TRUE ---> finally we reach a tautology
 |
 |
TRUE AND TRUE = TRUE ---> resolving the nested expressions as
      TRUE (all Boolean expressions
       necessarily resolve to tautologies or contradictions)

Next, let us review how these nested expressions can be used in a programming language. In C, the syntax for Boolean expressions is different than simply using the words AND, OR and NOT. The translation is as follows:

AND = &&
OR  = ||
NOT = !

The nested expressions from our previous example can be translated into C rather simply:

initial example.

x AND (y OR (NOT z))

translated to C.

x && (y || (!z))

Now that we know how to translate Boolean expressions into C, how do we use it? Consider the example of a script that should execute only if a given variable is TRUE. For this purpose, nearly all programming languages provide support for the IF condition.

example in pseudocode.

IF x=true
THEN do Y

example in C.

bool x=true;
if (x)
{
// do Y
}

What if we wish to provide an alternative branch, to assign the script a set of instructions for when x IS NOT true? In these circumstances, we would append ELSE to our IF condition (forming an IF...ELSE condition)

example in C.
bool x=true;
if (x)
{
// do Y
}
else
{
// do Z
}

Another important function is the ELSE IF condition. Using else if, we can branch application behavior in more than two directions.

Suppose you have two Boolean values you would like your script to consider, called X and Y. We want our program to do one thing if X and Y are both TRUE. We want the program to do another thing if either X or Y are TRUE. And finally, we want the program to do a third thing for every other combination of truth statements for X and Y not covered in the first two conditions.

We can express such an application in C using the following code and relying on ELSE IF (in this example we declare X and Y to be TRUE and FALSE, respectively)

example.

bool x = true;
bool y = false;
if (x && y)
{
//code
}
else if (x || y)
{
//code
}
else
{
//code
}


Working with variables in this manner is useful, but we appear to be limited to only a few conditions. Booleans become much more powerful when we introduce *comparisons*. *Comparisons* are ways to evaluate values that are not originally declared as Boolean.

To see if two values are the same, we use: ==

== returns TRUE if the values are equal and FALSE if they are not.

Other common comparisons are:

LESS THAN            : <
GREATER THAN            : >
LESS THAN OR EQUAL TO    : <=
GREATER THAN OR EQUAL TO : >=

Using one last concrete example, let's combine ALL of the topics related to Boolean values we have covered so far, as well as a few topics from our previous discussions as well.

Suppose we have two variables: `temperature` and `isHungry`. `temperature` is a floating point number that can have decimal places. This example is a very simple program to tell someone what to eat based on the temperature.

example.

if (hungry && (temperature >= 100))
{
printf("eat ice cream");
}
else if (hungry && (temperature <= 0))
{
printf("eat spicy food");
}
else if (!hungry)
{
printf("don't eat anything");
}

That's all for today's lesson on Boolean. Try writing your own program using a Boolean function in a loop, such that when the the Boolean value changes, the loop terminates.

Thursday, November 27, 2014

Tamir Rice Video Casts Doubt on Statements from Police

There seems to be a great deal of confusion about what happened between Tamir Rice, a 12 year old who was playing in a park with a BB gun, and the police officer who killed him.

Take, for example, this: 


Quite a few members of the "public at large" seem to be convinced that young Tamir Rice was brandishing a convincing pistol replica at the police. The police, after begging Rice to lay down his weapon multiple times, were forced to open fire when young Tamir made some sort of furtive movement toward his waist band, in which this make-believe pistol was ensconced. 

While I find it quite troubling that so many of our fellow citizens find it reasonable to leap to the defense of today's police force immediately after they gun down a pre-pubescent child, perhaps in this instance the Public can be forgiven. After all, the narrative described above has largely been formed from police statements of what happened. 

Here's the police version: A man calls 911, informing them that someone is brandishing a pistol in the park. We also know this man told 911 dispatchers that he believed the gun was fake. Police claim that they were not informed of this key detail - which, frankly, should be a controversy in and of itself. As we will see, though, its not the worst of what happened.


Its this last detail that is really the clincher. Based on the police description of events, their response was still tragic, but reasonable. Police told Tamir to raise his hands three times. The boy failed to respond, and instead reached for what the police had been told was a gun. They then killed him. 

The issue is that the officer's description of events is, at worst, an outright lie and at best an intentionally misleading misrepresentation of events. Fortunately, the Tamir family was able to receive and make public a video of the park that completely captures the events. This video is below. I would encourage readers to watch the video; the shooting occurs begin at time code 7:00 



As can be clearly seen in the video, police arrive on the scene by nearly bowling over Tamir Rice with their car. The officer on the right opens his car door and immediately opens fire. The New York Times has reported that the length of time that elapsed during this period is 2 seconds.

2 seconds is enough time to open a car door and shoot someone. 2 seconds is not long enough to order someone to raise their hands and for that person to respond to even one such warning. Particularly when the person responding is a 12 year old child who has nearly been struck by a car and is no doubt completely shocked and terrified about what is occurring. There is no way that police told Rice to raise his hands 3 times; 2 seconds is simply not enough time for that to occur.

It is my hope that those who currently support the police narrative of events watch this video. It is my hope that people will question why the police statement is completely irreconcilable with the events in this video. It is my hope that people will question why the image of a child with a toy pop gun is an event worthy of calling the police, when in the recent past such an image was as American as Apple Pie. Does the boy below strike you as being a legitimate threat to law enforcement?


The Justice system in the United States is broken. In our fear we have built a machine that destroys lives and devours children. We ought to pause, now, to consider what can be done to stop it.