Thursday, September 6, 2012

Red-black & blue


One of my neglected hobbies involves enliterating programs for the literate programming wiki. Donald Knuth originally described a literate program as one that has explanation of the code to humans as its primary goal. (Structured nicely, it even can be torn apart to assemble a real program with the right software.) Javadocs are nice, but focus on describing the public functions independently. That's fine if I'm only after the api, but a new programmer - so the reasoning goes - should study how great programs work the way english majors deconstruct Shakespeare.

However, most of the gaps in the literate programming wiki are either code dumps I can't easily enlitarate (language I don't know or algorithm I don't understand) or have no code at all. One page that I thought threaded that line was a python implementation of a red black tree. So, I began describing the tree node and so on, before I noticed that the implementation lacked deletion code.

Understand, I can do this as I have seen the algorithms before. But, they are tedious to implement and I have other coding projects I want to pursue. This was undertaken in a different spirit. So I let it languish. I dumped in some pseudocode and went back to working on Simpletron (which I will eventually submit to the wiki). A month and one half passed and no one else has touched it. So, I thought, why not check for something I can shoehorn in from github.

Big mistake of course. Little choices are stock and trade of writing a program. I call the values stored values, so & so calls them keys. (Only found one worth a damn, by headius.) I use tabs, he uses spaces and the compiler/interpreter expects consistent depth whichever is used. Suffice to say, I wasted half an hour correcting those things in the copypasta inserted in the established code. Then, I spent another half hour getting his native code to print nicely. Headius appearantly doesn't care about color or whether it looks vaguely treelike. Rather than leave you doubting me, here is an example. I had to add the periods so blogger wouldn't trim the spaces.

20
 . . 10
 . . . . . 3
 . . . . . 15
 . . 70
 . . . . . 55
 . . . . . . . . 30
 . . . . . . . . 62
 . . . . . 80
 . . . . . . . . 98

The point here isn't that I'm wiser than headius. I am faced with three choices.

A) Throw out everything, get some reliable pseudocode, and build it myself. While I may relish the finished product more, it is a waste of time and concession of failure on my part. ie, I couldn't understand what was going on so I spent good money after bad when something workable was at hand. Not likely.

B) Throw out everything I've enliterated on the wiki so far and enliterate this version that works totally. (After refactoring it, so I don't cringe looking at the one letter variables and so on.) This is a blow to my pride since I've essentially wasted effort so far. However, - after refactoring - this is the decision closest to my actual goal: just describing how the code is working.

C) Keep what I have and translate a deletion algorithm to work with the code (and explanation) already present. One path involves translating headius' algorithm. The other involves translating pseudocode to python. The problem with each involves rectifying various assumptions we've made even more fundamental than calling the stored integer value or key. Headius uses shells rather than nulls as the leaves, which means he can query the null's parents & so on whereas the current code has to check whether it was just handed a null. I haven't checked Cormen's pseudocode in a while but it may feature the same, or a different, disjoint assumption.

The trick in deciding between the latter two choices stems from how much perceived work comes from the translation and debugging it against reenliterating and refactoring a code base that won't fight me. To be fair, I haven't enliterated much as I left the insertion code unrefactored and uncommented. Printing is the same, thanks to that last half hour. Intro and some of the normal node are the same or trivially retargetable. And yet, I shouldn't take it's uncomplaining operation on faith. I need to test that this indeed deletes like a red black tree should.

While I began this post thinking that I would argue myself into patching the deletion hole with translated pseudocode, I will take on headius' legacy. Make it better, readable, literate. This just won't show up instantly on the wiki. Honestly, I can't complain about this. In similar realization last time, I will restrict simpletron to describing its operation (filling its wiki). Besides learning clojure, that means for coding practice I would have turned to Java and either finish the game portion of Conway's trees or Edwin's game. (Conway's trees is an implementation of his game with different types of trees underlying the map.)

Wednesday, September 5, 2012

Really Simple


So I find out my assessment, potentially, on Wednsday and see what path the remainder of my life will be like. It's a bit unsettling to think that - finally - I may have ended it. Its nice to think I can reapply but if I'm working fulltime or 'full'time, then I won't really have enough to spare. Oh well.

I had thought of not reporting about my progress on Simpletron and the Simple compiler. That's silly, of course, because I've linked to my github account where you can see all the commits and delays and current state. I actually gain by explaining what's going on (outside of the commits no one else will read).

Anyway, I was thinking on the nanopass compiler article and find it appealing for changing the Simple Compiler. Rather than cart around useless rem commands, I can just delete them. Unless, of course, a goto refers to it. And that goes for all the other line numbers too. But this is forward, I'll be more explicit for those who haven't read the article.

Very quickly, in case you don't recall or know that I've been working on Deitel's Simpletron/Simple Compiler and what those are: Deitel is a publisher with some computer texts. There are a few exercises that involve simulating a cpu and then a compiler with a subset of the computer language BASIC. I've completed the canonical implementation of the exercises (albiet in Python rather than Java or one of the C's). Working on the extension problems currently.

These, potentially, involve changing the structure of the compiler quite a bit so I've been dragging my feet. A tiny part of that is anticipating breaking the compiler, but barely. Reverting is effortless and there isn't any real danger anyway since these thoughts are clear. The more ambiguous aspect comes from when to split wildly divergent extensions and when to freeze a version to document. (Not necessarily the same point.)

One extension involves changing the compiler to an interpreter. Rather than transform 'Simple' code to simpletron instructions, the interpreter puts the instructions into memory. Then it runs just that command (or block if multiple) This restricts, for example, goto commands to pointing at already declared instructions.

Another extension (my own), involves introducing an assembler in between the compiler and simpletron. It isn't necessary since the language is barely above the assembler level anyway. However, introducing the interface lets me fantasize about either retargeting the compiler to another computer simulator. (Say, Notch's Dcpu10) Or, I could write a compiler for another simple language (a pascal? a scheme even?) to target the simpletron.

But, those are changes I will pursue later. For the moment, I am more concerned about incorporating the eerie basic subroutine. Normally, it would have a name, associated commands, and be callable from anywhere. Because 'simple' requires line numbers, a client can call any part as a subroutine. I can already goto any line (and depend on another pass to resolve lines I haven't enountered yet). But, a gosub must return to the point after the gosub. Consider

  1. print 4
  2. return
  3. gosub 1
  4. gosub 2

One of the problems of writing programs at this level is unstructured programming techniques like this. The print is fine, but then I move to the return. If the return were below the gosub, then - upon reaching the gosub - I insert a goto in the 'return's spot that points at the instruction right after the gosub. Here, I hit it early, so what is supposed to be there is ambiguous. I hadn't thought deeply and appear not to be able as I'm about to argue myself into a hole.

Suffice that it is late and I am tired. Returns need to initialize with a goto [next instruction] in case of this situation. But then what about after I've returned? Do I reset it to next instruction or leave it pointing at the first gosub? If I'm to protect the client, then the former is arguable. But, basic is so far down, that I cant' really justify it. If you want to make an infinite loop, who am I to stop you? If you want spagetti code, I have no right to stop you. So why would I make subroutines better than the spec calls for?

Mainly, because I know that it can be better and implementing a hexagonal wheel because it matches my presumed experience feels less useful than implementing the octagonal wheel I learned about earlier. In this case, that means named functions/regions of code. Those suggest some scope and I even heard how compilers (generally) graduate to that level: an abstract syntax tree for each function.

This is why I want an intermediate layer I can manipulate, unacknowleged by the spec. If I examine the code and find forward or backward references (a certainty), then I can resolve the reference before producing code. Consider a for loop

  1. for x = 3 to 5
  2. print x
  3. next

Without making an intermediate change to the simple code, I have to save the information about the for somewhere to use when I hit the next command that loops back to the for and increments x.

However, there is another paradigm. I encountered it in the Nanopass compiler article. Rather than make ad hoc changes to code in between compiler passes, the professor who wrote it demanded his students provide a grammer for the intermediate language that a particular pass expects and emits. So, in the above case, I would append the line number of the for to the next and the next to the for. The gosubs recieve the line number where the return's goto will be stored. For clarity:

  1. for x = 3 to 5 3
  2. print x
  3. next 1



  1. print 4
  2. return
  3. gosub 1 2
  4. gosub 2 2

This requires, though, that I have the whole source code available to me, as a list of strings. Otherwise, the separate passes would be opening the same file multiple times and resolving their adjustments inefficiently. Were I to do so though, then I can make a number of nice transformations. Like, after all the forward references are taken care of, I can cut all the line numbers and comments that aren't referenced. (Or even increase the line number past a comment and then cut them all.)

While I am quite excited to make that change, it flies in the face of both the interpreter version (which can't import the whole file) and documenting the code. The documentation, to my mind, should reflect Deitel's description of how to make the simple compiler. As I improve the compiler with functions or strings, some basic elements move farther from the original working version.

On one hand, it is unlikely that anyone will look at my implementation, much less the wiki documentation for it. And yet, I've read a fair amount of the only implementation uploaded before mine (that I could find) of both the simpletron and simple compiler. (Just the simpletron is more common.) If someone were to reference mine as a guide, then a version with multiple passes that defy Deitel's spec (to satisfy one or more of the embellishments) would likely disincline that person to continue. S/he either has to investigate where the old pass remains in the new version, construct the undocumented, canonical version from an old commit, or ignore Deitel and work from my template. Obviously, none of these are likely. So, I would have just failed that person as guide.

On the other hand, If I were to regard the literateprogramming wiki as my primary audience, I could justify completing the exercise and then writing the documentation. Of course, they could also see that I had gone beyond the canonical spec and mutter. They may not be invested in adherence, but if someone were to submit another implementation in a different language but only satisfy the canonical spec, there would be a significant disparity.

In the end, I see that I should redact my for loop changes and refactor scompiler and simpletron to their very best selves for documentation. I'll need to move the 1.0 tag up so the wiki references the right version. Only then can I branch out with extensions and interpreter. Honestly, I don't need the extensions for the interpreter so I won't feel bad about not rejiggering them for the interpreter.

I'm also working on some java projects and learning clojure, but that's another night's work.

Friday, March 23, 2012

Eye of the Needle of Public Opinion


These thoughts are better suited consigned to Twitter or Facebook, but 140 characters isn't sufficient. Normal qualifiers "in my social circle and in media portrayals I have experienced, contemporary and outdated" apply throughought.
The Hank Johnson curfuffle is embarrasing. Not because I am embarassed for him, but because I am embarassed of the social reaction. Obviously, what he says made no sense, even as a last ditch effort to address the infrastructural consequences of an increased military presence on Guam. But, really, the guy is human. Touting 'Bushisms' and the like as evidence of clinical idiocy is cruel and somewhat hypocritical.
We loudly recoil at the barter of our privacy for social networking perks. Polititians and celebrities survive under actual Orwellian surveillence. How else are the Hot/Not issues of Stars populated? My, incorrect, first reaction involved wondering if people expect their representatives to be perfect and are righteously angry that they aren't.
The support polls indicate the opposite, of course. Americans despise their representatives. More likely, they are pessimistic about the electees and relish the opportunity to hold up the evidence of unfit qualities. Really, Nicholas? This guy is fit? Bush Jr was fit? Perhaps he is a fine member on the Armed Services Committee or an anchor they couldn't tie to anything else. I simply consider it bad form to let this moment totally define my judgment about his fitness as a representative citizen. How couldn't it be? There was no story previously outside his home territory and he can't have the resources to defuse the PR that this infectious event solicits. Also, it's not prejudice if he actually did it. How intimate must I be to disdain someone? I dearly want to append "in power" to "someone", but the complaint works just fine without it.
I guess this just dredges up the ennui I exposed in the final post of my previous blog. My vote doesn't -statistically- matter in California for president. A bare four representatives between state and federal legislatures also diminishes the likelihood that My Vision of Responsible Government will ever come to pass. Short of changing national opinion about relevant issues, in the guise of a cultural icon. As if change was ever made that way. And no, I will not write you a story where you are the time-traveling Mary Sue inducting a premature Industrial Revolution into the Renaissance.
So, while that post laments, acceptingly, my impact on the bureaucracy, I chafe at how others handle their own doublethink. Who gets elected? Incumbents, corporate patsys, or extremist populists. And then they are subjected to this ahistorical level of nagging. I recall 2008 saw people decrying both candidates for biasing their stump speeches to the local audience, especially at fundraisers. I previously settled on a structuralist interpretation that false hope of 'change' keeps a compromise for the mentioned parties. (I mean voters, political parties, and governmental beneficiaries; not just the political parties.) How can we expect otherwise, how else can a candidate garner votes than to incite (unwarranted) passion?
The pattern of character assassination, of which Johnson is the newest enabler-victim, presents another filter on our political aspirants. You need skin like a battleship to smile and assure everyone that, no I didn't really mean that household proximity to Russia is tantamount to diplomatic experience. I can't see Mr. Smith going to Washington under these conditions. He has easier routes to earn prestige or charitable attention for his community than pour money into advertising for an uncertain victory that merits a single floor vote and perhaps an irrelevant committee post or two. So we get people we don't trust and occasionally people we shouldn't trust.
I was going to trot out machine learning as the utopian solution. One pillar of Scott Adam's party platform involves cowtowing first to our best (scientific) understanding of an issue or, where subjective, to statistical public opinion. But, we don't need a person to enact majority favor. I don't even mean just do what 51% say in one or many polls. The statistic weights built up from years of input could reflect our longer-term preferences. Or not, obvious literary figures have plumbed this depth and found it lacking as are our current AI capabilities.
Further, rehashing the capability of statistical algorithms detracts from the incipent cause of bringing it up. A human, even Adams on a neutral path, will flub some delivery and invite this sword of damoceles into his crown. A software neural network will produce the same answers without opening this chink. Because it will reflect our cultural decisions. It would be us, but only as a static oracle of introspection. Such a system would lack creativity. A perfect civil servant would be inhuman, at minimum in this capacity.
I don't ask that you avoid sharing in a gape mouthed blank stare. It may even be fun. We are Meaning Producing machines. (Superstition as path to Kuhn paradigms.) For example, I like looking at mixotronic. It returns three randomly chosen concepts, ostentibly for an onlooker to create a story or something with it. What would the union of 'machiavellian,' 'seige,' and 'Stranger in a Strange Land' be like? I opined, after a non sequitur commercial in between scenes of Awake, that perhaps a postmodern advertising scheme involves offloading meaning creation to a consumer. And, the work is largely instinctive within a domain of connectivity. So long as the brand is emphasized right alongside, perhaps it may be more effective (via active listening) than relying on celebrity association (or other techniques) to shill your product, as those tropes are easily identified and tuned out. Or maybe not. I hear some people tried to call the FTC on bioware for insufficient narrative hand-holding since Mass Effect 3's final scenes were too 'open ended.'

Thursday, March 1, 2012

Trading spaces


When learning to use Python for the first time, I turned to Zed Shaw's booklet, Learn Python the Hard Way. The initial exercises were a shade tedious as its intended audience is a complete beginner. In the end, I much enjoyed his style of introducing mechanisms without explanation and giving the answer later. In that space I form an association, whereas the novice can form a hypothesis. Both are useful. I heartily reccommend it to anyone who wants to learn to use Python and isn't an Expert in another language.

One suggestion he makes - I've seen it elsewhere, involves using underscored_variable_names rather than camelCasedVariableNames. I admit, it is more pleasant to look at and discern. But, the underscore is way high up on the keyboard, using my weakest finger. I would have little argument (compactness) if the key were as central as the spacebar it replaces. So, principle ignored.

But, today, I recalled another piece of advice I ignored, this by Steve Yegge. In a primer about using emacs, he suggests that a potential user switch the effects of the alt and caps lock key. Emacs uses a number of keyboard shortcuts that begin with alt. Yegge changed his setup to reduce strain in cupping his hand awkwardly to accomplish the most used sequences. He proceeded to explain how to do so in windows. I prefer emacs to vim, but prefer other editors to emacs, so that was a non-starter. (Not to mention an inconvenience with any computer I don't control, like when I switched to Dvorak.)

The connection came, what if I change the registry to promote a space into an underscore when used in conjunction with a shift key? It would promote this other practice that is useful across software. Admittedly, it may promote an uncomfortable habit when I work on other computers, but the habit would be nicer on the whole. Frankly, I'd have done it rather than writing this but for the death of our modem. It has served at least six years, when its expected lifespan was three. It operated at rates comparable to last year's model, so it is a great shame. Plus, no internet leaves Jack dull, boy. I occasionally wondered if we could improve its lifespan by leaving it off at night but stayed my hand in case it would frustrate my similarly nightowl sister.

I have not touched a project to implement Deitel's Simpletron in Python in twenty days. I felt disappointed by how crudely it failed to act as expected. The problem seemed to arise from a concern I had thought resolved earlier. (Instructions ended up pointing at some list index rather than a memory location.) While I originally hoped to use a dict(ionary/hash), I had more that a pair of values to associate. I decided to use a wrapper with a list and now the implementation largely reflects expecting index values and searching it. Tonight I realized that the symbol should be the key and the wrapper, sans symbol, should be the value. Obvious in retrospect. Further, I could even pare down to a normal dict if I use a string-number for line numbers and integers as const values. Then the value is the memory location.


My only reservation is the degree of change this will require. I'm finally studying how git handles branches, so I can make a verbose version separate. Yet another thing to try.

Monday, February 27, 2012

Transmigration of a Site's Soul


The first words I published in a weblog were on xanga.com. After some time, I used myspace's features. Still later, a friend of mine from high school - Chase Lloyd - offered to host a blog on his domain and server. However, in the last month, he published his last blog post and put a keep out sign as the main page. In the preceeding year, he had uploaded his final posts but not published them until the new year. So, it looked to all the world as though he and his brothers (whose blogs he also hosted) had gone silent.

I admit that I published little after June of last year. That's not to say I didn't write anything. I mostly produced programs in Java, as I was learning to use it last year. But I didn't journal either so it wasn't an introspective episode in my life. Further, Chase and I have not spoken substantially for about a year. I was always aloof from his end of our high school clique, much to my chagrin as it lasted longest locally after we graduated. My inclusion in D&D campaigns was always midstream and at another's prompting. This is no gripe, I state this to expose my perception. I orbited far from his social circle, and yet he usually found room in tales and on his server for my contributions.

So, I felt guilty about relying on a fading friendship for a personal and seldom updated archive. Occasionally I toyed with the idea of migrating to a site of my own. One project I'd like to finish is a family tree as a web of pages. If you've ever looked at a family tree of more than three generations (or three children), you can see the leaking abstraction in making a 'tree.' Namely, spouse's branches are almost never present. A multidimensional graph (as afforded by links) suits the project much more. It may lose some nicity in visualization, but they all look the same anyway because of the abstraction.

In the meantime, I had downloaded an offline wiki called Zim. I have posted a very few articles onto a site called everything2.org. (Imagine the lovechild of wikipedia and a fiction forum. It hosts a spectrum.) Unfortunately for my pride, many obvious articles have been posted. So, my main options are usually either large topics I know little about (1969, Jean Marie Jacquard), unreviewed media (Ingloruious Basterds, My Name is Earl) or a grab bag of stuff (revolving door, gas station). One good way of ingraining the knowledge I paid to acquire is to write it myself in an easily accessible format. Wikipedia is indeed good for a quick refresh or introduction, but a quick read is a shallow event for engraving.

Suffice to say, my offline migration is at a standstill as well. As noted, I wasn't expressive during the period. And the point of the preceeding is that I had many destinations for harvested thoughts, but let my fields lie fallow. So, I set the idea aside after noting that google sites offers little advantage for my needs and my budget won't permit vanity publishing currently. And then Chase gave a fast-forwarded 2011 retrospective before going dark. My attitude turned from elation to panic. Perhaps, he might let his domain expire. It wouldn't be just to let him subscribe for solely my account (ignoring any fondness he may have for the name himself). So I exported the xml and reserved this spot. This post might have eventually lead to my dispassion for xanga's archival export (straight html, I could have done that myself), but it is late.

A deque could be benefit a blog. One end points to the beginning and flows chronologically. New readers start there. Caught up readers read from the most recent end and scroll down, if necessary. Or use a main page with the most recent post and organize the blog as a queue? Of course, reading chronologically assumes that order is paramount in enjoying a blog, which is far from certain. At the moment - xanga aside - I have refrained from publising the imported shakytable posts. How much history is too much? I do anticipate a future where all we upload will be correlated to us without extreme care (ip masking from anonymous public terminals), so it won't be hard to find what I published as a teen. But, as BF Skinner's perspective notes, could it muddle my impact to keep everything in a single location or paste in my history every time I change horses?