2001-07-26 THURSDAY [TGM]
* 11:00 EDT. Got task description in PDF format. Started printing it
and jumped in the shower.
* Exited shower, dressed. Read description, highlighting interesting bits.
* 11:30 EDT. Started project with parser.
* 13:31 EDT. Initial parser is finished (thanks Parsec!) and passes
smoke tests. Time to think about using QuickCheck to raise my
confidence level. Time to also think about the big picture. I
don't want to over-focus on the details to the exclusion of what's
really important. ==> espresso time
* 13:40 EDT. Acccch! My espresso supply is running low. Time to
start roasting up another batch. (My friends Carsten and Molly are
arriving tomorrow, and I wouldn't want to offer them stale espresso!
One must be a good host, after all.)
* 13:59 EDT. Espresso consumed. Power flowing. All is right with
the world. Second batch of Brazilian Cerrado is roasting. Thoughts
during the break:
- Next step: Document -> String output (it's easy and on the critical
path, giving me time to think about the rest of the problem in the
back of my head)
* 14:10 EDT. Split out Document into its own module. Smoke test in
GHCi: works.
* 14:16 EDT. 2nd batch of Brazillian done. Starting Sumatra in
roaster...
* 14:29 EDT. Sumatra done... starting Guat. Huehue...
* 15:11 EDT. Show instances for Document are finished. Now I can go
round-trip from text to Document and back to text, just what I need
for QuickChecking the parser. (Whoops! Forgot to get the finished
Guat. Huehue. out of the roaster.)
* 15:21 EDT. Espresso roast/blend finished. Starting work on
QuickCheck.
* 17:13 EDT. QuickCheck is revealing that something is going wrong
with either the Parser our my Show instances. I've been able to
determine that when some Documents are shown, the resulting
strings can't be parsed. However, Parsec can't give me any
error messages because my grammar isn't left factored. So,
off to left factor the grammar...
* 17:33 EDT. Grammar is left factored now. Resuming QuickChecking...
* 17:37 EDT. QuickCheck to the rescue! QuickCheck found a test case
that falsified my RoundTrip property for Show->Parse->Show, and I
was able to hand-feed that case to my parser to determine the error.
The problem turned out to be a subtle typo in my Show instances.
This is one error that I'm confident I would have missed without
QuickCheck. (Big thanks to Koen Claessen and John Hughes!)
* 17:54 EDT. Parser and Show instances passed all tests. Time
for a food break. Leftover pasta in fridge.
* 18:24 EDT. Dinner of angel hair pasta, olive oil, and about a half
pound of minced garlic is now complete. Back to coding.
- I must now conclude that my attempts to bait my friend Bill into
joining Team Functional Beer have failed. It looks like solo
coding for the duration.
- Time to re-read the contest challenge, just to pick up anything
I may have missed the first time around.
* 20:09 EDT. Ordered delivery from local Chinese restaurant. Started
working on Document -> Meaning.
* 23:19 EDT. Finished eating a long time ago. Got Document->Meaning
module finished. Haven't yet figured out how best to use QuickCheck
on this bit of code. I think that I'm going to test it indirectly
via my first optimization attempt, which I'll code next. If there's
a problem, there will be a larger area to hunt for bugs, but I'm
willing to take that risk.
2001-07-27 FRIDAY [TGM]
* 10:05 EDT. Back at it. Breakfast finished, espresso in hand, I am
ready.
- It occurred to me during my downtime that I have been modeling the
problem with too much bias for my existing experience with
document processing -- SGML/XML/DOM-type stuff -- which suggests
that the document is a hierarchy. I may now prefer to
think of it as a linear stream of decorated characters.
- In the stream, any transition in decoration *must* be associated
with a corresponding open- or close-tag in the marked-up version.
This suggests that the ``naive optimizer'' I had been working on
might not be so naive after all. Building upon its logic might
yield good results.
- At a decoration transition, there are a finite number of tag
transitions that could have been responsible for the change in
decoration. One optimization technique is to try them all,
adjusting the surrounding tagging as needed, and taking the one
that results in the shortest overall markup. It's something to
think about.
* 12:19 EDT. My naive optimizer is done. Not so fast, QuickCheck
spotted a corner case that causes the optimizer to discard untagged
text at the end of a document. Oops.
* 14:20 EDT. I'm just back from a forced ``thought break'' (eating
and mowing the lawn) after halting work in the present optimization
strategy, which has a fundamental flaw.
New strategy: put the changed decoration that yields the
shortest run on the stack *last*. I.e., order options by run
length.
* 16:37 EDT. Still working on getting the optimizer debugged.
I'm making progress, but I don't know how long it will take.
I think that there is a problem in my ``Meaning'' library,
which might be causing false failures in testing.
* 16:40 EDT. Yes, there was a bug in the Meaning module's handling of
white space. Basically if two spaces were adjacent but within
different tagged elements, the spaces wouldn't be merged if they had
the same space context. But, even with it fixed, QuickCheck is
spotting problems in my optimizer... sigh..
* 16:43 EDT. The test cases that cause failures are huge... I'm
trying to find a smaller one. (A typical one is about three pages
long.) I finally got it down to this handsome devil:
<1>Q1>XDVV V<7><3>3>7><4>4><7>7><9>CF <4>4>9><4><7><1>1>7>VP4> N I <0>0>C URJ VJGMNUUQRZV<5>J BZWUH<0>0>Q5>ESHWWKBKPMPIWTZQJM YKDHSNMYBZJDZYK<9><7><6>F6>7>YPCF9><8>CZ U<4>4>8><8> P<3>N3>XQ8><1><0>0>QOMUE<7>7>1><4>4>JXXEDKVZFDGPXLE<9>9>A<6>6><8>8><8>8>TSHLA W<2><2>2><8><1>1>8><9>9>2>VX MXODFO<8>8>ZEIA<6>6>WA<7>7>DBQEXAMGRMU FY <4>4>ECZZIF D VPAS Y<5>5>F QO MNLC TG TJ<6>6>RVD<7><1>1>TYX<4>4>EO BXJ<3><1>1>3><9><1>1>9> HNSSUWTGXP7>XGCGDKJNADO AADGG<2>2><0><3>3>0><9>9><0>0><2> 2>FHEK<9><9>9>9>M<0>0>BLNSK SWBW<5>5>F FQQ<0><8>8>VJ0>PSIWBFI<8>8>KWR<7>7>UBGUMKL X T<1><3>J<5>5>3>RPGB B WD<2>YICXK<8>8><4>4>T2>1>
* 16:48 EDT. Running that string through the optimizer and then
diffing its meaning with the original version reveals a one character
difference in underlining. The original has a double underline but
the optimized version has only a single underline. Time to look at
that underlining code with a magnifying glass...
* 16:54 EDT. Oooh. Found a really small counterexample:
RG WD<9>9> MEYWTRMRWZ TZRU HW<2>2>BWN M
It has the exact same problem. Something that ought to be underlined
twice is underlined only once:
RG WD MEYWTRMRWZ TZRU HWBWN M
The "BWN" toward the end ought to be double-nested in U elements.
* 17:20 EDT. Okay, I think that I have the bug squashed. QuickChecking...
Joy!
OK, passed 100 tests (93% shorter!).
93% of the documents that the optimizer was fed could be reduced!
Here's the histogram of reductions for 300 test runs:
# ./histo
REDUCTION
10% ************
20% ********************************
30% **********************************************************
40% ***************************************************
50% *****************************
60% **********************
70% **********************
80% *************
90% **********
100% ***************************************************
My goal is to move the bars down toward 100%.
* 17:47 EDT. Time to check the files into CVS and take a break.
* 18:33 EDT. During the break, I realized that I hadn't accounted
for TT when handling space contraction in the Meaning module.
Added it to the TODO list. Worked up a quick fix, but my
brain is too fried to validate it.
* 18:35 EDT. Time to stop coding for the day and prepare for the
arrival of Molly and Carsten.
2001-07-28 SATURDAY [TGM]
* 08:23 EDT. Taking a quick look at the space contraction thing
before breakfast...
* 08:32 EDT. Validated the TT-space contraction fix via inspection
and QuickCheck.
* 18:54 EDT. Long time without an update. Basically, my wife and I
were having fun with our friends Molly and Carsten. Carsten (who
taught the graduate-level functional programming course at Yale in
the Spring) and I also had a fun chat about the contest and my
entry. He had some good ideas, which I'm sure will find a way into
my submission somehow.
Anyway, I've been working on a Perl wrapper that makes sure I can
produce decent output in the event that the Haskell executable
crashes.
I'm having a hard time producing a statically-linked executable, and
so I'm going to have to make sure that my ./buildme is flawless.
* 18:57 EDT. Updated the histogram code to produce an overall
average. Latest round:
AVERAGE REDUCTION = 30%
10% ********
20% *************************************
30% ***********************************************************
40% **********************************
50% *******************************
60% ********************
70% **************************
80% ***********
90% ************
100% **************************************************************
* 19:38 EDT. Added a quick whitespace optimization to the naiveOptmizer
that trims a lot of unnecessary markup from around whitespace. As
a result, the average reduction increased by about 20% to ~36%:
AVERAGE REDUCTION = 37%
10%
20% ***
30% **
40% *************
50% ********************************************
60% *********************************************
70% ***********************
80% ***********
90% ********
100% ***********************************************
* 20:57 EDT. Updated the README file and added some better commenting
to the programs.
* 21:02 EDT. Fixed a subtle bug in Meaning.hs. Basically, I forgot
to convert whitespace characters into spaces. Since this bug was in
Meaning, which my test cases rely on to determine equivalence, this
bug was invisible to my testing.
* 21:25 EDT. I decided to run a torture test of about 7 MB of deeply
nested tags and whitespace. The result is a stack overflow in the
Haskell executable.
(Stack space overflow: current size 1048576 bytes.
Use `+RTS -Ksize' to increase it.
Time to bump the stack size...
* 21:27 EDT. Bumped up ghc compile options to use -O2 optimization,
hoping that might help with the stack situation. Nope, didn't
help.
* 21:52 EDT. Arrgh! It's the parser that's eating up the space.
Time to hand roll a parser? Try Happy?
* 22:53 EDT. Okay, I'm not sure if it's the parser. I do know that
the overall running time is about O(N^2) with an input of size N.
Time to use GHC's profiling options to pinpoint the trouble.
* 22:57 EDT. Profiling located the culprit: showsDoc!
COST CENTRE MODULE %time %alloc
showsDoc Document 86.1 83.8
dm' Meaning 5.2 10.6
docParser Parse 4.3 2.2
mkTagParser Parse 1.7 2.8
... as called by naiveOptimizer.
* 23:17 EDT. What was the problem? A little test in naiveOptimizer
that made sure that the returned document wasn't larger than the
original! Once removed, run times became more reasonable. Still,
showsDoc is exhibiting N^2 runtime performance, which indicates
that I have a problem there. Adding it to the TODO list.
Trying the ``bigus-nestus'' torture test:
SPCSPCSPC ... 997 more SPC ...
takes about 100 seconds, 97% of which is spent in Meaning's
dm'. Can it be quickly improved?
2001-07-29 SUNDAY [TGM]
* 00:22 EDT. I've rewritten the dm and dm' functions to pass an
accumulator around a la shows to solve the recursive
left-re-expansion problem. The result is that the functions drop
off the profile map. Now docParser is the leader at 50%. I'll
trust that Parsec is wicked fast (that's one of its claims to
fame), and so I'm back to optimizing showsDoc.
* 00:42 EDT. YES! I found the showDoc speed killer. It was hidden
in a foldr'd application of (++). The following change made all
the difference (diff -u):
- foldr (++) s (map show dparts)
+ foldr showsDP s dparts
* 01:08 EDT. On large documents, the parser is overflowing the
already large 10 MB stack. Is it *now* time for a hand-rolled
parser?
* 02:41 EDT. I have written a custom parser. It is very fast, much
more so than the Parsec combinator parser, but I wonder about memory
usage... Can I handle a 1,000,000 element bigus-nestus test without
overflowing the stack?
* 03:21 EDT. I HAVE SUBMITTED MY FIRST ENTRY. It's fast and should
always produce correct output. Whether it is a good optimizer and
whether it won't run out of memory on beastly documents, I'm not
so sure.
* 05:21 EDT. It's getting late and so I've resubmitted my entry,
which now takes advantage of unboxed tuples in a few high-traffic
areas.
* 05:33 EDT. I'm going to bed.
* 10:15 EDT. I've just awakened. I'm working on Optimization1, a
longest-run-first optimizer, hoping against the odds that I can get
it working in the next few minutes, package up another submission,
and send it off before 11:00 AM.
* 11:00 EDT. It's O-vah! Well, I didn't get my smart optimizer in,
but I did submit a half decent entry. How far will it go? Only
time will tell...