A-  A  A+ RSS Feed

Deep Thoughts by Robert Felty

thoughts on wordpress, latex, cooking et alia

Archive for March, 2009

Sunday, March 29th, 2009

Converting LaTeX to Microsoft Word with plasTeX and Open Office

Sample of LaTeX document converted to Word
Sample of LaTeX document converted to Word

First of all, any LaTeX user might ask — why would I want to convert beautiful LaTeX into ugly Microsoft Word? The main reason is collaborators who want to use track changes. I recently sent a draft of a paper to some colleagues it two formats – .pdf and .doc. The pdf was formatted beautifully with LaTeX, but if your collaborators are not comfortable with editing a LaTeX file, it is difficult to make comments in pdf files, though there are some options for it (I like Mac OSX’s Preview application).

So when I sent out this latest paper for comments, I decided to send two versions. Converting to Word via copy and paste is very laborious, and not worth the effort. Recently though, I have been using plasTeX to convert LaTeX into html. I know that programs like Open Office can import html, so I decided to try that route to convert into .doc. First I used plasTeX to convert html, and specified a few options to get the output I wanted:

plastex --theme minimal --sec-num-depth 0 --split-level 0 <filename>.tex

By default, this creates a subdirectory called , with an index.html file inside it.

Next I opened a new text document in Open Office, then selected Insert > File, and selected this index.html file. Presto! I had the document, complete with figures, tables, footnotes, and references. It wasn’t formatted as nicely as the pdf, but now my authors could insert their own comments and send it back to me electronically. One last step though. By default Open Office links to external figures instead of embedding them. To override this, select Edit > Links. Then highlight all the links, and click on the button to “break links”. Finally, save the document as a .doc file, and e-mail it to the collaborators as an attachment.

I have to incorporate their comments back into my original LaTeX file, but this is much less painful to me than having to write the whole thing in Word to begin with.

Tuesday, March 24th, 2009

Baingan Bharta

Baingan Bharta
Baingan Bharta

One of my favorite Indian dishes is Baingan Bharta, which is an eggplant dish. I decided to finally try to make it at home. After browsing through several recipes, I mostly followed this one from Dil Se. I was actually very surprised to discover that none of the recipes I found had any peanuts or peanut butter. I have always sworn that the Baingan Bharta I get in restaurants tastes like peanuts. I wonder how I came to that conclusion.

The main difference from the recipe above is that I used 2 eggplants, and I roasted them under the broiler in the oven for about 30 minutes, turning every 5-10 minutes. I also used a can of diced tomatoes instead of fresh tomatoes. The result was ok. It was actually a bit too spicy (probably too many chilies and too much garam masala powder). It was also too tomatoey. It was also not enough compared to the 2 cups of brown rice (dry) I made. Next time I think I will use 4 eggplants, and 2 onions, and keep the spices and tomatoes about the same.

Thursday, March 19th, 2009

On choosing website style

The style options on my website which you may not have noticed
The style options on my website which you may not have noticed

A couple years ago, I read an article in A list apart about a style switcher for a website. I thought this was a great idea — let the user choose their favorite style. So I set about implementing it. I have had it on this blog since the beginning. Personally, I find black text on a white screen hard on the eyes. I spend most of my editing in a terminal with green text on a black background. I know that is not for everyone though. That was my thinking on the different styles I have for this site.

Over the last couple years though, I have noticed that few people actually notice the option to change the style. I have noticed this mostly when people have complained that they didn’t like the style, and they did not even notice the switcher.

Just today, I was chatting with my friend Danny, who pointed to me an article by Joel Spolsky about how too many choices can be bad. In it he states that:

Every time you provide an option, you’re asking the user to make a decision. That means they will have to think about something and decide about it. It’s not necessarily a bad thing, but, in general, you should always try to minimize the number of decisions that people have to make.

I don’t agree with everything Joel says here. In general, I am in favor of lots of options. That is one of the reasons I like the KDE desktop (and why I haven’t switched from 3.5 to 4.x yet — I hear 4.2 has re-introduced many of the options absent in 4.0 and 4.1). But I am wondering whether I should get rid of the style switcher altogether. What do you think? Should I keep the switcher? If not, what is your favorite option?

Sunday, March 15th, 2009

Reading iptc captions from jpegs with imagemagick

Rob and Spencer with zebras
Rob and Spencer with zebras

Once again I found myself needing to use imagemagick to do something, and was overwhelmed by the many options. After much fiddling around, I found out some options that worked for me.

In this case, I wanted to extract iptc captions from images, so that I could then insert the caption in a webpage with php. I use Picasa to edit photos and add captions. Picasa adds in the captions in the iptc information, which is the right place to add them. To extract the caption from the image above, do the following

convert robSpencer.jpg 8BIMTEXT:-

The result should be output to standard out:

8BIM#1028=”IPTC”
2#120#Caption=”Rob and Spencer with zebras”

If you want to output to a file, simply do something like:

convert robSpencer.jpg 8BIMTEXT:filename.iptc

Now it easy to extract the caption using perl, python, php or whatever you like. For perl, we could simply pipe it:

convert robSpencer.jpg 8BIMTEXT:-|perl -ne '/Caption="(.*)"/; print $1;'

And in case you are interested, I needed to do this for the postie wordpress plugin which allows you to post to your blog via e-mail. In version 1.1.5 iptc captions will be read and displayed (if they are in the image you attach).