robfelty.com


Unicode block names in regular expressions

Posted in bash, java, perl, python, regex

Quarter note = 11032014 robfelty
Treble clef 4/4 Time
Frequently, I find myself wanting to do some simple language detection. For Chinese, Japanese, and Korean, this can easily be done by looking at the types of characters in some text. The simplest and most robust way to do this is to use Unicode block names. It is very simple to write a regular expression which will test if a character is contained in a certain block. For all the different possible blocks, see here: Unicode block names for use […] (Read more)

Monkey patching in python

Posted in python

Quarter note = 07012014 robfelty
Treble clef 4/4 Time
I was just reading an article about Martijn Pieters, who is a python expert, and he mentioned monkey patching I did not know what monkey patching is, so I googled it, and found a great answer on stack overflow Basically, it takes advantage of python’s class access philosophy. Unlike java, which has a strict access policy, in python, all attributes and methods of a class are mutable. So it is possible to write code like this: from SomeOtherProduct.SomeModule import SomeClass […] (Read more)

Pretty printing json

Posted in bash, python

Quarter note = 01032014 robfelty
Treble clef 4/4 Time
Here is a really simple way to pretty print some unformatted json $ echo '{"foo": "lorem", "bar": "ipsum"}' | python -mjson.tool { "bar": "ipsum", "foo": "lorem" }

Small fix for article class in plasTeX

Posted in latex, python

Quarter note = 01172013 robfelty
Treble clef 4/4 Time
I recently found a small error in plasTeX, the program I like to use to convert latex to html. Unfortunately, it looks like it is not being actively developed anymore, but since it is open-source, and it is written in python, which I know, I was able to figure out the issue in not too long. When running plasTeX, I was getting this error: (Read more)