robfelty.com


Unicode block names in regular expressions

Posted in bash, java, perl, python, regex

Quarter note = 11032014 robfelty
Treble clef 4/4 Time
Frequently, I find myself wanting to do some simple language detection. For Chinese, Japanese, and Korean, this can easily be done by looking at the types of characters in some text. The simplest and most robust way to do this is to use Unicode block names. It is very simple to write a regular expression which will test if a character is contained in a certain block. For all the different possible blocks, see here: Unicode block names for use […] (Read more)

Pretty printing json

Posted in bash, python

Quarter note = 01032014 robfelty
Treble clef 4/4 Time
Here is a really simple way to pretty print some unformatted json $ echo '{"foo": "lorem", "bar": "ipsum"}' | python -mjson.tool { "bar": "ipsum", "foo": "lorem" }

Using awk to sum rows of numbers

Posted in bash, linux, UNIX

Quarter note = 11142013 robfelty
Treble clef 4/4 Time
I have a script which takes a tab-delmited file for regression tests, and converts it xml. I want to do a sanity check, to make sure that the number of utterances in my xml files matches the number in the tab-delimited.txt file. I can do this in 2 lines in UNIX robert_felty$ wc -l samples2.txt 72148 samples2.txt robert_felty$ find . -name '*.xml' | xargs grep -c " (Read more)

Vetting vignetting

Posted in linux, photography

Quarter note = 09282011 robfelty
Treble clef 4/4 Time
We recently got some family portraits taken at JCPenney. I think they turned out very nicely. They had a bunch of different effects that they could apply. While we were at the studio, I really liked the way that the vignetting effect brought out our faces. However, after investigating the full set (we bought the digital images), I decided I didn’t like the vignetting, because it was actually making my face a bit dark in one shot. Then I decided […] (Read more)

UNIX/Linux permissions and groups – getent

Posted in linux

Quarter note = 09062011 robfelty
Treble clef 4/4 Time
I keep forgetting this command, so I writing it here so I know where to find it. Getent will list information about users and groups on a UNIX/Linux system, including NIS and LDAP users, which is crucial networks with multiple nodes. For example to list information about a user named robert_felty, you can do: $ getent passwd robert_felty robert_felty:$1$iPJ.svD/$ce77I/wxh129FLt2Z7UOm.:5440:112:Robert Felty:/home/robert_felty:/bin/bash