Using awk to sum rows of numbers

Posted in bash, linux, UNIX

= 11142013 robfelty

I have a script which takes a tab-delmited file for regression tests, and converts it xml. I want to do a sanity check, to make sure that the number of utterances in my xml files matches the number in the tab-delimited.txt file. I can do this in 2 lines in UNIX

robert_felty$ wc -l samples2.txt 72148 samples2.txt robert_felty$ find . -name '*.xml' | xargs grep -c " In the first line, I count the number of lines (there is a heade line, so I will be expecting 1 fewer lines)


In the next line, I find all the .xml file using find, then pipe that to xargs, where I use "grep -c" to count the number of matches to the utternace pattern. grep -c outputs rows like this

filename:count

I want to sum up all the counts, so I cut out just the count field using cut, then I use awk to sum up all the counts.
I love UNIX pipelines!