Ivor O’Connor

March 17, 2009

Sorry state of “agrep”…

Filed under: awk/sed, bash, cli, Linux, ubuntu, vi — Tags: , , , , , , — ioconnor @ 7:27 pm

One very useful tool is “agrep”. Say a customers calls in and they live on “pentrose”. If you had a directory of customers you might be able to do a search for them by typing “agrep -i -2 pentrose *.*”. The “-2” says find any characters that are off by two or less from “pentrose”. So “Penrose”, “pentroose”, “Pentrous” and “Pentrouse” would all have matched. This is very helpful especially considering google maps and thomas brother maps often spell street names slightly different. Unfortunately the “-l” option telling agrep to list the file names does not work. See the following dump:

/customers$ agrep -V

This is agrep version 3.0, 1994.

/customers$ locate agrep
/usr/bin/agrep
/usr/share/doc/agrep
/usr/share/doc/agrep/README.gz
/usr/share/doc/agrep/agrep.algorithms
/usr/share/doc/agrep/agrep.ps.1.gz
/usr/share/doc/agrep/agrep.ps.2.gz
/usr/share/doc/agrep/changelog.Debian.gz
/usr/share/doc/agrep/changelog.gz
/usr/share/doc/agrep/contribution.list
/usr/share/doc/agrep/copyright
/usr/share/man/man1/agrep.1.gz
/var/cache/apt/archives/agrep_4.17-5_i386.deb
/var/lib/dpkg/info/agrep.list
/var/lib/dpkg/info/agrep.md5sums
/customers$ agrep -il -1  biider *.*html
customer_2401.shtml
*** glibc detected *** agrep: double free or corruption (top): 0x0821bf48 ***
======= Backtrace: =========
/lib/tls/i686/cmov/libc.so.6[0xb7e8aa85]
/lib/tls/i686/cmov/libc.so.6(cfree+0x90)[0xb7e8e4f0]
agrep[0x804b7df]
agrep[0x805751c]
agrep[0x8055dea]
agrep[0x80571bc]
agrep[0x8057228]
agrep[0x8066519]
/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe0)[0xb7e35450]
agrep[0x8048b81]
======= Memory map: ========
08048000-08068000 r-xp 00000000 08:01 6931968    /usr/bin/agrep
08068000-08069000 rw-p 00020000 08:01 6931968    /usr/bin/agrep
08069000-0823a000 rw-p 08069000 00:00 0          [heap]
b7b00000-b7b21000 rw-p b7b00000 00:00 0
b7b21000-b7c00000 —p b7b21000 00:00 0
b7cdf000-b7ce9000 r-xp 00000000 08:01 5744705    /lib/libgcc_s.so.1
b7ce9000-b7cea000 rw-p 0000a000 08:01 5744705    /lib/libgcc_s.so.1
b7cfd000-b7cfe000 rw-p b7cfd000 00:00 0
b7cfe000-b7d3d000 r–p 00000000 08:01 6964092    /usr/lib/locale/en_US.utf8/LC_CTYPE
b7d3d000-b7e1e000 r–p 00000000 08:01 6964091    /usr/lib/locale/en_US.utf8/LC_COLLATE
b7e1e000-b7e1f000 rw-p b7e1e000 00:00 0
b7e1f000-b7f68000 r-xp 00000000 08:01 4252758    /lib/tls/i686/cmov/libc-2.7.so
b7f68000-b7f69000 r–p 00149000 08:01 4252758    /lib/tls/i686/cmov/libc-2.7.so
b7f69000-b7f6b000 rw-p 0014a000 08:01 4252758    /lib/tls/i686/cmov/libc-2.7.so
b7f6b000-b7f6e000 rw-p b7f6b000 00:00 0
b7f70000-b7f71000 r–p 00000000 08:01 6964097    /usr/lib/locale/en_US.utf8/LC_NUMERIC
b7f71000-b7f72000 r–p 00000000 08:01 6964100    /usr/lib/locale/en_US.utf8/LC_TIME
b7f72000-b7f73000 r–p 00000000 08:01 6964095    /usr/lib/locale/en_US.utf8/LC_MONETARY
b7f73000-b7f74000 r–p 00000000 08:01 6971396    /usr/lib/locale/en_US.utf8/LC_MESSAGES/SYS_LC_MESSAGES
b7f74000-b7f75000 r–p 00000000 08:01 6964098    /usr/lib/locale/en_US.utf8/LC_PAPER
b7f75000-b7f76000 r–p 00000000 08:01 6964096    /usr/lib/locale/en_US.utf8/LC_NAME
b7f76000-b7f77000 r–p 00000000 08:01 6964090    /usr/lib/locale/en_US.utf8/LC_ADDRESS
b7f77000-b7f78000 r–p 00000000 08:01 6964099    /usr/lib/locale/en_US.utf8/LC_TELEPHONE
b7f78000-b7f79000 r–p 00000000 08:01 6964094    /usr/lib/locale/en_US.utf8/LC_MEASUREMENT
b7f79000-b7f80000 r–s 00000000 08:01 6947512    /usr/lib/gconv/gconv-modules.cache
b7f80000-b7f81000 r–p 00000000 08:01 6964093    /usr/lib/locale/en_US.utf8/LC_IDENTIFICATION
b7f81000-b7f83000 rw-p b7f81000 00:00 0
b7f83000-b7f84000 r-xp b7f83000 00:00 0          [vdso]
b7f84000-b7f9e000 r-xp 00000000 08:01 5693498    /lib/ld-2.7.so
b7f9e000-b7fa0000 rw-p 00019000 08:01 5693498    /lib/ld-2.7.so
bf89e000-bf8c4000 rw-p bffda000 00:00 0          [stack]
Aborted
customers$

You’d think such a handy utility would be kept up-to-date. Yet it says it is version 3 from 12 years ago. This is not true. The actual version synaptic gives is 4.17-5.

To get around this goofiness and get the documented “-l” option without the core dump use:

/customers$ agrep -i -1 biider *.* | awk ‘{ print $1 }’ | sort | uniq
customer_2401.shtml:
customer_6986.php:
/customers$

I just put it into a script file called “f” so I can simply type “f biider” and view the files…

#!/bin/bash
if [ “1” = “$#” ]; then
time gvim +/$1 $(agrep -1 -i $1 /customers/*.*html /customers/*.php | awk -F”:” ‘{ print $1 }’ | sort -rn | uniq )
elif [ “2” = “$#” ]; then
time gvim +/$1 -n $(agrep -2 -i $2 $(agrep -1 -i $1 /customers/*.*html /customers/*.php | awk -F”:” ‘{ print $1 }’ | sort -rn | uniq) | awk -F”:” ‘{ print $1 }’ | sort -rn | uniq )
else
echo “wrong number of arguments”
fi

agrep is so very useful. Somebody really should fix it. Until then use my workaround.

Advertisements

February 22, 2009

Awk and Sed one liners explained

Filed under: awk/sed, bash, cli, php, Uncategorized — Tags: , , , — ioconnor @ 7:44 pm

There is an excellent article at OS News here http://www.osnews.com/story/21004/Awk_and_Sed_One-Liners_Explained

Much of the stuff presented there is very useful. However… Yes there is a big “however”. Howevery do you really want to use these platform specific arcane relics that do not have the error handling capabilities needed to handle unexpected scenarios? Short answer… NO.

Use PHP instead. It can do everything awk/sed can much more easily. More importantly you can plan for the unexpected since you get errors back instead of just blindly assuming the awk and sed commands worked as they did in your microcosm of a test world.

Sure PHP is not always installed on the platform. Say you are planning to put your code in a debian package and you are afraid to use PHP since it does not come preinstalled. Well simply put the latest release of PHP as a requirement for your package. Done. Before your package is loaded PHP gets loaded so you have all the wonderful PHP tools and perhaps libraries available.

Then there is the matter of running your code run on windows platforms. Same thing. Require PHP to be installed first. You can find it here for free on the M$ world. Sure there are some cooler languages like python and ruby but they simply are not as solid as PHP nor do they have the excellent pear libraries.

Blog at WordPress.com.