release: extract

extract is a little text processing utility that only prints the matching part of a line. It works much like grep -o or grep --only-matching but it is written in Ruby. Unfortunately I didn’t look at the grep manual before writing this…

The pattern is defined as a Ruby regular expression with no modifier. If a line has no match, it won’t print anything. Its purpose, as least for me, is to extract values out of log files so that they are easier to scan.

It’s a Ruby script. I didn’t exactly know how to respect pipelines in Ruby and followed this excellent blog post by Jesse Storimer. I re-used the core of Jesse’s gist.

Usage

You use this script much like any other text processing utilities for unix:

$ extract PATTERN Gemfile
$ extract PATTERN Gemfile | more
$ extract PATTERN Gemfile > output.txt
$ extract PATTERN Gemfile Gemfile.lock
$ cat Gemfile* | extract PATTERN

Given an input like this

2014-10-31T23:56:02.937999+00:00 heroku[web.2]: source=web.2 dyno=heroku.2266096.83372ea7-5f35-4ced-941f-3d41379deaeb sample#memory_total=852.89MB sample#memory_rss=823.62MB sample#memory_cache=9.56MB sample#memory_swap=19.71MB sample#memory_pgpgin=640456pages sample#memory_pgpgout=427162pages
2014-10-31T23:56:13.129107+00:00 heroku[web.1]: source=web.1 dyno=heroku.2266096.83a6bbb7-bd87-4cd3-976d-ab97e91ae7e9 sample#memory_total=845.93MB sample#memory_rss=833.33MB sample#memory_cache=6.88MB sample#memory_swap=5.71MB sample#memory_pgpgin=433064pages sample#memory_pgpgout=217969pages
2014-10-31T23:56:17.908008+00:00 heroku[sidekiq.1]: source=sidekiq.1 dyno=heroku.2266096.20cb65fb-2cea-4e29-bbe2-13c67afe4a78 sample#memory_total=259.89MB sample#memory_rss=259.71MB sample#memory_cache=0.12MB sample#memory_swap=0.05MB sample#memory_pgpgin=1742002pages sample#memory_pgpgout=1675483pages

You want to be able to scan the memory_total value, so you run

extract "memory_total=\d+\.\d+MB" log.txt

To get this

memory_total=852.89MB
memory_total=845.93MB
memory_total=259.89MB

Installation

You can install it by downloading the source into a directory that is part of your $PATH and making it an executable.

$ curl https://gist.githubusercontent.com/vroy/8dae1d39544fad407924/raw/2d0da9234de179c43d5e3020e6b6272f10ed3d80/extract.rb > ~/bin/extract
$ chmod +x ~/bin/extract
Learn More

Subscribe via RSS