Tuesday, October 9, 2012

Regular Expresion using Linux Grep

Linux grep command is commonly used when we need to filter a file or string. The advantage of grep would be maximized when combined use extended regular expression with the option E. Following are some good example to give you some inspiration:

Find a line contains the word "team"
$echo "we are in a team in US."|grep -E '\bteam\b'
$we are in a team in US.

This regular expression is equivalent to grep -E -w 'team' with the option w.

Note:
\b means the boundary of the word, it doesn't consume any character of the string.
see difference of basic regular expression and extended regular expression

A tricky thing here is: if the string contain "teams" or "team!" with a exclamation mark. The above regular expression will not match.

if we can change the regular expression to '\bteam'
$echo "we are teams in US."|grep -E '\bteam'
$we are in a team in US.

Tuesday, September 25, 2012

Use Curl to log on a page

This article is to give an example of using linux curl  command to log on a page.

Assuming we want to login a page like this:

So, the first thing is to look HTML code to determine which kind of input field to submit to the server side. We can just use browser's developer tool to the source code and we found something like the following:
<form action="http://annunziato.org/academia/login/index.php" method="post" id="login">

          <div class="loginform">          

              <input type="text" name="username" id="username" size="15" value="">

              <input type="password" name="password" id="password" size="15" value="">

              <input type="submit" id="loginbtn" value="Login"> 

             <input type="checkbox" name="rememberusername" id="rememberusername" value="1"> 

          </div>

</form>

Under the form tag, We have three input tag has names: username, password and rememberusername. So, we use curl option --data to submit the data.
#! /bin/bash

timemark=`date +%s`;
cookieFileName+="cookie${timemark}"
curl --cookie-jar $cookieFileName \
     --location \
     --output login.html \
     http://www.somewebsite.com/academia/login/index.php 

curl --cookie-jar $cookieFileName \
     --cookie $cookieFileName \
     --data username=myname \
     --data @pwdfile \
     --data rememberusername=1 \
     --location \
     http://www.somewebsite.com/academia/login/index.php \
     --output welcome.html

The line13 is using a password file called "pwdfile" which using a pattern like "key=value".
 The option --data is to pass form data like user name, password, etc. to specified url using Post.

we save this script into curlLogin.sh and change the previlege using chmod 644 curlLogin.sh.
Then, execute it.

We should open the welcome.html to see some successful login information if everything goes well.

Saturday, September 22, 2012

linux shell xargs command

xargs command takes pipe input stream as argument. Then, split each line into a list of string as argument pass to actual command. The actual command will execute once for all arguments.

Example:


find /path -type f -print0 | xargs -0 rm

some important argument:

-print0
use ASCII Null as a terminator. Generally in concert with xargs -0 

-t 
show actual command to be executed

-L num  
specify how many line to be concatenated into a line argument to command

-d letter
specify delimiter


Cleans current directory from all subversion directories recursively.

find . -type d -name ".svn" -print | xargs rm -rf

 ls | xargs echo
 ls |xargs -I {} echo "file:"{}

Tuesday, September 11, 2012

Using bash check IP validity

The following snippet of code is to use to check IP address validity.

#!/bin/bash

usage () {
echo "incorrect IP format."
readip
chkip
}
readip () {
read -p "Your IP: " IP
 }
 chkip () {
 echo "$IP" | grep -Eq '[^0-9.]|^\.|\.$' && usage
 echo "execute once"
 [ $(echo -e "${IP//./\n}" | wc -l) -ne 4 ] && usage
 for i in ${IP//./ } ; do
 [ $((i/8)) -lt 32 ] || usage
 done
 }

 if [ "$1" ]; then
 IP=$1
 else
 readip
 fi
 chkip
 echo "$IP is good!"



Note: the grep command option -E is to use extended regular expression which is similarto most recent regular expression. It supports more meta characters. References: Posix Extended Regular Expression

note: POSIX ERE Alternation Returns The Longest Match plus pre-condition is the leftmost.

Monday, July 23, 2012

Split command output into line with specified delimiter with a shell script

The scenario is I want to split a command output into lines for a readable purpose.
The command is from Apache Map-reduce framework.

$ hadoop classpath

/Users/zhouyaofei/Pictures/hadoop-1.0.3/libexec/../conf:/System/Library/Frameworks/JavaVM.framework/Versions/1.6.0/Home/lib/tools.jar:/Users/zhouyaofei/Pictures/hadoop-1.0.3/libexec/..:/Users/zhouyaofei/Pictures/hadoop-1.0.3/libexec/../hadoop-core-1.0.3.jar:/Users/zhouyaofei/Pictures/hadoop-1.0.3/libexec/../lib/asm-3.2.jar:/Users/zhouyaofei/Pictures/hadoop-1.0.3/libexec/../lib/aspectjrt-1.6.5.jar...

save the following snippet of code into mysplit.sh


#!/usr/bin/env bash


IN=`hadoop classpath`
#echo $IN
arr=$(echo $IN|tr $1 "\n")

for x in $arr
do
   echo ">[$x]"
done

$ ./mysplit.sh ":"

>[/Users/zhouyaofei/Pictures/hadoop-1.0.3/libexec/../conf]
>[/System/Library/Frameworks/JavaVM.framework/Versions/1.6.0/Home/lib/tools.jar]
>[/Users/zhouyaofei/Pictures/hadoop-1.0.3/libexec/..]
>[/Users/zhouyaofei/Pictures/hadoop-1.0.3/libexec/../hadoop-core-1.0.3.jar]
>[/Users/zhouyaofei/Pictures/hadoop-1.0.3/libexec/../lib/asm-3.2.jar]
>[/Users/zhouyaofei/Pictures/hadoop-1.0.3/libexec/../lib/aspectjrt-1.6.5.jar]
>[/Users/zhouyaofei/Pictures/hadoop-1.0.3/libexec/../lib/aspectjtools-1.6.5.jar]
>[/Users/zhouyaofei/Pictures/hadoop-1.0.3/libexec/../lib/commons-beanutils-1.7.0.jar]
>[/Users/zhouyaofei/Pictures/hadoop-1.0.3/libexec/../lib/commons-beanutils-core-1.8.0.jar]