Welcome
Welcome to <strong>The Linux And Unix Menagerie</strong>.

You are currently viewing our boards as a guest, which gives you limited access to view most discussions and access our other features. By joining our free community, you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content, and access many other special features. Registration is fast, simple, and absolutely free, so please, <a href="/profile.php?mode=register">join our community today</a>!

The best way to code "htmlElementcount.sh"

Any unix or linux shell scripting questions and answers

The best way to code "htmlElementcount.sh"

Postby phil on Mon Mar 03, 2008 3:32 pm

In place of reading each line of the access log you should use the shell commands and egrep tool to do the hard work for you!

For performance use mixed upper and lower case searches e.g. [Hh] as this is quicker than using '-i'.

So we would code...

Code: Select all
#!/bin/bash

#
# htmlElementcount.sh


function die {
   echo "$*" >&2 ; exit 1
}

[ 1 -ne $# ] && die "usage: $(basename $0) LOG_FILE"
[ ! -e  $1 ] && die "LOG_FILE [$1] does not exist"
[ ! -s  $1 ] && die "LOG_FILE [$1] is empty"

cat <<EOF
Page Hits
$(wc -l < $1 | tr -d ' ') pages accessed - Form Elements Processed:
$(egrep -cq '[.][Hh][Tt][Mm][Ll]' $1| tr -d ' ') html pages accessed
$(egrep -cq '[.][Gg][Ii][Ff]' $1| tr -d ' ') GIF files accessed
$(egrep -cq '[.][Jj][Pp][Gg]' $1| tr -d ' ') jpg files accessed
EOF

exit 0
phil
contributor
 
Posts: 2
Joined: Mon Mar 03, 2008 3:24 pm

Postby laum on Sat Mar 08, 2008 2:33 am

Hey Phil,


Thanks for the tips. I never thought about using the range operator instead of using e|grep's -i option for speed, but it makes sense since you're letting the shell do the work for you.

Sorry I didn't respond to you sooner. I just transfered my comcast service from one location to another and they "comcastically" gave me a new email address and I had to wait a week for them to get all the emails, etc, that they "lost" during that time period.

Appreciate the suggestion and thanks for taking the time to help out. I promise I won't use your better-method in our blog without crediting you, unless you don't want it out there at all with your name on it. Let me know; it would make a great follow-up post and I like to give attribution where I can (maybe promo a link to your site or something, rather than giving out your email). In any event, I won't post any of your info if you don't say it's okay.

Thanks, again :)

, Mike
laum
Site Admin
 
Posts: 46
Joined: Sun Oct 14, 2007 7:04 pm

Postby phil on Sat Mar 08, 2008 5:06 am

No problem you're welcome to 'use' the post
phil
contributor
 
Posts: 2
Joined: Mon Mar 03, 2008 3:24 pm

Postby laum on Sat Mar 08, 2008 11:43 pm

Hey Phil,

Thanks a lot :) I'll give propers to your username :)

Cheers,

, Mike
laum
Site Admin
 
Posts: 46
Joined: Sun Oct 14, 2007 7:04 pm

Postby laum on Mon Mar 10, 2008 2:02 pm

Hey Phil,

I just got a chance to check out your code and it looks good.

I just needed to alter the:

egrep -cq

to be

egrep -c

So the counts would print. I was using q before because I was checking errno after and counting outside of that . Like you said, my script was doing a lot of extra work ;)

Definitely great work, though - check out these timings

my original brutish script: real 0m3.358s
your streamlined script : real 0m0.108s

Those 3 seconds would add up - I was only using a 500 line access file!

Good stuff. Thanks, again :)

, Mike
laum
Site Admin
 
Posts: 46
Joined: Sun Oct 14, 2007 7:04 pm

Postby yahoozer on Fri Apr 18, 2008 10:57 pm

Good solution. Thank you!

Yaz
yahoozer
 
Posts: 9
Joined: Sun Oct 28, 2007 4:40 pm


Return to Shell Scripting

Who is online

Users browsing this forum: No registered users and 0 guests

cron