Whenever I review a piece of code, I stress a lot to follow even
the simplest of the coding principles. When a developer questions the significance of
those principles, I give mostly theoretical examples of why the principles are
important in scenarios that don’t happen often. Seriously, how often do we see
those coding principles making an actual difference in a normal project lifecycle?
The simple coding principles may not always make a lot of difference.
But when they do, it’s an embarrassment to the developer and not many
developers get embarrassed on a regular basis.
They are rare but very important. I haven’t seen
the principles making a huge difference in practice, well, until recently. So
when I have a live example, I better document it, for re-usability of course. Here it goes.
Consider the piece of code below:
#/bin/bash
bash getpendinglist.sh # populates /tmp/list.txt by querying against
a database
while read
line
do
name=`echo $line | cut –d’|’ -f1`
location=`echo $line | cut –d’|’ -f2`
status=`echo $line | cut –d’|’ -f3`
if [ “$status” == “PENDING” ]; then
echo
“Pending entry identified: $location/$name”
bash
process.sh $name $location
fi
done <
/tmp/list.txt
At the first look, everything looks OK. The
getpendinglist.sh retrieves the list of pending entries (from a database using
sqlplus). The file with the retrieved entries is read line by line in a while
loop. The line contains 3 values delimited by a pipe (|) character. If the 3rd
value is “PENDING”, the process.sh is called with the name and location
variables. Otherwise the line is skipped. The process.sh is the actual business
process, in our case an ETL process. Straight-forward, right?
But it is not following the simple rule mentioned above. The
initialization of name and location is required only for lines with status
value as “PENDING”. So the 2
initialization statements should have been placed under if statement. But come
on, how much could 2 simple initialization statements hurt?
The script worked fine for almost 18 months. The maximum CPU
usage was around 70%. The CPU utilization itself didn’t bother us because after
all the ETL process.sh is supposed to be CPU intensive and there was no
operational impact.
All of a sudden, unexpected pauses started to appear in the
processing. Even when there was only one PENDING entry in the database, it took
a few hours even to start processing the PENDING entry and the CPU usage shot
up to a consistent 90%. Even when there are no entries with PENDING as status
in the list.txt, the CPU usage never came down below 75%. That triggered the
analysis because if there are no PENDING entries, the process.sh that is supposed
to be CPU-intensive never runs. So there is no way the CPU usage can be 90%.
There is an obvious problem to analyse now.
The unix admin for our servers, a
very good one at that, did some deep-diving and finally figured out a crucial cause. Of
90% the CPU usage, 75% is in the system kernel level and too many processes are
being created (~200 per second). The list.txt contained 1000s
of entries without having status as PENDING.
The getpendinglist.sh that populates the list.txt didn’t use a
filter clause in its database query, it retrieved all the entries from the
database rather than getting only the pending entries. Given that every cut
statement is a new process, every 1000 entries in the file created 3000
processes (3 per entry). Obviously the cut is going to be a very short-lived
process, only in milliseconds at best. As every new process creation has a kernel overhead, the CPU usage
at the kernel level was consistently high.
The fix was very simple indeed,
- Add a filter in the database query to remove entries that don't have status as PENDING
- Move the cut statements inside the if statement
A true silver bullet fix, brought down the CPU usage to 10% in
the system kernel level. This is a clear demonstration of the importance of the
coding principles that are overlooked many times.
- Write a line of code only when it is necessary; initialize a variable only when it is going to be used
- Don’t just go for a simple query, aim for the perfect query that will give you just what you wanted