Soft­ware Au­to­ma­tion Test­ing Us­ing Sikuli

Sikuli is a script­ing lan­guage that can carry out au­to­mated soft­ware test­ing of graph­i­cal user in­ter­faces (GUI) us­ing screen­shot im­ages of the soft­ware un­der test.

OpenSource For You - - Contents - By: Aakash Beni­wal The au­thor works with In­fosys Limited, Pune, as a test­ing en­gi­neer and has over two years’ ex­pe­ri­ence in this do­main. He can be reached at

Test­ing of any soft­ware project is as im­por­tant as its de­vel­op­ment, and is done to check or val­i­date dif­fer­ent as­pects like func­tional test­ing, se­cu­rity test­ing and data­base test­ing. The test­ing process can be man­ual or au­to­mated. Man­ual test­ing is per­formed by a per­son sit­ting in front of a com­puter, care­fully ex­e­cut­ing te­dious and time­con­sum­ing tests. Test­ing can also be done by us­ing suit­able au­to­ma­tion tools, which makes it more re­li­able and faster. Whether one opts for man­ual or au­to­mated test­ing de­pends on var­i­ous fac­tors like the project’s re­quire­ments, the bud­get, time­lines, the ex­per­tise avail­able and suit­abil­ity.

A few ma­jor rea­sons why one should opt for au­to­ma­tion test­ing are listed be­low:

Man­ual test­ing is very time con­sum­ing when it comes to over­all flow test­ing and cov­er­ing all sce­nar­ios. Re­gres­sion test­ing also be­comes a very te­dious task when done man­u­ally as it needs rep­e­ti­tion of the same ac­tions/steps. Man­ual test­ing be­comes phys­i­cally tir­ing.

Man­ual test­ing is also less thor­ough than au­to­ma­tion. To err is hu­man; so man­ual test­ing is more er­ror prone. Au­to­ma­tion test­ing may be use­ful in some cases but, some­times, it may be too high-tech and could wind up cost­ing you way more than it’s worth. So it be­comes very im­por­tant to choose the cor­rect au­to­ma­tion tool for your project. The open source tools avail­able in­clude Se­le­nium, Robotium, Au­toit, Sahi and Sikuli, while some oth­ers are li­censed like HP Uni­fied Func­tional Tool (UFT) and Tosca, for which one has to pay to use. Choos­ing open source tools can re­duce the project cost; how­ever, paid tools have many more fea­tures, are less time con­sum­ing and have won­der­ful sup­port teams. So depend­ing upon your project’s re­quire­ment, you may opt for any au­to­ma­tion tool to en­hance your test­ing scope and speed.

Let’s now dis­cuss an emerg­ing au­to­ma­tion tool called Sikuli.

In­tro­duc­ing Sikuli

Sikuli is an open source au­to­ma­tion tool that uses im­age recog­ni­tion to iden­tify and con­trol GUI com­po­nents. It can be in­te­grated with the Se­le­nium Web driver to au­to­mate Flash con­tent and Java ap­plets.

Ac­cord­ing to the of­fi­cial site, in the Hui­chol Na­tive Amer­i­can lan­guage, Sikuli refers to God’s Eye, im­ply­ing the power to see and un­der­stand things un­known. It is ba­si­cally a soft­ware frame­work li­censed un­der MIT2.0 and is cross­plat­form. It was started in mid-2009 as an open source project by Tom Yeh and Tsung-Hsiang at the User In­ter­face De­sign Group in MIT, USA. Both de­vel­op­ers worked with Sikuli till Sikuli-X-1.0rc3 in 2012. Then Raimund Hocke

(aka RaiMan) took over de­vel­op­ment sup­port for Sikuli and main­tained it. He de­vel­oped it fur­ther as the SikuliX (where X de­notes eX­per­i­men­tal) pack­age to­gether with the open source com­mu­nity, and con­tin­ues to main­tain it with its help.

Sikuli ba­si­cally au­to­mates any­thing you see on the screen of your desk­top. It uses im­age recog­ni­tion to iden­tify and con­trol GUI com­po­nents. It comes up with ba­sic text recog­ni­tion OCR pow­ered by Tesser­act, which can be used to search for text in im­ages.

So we can say that us­ing Sikuli is WYSIWYS or What You See Is What You Script.

Sikuli can be used to au­to­mate test­ing through screen­shots us­ing Power Point slides while code lovers can use scripts in IDE to en­hance its func­tion­al­ity. This frame­work is very use­ful in many sce­nar­ios like the fol­low­ing:

It is best for use on Flash ap­pli­ca­tions. For us­ing the Se­le­nium Web driver, we need the source code to de­velop the API. For ex­am­ple, if we need to au­to­mate the val­i­da­tion of Adobe Pho­to­shop (whether an im­age got opened or not), then Sikuli can be very use­ful with­out us­ing any API.

It can be very use­ful in some sce­nar­ios where ap­pli­ca­tions have a very com­plex source code yet very sim­ple vi­su­al­i­sa­tion. So with­out go­ing into the source code or some Xpath, we can au­to­mate and test the func­tion­al­ity of that ap­pli­ca­tion.

It is very use­ful in cases in which the ap­pli­ca­tion code gets changed fre­quently but GUI com­po­nents re­main the same. In such cases, the func­tion­al­ity of an ap­pli­ca­tion can be val­i­dated us­ing Sikuli.

Sikuli can be very use­ful for game testers as well. With­out us­ing an API they can do some sort of test­ing on it.

The sys­tem re­quire­ments for Sikuli are:

• Win­dows XP and later, in­clud­ing Win­dows 8 and 10 (32-bit and 64-bit).

• Linux/UNIX sys­tems, depend­ing on what pre­req­ui­sites are avail­able (32-bit or 64-bit).

• Mac OSX 10.5 and later (64-bit only).

In­stal­la­tion of Sikuli on Win­dows

The path for down­load­ing and set­ting up Sikuli is https:// launch­­load.

Down­load the Sikuli setup.jar file from this link.

Once the down­load is com­plete, click on the sikulixsetup1.1.1.jar ex­e­cutable file and fol­low the in­struc­tions to in­stall the Sikuli IDE.

Now, to run the Sikuli IDE, open com­mand prompt, go to the path where you have just in­stalled Sikuli and run run­sikulix.cmd. This will open Sikulix IDE home­page.

Sikuli can be di­vided into two parts:

In­te­grated De­vel­op­ment En­vi­ron­ment (IDE): This is used to make scripts by tak­ing screen­shots.

API/Sikuli script: This part is used for GUI in­ter­ac­tion of Jython and the Java li­brary with key­board or mouse events. Both these com­po­nents are part of SikuliX.

Some ba­sic fea­tures of Sikuli

Let us go through some ba­sic func­tions of Sikuli.

Type (): The type com­mand is a very ba­sic com­mand, which we can use to en­ter in­put or text: type (“This is Sam­ple text ex­am­ple of type com­mand”)

The type com­mand can also be used with a fo­cused im­age, as while script­ing we can fo­cus on a par­tic­u­lar area of ap­pli­ca­tion; then dur­ing ex­e­cu­tion, the type () com­mand will search that re­gion first and type there. We can also use a mod­i­fier (as an op­tion) with the type com­mand to pro­vide mod­i­fier keys as shown in the ex­am­ple be­low:

Type (“text”, KeyMod­i­fier.ALT)

wait () and wait­Van­ish () method: Both meth­ods are used to slow down the script to wait for some­thing or to make some­thing van­ish. They take an op­tional du­ra­tion pa­ram­e­ter, which can be a num­ber of sec­onds, or the global pa­ram­e­ter FOR­EVER, which will wait un­til some­thing hap­pens.

Find () and find­All (): These two are other com­mon op­er­a­tions in Sikuli to search for things and in­ter­act with them. They are used when op­er­at­ing on a bunch of sim­i­lar items on the screen. We can use some vari­able r to store the re­gion as shown be­low:

r = find ( )

And later we can use that vari­able to call wait (), click (), type (), and other func­tions so that it will re­strict the search area and, hence, will help in speed­ing up the script. Se­lect­ing a re­gion and as­sign­ing it to a vari­able also helps when there are mul­ti­ple sim­i­lar items on screen and we want to deal with a par­tic­u­lar one at a time. For ex­am­ple: ()

High­light (): This is an­other ba­sic com­mand used to draw a box around a par­tic­u­lar re­gion.

Flow con­trol technique in Sikuli

Sikuli uses some sort of con­trol struc­ture like a FOR loop with a com­bi­na­tion of the find­All () func­tion. A sam­ple cod­ing of the FOR loop is given be­low:


Below_op­tions= find (im­age1.png)

Check­boxes = below_op­tions.find­All (im­age.png) For check­box in check­boxes: Check­box.high­light(1)


Sim­i­larly, if and while flow con­trol mech­a­nisms can also be used, which al­lows you to do some more com­plex in­ter­ac­tions through script­ing.

Use of Python in script­ing Sikuli

To en­hance the script­ing ca­pa­bil­i­ties, we can ac­cess the en­tire Python lan­guage. As an ex­am­ple, let’s sup­pose we run our Sikuli script unat­tended, and it has failed. Then by us­ing the cap­tureto ( ) method we can save the im­ages that can be used to de­bug script fail­ure.

Ac­cess­ing Java from Sikuli

To pro­vide some kind of on-screen dis­play, we can use Java classes and, in this way, give a GUI rep­re­sen­ta­tion to our script. Sikuli starts by im­port­ing the Swing classes, which are some of Java’s GUI li­braries, and then uses Swing to show a bor­der­less win­dow with the des­ig­nated text over ev­ery­thing on the screen for the spec­i­fied num­ber of sec­onds (the de­fault is 1 sec­ond).

Start your first script with Sikuli

To be­gin with, let’s build a very ba­sic and sim­ple ‘Hello world’ ex­am­ple of a test­ing script for an ap­pli­ca­tion WordPad to un­der­stand what a sim­ple script looks like. Fig­ure 1 shows how our com­plete script will look.

First of all, start Sikuli, se­lect the ed­i­tor and write the fol­low­ing line of code:

App =”C:\Pro­gram Files\Win­dows NT\Ac­ces­sories\ wordpad.exe”)

This will start the WordPad ap­pli­ca­tion us­ing Sikuli.

Type ‘wait’ in the ed­i­tor, and then click on the Take Screen­shot but­ton be­fore se­lect­ing the area of screen that you would like Sikuli to wait for the text to ap­pear.

Wait (screen­shot1.png)

Now add in­put text to the WordPad ap­pli­ca­tion us­ing type(), as fol­lows:

type (“Hello, this is my first Sikuli Code!”)

You can check whether the text that you en­tered ap­pears as ex­pected or not, us­ing the wait() com­mand: Wait (Hello, this is my first Sikuli Code!)

Now save and run the script. It will open the WordPad, and write in­put text to it. In this way, you can start cod­ing and fur­ther cus­tomise it ac­cord­ing to your re­quire­ment.

A trick you can try out

If we get two sim­i­lar but­tons like the Save but­ton on the same screen and the script is un­able to dis­tin­guish be­tween the two prior to se­lect­ing which one is to be clicked, use a larger por­tion of the screen in script­ing. This will help Sikuli to get a bet­ter un­der­stand­ing of the but­tons.

Ad­van­tages and dis­ad­van­tages of Sikuli


Sikuli is an open source tool, so it is bet­ter than tools like UFT.

It works very well with Flash ob­jects.

It is very handy in au­to­ma­tion when work­ing with Web el­e­ments that have dy­namic Xpaths and IDs.

It has mul­ti­ple script­ing and pro­gram­ming lan­guage sup­port like Scala, JRuby and Jpython.

It is very good with bound­ary value anal­y­sis test­ing.

With proper and smart use of script­ing, it is easy to iden­tify ap­pli­ca­tion crashes and bugs.

The Sikuli set-up and use is very sim­ple and easy. Au­to­ma­tion test­ing of mo­bile ap­pli­ca­tions can also be done with the help of em­u­la­tors.

Its in­te­gra­tion with Se­le­nium makes it worth us­ing. It can solve the browser di­a­logue box han­dling prob­lems of Se­le­nium. It can read texts on im­ages with the help of its ba­sic text recog­ni­tion OCR.

It sup­ports al­most ev­ery plat­form in­clud­ing Win­dows XP+, most Linux flavours and Mac OS 10.5+.

Sikuli is very use­ful in func­tional test­ing where in­put and out­put are pre­de­fined; so it can be used ef­fi­ciently for test­ing the over­all be­hav­iour of ap­pli­ca­tions.


The script can­not be run on the back-end as it needs a vis­i­ble ap­pli­ca­tion GUI dur­ing the time it is be­ing ex­e­cuted. It is plat­form and res­o­lu­tion de­pen­dent.

Run­ning mul­ti­ple scripts au­to­mat­i­cally, one by one, is very tricky in Sikuli.

A slight change in the text la­bel or im­age of the GUI of the ap­pli­ca­tion can re­sult in the fail­ure of the script. Main­te­nance of script­ing is very hard if the GUI of the apps changes fre­quently.

Fig­ure 1: First pro­gram code

Newspapers in English

Newspapers from India

© PressReader. All rights reserved.