xrobot - Python X11 event automation library

Every once in a while, I need to step outside the command line. Sometimes I'm even forced to interact with deaf graphical programs, those that do not listen to standard input or a meager HTTP port. In those desperate times, were it not for tools such as XAUT or xdotool, I would have to type and click outside of VIM, like cavemen probably did.

Those two little programs are enough to make me happy when confronted with an X11 server. However, my computer, of a more whimsical nature, is reluctant to execute binaries other than a Python interpreter (like any other well-meaning general-purpose device assembled during the 21st century, really). That is why I have decided to write the simplest Python library I could think of that is able to:

Find out the position of the mouse pointer
Move the mouse pointer around the screen
Press and release mouse buttons
Press and release keys in the keyboard
Capture the screen

The xrobot library is lean, simple and Python[23]-compliant. It is just a wrapper around functions defined inside python-xlib. Since Xlib screen capture is painfully slow, the python-gtk bindings are used instead, if present. I have decided to return images as numpy arrays for my convenience; if you find that dependency unbearable, you can root it out easily from the code.

Here's me, at five hundred clicks/second.

I leave you with a link to the xrobot github repository and some sample code:

import xrobot

xr = xrobot.XRobot()
xr.move(10, 10)

robot = XRobot()
x, y = robot.mouse_pos()
print('Current mouse position: x =', x, 'y =', y)
robot.move(10, 10)
robot.click(1)
robot.key('a')              # Press and release 'a'
robot.key_down('comma')     # Press ','
robot.key_up('comma')       # Release ','

width, height = robot.screen_resolution()
print('Screen width:', width, 'Screen height:', height)
img = robot.capture_screen()
import pylab as pl
pl.imshow(img)
pl.show()

Comments

xrobot Python X11 event automation library

Published

Category

Tags

Follow/Contact