Chapter 8. String Objects in Python and Qt

Table of Contents
Introduction
String conversions
QCString — simple strings in PyQt
Unicode strings

Most likely, you won't need the information in this chapter very often. If you don't juggle character encodings on a regular basis, or work extensively with Unicode, then you can probably get by quite well with the automatic string handling of PyQt. However, situations may arise, especially when working with string-intensive applications, where PyQt's behavior might surprise you. Then you will probably find yourself coming to this chapter.

Introduction

Working with strings is a delight in Python. Take the simple fact that you can, depending on your whim and on whether the string contains the other kind, choose to enclose your string literals in single ('), double ("), or triple (''' or """) quotes. Triple quoted strings can span multiple lines — no more string concatenations just because the string doesn't fit the line.

Once you have your string, and it can be a delightfully long string, megabytes if needs be, you can transform it, mangle it, search in it — all using a few choice modules, such as string, re or the native methods of the string object. About the only snag is the immutability of strings — every modifying action creates a new string from the old, which can be costly.

In C++, working with strings is not a delight. Working with strings in C++ requires using null-terminated character arrays, and writing all your own support functions. Or you have to try to use the C++ Standard Library String class, which is rather limited. This is why Trolltech created two string classes — QString and QCString — which are almost as powerful and friendly to use as the Python string classes. In fact, when Trolltech first created QString, there was no string class in the standard C++ library.

Python also has two string classes: the 'old' string class, in which every byte represents a character, and the newer Unicode string class, which contains a sequence of Unicode characters that can, depending on the encoding, take between one and four bytes. The Qt QString class is equivalent to the Python Unicode string class, and the Qt QCString class is more like the 'old' 8-bit Python string.

Your friendly Python Library Reference will tell you all about the string module, the string class, and the re module for regular expression matching. In this chapter I am more concerned with the interaction between QString and Python strings, and with character encoding questions.