Is there some string class in Python like StringBuilder in C#?

1

8 Answers

There is no one-to-one correlation. For a really good article please see Efficient String Concatenation in Python:

Building long strings in the Python progamming language can sometimes result in very slow running code. In this article I investigate the computational performance of various string concatenation methods.

TLDR the fastest method is below. It's extremely compact, and also pretty understandable:

def method6(): return ''.join([`num` for num in xrange(loop_count)]) 
4

Relying on compiler optimizations is fragile. The benchmarks linked in the accepted answer and numbers given by Antoine-tran are not to be trusted. Andrew Hare makes the mistake of including a call to repr in his methods. That slows all the methods equally but obscures the real penalty in constructing the string.

Use join. It's very fast and more robust.

$ ipython3 Python 3.5.1 (default, Mar 2 2016, 03:38:02) IPython 4.1.2 -- An enhanced Interactive Python. In [1]: values = [str(num) for num in range(int(1e3))] In [2]: %%timeit ...: ''.join(values) ...: 100000 loops, best of 3: 7.37 µs per loop In [3]: %%timeit ...: result = '' ...: for value in values: ...: result += value ...: 10000 loops, best of 3: 82.8 µs per loop In [4]: import io In [5]: %%timeit ...: writer = io.StringIO() ...: for value in values: ...: writer.write(value) ...: writer.getvalue() ...: 10000 loops, best of 3: 81.8 µs per loop 
3

I have used the code of Oliver Crow (link given by Andrew Hare) and adapted it a bit to tailor Python 2.7.3. (by using timeit package). I ran on my personal computer, Lenovo T61, 6GB RAM, Debian GNU/Linux 6.0.6 (squeeze).

Here is the result for 10,000 iterations:

 method1: 0.0538418292999 secs process size 4800 kb method2: 0.22602891922 secs process size 4960 kb method3: 0.0605459213257 secs process size 4980 kb method4: 0.0544030666351 secs process size 5536 kb method5: 0.0551080703735 secs process size 5272 kb method6: 0.0542731285095 secs process size 5512 kb 

and for 5,000,000 iterations (method 2 was ignored because it ran tooo slowly, like forever):

 method1: 5.88603997231 secs process size 37976 kb method3: 8.40748500824 secs process size 38024 kb method4: 7.96380496025 secs process size 321968 kb method5: 8.03666186333 secs process size 71720 kb method6: 6.68192911148 secs process size 38240 kb 

It is quite obvious that Python guys have done pretty great job to optimize string concatenation, and as Hoare said: "premature optimization is the root of all evil" :-)

3

Python has several things that fulfill similar purposes:

  • One common way to build large strings from pieces is to grow a list of strings and join it when you are done. This is a frequently-used Python idiom.
    • To build strings incorporating data with formatting, you would do the formatting separately.
  • For insertion and deletion at a character level, you would keep a list of length-one strings. (To make this from a string, you'd call list(your_string). You could also use a UserString.MutableString for this.
  • (c)StringIO.StringIO is useful for things that would otherwise take a file, but less so for general string building.

Using method 5 from above (The Pseudo File) we can get very good perf and flexibility

from cStringIO import StringIO class StringBuilder: _file_str = None def __init__(self): self._file_str = StringIO() def Append(self, str): self._file_str.write(str) def __str__(self): return self._file_str.getvalue() 

now using it

sb = StringBuilder() sb.Append("Hello\n") sb.Append("World") print sb 

you can try StringIO or cStringIO

0

There is no explicit analogue - i think you are expected to use string concatenations(likely optimized as said before) or third-party class(i doubt that they are a lot more efficient - lists in python are dynamic-typed so no fast-working char[] for buffer as i assume). Stringbuilder-like classes are not premature optimization because of innate feature of strings in many languages(immutability) - that allows many optimizations(for example, referencing same buffer for slices/substrings). Stringbuilder/stringbuffer/stringstream-like classes work a lot faster than concatenating strings(producing many small temporary objects that still need allocations and garbage collection) and even string formatting printf-like tools, not needing of interpreting formatting pattern overhead that is pretty consuming for a lot of format calls.

In case you are here looking for a fast string concatenation method in Python, then you do not need a special StringBuilder class. Simple concatenation works just as well without the performance penalty seen in C#.

resultString = "" resultString += "Append 1" resultString += "Append 2" 

See Antoine-tran's answer for performance results

0

ncG1vNJzZmirpJawrLvVnqmfpJ%2Bse6S7zGiorp2jqbawutJoaW1pZGuDeHvPsquhp55iwLW%2ByKeeZpuclsC0ecuiop5lo6m%2FqrrGm6yipJSav261zWaa