String reversal in Python
$begingroup$
This code will take as output any string (long, empty, with spaces...) and will return its reverse. Can I improve this algorithm so that it can be faster?
Right now its complexity is $O(n)$.
def reverse(stri):
output = ''
length = len(stri)
while length > 0:
output += stri[-1]
stri, length = (stri[0:length - 1], length - 1)
return output
python performance strings
New contributor
$endgroup$
add a comment |
$begingroup$
This code will take as output any string (long, empty, with spaces...) and will return its reverse. Can I improve this algorithm so that it can be faster?
Right now its complexity is $O(n)$.
def reverse(stri):
output = ''
length = len(stri)
while length > 0:
output += stri[-1]
stri, length = (stri[0:length - 1], length - 1)
return output
python performance strings
New contributor
$endgroup$
11
$begingroup$
I'd say your code isO(n**2)
because it creates at least n strings with length between1
andn
.
$endgroup$
– Eric Duminil
yesterday
1
$begingroup$
I think you're right, since I didn't take into account the immutability of strings in python, but the graph of @Graipher shows something similar to O(nlgn) complexity for my code, I don't know I'm a bit confused...
$endgroup$
– Midos
yesterday
1
$begingroup$
Depending on how large the string is and what you want to do with it, it might be a good idea to create a lazyReverseString
class, which only accesses the data when needed.
$endgroup$
– Eric Duminil
yesterday
2
$begingroup$
@Midos That might be due to the Python implementation used. AFAIK, cPython does sometimes manage to reuse one of the two strings (i.e. it tries to reserve enough space after/before one of the two strings and needs to copy only one string, but don't quote me on that). However, that is both implementation dependent and potentially dependent on the available memory, which is why Python's official style-guide recommends not relying on it.
$endgroup$
– Graipher
yesterday
$begingroup$
@Midos I recommend having a look at my answer now. It shows how you can speed things up with C functions that you call from Python.
$endgroup$
– Broman
6 hours ago
add a comment |
$begingroup$
This code will take as output any string (long, empty, with spaces...) and will return its reverse. Can I improve this algorithm so that it can be faster?
Right now its complexity is $O(n)$.
def reverse(stri):
output = ''
length = len(stri)
while length > 0:
output += stri[-1]
stri, length = (stri[0:length - 1], length - 1)
return output
python performance strings
New contributor
$endgroup$
This code will take as output any string (long, empty, with spaces...) and will return its reverse. Can I improve this algorithm so that it can be faster?
Right now its complexity is $O(n)$.
def reverse(stri):
output = ''
length = len(stri)
while length > 0:
output += stri[-1]
stri, length = (stri[0:length - 1], length - 1)
return output
python performance strings
python performance strings
New contributor
New contributor
edited 1 hour ago
Jamal♦
30.4k11121227
30.4k11121227
New contributor
asked yesterday
MidosMidos
8116
8116
New contributor
New contributor
11
$begingroup$
I'd say your code isO(n**2)
because it creates at least n strings with length between1
andn
.
$endgroup$
– Eric Duminil
yesterday
1
$begingroup$
I think you're right, since I didn't take into account the immutability of strings in python, but the graph of @Graipher shows something similar to O(nlgn) complexity for my code, I don't know I'm a bit confused...
$endgroup$
– Midos
yesterday
1
$begingroup$
Depending on how large the string is and what you want to do with it, it might be a good idea to create a lazyReverseString
class, which only accesses the data when needed.
$endgroup$
– Eric Duminil
yesterday
2
$begingroup$
@Midos That might be due to the Python implementation used. AFAIK, cPython does sometimes manage to reuse one of the two strings (i.e. it tries to reserve enough space after/before one of the two strings and needs to copy only one string, but don't quote me on that). However, that is both implementation dependent and potentially dependent on the available memory, which is why Python's official style-guide recommends not relying on it.
$endgroup$
– Graipher
yesterday
$begingroup$
@Midos I recommend having a look at my answer now. It shows how you can speed things up with C functions that you call from Python.
$endgroup$
– Broman
6 hours ago
add a comment |
11
$begingroup$
I'd say your code isO(n**2)
because it creates at least n strings with length between1
andn
.
$endgroup$
– Eric Duminil
yesterday
1
$begingroup$
I think you're right, since I didn't take into account the immutability of strings in python, but the graph of @Graipher shows something similar to O(nlgn) complexity for my code, I don't know I'm a bit confused...
$endgroup$
– Midos
yesterday
1
$begingroup$
Depending on how large the string is and what you want to do with it, it might be a good idea to create a lazyReverseString
class, which only accesses the data when needed.
$endgroup$
– Eric Duminil
yesterday
2
$begingroup$
@Midos That might be due to the Python implementation used. AFAIK, cPython does sometimes manage to reuse one of the two strings (i.e. it tries to reserve enough space after/before one of the two strings and needs to copy only one string, but don't quote me on that). However, that is both implementation dependent and potentially dependent on the available memory, which is why Python's official style-guide recommends not relying on it.
$endgroup$
– Graipher
yesterday
$begingroup$
@Midos I recommend having a look at my answer now. It shows how you can speed things up with C functions that you call from Python.
$endgroup$
– Broman
6 hours ago
11
11
$begingroup$
I'd say your code is
O(n**2)
because it creates at least n strings with length between 1
and n
.$endgroup$
– Eric Duminil
yesterday
$begingroup$
I'd say your code is
O(n**2)
because it creates at least n strings with length between 1
and n
.$endgroup$
– Eric Duminil
yesterday
1
1
$begingroup$
I think you're right, since I didn't take into account the immutability of strings in python, but the graph of @Graipher shows something similar to O(nlgn) complexity for my code, I don't know I'm a bit confused...
$endgroup$
– Midos
yesterday
$begingroup$
I think you're right, since I didn't take into account the immutability of strings in python, but the graph of @Graipher shows something similar to O(nlgn) complexity for my code, I don't know I'm a bit confused...
$endgroup$
– Midos
yesterday
1
1
$begingroup$
Depending on how large the string is and what you want to do with it, it might be a good idea to create a lazy
ReverseString
class, which only accesses the data when needed.$endgroup$
– Eric Duminil
yesterday
$begingroup$
Depending on how large the string is and what you want to do with it, it might be a good idea to create a lazy
ReverseString
class, which only accesses the data when needed.$endgroup$
– Eric Duminil
yesterday
2
2
$begingroup$
@Midos That might be due to the Python implementation used. AFAIK, cPython does sometimes manage to reuse one of the two strings (i.e. it tries to reserve enough space after/before one of the two strings and needs to copy only one string, but don't quote me on that). However, that is both implementation dependent and potentially dependent on the available memory, which is why Python's official style-guide recommends not relying on it.
$endgroup$
– Graipher
yesterday
$begingroup$
@Midos That might be due to the Python implementation used. AFAIK, cPython does sometimes manage to reuse one of the two strings (i.e. it tries to reserve enough space after/before one of the two strings and needs to copy only one string, but don't quote me on that). However, that is both implementation dependent and potentially dependent on the available memory, which is why Python's official style-guide recommends not relying on it.
$endgroup$
– Graipher
yesterday
$begingroup$
@Midos I recommend having a look at my answer now. It shows how you can speed things up with C functions that you call from Python.
$endgroup$
– Broman
6 hours ago
$begingroup$
@Midos I recommend having a look at my answer now. It shows how you can speed things up with C functions that you call from Python.
$endgroup$
– Broman
6 hours ago
add a comment |
3 Answers
3
active
oldest
votes
$begingroup$
Yes, this can be faster. Adding strings using +
is usually a bad idea in Python, since strings are immutable. This means that whenever you add two strings, a new string needs to be allocated with the size of the resulting strings and then both string contents need to be copied there. Instead you usually want to build a list of strings and ''.join
them at the end (where you pay this cost only once).
But here you can just use the fact that strings can be sliced and you can specify a negative step:
def reverse_g(s):
return s[::-1]
Here is a timing comparison for random strings of length up to 1M characters, where reverse
is your function and reverse_g
is this one using slicing. Note the double-log scale, for the largest string your function is almost a hundred thousand times slower.
The reverse_s
function uses the reversed
built-in, as suggested in the answer by @sleblanc and assumes you actually need the reversed string and not just an iterator over it:
def reverse_s(s):
return ''.join(reversed(s))
$endgroup$
2
$begingroup$
How did you measure the uncertainty bars, is the drop in processing time with increasing string length near length 10 forreverse_g
significant and if so why?
$endgroup$
– gerrit
yesterday
2
$begingroup$
@gerrit: The uncertainty bars come from the fact that each timing measurement is performed three times, so the value is the mean and the uncertainty the uncertainty of the mean (i.e. standard deviation / sqrt(n)). So I would doubt that the drop is significant. Sometimes you get large increases due to some other activity on the machine, so that is what could have happened in the first case.
$endgroup$
– Graipher
yesterday
2
$begingroup$
Data is code and code is data. Ponder why you might want to reverse the string, and consider instead to use an iterator that simply reads the string backwards. This is essentially what this answer is, while the actual implementation the iterator is buried inside CPython's internals. Nice answer
$endgroup$
– sleblanc
yesterday
2
$begingroup$
Adding strings using + is usually a bad idea in Python, since strings are immutable. Immutability (and the use of Python) isn't what matters here; in nearly every case, string addition requires iterating over at least one of the strings. A pre-allocated, mutable C-string concatenation still requires linear time to copy the contents of the second string into the tail of the first.
$endgroup$
– Schism
23 hours ago
1
$begingroup$
@IsmaelMiguel: As mentioned in the textreverse_g
is the function with the slicing, i.e. the one in the answer. Renamed it so it is the correct name now, though.
$endgroup$
– Graipher
17 hours ago
|
show 7 more comments
$begingroup$
In terms of pure complexity, the answer is simple: No, it is not possible to reverse a string faster than O(n). That is the theoretical limit when you look at the pure algorithm.
However, your code does not achieve that because the operations in the loop are not O(1). For instance, output += stri[-1]
does not do what you think it does. Python is a very high level language that does a lot of strange things under the hood compared to languages such as C. Strings are immutable in Python, which means that each time this line is executed, a completely new string is created.
If you really need the speed, you could consider writing a C function and call it from Python. Here is an example:
rev.c:
#include <stddef.h>
void reverse(char * stro, char * stri, size_t length) {
for(size_t i=0; i<length; i++) stro[i]=stri[length-1-i];
stro[length]='';
}
Compile the above function with this command:
gcc -o rev.so -shared -fPIC rev.c
And here is a python script using that function.
rev.py:
from ctypes import *
revlib = cdll.LoadLibrary("rev.so");
reverse = revlib.reverse
reverse.argtypes = [c_char_p, c_char_p, c_size_t]
hello = "HelloWorld"
stri = create_string_buffer(hello)
stro = create_string_buffer(b'00' * (len(hello)+1))
reverse(stro, stri, len(stri)-1)
print(repr(stri.value))
print(repr(stro.value))
Please note that I'm by no means an expert on this. I tested this with string of length 10⁸, and I tried the method from Graipher, calling the C function from Python and calling the C function from C. I used -O3
optimization. When I did not use any optimization it was slower to call the C function from Python. Also note that I did NOT include the time it took to create the buffers.
stri[::-1] : 0.98s
calling reverse from python : 0.59s
calling reverse from c: 0.06s
It's not a huge improvement, but it is an improvement. But the pure C program was WAY faster. The main function I used was this one:
int __attribute__((optimize("0"))) // Disable optimization for main
main(int argc, char ** argv) { // so that reverse is not inlined
const size_t size = 1e9;
char * str = malloc(size+1);
static const char alphanum =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
// Take data from outside the program to fool the optimizer
alphanum[atoi(argv[1])]='7';
// Load string with random data to fool the optimizer
srand(time(NULL));
for (size_t i = 0; i < size; ++i) {
str[i] = alphanum[rand() % (sizeof(alphanum) - 1)];
}
char *o = malloc(size+1);
reverse(o, str, strlen(str));
// Do something with the data to fool the optimizer
for(size_t i=0; i<size; i++)
if(str[i] != o[size-i-1]) {
printf("Errorn");
exit(1);
}
}
Then, to get the runtime I ran:
gcc -O3 -pg rev.c; ./a.out; gprof a.out gmon.out | head -n8
$endgroup$
$begingroup$
Thus, this answer is simply wrong as it doesn't address the issue with the presented code
$endgroup$
– PascalVKooten
12 hours ago
$begingroup$
@PascalVKooten Fixed
$endgroup$
– Broman
6 hours ago
add a comment |
$begingroup$
Here's a very short code I would use to reverse any string.
a = 'Coding is fun'
rev = print("Reverse of this statement is "+ a[::-1])
New contributor
$endgroup$
4
$begingroup$
This adds nothing to Graipher's answer.
$endgroup$
– Broman
8 hours ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Midos is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e) {
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom)) {
StackExchange.using('gps', function() { StackExchange.gps.track('embedded_signup_form.view', { location: 'question_page' }); });
$window.unbind('scroll', onScroll);
}
};
$window.on('scroll', onScroll);
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f215179%2fstring-reversal-in-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Yes, this can be faster. Adding strings using +
is usually a bad idea in Python, since strings are immutable. This means that whenever you add two strings, a new string needs to be allocated with the size of the resulting strings and then both string contents need to be copied there. Instead you usually want to build a list of strings and ''.join
them at the end (where you pay this cost only once).
But here you can just use the fact that strings can be sliced and you can specify a negative step:
def reverse_g(s):
return s[::-1]
Here is a timing comparison for random strings of length up to 1M characters, where reverse
is your function and reverse_g
is this one using slicing. Note the double-log scale, for the largest string your function is almost a hundred thousand times slower.
The reverse_s
function uses the reversed
built-in, as suggested in the answer by @sleblanc and assumes you actually need the reversed string and not just an iterator over it:
def reverse_s(s):
return ''.join(reversed(s))
$endgroup$
2
$begingroup$
How did you measure the uncertainty bars, is the drop in processing time with increasing string length near length 10 forreverse_g
significant and if so why?
$endgroup$
– gerrit
yesterday
2
$begingroup$
@gerrit: The uncertainty bars come from the fact that each timing measurement is performed three times, so the value is the mean and the uncertainty the uncertainty of the mean (i.e. standard deviation / sqrt(n)). So I would doubt that the drop is significant. Sometimes you get large increases due to some other activity on the machine, so that is what could have happened in the first case.
$endgroup$
– Graipher
yesterday
2
$begingroup$
Data is code and code is data. Ponder why you might want to reverse the string, and consider instead to use an iterator that simply reads the string backwards. This is essentially what this answer is, while the actual implementation the iterator is buried inside CPython's internals. Nice answer
$endgroup$
– sleblanc
yesterday
2
$begingroup$
Adding strings using + is usually a bad idea in Python, since strings are immutable. Immutability (and the use of Python) isn't what matters here; in nearly every case, string addition requires iterating over at least one of the strings. A pre-allocated, mutable C-string concatenation still requires linear time to copy the contents of the second string into the tail of the first.
$endgroup$
– Schism
23 hours ago
1
$begingroup$
@IsmaelMiguel: As mentioned in the textreverse_g
is the function with the slicing, i.e. the one in the answer. Renamed it so it is the correct name now, though.
$endgroup$
– Graipher
17 hours ago
|
show 7 more comments
$begingroup$
Yes, this can be faster. Adding strings using +
is usually a bad idea in Python, since strings are immutable. This means that whenever you add two strings, a new string needs to be allocated with the size of the resulting strings and then both string contents need to be copied there. Instead you usually want to build a list of strings and ''.join
them at the end (where you pay this cost only once).
But here you can just use the fact that strings can be sliced and you can specify a negative step:
def reverse_g(s):
return s[::-1]
Here is a timing comparison for random strings of length up to 1M characters, where reverse
is your function and reverse_g
is this one using slicing. Note the double-log scale, for the largest string your function is almost a hundred thousand times slower.
The reverse_s
function uses the reversed
built-in, as suggested in the answer by @sleblanc and assumes you actually need the reversed string and not just an iterator over it:
def reverse_s(s):
return ''.join(reversed(s))
$endgroup$
2
$begingroup$
How did you measure the uncertainty bars, is the drop in processing time with increasing string length near length 10 forreverse_g
significant and if so why?
$endgroup$
– gerrit
yesterday
2
$begingroup$
@gerrit: The uncertainty bars come from the fact that each timing measurement is performed three times, so the value is the mean and the uncertainty the uncertainty of the mean (i.e. standard deviation / sqrt(n)). So I would doubt that the drop is significant. Sometimes you get large increases due to some other activity on the machine, so that is what could have happened in the first case.
$endgroup$
– Graipher
yesterday
2
$begingroup$
Data is code and code is data. Ponder why you might want to reverse the string, and consider instead to use an iterator that simply reads the string backwards. This is essentially what this answer is, while the actual implementation the iterator is buried inside CPython's internals. Nice answer
$endgroup$
– sleblanc
yesterday
2
$begingroup$
Adding strings using + is usually a bad idea in Python, since strings are immutable. Immutability (and the use of Python) isn't what matters here; in nearly every case, string addition requires iterating over at least one of the strings. A pre-allocated, mutable C-string concatenation still requires linear time to copy the contents of the second string into the tail of the first.
$endgroup$
– Schism
23 hours ago
1
$begingroup$
@IsmaelMiguel: As mentioned in the textreverse_g
is the function with the slicing, i.e. the one in the answer. Renamed it so it is the correct name now, though.
$endgroup$
– Graipher
17 hours ago
|
show 7 more comments
$begingroup$
Yes, this can be faster. Adding strings using +
is usually a bad idea in Python, since strings are immutable. This means that whenever you add two strings, a new string needs to be allocated with the size of the resulting strings and then both string contents need to be copied there. Instead you usually want to build a list of strings and ''.join
them at the end (where you pay this cost only once).
But here you can just use the fact that strings can be sliced and you can specify a negative step:
def reverse_g(s):
return s[::-1]
Here is a timing comparison for random strings of length up to 1M characters, where reverse
is your function and reverse_g
is this one using slicing. Note the double-log scale, for the largest string your function is almost a hundred thousand times slower.
The reverse_s
function uses the reversed
built-in, as suggested in the answer by @sleblanc and assumes you actually need the reversed string and not just an iterator over it:
def reverse_s(s):
return ''.join(reversed(s))
$endgroup$
Yes, this can be faster. Adding strings using +
is usually a bad idea in Python, since strings are immutable. This means that whenever you add two strings, a new string needs to be allocated with the size of the resulting strings and then both string contents need to be copied there. Instead you usually want to build a list of strings and ''.join
them at the end (where you pay this cost only once).
But here you can just use the fact that strings can be sliced and you can specify a negative step:
def reverse_g(s):
return s[::-1]
Here is a timing comparison for random strings of length up to 1M characters, where reverse
is your function and reverse_g
is this one using slicing. Note the double-log scale, for the largest string your function is almost a hundred thousand times slower.
The reverse_s
function uses the reversed
built-in, as suggested in the answer by @sleblanc and assumes you actually need the reversed string and not just an iterator over it:
def reverse_s(s):
return ''.join(reversed(s))
edited 7 hours ago
answered yesterday
GraipherGraipher
25.9k54089
25.9k54089
2
$begingroup$
How did you measure the uncertainty bars, is the drop in processing time with increasing string length near length 10 forreverse_g
significant and if so why?
$endgroup$
– gerrit
yesterday
2
$begingroup$
@gerrit: The uncertainty bars come from the fact that each timing measurement is performed three times, so the value is the mean and the uncertainty the uncertainty of the mean (i.e. standard deviation / sqrt(n)). So I would doubt that the drop is significant. Sometimes you get large increases due to some other activity on the machine, so that is what could have happened in the first case.
$endgroup$
– Graipher
yesterday
2
$begingroup$
Data is code and code is data. Ponder why you might want to reverse the string, and consider instead to use an iterator that simply reads the string backwards. This is essentially what this answer is, while the actual implementation the iterator is buried inside CPython's internals. Nice answer
$endgroup$
– sleblanc
yesterday
2
$begingroup$
Adding strings using + is usually a bad idea in Python, since strings are immutable. Immutability (and the use of Python) isn't what matters here; in nearly every case, string addition requires iterating over at least one of the strings. A pre-allocated, mutable C-string concatenation still requires linear time to copy the contents of the second string into the tail of the first.
$endgroup$
– Schism
23 hours ago
1
$begingroup$
@IsmaelMiguel: As mentioned in the textreverse_g
is the function with the slicing, i.e. the one in the answer. Renamed it so it is the correct name now, though.
$endgroup$
– Graipher
17 hours ago
|
show 7 more comments
2
$begingroup$
How did you measure the uncertainty bars, is the drop in processing time with increasing string length near length 10 forreverse_g
significant and if so why?
$endgroup$
– gerrit
yesterday
2
$begingroup$
@gerrit: The uncertainty bars come from the fact that each timing measurement is performed three times, so the value is the mean and the uncertainty the uncertainty of the mean (i.e. standard deviation / sqrt(n)). So I would doubt that the drop is significant. Sometimes you get large increases due to some other activity on the machine, so that is what could have happened in the first case.
$endgroup$
– Graipher
yesterday
2
$begingroup$
Data is code and code is data. Ponder why you might want to reverse the string, and consider instead to use an iterator that simply reads the string backwards. This is essentially what this answer is, while the actual implementation the iterator is buried inside CPython's internals. Nice answer
$endgroup$
– sleblanc
yesterday
2
$begingroup$
Adding strings using + is usually a bad idea in Python, since strings are immutable. Immutability (and the use of Python) isn't what matters here; in nearly every case, string addition requires iterating over at least one of the strings. A pre-allocated, mutable C-string concatenation still requires linear time to copy the contents of the second string into the tail of the first.
$endgroup$
– Schism
23 hours ago
1
$begingroup$
@IsmaelMiguel: As mentioned in the textreverse_g
is the function with the slicing, i.e. the one in the answer. Renamed it so it is the correct name now, though.
$endgroup$
– Graipher
17 hours ago
2
2
$begingroup$
How did you measure the uncertainty bars, is the drop in processing time with increasing string length near length 10 for
reverse_g
significant and if so why?$endgroup$
– gerrit
yesterday
$begingroup$
How did you measure the uncertainty bars, is the drop in processing time with increasing string length near length 10 for
reverse_g
significant and if so why?$endgroup$
– gerrit
yesterday
2
2
$begingroup$
@gerrit: The uncertainty bars come from the fact that each timing measurement is performed three times, so the value is the mean and the uncertainty the uncertainty of the mean (i.e. standard deviation / sqrt(n)). So I would doubt that the drop is significant. Sometimes you get large increases due to some other activity on the machine, so that is what could have happened in the first case.
$endgroup$
– Graipher
yesterday
$begingroup$
@gerrit: The uncertainty bars come from the fact that each timing measurement is performed three times, so the value is the mean and the uncertainty the uncertainty of the mean (i.e. standard deviation / sqrt(n)). So I would doubt that the drop is significant. Sometimes you get large increases due to some other activity on the machine, so that is what could have happened in the first case.
$endgroup$
– Graipher
yesterday
2
2
$begingroup$
Data is code and code is data. Ponder why you might want to reverse the string, and consider instead to use an iterator that simply reads the string backwards. This is essentially what this answer is, while the actual implementation the iterator is buried inside CPython's internals. Nice answer
$endgroup$
– sleblanc
yesterday
$begingroup$
Data is code and code is data. Ponder why you might want to reverse the string, and consider instead to use an iterator that simply reads the string backwards. This is essentially what this answer is, while the actual implementation the iterator is buried inside CPython's internals. Nice answer
$endgroup$
– sleblanc
yesterday
2
2
$begingroup$
Adding strings using + is usually a bad idea in Python, since strings are immutable. Immutability (and the use of Python) isn't what matters here; in nearly every case, string addition requires iterating over at least one of the strings. A pre-allocated, mutable C-string concatenation still requires linear time to copy the contents of the second string into the tail of the first.
$endgroup$
– Schism
23 hours ago
$begingroup$
Adding strings using + is usually a bad idea in Python, since strings are immutable. Immutability (and the use of Python) isn't what matters here; in nearly every case, string addition requires iterating over at least one of the strings. A pre-allocated, mutable C-string concatenation still requires linear time to copy the contents of the second string into the tail of the first.
$endgroup$
– Schism
23 hours ago
1
1
$begingroup$
@IsmaelMiguel: As mentioned in the text
reverse_g
is the function with the slicing, i.e. the one in the answer. Renamed it so it is the correct name now, though.$endgroup$
– Graipher
17 hours ago
$begingroup$
@IsmaelMiguel: As mentioned in the text
reverse_g
is the function with the slicing, i.e. the one in the answer. Renamed it so it is the correct name now, though.$endgroup$
– Graipher
17 hours ago
|
show 7 more comments
$begingroup$
In terms of pure complexity, the answer is simple: No, it is not possible to reverse a string faster than O(n). That is the theoretical limit when you look at the pure algorithm.
However, your code does not achieve that because the operations in the loop are not O(1). For instance, output += stri[-1]
does not do what you think it does. Python is a very high level language that does a lot of strange things under the hood compared to languages such as C. Strings are immutable in Python, which means that each time this line is executed, a completely new string is created.
If you really need the speed, you could consider writing a C function and call it from Python. Here is an example:
rev.c:
#include <stddef.h>
void reverse(char * stro, char * stri, size_t length) {
for(size_t i=0; i<length; i++) stro[i]=stri[length-1-i];
stro[length]='';
}
Compile the above function with this command:
gcc -o rev.so -shared -fPIC rev.c
And here is a python script using that function.
rev.py:
from ctypes import *
revlib = cdll.LoadLibrary("rev.so");
reverse = revlib.reverse
reverse.argtypes = [c_char_p, c_char_p, c_size_t]
hello = "HelloWorld"
stri = create_string_buffer(hello)
stro = create_string_buffer(b'00' * (len(hello)+1))
reverse(stro, stri, len(stri)-1)
print(repr(stri.value))
print(repr(stro.value))
Please note that I'm by no means an expert on this. I tested this with string of length 10⁸, and I tried the method from Graipher, calling the C function from Python and calling the C function from C. I used -O3
optimization. When I did not use any optimization it was slower to call the C function from Python. Also note that I did NOT include the time it took to create the buffers.
stri[::-1] : 0.98s
calling reverse from python : 0.59s
calling reverse from c: 0.06s
It's not a huge improvement, but it is an improvement. But the pure C program was WAY faster. The main function I used was this one:
int __attribute__((optimize("0"))) // Disable optimization for main
main(int argc, char ** argv) { // so that reverse is not inlined
const size_t size = 1e9;
char * str = malloc(size+1);
static const char alphanum =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
// Take data from outside the program to fool the optimizer
alphanum[atoi(argv[1])]='7';
// Load string with random data to fool the optimizer
srand(time(NULL));
for (size_t i = 0; i < size; ++i) {
str[i] = alphanum[rand() % (sizeof(alphanum) - 1)];
}
char *o = malloc(size+1);
reverse(o, str, strlen(str));
// Do something with the data to fool the optimizer
for(size_t i=0; i<size; i++)
if(str[i] != o[size-i-1]) {
printf("Errorn");
exit(1);
}
}
Then, to get the runtime I ran:
gcc -O3 -pg rev.c; ./a.out; gprof a.out gmon.out | head -n8
$endgroup$
$begingroup$
Thus, this answer is simply wrong as it doesn't address the issue with the presented code
$endgroup$
– PascalVKooten
12 hours ago
$begingroup$
@PascalVKooten Fixed
$endgroup$
– Broman
6 hours ago
add a comment |
$begingroup$
In terms of pure complexity, the answer is simple: No, it is not possible to reverse a string faster than O(n). That is the theoretical limit when you look at the pure algorithm.
However, your code does not achieve that because the operations in the loop are not O(1). For instance, output += stri[-1]
does not do what you think it does. Python is a very high level language that does a lot of strange things under the hood compared to languages such as C. Strings are immutable in Python, which means that each time this line is executed, a completely new string is created.
If you really need the speed, you could consider writing a C function and call it from Python. Here is an example:
rev.c:
#include <stddef.h>
void reverse(char * stro, char * stri, size_t length) {
for(size_t i=0; i<length; i++) stro[i]=stri[length-1-i];
stro[length]='';
}
Compile the above function with this command:
gcc -o rev.so -shared -fPIC rev.c
And here is a python script using that function.
rev.py:
from ctypes import *
revlib = cdll.LoadLibrary("rev.so");
reverse = revlib.reverse
reverse.argtypes = [c_char_p, c_char_p, c_size_t]
hello = "HelloWorld"
stri = create_string_buffer(hello)
stro = create_string_buffer(b'00' * (len(hello)+1))
reverse(stro, stri, len(stri)-1)
print(repr(stri.value))
print(repr(stro.value))
Please note that I'm by no means an expert on this. I tested this with string of length 10⁸, and I tried the method from Graipher, calling the C function from Python and calling the C function from C. I used -O3
optimization. When I did not use any optimization it was slower to call the C function from Python. Also note that I did NOT include the time it took to create the buffers.
stri[::-1] : 0.98s
calling reverse from python : 0.59s
calling reverse from c: 0.06s
It's not a huge improvement, but it is an improvement. But the pure C program was WAY faster. The main function I used was this one:
int __attribute__((optimize("0"))) // Disable optimization for main
main(int argc, char ** argv) { // so that reverse is not inlined
const size_t size = 1e9;
char * str = malloc(size+1);
static const char alphanum =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
// Take data from outside the program to fool the optimizer
alphanum[atoi(argv[1])]='7';
// Load string with random data to fool the optimizer
srand(time(NULL));
for (size_t i = 0; i < size; ++i) {
str[i] = alphanum[rand() % (sizeof(alphanum) - 1)];
}
char *o = malloc(size+1);
reverse(o, str, strlen(str));
// Do something with the data to fool the optimizer
for(size_t i=0; i<size; i++)
if(str[i] != o[size-i-1]) {
printf("Errorn");
exit(1);
}
}
Then, to get the runtime I ran:
gcc -O3 -pg rev.c; ./a.out; gprof a.out gmon.out | head -n8
$endgroup$
$begingroup$
Thus, this answer is simply wrong as it doesn't address the issue with the presented code
$endgroup$
– PascalVKooten
12 hours ago
$begingroup$
@PascalVKooten Fixed
$endgroup$
– Broman
6 hours ago
add a comment |
$begingroup$
In terms of pure complexity, the answer is simple: No, it is not possible to reverse a string faster than O(n). That is the theoretical limit when you look at the pure algorithm.
However, your code does not achieve that because the operations in the loop are not O(1). For instance, output += stri[-1]
does not do what you think it does. Python is a very high level language that does a lot of strange things under the hood compared to languages such as C. Strings are immutable in Python, which means that each time this line is executed, a completely new string is created.
If you really need the speed, you could consider writing a C function and call it from Python. Here is an example:
rev.c:
#include <stddef.h>
void reverse(char * stro, char * stri, size_t length) {
for(size_t i=0; i<length; i++) stro[i]=stri[length-1-i];
stro[length]='';
}
Compile the above function with this command:
gcc -o rev.so -shared -fPIC rev.c
And here is a python script using that function.
rev.py:
from ctypes import *
revlib = cdll.LoadLibrary("rev.so");
reverse = revlib.reverse
reverse.argtypes = [c_char_p, c_char_p, c_size_t]
hello = "HelloWorld"
stri = create_string_buffer(hello)
stro = create_string_buffer(b'00' * (len(hello)+1))
reverse(stro, stri, len(stri)-1)
print(repr(stri.value))
print(repr(stro.value))
Please note that I'm by no means an expert on this. I tested this with string of length 10⁸, and I tried the method from Graipher, calling the C function from Python and calling the C function from C. I used -O3
optimization. When I did not use any optimization it was slower to call the C function from Python. Also note that I did NOT include the time it took to create the buffers.
stri[::-1] : 0.98s
calling reverse from python : 0.59s
calling reverse from c: 0.06s
It's not a huge improvement, but it is an improvement. But the pure C program was WAY faster. The main function I used was this one:
int __attribute__((optimize("0"))) // Disable optimization for main
main(int argc, char ** argv) { // so that reverse is not inlined
const size_t size = 1e9;
char * str = malloc(size+1);
static const char alphanum =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
// Take data from outside the program to fool the optimizer
alphanum[atoi(argv[1])]='7';
// Load string with random data to fool the optimizer
srand(time(NULL));
for (size_t i = 0; i < size; ++i) {
str[i] = alphanum[rand() % (sizeof(alphanum) - 1)];
}
char *o = malloc(size+1);
reverse(o, str, strlen(str));
// Do something with the data to fool the optimizer
for(size_t i=0; i<size; i++)
if(str[i] != o[size-i-1]) {
printf("Errorn");
exit(1);
}
}
Then, to get the runtime I ran:
gcc -O3 -pg rev.c; ./a.out; gprof a.out gmon.out | head -n8
$endgroup$
In terms of pure complexity, the answer is simple: No, it is not possible to reverse a string faster than O(n). That is the theoretical limit when you look at the pure algorithm.
However, your code does not achieve that because the operations in the loop are not O(1). For instance, output += stri[-1]
does not do what you think it does. Python is a very high level language that does a lot of strange things under the hood compared to languages such as C. Strings are immutable in Python, which means that each time this line is executed, a completely new string is created.
If you really need the speed, you could consider writing a C function and call it from Python. Here is an example:
rev.c:
#include <stddef.h>
void reverse(char * stro, char * stri, size_t length) {
for(size_t i=0; i<length; i++) stro[i]=stri[length-1-i];
stro[length]='';
}
Compile the above function with this command:
gcc -o rev.so -shared -fPIC rev.c
And here is a python script using that function.
rev.py:
from ctypes import *
revlib = cdll.LoadLibrary("rev.so");
reverse = revlib.reverse
reverse.argtypes = [c_char_p, c_char_p, c_size_t]
hello = "HelloWorld"
stri = create_string_buffer(hello)
stro = create_string_buffer(b'00' * (len(hello)+1))
reverse(stro, stri, len(stri)-1)
print(repr(stri.value))
print(repr(stro.value))
Please note that I'm by no means an expert on this. I tested this with string of length 10⁸, and I tried the method from Graipher, calling the C function from Python and calling the C function from C. I used -O3
optimization. When I did not use any optimization it was slower to call the C function from Python. Also note that I did NOT include the time it took to create the buffers.
stri[::-1] : 0.98s
calling reverse from python : 0.59s
calling reverse from c: 0.06s
It's not a huge improvement, but it is an improvement. But the pure C program was WAY faster. The main function I used was this one:
int __attribute__((optimize("0"))) // Disable optimization for main
main(int argc, char ** argv) { // so that reverse is not inlined
const size_t size = 1e9;
char * str = malloc(size+1);
static const char alphanum =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
// Take data from outside the program to fool the optimizer
alphanum[atoi(argv[1])]='7';
// Load string with random data to fool the optimizer
srand(time(NULL));
for (size_t i = 0; i < size; ++i) {
str[i] = alphanum[rand() % (sizeof(alphanum) - 1)];
}
char *o = malloc(size+1);
reverse(o, str, strlen(str));
// Do something with the data to fool the optimizer
for(size_t i=0; i<size; i++)
if(str[i] != o[size-i-1]) {
printf("Errorn");
exit(1);
}
}
Then, to get the runtime I ran:
gcc -O3 -pg rev.c; ./a.out; gprof a.out gmon.out | head -n8
edited 6 hours ago
answered 20 hours ago
BromanBroman
35419
35419
$begingroup$
Thus, this answer is simply wrong as it doesn't address the issue with the presented code
$endgroup$
– PascalVKooten
12 hours ago
$begingroup$
@PascalVKooten Fixed
$endgroup$
– Broman
6 hours ago
add a comment |
$begingroup$
Thus, this answer is simply wrong as it doesn't address the issue with the presented code
$endgroup$
– PascalVKooten
12 hours ago
$begingroup$
@PascalVKooten Fixed
$endgroup$
– Broman
6 hours ago
$begingroup$
Thus, this answer is simply wrong as it doesn't address the issue with the presented code
$endgroup$
– PascalVKooten
12 hours ago
$begingroup$
Thus, this answer is simply wrong as it doesn't address the issue with the presented code
$endgroup$
– PascalVKooten
12 hours ago
$begingroup$
@PascalVKooten Fixed
$endgroup$
– Broman
6 hours ago
$begingroup$
@PascalVKooten Fixed
$endgroup$
– Broman
6 hours ago
add a comment |
$begingroup$
Here's a very short code I would use to reverse any string.
a = 'Coding is fun'
rev = print("Reverse of this statement is "+ a[::-1])
New contributor
$endgroup$
4
$begingroup$
This adds nothing to Graipher's answer.
$endgroup$
– Broman
8 hours ago
add a comment |
$begingroup$
Here's a very short code I would use to reverse any string.
a = 'Coding is fun'
rev = print("Reverse of this statement is "+ a[::-1])
New contributor
$endgroup$
4
$begingroup$
This adds nothing to Graipher's answer.
$endgroup$
– Broman
8 hours ago
add a comment |
$begingroup$
Here's a very short code I would use to reverse any string.
a = 'Coding is fun'
rev = print("Reverse of this statement is "+ a[::-1])
New contributor
$endgroup$
Here's a very short code I would use to reverse any string.
a = 'Coding is fun'
rev = print("Reverse of this statement is "+ a[::-1])
New contributor
New contributor
answered 9 hours ago
DaachiDaachi
1
1
New contributor
New contributor
4
$begingroup$
This adds nothing to Graipher's answer.
$endgroup$
– Broman
8 hours ago
add a comment |
4
$begingroup$
This adds nothing to Graipher's answer.
$endgroup$
– Broman
8 hours ago
4
4
$begingroup$
This adds nothing to Graipher's answer.
$endgroup$
– Broman
8 hours ago
$begingroup$
This adds nothing to Graipher's answer.
$endgroup$
– Broman
8 hours ago
add a comment |
Midos is a new contributor. Be nice, and check out our Code of Conduct.
Midos is a new contributor. Be nice, and check out our Code of Conduct.
Midos is a new contributor. Be nice, and check out our Code of Conduct.
Midos is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e) {
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom)) {
StackExchange.using('gps', function() { StackExchange.gps.track('embedded_signup_form.view', { location: 'question_page' }); });
$window.unbind('scroll', onScroll);
}
};
$window.on('scroll', onScroll);
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f215179%2fstring-reversal-in-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e) {
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom)) {
StackExchange.using('gps', function() { StackExchange.gps.track('embedded_signup_form.view', { location: 'question_page' }); });
$window.unbind('scroll', onScroll);
}
};
$window.on('scroll', onScroll);
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e) {
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom)) {
StackExchange.using('gps', function() { StackExchange.gps.track('embedded_signup_form.view', { location: 'question_page' }); });
$window.unbind('scroll', onScroll);
}
};
$window.on('scroll', onScroll);
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e) {
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom)) {
StackExchange.using('gps', function() { StackExchange.gps.track('embedded_signup_form.view', { location: 'question_page' }); });
$window.unbind('scroll', onScroll);
}
};
$window.on('scroll', onScroll);
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
11
$begingroup$
I'd say your code is
O(n**2)
because it creates at least n strings with length between1
andn
.$endgroup$
– Eric Duminil
yesterday
1
$begingroup$
I think you're right, since I didn't take into account the immutability of strings in python, but the graph of @Graipher shows something similar to O(nlgn) complexity for my code, I don't know I'm a bit confused...
$endgroup$
– Midos
yesterday
1
$begingroup$
Depending on how large the string is and what you want to do with it, it might be a good idea to create a lazy
ReverseString
class, which only accesses the data when needed.$endgroup$
– Eric Duminil
yesterday
2
$begingroup$
@Midos That might be due to the Python implementation used. AFAIK, cPython does sometimes manage to reuse one of the two strings (i.e. it tries to reserve enough space after/before one of the two strings and needs to copy only one string, but don't quote me on that). However, that is both implementation dependent and potentially dependent on the available memory, which is why Python's official style-guide recommends not relying on it.
$endgroup$
– Graipher
yesterday
$begingroup$
@Midos I recommend having a look at my answer now. It shows how you can speed things up with C functions that you call from Python.
$endgroup$
– Broman
6 hours ago