forked from szabgab/perlmaven.com
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathsorting-arrays-in-perl.tt
181 lines (133 loc) · 4.74 KB
/
sorting-arrays-in-perl.tt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
=title Sorting arrays in Perl
=timestamp 2012-11-09T13:45:56
=indexes sort, $a, $b, cmp, <=>
=status show
=books beginner_book
=author szabgab
=index 1
=archive 1
=feed 1
=comments 1
=social 1
=abstract start
In this article we are going to see how we can <b>sort an array of strings or numbers in Perl</b>.
Perl has a built-in function called <hl>sort</hl> that can, unsurprisingly, sort an array. In its most simple form,
you just give it an array, and it returns the elements of that array in a sorted order. <hl>@sorted = sort @original</hl>.
=abstract end
<h2>Sort based on ASCII order</h2>
<code lang="perl">
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
use Data::Dumper qw(Dumper);
my @words = qw(foo bar zorg moo);
say Dumper \@words;
my @sorted_words = sort @words;
say Dumper \@sorted_words;
</code>
The above example will print
<code lang="perl">
$VAR1 = [
'foo',
'bar',
'zorg',
'moo'
];
$VAR1 = [
'bar',
'foo',
'moo',
'zorg'
];
</code>
The first output shows the array before sorting, the second output after sorting.
This is the most simple case, but it is not always what you want.
For example, what happens if some of the words start with an upper case letter?
<code lang="perl">
my @words = qw(foo bar Zorg moo);
</code>
The result in <hl>@sorted_names</hl> is going to be:
<code lang="perl">
$VAR1 = [
'Zorg',
'bar',
'foo',
'moo'
];
</code>
As you can see, the word that starts with an upper-case letter became first.
That's because <hl>sort</hl>, by default, sorts according to the ASCII table,
and all the upper case letters are located earlier than the lower case letters.
<h2>Comparison function</h2>
The way <hl>sort</hl> works in Perl is, that it goes over every two elements of
the original array; In every turn it puts the value from the left side into the variable <hl>$a</hl>,
and the value on the right side in the variable <hl>$b</hl>. Then it calls a <b>comparison function</b>.
That "comparison function" will return 1 if the content of <hl>$a</hl> should be on the left, -1 if the content of
<hl>$b</hl> should be on the left, or 0 if it does not matter as the two values are the same.
By default you don't see this comparison function and <b>sort</b> compares the values according to the ASCII table,
but if you want, you can write it explicitly:
<code lang="perl">
sort { $a cmp $b } @words;
</code>
This code will provide the exact same result as the one without the block: <hl>sort @words</hl>.
Here you can see that, by default, perl uses <hl>cmp</hl> in the comparison function.
That's because cmp is doing exactly what we need here. It compares the values on its two sides as strings,
returns 1 if the left argument is "less than" the right argument; returns -1 if the left argument
is "greater than" the right argument; and returns 0 if they are the same.
<h2>Sorting in alphabetical order</h2>
If you want to disregard the case in the strings - what is usually called alphabetical order -
you can do so as given in the next example:
<code lang="perl">
my @sorted_words = sort { lc($a) cmp lc($b) } @words;
</code>
Here, for the sake of comparison, we call the <hl>lc</hl> function that returns the <b>lower case</b> version of its argument.
Then <hl>cmp</hl> compares those lower case versions and decides which of the original strings must go first
and which second.
The result is
<code lang="perl">
$VAR1 = [
'bar',
'foo',
'moo',
'Zorg'
];
</code>
<h2>Sorting numbers in Perl</h2>
If we take an array of numbers and sort them with the default sorting,
the result is probably not what we are expecting.
<code lang="perl">
my @numbers = (14, 3, 12, 2, 23);
my @sorted_numbers = sort @numbers;
say Dumper \@sorted_numbers;
</code>
<code lang="perl">
$VAR1 = [
12,
14,
2,
23,
3
];
</code>
If you think about it, of course, this is not surprising. When the comparison function sees 12 and 3
it compares them as strings. That means comparing the first character in both strings. "1" to "3".
"1" is ahead of "3" in the ASCII table and thus the string "12" will come before the string "3".
Perl does not magically understand that you want to order these values as numbers.
No problem though as we can write a comparison function that will compare the two values as numbers.
For that we use the <lt><=></li>
(also called <a href="http://en.wikipedia.org/wiki/Spaceship_operator">spaceship operator</a>) that will
compare its two parameters as numbers and return 1, -1 or 0.
<code lang="perl">
my @sorted_numbers = sort { $a <=> $b } @numbers;
</code>
Results in:
<code lang="perl">
$VAR1 = [
2,
3,
12,
14,
23
];
</code>