-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathtour-tips.html
152 lines (130 loc) · 4.39 KB
/
tour-tips.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
<html>
<head>
<title>
A Tour of NTL: Tips for Getting the Best Performance out of NTL
</title>
</head>
<center>
<a href="tour-win.html"><img src="arrow1.gif" alt="[Previous]" align=bottom></a>
<a href="tour.html"><img src="arrow2.gif" alt="[Up]" align=bottom></a>
<a href="tour-impl.html"> <img src="arrow3.gif" alt="[Next]" align=bottom></a>
</center>
<h1>
<p align=center>
A Tour of NTL: Tips for Getting the Best Performance out of NTL
</p>
</h1>
<p> <hr> <p>
<ol>
<li>
Build NTL using GMP as the long integer package.
This is extremely important, as the GMP implementation
of long integer arithmetic is <i>much</i> faster
than the default implementation.
Go <a href="tour-gmp.html">here</a> for details.
<p>
<li>
On many machines that optionally offer 64-bit integer arithmetic
(recent Mac OSX machines, for instance),
you should
compile using <tt>gcc</tt> with the option <tt>-m64</tt>
to get the full benefit.
To do this,
pass <tt>"CFLAGS=-O2 -m64"</tt>
to the <tt>configure</tt> script (note the use of quotes).
If you are using NTL with GMP on such a machine,
you <i>must</i> do this to get compatible code.
Note, however, that 64-bit is becoming the default, so this
may not be necessary.
<p>
<li>
On Sparcs,
pass the argument <tt>"CFLAGS=-O2 -mcpu=v8"</tt>
to the <tt>configure</tt> script.
On more recent, 64-bit sparcs, pass <tt>"CFLAGS=-O2 -mcpu=v9 -m64"</tt>
to get the full instruction set and 64-bit code.
<p>
<li>
Make sure you run the configuration wizard when you install NTL.
This is the default behaviour in the makefile
in the Unix distribution, so don't change this;
in the Windows distribution, there is unfortunately no
easy way to run the wizard.
<p>
<li>
In time-critical code, avoid creating unnecessary temporary
objects.
For example, instead of
<!-- STARTPLAIN
ZZ InnerProduct(const ZZ *a, const ZZ *b, long n)
{
long i;
ZZ res;
for (i = 0; i < n; i++)
res += a[i] * b[i];
return res;
}
ENDPLAIN -->
<!-- STARTPRETTY {{{ -->
<p><p><table cellPadding=10px><tr><td><font color="#000000">
<font face="monospace">
ZZ InnerProduct(<font color="#008b00"><b>const</b></font> ZZ *a, <font color="#008b00"><b>const</b></font> ZZ *b, <font color="#008b00"><b>long</b></font> n)<br>
{<br>
<font color="#008b00"><b>long</b></font> i;<br>
ZZ res;<br>
<font color="#b03060"><b>for</b></font> (i = <font color="#ff8c00">0</font>; i < n; i++)<br>
res += a[i] * b[i];<br>
<font color="#b03060"><b>return</b></font> res;<br>
}<br>
</font>
</font></td></tr></table><p><p>
<!-- }}} ENDPRETTY -->
write this as
<!-- STARTPLAIN
ZZ InnerProduct(const ZZ *a, const ZZ *b, long n)
{
long i;
ZZ res, t;
for (i = 0; i < n; i++) {
mul(t, a[i], b[i]);
add(res, res, t);
}
return res;
}
ENDPLAIN -->
<!-- STARTPRETTY {{{ -->
<p><p><table cellPadding=10px><tr><td><font color="#000000">
<font face="monospace">
ZZ InnerProduct(<font color="#008b00"><b>const</b></font> ZZ *a, <font color="#008b00"><b>const</b></font> ZZ *b, <font color="#008b00"><b>long</b></font> n)<br>
{<br>
<font color="#008b00"><b>long</b></font> i;<br>
ZZ res, t;<br>
<font color="#b03060"><b>for</b></font> (i = <font color="#ff8c00">0</font>; i < n; i++) {<br>
mul(t, a[i], b[i]);<br>
add(res, res, t);<br>
}<br>
<font color="#b03060"><b>return</b></font> res;<br>
}<br>
</font>
</font></td></tr></table><p><p>
<!-- }}} ENDPRETTY -->
The first version of <tt>InnerProduct</tt>
creates and destroys a temporary object, holding the value
<tt>a[i]*b[i]</tt>, in every loop iteration.
The second does not.
<p>
<li>
If you use the class <tt>ZZ_p</tt>, try to avoid switching the modulus
too often, as this can be a rather expensive operation.
If you <i>must</i> switch the modulus often,
use the class <tt>ZZ_pContext</tt> to save the information
associated with the modulus (see <a href="ZZ_p.cpp.html">ZZ_p.txt</a>).
</ol>
<p>
<center>
<a href="tour-win.html"><img src="arrow1.gif" alt="[Previous]" align=bottom></a>
<a href="tour.html"><img src="arrow2.gif" alt="[Up]" align=bottom></a>
<a href="tour-impl.html"> <img src="arrow3.gif" alt="[Next]" align=bottom></a>
</center>
</body>
</html>