Commit 8fe7382e2a1f7763be8b12db283cc1570eb64518

Authored by Loic Dachary
2 parents 363da207 9f9f005a
Exists in master and in 1 other branch v3

Merge branch 'manual' into 'master'

HTML manual fixes

Fixes to HTML manual, for mistakes I've noticed.
I'm sure there's more, but this is a start...

See merge request !14
Showing 1 changed file with 96 additions and 96 deletions   Show diff stats
manual/gf-complete.html
... ... @@ -160,7 +160,7 @@ CONTENT <span class="aligning_page_number"> 3 </span>
160 160  
161 161  
162 162 <div class="sub_indices">
163   -4.1 Three Simple Command Line Tools: gf mult, gf div and gf add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . <span class="aligning_page_number"> 8</span> <br>
  163 +4.1 Three Simple Command Line Tools: gf_mult, gf_div and gf_add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . <span class="aligning_page_number"> 8</span> <br>
164 164 4.2 Quick Starting Example #1: Simple multiplication and division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . <span class="aligning_page_number"> 9 </span> <br>
165 165 4.3 Quick Starting Example #2: Multiplying a region by a constant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . <span class="aligning_page_number"> 10 </span> <br>
166 166 4.4 Quick Starting Example #3: Using w = 64 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . <span class="aligning_page_number"> 11 </span> <br>
... ... @@ -231,7 +231,7 @@ CONTENT &lt;span class=&quot;aligning_page_number&quot;&gt; 3 &lt;/span&gt;
231 231 7.4 Arguments to <b>"SPLIT"</b> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . <span class="aligning_page_number"> 28</span> <br>
232 232 7.5 Arguments to <b>"GROUP"</b> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . <span class="aligning_page_number">29 </span> <br>
233 233 7.6 Considerations with <b>"COMPOSITE"</b> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . <span class="aligning_page_number">30 </span> <br>
234   -7.7 <b>"CARRY FREE"</b> and the Primitive Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . <span class="aligning_page_number">31 </span> <br>
  234 +7.7 <b>"CARRY_FREE"</b> and the Primitive Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . <span class="aligning_page_number">31 </span> <br>
235 235 7.8 More on Primitive Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . <span class="aligning_page_number">31 </span> <br>
236 236  
237 237  
... ... @@ -426,7 +426,7 @@ defines some randomnumber generators to help test the programs. The randomnumber
426 426 <ul>
427 427  
428 428 of All" random number generator [Mar94] which we've selected because it has no patent issues. <b>gf_unit</b> and
429   -gf time use these random number generators.<br><br>
  429 +<b>gf_time</b> use these random number generators.<br><br>
430 430 <li><b>gf_int.h:</b> This is an internal header file that the various source files use. This is <em>not</em> intended for applications to
431 431 include.</li><br>
432 432 <li><b>config.xx</b> and <b>stamp-h1</b> are created by autoconf, and should be ignored by applications. </li>
... ... @@ -457,7 +457,7 @@ The following are tools to help you with Galois Field arithmetic, and with the l
457 457 detail elsewhere in this manual.<br><br>
458 458 <li> <b>gf_mult.c, gf_ div.c</b> and <b>gf_ add:</b> Command line tools to do multiplication, division and addition by single numbers</li><br>
459 459 <li> <b>gf_time.c:</b> A program that times the procedures for given values of <em>w </em> and implementation options</li><br>
460   -<li> <b>time tool.sh:</b> A shell script that helps perform rough timings of the various multiplication, division and region
  460 +<li> <b>time_tool.sh:</b> A shell script that helps perform rough timings of the various multiplication, division and region
461 461 operations in GF-Complete</li><br>
462 462 <li> <b>gf_methods.c:</b> A program that enumerates most of the implementation methods supported by GF-Complete</li><br>
463 463 <li> <b> gf_poly.c:</b> A program to identify irreducible polynomials in regular and composite Galois Fields</li><br>
... ... @@ -652,7 +652,7 @@ gf.multiply_region.w32 (&amp;gf, r1, r2, a, 16, 0); &lt;br&gt;&lt;br&gt;
652 652  
653 653 That last argument specifies whether to simply place the product into r2 or to XOR it with the contents that are already
654 654 in r2. Zero means to place the product there. When we run it, it prints the results of the <b>multiply_region.w32</b> in
655   -hexadecimal. Again, you can verify it using gf mult:<br><br>
  655 +hexadecimal. Again, you can verify it using <b>gf_mult</b>:<br><br>
656 656 <div id="number_spacing">
657 657 UNIX> gf_example_2 4 <br>
658 658 12 * 2 = 11 <br>
... ... @@ -917,7 +917,7 @@ memory consumption and their rough performance. The performance tests are on an
917 917 3.40 GHz, and are included solely to give a flavor of performance on a standard microprocessor. Some processors
918 918 will be faster with some techniques and others will be slower, so we only put numbers in so that you can ballpark it.
919 919 For other values of <em>w</em> between 1 and 31, we use table lookup when w &#8804 8, discrete logarithms when w &#8804 16 and
920   -"Bytwop" for w &#8804 32. </p>
  920 +"Bytwo<sub>p</sub>" for w &#8804 32. </p>
921 921 <br><br>
922 922 <center> With SSE
923 923 <div id="data1">
... ... @@ -972,15 +972,15 @@ For other values of &lt;em&gt;w&lt;/em&gt; between 1 and 31, we use table lookup when w &amp;#88
972 972 <td>32 </td><td>16 </td> <td>2,135</td> </tr>
973 973  
974 974 <tr>
975   -<td>32 </td><td>4K </td><td>Bytwop</td><td>19</td><td>Split Table (32,4)</td>
  975 +<td>32 </td><td>4K </td><td>Bytwo<sub>p</sub></td><td>19</td><td>Split Table (32,4)</td>
976 976 <td>4 </td><td>4 </td> <td>1,149</td> </tr>
977 977  
978 978 <tr>
979   -<td>64 </td><td>16K </td><td>Bytwop</td><td>9</td><td>Split Table (64,4)</td>
  979 +<td>64 </td><td>16K </td><td>Bytwo<sub>p</sub></td><td>9</td><td>Split Table (64,4)</td>
980 980 <td>8 </td><td>8 </td> <td>987</td> </tr>
981 981  
982 982 <tr>
983   -<td>128 </td><td>64K </td><td>Bytwop</td><td>1.4</td><td>Split Table (128,4)</td>
  983 +<td>128 </td><td>64K </td><td>Bytwo<sub>p</sub></td><td>1.4</td><td>Split Table (128,4)</td>
984 984 <td>16 </td><td>8 </td> <td>833</td> </tr>
985 985 </table>
986 986 </div>
... ... @@ -1194,30 +1194,30 @@ larger &lt;em&gt;w&lt;/em&gt; than &lt;b&gt;&quot;TABLE.&quot;&lt;/b&gt; If the polynomial is not primitive (see s
1194 1194 an implementation. In that case,<b> gf_init_hard()</b> or <b>create_gf_from_argv()</b> will fail</li><br>
1195 1195  
1196 1196  
1197   -<li><b> "LOG ZERO:"</b> Discrete logarithm tables which include extra room for zero entries. This more than doubles
  1197 +<li><b> "LOG_ZERO:"</b> Discrete logarithm tables which include extra room for zero entries. This more than doubles
1198 1198 the memory consumption to remove an <b>if</b> statement (please see [GMS08] or The Paper for more description). It
1199 1199 doesn’t really make a huge deal of difference in performance</li><br>
1200 1200  
1201   -<li> <b>"LOG ZERO EXT:"</b> This expends even more memory to remove another <b>if</b> statement. Again, please see The
1202   -Paper for an explanation. As with <b>"LOG ZERO,"</b> the performance difference is negligible</li><br>
  1201 +<li> <b>"LOG_ZERO_EXT:"</b> This expends even more memory to remove another <b>if</b> statement. Again, please see The
  1202 +Paper for an explanation. As with <b>"LOG_ZERO,"</b> the performance difference is negligible</li><br>
1203 1203  
1204 1204 <li> <b>"SHIFT:"</b> Implementation straight from the definition of Galois Field multiplication, by shifting and XOR-ing,
1205 1205 then reducing the product using the polynomial. This is <em>slooooooooow,</em> so we don’t recommend you use it</li><br>
1206 1206  
1207 1207  
1208   -<li> <b>"CARRY FREE:"</b> This is identical to <b>"SHIFT,"</b> however it leverages the SSE instruction PCLMUL to perform
  1208 +<li> <b>"CARRY_FREE:"</b> This is identical to <b>"SHIFT,"</b> however it leverages the SSE instruction PCLMUL to perform
1209 1209 carry-freemultiplications in single instructions. As such, it is the fastest way to perform multiplication for large
1210 1210 values of <em>w</em> when that instruction is available. Its performance depends on the polynomial used. See The Paper
1211 1211 for details, and see section 7.7 below for the speedups available when <em>w </em>= 16 and <em>w</em> = 32 if you use a different
1212 1212 polynomial than the default one</li><br>
1213 1213  
1214 1214  
1215   -<li> <b>"BYTWO p:"</b> This implements multiplication by successively multiplying the product by two and selectively
  1215 +<li> <b>"BYTWO_p:"</b> This implements multiplication by successively multiplying the product by two and selectively
1216 1216 XOR-ing the multiplicand. See The Paper for more detail. It can leverage Anvin’s optimization that multiplies
1217 1217 64 and 128 bits of numbers in <em>GF(2<sup>w</sup>) </em> by two with just a few instructions. The SSE version requires SSE2</li><br>
1218 1218  
1219 1219  
1220   -<li> <b>"BYTWO b:"</b> This implements multiplication by successively multiplying the multiplicand by two and selectively
  1220 +<li> <b>"BYTWO_b:"</b> This implements multiplication by successively multiplying the multiplicand by two and selectively
1221 1221 XOR-ing it into the product. It can also leverage Anvin's optimization, and it has the feature that when
1222 1222 you're multiplying a region by a very small constant (like 2), it can terminate the multiplication early. As such,
1223 1223 if you are multiplying regions of bytes by two (as in the Linux RAID-6 Reed-Solomon code [Anv09]), this is
... ... @@ -1269,7 +1269,7 @@ In order to specify the base field, put appropriate flags after specifying &lt;em&gt;k
1269 1269 and after that, you may continue making specifications for the composite field. This process can be continued
1270 1270 for multiple layers of <b>"COMPOSITE."</b> As an example, the following multiplies 1000000 and 2000000
1271 1271 in <em>GF((2<sup>16</sup>)<sup>2</sup>),</em> where the base field uses <b>BYTWO_p</b> for multiplication: <br><br>
1272   -<center>./gf mult 1000000 2000000 32 -m COMPOSITE 2 <span style="color:red">-m BYTWO p - -</span> </center><br>
  1272 +<center>./gf_mult 1000000 2000000 32 -m COMPOSITE 2 <span style="color:red">-m BYTWO_p - -</span> </center><br>
1273 1273  
1274 1274 In the above example, the red text applies to the base field, and the black text applies to the composite field.
1275 1275 Composite fields have two defining polynomials - one for the composite field, and one for the base field. Thus, if
... ... @@ -1278,7 +1278,7 @@ form x&lt;sup&gt;2&lt;/sup&gt;+sx+1, where s is an element of &lt;em&gt;GF(2&lt;sup&gt;k&lt;/sup&gt;).&lt;/em&gt; To
1278 1278 example below, we multiply 20000 and 30000 in <em>GF((2<sup>8</sup>)<sup>2</sup>) </em>, setting s to three, and using x<sup>8</sup>+x<sup>4</sup>+x<sup>3</sup>+x<sup>2</sup>+1
1279 1279 as the polynomial for the base field: <br><br>
1280 1280  
1281   -<center>./gf mult 20000 30000 16 -m COMPOSITE 2 <span style="color:red">-p 0x11d </span> - -p 0x3 - </center> <br><br>
  1281 +<center>./gf_mult 20000 30000 16 -m COMPOSITE 2 <span style="color:red">-p 0x11d </span> - -p 0x3 - </center> <br><br>
1282 1282  
1283 1283 If you use composite fields, you should consider using <b>"ALTMAP"</b> as well. The reason is that the region
1284 1284 operations will go much faster. Please see section 7.6.<br><br>
... ... @@ -1340,13 +1340,13 @@ multiplication techniques which can leverage SSE instructions and which versions
1340 1340 <td><b>"SPLIT"</b></td><td>-</td><td>Yes</td><td>SSSE3</td><td>Only when the second argument equals 4.</td>
1341 1341  
1342 1342 <tr>
1343   -<td><b>"SPLIt"</b></td><td>- </td><td>Yes</td><td>SSE4</td><td>When <em>w </em> = 64 and not using <b>"ALTMAP".</b></td>
  1343 +<td><b>"SPLIT"</b></td><td>- </td><td>Yes</td><td>SSE4</td><td>When <em>w </em> = 64 and not using <b>"ALTMAP".</b></td>
1344 1344  
1345 1345 <tr>
1346   -<td><b>"BYTWO p"</b></td><td>- </td><td>Yes</td><td>SSE2</td><td></td>
  1346 +<td><b>"BYTWO_p"</b></td><td>- </td><td>Yes</td><td>SSE2</td><td></td>
1347 1347  
1348 1348 <tr>
1349   -<td><b>"BYTWO p"</b></td><td>- </td><td>Yes</td><td>SSE2</td><td></td>
  1349 +<td><b>"BYTWO_p"</b></td><td>- </td><td>Yes</td><td>SSE2</td><td></td>
1350 1350  
1351 1351 </table></div> <br><br>
1352 1352 Table 2: Multiplication techniques which can leverage SSE instructions when they are available.
... ... @@ -1425,12 +1425,12 @@ listed. If multiple region options are required, they should be specified indepe
1425 1425 and independent options for command-line tools and <b>create_gf_from_argv()).</b> </p>
1426 1426  
1427 1427  
1428   -<h3>6.2 &nbsp&nbsp&nbspDetermining Supported Techniques with gf methods </h3>
  1428 +<h3>6.2 &nbsp&nbsp&nbspDetermining Supported Techniques with gf_methods </h3>
1429 1429  
1430 1430  
1431 1431 The program <b>gf_methods</b> prints a list of supported methods on standard output. It is called as follows:<br><br>
1432 1432 <div id="number_spacing">
1433   -<center>./gf methods <em>w </em> -BADC -LUMDRB <br><br> </center> </div>
  1433 +<center>./gf_methods <em>w </em> -BADC -LUMDRB <br><br> </center> </div>
1434 1434  
1435 1435 The first argument is <em>w </em>, which may be any legal value of <em>w </em>. The second argument has the following flags: <br><br>
1436 1436 <ul>
... ... @@ -1583,7 +1583,7 @@ The performance of &quot;Region-By-Zero&quot; and &quot;Region-By-One&quot; will not change from tes
1583 1583 the same calls for these. "Region-By-Zero" with "XOR: 1" does nothing except set up the tests. Therefore, you may
1584 1584 use it as a control.</p>
1585 1585  
1586   -<h3>6.3.1 &nbsp &nbsp &nbsp time tool.sh </h3>
  1586 +<h3>6.3.1 &nbsp &nbsp &nbsp time_tool.sh </h3>
1587 1587  
1588 1588 Finally, the shell script <b>time_tool.sh</b> makes a bunch of calls to <b>gf_time</b> to give a rough estimate of performance. It is
1589 1589 called as follows:<br><br>
... ... @@ -1637,7 +1637,7 @@ error may be minimized. &lt;/p&gt;
1637 1637  
1638 1638 6 &nbsp &nbsp <em> THE DEFAULTS </em> <span id="index_number">23 </span> <br><br><br>
1639 1639  
1640   -<h3>6.3.2 &nbsp &nbsp &nbsp An example of gf methods and time tool.sh </h3><br><br>
  1640 +<h3>6.3.2 &nbsp &nbsp &nbsp An example of gf_methods and time_tool.sh </h3><br><br>
1641 1641 Let's give an example of how some of these components fit together. Suppose we want to explore the basic techniques
1642 1642 in <em>GF(2<sup>32</sup>).</em> First, let's take a look at what <b>gf_methods</b> suggests as "basic" methods: <br><br>
1643 1643 <div id="number_spacing">
... ... @@ -1656,7 +1656,7 @@ UNIX&gt; &lt;br&gt;&lt;br&gt;
1656 1656  
1657 1657 <p>
1658 1658  
1659   -You'll note, this is on my old Macbook Pro, which doesn't support (PCLMUL), so <b>"CARRY FREE"</b> is not included
  1659 +You'll note, this is on my old Macbook Pro, which doesn't support (PCLMUL), so <b>"CARRY_FREE"</b> is not included
1660 1660 as an option. Now, let's run the unit tester on these to make sure they work, and to see their memory consumption: </p><br><br>
1661 1661  
1662 1662 <div id="number_spacing">
... ... @@ -1739,7 +1739,7 @@ which is why we don&#39;t use &quot;&lt;b&gt;-m SPLIT 32 4 -r ALTMAP -.&lt;/b&gt;&quot;&lt;/p&gt;
1739 1739 <p>
1740 1740 <b>Test question:</b> Given the numbers above, it would appear that <b>"COMPOSITE"</b> yields the fastest performance of
1741 1741 single multiplication, while "SPLIT 32 4" yields the fastest performance of region multiplication. Should I use two
1742   -gf t's in my application – one for single multiplication that uses <b>"COMPOSITE,"</b> and one for region multiplication
  1742 +gf_t's in my application – one for single multiplication that uses <b>"COMPOSITE,"</b> and one for region multiplication
1743 1743 that uses <b>"SPLIT 32 4?"</b></p>
1744 1744 <p>
1745 1745 The answer to this is "no." Why? Because composite fields are different from the "standard" fields, and if you mix
... ... @@ -1780,7 +1780,7 @@ void *scratch_memory); &lt;/div&gt;&lt;br&gt;&lt;br&gt;
1780 1780  
1781 1781  
1782 1782 The arguments mult type, region type and divide type allow for the same specifications as above, except the
1783   -types are integer constants defined in gf complete.h: <br><br>
  1783 +types are integer constants defined in gf_complete.h: <br><br>
1784 1784 typedef enum {GF_MULT_DEFAULT,<br>
1785 1785 <div style="padding-left:124px">
1786 1786 GF_MULT_SHIFT<br>
... ... @@ -2044,26 +2044,26 @@ The performance difference using &lt;b&gt;&quot;ALTMAP&quot;&lt;/b&gt; can be significant: &lt;br&gt;&lt;br&gt;&lt;br
2044 2044 <div id="table_page28">
2045 2045 <table cellpadding="6" cellspacing="0" style="text-align:center;font-size:19px">
2046 2046 <tr>
2047   -<td> gf time 16 G 0 1048576 100 -m SPLIT 16 4 -</td> <td>Speed = 8,389 MB/s </td>
  2047 +<td> gf_time 16 G 0 1048576 100 -m SPLIT 16 4 -</td> <td>Speed = 8,389 MB/s </td>
2048 2048 </tr>
2049 2049 <tr>
2050   -<td>gf time 16 G 0 1048576 100 -m SPLIT 16 4 -r ALTMAP - </td> <td>Speed = 8,389 MB/s </td>
  2050 +<td>gf_time 16 G 0 1048576 100 -m SPLIT 16 4 -r ALTMAP - </td> <td>Speed = 8,389 MB/s </td>
2051 2051 </tr>
2052 2052  
2053 2053 <tr>
2054   -<td>gf time 32 G 0 1048576 100 -m SPLIT 32 4 -</td> <td> Speed = 5,304 MB/s</td>
  2054 +<td>gf_time 32 G 0 1048576 100 -m SPLIT 32 4 -</td> <td> Speed = 5,304 MB/s</td>
2055 2055 </tr>
2056 2056 <tr>
2057   -<td>gf time 32 G 0 1048576 100 -m SPLIT 32 4 -r ALTMAP -</td> <td> Speed = 7,146 MB/s</td>
  2057 +<td>gf_time 32 G 0 1048576 100 -m SPLIT 32 4 -r ALTMAP -</td> <td> Speed = 7,146 MB/s</td>
2058 2058 </tr>
2059 2059  
2060 2060  
2061 2061 <tr>
2062   -<td>gf time 64 G 0 1048576 100 -m SPLIT 64 4 - </td> <td>Speed = 2,595 MB/s </td>
  2062 +<td>gf_time 64 G 0 1048576 100 -m SPLIT 64 4 - </td> <td>Speed = 2,595 MB/s </td>
2063 2063 </tr>
2064 2064  
2065 2065 <tr>
2066   -<td>gf time 64 G 0 1048576 100 -m SPLIT 64 4 -r ALTMAP - </td> <td>Speed = 3,436 MB/s </td>
  2066 +<td>gf_time 64 G 0 1048576 100 -m SPLIT 64 4 -r ALTMAP - </td> <td>Speed = 3,436 MB/s </td>
2067 2067 </tr>
2068 2068 </div>
2069 2069  
... ... @@ -2179,15 +2179,15 @@ region(),&lt;/b&gt; rather than simply calling &lt;b&gt;multiply()&lt;/b&gt; on every word in the
2179 2179  
2180 2180 <table cellpadding="6" cellspacing="0" style="text-align:center;font-size:19px"><tr>
2181 2181 <td>
2182   -gf time 32 G 0 10240 10240 -m COMPOSITE 2 - -
  2182 +gf_time 32 G 0 10240 10240 -m COMPOSITE 2 - -
2183 2183 Speed = 322 MB/s </td> </tr>
2184 2184 <tr>
2185   -<td>gf time 32 G 0 10240 10240 -m COMPOSITE 2 - -r ALTMAP -
  2185 +<td>gf_time 32 G 0 10240 10240 -m COMPOSITE 2 - -r ALTMAP -
2186 2186 Speed = 3,368 MB/s </td> </tr>
2187 2187  
2188 2188 <tr>
2189 2189 <td>
2190   -gf time 32 G 0 10240 10240 -m COMPOSITE 2 -m SPLIT 16 4 -r ALTMAP - -r ALTMAP -
  2190 +gf_time 32 G 0 10240 10240 -m COMPOSITE 2 -m SPLIT 16 4 -r ALTMAP - -r ALTMAP -
2191 2191 Speed = 3,925 MB/s </td> </tr>
2192 2192 </center>
2193 2193 </table>
... ... @@ -2207,10 +2207,10 @@ as fast. The difference is the inlining of multiplication in the base field when
2207 2207  
2208 2208 <table cellpadding="6" cellspacing="0" style="text-align:center;font-size:19px">
2209 2209  
2210   - <tr><td>gf time 8 M 0 1048576 100 - Speed = 501 Mega-ops/s</td> </tr>
2211   - <tr><td>gf time 8 M 0 1048576 100 -m SPLIT 8 4 - Speed = 439 Mega-ops/s </td> </tr>
2212   - <tr><td>gf time 8 M 0 1048576 100 -m COMPOSITE 2 - - Speed = 207 Mega-ops/s </td> </tr>
2213   - <tr><td>gf time 8 M 0 1048576 100 -m COMPOSITE 2 -m SPLIT 8 4 - - Speed = 77 Mega-ops/s </td> </tr>
  2210 + <tr><td>gf_time 8 M 0 1048576 100 - Speed = 501 Mega-ops/s</td> </tr>
  2211 + <tr><td>gf_time 8 M 0 1048576 100 -m SPLIT 8 4 - Speed = 439 Mega-ops/s </td> </tr>
  2212 + <tr><td>gf_time 8 M 0 1048576 100 -m COMPOSITE 2 - - Speed = 207 Mega-ops/s </td> </tr>
  2213 + <tr><td>gf_time 8 M 0 1048576 100 -m COMPOSITE 2 -m SPLIT 8 4 - - Speed = 77 Mega-ops/s </td> </tr>
2214 2214  
2215 2215 </table>
2216 2216 </center>
... ... @@ -2235,17 +2235,17 @@ region operations (641 MB/s):
2235 2235  
2236 2236 <div id="number_spacing">
2237 2237 <center>
2238   -gf time 128 G 0 1048576 100 -m COMPOSITE 2 <span style="color:red">-m COMPOSITE 2 </span> <span style="color:blue">-m COMPOSITE 2 </span> <br>
  2238 +gf_time 128 G 0 1048576 100 -m COMPOSITE 2 <span style="color:red">-m COMPOSITE 2 </span> <span style="color:blue">-m COMPOSITE 2 </span> <br>
2239 2239 <span style="color:rgb(250, 149, 167)">-m SPLIT 16 4 -r ALTMAP -</span> <span style="color:blue">-r ALTMAP -</span> <span style="color:red"> -r ALTMAP -</span> -r ALTMAP -
2240 2240 </center>
2241 2241 </div><br>
2242 2242  
2243 2243 <p>Please see section 7.8.1 for a discussion of polynomials in composite fields.</p>
2244 2244  
2245   -<h2>7.7 &nbsp &nbsp &nbsp "CARRY FREE" and the Primitive Polynomial </h2>
  2245 +<h2>7.7 &nbsp &nbsp &nbsp "CARRY_FREE" and the Primitive Polynomial </h2>
2246 2246  
2247 2247  
2248   -If your machine supports the PCLMUL instruction, then we leverage that in <b>"CARRY FREE."</b> This implementation
  2248 +If your machine supports the PCLMUL instruction, then we leverage that in <b>"CARRY_FREE."</b> This implementation
2249 2249 first performs a carry free multiplication of two <em>w</em>-bit numbers, which yields a 2<em>w</em>-bit number. It does this with
2250 2250 one PCLMUL instruction. To reduce the 2<em>w</em>-bit number back to a <em>w</em>-bit number requires some manipulation of the
2251 2251 polynomial. As it turns out, if the polynomial has a lot of contiguous zeroes following its leftmost one, the number of
... ... @@ -2260,9 +2260,9 @@ You can see the difference in performance:
2260 2260 <table cellpadding="6" cellspacing="0" style="text-align:center;font-size:19px">
2261 2261 <tr>
2262 2262  
2263   -<td>gf time 32 M 0 1048576 100 -m CARRY FREE - </td> <td> Speed = 48 Mega-ops/s</td> </tr>
  2263 +<td>gf_time 32 M 0 1048576 100 -m CARRY_FREE - </td> <td> Speed = 48 Mega-ops/s</td> </tr>
2264 2264  
2265   -<tr><td>gf time 32 M 0 1048576 100 -m CARRY FREE -p 0xc5 -</td> <td> Speed = 81 Mega-ops/s </td> </tr>
  2265 +<tr><td>gf_time 32 M 0 1048576 100 -m CARRY_FREE -p 0xc5 -</td> <td> Speed = 81 Mega-ops/s </td> </tr>
2266 2266  
2267 2267 </table></center>
2268 2268 </div>
... ... @@ -2270,8 +2270,8 @@ You can see the difference in performance:
2270 2270  
2271 2271 <p>
2272 2272 This is relevant for <em>w </em> = 16 and <em>w </em> = 32, where the "standard" polynomials are sub-optimal with respect to
2273   -<b>"CARRY FREE."</b> For <em>w </em> = 16, the polynomial 0x1002d has the desired property. It’s less important, of course,
2274   -with <em>w </em> = 16, because <b>"LOG"</b> is so much faster than <b>CARRY FREE.</b> </p>
  2273 +<b>"CARRY_FREE."</b> For <em>w </em> = 16, the polynomial 0x1002d has the desired property. It’s less important, of course,
  2274 +with <em>w </em> = 16, because <b>"LOG"</b> is so much faster than <b>CARRY_FREE.</b> </p>
2275 2275  
2276 2276 <h2>7.8 &nbsp More on Primitive Polynomials </h3>
2277 2277  
... ... @@ -2383,7 +2383,7 @@ GF-Complete will successfully select a default polynomial in the following compo
2383 2383 6 &nbsp &nbsp <em> FURTHER INFORMATION ON OPTIONS AND ALGORITHMS </em> <span id="index_number">33 </span> <br><br><br>
2384 2384  
2385 2385  
2386   -<h3>7.8.3 The Program gf poly for Verifying Irreducibility of Polynomials </h3>
  2386 +<h3>7.8.3 The Program gf_poly for Verifying Irreducibility of Polynomials </h3>
2387 2387  
2388 2388 The program <b>gf_poly</b> uses the Ben-Or algorithm[GP97] to determine whether a polynomial with coefficients in <em> GF(2<sup>w </sup>) </em>
2389 2389 is reducible. Its syntax is:<br><br>
... ... @@ -2640,8 +2640,8 @@ stored in 16 16-byte regions.&lt;/p&gt;&lt;br&gt;
2640 2640 <h3>7.9.2 &nbsp Alternate mappings with "COMPOSITE" </h3>
2641 2641  
2642 2642 With <b>"COMPOSITE,"</b> the alternate mapping divides the middle region in half. The lower half of each word is stored
2643   -in the first half of the middle region, and the higher half is stored in the second half. To illustrate, gf example 6
2644   -performs the same example as gf example 5, except it is using <b>"COMPOSITE"</b> in GF((2<sup>16</sup>)<sup>2</sup>), and it is multiplying
  2643 +in the first half of the middle region, and the higher half is stored in the second half. To illustrate, gf_example_6
  2644 +performs the same example as gf_example_5, except it is using <b>"COMPOSITE"</b> in GF((2<sup>16</sup>)<sup>2</sup>), and it is multiplying
2645 2645 a region of 120 bytes rather than 60. As before, the pointers are not aligned on 16-bit quantities, so the region is broken
2646 2646 into three regions of 4 bytes, 96 bytes, and 20 bytes. In the first and third region, each consecutive four byte word is a
2647 2647 word in <em>GF(2<sup>32</sup>).</em> For example, word 0 is 0x562c640b, and word 25 is 0x46bc47e0. In the middle region, the low two
... ... @@ -2847,14 +2847,14 @@ section 7.1.&lt;/li&gt;&lt;br&gt;
2847 2847 <li> <b>MOA_Random_W()</b> in <b>gf_rand.h:</b> Creates a random w-bit number, where <em>w </em> &#8804 32. </li><br>
2848 2848 <li> <b>MOA_Seed()</b> in <b>gf_rand.h:</b> Sets the seed for the random number generator. </li><br>
2849 2849 <li> <b>gf_errno</b> in <b>gf_complete.h:</b> This is to help figure out why an initialization call failed. See section 6.1.</li><br>
2850   -<li> <b>gf_create_gf_from_argv()</b> in <b>gf method.h:</b> Creates a gf t using C style argc/argv. See section 6.1.1. </li><br>
  2850 +<li> <b>gf_create_gf_from_argv()</b> in <b>gf_method.h:</b> Creates a gf_t using C style argc/argv. See section 6.1.1. </li><br>
2851 2851 <li> <b>gf_division_type_t</b> in <b>gf_complete.h:</b> the different ways to specify division when using <b>gf_init_hard().</b> See
2852 2852 section 6.4. </li><br>
2853 2853 <li> <b>gf_error()</b> in <b>gf_complete.h:</b> This prints out why an initialization call failed. See section 6.1. </li><br>
2854 2854  
2855   -<li> <b>gf_extract</b> in <b>gf_complete.h:</b> This is the data type of <b>extract_word()</b> in a gf t. See section 7.9 for an example
  2855 +<li> <b>gf_extract</b> in <b>gf_complete.h:</b> This is the data type of <b>extract_word()</b> in a gf_t. See section 7.9 for an example
2856 2856 of how to use extract word().</li>
2857   -
  2857 +</ul>
2858 2858  
2859 2859  
2860 2860  
... ... @@ -3028,7 +3028,7 @@ composite field too. See 7.8.2 for the fields where GF-Complete will support def
3028 3028 explanation</li><br>
3029 3029  
3030 3030  
3031   -<li> <b>"ALTMAP" is confusing.</b> We agree. Please see section 7.9 for more explanation.
  3031 +<li> <b>"ALTMAP" is confusing.</b> We agree. Please see section 7.9 for more explanation.</li><br>
3032 3032  
3033 3033 <li> <b>I used "ALTMAP" and it doesn't appear to be functioning correctly.</b> With 7.9, the size of the region and
3034 3034 its alignment both matter in terms of how <b>"ALTMAP"</b> performs <b>multiply_region()</b>. Please see section 7.9 for
... ... @@ -3065,7 +3065,7 @@ per second.
3065 3065  
3066 3066 <p>As would be anticipated, the inlined operations (see section 7.1) outperform the others. Additionally, in all
3067 3067 cases with the exception of <em>w</em> = 32, the defaults are the fastest performing implementations. With w = 32,
3068   -"CARRY FREE" is the fastest with an alternate polynomial (see section 7.7). Because we require the defaults to
  3068 +"CARRY_FREE" is the fastest with an alternate polynomial (see section 7.7). Because we require the defaults to
3069 3069 use a "standard" polynomial, we cannot use this implementation as the default. </p>
3070 3070  
3071 3071 <h2>11.2 &nbsp Divide() </h2>
... ... @@ -3126,9 +3126,9 @@ For these tables, we performed 1GB worth of &lt;b&gt;multiply_region()&lt;/b&gt; calls for a
3126 3126  
3127 3127 <tr><td>-m TABLE (Default) -</td> <td>11879.909</td> </tr>
3128 3128 <tr><td>-m TABLE -r CAUCHY -</td> <td>9079.712</td> </tr>
3129   -<tr><td>-m BYTWO b -</td> <td>5242.400</td> </tr>
3130   -<tr><td>-m BYTWO p -</td> <td>4078.431</td> </tr>
3131   -<tr><td>-m BYTWO b -r NOSSE -</td> <td>3799.699</td> </tr>
  3129 +<tr><td>-m BYTWO_b -</td> <td>5242.400</td> </tr>
  3130 +<tr><td>-m BYTWO_p -</td> <td>4078.431</td> </tr>
  3131 +<tr><td>-m BYTWO_b -r NOSSE -</td> <td>3799.699</td> </tr>
3132 3132 <tr><td>-m TABLE -r QUAD -</td> <td>3014.315</td> </tr>
3133 3133  
3134 3134 <tr><td>-m TABLE -r DOUBLE -</td> <td>2253.627</td> </tr>
... ... @@ -3138,7 +3138,7 @@ For these tables, we performed 1GB worth of &lt;b&gt;multiply_region()&lt;/b&gt; calls for a
3138 3138  
3139 3139  
3140 3140 <tr><td>m SHIFT -</td> <td>157.749</td> </tr>
3141   -<tr><td>-m CARRY FREE -</td> <td>86.202</td> </tr>
  3141 +<tr><td>-m CARRY_FREE -</td> <td>86.202</td> </tr>
3142 3142 </div>
3143 3143 </table> <br><br>
3144 3144 </div> </center>
... ... @@ -3188,27 +3188,27 @@ of Computational Mathematics,&lt;/em&gt; pages 346–361. Springer Verlag, 1997.
3188 3188 <tr><td>-m SPLIT 8 4 (Default)</td> <td>13279.146</td> </tr>
3189 3189 <tr><td>-m COMPOSITE 2 - -r ALTMAP -</td> <td>5516.588</td> </tr>
3190 3190 <tr><td>-m TABLE -r CAUCHY -</td> <td>4968.721</td> </tr>
3191   -<tr><td>-m BYTWO b -</td> <td>2656.463</td> </tr>
  3191 +<tr><td>-m BYTWO_b -</td> <td>2656.463</td> </tr>
3192 3192 <tr><td>-m TABLE -r DOUBLE -</td> <td>2561.225</td> </tr>
3193 3193 <tr><td>-m TABLE -</td> <td>1408.577</td> </tr>
3194 3194  
3195   -<tr><td>-m BYTWO b -r NOSSE -</td> <td>1382.409</td> </tr>
3196   -<tr><td>-m BYTWO p -</td> <td>1376.661</td> </tr>
3197   -<tr><td>-m LOG ZERO EXT -</td> <td>1175.739</td> </tr>
3198   -<tr><td>-m LOG ZERO -</td> <td>1174.694</td> </tr>
  3195 +<tr><td>-m BYTWO_b -r NOSSE -</td> <td>1382.409</td> </tr>
  3196 +<tr><td>-m BYTWO_p -</td> <td>1376.661</td> </tr>
  3197 +<tr><td>-m LOG_ZERO_EXT -</td> <td>1175.739</td> </tr>
  3198 +<tr><td>-m LOG_ZERO -</td> <td>1174.694</td> </tr>
3199 3199  
3200 3200  
3201 3201 <tr><td>-m LOG -</td> <td>997.838</td> </tr>
3202 3202 <tr><td>-m SPLIT 8 4 -r NOSSE -</td> <td>885.897</td> </tr>
3203 3203  
3204 3204  
3205   -<tr><td>-m BYTWO p -r NOSSE -</td> <td>589.520</td> </tr>
  3205 +<tr><td>-m BYTWO_p -r NOSSE -</td> <td>589.520</td> </tr>
3206 3206 <tr><td>-m COMPOSITE 2 - -</td> <td>327.039</td> </tr>
3207 3207  
3208 3208  
3209 3209 <tr><td>-m SHIFT -</td> <td>106.115</td> </tr>
3210 3210  
3211   -<tr><td>-m CARRY FREE -</td> <td>104.299</td> </tr>
  3211 +<tr><td>-m CARRY_FREE -</td> <td>104.299</td> </tr>
3212 3212  
3213 3213  
3214 3214 </div>
... ... @@ -3272,14 +3272,14 @@ Practice &amp; Experience,&lt;/em&gt; 27(9):995-1012, September 1997.
3272 3272 <tr><td>-m SPLIT 8 8 -</td> <td>2163.993</td> </tr>
3273 3273 <tr><td>-m SPLIT 16 4 -r NOSSE -</td> <td>1148.810</td> </tr>
3274 3274 <tr><td>-m LOG -</td> <td>1019.896</td> </tr>
3275   -<tr><td>-m LOG ZERO -</td> <td>1016.814</td> </tr>
3276   -<tr><td>-m BYTWO b -</td> <td>738.879</td> </tr>
  3275 +<tr><td>-m LOG_ZERO -</td> <td>1016.814</td> </tr>
  3276 +<tr><td>-m BYTWO_b -</td> <td>738.879</td> </tr>
3277 3277 <tr><td>-m COMPOSITE 2 - -</td> <td>596.819</td> </tr>
3278   -<tr><td>-m BYTWO p -</td> <td>560.972</td> </tr>
  3278 +<tr><td>-m BYTWO_p -</td> <td>560.972</td> </tr>
3279 3279 <tr><td>-m GROUP 4 4 -</td> <td>450.815</td> </tr>
3280   -<tr><td>-m BYTWO b -r NOSSE -</td> <td>332.967</td> </tr>
3281   -<tr><td>-m BYTWO p -r NOSSE -</td> <td>249.849</td> </tr>
3282   -<tr><td>-m CARRY FREE -</td> <td>111.582</td> </tr>
  3280 +<tr><td>-m BYTWO_b -r NOSSE -</td> <td>332.967</td> </tr>
  3281 +<tr><td>-m BYTWO_p -r NOSSE -</td> <td>249.849</td> </tr>
  3282 +<tr><td>-m CARRY_FREE -</td> <td>111.582</td> </tr>
3283 3283 <tr><td>-m SHIFT -</td> <td>95.813</td> </tr>
3284 3284  
3285 3285  
... ... @@ -3321,21 +3321,21 @@ of the Association for Computing Machinery,&lt;/em&gt; 36(2):335-348, April 1989.
3321 3321 -m SPLIT 32 4 (Default) <br>
3322 3322 -m COMPOSITE 2 -m SPLIT 16 4 -r ALTMAP - -r ALTMAP - <br>
3323 3323 -m COMPOSITE 2 - -r ALTMAP - <br>
3324   --m SPLIT 8 8 <br>
3325   --m SPLIT 32 8 <br>
3326   --m SPLIT 32 16 <br>
  3324 +-m SPLIT 8 8 - <br>
  3325 +-m SPLIT 32 8 - <br>
  3326 +-m SPLIT 32 16 - <br>
3327 3327 -m SPLIT 8 8 -r CAUCHY <br>
3328 3328 -m SPLIT 32 4 -r NOSSE <br>
3329   --m CARRY FREE -p 0xc5 <br>
  3329 +-m CARRY_FREE -p 0xc5 <br>
3330 3330 -m COMPOSITE 2 - <br>
3331   --m BYTWO b <br>
3332   --m BYTWO p <br>
3333   --m GROUP 4 8 <br>
3334   --m GROUP 4 4 <br>
3335   --m CARRY FREE <br>
3336   --m BYTWO b -r NOSSE <br>
3337   --m BYTWO p -r NOSSE <br>
3338   --m SHIFT <br>
  3331 +-m BYTWO_b - <br>
  3332 +-m BYTWO_p - <br>
  3333 +-m GROUP 4 8 - <br>
  3334 +-m GROUP 4 4 - <br>
  3335 +-m CARRY_FREE - <br>
  3336 +-m BYTWO_b -r NOSSE - <br>
  3337 +-m BYTWO_p -r NOSSE - <br>
  3338 +-m SHIFT - <br>
3339 3339  
3340 3340 </td>
3341 3341  
... ... @@ -3382,16 +3382,16 @@ of the Association for Computing Machinery,&lt;/em&gt; 36(2):335-348, April 1989.
3382 3382 -m COMPOSITE 2 - -r ALTMAP - <br>
3383 3383 -m SPLIT 64 16 - <br>
3384 3384 -m SPLIT 64 8 - <br>
3385   --m CARRY FREE - <br>
  3385 +-m CARRY_FREE - <br>
3386 3386 -m SPLIT 64 4 -r NOSSE - <br>
3387 3387 -m GROUP 4 4 - <br>
3388 3388 -m GROUP 4 8 - <br>
3389   --m BYTWO b - <br>
3390   --m BYTWO p - <br>
  3389 +-m BYTWO_b - <br>
  3390 +-m BYTWO_p - <br>
3391 3391 -m SPLIT 8 8 - <br>
3392   --m BYTWO p -r NOSSE - <br>
  3392 +-m BYTWO_p -r NOSSE - <br>
3393 3393 -m COMPOSITE 2 - - <br>
3394   --m BYTWO b -r NOSSE - <br>
  3394 +-m BYTWO_b -r NOSSE - <br>
3395 3395 -m SHIFT - <br>
3396 3396  
3397 3397 </td>
... ... @@ -3446,17 +3446,17 @@ of the Association for Computing Machinery,&lt;/em&gt; 36(2):335-348, April 1989.
3446 3446  
3447 3447 <td>
3448 3448  
3449   --m SPLIT 128 4 -r ALTMAP- <br>
3450   --m COMPOSITE 2 -m SPLIT 64 4 -r ALTMAP - -r ALTMAP- <br>
3451   --m COMPOSITE 2 - -r ALTMAP- <br>
3452   --m SPLIT 128 8 (Default)- <br>
3453   --m CARRY FREE -<br>
  3449 +-m SPLIT 128 4 -r ALTMAP - <br>
  3450 +-m COMPOSITE 2 -m SPLIT 64 4 -r ALTMAP - -r ALTMAP - <br>
  3451 +-m COMPOSITE 2 - -r ALTMAP - <br>
  3452 +-m SPLIT 128 8 (Default) - <br>
  3453 +-m CARRY_FREE -<br>
3454 3454 -m SPLIT 128 4 -<br>
3455 3455 -m COMPOSITE 2 - <br>
3456 3456 -m GROUP 4 8 -<br>
3457 3457 -m GROUP 4 4 -<br>
3458   --m BYTWO p -<br>
3459   --m BYTWO b -<br>
  3458 +-m BYTWO_p -<br>
  3459 +-m BYTWO_b -<br>
3460 3460 -m SHIFT -<br>
3461 3461 </td>
3462 3462  
... ...