atof,是C语言中的一个字符串转化为浮点数的函数,在Java在也有一个对应的实现,就是大家所熟悉的Double.parseDouble(String s)函数。
既然是讲atof的Java实现,肯定脱离不开C语言的实现,引用[我的算法学习之路]中的一句话
“stof是一个简单到爆的“算法””
事实却是如此,字符串转化为浮点数整个的算法核心只有一个,如何将字符‘0‘~‘9‘转化为计算机能识别的数字0~9,而在C语言中有一个很简单的转化方式:int x = (char)c - ‘0‘; 剩下的就是一些异常处理以及如果有效的得到该字符串的位数。
本文核心是想理解Java的实现,借此可以了解Java将字符串转为为浮点数的原理,通过这些代码可以看到,代码编写人注意了每一行代码的变量声明,以及if的逻辑控制提高效率:
<span style="font-size:18px;">public strictfp double doubleValue(){ int kDigits = Math.min( nDigits, maxDecimalDigits+1 ); long lValue; double dValue; double rValue, tValue; // First, check for NaN and Infinity values if(digits == infinity || digits == notANumber) { if(digits == notANumber) return Double.NaN; else return (isNegative?Double.NEGATIVE_INFINITY:Double.POSITIVE_INFINITY); } else { if (mustSetRoundDir) { roundDir = 0; } /* * convert the lead kDigits to a long integer. */ // (special performance hack: start to do it using int) int iValue = (int)digits[0]-(int)'0'; int iDigits = Math.min( kDigits, intDecimalDigits ); for ( int i=1; i < iDigits; i++ ){ iValue = iValue*10 + (int)digits[i]-(int)'0'; } lValue = (long)iValue; for ( int i=iDigits; i < kDigits; i++ ){ lValue = lValue*10L + (long)((int)digits[i]-(int)'0'); } dValue = (double)lValue; int exp = decExponent-kDigits; /* * lValue now contains a long integer with the value of * the first kDigits digits of the number. * dValue contains the (double) of the same. */ if ( nDigits <= maxDecimalDigits ){ /* * possibly an easy case. * We know that the digits can be represented * exactly. And if the exponent isn't too outrageous, * the whole thing can be done with one operation, * thus one rounding error. * Note that all our constructors trim all leading and * trailing zeros, so simple values (including zero) * will always end up here */ if (exp == 0 || dValue == 0.0) return (isNegative)? -dValue : dValue; // small floating integer else if ( exp >= 0 ){ if ( exp <= maxSmallTen ){ /* * Can get the answer with one operation, * thus one roundoff. */ rValue = dValue * small10pow[exp]; if ( mustSetRoundDir ){ tValue = rValue / small10pow[exp]; roundDir = ( tValue == dValue ) ? 0 :( tValue < dValue ) ? 1 : -1; } return (isNegative)? -rValue : rValue; } int slop = maxDecimalDigits - kDigits; if ( exp <= maxSmallTen+slop ){ /* * We can multiply dValue by 10^(slop) * and it is still "small" and exact. * Then we can multiply by 10^(exp-slop) * with one rounding. */ dValue *= small10pow[slop]; rValue = dValue * small10pow[exp-slop]; if ( mustSetRoundDir ){ tValue = rValue / small10pow[exp-slop]; roundDir = ( tValue == dValue ) ? 0 :( tValue < dValue ) ? 1 : -1; } return (isNegative)? -rValue : rValue; } /* * Else we have a hard case with a positive exp. */ } else { if ( exp >= -maxSmallTen ){ /* * Can get the answer in one division. */ rValue = dValue / small10pow[-exp]; tValue = rValue * small10pow[-exp]; if ( mustSetRoundDir ){ roundDir = ( tValue == dValue ) ? 0 :( tValue < dValue ) ? 1 : -1; } return (isNegative)? -rValue : rValue; } /* * Else we have a hard case with a negative exp. */ } } /* * Harder cases: * The sum of digits plus exponent is greater than * what we think we can do with one error. * * Start by approximating the right answer by, * naively, scaling by powers of 10. */ if ( exp > 0 ){ if ( decExponent > maxDecimalExponent+1 ){ /* * Lets face it. This is going to be * Infinity. Cut to the chase. */ return (isNegative)? Double.NEGATIVE_INFINITY : Double.POSITIVE_INFINITY; } if ( (exp&15) != 0 ){ dValue *= small10pow[exp&15]; } if ( (exp>>=4) != 0 ){ int j; for( j = 0; exp > 1; j++, exp>>=1 ){ if ( (exp&1)!=0) dValue *= big10pow[j]; } /* * The reason for the weird exp > 1 condition * in the above loop was so that the last multiply * would get unrolled. We handle it here. * It could overflow. */ double t = dValue * big10pow[j]; if ( Double.isInfinite( t ) ){ /* * It did overflow. * Look more closely at the result. * If the exponent is just one too large, * then use the maximum finite as our estimate * value. Else call the result infinity * and punt it. * ( I presume this could happen because * rounding forces the result here to be * an ULP or two larger than * Double.MAX_VALUE ). */ t = dValue / 2.0; t *= big10pow[j]; if ( Double.isInfinite( t ) ){ return (isNegative)? Double.NEGATIVE_INFINITY : Double.POSITIVE_INFINITY; } t = Double.MAX_VALUE; } dValue = t; } } else if ( exp < 0 ){ exp = -exp; if ( decExponent < minDecimalExponent-1 ){ /* * Lets face it. This is going to be * zero. Cut to the chase. */ return (isNegative)? -0.0 : 0.0; } if ( (exp&15) != 0 ){ dValue /= small10pow[exp&15]; } if ( (exp>>=4) != 0 ){ int j; for( j = 0; exp > 1; j++, exp>>=1 ){ if ( (exp&1)!=0) dValue *= tiny10pow[j]; } /* * The reason for the weird exp > 1 condition * in the above loop was so that the last multiply * would get unrolled. We handle it here. * It could underflow. */ double t = dValue * tiny10pow[j]; if ( t == 0.0 ){ /* * It did underflow. * Look more closely at the result. * If the exponent is just one too small, * then use the minimum finite as our estimate * value. Else call the result 0.0 * and punt it. * ( I presume this could happen because * rounding forces the result here to be * an ULP or two less than * Double.MIN_VALUE ). */ t = dValue * 2.0; t *= tiny10pow[j]; if ( t == 0.0 ){ return (isNegative)? -0.0 : 0.0; } t = Double.MIN_VALUE; } dValue = t; } } /* * dValue is now approximately the result. * The hard part is adjusting it, by comparison * with FDBigInt arithmetic. * Formulate the EXACT big-number result as * bigD0 * 10^exp */ FDBigInt bigD0 = new FDBigInt( lValue, digits, kDigits, nDigits ); exp = decExponent - nDigits; correctionLoop: while(true){ /* AS A SIDE EFFECT, THIS METHOD WILL SET THE INSTANCE VARIABLES * bigIntExp and bigIntNBits */ FDBigInt bigB = doubleToBigInt( dValue ); /* * Scale bigD, bigB appropriately for * big-integer operations. * Naively, we multiply by powers of ten * and powers of two. What we actually do * is keep track of the powers of 5 and * powers of 2 we would use, then factor out * common divisors before doing the work. */ int B2, B5; // powers of 2, 5 in bigB int D2, D5; // powers of 2, 5 in bigD int Ulp2; // powers of 2 in halfUlp. if ( exp >= 0 ){ B2 = B5 = 0; D2 = D5 = exp; } else { B2 = B5 = -exp; D2 = D5 = 0; } if ( bigIntExp >= 0 ){ B2 += bigIntExp; } else { D2 -= bigIntExp; } Ulp2 = B2; // shift bigB and bigD left by a number s. t. // halfUlp is still an integer. int hulpbias; if ( bigIntExp+bigIntNBits <= -expBias+1 ){ // This is going to be a denormalized number // (if not actually zero). // half an ULP is at 2^-(expBias+expShift+1) hulpbias = bigIntExp+ expBias + expShift; } else { hulpbias = expShift + 2 - bigIntNBits; } B2 += hulpbias; D2 += hulpbias; // if there are common factors of 2, we might just as well // factor them out, as they add nothing useful. int common2 = Math.min( B2, Math.min( D2, Ulp2 ) ); B2 -= common2; D2 -= common2; Ulp2 -= common2; // do multiplications by powers of 5 and 2 bigB = multPow52( bigB, B5, B2 ); FDBigInt bigD = multPow52( new FDBigInt( bigD0 ), D5, D2 ); // // to recap: // bigB is the scaled-big-int version of our floating-point // candidate. // bigD is the scaled-big-int version of the exact value // as we understand it. // halfUlp is 1/2 an ulp of bigB, except for special cases // of exact powers of 2 // // the plan is to compare bigB with bigD, and if the difference // is less than halfUlp, then we're satisfied. Otherwise, // use the ratio of difference to halfUlp to calculate a fudge // factor to add to the floating value, then go 'round again. // FDBigInt diff; int cmpResult; boolean overvalue; if ( (cmpResult = bigB.cmp( bigD ) ) > 0 ){ overvalue = true; // our candidate is too big. diff = bigB.sub( bigD ); if ( (bigIntNBits == 1) && (bigIntExp > -expBias+1) ){ // candidate is a normalized exact power of 2 and // is too big. We will be subtracting. // For our purposes, ulp is the ulp of the // next smaller range. Ulp2 -= 1; if ( Ulp2 < 0 ){ // rats. Cannot de-scale ulp this far. // must scale diff in other direction. Ulp2 = 0; diff.lshiftMe( 1 ); } } } else if ( cmpResult < 0 ){ overvalue = false; // our candidate is too small. diff = bigD.sub( bigB ); } else { // the candidate is exactly right! // this happens with surprising frequency break correctionLoop; } FDBigInt halfUlp = constructPow52( B5, Ulp2 ); if ( (cmpResult = diff.cmp( halfUlp ) ) < 0 ){ // difference is small. // this is close enough if (mustSetRoundDir) { roundDir = overvalue ? -1 : 1; } break correctionLoop; } else if ( cmpResult == 0 ){ // difference is exactly half an ULP // round to some other value maybe, then finish dValue += 0.5*ulp( dValue, overvalue ); // should check for bigIntNBits == 1 here?? if (mustSetRoundDir) { roundDir = overvalue ? -1 : 1; } break correctionLoop; } else { // difference is non-trivial. // could scale addend by ratio of difference to // halfUlp here, if we bothered to compute that difference. // Most of the time ( I hope ) it is about 1 anyway. dValue += ulp( dValue, overvalue ); if ( dValue == 0.0 || dValue == Double.POSITIVE_INFINITY ) break correctionLoop; // oops. Fell off end of range. continue; // try again. } } return (isNegative)? -dValue : dValue; } }</span>
上面就是整个实现的源代码,之所以列出来,省的各位去找了,下面就来一行行解读:
首先跟C语言中的实现一样,去掉了字符串前后的空格:
<span style="font-size:18px;"> in = in.trim()</span>
判断是否为空的字符串:
<span style="font-size:18px;"> int l = in.length(); if (l == 0) throw new NumberFormatException("empty String");</span>
取出第一个字符,判断是否为正负数:
<span style="font-size:18px;"> int i = 0; switch (c = in.charAt(i)) { case '-': isNegative = true; //FALLTHROUGH case '+': i++; signSeen = true; }</span>
检查是否为Infinity(无穷大)或者NaN(不明确的数值结果,一般被除数为0会出现这个结果)
<span style="font-size:18px;"> c = in.charAt(i); if (c == 'N' || c == 'I');</span>
如果既不是Infinity或者NaN,则检查是为十六进制浮点数
<span style="font-size:18px;"> else if (c == '0');</span>
之后就是把字符串中的每个字符拆解出来放到array中,
Java提供了一个方法,将字符array转化为数字,即doubleValue(),
从doubleValue()中可以看到
<span style="font-size:18px;">int iValue = (int)digits[0]-(int)'0';</span>
直接通过了强制类型转换进行数值的转换,剩下的任务就是异常判断以及是否为科学计数法。
总结:从上面可以看出,从算法本身来讲,Java简直弱爆了,一个用C语言十几行代码就能实现的算法,用Java却达到了一百来行的代码。
atof(将字符串转化为浮点数的Java实现)
时间: 2024-10-21 18:25:37