2015年3月5日木曜日

【Perl】Perlでは16桁以上の小数は丸められてしまうようだ

Perlでは合計16桁以上の小数は丸められてしまうようだ。以下、いろいろ試した結果、最後の桁の数値が5では切り捨て、6では切り上げになるようだ。

$ perl -e '
> $tmp = 100.9999999999999;
> print $tmp."\n";
> '
101

$ perl -e '                   
$tmp = 100.999999999999;
print $tmp."\n";
'
100.999999999999

$ perl -e '
$tmp = 1000.999999999999;
print $tmp."\n";
'
1001

$ perl -e '
$tmp = 1000.99999999999;
print $tmp."\n";
'
1000.99999999999

$ perl -e '
$tmp = 1000.999999999995;
print $tmp."\n";
'
1000.99999999999

$ perl -e '
$tmp = 1000.999999999996;
print $tmp."\n";
'
1001

2015年3月3日火曜日

[Perl]POSIXのfloorを使って小数点切り捨ては実装しないほうがいい

POSIXのfloor関数を使って小数点の切り捨てロジックを実装していたら、不可解な挙動を確認したので掲載をしておく。

以下のコードはインプットされた数字を小数点2桁までで切り捨てるロジックを実装し、0.01から10.00まで0.01づつインクリメントさせながらテストをしたものである。

use strict;
use warnings;
use POSIX qw(floor);
use Test::More;

for (1..1000) {
    my $num = $_ / 100;
    is((floor($num*100))/100, $num, "$_/100 = $num case");
}

done_testing();


テスト結果

$ prove test.t
test.t .. 1/?
#   Failed test '29/100 = 0.29 case'
#   at test.t line 8.
#          got: '0.28'
#     expected: '0.29'

#   Failed test '57/100 = 0.57 case'
#   at test.t line 8.
#          got: '0.56'
#     expected: '0.57'

#   Failed test '58/100 = 0.58 case'
#   at test.t line 8.
#          got: '0.57'
#     expected: '0.58'

#   Failed test '113/100 = 1.13 case'
#   at test.t line 8.
#          got: '1.12'
#     expected: '1.13'

#   Failed test '114/100 = 1.14 case'
#   at test.t line 8.
#          got: '1.13'
#     expected: '1.14'

#   Failed test '115/100 = 1.15 case'
#   at test.t line 8.
#          got: '1.14'
#     expected: '1.15'

#   Failed test '116/100 = 1.16 case'
#   at test.t line 8.
#          got: '1.15'
#     expected: '1.16'

#   Failed test '201/100 = 2.01 case'
#   at test.t line 8.
#          got: '2'
#     expected: '2.01'

#   Failed test '203/100 = 2.03 case'
#   at test.t line 8.
#          got: '2.02'
#     expected: '2.03'

#   Failed test '205/100 = 2.05 case'
#   at test.t line 8.
#          got: '2.04'
#     expected: '2.05'

#   Failed test '207/100 = 2.07 case'
#   at test.t line 8.
#          got: '2.06'
#     expected: '2.07'

#   Failed test '226/100 = 2.26 case'
#   at test.t line 8.
#          got: '2.25'
#     expected: '2.26'

#   Failed test '228/100 = 2.28 case'
#   at test.t line 8.
#          got: '2.27'
#     expected: '2.28'

#   Failed test '230/100 = 2.3 case'
#   at test.t line 8.
#          got: '2.29'
#     expected: '2.3'

#   Failed test '232/100 = 2.32 case'
#   at test.t line 8.
#          got: '2.31'
#     expected: '2.32'

#   Failed test '251/100 = 2.51 case'
#   at test.t line 8.
#          got: '2.5'
#     expected: '2.51'

#   Failed test '253/100 = 2.53 case'
#   at test.t line 8.
#          got: '2.52'
#     expected: '2.53'

#   Failed test '255/100 = 2.55 case'
#   at test.t line 8.
#          got: '2.54'
#     expected: '2.55'

#   Failed test '402/100 = 4.02 case'
#   at test.t line 8.
#          got: '4.01'
#     expected: '4.02'

#   Failed test '406/100 = 4.06 case'
#   at test.t line 8.
#          got: '4.05'
#     expected: '4.06'

#   Failed test '410/100 = 4.1 case'
#   at test.t line 8.
#          got: '4.09'
#     expected: '4.1'

#   Failed test '414/100 = 4.14 case'
#   at test.t line 8.
#          got: '4.13'
#     expected: '4.14'

#   Failed test '427/100 = 4.27 case'
#   at test.t line 8.
#          got: '4.26'
#     expected: '4.27'

#   Failed test '431/100 = 4.31 case'
#   at test.t line 8.
#          got: '4.3'
#     expected: '4.31'

#   Failed test '435/100 = 4.35 case'
#   at test.t line 8.
#          got: '4.34'
#     expected: '4.35'

#   Failed test '439/100 = 4.39 case'
#   at test.t line 8.
#          got: '4.38'
#     expected: '4.39'

#   Failed test '452/100 = 4.52 case'
#   at test.t line 8.
#          got: '4.51'
#     expected: '4.52'

#   Failed test '456/100 = 4.56 case'
#   at test.t line 8.
#          got: '4.55'
#     expected: '4.56'

#   Failed test '460/100 = 4.6 case'
#   at test.t line 8.
#          got: '4.59'
#     expected: '4.6'

#   Failed test '464/100 = 4.64 case'
#   at test.t line 8.
#          got: '4.63'
#     expected: '4.64'

#   Failed test '477/100 = 4.77 case'
#   at test.t line 8.
#          got: '4.76'
#     expected: '4.77'

#   Failed test '481/100 = 4.81 case'
#   at test.t line 8.
#          got: '4.8'
#     expected: '4.81'

#   Failed test '485/100 = 4.85 case'
#   at test.t line 8.
#          got: '4.84'
#     expected: '4.85'

#   Failed test '489/100 = 4.89 case'
#   at test.t line 8.
#          got: '4.88'
#     expected: '4.89'

#   Failed test '502/100 = 5.02 case'
#   at test.t line 8.
#          got: '5.01'
#     expected: '5.02'

#   Failed test '506/100 = 5.06 case'
#   at test.t line 8.
#          got: '5.05'
#     expected: '5.06'

#   Failed test '510/100 = 5.1 case'
#   at test.t line 8.
#          got: '5.09'
#     expected: '5.1'

#   Failed test '803/100 = 8.03 case'
#   at test.t line 8.
#          got: '8.02'
#     expected: '8.03'

#   Failed test '804/100 = 8.04 case'
#   at test.t line 8.
#          got: '8.03'
#     expected: '8.04'

#   Failed test '812/100 = 8.12 case'
#   at test.t line 8.
#          got: '8.11'
#     expected: '8.12'

#   Failed test '820/100 = 8.2 case'
#   at test.t line 8.
#          got: '8.19'
#     expected: '8.2'

#   Failed test '828/100 = 8.28 case'
#   at test.t line 8.
#          got: '8.27'
#     expected: '8.28'

#   Failed test '829/100 = 8.29 case'
#   at test.t line 8.
#          got: '8.28'
#     expected: '8.29'

#   Failed test '837/100 = 8.37 case'
#   at test.t line 8.
#          got: '8.36'
#     expected: '8.37'

#   Failed test '845/100 = 8.45 case'
#   at test.t line 8.
#          got: '8.44'
#     expected: '8.45'

#   Failed test '853/100 = 8.53 case'
#   at test.t line 8.
#          got: '8.52'
#     expected: '8.53'

#   Failed test '854/100 = 8.54 case'
#   at test.t line 8.
#          got: '8.53'
#     expected: '8.54'

#   Failed test '862/100 = 8.62 case'
#   at test.t line 8.
#          got: '8.61'
#     expected: '8.62'

#   Failed test '870/100 = 8.7 case'
#   at test.t line 8.
#          got: '8.69'
#     expected: '8.7'

#   Failed test '878/100 = 8.78 case'
#   at test.t line 8.
#          got: '8.77'
#     expected: '8.78'

#   Failed test '879/100 = 8.79 case'
#   at test.t line 8.
#          got: '8.78'
#     expected: '8.79'

#   Failed test '887/100 = 8.87 case'
#   at test.t line 8.
#          got: '8.86'
#     expected: '8.87'

#   Failed test '895/100 = 8.95 case'
#   at test.t line 8.
#          got: '8.94'
#     expected: '8.95'

#   Failed test '903/100 = 9.03 case'
#   at test.t line 8.
#          got: '9.02'
#     expected: '9.03'

#   Failed test '904/100 = 9.04 case'
#   at test.t line 8.
#          got: '9.03'
#     expected: '9.04'

#   Failed test '912/100 = 9.12 case'
#   at test.t line 8.
#          got: '9.11'
#     expected: '9.12'

#   Failed test '920/100 = 9.2 case'
#   at test.t line 8.
#          got: '9.19'
#     expected: '9.2'

#   Failed test '928/100 = 9.28 case'
#   at test.t line 8.
#          got: '9.27'
#     expected: '9.28'

#   Failed test '929/100 = 9.29 case'
#   at test.t line 8.
#          got: '9.28'
#     expected: '9.29'

#   Failed test '937/100 = 9.37 case'
#   at test.t line 8.
#          got: '9.36'
#     expected: '9.37'

#   Failed test '945/100 = 9.45 case'
#   at test.t line 8.
#          got: '9.44'
#     expected: '9.45'

#   Failed test '953/100 = 9.53 case'
#   at test.t line 8.
#          got: '9.52'
#     expected: '9.53'

#   Failed test '954/100 = 9.54 case'
#   at test.t line 8.
#          got: '9.53'
#     expected: '9.54'

#   Failed test '962/100 = 9.62 case'
#   at test.t line 8.
#          got: '9.61'
#     expected: '9.62'

#   Failed test '970/100 = 9.7 case'
#   at test.t line 8.
#          got: '9.69'
#     expected: '9.7'

#   Failed test '978/100 = 9.78 case'
#   at test.t line 8.
#          got: '9.77'
#     expected: '9.78'

#   Failed test '979/100 = 9.79 case'
#   at test.t line 8.
#          got: '9.78'
#     expected: '9.79'

#   Failed test '987/100 = 9.87 case'
#   at test.t line 8.
#          got: '9.86'
#     expected: '9.87'

#   Failed test '995/100 = 9.95 case'
#   at test.t line 8.
#          got: '9.94'
#     expected: '9.95'
# Looks like you failed 69 tests of 1000.
test.t .. Dubious, test returned 69 (wstat 17664, 0x4500)
Failed 69/1000 subtests

Test Summary Report
-------------------
test.t (Wstat: 17664 Tests: 1000 Failed: 69)
  Failed tests:  29, 57-58, 113-116, 201, 203, 205, 207
                226, 228, 230, 232, 251, 253, 255, 402
                406, 410, 414, 427, 431, 435, 439, 452
                456, 460, 464, 477, 481, 485, 489, 502
                506, 510, 803-804, 812, 820, 828-829, 837
                845, 853-854, 862, 870, 878-879, 887, 895
                903-904, 912, 920, 928-929, 937, 945, 953-954
                962, 970, 978-979, 987, 995
  Non-zero exit status: 69
Files=1, Tests=1000,  0 wallclock secs ( 0.18 usr  0.02 sys +  0.29 cusr  0.01 csys =  0.50 CPU)
Result: FAIL

ということで、例えば9.95をfloorを使って小数第2桁で切り捨てようとすると、9.94になってしまうようだ。この事例、大抵の値のパターン(上記例では約97%)ではテストが通ってしまうので、個別にいくつかケースを書いただけでは通ってしまってバグに気付かず、現実にシステムを動かしたときに気付くことになるという点がたちが悪い。

ちなみにこの現象、intを使っても同じことが発生する。上記スクリプトのfloorの部分をintに変えて実行してみれば、同じ結果になることが確認できるであろう。

原因は、おそらくは内部的に浮動小数点を2進数で処理しているとか、それ関連の現象と思われるが、こういう結果が出てしまった以上は、少数第○位以下を切り捨てするような処理をする際には、上記のような広範囲の数値を入れたテストを行うのは必須で、実装もint,floorが使えないので正規表現を使った実装にせざるを得ないと思われる。